All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/66] [v1] Full PPGTT minus soft pin
@ 2013-06-27 23:30 Ben Widawsky
  2013-06-27 23:30 ` [PATCH 01/66] drm/i915: Remove extra error state NULL Ben Widawsky
                   ` (67 more replies)
  0 siblings, 68 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

First, I don't think this whole series is ready for merge yet. It is
however ready for a review, and I think a lot of the prep patches in the
series could be merged to make my rebasing life a bit easier. I cannot
continue ignoring pretty much all emails/bugs as I have for the last
month to wrap up this series. The current state is on my IVB, things are
pretty stable. I've seen one unexplained hang, but I'm hopeful review
might help me uncover/explain.

This patch series introduces the next step in enabling full PPGTT, which
is per fd address space/context, and it also contains the previously
unmerged patches (some of which have been reworked, modified, or
rebased). In regards to the continued VMA changes, I think in these, the
delta with regard to the last posting is the bound list was per VM. It
is now global. I've also moved the active list to per VM.

Brand new in the series we take the previous series' context per fd
(with address space) one step further and actually switch the address
spaces when we do context switches. In order to make this happen, the
series continues to chip away at removing the notion of an object only
ever being bound into one address space via the struct
i915_address_space and struct i915_vma data structures which are really
abstractions for a page directory, and current mapped ptes respectively.
The error state is improved since the last series (though still some
work there is probably needed). It also serves to remove the notion of
the aliasing PPGTT since in theory everything bound into the GGTT
shouldn't benefit from an aliasing PPGTT (fact check).

With every context having it's own address space, and every open DRM fd
having it's own context, it's trivial on execbuf to lookup a context and
do the pinning in the proper address space. More importantly, it's
implicit that a context exists, which made this impossible to do
earlier.

*A note on patch ordering:* In order to work this series incrementally, the
final patch ordering is admittedly a little bit strange. I'm more than willing
to rework these as requested, but I'd really prefer not to do really heavy
reordering unless there is a major benefit, or of course to fix bugs.

# What is not in this patch series in the order I think we should handle it
(and I acknowledge some of this stuff is non-trivial):

## Review + QA coverage

## Porting to HSW

It shouldn't be too much extra work, if any, to add support. I haven't looked
into it yet.

## Better vm/ppgtt info in error state collection

In particular, I want to dump all the PTEs at hang, and at the very least the
guilt PTEs.  This isn't difficult, and can be done with copypasta from the
existing dumper I have.

## User space and the implications

Now that contexts are valid on all rings, userspace should begin emitting the
context for all rings if it expects both rings to be able to access both
objects in the same offset. The mesa change looks to be just one line which
simplies emits the context for batch->is_blit, I doubt libva is using contexts,
and SNA seems not to. The plan to support mesa will be to do the detection in
libdrm, and go ahead with the simple mesa one liner. I've been using the
oneliner if mesa for a while now, but we will need to support old user space in
the kernel. So there might be a bit of work even on the kernel side here. We
also need some IGT tools test updates. I have messy versions of these locally
already.

## Performance data

I think it doesn't preclude preliminary review of the patches since the main
goal of PPGTT is really abourt security, correctness, and enabling other
things. I will update with some numbers after I work on it a bit more.


## Testing on SNB

If our current code is correct, then I think these patches might work on SNB
as is, but it's untested. There is currently no way to disconnect contexts +
PPGTT from the whole thing; so if this doesn't work - we'll need rework some of
the code. I think it should just entail bringing back aliasing ppgtt, and not
doing the address space switch when switching contexts (aliasing ppgtt will
have a null switch_mm()).

## Soft pin interface

I'd like to defer the discussion until these patches are merged.

## On demand page table allocation

This is a potentially very useful optimization for at least the following
reasons:
* any app using contexts will have an extra set of page tables it isn't using;
  wasted memory
* Reduce DCLV to reduce pd fetch latency
* Allows Better use of a switch to default context for low memory situations
  (evicting unused page tables for example)

## Per VMA cache levels/control

There are situations in the code where we have to flush the GPU pipeline in
order to change cache levels.  This should no longer be the case for unaffected
VMs (I think). The same may be true with domain tracking.

## dmabuf/prime integration

I haven't looked into what's missing to support it. If I'm lucky, it just works.



With that, if you haven't already moved on, chanting tl;dr - all comments
welcome.

---

Ben Widawsky (65):
  drm/i915: Remove extra error state NULL
  drm/i915: Extract error buffer capture
  drm/i915: make PDE|PTE platform specific
  drm/i915: Don't clear gtt with 0 entries
  drm/i915: Conditionally use guard page based on PPGTT
  drm/i915: Use drm_mm for PPGTT PDEs
  drm/i915: cleanup context fini
  drm/i915: Do a fuller init after reset
  drm/i915: Split context enabling from init
  drm/i915: destroy i915_gem_init_global_gtt
  drm/i915: Embed PPGTT into the context
  drm/i915: Unify PPGTT codepaths on gen6+
  drm/i915: Move ppgtt initialization down
  drm/i915: Tie context to PPGTT
  drm/i915: Really share scratch page
  drm/i915: Combine scratch members into a struct
  drm/i915: Drop dev from pte_encode
  drm/i915: Use gtt shortform where possible
  drm/i915: Move fbc members out of line
  drm/i915: Move gtt and ppgtt under address space umbrella
  drm/i915: Move gtt_mtrr to i915_gtt
  drm/i915: Move stolen stuff to i915_gtt
  drm/i915: Move aliasing_ppgtt
  drm/i915: Put the mm in the parent address space
  drm/i915: Move active/inactive lists to new mm
  drm/i915: Create a global list of vms
  drm/i915: Remove object's gtt_offset
  drm: pre allocate node for create_block
  drm/i915: Getter/setter for object attributes
  drm/i915: Create VMAs (part 1)
  drm/i915: Create VMAs (part 2) - kill gtt space
  drm/i915: Create VMAs (part 3) - plumbing
  drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
  drm/i915: Create VMAs (part 4) - Error capture
  drm/i915: Create VMAs (part 5) - move mm_list
  drm/i915: Create VMAs (part 6) - finish error plumbing
  drm/i915: create an object_is_active()
  drm/i915: Move active to vma
  drm/i915: Track all VMAs per VM
  drm/i915: Defer request freeing
  drm/i915: Clean up VMAs before freeing
  drm/i915: Replace has_bsd/blt with a mask
  drm/i915: Catch missed context unref earlier
  drm/i915: Add a context open function
  drm/i915: Permit contexts on all rings
  drm/i915: Fix context fini refcounts
  drm/i915: Better reset handling for contexts
  drm/i915: Create a per file_priv default context
  drm/i915: Remove ring specificity from contexts
  drm/i915: Track which ring a context ran on
  drm/i915: dump error state based on capture
  drm/i915: PPGTT should take a ppgtt argument
  drm/i915: USE LRI for switching PP_DIR_BASE
  drm/i915: Extract mm switching to function
  drm/i915: Write PDEs at init instead of enable
  drm/i915: Disallow pin with full ppgtt
  drm/i915: Get context early in execbuf
  drm/i915: Pass ctx directly to switch/hangstat
  drm/i915: Actually add the new address spaces
  drm/i915: Use multiple VMs
  drm/i915: Kill now unused ppgtt_{un,}bind
  drm/i915: Add PPGTT dumper
  drm/i915: Dump all ppgtt
  drm/i915: Add debugfs for vma info per vm
  drm/i915: Getparam full ppgtt

Chris Wilson (1):
  drm: Optionally create mm blocks from top-to-bottom

 drivers/gpu/drm/drm_mm.c                   | 134 +++---
 drivers/gpu/drm/i915/i915_debugfs.c        | 215 ++++++++--
 drivers/gpu/drm/i915/i915_dma.c            |  25 +-
 drivers/gpu/drm/i915/i915_drv.c            |  57 ++-
 drivers/gpu/drm/i915/i915_drv.h            | 353 ++++++++++------
 drivers/gpu/drm/i915/i915_gem.c            | 639 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    | 279 +++++++++----
 drivers/gpu/drm/i915/i915_gem_debug.c      |  11 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  64 +--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 541 ++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  87 ++--
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  21 +-
 drivers/gpu/drm/i915/i915_irq.c            | 197 ++++++---
 drivers/gpu/drm/i915/i915_trace.h          |  38 +-
 drivers/gpu/drm/i915/intel_display.c       |  40 +-
 drivers/gpu/drm/i915/intel_drv.h           |   7 -
 drivers/gpu/drm/i915/intel_fb.c            |   8 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  26 +-
 drivers/gpu/drm/i915/intel_pm.c            |  65 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  32 +-
 drivers/gpu/drm/i915/intel_sprite.c        |   8 +-
 include/drm/drm_mm.h                       | 147 ++++---
 include/uapi/drm/i915_drm.h                |   1 +
 24 files changed, 2044 insertions(+), 1089 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 124+ messages in thread

* [PATCH 01/66] drm/i915: Remove extra error state NULL
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 02/66] drm/i915: Extract error buffer capture Ben Widawsky
                   ` (66 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Not only was there an extra, but since we now kzalloc the error state,
we don't need either.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 208e675..13574fa 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1926,10 +1926,6 @@ static void i915_capture_error_state(struct drm_device *dev)
 	i915_gem_record_fences(dev, error);
 	i915_gem_record_rings(dev, error);
 
-	/* Record buffers on the active and pinned lists. */
-	error->active_bo = NULL;
-	error->pinned_bo = NULL;
-
 	i = 0;
 	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
 		i++;
@@ -1939,8 +1935,6 @@ static void i915_capture_error_state(struct drm_device *dev)
 			i++;
 	error->pinned_bo_count = i - error->active_bo_count;
 
-	error->active_bo = NULL;
-	error->pinned_bo = NULL;
 	if (i) {
 		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
 					   GFP_ATOMIC);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 02/66] drm/i915: Extract error buffer capture
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
  2013-06-27 23:30 ` [PATCH 01/66] drm/i915: Remove extra error state NULL Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 03/66] drm/i915: make PDE|PTE platform specific Ben Widawsky
                   ` (65 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This helps when we have per VM buffer capturing.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c | 69 +++++++++++++++++++++++------------------
 1 file changed, 38 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 13574fa..fa70fd0 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1850,6 +1850,42 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	}
 }
 
+static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	struct drm_i915_gem_object *obj;
+	int i;
+
+	i = 0;
+	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
+		i++;
+	error->active_bo_count = i;
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
+		if (obj->pin_count)
+			i++;
+	error->pinned_bo_count = i - error->active_bo_count;
+
+	if (i) {
+		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
+					   GFP_ATOMIC);
+		if (error->active_bo)
+			error->pinned_bo =
+				error->active_bo + error->active_bo_count;
+	}
+
+	if (error->active_bo)
+		error->active_bo_count =
+			capture_active_bo(error->active_bo,
+					  error->active_bo_count,
+					  &dev_priv->mm.active_list);
+
+	if (error->pinned_bo)
+		error->pinned_bo_count =
+			capture_pinned_bo(error->pinned_bo,
+					  error->pinned_bo_count,
+					  &dev_priv->mm.bound_list);
+}
+
 /**
  * i915_capture_error_state - capture an error record for later analysis
  * @dev: drm device
@@ -1862,10 +1898,9 @@ static void i915_gem_record_rings(struct drm_device *dev,
 static void i915_capture_error_state(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
 	struct drm_i915_error_state *error;
 	unsigned long flags;
-	int i, pipe;
+	int pipe;
 
 	spin_lock_irqsave(&dev_priv->gpu_error.lock, flags);
 	error = dev_priv->gpu_error.first_error;
@@ -1923,38 +1958,10 @@ static void i915_capture_error_state(struct drm_device *dev)
 
 	i915_get_extra_instdone(dev, error->extra_instdone);
 
+	i915_gem_capture_buffers(dev_priv, error);
 	i915_gem_record_fences(dev, error);
 	i915_gem_record_rings(dev, error);
 
-	i = 0;
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
-		i++;
-	error->active_bo_count = i;
-	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-		if (obj->pin_count)
-			i++;
-	error->pinned_bo_count = i - error->active_bo_count;
-
-	if (i) {
-		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
-					   GFP_ATOMIC);
-		if (error->active_bo)
-			error->pinned_bo =
-				error->active_bo + error->active_bo_count;
-	}
-
-	if (error->active_bo)
-		error->active_bo_count =
-			capture_active_bo(error->active_bo,
-					  error->active_bo_count,
-					  &dev_priv->mm.active_list);
-
-	if (error->pinned_bo)
-		error->pinned_bo_count =
-			capture_pinned_bo(error->pinned_bo,
-					  error->pinned_bo_count,
-					  &dev_priv->mm.bound_list);
-
 	do_gettimeofday(&error->time);
 
 	error->overlay = intel_overlay_capture_error_state(dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 03/66] drm/i915: make PDE|PTE platform specific
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
  2013-06-27 23:30 ` [PATCH 01/66] drm/i915: Remove extra error state NULL Ben Widawsky
  2013-06-27 23:30 ` [PATCH 02/66] drm/i915: Extract error buffer capture Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-28 16:53   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
                   ` (64 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Nothing outside of i915_gem_gtt.c and more specifically, the relevant
gen specific init function should need to know about number of PDEs, or
PTEs per PD. Exposing this will only lead to circumventing using the
upcoming VM abstraction.

To accomplish this, move the defines into the .c file, rename the PDE
define to be GEN6, and make the PTE count less of a magic number.

The remaining code in the global gtt setup is a bit messy, but an
upcoming patch will clean that one up.

v2: Don't hardcode number of PDEs (Daniel + Jesse)
Reworded commit message to reflect change.

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     | 2 --
 drivers/gpu/drm/i915/i915_gem_gtt.c | 9 ++++++---
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7940cbe..b709712 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -483,8 +483,6 @@ struct i915_gtt {
 };
 #define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
 
-#define I915_PPGTT_PD_ENTRIES 512
-#define I915_PPGTT_PT_ENTRIES 1024
 struct i915_hw_ppgtt {
 	struct drm_device *dev;
 	unsigned num_pd_entries;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5101ab6..216e7a1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -28,6 +28,9 @@
 #include "i915_trace.h"
 #include "intel_drv.h"
 
+#define GEN6_PPGTT_PD_ENTRIES 512
+#define I915_PPGTT_PT_ENTRIES (PAGE_SIZE / sizeof(gen6_gtt_pte_t))
+
 /* PPGTT stuff */
 #define GEN6_GTT_ADDR_ENCODE(addr)	((addr) | (((addr) >> 28) & 0xff0))
 
@@ -278,7 +281,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	} else {
 		ppgtt->pte_encode = gen6_pte_encode;
 	}
-	ppgtt->num_pd_entries = I915_PPGTT_PD_ENTRIES;
+	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
 	ppgtt->clear_range = gen6_ppgtt_clear_range;
 	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
@@ -688,7 +691,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 		if (INTEL_INFO(dev)->gen <= 7) {
 			/* PPGTT pdes are stolen from global gtt ptes, so shrink the
 			 * aperture accordingly when using aliasing ppgtt. */
-			gtt_size -= I915_PPGTT_PD_ENTRIES*PAGE_SIZE;
+			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 		}
 
 		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
@@ -699,7 +702,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 
 		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
 		drm_mm_takedown(&dev_priv->mm.gtt_space);
-		gtt_size += I915_PPGTT_PD_ENTRIES*PAGE_SIZE;
+		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 	}
 	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (2 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 03/66] drm/i915: make PDE|PTE platform specific Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 12:30   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 05/66] drm/i915: Don't clear gtt with 0 entries Ben Widawsky
                   ` (63 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

From: Chris Wilson <chris@chris-wilson.co.uk>

Clients like i915 needs to segregate cache domains within the GTT which
can lead to small amounts of fragmentation. By allocating the uncached
buffers from the bottom and the cacheable buffers from the top, we can
reduce the amount of wasted space and also optimize allocation of the
mappable portion of the GTT to only those buffers that require CPU
access through the GTT.

v2 by Ben:
Update callers in i915_gem_object_bind_to_gtt()
Turn search flags and allocation flags into separate enums
Make checkpatch happy where logical/easy

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/drm_mm.c        | 122 ++++++++++++++++++---------------
 drivers/gpu/drm/i915/i915_gem.c |   4 +-
 include/drm/drm_mm.h            | 148 ++++++++++++++++++++++++----------------
 3 files changed, 161 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 07cf99c..7095328 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -49,7 +49,7 @@
 
 #define MM_UNUSED_TARGET 4
 
-static struct drm_mm_node *drm_mm_kmalloc(struct drm_mm *mm, int atomic)
+static struct drm_mm_node *drm_mm_kmalloc(struct drm_mm *mm, bool atomic)
 {
 	struct drm_mm_node *child;
 
@@ -105,7 +105,8 @@ EXPORT_SYMBOL(drm_mm_pre_get);
 static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 				 struct drm_mm_node *node,
 				 unsigned long size, unsigned alignment,
-				 unsigned long color)
+				 unsigned long color,
+				 enum drm_mm_allocator_flags flags)
 {
 	struct drm_mm *mm = hole_node->mm;
 	unsigned long hole_start = drm_mm_hole_node_start(hole_node);
@@ -118,12 +119,22 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	if (mm->color_adjust)
 		mm->color_adjust(hole_node, color, &adj_start, &adj_end);
 
+	if (flags & DRM_MM_CREATE_TOP)
+		adj_start = adj_end - size;
+
 	if (alignment) {
 		unsigned tmp = adj_start % alignment;
-		if (tmp)
-			adj_start += alignment - tmp;
+		if (tmp) {
+			if (flags & DRM_MM_CREATE_TOP)
+				adj_start -= tmp;
+			else
+				adj_start += alignment - tmp;
+		}
 	}
 
+	BUG_ON(adj_start < hole_start);
+	BUG_ON(adj_end > hole_end);
+
 	if (adj_start == hole_start) {
 		hole_node->hole_follows = 0;
 		list_del(&hole_node->hole_stack);
@@ -150,7 +161,7 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
 					unsigned long start,
 					unsigned long size,
-					bool atomic)
+					enum drm_mm_allocator_flags flags)
 {
 	struct drm_mm_node *hole, *node;
 	unsigned long end = start + size;
@@ -161,7 +172,7 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
 		if (hole_start > start || hole_end < end)
 			continue;
 
-		node = drm_mm_kmalloc(mm, atomic);
+		node = drm_mm_kmalloc(mm, flags & DRM_MM_CREATE_ATOMIC);
 		if (unlikely(node == NULL))
 			return NULL;
 
@@ -196,15 +207,15 @@ struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
 					     unsigned long size,
 					     unsigned alignment,
 					     unsigned long color,
-					     int atomic)
+					     enum drm_mm_allocator_flags flags)
 {
 	struct drm_mm_node *node;
 
-	node = drm_mm_kmalloc(hole_node->mm, atomic);
+	node = drm_mm_kmalloc(hole_node->mm, flags & DRM_MM_CREATE_ATOMIC);
 	if (unlikely(node == NULL))
 		return NULL;
 
-	drm_mm_insert_helper(hole_node, node, size, alignment, color);
+	drm_mm_insert_helper(hole_node, node, size, alignment, color, flags);
 
 	return node;
 }
@@ -217,32 +228,28 @@ EXPORT_SYMBOL(drm_mm_get_block_generic);
  */
 int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node,
 			       unsigned long size, unsigned alignment,
-			       unsigned long color)
+			       unsigned long color,
+			       enum drm_mm_allocator_flags aflags,
+			       enum drm_mm_search_flags sflags)
 {
 	struct drm_mm_node *hole_node;
 
 	hole_node = drm_mm_search_free_generic(mm, size, alignment,
-					       color, 0);
+					       color, sflags);
 	if (!hole_node)
 		return -ENOSPC;
 
-	drm_mm_insert_helper(hole_node, node, size, alignment, color);
+	drm_mm_insert_helper(hole_node, node, size, alignment, color, aflags);
 	return 0;
 }
 EXPORT_SYMBOL(drm_mm_insert_node_generic);
 
-int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node,
-		       unsigned long size, unsigned alignment)
-{
-	return drm_mm_insert_node_generic(mm, node, size, alignment, 0);
-}
-EXPORT_SYMBOL(drm_mm_insert_node);
-
 static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
 				       struct drm_mm_node *node,
 				       unsigned long size, unsigned alignment,
 				       unsigned long color,
-				       unsigned long start, unsigned long end)
+				       unsigned long start, unsigned long end,
+				       enum drm_mm_search_flags flags)
 {
 	struct drm_mm *mm = hole_node->mm;
 	unsigned long hole_start = drm_mm_hole_node_start(hole_node);
@@ -257,13 +264,20 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
 	if (adj_end > end)
 		adj_end = end;
 
+	if (flags & DRM_MM_CREATE_TOP)
+		adj_start = adj_end - size;
+
 	if (mm->color_adjust)
 		mm->color_adjust(hole_node, color, &adj_start, &adj_end);
 
 	if (alignment) {
 		unsigned tmp = adj_start % alignment;
-		if (tmp)
-			adj_start += alignment - tmp;
+		if (tmp) {
+			if (flags & DRM_MM_CREATE_TOP)
+				adj_start -= tmp;
+			else
+				adj_start += alignment - tmp;
+		}
 	}
 
 	if (adj_start == hole_start) {
@@ -280,6 +294,8 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
 	INIT_LIST_HEAD(&node->hole_stack);
 	list_add(&node->node_list, &hole_node->node_list);
 
+	BUG_ON(node->start < start);
+	BUG_ON(node->start < adj_start);
 	BUG_ON(node->start + node->size > adj_end);
 	BUG_ON(node->start + node->size > end);
 
@@ -290,22 +306,23 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
 	}
 }
 
-struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node,
-						unsigned long size,
-						unsigned alignment,
-						unsigned long color,
-						unsigned long start,
-						unsigned long end,
-						int atomic)
+struct drm_mm_node *
+drm_mm_get_block_range_generic(struct drm_mm_node *hole_node,
+			       unsigned long size,
+			       unsigned alignment,
+			       unsigned long color,
+			       unsigned long start,
+			       unsigned long end,
+			       enum drm_mm_allocator_flags flags)
 {
 	struct drm_mm_node *node;
 
-	node = drm_mm_kmalloc(hole_node->mm, atomic);
+	node = drm_mm_kmalloc(hole_node->mm, flags & DRM_MM_CREATE_ATOMIC);
 	if (unlikely(node == NULL))
 		return NULL;
 
 	drm_mm_insert_helper_range(hole_node, node, size, alignment, color,
-				   start, end);
+				   start, end, flags);
 
 	return node;
 }
@@ -318,31 +335,25 @@ EXPORT_SYMBOL(drm_mm_get_block_range_generic);
  */
 int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, struct drm_mm_node *node,
 					unsigned long size, unsigned alignment, unsigned long color,
-					unsigned long start, unsigned long end)
+					unsigned long start, unsigned long end,
+					enum drm_mm_allocator_flags aflags,
+					enum drm_mm_search_flags sflags)
 {
 	struct drm_mm_node *hole_node;
 
 	hole_node = drm_mm_search_free_in_range_generic(mm,
 							size, alignment, color,
-							start, end, 0);
+							start, end, sflags);
 	if (!hole_node)
 		return -ENOSPC;
 
 	drm_mm_insert_helper_range(hole_node, node,
 				   size, alignment, color,
-				   start, end);
+				   start, end, aflags);
 	return 0;
 }
 EXPORT_SYMBOL(drm_mm_insert_node_in_range_generic);
 
-int drm_mm_insert_node_in_range(struct drm_mm *mm, struct drm_mm_node *node,
-				unsigned long size, unsigned alignment,
-				unsigned long start, unsigned long end)
-{
-	return drm_mm_insert_node_in_range_generic(mm, node, size, alignment, 0, start, end);
-}
-EXPORT_SYMBOL(drm_mm_insert_node_in_range);
-
 /**
  * Remove a memory node from the allocator.
  */
@@ -418,7 +429,7 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 					       unsigned long size,
 					       unsigned alignment,
 					       unsigned long color,
-					       bool best_match)
+					       enum drm_mm_search_flags flags)
 {
 	struct drm_mm_node *entry;
 	struct drm_mm_node *best;
@@ -431,7 +442,8 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 	best = NULL;
 	best_size = ~0UL;
 
-	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
+	__drm_mm_for_each_hole(entry, mm, adj_start, adj_end,
+			       flags & DRM_MM_SEARCH_BELOW) {
 		if (mm->color_adjust) {
 			mm->color_adjust(entry, color, &adj_start, &adj_end);
 			if (adj_end <= adj_start)
@@ -441,7 +453,7 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 		if (!check_free_hole(adj_start, adj_end, size, alignment))
 			continue;
 
-		if (!best_match)
+		if ((flags & DRM_MM_SEARCH_BEST) == 0)
 			return entry;
 
 		if (entry->size < best_size) {
@@ -454,13 +466,14 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
 }
 EXPORT_SYMBOL(drm_mm_search_free_generic);
 
-struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
-							unsigned long size,
-							unsigned alignment,
-							unsigned long color,
-							unsigned long start,
-							unsigned long end,
-							bool best_match)
+struct drm_mm_node *
+drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
+				    unsigned long size,
+				    unsigned alignment,
+				    unsigned long color,
+				    unsigned long start,
+				    unsigned long end,
+				    enum drm_mm_search_flags flags)
 {
 	struct drm_mm_node *entry;
 	struct drm_mm_node *best;
@@ -473,7 +486,8 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
 	best = NULL;
 	best_size = ~0UL;
 
-	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
+	__drm_mm_for_each_hole(entry, mm, adj_start, adj_end,
+			       flags & DRM_MM_SEARCH_BELOW) {
 		if (adj_start < start)
 			adj_start = start;
 		if (adj_end > end)
@@ -488,7 +502,7 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
 		if (!check_free_hole(adj_start, adj_end, size, alignment))
 			continue;
 
-		if (!best_match)
+		if ((flags & DRM_MM_SEARCH_BEST) == 0)
 			return entry;
 
 		if (entry->size < best_size) {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bbc3beb..6806bb9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3131,7 +3131,9 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space, node,
 						  size, alignment,
-						  obj->cache_level, 0, gtt_max);
+						  obj->cache_level, 0, gtt_max,
+						  DRM_MM_CREATE_DEFAULT,
+						  DRM_MM_SEARCH_DEFAULT);
 	if (ret) {
 		ret = i915_gem_evict_something(dev, size, alignment,
 					       obj->cache_level,
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 88591ef..8935710 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -41,6 +41,21 @@
 #include <linux/seq_file.h>
 #endif
 
+enum drm_mm_allocator_flags {
+	DRM_MM_CREATE_DEFAULT = 0,
+	DRM_MM_CREATE_ATOMIC = 1<<0,
+	DRM_MM_CREATE_TOP = 1<<1,
+};
+
+enum drm_mm_search_flags {
+	DRM_MM_SEARCH_DEFAULT = 0,
+	DRM_MM_SEARCH_BEST = 1<<0,
+	DRM_MM_SEARCH_BELOW = 1<<1,
+};
+
+#define DRM_MM_BOTTOMUP DRM_MM_CREATE_DEFAULT, DRM_MM_SEARCH_DEFAULT
+#define DRM_MM_TOPDOWN DRM_MM_CREATE_TOP, DRM_MM_SEARCH_BELOW
+
 struct drm_mm_node {
 	struct list_head node_list;
 	struct list_head hole_stack;
@@ -135,26 +150,37 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
 	     1 : 0; \
 	     entry = list_entry(entry->hole_stack.next, struct drm_mm_node, hole_stack))
 
+#define __drm_mm_for_each_hole(entry, mm, hole_start, hole_end, backwards) \
+	for (entry = list_entry((backwards) ? (mm)->hole_stack.prev : (mm)->hole_stack.next, struct drm_mm_node, hole_stack); \
+	     &entry->hole_stack != &(mm)->hole_stack ? \
+	     hole_start = drm_mm_hole_node_start(entry), \
+	     hole_end = drm_mm_hole_node_end(entry), \
+	     1 : 0; \
+	     entry = list_entry((backwards) ? entry->hole_stack.prev : entry->hole_stack.next, struct drm_mm_node, hole_stack))
+
 /*
  * Basic range manager support (drm_mm.c)
  */
-extern struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
-					       unsigned long start,
-					       unsigned long size,
-					       bool atomic);
-extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node,
-						    unsigned long size,
-						    unsigned alignment,
-						    unsigned long color,
-						    int atomic);
-extern struct drm_mm_node *drm_mm_get_block_range_generic(
-						struct drm_mm_node *node,
-						unsigned long size,
-						unsigned alignment,
-						unsigned long color,
-						unsigned long start,
-						unsigned long end,
-						int atomic);
+extern struct drm_mm_node *
+drm_mm_create_block(struct drm_mm *mm,
+		    unsigned long start,
+		    unsigned long size,
+		    enum drm_mm_allocator_flags flags);
+extern struct drm_mm_node *
+drm_mm_get_block_generic(struct drm_mm_node *node,
+			 unsigned long size,
+			 unsigned alignment,
+			 unsigned long color,
+			 enum drm_mm_allocator_flags flags);
+extern struct drm_mm_node *
+drm_mm_get_block_range_generic(struct drm_mm_node *node,
+			       unsigned long size,
+			       unsigned alignment,
+			       unsigned long color,
+			       unsigned long start,
+			       unsigned long end,
+			       enum drm_mm_allocator_flags flags);
+
 static inline struct drm_mm_node *drm_mm_get_block(struct drm_mm_node *parent,
 						   unsigned long size,
 						   unsigned alignment)
@@ -165,7 +191,8 @@ static inline struct drm_mm_node *drm_mm_get_block_atomic(struct drm_mm_node *pa
 							  unsigned long size,
 							  unsigned alignment)
 {
-	return drm_mm_get_block_generic(parent, size, alignment, 0, 1);
+	return drm_mm_get_block_generic(parent, size, alignment, 0,
+					DRM_MM_CREATE_ATOMIC);
 }
 static inline struct drm_mm_node *drm_mm_get_block_range(
 						struct drm_mm_node *parent,
@@ -196,39 +223,41 @@ static inline struct drm_mm_node *drm_mm_get_block_atomic_range(
 						unsigned long end)
 {
 	return drm_mm_get_block_range_generic(parent, size, alignment, 0,
-						start, end, 1);
+						start, end,
+						DRM_MM_CREATE_ATOMIC);
 }
 
-extern int drm_mm_insert_node(struct drm_mm *mm,
-			      struct drm_mm_node *node,
-			      unsigned long size,
-			      unsigned alignment);
-extern int drm_mm_insert_node_in_range(struct drm_mm *mm,
-				       struct drm_mm_node *node,
-				       unsigned long size,
-				       unsigned alignment,
-				       unsigned long start,
-				       unsigned long end);
 extern int drm_mm_insert_node_generic(struct drm_mm *mm,
 				      struct drm_mm_node *node,
 				      unsigned long size,
 				      unsigned alignment,
-				      unsigned long color);
-extern int drm_mm_insert_node_in_range_generic(struct drm_mm *mm,
-				       struct drm_mm_node *node,
-				       unsigned long size,
-				       unsigned alignment,
-				       unsigned long color,
-				       unsigned long start,
-				       unsigned long end);
+				      unsigned long color,
+				      enum drm_mm_allocator_flags aflags,
+				      enum drm_mm_search_flags sflags);
+#define drm_mm_insert_node(mm, node, size, alignment) \
+	drm_mm_insert_node_generic(mm, node, size, alignment, 0, 0)
+extern int
+drm_mm_insert_node_in_range_generic(struct drm_mm *mm,
+				    struct drm_mm_node *node,
+				    unsigned long size,
+				    unsigned alignment,
+				    unsigned long color,
+				    unsigned long start,
+				    unsigned long end,
+				    enum drm_mm_allocator_flags aflags,
+				    enum drm_mm_search_flags sflags);
+#define drm_mm_insert_node_in_range(mm, node, size, alignment, start, end) \
+	drm_mm_insert_node_in_range_generic(mm, node, size, alignment, 0, start, end, 0)
 extern void drm_mm_put_block(struct drm_mm_node *cur);
 extern void drm_mm_remove_node(struct drm_mm_node *node);
 extern void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new);
-extern struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
-						      unsigned long size,
-						      unsigned alignment,
-						      unsigned long color,
-						      bool best_match);
+
+extern struct drm_mm_node *
+drm_mm_search_free_generic(const struct drm_mm *mm,
+			   unsigned long size,
+			   unsigned alignment,
+			   unsigned long color,
+			   enum drm_mm_search_flags flags);
 extern struct drm_mm_node *drm_mm_search_free_in_range_generic(
 						const struct drm_mm *mm,
 						unsigned long size,
@@ -236,13 +265,15 @@ extern struct drm_mm_node *drm_mm_search_free_in_range_generic(
 						unsigned long color,
 						unsigned long start,
 						unsigned long end,
-						bool best_match);
-static inline struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm,
-						     unsigned long size,
-						     unsigned alignment,
-						     bool best_match)
+						enum drm_mm_search_flags flags);
+
+static inline struct drm_mm_node *
+drm_mm_search_free(const struct drm_mm *mm,
+		   unsigned long size,
+		   unsigned alignment,
+		   enum drm_mm_search_flags flags)
 {
-	return drm_mm_search_free_generic(mm,size, alignment, 0, best_match);
+	return drm_mm_search_free_generic(mm, size, alignment, 0, flags);
 }
 static inline  struct drm_mm_node *drm_mm_search_free_in_range(
 						const struct drm_mm *mm,
@@ -250,18 +281,19 @@ static inline  struct drm_mm_node *drm_mm_search_free_in_range(
 						unsigned alignment,
 						unsigned long start,
 						unsigned long end,
-						bool best_match)
+						enum drm_mm_search_flags flags)
 {
 	return drm_mm_search_free_in_range_generic(mm, size, alignment, 0,
-						   start, end, best_match);
+						   start, end, flags);
 }
-static inline struct drm_mm_node *drm_mm_search_free_color(const struct drm_mm *mm,
-							   unsigned long size,
-							   unsigned alignment,
-							   unsigned long color,
-							   bool best_match)
+static inline struct drm_mm_node *
+drm_mm_search_free_color(const struct drm_mm *mm,
+			 unsigned long size,
+			 unsigned alignment,
+			 unsigned long color,
+			 enum drm_mm_search_flags flags)
 {
-	return drm_mm_search_free_generic(mm,size, alignment, color, best_match);
+	return drm_mm_search_free_generic(mm, size, alignment, color, flags);
 }
 static inline  struct drm_mm_node *drm_mm_search_free_in_range_color(
 						const struct drm_mm *mm,
@@ -270,10 +302,10 @@ static inline  struct drm_mm_node *drm_mm_search_free_in_range_color(
 						unsigned long color,
 						unsigned long start,
 						unsigned long end,
-						bool best_match)
+						enum drm_mm_search_flags flags)
 {
 	return drm_mm_search_free_in_range_generic(mm, size, alignment, color,
-						   start, end, best_match);
+						   start, end, flags);
 }
 extern int drm_mm_init(struct drm_mm *mm,
 		       unsigned long start,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 05/66] drm/i915: Don't clear gtt with 0 entries
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (3 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT Ben Widawsky
                   ` (62 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This will help keep the next patch cleaner which will conditionally clear
the guard page, and use 0 num_entries when not actually clear space for
guard page.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 216e7a1..0fce8d0 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -516,6 +516,9 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
 	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
 	int i;
 
+	if (num_entries == 0)
+		return;
+
 	if (WARN(num_entries > max_entries,
 		 "First entry = %d; Num entries = %d (max=%d)\n",
 		 first_entry, num_entries, max_entries))
@@ -546,6 +549,9 @@ static void i915_ggtt_clear_range(struct drm_device *dev,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
 {
+	if (num_entries == 0)
+		return;
+
 	intel_gtt_clear_range(first_entry, num_entries);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (4 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 05/66] drm/i915: Don't clear gtt with 0 entries Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-28 17:57   ` Jesse Barnes
  2013-06-27 23:30 ` [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
                   ` (61 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The PPGTT PDEs serve as the guard page (as long as they remain at the
top) so we don't need yet another guard page. Note that there is a
potential issue if the aliasing PPGTT (and later, the default context)
relinquish this part of the GGTT. We should be able to assert that won't
happen however.

While there, add some comments for the setup_global_gtt function which
started getting complicated.

The reason I've opted not to leave out the guard_page argument is that
in order to support dri1, we call the setup function, and I didn't like
to have to clear the guard page in more than 1 location.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  3 ++-
 drivers/gpu/drm/i915/i915_gem.c     |  4 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c | 27 ++++++++++++++++++++++-----
 3 files changed, 26 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b709712..c677d6c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1852,7 +1852,8 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
 void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
-			       unsigned long mappable_end, unsigned long end);
+			       unsigned long mappable_end, unsigned long end,
+			       unsigned long guard_size);
 int i915_gem_gtt_init(struct drm_device *dev);
 static inline void i915_gem_chipset_flush(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6806bb9..629e047 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -158,8 +158,8 @@ i915_gem_init_ioctl(struct drm_device *dev, void *data,
 
 	mutex_lock(&dev->struct_mutex);
 	i915_gem_setup_global_gtt(dev, args->gtt_start, args->gtt_end,
-				  args->gtt_end);
-	dev_priv->gtt.mappable_end = args->gtt_end;
+				  args->gtt_end, PAGE_SIZE);
+	dev_priv->gtt.mappable_end = args->gtt_end - PAGE_SIZE;
 	mutex_unlock(&dev->struct_mutex);
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 0fce8d0..fb30d65 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -613,10 +613,23 @@ static void i915_gtt_color_adjust(struct drm_mm_node *node,
 			*end -= 4096;
 	}
 }
+
+/**
+ * i915_gem_setup_global_gtt() setup an allocator for the global GTT with the
+ * given parameters and initialize all PTEs to point to the scratch page.
+ *
+ * @dev
+ * @start - first offset of managed GGTT space
+ * @mappable_end - Last offset of the aperture mapped region
+ * @end - Last offset that can be accessed by the allocator
+ * @guard_size - Size to initialize to scratch after end. (Currently only used
+ *		 for prefetching case)
+ */
 void i915_gem_setup_global_gtt(struct drm_device *dev,
 			       unsigned long start,
 			       unsigned long mappable_end,
-			       unsigned long end)
+			       unsigned long end,
+			       unsigned long guard_size)
 {
 	/* Let GEM Manage all of the aperture.
 	 *
@@ -634,8 +647,11 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	BUG_ON(mappable_end > end);
 
+	if (WARN_ON(guard_size & ~PAGE_MASK))
+		guard_size = round_up(guard_size, PAGE_SIZE);
+
 	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE);
+	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - guard_size);
 	if (!HAS_LLC(dev))
 		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
 
@@ -665,7 +681,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
+	dev_priv->gtt.gtt_clear_range(dev, (end - guard_size) / PAGE_SIZE,
+				      guard_size / PAGE_SIZE);
 }
 
 static bool
@@ -700,7 +717,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 		}
 
-		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
+		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, 0);
 
 		ret = i915_gem_init_aliasing_ppgtt(dev);
 		if (!ret)
@@ -710,7 +727,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 		drm_mm_takedown(&dev_priv->mm.gtt_space);
 		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 	}
-	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
+	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, PAGE_SIZE);
 }
 
 static int setup_scratch_page(struct drm_device *dev)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (5 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-28 18:01   ` Jesse Barnes
  2013-06-27 23:30 ` [PATCH 08/66] drm/i915: cleanup context fini Ben Widawsky
                   ` (60 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.

Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.

The first step here is to make the existing single PPGTT use the allocator.

v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)

v3: Use Chris' top down allocator

v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)

v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c | 45 ++++++++++++++++++++++++-------------
 2 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c677d6c..659b4aa 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -484,6 +484,7 @@ struct i915_gtt {
 #define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
 
 struct i915_hw_ppgtt {
+	struct drm_mm_node node;
 	struct drm_device *dev;
 	unsigned num_pd_entries;
 	struct page **pt_pages;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index fb30d65..5284dc5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -247,6 +247,8 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 {
 	int i;
 
+	drm_mm_remove_node(&ppgtt->node);
+
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
 			pci_unmap_page(ppgtt->dev->pdev,
@@ -263,16 +265,27 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
+#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
+#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
 	struct drm_device *dev = ppgtt->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned first_pd_entry_in_global_pt;
 	int i;
 	int ret = -ENOMEM;
 
-	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
-	 * entries. For aliasing ppgtt support we just steal them at the end for
-	 * now. */
-	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
+	/* PPGTT PDEs reside in the GGTT stolen space, and consists of 512
+	 * entries. The allocator works in address space sizes, so it's
+	 * multiplied by page size. We allocate at the top of the GTT to avoid
+	 * fragmentation.
+	 */
+	BUG_ON(!drm_mm_initialized(&dev_priv->mm.gtt_space));
+	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space,
+						  &ppgtt->node, GEN6_PD_SIZE,
+						  GEN6_PD_ALIGN, 0,
+						  dev_priv->gtt.mappable_end,
+						  dev_priv->gtt.total,
+						  DRM_MM_TOPDOWN);
+	if (ret)
+		return ret;
 
 	if (IS_HASWELL(dev)) {
 		ppgtt->pte_encode = hsw_pte_encode;
@@ -288,8 +301,10 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->cleanup = gen6_ppgtt_cleanup;
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
 				  GFP_KERNEL);
-	if (!ppgtt->pt_pages)
+	if (!ppgtt->pt_pages) {
+		drm_mm_remove_node(&ppgtt->node);
 		return -ENOMEM;
+	}
 
 	for (i = 0; i < ppgtt->num_pd_entries; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
@@ -319,7 +334,11 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	ppgtt->clear_range(ppgtt, 0,
 			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
 
-	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
+	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
+			 ppgtt->node.size >> 20,
+			 ppgtt->node.start / PAGE_SIZE);
+	ppgtt->pd_offset =
+		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
 
 	return 0;
 
@@ -336,6 +355,7 @@ err_pt_alloc:
 			__free_page(ppgtt->pt_pages[i]);
 	}
 	kfree(ppgtt->pt_pages);
+	drm_mm_remove_node(&ppgtt->node);
 
 	return ret;
 }
@@ -442,6 +462,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
 				      dev_priv->gtt.total / PAGE_SIZE);
 
+	if (dev_priv->mm.aliasing_ppgtt)
+		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
+
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
@@ -711,21 +734,13 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
 	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
 		int ret;
 
-		if (INTEL_INFO(dev)->gen <= 7) {
-			/* PPGTT pdes are stolen from global gtt ptes, so shrink the
-			 * aperture accordingly when using aliasing ppgtt. */
-			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
-		}
-
 		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, 0);
-
 		ret = i915_gem_init_aliasing_ppgtt(dev);
 		if (!ret)
 			return;
 
 		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
 		drm_mm_takedown(&dev_priv->mm.gtt_space);
-		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
 	}
 	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, PAGE_SIZE);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 08/66] drm/i915: cleanup context fini
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (6 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 09/66] drm/i915: Do a fuller init after reset Ben Widawsky
                   ` (59 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With the introduction of context refcounting we never explicitly
ref/unref the backing object. As such, the previous fix was a bit wonky.

Aside from fixing the above, this patch also puts us in good shape for
an upcoming patch which allows a failure to occur in between
context_init and the first do_switch.

CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index ff47145..644df91 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -213,7 +213,6 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * may not be available. To avoid this we always pin the
 	 * default context.
 	 */
-	dev_priv->ring[RCS].default_context = ctx;
 	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
@@ -226,6 +225,8 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 		goto err_unpin;
 	}
 
+	dev_priv->ring[RCS].default_context = ctx;
+
 	DRM_DEBUG_DRIVER("Default HW context loaded\n");
 	return 0;
 
@@ -281,16 +282,24 @@ void i915_gem_context_fini(struct drm_device *dev)
 	 * other code, leading to spurious errors. */
 	intel_gpu_reset(dev);
 
-	i915_gem_object_unpin(dctx->obj);
-
 	/* When default context is created and switched to, base object refcount
 	 * will be 2 (+1 from object creation and +1 from do_switch()).
 	 * i915_gem_context_fini() will be called after gpu_idle() has switched
 	 * to default context. So we need to unreference the base object once
 	 * to offset the do_switch part, so that i915_gem_context_unreference()
 	 * can then free the base object correctly. */
-	drm_gem_object_unreference(&dctx->obj->base);
+	WARN_ON(!dev_priv->ring[RCS].last_context);
+	if (dev_priv->ring[RCS].last_context == dctx) {
+		/* Fake switch to NULL context */
+		WARN_ON(dctx->obj->active);
+		i915_gem_object_unpin(dctx->obj);
+		i915_gem_context_unreference(dctx);
+	}
+
+	i915_gem_object_unpin(dctx->obj);
 	i915_gem_context_unreference(dctx);
+	dev_priv->ring[RCS].default_context = NULL;
+	dev_priv->ring[RCS].last_context = NULL;
 }
 
 static int context_idr_cleanup(int id, void *p, void *data)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 09/66] drm/i915: Do a fuller init after reset
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (7 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 08/66] drm/i915: cleanup context fini Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 10/66] drm/i915: Split context enabling from init Ben Widawsky
                   ` (58 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

It's convenient to just call i915_gem_init_hw at reset because we'll be
adding new things to that function, and having just one function to call
instead of reimplementing it in two places is nice.

In order to accommodate we cleanup ringbuffers in order to bring them
back up cleanly. Optionally, we could also teardown/re initialize the
default context but this was causing some problems on reset which I
wasn't able to fully debug, and is unnecessary with the previous context
init/enable split.

This essentially reverts:
commit 8e88a2bd5987178d16d53686197404e149e996d9
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Jun 19 18:40:00 2012 +0200

    drm/i915: don't call modeset_init_hw in i915_reset

It seems to work for me on ILK now. Perhaps it's due to:
commit 8a5c2ae753c588bcb2a4e38d1c6a39865dbf1ff3
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Mar 28 13:57:19 2013 -0700

    drm/i915: fix ILK GPU reset for render

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c         | 29 ++++++++---------------------
 drivers/gpu/drm/i915/i915_gem.c         |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  3 ++-
 3 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index ef3a96f..30346ee 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -944,30 +944,17 @@ int i915_reset(struct drm_device *dev)
 	 */
 	if (drm_core_check_feature(dev, DRIVER_MODESET) ||
 			!dev_priv->mm.suspended) {
-		struct intel_ring_buffer *ring;
-		int i;
-
+		bool hw_contexts_disabled = dev_priv->hw_contexts_disabled;
 		dev_priv->mm.suspended = 0;
 
-		i915_gem_init_swizzling(dev);
-
-		for_each_ring(ring, dev_priv, i)
-			ring->init(ring);
-
-		i915_gem_context_init(dev);
-		if (dev_priv->mm.aliasing_ppgtt) {
-			ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
-			if (ret)
-				i915_gem_cleanup_aliasing_ppgtt(dev);
-		}
-
-		/*
-		 * It would make sense to re-init all the other hw state, at
-		 * least the rps/rc6/emon init done within modeset_init_hw. For
-		 * some unknown reason, this blows up my ilk, so don't.
-		 */
-
+		ret = i915_gem_init_hw(dev);
+		if (!hw_contexts_disabled && dev_priv->hw_contexts_disabled)
+			DRM_ERROR("HW contexts didn't survive reset\n");
 		mutex_unlock(&dev->struct_mutex);
+		if (ret) {
+			DRM_ERROR("Failed hw init on reset %d\n", ret);
+			return ret;
+		}
 
 		drm_irq_uninstall(dev);
 		drm_irq_install(dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 629e047..17fad0d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2274,6 +2274,8 @@ bool i915_gem_reset(struct drm_device *dev)
 	for_each_ring(ring, dev_priv, i)
 		ctx_banned |= i915_gem_reset_ring_lists(dev_priv, ring);
 
+	i915_gem_cleanup_ringbuffer(dev);
+
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e51ab55..901e0af 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1353,7 +1353,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 	/* Disable the ring buffer. The ring must be idle at this point */
 	dev_priv = ring->dev->dev_private;
 	ret = intel_ring_idle(ring);
-	if (ret)
+	if (ret && !i915_reset_in_progress(&dev_priv->gpu_error))
 		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
 			  ring->name, ret);
 
@@ -1364,6 +1364,7 @@ void intel_cleanup_ring_buffer(struct intel_ring_buffer *ring)
 	i915_gem_object_unpin(ring->obj);
 	drm_gem_object_unreference(&ring->obj->base);
 	ring->obj = NULL;
+	ring->outstanding_lazy_request = 0;
 
 	if (ring->cleanup)
 		ring->cleanup(ring);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 10/66] drm/i915: Split context enabling from init
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (8 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 09/66] drm/i915: Do a fuller init after reset Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 11/66] drm/i915: destroy i915_gem_init_global_gtt Ben Widawsky
                   ` (57 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We **need** to do this for exactly 1 reason, because we want to embed a
PPGTT into the context, but we don't want to special case the default
context.

To achieve that, we must be able to initialize contexts after the GTT is
setup (so we can allocate and pin the default context's BO), but before
the PPGTT and rings are initialized. This is because, currently, context
initialization requires ring usage. We don't have rings until after the
GTT is setup. If we split the enabling part of context initialization,
the part requiring the ringbuffer, we can untangle this, and then later
embed the PPGTT

Incidentally this allows us to also adhere to the original design of
context init/fini in future patches: they were only ever meant to be
called at driver load and unload.

v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris)

v3: BUG_ON after checking for disabled contexts. Or else it blows up pre
gen6 (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem.c         | 15 ++++++++++++---
 drivers/gpu/drm/i915/i915_gem_context.c | 23 ++++++++++++-----------
 3 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 659b4aa..7c3ba90 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1814,6 +1814,7 @@ void i915_gem_restore_fences(struct drm_device *dev);
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
+int i915_gem_context_enable(struct drm_i915_private *dev_priv);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct intel_ring_buffer *ring,
 			struct drm_file *file, int to_id);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 17fad0d..64f8087 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4184,10 +4184,17 @@ i915_gem_init_hw(struct drm_device *dev)
 		return ret;
 
 	/*
-	 * XXX: There was some w/a described somewhere suggesting loading
-	 * contexts before PPGTT.
+	 * XXX: Contexts should only be initialized once. Doing a switch to the
+	 * default context switch however is something we'd like to do after
+	 * reset or thaw (the latter may not actually be necessary for HW, but
+	 * goes with our code better). Context switching requires rings (for
+	 * the do_switch), but before enabling PPGTT. So don't move this.
 	 */
-	i915_gem_context_init(dev);
+	if (i915_gem_context_enable(dev_priv)) {
+		i915_gem_context_fini(dev);
+		dev_priv->hw_contexts_disabled = true;
+	}
+
 	if (dev_priv->mm.aliasing_ppgtt) {
 		ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
 		if (ret) {
@@ -4215,6 +4222,8 @@ int i915_gem_init(struct drm_device *dev)
 
 	i915_gem_init_global_gtt(dev);
 
+	i915_gem_context_init(dev);
+
 	ret = i915_gem_init_hw(dev);
 	mutex_unlock(&dev->struct_mutex);
 	if (ret) {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 644df91..14bdf1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -219,19 +219,11 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 		goto err_destroy;
 	}
 
-	ret = do_switch(ctx);
-	if (ret) {
-		DRM_DEBUG_DRIVER("Switch failed %d\n", ret);
-		goto err_unpin;
-	}
-
 	dev_priv->ring[RCS].default_context = ctx;
 
 	DRM_DEBUG_DRIVER("Default HW context loaded\n");
 	return 0;
 
-err_unpin:
-	i915_gem_object_unpin(ctx->obj);
 err_destroy:
 	i915_gem_context_unreference(ctx);
 	return ret;
@@ -247,9 +239,10 @@ void i915_gem_context_init(struct drm_device *dev)
 		return;
 	}
 
-	/* If called from reset, or thaw... we've been here already */
-	if (dev_priv->hw_contexts_disabled ||
-	    dev_priv->ring[RCS].default_context)
+	/* Init should only be called once per module load. Eventually the
+	 * restriction on the context_disabled check can be loosened. */
+	if (WARN_ON(dev_priv->hw_contexts_disabled ||
+	    WARN_ON(dev_priv->ring[RCS].default_context)))
 		return;
 
 	dev_priv->hw_context_size = round_up(get_context_size(dev), 4096);
@@ -302,6 +295,14 @@ void i915_gem_context_fini(struct drm_device *dev)
 	dev_priv->ring[RCS].last_context = NULL;
 }
 
+int i915_gem_context_enable(struct drm_i915_private *dev_priv)
+{
+	if (dev_priv->hw_contexts_disabled)
+		return 0;
+	BUG_ON(!dev_priv->ring[RCS].default_context);
+	return do_switch(dev_priv->ring[RCS].default_context);
+}
+
 static int context_idr_cleanup(int id, void *p, void *data)
 {
 	struct i915_hw_context *ctx = p;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 11/66] drm/i915: destroy i915_gem_init_global_gtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (9 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 10/66] drm/i915: Split context enabling from init Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 12/66] drm/i915: Embed PPGTT into the context Ben Widawsky
                   ` (56 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In continuing to make default context/aliasing PPGTT behave like any
other context/PPGTT pair this patch sets us up by moving the
context/PPGTT init to a common location.

The resulting code isn't a huge improvement, but that will change in the
next patch (at least a bit).

In the process of doing this, make the ppgtt init function a bit more
generic

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  4 +--
 drivers/gpu/drm/i915/i915_gem.c     | 36 ++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 63 +++++++++----------------------------
 3 files changed, 48 insertions(+), 55 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7c3ba90..1500fe4 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1839,20 +1839,20 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file);
 
 /* i915_gem_gtt.c */
+bool intel_enable_ppgtt(struct drm_device *dev);
+int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
 void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
 void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level);
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj);
-
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 				enum i915_cache_level cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
 void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
-void i915_gem_init_global_gtt(struct drm_device *dev);
 void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
 			       unsigned long mappable_end, unsigned long end,
 			       unsigned long guard_size);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 64f8087..4f46cf8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4209,7 +4209,7 @@ i915_gem_init_hw(struct drm_device *dev)
 int i915_gem_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	int ret;
+	int ret = -ENODEV;
 
 	mutex_lock(&dev->struct_mutex);
 
@@ -4220,16 +4220,42 @@ int i915_gem_init(struct drm_device *dev)
 			DRM_DEBUG_DRIVER("allow wake ack timed out\n");
 	}
 
-	i915_gem_init_global_gtt(dev);
+	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
+		struct i915_hw_ppgtt *ppgtt;
+		ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
+		if (!ppgtt)
+			goto ggtt_only;
+
+
+		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
+					  dev_priv->gtt.total, 0);
+		i915_gem_context_init(dev);
+		ret = i915_gem_ppgtt_init(dev, ppgtt);
+		if (ret)  {
+			kfree(ppgtt);
+			drm_mm_takedown(&dev_priv->mm.gtt_space);
+			goto ggtt_only;
+		}
 
-	i915_gem_context_init(dev);
+		dev_priv->mm.aliasing_ppgtt = ppgtt;
+	}
+
+ggtt_only:
+	if (ret) {
+		if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev))
+			DRM_DEBUG_DRIVER("Aliased PPGTT setup fail %d\n", ret);
+		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
+					  dev_priv->gtt.total, PAGE_SIZE);
+	}
 
 	ret = i915_gem_init_hw(dev);
-	mutex_unlock(&dev->struct_mutex);
 	if (ret) {
 		i915_gem_cleanup_aliasing_ppgtt(dev);
-		return ret;
+		i915_gem_context_fini(dev);
 	}
+	mutex_unlock(&dev->struct_mutex);
+	if (ret)
+		return ret;
 
 	/* Allow hardware batchbuffers unless told otherwise, but not for KMS. */
 	if (!drm_core_check_feature(dev, DRIVER_MODESET))
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5284dc5..6266b1a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -360,16 +360,11 @@ err_pt_alloc:
 	return ret;
 }
 
-static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
+int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_hw_ppgtt *ppgtt;
 	int ret;
 
-	ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
-	if (!ppgtt)
-		return -ENOMEM;
-
 	ppgtt->dev = dev;
 	ppgtt->scratch_page_dma_addr = dev_priv->gtt.scratch_page_dma;
 
@@ -378,11 +373,6 @@ static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev)
 	else
 		BUG();
 
-	if (ret)
-		kfree(ppgtt);
-	else
-		dev_priv->mm.aliasing_ppgtt = ppgtt;
-
 	return ret;
 }
 
@@ -637,6 +627,20 @@ static void i915_gtt_color_adjust(struct drm_mm_node *node,
 	}
 }
 
+bool intel_enable_ppgtt(struct drm_device *dev)
+{
+	if (i915_enable_ppgtt >= 0)
+		return i915_enable_ppgtt;
+
+#ifdef CONFIG_INTEL_IOMMU
+	/* Disable ppgtt on SNB if VT-d is on. */
+	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped)
+		return false;
+#endif
+
+	return true;
+}
+
 /**
  * i915_gem_setup_global_gtt() setup an allocator for the global GTT with the
  * given parameters and initialize all PTEs to point to the scratch page.
@@ -708,43 +712,6 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 				      guard_size / PAGE_SIZE);
 }
 
-static bool
-intel_enable_ppgtt(struct drm_device *dev)
-{
-	if (i915_enable_ppgtt >= 0)
-		return i915_enable_ppgtt;
-
-#ifdef CONFIG_INTEL_IOMMU
-	/* Disable ppgtt on SNB if VT-d is on. */
-	if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped)
-		return false;
-#endif
-
-	return true;
-}
-
-void i915_gem_init_global_gtt(struct drm_device *dev)
-{
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long gtt_size, mappable_size;
-
-	gtt_size = dev_priv->gtt.total;
-	mappable_size = dev_priv->gtt.mappable_end;
-
-	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
-		int ret;
-
-		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, 0);
-		ret = i915_gem_init_aliasing_ppgtt(dev);
-		if (!ret)
-			return;
-
-		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
-		drm_mm_takedown(&dev_priv->mm.gtt_space);
-	}
-	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, PAGE_SIZE);
-}
-
 static int setup_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 12/66] drm/i915: Embed PPGTT into the context
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (10 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 11/66] drm/i915: destroy i915_gem_init_global_gtt Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 13/66] drm/i915: Unify PPGTT codepaths on gen6+ Ben Widawsky
                   ` (55 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

My long term vision is for contexts to have a 1:1 relationship with a
PPGTT. Sharing objects between address spaces would work similarly to
the flink/dmabuf model if needed.

The only current code to convert is the aliasing PPGTT. The aliasing
PPGTT is just the PPGTT for the default context.

The obvious downside is until we actually do PPGTT switches, this wastes
a bit of memory. ie. by the end of the series, it's a don't care. The
other downside is PPGTT can't work without contexts, which *should* have
already been the case except for debugging scenarios. (Note the does
break the potential to easily use contexts on GEN5 since I believe our
odds of running PPGTT on GEN5 are fairly close to 0)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  2 ++
 drivers/gpu/drm/i915/i915_gem.c     | 14 ++++++++------
 drivers/gpu/drm/i915/i915_gem_gtt.c |  1 -
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1500fe4..5806d4d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -531,6 +531,8 @@ struct i915_hw_context {
 	struct intel_ring_buffer *ring;
 	struct drm_i915_gem_object *obj;
 	struct i915_ctx_hang_stats hang_stats;
+
+	struct i915_hw_ppgtt ppgtt;
 };
 
 enum no_fbc_reason {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 4f46cf8..f3d6059 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4222,17 +4222,19 @@ int i915_gem_init(struct drm_device *dev)
 
 	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
 		struct i915_hw_ppgtt *ppgtt;
-		ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL);
-		if (!ppgtt)
-			goto ggtt_only;
-
 
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
 					  dev_priv->gtt.total, 0);
 		i915_gem_context_init(dev);
+		if (dev_priv->hw_contexts_disabled) {
+			drm_mm_takedown(&dev_priv->mm.gtt_space);
+			goto ggtt_only;
+		}
+
+		ppgtt = &dev_priv->ring[RCS].default_context->ppgtt;
+
 		ret = i915_gem_ppgtt_init(dev, ppgtt);
-		if (ret)  {
-			kfree(ppgtt);
+		if (ret) {
 			drm_mm_takedown(&dev_priv->mm.gtt_space);
 			goto ggtt_only;
 		}
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6266b1a..87e5c7a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -260,7 +260,6 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 	for (i = 0; i < ppgtt->num_pd_entries; i++)
 		__free_page(ppgtt->pt_pages[i]);
 	kfree(ppgtt->pt_pages);
-	kfree(ppgtt);
 }
 
 static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 13/66] drm/i915: Unify PPGTT codepaths on gen6+
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (11 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 12/66] drm/i915: Embed PPGTT into the context Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 14/66] drm/i915: Move ppgtt initialization down Ben Widawsky
                   ` (54 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This will allow us to use the same code paths whether or not we have
PPGTT actually turned on. It will do all but actually enable the bits
that tell the HW to use PPGTT.

This patch also will help tie together contexts and PPGTT in the next
patch. That patch wants to disable contexts if there is no PPGTT, and we
disable PPGTT on gen6 with VT-D. Since Mesa depends on gen6+ having HW
contexts, this is a requirement.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c     | 3 ++-
 drivers/gpu/drm/i915/i915_gem.c     | 6 +++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 +++-
 3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index adb319b..cb08907 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -977,7 +977,8 @@ static int i915_getparam(struct drm_device *dev, void *data,
 		value = HAS_LLC(dev);
 		break;
 	case I915_PARAM_HAS_ALIASING_PPGTT:
-		value = dev_priv->mm.aliasing_ppgtt ? 1 : 0;
+		if (intel_enable_ppgtt(dev) && dev_priv->mm.aliasing_ppgtt)
+			value = 1;
 		break;
 	case I915_PARAM_HAS_WAIT_TIMEOUT:
 		value = 1;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f3d6059..a337ce1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4220,7 +4220,11 @@ int i915_gem_init(struct drm_device *dev)
 			DRM_DEBUG_DRIVER("allow wake ack timed out\n");
 	}
 
-	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
+	/* NB: In order to keep the code paths for all platforms with PPGTT the
+	 * same, we run through this next section regardless, but don't actually
+	 * enable the PPGTT via GFX_MODE.
+	 */
+	if (HAS_ALIASING_PPGTT(dev)) {
 		struct i915_hw_ppgtt *ppgtt;
 
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 87e5c7a..16a8486 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -154,7 +154,9 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
 		ecochk = I915_READ(GAM_ECOCHK);
 		I915_WRITE(GAM_ECOCHK, ecochk | ECOCHK_SNB_BIT |
 				       ECOCHK_PPGTT_CACHE64B);
-		I915_WRITE(GFX_MODE, _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
+		if (intel_enable_ppgtt(dev))
+			I915_WRITE(GFX_MODE,
+				   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 	} else if (INTEL_INFO(dev)->gen >= 7) {
 		uint32_t ecochk, ecobits;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 14/66] drm/i915: Move ppgtt initialization down
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (12 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 13/66] drm/i915: Unify PPGTT codepaths on gen6+ Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 15/66] drm/i915: Tie context to PPGTT Ben Widawsky
                   ` (53 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This will allows us to be able to check whether PPGTT was initialized
not by the pointer, but by the values of it's function pointers.

This will be used in the next patch in order to determine at context
destruction if the PPGTT needs cleanup. It should only occur in error
cases.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 16a8486..f56e75b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -295,30 +295,26 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	} else {
 		ppgtt->pte_encode = gen6_pte_encode;
 	}
-	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
-	ppgtt->enable = gen6_ppgtt_enable;
-	ppgtt->clear_range = gen6_ppgtt_clear_range;
-	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
-	ppgtt->cleanup = gen6_ppgtt_cleanup;
-	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
+
+	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*GEN6_PPGTT_PD_ENTRIES,
 				  GFP_KERNEL);
 	if (!ppgtt->pt_pages) {
 		drm_mm_remove_node(&ppgtt->node);
 		return -ENOMEM;
 	}
 
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+	for (i = 0; i < GEN6_PPGTT_PD_ENTRIES; i++) {
 		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
 		if (!ppgtt->pt_pages[i])
 			goto err_pt_alloc;
 	}
 
-	ppgtt->pt_dma_addr = kzalloc(sizeof(dma_addr_t) *ppgtt->num_pd_entries,
+	ppgtt->pt_dma_addr = kzalloc(sizeof(dma_addr_t) * GEN6_PPGTT_PD_ENTRIES,
 				     GFP_KERNEL);
 	if (!ppgtt->pt_dma_addr)
 		goto err_pt_alloc;
 
-	for (i = 0; i < ppgtt->num_pd_entries; i++) {
+	for (i = 0; i < GEN6_PPGTT_PD_ENTRIES; i++) {
 		dma_addr_t pt_addr;
 
 		pt_addr = pci_map_page(dev->pdev, ppgtt->pt_pages[i], 0, 4096,
@@ -332,6 +328,12 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 		ppgtt->pt_dma_addr[i] = pt_addr;
 	}
 
+	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
+	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->clear_range = gen6_ppgtt_clear_range;
+	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
+	ppgtt->cleanup = gen6_ppgtt_cleanup;
+
 	ppgtt->clear_range(ppgtt, 0,
 			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 15/66] drm/i915: Tie context to PPGTT
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (13 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 14/66] drm/i915: Move ppgtt initialization down Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 16/66] drm/i915: Really share scratch page Ben Widawsky
                   ` (52 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This is the second half to the previous patch in that now if PPGTT fails
to come up, contexts are disabled, and we don't even try to bring up
contexts when we don't have PPGTT support.

NB: PPGTT cleanup is now done in context unreference.

v2: Only cleanup if we set a cleanup vfunc. Don't clean PPGTT up on the
failure paths, leave it to unref.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c         |  2 --
 drivers/gpu/drm/i915/i915_drv.h         |  4 +--
 drivers/gpu/drm/i915/i915_gem.c         | 49 ++++++++++++---------------------
 drivers/gpu/drm/i915/i915_gem_context.c |  8 ++++++
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 12 --------
 5 files changed, 28 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index cb08907..b675dc7 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1363,7 +1363,6 @@ cleanup_gem:
 	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
-	i915_gem_cleanup_aliasing_ppgtt(dev);
 	drm_mm_takedown(&dev_priv->mm.gtt_space);
 cleanup_irq:
 	drm_irq_uninstall(dev);
@@ -1748,7 +1747,6 @@ int i915_driver_unload(struct drm_device *dev)
 		i915_gem_cleanup_ringbuffer(dev);
 		i915_gem_context_fini(dev);
 		mutex_unlock(&dev->struct_mutex);
-		i915_gem_cleanup_aliasing_ppgtt(dev);
 		i915_gem_cleanup_stolen(dev);
 
 		if (!I915_NEED_GFX_HWS(dev))
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5806d4d..a7f8111 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1454,8 +1454,9 @@ struct drm_i915_file_private {
 #define HAS_LLC(dev)            (INTEL_INFO(dev)->has_llc)
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
-#define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6)
 #define HAS_ALIASING_PPGTT(dev)	(INTEL_INFO(dev)->gen >=6 && !IS_VALLEYVIEW(dev))
+#define HAS_HW_CONTEXTS(dev)	(INTEL_INFO(dev)->gen >= 6 && \
+				 HAS_ALIASING_PPGTT(dev))
 
 #define HAS_OVERLAY(dev)		(INTEL_INFO(dev)->has_overlay)
 #define OVERLAY_NEEDS_PHYSICAL(dev)	(INTEL_INFO(dev)->overlay_needs_physical)
@@ -1843,7 +1844,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 /* i915_gem_gtt.c */
 bool intel_enable_ppgtt(struct drm_device *dev);
 int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
-void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev);
 void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a337ce1..c96b422 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4190,20 +4190,20 @@ i915_gem_init_hw(struct drm_device *dev)
 	 * goes with our code better). Context switching requires rings (for
 	 * the do_switch), but before enabling PPGTT. So don't move this.
 	 */
-	if (i915_gem_context_enable(dev_priv)) {
-		i915_gem_context_fini(dev);
-		dev_priv->hw_contexts_disabled = true;
-	}
+	ret = i915_gem_context_enable(dev_priv);
+	if (ret || !dev_priv->mm.aliasing_ppgtt)
+		goto disable_ctx_out;
 
-	if (dev_priv->mm.aliasing_ppgtt) {
-		ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
-		if (ret) {
-			i915_gem_cleanup_aliasing_ppgtt(dev);
-			DRM_INFO("PPGTT enable failed. This is not fatal, but unexpected\n");
-		}
-	}
+	ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
+	if (ret)
+		goto disable_ctx_out;
 
 	return 0;
+
+disable_ctx_out:
+	i915_gem_context_fini(dev);
+	dev_priv->hw_contexts_disabled = true;
+	return ret;
 }
 
 int i915_gem_init(struct drm_device *dev)
@@ -4224,9 +4224,7 @@ int i915_gem_init(struct drm_device *dev)
 	 * same, we run through this next section regardless, but don't actually
 	 * enable the PPGTT via GFX_MODE.
 	 */
-	if (HAS_ALIASING_PPGTT(dev)) {
-		struct i915_hw_ppgtt *ppgtt;
-
+	if (HAS_HW_CONTEXTS(dev)) {
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
 					  dev_priv->gtt.total, 0);
 		i915_gem_context_init(dev);
@@ -4234,31 +4232,20 @@ int i915_gem_init(struct drm_device *dev)
 			drm_mm_takedown(&dev_priv->mm.gtt_space);
 			goto ggtt_only;
 		}
-
-		ppgtt = &dev_priv->ring[RCS].default_context->ppgtt;
-
-		ret = i915_gem_ppgtt_init(dev, ppgtt);
-		if (ret) {
-			drm_mm_takedown(&dev_priv->mm.gtt_space);
-			goto ggtt_only;
-		}
-
-		dev_priv->mm.aliasing_ppgtt = ppgtt;
-	}
+	} else
+		dev_priv->hw_contexts_disabled = true;
 
 ggtt_only:
-	if (ret) {
-		if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev))
-			DRM_DEBUG_DRIVER("Aliased PPGTT setup fail %d\n", ret);
+	if (!dev_priv->mm.aliasing_ppgtt) {
+		if (HAS_HW_CONTEXTS(dev))
+			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
 					  dev_priv->gtt.total, PAGE_SIZE);
 	}
 
 	ret = i915_gem_init_hw(dev);
-	if (ret) {
-		i915_gem_cleanup_aliasing_ppgtt(dev);
+	if (ret)
 		i915_gem_context_fini(dev);
-	}
 	mutex_unlock(&dev->struct_mutex);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 14bdf1d..d92f121 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -129,6 +129,8 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 
+	if (ctx->ppgtt.cleanup)
+		ctx->ppgtt.cleanup(&ctx->ppgtt);
 	drm_gem_object_unreference(&ctx->obj->base);
 	kfree(ctx);
 }
@@ -167,6 +169,10 @@ create_hw_context(struct drm_device *dev,
 	 */
 	ctx->ring = &dev_priv->ring[RCS];
 
+	ret = i915_gem_ppgtt_init(dev, &ctx->ppgtt);
+	if (ret)
+		goto err_out;
+
 	/* Default context will never have a file_priv */
 	if (file_priv == NULL)
 		return ctx;
@@ -220,6 +226,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	}
 
 	dev_priv->ring[RCS].default_context = ctx;
+	dev_priv->mm.aliasing_ppgtt = &ctx->ppgtt;
 
 	DRM_DEBUG_DRIVER("Default HW context loaded\n");
 	return 0;
@@ -293,6 +300,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 	i915_gem_context_unreference(dctx);
 	dev_priv->ring[RCS].default_context = NULL;
 	dev_priv->ring[RCS].last_context = NULL;
+	dev_priv->mm.aliasing_ppgtt = NULL;
 }
 
 int i915_gem_context_enable(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index f56e75b..7bbd5c1 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -379,18 +379,6 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 	return ret;
 }
 
-void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
-{
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
-
-	if (!ppgtt)
-		return;
-
-	ppgtt->cleanup(ppgtt);
-	dev_priv->mm.aliasing_ppgtt = NULL;
-}
-
 void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 16/66] drm/i915: Really share scratch page
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (14 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 15/66] drm/i915: Tie context to PPGTT Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 17/66] drm/i915: Combine scratch members into a struct Ben Widawsky
                   ` (51 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

A previous patch had set up the ppgtt and ggtt to use the same scratch
page, but still kept around both pointers. Kill it, it's not needed and
gets in our way for upcoming cleanups.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     | 1 -
 drivers/gpu/drm/i915/i915_gem_gtt.c | 5 ++---
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a7f8111..632f23e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -490,7 +490,6 @@ struct i915_hw_ppgtt {
 	struct page **pt_pages;
 	uint32_t pd_offset;
 	dma_addr_t *pt_dma_addr;
-	dma_addr_t scratch_page_dma_addr;
 
 	/* pte functions, mirroring the interface of the global gtt. */
 	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7bbd5c1..5313d96 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -190,13 +190,14 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 				   unsigned first_entry,
 				   unsigned num_entries)
 {
+	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
 	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	unsigned last_pte, i;
 
 	scratch_pte = ppgtt->pte_encode(ppgtt->dev,
-					ppgtt->scratch_page_dma_addr,
+					dev_priv->gtt.scratch_page_dma,
 					I915_CACHE_LLC);
 
 	while (num_entries) {
@@ -365,11 +366,9 @@ err_pt_alloc:
 
 int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	int ret;
 
 	ppgtt->dev = dev;
-	ppgtt->scratch_page_dma_addr = dev_priv->gtt.scratch_page_dma;
 
 	if (INTEL_INFO(dev)->gen < 8)
 		ret = gen6_ppgtt_init(ppgtt);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 17/66] drm/i915: Combine scratch members into a struct
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (15 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 16/66] drm/i915: Really share scratch page Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 18/66] drm/i915: Drop dev from pte_encode Ben Widawsky
                   ` (50 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

There isn't any special reason to do this other than it makes it obvious
that the two members are connected.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  6 ++++--
 drivers/gpu/drm/i915/i915_gem_gtt.c | 17 ++++++++---------
 2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 632f23e..229a5d7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -462,8 +462,10 @@ struct i915_gtt {
 	void __iomem *gsm;
 
 	bool do_idle_maps;
-	dma_addr_t scratch_page_dma;
-	struct page *scratch_page;
+	struct {
+		dma_addr_t addr;
+		struct page *page;
+	} scratch;
 
 	/* global gtt ops */
 	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 5313d96..42e80b4 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -197,7 +197,7 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 	unsigned last_pte, i;
 
 	scratch_pte = ppgtt->pte_encode(ppgtt->dev,
-					dev_priv->gtt.scratch_page_dma,
+					dev_priv->gtt.scratch.addr,
 					I915_CACHE_LLC);
 
 	while (num_entries) {
@@ -527,8 +527,7 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = dev_priv->gtt.pte_encode(dev,
-					       dev_priv->gtt.scratch_page_dma,
+	scratch_pte = dev_priv->gtt.pte_encode(dev, dev_priv->gtt.scratch.addr,
 					       I915_CACHE_LLC);
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
@@ -722,8 +721,8 @@ static int setup_scratch_page(struct drm_device *dev)
 #else
 	dma_addr = page_to_phys(page);
 #endif
-	dev_priv->gtt.scratch_page = page;
-	dev_priv->gtt.scratch_page_dma = dma_addr;
+	dev_priv->gtt.scratch.page = page;
+	dev_priv->gtt.scratch.addr = dma_addr;
 
 	return 0;
 }
@@ -731,11 +730,11 @@ static int setup_scratch_page(struct drm_device *dev)
 static void teardown_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	set_pages_wb(dev_priv->gtt.scratch_page, 1);
-	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch_page_dma,
+	set_pages_wb(dev_priv->gtt.scratch.page, 1);
+	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
 		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	put_page(dev_priv->gtt.scratch_page);
-	__free_page(dev_priv->gtt.scratch_page);
+	put_page(dev_priv->gtt.scratch.page);
+	__free_page(dev_priv->gtt.scratch.page);
 }
 
 static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 18/66] drm/i915: Drop dev from pte_encode
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (16 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 17/66] drm/i915: Combine scratch members into a struct Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 19/66] drm/i915: Use gtt shortform where possible Ben Widawsky
                   ` (49 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The original pte_encode function needed the dev argument so we could do
platform specific handling via IS_GENX, etc. With the merging of a pte
encoding function there should never been a need to quirk away gen
specific details.

The patch doesn't do much but makes the upcoming reworks in gtt/ppgtt/mm
slightly (albeit, ever so) easier.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  6 ++----
 drivers/gpu/drm/i915/i915_gem_gtt.c | 21 ++++++++-------------
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 229a5d7..efd244d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -479,8 +479,7 @@ struct i915_gtt {
 				   struct sg_table *st,
 				   unsigned int pg_start,
 				   enum i915_cache_level cache_level);
-	gen6_gtt_pte_t (*pte_encode)(struct drm_device *dev,
-				     dma_addr_t addr,
+	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
 };
 #define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
@@ -501,8 +500,7 @@ struct i915_hw_ppgtt {
 			       struct sg_table *st,
 			       unsigned int pg_start,
 			       enum i915_cache_level cache_level);
-	gen6_gtt_pte_t (*pte_encode)(struct drm_device *dev,
-				     dma_addr_t addr,
+	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
 	int (*enable)(struct drm_device *dev);
 	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 42e80b4..746b649 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -45,8 +45,7 @@
 #define GEN6_PTE_CACHE_LLC_MLC		(3 << 1)
 #define GEN6_PTE_ADDR_ENCODE(addr)	GEN6_GTT_ADDR_ENCODE(addr)
 
-static gen6_gtt_pte_t gen6_pte_encode(struct drm_device *dev,
-				      dma_addr_t addr,
+static gen6_gtt_pte_t gen6_pte_encode(dma_addr_t addr,
 				      enum i915_cache_level level)
 {
 	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
@@ -72,8 +71,7 @@ static gen6_gtt_pte_t gen6_pte_encode(struct drm_device *dev,
 #define BYT_PTE_WRITEABLE		(1 << 1)
 #define BYT_PTE_SNOOPED_BY_CPU_CACHES	(1 << 2)
 
-static gen6_gtt_pte_t byt_pte_encode(struct drm_device *dev,
-				     dma_addr_t addr,
+static gen6_gtt_pte_t byt_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
 	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
@@ -90,8 +88,7 @@ static gen6_gtt_pte_t byt_pte_encode(struct drm_device *dev,
 	return pte;
 }
 
-static gen6_gtt_pte_t hsw_pte_encode(struct drm_device *dev,
-				     dma_addr_t addr,
+static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 				     enum i915_cache_level level)
 {
 	gen6_gtt_pte_t pte = GEN6_PTE_VALID;
@@ -196,8 +193,7 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	unsigned last_pte, i;
 
-	scratch_pte = ppgtt->pte_encode(ppgtt->dev,
-					dev_priv->gtt.scratch.addr,
+	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
 					I915_CACHE_LLC);
 
 	while (num_entries) {
@@ -233,8 +229,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
 		dma_addr_t page_addr;
 
 		page_addr = sg_page_iter_dma_address(&sg_iter);
-		pt_vaddr[act_pte] = ppgtt->pte_encode(ppgtt->dev, page_addr,
-						      cache_level);
+		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
 		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
 			kunmap_atomic(pt_vaddr);
 			act_pt++;
@@ -486,7 +481,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_page_iter_dma_address(&sg_iter);
-		iowrite32(dev_priv->gtt.pte_encode(dev, addr, level),
+		iowrite32(dev_priv->gtt.pte_encode(addr, level),
 			  &gtt_entries[i]);
 		i++;
 	}
@@ -499,7 +494,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 	 */
 	if (i != 0)
 		WARN_ON(readl(&gtt_entries[i-1])
-			!= dev_priv->gtt.pte_encode(dev, addr, level));
+			!= dev_priv->gtt.pte_encode(addr, level));
 
 	/* This next bit makes the above posting read even more important. We
 	 * want to flush the TLBs only after we're certain all the PTE updates
@@ -527,7 +522,7 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = dev_priv->gtt.pte_encode(dev, dev_priv->gtt.scratch.addr,
+	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
 					       I915_CACHE_LLC);
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 19/66] drm/i915: Use gtt shortform where possible
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (17 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 18/66] drm/i915: Drop dev from pte_encode Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 20/66] drm/i915: Move fbc members out of line Ben Widawsky
                   ` (48 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Just for compactness.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 36 +++++++++++++++---------------------
 1 file changed, 15 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 746b649..bb4ccb5 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -841,34 +841,28 @@ int i915_gem_gtt_init(struct drm_device *dev)
 	int ret;
 
 	if (INTEL_INFO(dev)->gen <= 5) {
-		dev_priv->gtt.gtt_probe = i915_gmch_probe;
-		dev_priv->gtt.gtt_remove = i915_gmch_remove;
+		gtt->gtt_probe = i915_gmch_probe;
+		gtt->gtt_remove = i915_gmch_remove;
 	} else {
-		dev_priv->gtt.gtt_probe = gen6_gmch_probe;
-		dev_priv->gtt.gtt_remove = gen6_gmch_remove;
-		if (IS_HASWELL(dev)) {
-			dev_priv->gtt.pte_encode = hsw_pte_encode;
-		} else if (IS_VALLEYVIEW(dev)) {
-			dev_priv->gtt.pte_encode = byt_pte_encode;
-		} else {
-			dev_priv->gtt.pte_encode = gen6_pte_encode;
-		}
+		gtt->gtt_probe = gen6_gmch_probe;
+		gtt->gtt_remove = gen6_gmch_remove;
+		if (IS_HASWELL(dev))
+			gtt->pte_encode = hsw_pte_encode;
+		else if (IS_VALLEYVIEW(dev))
+			gtt->pte_encode = byt_pte_encode;
+		else
+			gtt->pte_encode = gen6_pte_encode;
 	}
 
-	ret = dev_priv->gtt.gtt_probe(dev, &dev_priv->gtt.total,
-				     &dev_priv->gtt.stolen_size,
-				     &gtt->mappable_base,
-				     &gtt->mappable_end);
+	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
+			     &gtt->mappable_base, &gtt->mappable_end);
 	if (ret)
 		return ret;
 
 	/* GMADR is the PCI mmio aperture into the global GTT. */
-	DRM_INFO("Memory usable by graphics device = %zdM\n",
-		 dev_priv->gtt.total >> 20);
-	DRM_DEBUG_DRIVER("GMADR size = %ldM\n",
-			 dev_priv->gtt.mappable_end >> 20);
-	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n",
-			 dev_priv->gtt.stolen_size >> 20);
+	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
+	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
+	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 20/66] drm/i915: Move fbc members out of line
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (18 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 19/66] drm/i915: Use gtt shortform where possible Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 13:10   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
                   ` (47 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    |  2 +-
 drivers/gpu/drm/i915/i915_drv.h        | 48 +++++++++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_stolen.c | 20 +++++++-------
 drivers/gpu/drm/i915/intel_display.c   |  6 ++---
 drivers/gpu/drm/i915/intel_drv.h       |  7 -----
 drivers/gpu/drm/i915/intel_pm.c        | 49 +++++++++++++++++-----------------
 6 files changed, 67 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d4e78b6..e654bf4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1444,7 +1444,7 @@ static int i915_fbc_status(struct seq_file *m, void *unused)
 		seq_printf(m, "FBC enabled\n");
 	} else {
 		seq_printf(m, "FBC disabled: ");
-		switch (dev_priv->no_fbc_reason) {
+		switch (dev_priv->fbc.no_fbc_reason) {
 		case FBC_NO_OUTPUT:
 			seq_printf(m, "no outputs");
 			break;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index efd244d..21cf593 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -534,17 +534,35 @@ struct i915_hw_context {
 	struct i915_hw_ppgtt ppgtt;
 };
 
-enum no_fbc_reason {
-	FBC_NO_OUTPUT, /* no outputs enabled to compress */
-	FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
-	FBC_UNSUPPORTED_MODE, /* interlace or doublescanned mode */
-	FBC_MODE_TOO_LARGE, /* mode too large for compression */
-	FBC_BAD_PLANE, /* fbc not supported on plane */
-	FBC_NOT_TILED, /* buffer not tiled */
-	FBC_MULTIPLE_PIPES, /* more than one pipe active */
-	FBC_MODULE_PARAM,
+struct i915_fbc {
+	unsigned long size;
+	unsigned int fb_id;
+	enum plane plane;
+	int y;
+
+	struct drm_mm_node *compressed_fb;
+	struct drm_mm_node *compressed_llb;
+
+	struct intel_fbc_work {
+		struct delayed_work work;
+		struct drm_crtc *crtc;
+		struct drm_framebuffer *fb;
+		int interval;
+	} *fbc_work;
+
+	enum {
+		FBC_NO_OUTPUT, /* no outputs enabled to compress */
+		FBC_STOLEN_TOO_SMALL, /* not enough space for buffers */
+		FBC_UNSUPPORTED_MODE, /* interlace or doublescanned mode */
+		FBC_MODE_TOO_LARGE, /* mode too large for compression */
+		FBC_BAD_PLANE, /* fbc not supported on plane */
+		FBC_NOT_TILED, /* buffer not tiled */
+		FBC_MULTIPLE_PIPES, /* more than one pipe active */
+		FBC_MODULE_PARAM,
+	} no_fbc_reason;
 };
 
+
 enum intel_pch {
 	PCH_NONE = 0,	/* No PCH present */
 	PCH_IBX,	/* Ibexpeak PCH */
@@ -1064,12 +1082,7 @@ typedef struct drm_i915_private {
 
 	int num_plane;
 
-	unsigned long cfb_size;
-	unsigned int cfb_fb;
-	enum plane cfb_plane;
-	int cfb_y;
-	struct intel_fbc_work *fbc_work;
-
+	struct i915_fbc fbc;
 	struct intel_opregion opregion;
 	struct intel_vbt_data vbt;
 
@@ -1147,11 +1160,6 @@ typedef struct drm_i915_private {
 	/* Haswell power well */
 	struct i915_power_well power_well;
 
-	enum no_fbc_reason no_fbc_reason;
-
-	struct drm_mm_node *compressed_fb;
-	struct drm_mm_node *compressed_llb;
-
 	struct i915_gpu_error gpu_error;
 
 	struct drm_i915_gem_object *vlv_pctx;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index f713294..8e02344 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -120,7 +120,7 @@ static int i915_setup_compression(struct drm_device *dev, int size)
 		if (!compressed_llb)
 			goto err_fb;
 
-		dev_priv->compressed_llb = compressed_llb;
+		dev_priv->fbc.compressed_llb = compressed_llb;
 
 		I915_WRITE(FBC_CFB_BASE,
 			   dev_priv->mm.stolen_base + compressed_fb->start);
@@ -128,8 +128,8 @@ static int i915_setup_compression(struct drm_device *dev, int size)
 			   dev_priv->mm.stolen_base + compressed_llb->start);
 	}
 
-	dev_priv->compressed_fb = compressed_fb;
-	dev_priv->cfb_size = size;
+	dev_priv->fbc.compressed_fb = compressed_fb;
+	dev_priv->fbc.size = size;
 
 	DRM_DEBUG_KMS("reserved %d bytes of contiguous stolen space for FBC\n",
 		      size);
@@ -150,7 +150,7 @@ int i915_gem_stolen_setup_compression(struct drm_device *dev, int size)
 	if (dev_priv->mm.stolen_base == 0)
 		return -ENODEV;
 
-	if (size < dev_priv->cfb_size)
+	if (size < dev_priv->fbc.size)
 		return 0;
 
 	/* Release any current block */
@@ -163,16 +163,16 @@ void i915_gem_stolen_cleanup_compression(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (dev_priv->cfb_size == 0)
+	if (dev_priv->fbc.size == 0)
 		return;
 
-	if (dev_priv->compressed_fb)
-		drm_mm_put_block(dev_priv->compressed_fb);
+	if (dev_priv->fbc.compressed_fb)
+		drm_mm_put_block(dev_priv->fbc.compressed_fb);
 
-	if (dev_priv->compressed_llb)
-		drm_mm_put_block(dev_priv->compressed_llb);
+	if (dev_priv->fbc.compressed_llb)
+		drm_mm_put_block(dev_priv->fbc.compressed_llb);
 
-	dev_priv->cfb_size = 0;
+	dev_priv->fbc.size = 0;
 }
 
 void i915_gem_cleanup_stolen(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 8d075b1f..f056eca 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3391,7 +3391,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
 	intel_crtc_wait_for_pending_flips(crtc);
 	drm_vblank_off(dev, pipe);
 
-	if (dev_priv->cfb_plane == plane)
+	if (dev_priv->fbc.plane == plane)
 		intel_disable_fbc(dev);
 
 	intel_crtc_update_cursor(crtc, false);
@@ -3464,7 +3464,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
 	drm_vblank_off(dev, pipe);
 
 	/* FBC must be disabled before disabling the plane on HSW. */
-	if (dev_priv->cfb_plane == plane)
+	if (dev_priv->fbc.plane == plane)
 		intel_disable_fbc(dev);
 
 	hsw_disable_ips(intel_crtc);
@@ -3705,7 +3705,7 @@ static void i9xx_crtc_disable(struct drm_crtc *crtc)
 	intel_crtc_wait_for_pending_flips(crtc);
 	drm_vblank_off(dev, pipe);
 
-	if (dev_priv->cfb_plane == plane)
+	if (dev_priv->fbc.plane == plane)
 		intel_disable_fbc(dev);
 
 	intel_crtc_dpms_overlay(intel_crtc, false);
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index ffe9d35..af68861 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -548,13 +548,6 @@ struct intel_unpin_work {
 	bool enable_stall_check;
 };
 
-struct intel_fbc_work {
-	struct delayed_work work;
-	struct drm_crtc *crtc;
-	struct drm_framebuffer *fb;
-	int interval;
-};
-
 int intel_pch_rawclk(struct drm_device *dev);
 
 int intel_connector_update_modes(struct drm_connector *connector,
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index b27bda0..d32734d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -86,7 +86,7 @@ static void i8xx_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	int plane, i;
 	u32 fbc_ctl, fbc_ctl2;
 
-	cfb_pitch = dev_priv->cfb_size / FBC_LL_SIZE;
+	cfb_pitch = dev_priv->fbc.size / FBC_LL_SIZE;
 	if (fb->pitches[0] < cfb_pitch)
 		cfb_pitch = fb->pitches[0];
 
@@ -325,7 +325,7 @@ static void intel_fbc_work_fn(struct work_struct *__work)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	mutex_lock(&dev->struct_mutex);
-	if (work == dev_priv->fbc_work) {
+	if (work == dev_priv->fbc.fbc_work) {
 		/* Double check that we haven't switched fb without cancelling
 		 * the prior work.
 		 */
@@ -333,12 +333,12 @@ static void intel_fbc_work_fn(struct work_struct *__work)
 			dev_priv->display.enable_fbc(work->crtc,
 						     work->interval);
 
-			dev_priv->cfb_plane = to_intel_crtc(work->crtc)->plane;
-			dev_priv->cfb_fb = work->crtc->fb->base.id;
-			dev_priv->cfb_y = work->crtc->y;
+			dev_priv->fbc.plane = to_intel_crtc(work->crtc)->plane;
+			dev_priv->fbc.fb_id = work->crtc->fb->base.id;
+			dev_priv->fbc.y = work->crtc->y;
 		}
 
-		dev_priv->fbc_work = NULL;
+		dev_priv->fbc.fbc_work = NULL;
 	}
 	mutex_unlock(&dev->struct_mutex);
 
@@ -347,25 +347,25 @@ static void intel_fbc_work_fn(struct work_struct *__work)
 
 static void intel_cancel_fbc_work(struct drm_i915_private *dev_priv)
 {
-	if (dev_priv->fbc_work == NULL)
+	if (dev_priv->fbc.fbc_work == NULL)
 		return;
 
 	DRM_DEBUG_KMS("cancelling pending FBC enable\n");
 
 	/* Synchronisation is provided by struct_mutex and checking of
-	 * dev_priv->fbc_work, so we can perform the cancellation
+	 * dev_priv->fbc.fbc_work, so we can perform the cancellation
 	 * entirely asynchronously.
 	 */
-	if (cancel_delayed_work(&dev_priv->fbc_work->work))
+	if (cancel_delayed_work(&dev_priv->fbc.fbc_work->work))
 		/* tasklet was killed before being run, clean up */
-		kfree(dev_priv->fbc_work);
+		kfree(dev_priv->fbc.fbc_work);
 
 	/* Mark the work as no longer wanted so that if it does
 	 * wake-up (because the work was already running and waiting
 	 * for our mutex), it will discover that is no longer
 	 * necessary to run.
 	 */
-	dev_priv->fbc_work = NULL;
+	dev_priv->fbc.fbc_work = NULL;
 }
 
 void intel_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
@@ -390,7 +390,7 @@ void intel_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	work->interval = interval;
 	INIT_DELAYED_WORK(&work->work, intel_fbc_work_fn);
 
-	dev_priv->fbc_work = work;
+	dev_priv->fbc.fbc_work = work;
 
 	DRM_DEBUG_KMS("scheduling delayed FBC enable\n");
 
@@ -418,7 +418,7 @@ void intel_disable_fbc(struct drm_device *dev)
 		return;
 
 	dev_priv->display.disable_fbc(dev);
-	dev_priv->cfb_plane = -1;
+	dev_priv->fbc.plane = -1;
 }
 
 /**
@@ -471,7 +471,8 @@ void intel_update_fbc(struct drm_device *dev)
 		    !to_intel_crtc(tmp_crtc)->primary_disabled) {
 			if (crtc) {
 				DRM_DEBUG_KMS("more than one pipe active, disabling compression\n");
-				dev_priv->no_fbc_reason = FBC_MULTIPLE_PIPES;
+				dev_priv->fbc.no_fbc_reason =
+					FBC_MULTIPLE_PIPES;
 				goto out_disable;
 			}
 			crtc = tmp_crtc;
@@ -480,7 +481,7 @@ void intel_update_fbc(struct drm_device *dev)
 
 	if (!crtc || crtc->fb == NULL) {
 		DRM_DEBUG_KMS("no output, disabling\n");
-		dev_priv->no_fbc_reason = FBC_NO_OUTPUT;
+		dev_priv->fbc.no_fbc_reason = FBC_NO_OUTPUT;
 		goto out_disable;
 	}
 
@@ -498,14 +499,14 @@ void intel_update_fbc(struct drm_device *dev)
 	}
 	if (!enable_fbc) {
 		DRM_DEBUG_KMS("fbc disabled per module param\n");
-		dev_priv->no_fbc_reason = FBC_MODULE_PARAM;
+		dev_priv->fbc.no_fbc_reason = FBC_MODULE_PARAM;
 		goto out_disable;
 	}
 	if ((crtc->mode.flags & DRM_MODE_FLAG_INTERLACE) ||
 	    (crtc->mode.flags & DRM_MODE_FLAG_DBLSCAN)) {
 		DRM_DEBUG_KMS("mode incompatible with compression, "
 			      "disabling\n");
-		dev_priv->no_fbc_reason = FBC_UNSUPPORTED_MODE;
+		dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED_MODE;
 		goto out_disable;
 	}
 
@@ -519,13 +520,13 @@ void intel_update_fbc(struct drm_device *dev)
 	if ((crtc->mode.hdisplay > max_hdisplay) ||
 	    (crtc->mode.vdisplay > max_vdisplay)) {
 		DRM_DEBUG_KMS("mode too large for compression, disabling\n");
-		dev_priv->no_fbc_reason = FBC_MODE_TOO_LARGE;
+		dev_priv->fbc.no_fbc_reason = FBC_MODE_TOO_LARGE;
 		goto out_disable;
 	}
 	if ((IS_I915GM(dev) || IS_I945GM(dev) || IS_HASWELL(dev)) &&
 	    intel_crtc->plane != 0) {
 		DRM_DEBUG_KMS("plane not 0, disabling compression\n");
-		dev_priv->no_fbc_reason = FBC_BAD_PLANE;
+		dev_priv->fbc.no_fbc_reason = FBC_BAD_PLANE;
 		goto out_disable;
 	}
 
@@ -535,7 +536,7 @@ void intel_update_fbc(struct drm_device *dev)
 	if (obj->tiling_mode != I915_TILING_X ||
 	    obj->fence_reg == I915_FENCE_REG_NONE) {
 		DRM_DEBUG_KMS("framebuffer not tiled or fenced, disabling compression\n");
-		dev_priv->no_fbc_reason = FBC_NOT_TILED;
+		dev_priv->fbc.no_fbc_reason = FBC_NOT_TILED;
 		goto out_disable;
 	}
 
@@ -545,7 +546,7 @@ void intel_update_fbc(struct drm_device *dev)
 
 	if (i915_gem_stolen_setup_compression(dev, intel_fb->obj->base.size)) {
 		DRM_DEBUG_KMS("framebuffer too large, disabling compression\n");
-		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
+		dev_priv->fbc.no_fbc_reason = FBC_STOLEN_TOO_SMALL;
 		goto out_disable;
 	}
 
@@ -554,9 +555,9 @@ void intel_update_fbc(struct drm_device *dev)
 	 * cannot be unpinned (and have its GTT offset and fence revoked)
 	 * without first being decoupled from the scanout and FBC disabled.
 	 */
-	if (dev_priv->cfb_plane == intel_crtc->plane &&
-	    dev_priv->cfb_fb == fb->base.id &&
-	    dev_priv->cfb_y == crtc->y)
+	if (dev_priv->fbc.plane == intel_crtc->plane &&
+	    dev_priv->fbc.fb_id == fb->base.id &&
+	    dev_priv->fbc.y == crtc->y)
 		return;
 
 	if (intel_fbc_enabled(dev)) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (19 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 20/66] drm/i915: Move fbc members out of line Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 13:12   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 22/66] drm/i915: Move gtt_mtrr to i915_gtt Ben Widawsky
                   ` (46 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The GTT and PPGTT can be thought of more generally as GPU address
spaces. Many of their actions (insert entries), state (LRU lists) and
many of their characteristics (size), can be shared. Do that.

Created a i915_gtt_vm helper macro since for now we always want the
regular GTT address space. Eventually we'll ween ourselves off of using
this except in cases where we obviously want the GGTT (like display).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
 drivers/gpu/drm/i915/i915_drv.h     |  48 ++++++------
 drivers/gpu/drm/i915/i915_gem.c     |   8 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 145 +++++++++++++++++++-----------------
 4 files changed, 110 insertions(+), 95 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index e654bf4..c10a690 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   count, size);
 
 	seq_printf(m, "%zu [%lu] gtt total\n",
-		   dev_priv->gtt.total,
-		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
+		   i915_gtt_vm->total,
+		   dev_priv->gtt.mappable_end - i915_gtt_vm->start);
 
 	seq_printf(m, "\n");
 	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 21cf593..7f4c9b6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -442,6 +442,28 @@ enum i915_cache_level {
 
 typedef uint32_t gen6_gtt_pte_t;
 
+struct i915_address_space {
+	struct drm_device *dev;
+	unsigned long start;		/* Start offset always 0 for dri2 */
+	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
+
+	struct {
+		dma_addr_t addr;
+		struct page *page;
+	} scratch;
+
+	/* FIXME: Need a more generic return type */
+	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
+				     enum i915_cache_level level);
+	void (*clear_range)(struct i915_address_space *i915_mm,
+			    unsigned int first_entry,
+			    unsigned int num_entries);
+	void (*insert_entries)(struct i915_address_space *i915_mm,
+			       struct sg_table *st,
+			       unsigned int first_entry,
+			       enum i915_cache_level cache_level);
+};
+
 /* The Graphics Translation Table is the way in which GEN hardware translates a
  * Graphics Virtual Address into a Physical Address. In addition to the normal
  * collateral associated with any va->pa translations GEN hardware also has a
@@ -450,8 +472,7 @@ typedef uint32_t gen6_gtt_pte_t;
  * the spec.
  */
 struct i915_gtt {
-	unsigned long start;		/* Start offset of used GTT */
-	size_t total;			/* Total size GTT can map */
+	struct i915_address_space base;
 	size_t stolen_size;		/* Total size of stolen memory */
 
 	unsigned long mappable_end;	/* End offset that we can CPU map */
@@ -472,34 +493,17 @@ struct i915_gtt {
 			  size_t *stolen, phys_addr_t *mappable_base,
 			  unsigned long *mappable_end);
 	void (*gtt_remove)(struct drm_device *dev);
-	void (*gtt_clear_range)(struct drm_device *dev,
-				unsigned int first_entry,
-				unsigned int num_entries);
-	void (*gtt_insert_entries)(struct drm_device *dev,
-				   struct sg_table *st,
-				   unsigned int pg_start,
-				   enum i915_cache_level cache_level);
-	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
-				     enum i915_cache_level level);
 };
-#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
+#define i915_gtt_vm ((struct i915_address_space *)&(dev_priv->gtt.base))
 
 struct i915_hw_ppgtt {
+	struct i915_address_space base;
 	struct drm_mm_node node;
-	struct drm_device *dev;
 	unsigned num_pd_entries;
 	struct page **pt_pages;
 	uint32_t pd_offset;
 	dma_addr_t *pt_dma_addr;
 
-	/* pte functions, mirroring the interface of the global gtt. */
-	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
-			    unsigned int first_entry,
-			    unsigned int num_entries);
-	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
-			       struct sg_table *st,
-			       unsigned int pg_start,
-			       enum i915_cache_level cache_level);
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
 	int (*enable)(struct drm_device *dev);
@@ -1123,7 +1127,7 @@ typedef struct drm_i915_private {
 	enum modeset_restore modeset_restore;
 	struct mutex modeset_restore_lock;
 
-	struct i915_gtt gtt;
+	struct i915_gtt gtt; /* VMA representing the global address space */
 
 	struct i915_gem_mm mm;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c96b422..e31ed47 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			pinned += obj->gtt_space->size;
 	mutex_unlock(&dev->struct_mutex);
 
-	args->aper_size = dev_priv->gtt.total;
+	args->aper_size = i915_gtt_vm->total;
 	args->aper_available_size = args->aper_size - pinned;
 
 	return 0;
@@ -3083,7 +3083,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
+		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
 	int ret;
 
 	fence_size = i915_gem_get_gtt_size(dev,
@@ -4226,7 +4226,7 @@ int i915_gem_init(struct drm_device *dev)
 	 */
 	if (HAS_HW_CONTEXTS(dev)) {
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
-					  dev_priv->gtt.total, 0);
+					  i915_gtt_vm->total, 0);
 		i915_gem_context_init(dev);
 		if (dev_priv->hw_contexts_disabled) {
 			drm_mm_takedown(&dev_priv->mm.gtt_space);
@@ -4240,7 +4240,7 @@ ggtt_only:
 		if (HAS_HW_CONTEXTS(dev))
 			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
-					  dev_priv->gtt.total, PAGE_SIZE);
+					  i915_gtt_vm->total, PAGE_SIZE);
 	}
 
 	ret = i915_gem_init_hw(dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index bb4ccb5..6de75c7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 
 static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 {
-	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
+	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
 	gen6_gtt_pte_t __iomem *pd_addr;
 	uint32_t pd_entry;
 	int i;
@@ -183,18 +183,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
 }
 
 /* PPGTT support for Sandybdrige/Gen6 and later */
-static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
+static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 				   unsigned first_entry,
 				   unsigned num_entries)
 {
-	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
 	unsigned last_pte, i;
 
-	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
-					I915_CACHE_LLC);
+	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
 
 	while (num_entries) {
 		last_pte = first_pte + num_entries;
@@ -214,11 +214,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
 	}
 }
 
-static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
+static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 				      struct sg_table *pages,
 				      unsigned first_entry,
 				      enum i915_cache_level cache_level)
 {
+	struct i915_hw_ppgtt *ppgtt =
+		container_of(vm, struct i915_hw_ppgtt, base);
 	gen6_gtt_pte_t *pt_vaddr;
 	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
 	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
@@ -229,7 +231,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
 		dma_addr_t page_addr;
 
 		page_addr = sg_page_iter_dma_address(&sg_iter);
-		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
+		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
 		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
 			kunmap_atomic(pt_vaddr);
 			act_pt++;
@@ -243,14 +245,14 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
 
 static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 {
+	struct i915_address_space *vm = &ppgtt->base;
 	int i;
 
 	drm_mm_remove_node(&ppgtt->node);
 
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
-			pci_unmap_page(ppgtt->dev->pdev,
-				       ppgtt->pt_dma_addr[i],
+			pci_unmap_page(vm->dev->pdev, ppgtt->pt_dma_addr[i],
 				       4096, PCI_DMA_BIDIRECTIONAL);
 	}
 
@@ -264,7 +266,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 {
 #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
 #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
-	struct drm_device *dev = ppgtt->dev;
+	struct i915_address_space *vm = &ppgtt->base;
+	struct drm_device *dev = vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int i;
 	int ret = -ENOMEM;
@@ -279,21 +282,22 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 						  &ppgtt->node, GEN6_PD_SIZE,
 						  GEN6_PD_ALIGN, 0,
 						  dev_priv->gtt.mappable_end,
-						  dev_priv->gtt.total,
+						  i915_gtt_vm->total,
 						  DRM_MM_TOPDOWN);
 	if (ret)
 		return ret;
 
 	if (IS_HASWELL(dev)) {
-		ppgtt->pte_encode = hsw_pte_encode;
+		vm->pte_encode = hsw_pte_encode;
 	} else if (IS_VALLEYVIEW(dev)) {
-		ppgtt->pte_encode = byt_pte_encode;
+		vm->pte_encode = byt_pte_encode;
 	} else {
-		ppgtt->pte_encode = gen6_pte_encode;
+		vm->pte_encode = gen6_pte_encode;
 	}
 
 	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*GEN6_PPGTT_PD_ENTRIES,
 				  GFP_KERNEL);
+
 	if (!ppgtt->pt_pages) {
 		drm_mm_remove_node(&ppgtt->node);
 		return -ENOMEM;
@@ -326,12 +330,15 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
-	ppgtt->clear_range = gen6_ppgtt_clear_range;
-	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
 	ppgtt->cleanup = gen6_ppgtt_cleanup;
 
-	ppgtt->clear_range(ppgtt, 0,
-			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
+	vm->clear_range = gen6_ppgtt_clear_range;
+	vm->insert_entries = gen6_ppgtt_insert_entries;
+	vm->start = 0;
+	vm->total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
+	vm->scratch = dev_priv->gtt.base.scratch;
+
+	vm->clear_range(vm, 0, ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
 
 	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
 			 ppgtt->node.size >> 20,
@@ -363,7 +370,7 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 {
 	int ret;
 
-	ppgtt->dev = dev;
+	ppgtt->base.dev = dev;
 
 	if (INTEL_INFO(dev)->gen < 8)
 		ret = gen6_ppgtt_init(ppgtt);
@@ -377,17 +384,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	ppgtt->insert_entries(ppgtt, obj->pages,
-			      obj->gtt_space->start >> PAGE_SHIFT,
-			      cache_level);
+	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
+				   obj->gtt_space->start >> PAGE_SHIFT,
+				   cache_level);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
-	ppgtt->clear_range(ppgtt,
-			   obj->gtt_space->start >> PAGE_SHIFT,
-			   obj->base.size >> PAGE_SHIFT);
+	ppgtt->base.clear_range(&ppgtt->base,
+				obj->gtt_space->start >> PAGE_SHIFT,
+				obj->base.size >> PAGE_SHIFT);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -434,8 +441,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	struct drm_i915_gem_object *obj;
 
 	/* First fill our portion of the GTT with scratch pages */
-	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
-				      dev_priv->gtt.total / PAGE_SIZE);
+	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
+				       i915_gtt_vm->start / PAGE_SIZE,
+				       i915_gtt_vm->total / PAGE_SIZE);
 
 	if (dev_priv->mm.aliasing_ppgtt)
 		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
@@ -467,12 +475,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
  * within the global GTT as well as accessible by the GPU through the GMADR
  * mapped BAR (dev_priv->mm.gtt->gtt).
  */
-static void gen6_ggtt_insert_entries(struct drm_device *dev,
+static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     unsigned int first_entry,
 				     enum i915_cache_level level)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
 	gen6_gtt_pte_t __iomem *gtt_entries =
 		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
 	int i = 0;
@@ -481,8 +489,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 
 	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
 		addr = sg_page_iter_dma_address(&sg_iter);
-		iowrite32(dev_priv->gtt.pte_encode(addr, level),
-			  &gtt_entries[i]);
+		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
 		i++;
 	}
 
@@ -493,8 +500,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 	 * hardware should work, we must keep this posting read for paranoia.
 	 */
 	if (i != 0)
-		WARN_ON(readl(&gtt_entries[i-1])
-			!= dev_priv->gtt.pte_encode(addr, level));
+		WARN_ON(readl(&gtt_entries[i-1]) !=
+			vm->pte_encode(addr, level));
 
 	/* This next bit makes the above posting read even more important. We
 	 * want to flush the TLBs only after we're certain all the PTE updates
@@ -504,14 +511,14 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
 	POSTING_READ(GFX_FLSH_CNTL_GEN6);
 }
 
-static void gen6_ggtt_clear_range(struct drm_device *dev,
+static void gen6_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
 	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
 		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
-	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
+	const int max_entries = (vm->total >> PAGE_SHIFT) - first_entry;
 	int i;
 
 	if (num_entries == 0)
@@ -522,15 +529,15 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
 		 first_entry, num_entries, max_entries))
 		num_entries = max_entries;
 
-	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
-					       I915_CACHE_LLC);
+	scratch_pte = vm->pte_encode(vm->scratch.addr,
+					  I915_CACHE_LLC);
 	for (i = 0; i < num_entries; i++)
 		iowrite32(scratch_pte, &gtt_base[i]);
 	readl(gtt_base);
 }
 
 
-static void i915_ggtt_insert_entries(struct drm_device *dev,
+static void i915_ggtt_insert_entries(struct i915_address_space *vm,
 				     struct sg_table *st,
 				     unsigned int pg_start,
 				     enum i915_cache_level cache_level)
@@ -542,7 +549,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
 
 }
 
-static void i915_ggtt_clear_range(struct drm_device *dev,
+static void i915_ggtt_clear_range(struct i915_address_space *vm,
 				  unsigned int first_entry,
 				  unsigned int num_entries)
 {
@@ -559,9 +566,9 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
-					 obj->gtt_space->start >> PAGE_SHIFT,
-					 cache_level);
+	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
+					  obj->gtt_space->start >> PAGE_SHIFT,
+					  cache_level);
 
 	obj->has_global_gtt_mapping = 1;
 }
@@ -571,9 +578,9 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	dev_priv->gtt.gtt_clear_range(obj->base.dev,
-				      obj->gtt_space->start >> PAGE_SHIFT,
-				      obj->base.size >> PAGE_SHIFT);
+	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
+				       obj->gtt_space->start >> PAGE_SHIFT,
+				       obj->base.size >> PAGE_SHIFT);
 
 	obj->has_global_gtt_mapping = 0;
 }
@@ -679,21 +686,21 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 		obj->has_global_gtt_mapping = 1;
 	}
 
-	dev_priv->gtt.start = start;
-	dev_priv->gtt.total = end - start;
+	i915_gtt_vm->start = start;
+	i915_gtt_vm->total = end - start;
 
 	/* Clear any non-preallocated blocks */
 	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
 			     hole_start, hole_end) {
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
-					      (hole_end-hole_start) / PAGE_SIZE);
+		i915_gtt_vm->clear_range(i915_gtt_vm, hole_start / PAGE_SIZE,
+				     (hole_end-hole_start) / PAGE_SIZE);
 	}
 
 	/* And finally clear the reserved guard page */
-	dev_priv->gtt.gtt_clear_range(dev, (end - guard_size) / PAGE_SIZE,
-				      guard_size / PAGE_SIZE);
+	i915_gtt_vm->clear_range(i915_gtt_vm, (end - guard_size) / PAGE_SIZE,
+				 guard_size / PAGE_SIZE);
 }
 
 static int setup_scratch_page(struct drm_device *dev)
@@ -716,8 +723,8 @@ static int setup_scratch_page(struct drm_device *dev)
 #else
 	dma_addr = page_to_phys(page);
 #endif
-	dev_priv->gtt.scratch.page = page;
-	dev_priv->gtt.scratch.addr = dma_addr;
+	i915_gtt_vm->scratch.page = page;
+	i915_gtt_vm->scratch.addr = dma_addr;
 
 	return 0;
 }
@@ -725,11 +732,12 @@ static int setup_scratch_page(struct drm_device *dev)
 static void teardown_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	set_pages_wb(dev_priv->gtt.scratch.page, 1);
-	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
+
+	set_pages_wb(i915_gtt_vm->scratch.page, 1);
+	pci_unmap_page(dev->pdev, i915_gtt_vm->scratch.addr,
 		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	put_page(dev_priv->gtt.scratch.page);
-	__free_page(dev_priv->gtt.scratch.page);
+	put_page(i915_gtt_vm->scratch.page);
+	__free_page(i915_gtt_vm->scratch.page);
 }
 
 static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -792,8 +800,8 @@ static int gen6_gmch_probe(struct drm_device *dev,
 	if (ret)
 		DRM_ERROR("Scratch setup failed\n");
 
-	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
-	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
+	i915_gtt_vm->clear_range = gen6_ggtt_clear_range;
+	i915_gtt_vm->insert_entries = gen6_ggtt_insert_entries;
 
 	return ret;
 }
@@ -823,8 +831,8 @@ static int i915_gmch_probe(struct drm_device *dev,
 	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
-	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
-	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
+	i915_gtt_vm->clear_range = i915_ggtt_clear_range;
+	i915_gtt_vm->insert_entries = i915_ggtt_insert_entries;
 
 	return 0;
 }
@@ -847,20 +855,23 @@ int i915_gem_gtt_init(struct drm_device *dev)
 		gtt->gtt_probe = gen6_gmch_probe;
 		gtt->gtt_remove = gen6_gmch_remove;
 		if (IS_HASWELL(dev))
-			gtt->pte_encode = hsw_pte_encode;
+			gtt->base.pte_encode = hsw_pte_encode;
 		else if (IS_VALLEYVIEW(dev))
-			gtt->pte_encode = byt_pte_encode;
+			gtt->base.pte_encode = byt_pte_encode;
 		else
-			gtt->pte_encode = gen6_pte_encode;
+			gtt->base.pte_encode = gen6_pte_encode;
 	}
 
-	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
+	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
 			     &gtt->mappable_base, &gtt->mappable_end);
 	if (ret)
 		return ret;
 
+	gtt->base.dev = dev;
+
 	/* GMADR is the PCI mmio aperture into the global GTT. */
-	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
+	DRM_INFO("Memory usable by graphics device = %zdM\n",
+		 gtt->base.total >> 20);
 	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
 	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 22/66] drm/i915: Move gtt_mtrr to i915_gtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (20 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 23/66] drm/i915: Move stolen stuff " Ben Widawsky
                   ` (45 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

for file in `ls drivers/gpu/drm/i915/*.c` ; do
	sed -i "s/mm.gtt_mtrr/gtt.mtrr/" $file;
done

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 8 ++++----
 drivers/gpu/drm/i915/i915_drv.h | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index b675dc7..3535ced 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1558,8 +1558,8 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 		goto out_rmmap;
 	}
 
-	dev_priv->mm.gtt_mtrr = arch_phys_wc_add(dev_priv->gtt.mappable_base,
-						 aperture_size);
+	dev_priv->gtt.mtrr = arch_phys_wc_add(dev_priv->gtt.mappable_base,
+					      aperture_size);
 
 	/* The i915 workqueue is primarily used for batched retirement of
 	 * requests (and thus managing bo) once the task has been completed
@@ -1667,7 +1667,7 @@ out_gem_unload:
 	intel_teardown_mchbar(dev);
 	destroy_workqueue(dev_priv->wq);
 out_mtrrfree:
-	arch_phys_wc_del(dev_priv->mm.gtt_mtrr);
+	arch_phys_wc_del(dev_priv->gtt.mtrr);
 	io_mapping_free(dev_priv->gtt.mappable);
 	dev_priv->gtt.gtt_remove(dev);
 out_rmmap:
@@ -1705,7 +1705,7 @@ int i915_driver_unload(struct drm_device *dev)
 	cancel_delayed_work_sync(&dev_priv->mm.retire_work);
 
 	io_mapping_free(dev_priv->gtt.mappable);
-	arch_phys_wc_del(dev_priv->mm.gtt_mtrr);
+	arch_phys_wc_del(dev_priv->gtt.mtrr);
 
 	acpi_video_unregister();
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7f4c9b6..f428076 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -488,6 +488,8 @@ struct i915_gtt {
 		struct page *page;
 	} scratch;
 
+	int mtrr;
+
 	/* global gtt ops */
 	int (*gtt_probe)(struct drm_device *dev, size_t *gtt_total,
 			  size_t *stolen, phys_addr_t *mappable_base,
@@ -843,8 +845,6 @@ struct i915_gem_mm {
 	/** Usable portion of the GTT for GEM */
 	unsigned long stolen_base; /* limited to low memory (32-bit) */
 
-	int gtt_mtrr;
-
 	/** PPGTT used for aliasing the PPGTT with the GTT */
 	struct i915_hw_ppgtt *aliasing_ppgtt;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 23/66] drm/i915: Move stolen stuff to i915_gtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (21 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 22/66] drm/i915: Move gtt_mtrr to i915_gtt Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 13:18   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 24/66] drm/i915: Move aliasing_ppgtt Ben Widawsky
                   ` (44 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

It doesn't apply to generic VMA, so it belongs with the gtt.

for file in `ls drivers/gpu/drm/i915/*.c` ; do
	sed -i "s/mm.stolen_base/gtt.stolen_base/" $file;
done

for file in `ls drivers/gpu/drm/i915/*.c` ; do
	sed -i "s/mm.stolen/gtt.stolen/" $file;
done

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h        |  8 +++-----
 drivers/gpu/drm/i915/i915_gem_stolen.c | 32 ++++++++++++++++----------------
 drivers/gpu/drm/i915/i915_irq.c        |  2 +-
 drivers/gpu/drm/i915/intel_pm.c        |  4 ++--
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f428076..7016074 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -473,6 +473,9 @@ struct i915_address_space {
  */
 struct i915_gtt {
 	struct i915_address_space base;
+
+	struct drm_mm stolen;
+	unsigned long stolen_base; /* limited to low memory (32-bit) */
 	size_t stolen_size;		/* Total size of stolen memory */
 
 	unsigned long mappable_end;	/* End offset that we can CPU map */
@@ -828,8 +831,6 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
-	/** Memory allocator for GTT stolen memory */
-	struct drm_mm stolen;
 	/** Memory allocator for GTT */
 	struct drm_mm gtt_space;
 	/** List of all objects in gtt_space. Used to restore gtt
@@ -842,9 +843,6 @@ struct i915_gem_mm {
 	 */
 	struct list_head unbound_list;
 
-	/** Usable portion of the GTT for GEM */
-	unsigned long stolen_base; /* limited to low memory (32-bit) */
-
 	/** PPGTT used for aliasing the PPGTT with the GTT */
 	struct i915_hw_ppgtt *aliasing_ppgtt;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 8e02344..fd812d5 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -97,10 +97,10 @@ static int i915_setup_compression(struct drm_device *dev, int size)
 	struct drm_mm_node *compressed_fb, *uninitialized_var(compressed_llb);
 
 	/* Try to over-allocate to reduce reallocations and fragmentation */
-	compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
+	compressed_fb = drm_mm_search_free(&dev_priv->gtt.stolen,
 					   size <<= 1, 4096, 0);
 	if (!compressed_fb)
-		compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
+		compressed_fb = drm_mm_search_free(&dev_priv->gtt.stolen,
 						   size >>= 1, 4096, 0);
 	if (compressed_fb)
 		compressed_fb = drm_mm_get_block(compressed_fb, size, 4096);
@@ -112,7 +112,7 @@ static int i915_setup_compression(struct drm_device *dev, int size)
 	else if (IS_GM45(dev)) {
 		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
 	} else {
-		compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen,
+		compressed_llb = drm_mm_search_free(&dev_priv->gtt.stolen,
 						    4096, 4096, 0);
 		if (compressed_llb)
 			compressed_llb = drm_mm_get_block(compressed_llb,
@@ -123,9 +123,9 @@ static int i915_setup_compression(struct drm_device *dev, int size)
 		dev_priv->fbc.compressed_llb = compressed_llb;
 
 		I915_WRITE(FBC_CFB_BASE,
-			   dev_priv->mm.stolen_base + compressed_fb->start);
+			   dev_priv->gtt.stolen_base + compressed_fb->start);
 		I915_WRITE(FBC_LL_BASE,
-			   dev_priv->mm.stolen_base + compressed_llb->start);
+			   dev_priv->gtt.stolen_base + compressed_llb->start);
 	}
 
 	dev_priv->fbc.compressed_fb = compressed_fb;
@@ -147,7 +147,7 @@ int i915_gem_stolen_setup_compression(struct drm_device *dev, int size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (dev_priv->mm.stolen_base == 0)
+	if (dev_priv->gtt.stolen_base == 0)
 		return -ENODEV;
 
 	if (size < dev_priv->fbc.size)
@@ -180,7 +180,7 @@ void i915_gem_cleanup_stolen(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
 	i915_gem_stolen_cleanup_compression(dev);
-	drm_mm_takedown(&dev_priv->mm.stolen);
+	drm_mm_takedown(&dev_priv->gtt.stolen);
 }
 
 int i915_gem_init_stolen(struct drm_device *dev)
@@ -188,18 +188,18 @@ int i915_gem_init_stolen(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int bios_reserved = 0;
 
-	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
-	if (dev_priv->mm.stolen_base == 0)
+	dev_priv->gtt.stolen_base = i915_stolen_to_physical(dev);
+	if (dev_priv->gtt.stolen_base == 0)
 		return 0;
 
 	DRM_DEBUG_KMS("found %zd bytes of stolen memory at %08lx\n",
-		      dev_priv->gtt.stolen_size, dev_priv->mm.stolen_base);
+		      dev_priv->gtt.stolen_size, dev_priv->gtt.stolen_base);
 
 	if (IS_VALLEYVIEW(dev))
 		bios_reserved = 1024*1024; /* top 1M on VLV/BYT */
 
 	/* Basic memrange allocator for stolen space */
-	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->gtt.stolen_size -
+	drm_mm_init(&dev_priv->gtt.stolen, 0, dev_priv->gtt.stolen_size -
 		    bios_reserved);
 
 	return 0;
@@ -234,7 +234,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 	sg->offset = offset;
 	sg->length = size;
 
-	sg_dma_address(sg) = (dma_addr_t)dev_priv->mm.stolen_base + offset;
+	sg_dma_address(sg) = (dma_addr_t)dev_priv->gtt.stolen_base + offset;
 	sg_dma_len(sg) = size;
 
 	return st;
@@ -300,14 +300,14 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 
-	if (dev_priv->mm.stolen_base == 0)
+	if (dev_priv->gtt.stolen_base == 0)
 		return NULL;
 
 	DRM_DEBUG_KMS("creating stolen object: size=%x\n", size);
 	if (size == 0)
 		return NULL;
 
-	stolen = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
+	stolen = drm_mm_search_free(&dev_priv->gtt.stolen, size, 4096, 0);
 	if (stolen)
 		stolen = drm_mm_get_block(stolen, size, 4096);
 	if (stolen == NULL)
@@ -331,7 +331,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 
-	if (dev_priv->mm.stolen_base == 0)
+	if (dev_priv->gtt.stolen_base == 0)
 		return NULL;
 
 	DRM_DEBUG_KMS("creating preallocated stolen object: stolen_offset=%x, gtt_offset=%x, size=%x\n",
@@ -344,7 +344,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (WARN_ON(size == 0))
 		return NULL;
 
-	stolen = drm_mm_create_block(&dev_priv->mm.stolen,
+	stolen = drm_mm_create_block(&dev_priv->gtt.stolen,
 				     stolen_offset, size,
 				     false);
 	if (stolen == NULL) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index fa70fd0..1e25920 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1538,7 +1538,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 		} else if (src->stolen) {
 			unsigned long offset;
 
-			offset = dev_priv->mm.stolen_base;
+			offset = dev_priv->gtt.stolen_base;
 			offset += src->stolen->start;
 			offset += i << PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d32734d..02f2dea 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3464,7 +3464,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 		/* BIOS set it up already, grab the pre-alloc'd space */
 		int pcbr_offset;
 
-		pcbr_offset = (pcbr & (~4095)) - dev_priv->mm.stolen_base;
+		pcbr_offset = (pcbr & (~4095)) - dev_priv->gtt.stolen_base;
 		pctx = i915_gem_object_create_stolen_for_preallocated(dev_priv->dev,
 								      pcbr_offset,
 								      -1,
@@ -3486,7 +3486,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 		return;
 	}
 
-	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->start;
+	pctx_paddr = dev_priv->gtt.stolen_base + pctx->stolen->start;
 	I915_WRITE(VLV_PCBR, pctx_paddr);
 
 out:
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (22 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 23/66] drm/i915: Move stolen stuff " Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 13:27   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 25/66] drm/i915: Put the mm in the parent address space Ben Widawsky
                   ` (43 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

for file in `ls drivers/gpu/drm/i915/*.c` ; do
	sed -i "s/mm.aliasing/gtt.aliasing/" $file;
done

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++--
 drivers/gpu/drm/i915/i915_dma.c            |  2 +-
 drivers/gpu/drm/i915/i915_drv.h            |  6 +++---
 drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
 drivers/gpu/drm/i915/i915_gem_context.c    |  4 ++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 ++--
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  6 +++---
 7 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c10a690..f3c76ab 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1816,8 +1816,8 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
 		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
 	}
-	if (dev_priv->mm.aliasing_ppgtt) {
-		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+	if (dev_priv->gtt.aliasing_ppgtt) {
+		struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
 
 		seq_printf(m, "aliasing PPGTT:\n");
 		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset);
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 3535ced..ef00847 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -977,7 +977,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 		value = HAS_LLC(dev);
 		break;
 	case I915_PARAM_HAS_ALIASING_PPGTT:
-		if (intel_enable_ppgtt(dev) && dev_priv->mm.aliasing_ppgtt)
+		if (intel_enable_ppgtt(dev) && dev_priv->gtt.aliasing_ppgtt)
 			value = 1;
 		break;
 	case I915_PARAM_HAS_WAIT_TIMEOUT:
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7016074..0fa7a21 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -482,6 +482,9 @@ struct i915_gtt {
 	struct io_mapping *mappable;	/* Mapping to our CPU mappable region */
 	phys_addr_t mappable_base;	/* PA of our GMADR */
 
+	/** PPGTT used for aliasing the PPGTT with the GTT */
+	struct i915_hw_ppgtt *aliasing_ppgtt;
+
 	/** "Graphics Stolen Memory" holds the global PTEs */
 	void __iomem *gsm;
 
@@ -843,9 +846,6 @@ struct i915_gem_mm {
 	 */
 	struct list_head unbound_list;
 
-	/** PPGTT used for aliasing the PPGTT with the GTT */
-	struct i915_hw_ppgtt *aliasing_ppgtt;
-
 	struct shrinker inactive_shrinker;
 	bool shrinker_no_lock_stealing;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e31ed47..eb78c5b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2620,7 +2620,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (obj->has_global_gtt_mapping)
 		i915_gem_gtt_unbind_object(obj);
 	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
+		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
 		obj->has_aliasing_ppgtt_mapping = 0;
 	}
 	i915_gem_gtt_finish_object(obj);
@@ -3359,7 +3359,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		if (obj->has_global_gtt_mapping)
 			i915_gem_gtt_bind_object(obj, cache_level);
 		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
+			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
 					       obj, cache_level);
 
 		obj->gtt_space->color = cache_level;
@@ -3668,7 +3668,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		if (ret)
 			return ret;
 
-		if (!dev_priv->mm.aliasing_ppgtt)
+		if (!dev_priv->gtt.aliasing_ppgtt)
 			i915_gem_gtt_bind_object(obj, obj->cache_level);
 	}
 
@@ -4191,10 +4191,10 @@ i915_gem_init_hw(struct drm_device *dev)
 	 * the do_switch), but before enabling PPGTT. So don't move this.
 	 */
 	ret = i915_gem_context_enable(dev_priv);
-	if (ret || !dev_priv->mm.aliasing_ppgtt)
+	if (ret || !dev_priv->gtt.aliasing_ppgtt)
 		goto disable_ctx_out;
 
-	ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
+	ret = dev_priv->gtt.aliasing_ppgtt->enable(dev);
 	if (ret)
 		goto disable_ctx_out;
 
@@ -4236,7 +4236,7 @@ int i915_gem_init(struct drm_device *dev)
 		dev_priv->hw_contexts_disabled = true;
 
 ggtt_only:
-	if (!dev_priv->mm.aliasing_ppgtt) {
+	if (!dev_priv->gtt.aliasing_ppgtt) {
 		if (HAS_HW_CONTEXTS(dev))
 			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d92f121..aa4fc4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -226,7 +226,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	}
 
 	dev_priv->ring[RCS].default_context = ctx;
-	dev_priv->mm.aliasing_ppgtt = &ctx->ppgtt;
+	dev_priv->gtt.aliasing_ppgtt = &ctx->ppgtt;
 
 	DRM_DEBUG_DRIVER("Default HW context loaded\n");
 	return 0;
@@ -300,7 +300,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 	i915_gem_context_unreference(dctx);
 	dev_priv->ring[RCS].default_context = NULL;
 	dev_priv->ring[RCS].last_context = NULL;
-	dev_priv->mm.aliasing_ppgtt = NULL;
+	dev_priv->gtt.aliasing_ppgtt = NULL;
 }
 
 int i915_gem_context_enable(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7fcd6c0..93870bb 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -429,8 +429,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 	}
 
 	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
+	if (dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
+		i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
 				       obj, obj->cache_level);
 
 		obj->has_aliasing_ppgtt_mapping = 1;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 6de75c7..18820cb 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -127,7 +127,7 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	uint32_t pd_offset;
 	struct intel_ring_buffer *ring;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
+	struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
 	int i;
 
 	BUG_ON(ppgtt->pd_offset & 0x3f);
@@ -445,8 +445,8 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 				       i915_gtt_vm->start / PAGE_SIZE,
 				       i915_gtt_vm->total / PAGE_SIZE);
 
-	if (dev_priv->mm.aliasing_ppgtt)
-		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
+	if (dev_priv->gtt.aliasing_ppgtt)
+		gen6_write_pdes(dev_priv->gtt.aliasing_ppgtt);
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		i915_gem_clflush_object(obj);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 25/66] drm/i915: Put the mm in the parent address space
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (23 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 24/66] drm/i915: Move aliasing_ppgtt Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 26/66] drm/i915: Move active/inactive lists to new mm Ben Widawsky
                   ` (42 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Every address space should support object allocation. It therefore makes
sense to have the allocator be part of the "superclass" which GGTT and
PPGTT will derive.

Since our maximum address space size is only 2GB we're not yet able to
avoid doing allocation/eviction; but we'd hope one day this becomes
almost irrelvant.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c        |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h        |  3 +--
 drivers/gpu/drm/i915/i915_gem.c        |  4 ++--
 drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++++-----
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 18 +++++++++++-------
 drivers/gpu/drm/i915/i915_gem_stolen.c |  4 ++--
 6 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ef00847..7d6d4b0 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1363,7 +1363,7 @@ cleanup_gem:
 	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
-	drm_mm_takedown(&dev_priv->mm.gtt_space);
+	drm_mm_takedown(&i915_gtt_vm->mm);
 cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_gem_stolen:
@@ -1753,7 +1753,7 @@ int i915_driver_unload(struct drm_device *dev)
 			i915_free_hws(dev);
 	}
 
-	drm_mm_takedown(&dev_priv->mm.gtt_space);
+	drm_mm_takedown(&i915_gtt_vm->mm);
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0fa7a21..e65cf57 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -443,6 +443,7 @@ enum i915_cache_level {
 typedef uint32_t gen6_gtt_pte_t;
 
 struct i915_address_space {
+	struct drm_mm mm;
 	struct drm_device *dev;
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
@@ -834,8 +835,6 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
-	/** Memory allocator for GTT */
-	struct drm_mm gtt_space;
 	/** List of all objects in gtt_space. Used to restore gtt
 	 * mappings on resume */
 	struct list_head bound_list;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index eb78c5b..608b6b5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3131,7 +3131,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space, node,
+	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm, node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max,
 						  DRM_MM_CREATE_DEFAULT,
@@ -4229,7 +4229,7 @@ int i915_gem_init(struct drm_device *dev)
 					  i915_gtt_vm->total, 0);
 		i915_gem_context_init(dev);
 		if (dev_priv->hw_contexts_disabled) {
-			drm_mm_takedown(&dev_priv->mm.gtt_space);
+			drm_mm_takedown(&i915_gtt_vm->mm);
 			goto ggtt_only;
 		}
 	} else
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index c86d5d9..6e620f86 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -78,12 +78,12 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	INIT_LIST_HEAD(&unwind_list);
 	if (mappable)
-		drm_mm_init_scan_with_range(&dev_priv->mm.gtt_space,
-					    min_size, alignment, cache_level,
-					    0, dev_priv->gtt.mappable_end);
+		drm_mm_init_scan_with_range(&i915_gtt_vm->mm, min_size,
+					    alignment, cache_level, 0,
+					    dev_priv->gtt.mappable_end);
 	else
-		drm_mm_init_scan(&dev_priv->mm.gtt_space,
-				 min_size, alignment, cache_level);
+		drm_mm_init_scan(&i915_gtt_vm->mm, min_size, alignment,
+				 cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
 	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 18820cb..4131c22 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -249,6 +249,7 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 	int i;
 
 	drm_mm_remove_node(&ppgtt->node);
+	drm_mm_takedown(&ppgtt->base.mm);
 
 	if (ppgtt->pt_dma_addr) {
 		for (i = 0; i < ppgtt->num_pd_entries; i++)
@@ -277,8 +278,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	 * multiplied by page size. We allocate at the top of the GTT to avoid
 	 * fragmentation.
 	 */
-	BUG_ON(!drm_mm_initialized(&dev_priv->mm.gtt_space));
-	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space,
+	BUG_ON(!drm_mm_initialized(&i915_gtt_vm->mm));
+	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm,
 						  &ppgtt->node, GEN6_PD_SIZE,
 						  GEN6_PD_ALIGN, 0,
 						  dev_priv->gtt.mappable_end,
@@ -377,6 +378,10 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 	else
 		BUG();
 
+	if (!ret)
+		drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
+			    ppgtt->base.total);
+
 	return ret;
 }
 
@@ -668,10 +673,9 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	if (WARN_ON(guard_size & ~PAGE_MASK))
 		guard_size = round_up(guard_size, PAGE_SIZE);
 
-	/* Subtract the guard page ... */
-	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - guard_size);
+	drm_mm_init(&i915_gtt_vm->mm, start, end - start - guard_size);
 	if (!HAS_LLC(dev))
-		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
+		i915_gtt_vm->mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
@@ -679,7 +683,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 			      obj->gtt_offset, obj->base.size);
 
 		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
-		obj->gtt_space = drm_mm_create_block(&dev_priv->mm.gtt_space,
+		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
 						     obj->gtt_offset,
 						     obj->base.size,
 						     false);
@@ -690,7 +694,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	i915_gtt_vm->total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
+	drm_mm_for_each_hole(entry, &i915_gtt_vm->mm,
 			     hole_start, hole_end) {
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index fd812d5..49e8be7 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -368,8 +368,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 * setting up the GTT space. The actual reservation will occur
 	 * later.
 	 */
-	if (drm_mm_initialized(&dev_priv->mm.gtt_space)) {
-		obj->gtt_space = drm_mm_create_block(&dev_priv->mm.gtt_space,
+	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
+		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
 						     gtt_offset, size,
 						     false);
 		if (obj->gtt_space == NULL) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 26/66] drm/i915: Move active/inactive lists to new mm
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (24 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 25/66] drm/i915: Put the mm in the parent address space Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 15:38   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 27/66] drm/i915: Create a global list of vms Ben Widawsky
                   ` (41 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.inactive_list/i915_gtt_mm-\>inactive_list/" $file; done
for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.active_list/i915_gtt_mm-\>active_list/" $file; done

I've also opted to move the comments out of line a bit so one can get a
better picture of what the various lists do.

v2: Leave the bound list as a global one. (Chris, indirectly)

CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    | 11 ++++----
 drivers/gpu/drm/i915/i915_drv.h        | 49 ++++++++++++++--------------------
 drivers/gpu/drm/i915/i915_gem.c        | 24 +++++++----------
 drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
 drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++----
 drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
 drivers/gpu/drm/i915/i915_irq.c        |  6 ++---
 7 files changed, 46 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f3c76ab..a0babc7 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -158,11 +158,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_printf(m, "Active:\n");
-		head = &dev_priv->mm.active_list;
+		head = &i915_gtt_vm->active_list;
 		break;
 	case INACTIVE_LIST:
 		seq_printf(m, "Inactive:\n");
-		head = &dev_priv->mm.inactive_list;
+		head = &i915_gtt_vm->inactive_list;
 		break;
 	default:
 		mutex_unlock(&dev->struct_mutex);
@@ -247,12 +247,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.active_list, mm_list);
+	count_objects(&i915_gtt_vm->active_list, mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.inactive_list, mm_list);
+	count_objects(&i915_gtt_vm->inactive_list, mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -1977,7 +1977,8 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
+		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
+					 mm_list)
 			if (obj->pin_count == 0) {
 				ret = i915_gem_object_unbind(obj);
 				if (ret)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e65cf57..0553410 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -448,6 +448,22 @@ struct i915_address_space {
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
 
+/* We use many types of lists for object tracking:
+ *  active_list: List of objects currently involved in rendering.
+ *	Includes buffers having the contents of their GPU caches flushed, not
+ *	necessarily primitives. last_rendering_seqno represents when the
+ *	rendering involved will be completed. A reference is held on the buffer
+ *	while on this list.
+ *  inactive_list: LRU list of objects which are not in the ringbuffer
+ *	objects are ready to unbind but are still mapped.
+ *	last_rendering_seqno is 0 while an object is in this list.
+ *	A reference is not held on the buffer while on this list,
+ *	as merely being GTT-bound shouldn't prevent its being
+ *	freed, and we'll pull it off the list in the free path.
+ */
+	struct list_head active_list;
+	struct list_head inactive_list;
+
 	struct {
 		dma_addr_t addr;
 		struct page *page;
@@ -835,42 +851,17 @@ struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
-	/** List of all objects in gtt_space. Used to restore gtt
-	 * mappings on resume */
-	struct list_head bound_list;
 	/**
-	 * List of objects which are not bound to the GTT (thus
-	 * are idle and not used by the GPU) but still have
-	 * (presumably uncached) pages still attached.
+	 * Lists of objects which are [not] bound to a VM. Unbound objects are
+	 * idle are idle but still have (presumably uncached) pages still
+	 * attached.
 	 */
+	struct list_head bound_list;
 	struct list_head unbound_list;
 
 	struct shrinker inactive_shrinker;
 	bool shrinker_no_lock_stealing;
 
-	/**
-	 * List of objects currently involved in rendering.
-	 *
-	 * Includes buffers having the contents of their GPU caches
-	 * flushed, not necessarily primitives.  last_rendering_seqno
-	 * represents when the rendering involved will be completed.
-	 *
-	 * A reference is held on the buffer while on this list.
-	 */
-	struct list_head active_list;
-
-	/**
-	 * LRU list of objects which are not in the ringbuffer and
-	 * are ready to unbind, but are still in the GTT.
-	 *
-	 * last_rendering_seqno is 0 while an object is in this list.
-	 *
-	 * A reference is not held on the buffer while on this list,
-	 * as merely being GTT-bound shouldn't prevent its being
-	 * freed, and we'll pull it off the list in the free path.
-	 */
-	struct list_head inactive_list;
-
 	/** LRU list of objects with fence regs on them. */
 	struct list_head fence_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 608b6b5..7da06df 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1706,7 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 	}
 
 	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
+				 &i915_gtt_vm->inactive_list,
 				 mm_list) {
 		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
 		    i915_gem_object_unbind(obj) == 0 &&
@@ -1881,7 +1881,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
+	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1909,7 +1909,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2279,12 +2279,8 @@ bool i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj,
-			    &dev_priv->mm.inactive_list,
-			    mm_list)
-	{
+	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
 		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
-	}
 
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_restore_fences(dev);
@@ -3162,7 +3158,7 @@ search_free:
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	obj->gtt_space = node;
 	obj->gtt_offset = node->start;
@@ -3313,7 +3309,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 
 	/* And bump the LRU for this access */
 	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	return 0;
 }
@@ -4291,7 +4287,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 		return ret;
 	}
 
-	BUG_ON(!list_empty(&dev_priv->mm.active_list));
+	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
 	mutex_unlock(&dev->struct_mutex);
 
 	ret = drm_irq_install(dev);
@@ -4352,8 +4348,8 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&dev_priv->mm.active_list);
-	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
+	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
+	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
@@ -4652,7 +4648,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, global_list)
+	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
 		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
index 582e6a5..bf945a3 100644
--- a/drivers/gpu/drm/i915/i915_gem_debug.c
+++ b/drivers/gpu/drm/i915/i915_gem_debug.c
@@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
 		}
 	}
 
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
+	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
 		if (obj->base.dev != dev ||
 		    !atomic_read(&obj->base.refcount.refcount)) {
 			DRM_ERROR("freed inactive %p\n", obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 6e620f86..92856a2 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -86,7 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 				 cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
+	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
 		if (mark_free(obj, &unwind_list))
 			goto found;
 	}
@@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
+	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
 		if (mark_free(obj, &unwind_list))
 			goto found;
 	}
@@ -158,8 +158,8 @@ i915_gem_evict_everything(struct drm_device *dev)
 	bool lists_empty;
 	int ret;
 
-	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
-		       list_empty(&dev_priv->mm.active_list));
+	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
+		       list_empty(&i915_gtt_vm->active_list));
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -177,7 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list, mm_list)
+				 &i915_gtt_vm->inactive_list, mm_list)
 		if (obj->pin_count == 0)
 			WARN_ON(i915_gem_object_unbind(obj));
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 49e8be7..3f6564d 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -384,7 +384,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
+	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 1e25920..5dc055a 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1722,7 +1722,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
+	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
 		if (obj->ring != ring)
 			continue;
 
@@ -1857,7 +1857,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
+	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
@@ -1877,7 +1877,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 		error->active_bo_count =
 			capture_active_bo(error->active_bo,
 					  error->active_bo_count,
-					  &dev_priv->mm.active_list);
+					  &i915_gtt_vm->active_list);
 
 	if (error->pinned_bo)
 		error->pinned_bo_count =
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 27/66] drm/i915: Create a global list of vms
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (25 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 26/66] drm/i915: Move active/inactive lists to new mm Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 28/66] drm/i915: Remove object's gtt_offset Ben Widawsky
                   ` (40 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

After we plumb our code to support multiple address spaces (VMs), there
are a few situations where we want to be able to traverse the list of
all address spaces in the system. Cases like eviction, or error state
collection are obvious example.

It's easy enough to test and make sure our list is accurate because we
already have a member in place to access our global GTT. By porting that
to use our list (which assumes the GGTT is always the first entry) we
can verify a decent amount of the code is working correct.

NOTE: to do this, we must initialize the list quite early.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 5 +++++
 drivers/gpu/drm/i915/i915_drv.h | 7 ++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 7d6d4b0..24dd593 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1497,6 +1497,10 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
+	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
+
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
@@ -1753,6 +1757,7 @@ int i915_driver_unload(struct drm_device *dev)
 			i915_free_hws(dev);
 	}
 
+	list_del(&dev_priv->vm_list);
 	drm_mm_takedown(&i915_gtt_vm->mm);
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0553410..bc5f656 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -445,6 +445,7 @@ typedef uint32_t gen6_gtt_pte_t;
 struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
+	struct list_head global_link;
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
 
@@ -519,7 +520,10 @@ struct i915_gtt {
 			  unsigned long *mappable_end);
 	void (*gtt_remove)(struct drm_device *dev);
 };
-#define i915_gtt_vm ((struct i915_address_space *)&(dev_priv->gtt.base))
+#define i915_gtt_vm ((struct i915_address_space *) \
+		     list_first_entry(&dev_priv->vm_list,\
+				      struct i915_address_space, \
+				      global_link))
 
 struct i915_hw_ppgtt {
 	struct i915_address_space base;
@@ -1115,6 +1119,7 @@ typedef struct drm_i915_private {
 	enum modeset_restore modeset_restore;
 	struct mutex modeset_restore_lock;
 
+	struct list_head vm_list; /* Global list of all address spaces */
 	struct i915_gtt gtt; /* VMA representing the global address space */
 
 	struct i915_gem_mm mm;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 28/66] drm/i915: Remove object's gtt_offset
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (26 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 27/66] drm/i915: Create a global list of vms Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 29/66] drm: pre allocate node for create_block Ben Widawsky
                   ` (39 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

gtt_offset has always been == gtt_space->start. This makes an upcoming
change much easier to swallow.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  9 +++---
 drivers/gpu/drm/i915/i915_drv.h            |  7 ----
 drivers/gpu/drm/i915/i915_gem.c            | 51 ++++++++++++++----------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_debug.c      |  9 +++---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 13 ++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  6 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  2 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     | 14 ++++----
 drivers/gpu/drm/i915/i915_irq.c            | 16 +++++-----
 drivers/gpu/drm/i915/intel_display.c       | 34 ++++++++++++--------
 drivers/gpu/drm/i915/intel_fb.c            |  8 ++---
 drivers/gpu/drm/i915/intel_overlay.c       | 23 +++++++-------
 drivers/gpu/drm/i915/intel_pm.c            |  8 ++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 12 +++----
 drivers/gpu/drm/i915/intel_sprite.c        |  8 +++--
 16 files changed, 112 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a0babc7..3d3e770 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -123,8 +123,9 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
 	if (obj->gtt_space != NULL)
-		seq_printf(m, " (gtt offset: %08x, size: %08x)",
-			   obj->gtt_offset, (unsigned int)obj->gtt_space->size);
+		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
+			   obj->gtt_space->start,
+			   (unsigned int)obj->gtt_space->size);
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
@@ -379,12 +380,12 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 			if (work->old_fb_obj) {
 				struct drm_i915_gem_object *obj = work->old_fb_obj;
 				if (obj)
-					seq_printf(m, "Old framebuffer gtt_offset 0x%08x\n", obj->gtt_offset);
+					seq_printf(m, "Old framebuffer gtt_offset 0x%08lx\n", obj->gtt_space->start);
 			}
 			if (work->pending_flip_obj) {
 				struct drm_i915_gem_object *obj = work->pending_flip_obj;
 				if (obj)
-					seq_printf(m, "New framebuffer gtt_offset 0x%08x\n", obj->gtt_offset);
+					seq_printf(m, "New framebuffer gtt_offset 0x%08lx\n", obj->gtt_space->start);
 			}
 		}
 		spin_unlock_irqrestore(&dev->event_lock, flags);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bc5f656..f6704d3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1324,13 +1324,6 @@ struct drm_i915_gem_object {
 	unsigned long exec_handle;
 	struct drm_i915_gem_exec_object2 *exec_entry;
 
-	/**
-	 * Current offset of the object in GTT space.
-	 *
-	 * This is the same as gtt_space->start
-	 */
-	uint32_t gtt_offset;
-
 	struct intel_ring_buffer *ring;
 
 	/** Breadcrumb of last rendering to the buffer. */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7da06df..d747a1f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -609,7 +609,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	user_data = to_user_ptr(args->data_ptr);
 	remain = args->size;
 
-	offset = obj->gtt_offset + args->offset;
+	offset = obj->gtt_space->start + args->offset;
 
 	while (remain > 0) {
 		/* Operation in this page
@@ -1326,7 +1326,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	pgoff_t page_offset;
-	unsigned long pfn;
+	unsigned long pfn = dev_priv->gtt.mappable_base >> PAGE_SHIFT;
 	int ret = 0;
 	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
 
@@ -1361,8 +1361,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	obj->fault_mappable = true;
 
-	pfn = ((dev_priv->gtt.mappable_base + obj->gtt_offset) >> PAGE_SHIFT) +
-		page_offset;
+	pfn += (obj->gtt_space->start >> PAGE_SHIFT) + page_offset;
 
 	/* Finally, remap it using the new GTT offset */
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
@@ -2109,8 +2108,8 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 
 static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
 {
-	if (acthd >= obj->gtt_offset &&
-	    acthd < obj->gtt_offset + obj->base.size)
+	if (acthd >= obj->gtt_space->start &&
+	    acthd < obj->gtt_space->start + obj->base.size)
 		return true;
 
 	return false;
@@ -2168,11 +2167,11 @@ static bool i915_set_reset_status(struct intel_ring_buffer *ring,
 
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
-		DRM_ERROR("%s hung %s bo (0x%x ctx %d) at 0x%x\n",
+		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
 			  ring->name,
 			  inside ? "inside" : "flushing",
 			  request->batch_obj ?
-			  request->batch_obj->gtt_offset : 0,
+			  request->batch_obj->gtt_space->start : 0,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
@@ -2629,7 +2628,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 
 	drm_mm_put_block(obj->gtt_space);
 	obj->gtt_space = NULL;
-	obj->gtt_offset = 0;
 
 	return 0;
 }
@@ -2673,9 +2671,9 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 	if (obj) {
 		u32 size = obj->gtt_space->size;
 
-		val = (uint64_t)((obj->gtt_offset + size - 4096) &
+		val = (uint64_t)((obj->gtt_space->start + size - 4096) &
 				 0xfffff000) << 32;
-		val |= obj->gtt_offset & 0xfffff000;
+		val |= obj->gtt_space->start & 0xfffff000;
 		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
@@ -2699,11 +2697,11 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 		int pitch_val;
 		int tile_width;
 
-		WARN((obj->gtt_offset & ~I915_FENCE_START_MASK) ||
+		WARN((obj->gtt_space->start & ~I915_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (obj->gtt_offset & (size - 1)),
-		     "object 0x%08x [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
-		     obj->gtt_offset, obj->map_and_fenceable, size);
+		     (obj->gtt_space->start & (size - 1)),
+		     "object 0x%08lx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
+		     obj->gtt_space->start, obj->map_and_fenceable, size);
 
 		if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 			tile_width = 128;
@@ -2714,7 +2712,7 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 		pitch_val = obj->stride / tile_width;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = obj->gtt_offset;
+		val = obj->gtt_space->start;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I915_FENCE_SIZE_BITS(size);
@@ -2742,16 +2740,16 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 		u32 size = obj->gtt_space->size;
 		uint32_t pitch_val;
 
-		WARN((obj->gtt_offset & ~I830_FENCE_START_MASK) ||
+		WARN((obj->gtt_space->start & ~I830_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (obj->gtt_offset & (size - 1)),
-		     "object 0x%08x not 512K or pot-size 0x%08x aligned\n",
-		     obj->gtt_offset, size);
+		     (obj->gtt_space->start & (size - 1)),
+		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
+		     obj->gtt_space->start, size);
 
 		pitch_val = obj->stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = obj->gtt_offset;
+		val = obj->gtt_space->start;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I830_FENCE_SIZE_BITS(size);
@@ -3161,14 +3159,13 @@ search_free:
 	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	obj->gtt_space = node;
-	obj->gtt_offset = node->start;
 
 	fenceable =
 		node->size == fence_size &&
 		(node->start & (fence_alignment - 1)) == 0;
 
 	mappable =
-		obj->gtt_offset + obj->base.size <= dev_priv->gtt.mappable_end;
+		node->start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
@@ -3640,13 +3637,13 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 
 	if (obj->gtt_space != NULL) {
-		if ((alignment && obj->gtt_offset & (alignment - 1)) ||
+		if ((alignment && obj->gtt_space->start & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
-			     " offset=%x, req.alignment=%x, req.map_and_fenceable=%d,"
+			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     obj->gtt_offset, alignment,
+			     obj->gtt_space->start, alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
 			ret = i915_gem_object_unbind(obj);
@@ -3731,7 +3728,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	 * as the X server doesn't manage domains yet
 	 */
 	i915_gem_object_flush_cpu_write_domain(obj);
-	args->offset = obj->gtt_offset;
+	args->offset = obj->gtt_space->start;
 out:
 	drm_gem_object_unreference(&obj->base);
 unlock:
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index aa4fc4a..1e838f4 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -395,7 +395,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, new_context->obj->gtt_offset |
+	intel_ring_emit(ring, new_context->obj->gtt_space->start |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
index bf945a3..8812ee0 100644
--- a/drivers/gpu/drm/i915/i915_gem_debug.c
+++ b/drivers/gpu/drm/i915/i915_gem_debug.c
@@ -128,11 +128,12 @@ i915_gem_object_check_coherency(struct drm_i915_gem_object *obj, int handle)
 	int bad_count = 0;
 
 	DRM_INFO("%s: checking coherency of object %p@0x%08x (%d, %zdkb):\n",
-		 __func__, obj, obj->gtt_offset, handle,
+		 __func__, obj, obj->gtt_space->start, handle,
 		 obj->size / 1024);
 
-	gtt_mapping = ioremap(dev_priv->mm.gtt_base_addr + obj->gtt_offset,
-			      obj->base.size);
+	gtt_mapping =
+		ioremap(dev_priv->mm.gtt_base_addr + obj->gtt_space->start,
+			obj->base.size);
 	if (gtt_mapping == NULL) {
 		DRM_ERROR("failed to map GTT space\n");
 		return;
@@ -156,7 +157,7 @@ i915_gem_object_check_coherency(struct drm_i915_gem_object *obj, int handle)
 			if (cpuval != gttval) {
 				DRM_INFO("incoherent CPU vs GPU at 0x%08x: "
 					 "0x%08x vs 0x%08x\n",
-					 (int)(obj->gtt_offset +
+					 (int)(obj->gtt_space->start +
 					       page * PAGE_SIZE + i * 4),
 					 cpuval, gttval);
 				if (bad_count++ >= 8) {
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 93870bb..67246a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -188,7 +188,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		return -ENOENT;
 
 	target_i915_obj = to_intel_bo(target_obj);
-	target_offset = target_i915_obj->gtt_offset;
+	target_offset = target_i915_obj->gtt_space->start;
 
 	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
 	 * pipe_control writes because the gpu doesn't properly redirect them
@@ -280,7 +280,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 			return ret;
 
 		/* Map the page containing the relocation we're going to perform.  */
-		reloc->offset += obj->gtt_offset;
+		reloc->offset += obj->gtt_space->start;
 		reloc_page = io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
 						      reloc->offset & PAGE_MASK);
 		reloc_entry = (uint32_t __iomem *)
@@ -436,8 +436,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != obj->gtt_offset) {
-		entry->offset = obj->gtt_offset;
+	if (entry->offset != obj->gtt_space->start) {
+		entry->offset = obj->gtt_space->start;
 		*need_reloc = true;
 	}
 
@@ -539,7 +539,8 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 				obj->tiling_mode != I915_TILING_NONE;
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
-			if ((entry->alignment && obj->gtt_offset & (entry->alignment - 1)) ||
+			if ((entry->alignment &&
+			     obj->gtt_space->start & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
 				ret = i915_gem_object_unbind(obj);
 			else
@@ -1071,7 +1072,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = batch_obj->gtt_offset + args->batch_start_offset;
+	exec_start = batch_obj->gtt_space->start + args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4131c22..a45c00d 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -679,12 +679,12 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		DRM_DEBUG_KMS("reserving preallocated space: %x + %zx\n",
-			      obj->gtt_offset, obj->base.size);
+		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
+			      obj->gtt_space->start, obj->base.size);
 
 		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
 		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
-						     obj->gtt_offset,
+						     obj->gtt_space->start,
 						     obj->base.size,
 						     false);
 		obj->has_global_gtt_mapping = 1;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 3f6564d..7fba6f5 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -380,7 +380,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	} else
 		obj->gtt_space = I915_GTT_RESERVED;
 
-	obj->gtt_offset = gtt_offset;
+	obj->gtt_space->start = gtt_offset;
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 537545b..7aab12a 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -268,10 +268,10 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
 		return true;
 
 	if (INTEL_INFO(obj->base.dev)->gen == 3) {
-		if (obj->gtt_offset & ~I915_FENCE_START_MASK)
+		if (obj->gtt_space->start & ~I915_FENCE_START_MASK)
 			return false;
 	} else {
-		if (obj->gtt_offset & ~I830_FENCE_START_MASK)
+		if (obj->gtt_space->start & ~I830_FENCE_START_MASK)
 			return false;
 	}
 
@@ -279,7 +279,7 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
 	if (obj->gtt_space->size != size)
 		return false;
 
-	if (obj->gtt_offset & (size - 1))
+	if (obj->gtt_space->start & (size - 1))
 		return false;
 
 	return true;
@@ -358,9 +358,9 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		 * whilst executing a fenced command for an untiled object.
 		 */
 
-		obj->map_and_fenceable =
-			obj->gtt_space == NULL ||
-			(obj->gtt_offset + obj->base.size <= dev_priv->gtt.mappable_end &&
+		obj->map_and_fenceable = obj->gtt_space == NULL ||
+			(obj->gtt_space->start +
+			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
 		/* Rebind if we need a change of alignment */
@@ -369,7 +369,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (obj->gtt_offset & (unfenced_alignment - 1))
+			if (obj->gtt_space->start & (unfenced_alignment - 1))
 				ret = i915_gem_object_unbind(obj);
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 5dc055a..2c4fe36 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1512,7 +1512,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 	if (dst == NULL)
 		return NULL;
 
-	reloc_offset = src->gtt_offset;
+	reloc_offset = src->gtt_space->start;
 	for (i = 0; i < num_pages; i++) {
 		unsigned long flags;
 		void *d;
@@ -1564,7 +1564,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 		reloc_offset += PAGE_SIZE;
 	}
 	dst->page_count = num_pages;
-	dst->gtt_offset = src->gtt_offset;
+	dst->gtt_offset = src->gtt_space->start;
 
 	return dst;
 
@@ -1618,7 +1618,7 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 	err->name = obj->base.name;
 	err->rseqno = obj->last_read_seqno;
 	err->wseqno = obj->last_write_seqno;
-	err->gtt_offset = obj->gtt_offset;
+	err->gtt_offset = obj->gtt_space->start;
 	err->read_domains = obj->base.read_domains;
 	err->write_domain = obj->base.write_domain;
 	err->fence_reg = obj->fence_reg;
@@ -1716,8 +1716,8 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			return NULL;
 
 		obj = ring->private;
-		if (acthd >= obj->gtt_offset &&
-		    acthd < obj->gtt_offset + obj->base.size)
+		if (acthd >= obj->gtt_space->start &&
+		    acthd < obj->gtt_space->start + obj->base.size)
 			return i915_error_object_create(dev_priv, obj);
 	}
 
@@ -1798,7 +1798,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		if ((error->ccid & PAGE_MASK) == obj->gtt_offset) {
+		if ((error->ccid & PAGE_MASK) == obj->gtt_space->start) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
 								    obj, 1);
 		}
@@ -2152,10 +2152,10 @@ static void __always_unused i915_pageflip_stall_check(struct drm_device *dev, in
 	if (INTEL_INFO(dev)->gen >= 4) {
 		int dspsurf = DSPSURF(intel_crtc->plane);
 		stall_detected = I915_HI_DISPBASE(I915_READ(dspsurf)) ==
-					obj->gtt_offset;
+					obj->gtt_space->start;
 	} else {
 		int dspaddr = DSPADDR(intel_crtc->plane);
-		stall_detected = I915_READ(dspaddr) == (obj->gtt_offset +
+		stall_detected = I915_READ(dspaddr) == (obj->gtt_space->start +
 							crtc->y * crtc->fb->pitches[0] +
 							crtc->x * crtc->fb->bits_per_pixel/8);
 	}
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f056eca..a269d7a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1942,16 +1942,19 @@ static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 		intel_crtc->dspaddr_offset = linear_offset;
 	}
 
-	DRM_DEBUG_KMS("Writing base %08X %08lX %d %d %d\n",
-		      obj->gtt_offset, linear_offset, x, y, fb->pitches[0]);
+	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
+		      obj->gtt_space->start, linear_offset, x, y,
+		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	if (INTEL_INFO(dev)->gen >= 4) {
 		I915_MODIFY_DISPBASE(DSPSURF(plane),
-				     obj->gtt_offset + intel_crtc->dspaddr_offset);
+				     obj->gtt_space->start +
+				     intel_crtc->dspaddr_offset);
 		I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
 		I915_WRITE(DSPLINOFF(plane), linear_offset);
 	} else
-		I915_WRITE(DSPADDR(plane), obj->gtt_offset + linear_offset);
+		I915_WRITE(DSPADDR(plane),
+			   obj->gtt_space->start + linear_offset);
 	POSTING_READ(reg);
 
 	return 0;
@@ -2031,11 +2034,12 @@ static int ironlake_update_plane(struct drm_crtc *crtc,
 					       fb->pitches[0]);
 	linear_offset -= intel_crtc->dspaddr_offset;
 
-	DRM_DEBUG_KMS("Writing base %08X %08lX %d %d %d\n",
-		      obj->gtt_offset, linear_offset, x, y, fb->pitches[0]);
+	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
+		      obj->gtt_space->start, linear_offset, x, y,
+		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	I915_MODIFY_DISPBASE(DSPSURF(plane),
-			     obj->gtt_offset + intel_crtc->dspaddr_offset);
+			     obj->gtt_space->start+intel_crtc->dspaddr_offset);
 	if (IS_HASWELL(dev)) {
 		I915_WRITE(DSPOFFSET(plane), (y << 16) | x);
 	} else {
@@ -6554,7 +6558,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
 			goto fail_unpin;
 		}
 
-		addr = obj->gtt_offset;
+		addr = obj->gtt_space->start;
 	} else {
 		int align = IS_I830(dev) ? 16 * 1024 : 256;
 		ret = i915_gem_attach_phys_object(dev, obj,
@@ -7269,7 +7273,8 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_DISPLAY_FLIP |
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
-	intel_ring_emit(ring, obj->gtt_offset + intel_crtc->dspaddr_offset);
+	intel_ring_emit(ring,
+			obj->gtt_space->start + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7310,7 +7315,8 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 |
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
-	intel_ring_emit(ring, obj->gtt_offset + intel_crtc->dspaddr_offset);
+	intel_ring_emit(ring,
+			obj->gtt_space->start + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7350,7 +7356,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			(obj->gtt_offset + intel_crtc->dspaddr_offset) |
+			(obj->gtt_space->start + intel_crtc->dspaddr_offset) |
 			obj->tiling_mode);
 
 	/* XXX Enabling the panel-fitter across page-flip is so far
@@ -7393,7 +7399,8 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_DISPLAY_FLIP |
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
-	intel_ring_emit(ring, obj->gtt_offset + intel_crtc->dspaddr_offset);
+	intel_ring_emit(ring,
+			obj->gtt_space->start + intel_crtc->dspaddr_offset);
 
 	/* Contrary to the suggestions in the documentation,
 	 * "Enable Panel Fitter" does not seem to be required when page
@@ -7458,7 +7465,8 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 
 	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
 	intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
-	intel_ring_emit(ring, obj->gtt_offset + intel_crtc->dspaddr_offset);
+	intel_ring_emit(ring,
+			obj->gtt_space->start + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 244060a..242a793 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -139,11 +139,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->apertures->ranges[0].base = dev->mode_config.fb_base;
 	info->apertures->ranges[0].size = dev_priv->gtt.mappable_end;
 
-	info->fix.smem_start = dev->mode_config.fb_base + obj->gtt_offset;
+	info->fix.smem_start = dev->mode_config.fb_base + obj->gtt_space->start;
 	info->fix.smem_len = size;
 
 	info->screen_base =
-		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_offset,
+		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
 			   size);
 	if (!info->screen_base) {
 		ret = -ENOSPC;
@@ -166,9 +166,9 @@ static int intelfb_create(struct drm_fb_helper *helper,
 
 	/* Use default scratch pixmap (info->pixmap.flags = FB_PIXMAP_SYSTEM) */
 
-	DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08x, bo %p\n",
+	DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08lx, bo %p\n",
 		      fb->width, fb->height,
-		      obj->gtt_offset, obj);
+		      obj->gtt_space->start, obj);
 
 
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index a369881..93f2671 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -196,7 +196,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
 		regs = io_mapping_map_wc(dev_priv->gtt.mappable,
-					 overlay->reg_bo->gtt_offset);
+					 overlay->reg_bo->gtt_space->start);
 
 	return regs;
 }
@@ -740,7 +740,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	swidth = params->src_w;
 	swidthsw = calc_swidthsw(overlay->dev, params->offset_Y, tmp_width);
 	sheight = params->src_h;
-	iowrite32(new_bo->gtt_offset + params->offset_Y, &regs->OBUF_0Y);
+	iowrite32(new_bo->gtt_space->start + params->offset_Y, &regs->OBUF_0Y);
 	ostride = params->stride_Y;
 
 	if (params->format & I915_OVERLAY_YUV_PLANAR) {
@@ -754,8 +754,10 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 				      params->src_w/uv_hscale);
 		swidthsw |= max_t(u32, tmp_U, tmp_V) << 16;
 		sheight |= (params->src_h/uv_vscale) << 16;
-		iowrite32(new_bo->gtt_offset + params->offset_U, &regs->OBUF_0U);
-		iowrite32(new_bo->gtt_offset + params->offset_V, &regs->OBUF_0V);
+		iowrite32(new_bo->gtt_space->start + params->offset_U,
+			  &regs->OBUF_0U);
+		iowrite32(new_bo->gtt_space->start + params->offset_V,
+			  &regs->OBUF_0V);
 		ostride |= params->stride_UV << 16;
 	}
 
@@ -1355,7 +1357,7 @@ void intel_setup_overlay(struct drm_device *dev)
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
 		}
-		overlay->flip_addr = reg_bo->gtt_offset;
+		overlay->flip_addr = reg_bo->gtt_space->start;
 
 		ret = i915_gem_object_set_to_gtt_domain(reg_bo, true);
 		if (ret) {
@@ -1426,18 +1428,15 @@ static struct overlay_registers __iomem *
 intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
 {
 	drm_i915_private_t *dev_priv = overlay->dev->dev_private;
-	struct overlay_registers __iomem *regs;
 
 	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
 		/* Cast to make sparse happy, but it's wc memory anyway, so
 		 * equivalent to the wc io mapping on X86. */
-		regs = (struct overlay_registers __iomem *)
+		return (struct overlay_registers __iomem *)
 			overlay->reg_bo->phys_obj->handle->vaddr;
-	else
-		regs = io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
-						overlay->reg_bo->gtt_offset);
 
-	return regs;
+	return io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
+					overlay->reg_bo->gtt_space->start);
 }
 
 static void intel_overlay_unmap_regs_atomic(struct intel_overlay *overlay,
@@ -1468,7 +1467,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
 	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
 		error->base = (__force long)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
-		error->base = overlay->reg_bo->gtt_offset;
+		error->base = overlay->reg_bo->gtt_space->start;
 
 	regs = intel_overlay_map_regs_atomic(overlay);
 	if (!regs)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 02f2dea..73c0ee1 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -217,7 +217,7 @@ static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 		   (stall_watermark << DPFC_RECOMP_STALL_WM_SHIFT) |
 		   (interval << DPFC_RECOMP_TIMER_COUNT_SHIFT));
 	I915_WRITE(ILK_DPFC_FENCE_YOFF, crtc->y);
-	I915_WRITE(ILK_FBC_RT_BASE, obj->gtt_offset | ILK_FBC_RT_VALID);
+	I915_WRITE(ILK_FBC_RT_BASE, obj->gtt_space->start | ILK_FBC_RT_VALID);
 	/* enable it... */
 	I915_WRITE(ILK_DPFC_CONTROL, dpfc_ctl | DPFC_CTL_EN);
 
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_offset);
+	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_space->start);
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
@@ -3685,7 +3685,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 
 	intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, dev_priv->ips.renderctx->gtt_offset |
+	intel_ring_emit(ring, dev_priv->ips.renderctx->gtt_space->start |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
@@ -3708,7 +3708,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 		return;
 	}
 
-	I915_WRITE(PWRCTXA, dev_priv->ips.pwrctx->gtt_offset | PWRCTX_EN);
+	I915_WRITE(PWRCTXA, dev_priv->ips.pwrctx->gtt_space->start | PWRCTX_EN);
 	I915_WRITE(RSTDBYCTL, I915_READ(RSTDBYCTL) & ~RCX_SW_EXIT);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 901e0af..c4c80c2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -424,14 +424,14 @@ static int init_ring_common(struct intel_ring_buffer *ring)
 	 * registers with the above sequence (the readback of the HEAD registers
 	 * also enforces ordering), otherwise the hw might lose the new ring
 	 * register values. */
-	I915_WRITE_START(ring, obj->gtt_offset);
+	I915_WRITE_START(ring, obj->gtt_space->start);
 	I915_WRITE_CTL(ring,
 			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
 
 	/* If the head is still not zero, the ring is dead */
 	if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
-		     I915_READ_START(ring) == obj->gtt_offset &&
+		     I915_READ_START(ring) == obj->gtt_space->start &&
 		     (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
 		DRM_ERROR("%s initialization failed "
 				"ctl %08x head %08x tail %08x start %08x\n",
@@ -489,7 +489,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
 	if (ret)
 		goto err_unref;
 
-	pc->gtt_offset = obj->gtt_offset;
+	pc->gtt_offset = obj->gtt_space->start;
 	pc->cpu_page = kmap(sg_page(obj->pages->sgl));
 	if (pc->cpu_page == NULL) {
 		ret = -ENOMEM;
@@ -1129,7 +1129,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
 		intel_ring_advance(ring);
 	} else {
 		struct drm_i915_gem_object *obj = ring->private;
-		u32 cs_offset = obj->gtt_offset;
+		u32 cs_offset = obj->gtt_space->start;
 
 		if (len > I830_BATCH_LIMIT)
 			return -ENOSPC;
@@ -1214,7 +1214,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
 		goto err_unref;
 	}
 
-	ring->status_page.gfx_addr = obj->gtt_offset;
+	ring->status_page.gfx_addr = obj->gtt_space->start;
 	ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
 	if (ring->status_page.page_addr == NULL) {
 		ret = -ENOMEM;
@@ -1308,7 +1308,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 		goto err_unpin;
 
 	ring->virtual_start =
-		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_offset,
+		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
 			   ring->size);
 	if (ring->virtual_start == NULL) {
 		DRM_ERROR("Failed to map ringbuffer.\n");
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 1fa5612..c342571 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -133,7 +133,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
 
 	I915_WRITE(SPSIZE(pipe, plane), (crtc_h << 16) | crtc_w);
 	I915_WRITE(SPCNTR(pipe, plane), sprctl);
-	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), obj->gtt_offset +
+	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), obj->gtt_space->start +
 			     sprsurf_offset);
 	POSTING_READ(SPSURF(pipe, plane));
 }
@@ -308,7 +308,8 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 	if (intel_plane->can_scale)
 		I915_WRITE(SPRSCALE(pipe), sprscale);
 	I915_WRITE(SPRCTL(pipe), sprctl);
-	I915_MODIFY_DISPBASE(SPRSURF(pipe), obj->gtt_offset + sprsurf_offset);
+	I915_MODIFY_DISPBASE(SPRSURF(pipe),
+			     obj->gtt_space->start + sprsurf_offset);
 	POSTING_READ(SPRSURF(pipe));
 
 	/* potentially re-enable LP watermarks */
@@ -478,7 +479,8 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 	I915_WRITE(DVSSIZE(pipe), (crtc_h << 16) | crtc_w);
 	I915_WRITE(DVSSCALE(pipe), dvsscale);
 	I915_WRITE(DVSCNTR(pipe), dvscntr);
-	I915_MODIFY_DISPBASE(DVSSURF(pipe), obj->gtt_offset + dvssurf_offset);
+	I915_MODIFY_DISPBASE(DVSSURF(pipe),
+			     obj->gtt_space->start + dvssurf_offset);
 	POSTING_READ(DVSSURF(pipe));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 29/66] drm: pre allocate node for create_block
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (27 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 28/66] drm/i915: Remove object's gtt_offset Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 12:34   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 30/66] drm/i915: Getter/setter for object attributes Ben Widawsky
                   ` (38 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

For an upcoming patch where we introduce the i915 VMA, it's ideal to
have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
will break a bunch of code, but amongst them are 2 callers of
drm_mm_create_block(), both related to stolen memory.

As a side note, this patch has is able to leverage all the existing
drm_mm_put_block because the node is still kzalloc'd. When the
aforementioned VMA code comes into play, that too has to change.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/drm_mm.c               | 16 +++++-----------
 drivers/gpu/drm/i915/i915_drv.h        |  2 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 20 ++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_stolen.c | 35 +++++++++++++++++++++++-----------
 include/drm/drm_mm.h                   |  9 ++++-----
 5 files changed, 49 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 7095328..a2dcfdb 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -158,12 +158,10 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	}
 }
 
-struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
-					unsigned long start,
-					unsigned long size,
-					enum drm_mm_allocator_flags flags)
+int drm_mm_create_block(struct drm_mm *mm, struct drm_mm_node *node,
+			unsigned long start, unsigned long size)
 {
-	struct drm_mm_node *hole, *node;
+	struct drm_mm_node *hole;
 	unsigned long end = start + size;
 	unsigned long hole_start;
 	unsigned long hole_end;
@@ -172,10 +170,6 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
 		if (hole_start > start || hole_end < end)
 			continue;
 
-		node = drm_mm_kmalloc(mm, flags & DRM_MM_CREATE_ATOMIC);
-		if (unlikely(node == NULL))
-			return NULL;
-
 		node->start = start;
 		node->size = size;
 		node->mm = mm;
@@ -195,11 +189,11 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
 			node->hole_follows = 1;
 		}
 
-		return node;
+		return 0;
 	}
 
 	WARN(1, "no hole found for block 0x%lx + 0x%lx\n", start, size);
-	return NULL;
+	return -ENOSPC;
 }
 EXPORT_SYMBOL(drm_mm_create_block);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f6704d3..bc80ce0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1197,7 +1197,7 @@ enum hdmi_force_audio {
 	HDMI_AUDIO_ON,			/* force turn on HDMI audio */
 };
 
-#define I915_GTT_RESERVED ((struct drm_mm_node *)0x1)
+#define I915_GTT_RESERVED 0x1
 
 struct drm_i915_gem_object_ops {
 	/* Interface between the GEM object and its backing storage.
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index a45c00d..17e334f 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -679,14 +679,24 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		uintptr_t gtt_offset = (uintptr_t)obj->gtt_space;
+		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      obj->gtt_space->start, obj->base.size);
 
-		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
-		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
-						     obj->gtt_space->start,
-						     obj->base.size,
-						     false);
+		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
+		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
+		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
+		if (!obj->gtt_space) {
+			DRM_ERROR("Failed to preserve all objects\n");
+			break;
+		}
+		ret = drm_mm_create_block(&i915_gtt_vm->mm,
+					  obj->gtt_space,
+					  gtt_offset,
+					  obj->base.size);
+		if (ret)
+			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 7fba6f5..925f3b1 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -330,6 +330,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
+	int ret;
 
 	if (dev_priv->gtt.stolen_base == 0)
 		return NULL;
@@ -344,11 +345,15 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (WARN_ON(size == 0))
 		return NULL;
 
-	stolen = drm_mm_create_block(&dev_priv->gtt.stolen,
-				     stolen_offset, size,
-				     false);
-	if (stolen == NULL) {
+	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
+	if (!stolen)
+		return NULL;
+
+	ret = drm_mm_create_block(&dev_priv->gtt.stolen, stolen, stolen_offset,
+				  size);
+	if (ret) {
 		DRM_DEBUG_KMS("failed to allocate stolen space\n");
+		kfree(stolen);
 		return NULL;
 	}
 
@@ -369,18 +374,26 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 * later.
 	 */
 	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
-		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
-						     gtt_offset, size,
-						     false);
-		if (obj->gtt_space == NULL) {
+		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
+		if (!obj->gtt_space) {
+			drm_gem_object_unreference(&obj->base);
+			return NULL;
+		}
+		ret = drm_mm_create_block(&i915_gtt_vm->mm, obj->gtt_space,
+					  gtt_offset, size);
+		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			drm_gem_object_unreference(&obj->base);
+			kfree(obj->gtt_space);
 			return NULL;
 		}
-	} else
-		obj->gtt_space = I915_GTT_RESERVED;
+		obj->gtt_space->start = gtt_offset;
+	} else {
+		/* NB: Safe because we assert page alignment */
+		obj->gtt_space = (struct drm_mm_node *)
+			((uintptr_t)gtt_offset | I915_GTT_RESERVED);
+	}
 
-	obj->gtt_space->start = gtt_offset;
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 8935710..0cfb06c 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -161,11 +161,10 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
 /*
  * Basic range manager support (drm_mm.c)
  */
-extern struct drm_mm_node *
-drm_mm_create_block(struct drm_mm *mm,
-		    unsigned long start,
-		    unsigned long size,
-		    enum drm_mm_allocator_flags flags);
+extern int drm_mm_create_block(struct drm_mm *mm,
+			       struct drm_mm_node *node,
+			       unsigned long start,
+			       unsigned long size);
 extern struct drm_mm_node *
 drm_mm_get_block_generic(struct drm_mm_node *node,
 			 unsigned long size,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (28 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 29/66] drm: pre allocate node for create_block Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 13:00   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 31/66] drm/i915: Create VMAs (part 1) Ben Widawsky
                   ` (37 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This will be handy when we add VMs. It's not strictly, necessary, but it
will make the code much cleaner.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        | 26 +++++------
 drivers/gpu/drm/i915/i915_drv.h            | 21 +++++++++
 drivers/gpu/drm/i915/i915_gem.c            | 69 +++++++++++++++---------------
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 20 +++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 12 +++---
 drivers/gpu/drm/i915/i915_gem_tiling.c     | 14 +++---
 drivers/gpu/drm/i915/i915_irq.c            | 16 +++----
 drivers/gpu/drm/i915/i915_trace.h          |  8 ++--
 drivers/gpu/drm/i915/intel_display.c       | 22 +++++-----
 drivers/gpu/drm/i915/intel_fb.c            |  6 +--
 drivers/gpu/drm/i915/intel_overlay.c       | 15 ++++---
 drivers/gpu/drm/i915/intel_pm.c            |  7 +--
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 12 +++---
 drivers/gpu/drm/i915/intel_sprite.c        |  6 +--
 15 files changed, 143 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 3d3e770..87f813e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -122,10 +122,10 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	if (obj->gtt_space != NULL)
-		seq_printf(m, " (gtt offset: %08lx, size: %08x)",
-			   obj->gtt_space->start,
-			   (unsigned int)obj->gtt_space->size);
+	if (i915_gem_obj_bound(obj))
+		seq_printf(m, " (gtt offset: %08lx, size: %08lx)",
+			   i915_gem_obj_offset(obj),
+			   i915_gem_obj_size(obj));
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
@@ -176,7 +176,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 		describe_obj(m, obj);
 		seq_printf(m, "\n");
 		total_obj_size += obj->base.size;
-		total_gtt_size += obj->gtt_space->size;
+		total_gtt_size += i915_gem_obj_size(obj);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -188,10 +188,10 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 
 #define count_objects(list, member) do { \
 	list_for_each_entry(obj, list, member) { \
-		size += obj->gtt_space->size; \
+		size += i915_gem_obj_size(obj); \
 		++count; \
 		if (obj->map_and_fenceable) { \
-			mappable_size += obj->gtt_space->size; \
+			mappable_size += i915_gem_obj_size(obj); \
 			++mappable_count; \
 		} \
 	} \
@@ -268,11 +268,11 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	size = count = mappable_size = mappable_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		if (obj->fault_mappable) {
-			size += obj->gtt_space->size;
+			size += i915_gem_obj_size(obj);
 			++count;
 		}
 		if (obj->pin_mappable) {
-			mappable_size += obj->gtt_space->size;
+			mappable_size += i915_gem_obj_size(obj);
 			++mappable_count;
 		}
 		if (obj->madv == I915_MADV_DONTNEED) {
@@ -334,7 +334,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void* data)
 		describe_obj(m, obj);
 		seq_printf(m, "\n");
 		total_obj_size += obj->base.size;
-		total_gtt_size += obj->gtt_space->size;
+		total_gtt_size += i915_gem_obj_size(obj);
 		count++;
 	}
 
@@ -380,12 +380,14 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 			if (work->old_fb_obj) {
 				struct drm_i915_gem_object *obj = work->old_fb_obj;
 				if (obj)
-					seq_printf(m, "Old framebuffer gtt_offset 0x%08lx\n", obj->gtt_space->start);
+					seq_printf(m, "Old framebuffer gtt_offset 0x%08lx\n",
+						   i915_gem_obj_offset(obj));
 			}
 			if (work->pending_flip_obj) {
 				struct drm_i915_gem_object *obj = work->pending_flip_obj;
 				if (obj)
-					seq_printf(m, "New framebuffer gtt_offset 0x%08lx\n", obj->gtt_space->start);
+					seq_printf(m, "New framebuffer gtt_offset 0x%08lx\n",
+						   i915_gem_obj_offset(obj));
 			}
 		}
 		spin_unlock_irqrestore(&dev->event_lock, flags);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bc80ce0..56d47bc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1349,6 +1349,27 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
+static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
+{
+	return o->gtt_space->start;
+}
+
+static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
+{
+	return o->gtt_space != NULL;
+}
+
+static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
+{
+	return o->gtt_space->size;
+}
+
+static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+					  enum i915_cache_level color)
+{
+	o->gtt_space->color = color;
+}
+
 /**
  * Request queue structure.
  *
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d747a1f..dd2228d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -135,7 +135,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return obj->gtt_space && !obj->active;
+	return i915_gem_obj_bound(obj) && !obj->active;
 }
 
 int
@@ -178,7 +178,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 	mutex_lock(&dev->struct_mutex);
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
-			pinned += obj->gtt_space->size;
+			pinned += i915_gem_obj_size(obj);
 	mutex_unlock(&dev->struct_mutex);
 
 	args->aper_size = i915_gtt_vm->total;
@@ -422,7 +422,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		if (obj->gtt_space) {
+		if (i915_gem_obj_bound(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, false);
 			if (ret)
 				return ret;
@@ -609,7 +609,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	user_data = to_user_ptr(args->data_ptr);
 	remain = args->size;
 
-	offset = obj->gtt_space->start + args->offset;
+	offset = i915_gem_obj_offset(obj) + args->offset;
 
 	while (remain > 0) {
 		/* Operation in this page
@@ -739,7 +739,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		if (obj->gtt_space) {
+		if (i915_gem_obj_bound(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, true);
 			if (ret)
 				return ret;
@@ -1361,7 +1361,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	obj->fault_mappable = true;
 
-	pfn += (obj->gtt_space->start >> PAGE_SHIFT) + page_offset;
+	pfn += (i915_gem_obj_offset(obj) >> PAGE_SHIFT) + page_offset;
 
 	/* Finally, remap it using the new GTT offset */
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
@@ -1667,7 +1667,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages == NULL)
 		return 0;
 
-	BUG_ON(obj->gtt_space);
+	BUG_ON(i915_gem_obj_bound(obj));
 
 	if (obj->pages_pin_count)
 		return -EBUSY;
@@ -2587,7 +2587,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
 	int ret;
 
-	if (obj->gtt_space == NULL)
+	if (!i915_gem_obj_bound(obj))
 		return 0;
 
 	if (obj->pin_count)
@@ -2669,11 +2669,11 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 	}
 
 	if (obj) {
-		u32 size = obj->gtt_space->size;
+		u32 size = i915_gem_obj_size(obj);
 
-		val = (uint64_t)((obj->gtt_space->start + size - 4096) &
+		val = (uint64_t)((i915_gem_obj_offset(obj) + size - 4096) &
 				 0xfffff000) << 32;
-		val |= obj->gtt_space->start & 0xfffff000;
+		val |= i915_gem_obj_offset(obj) & 0xfffff000;
 		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
@@ -2693,15 +2693,15 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 	u32 val;
 
 	if (obj) {
-		u32 size = obj->gtt_space->size;
+		u32 size = i915_gem_obj_size(obj);
 		int pitch_val;
 		int tile_width;
 
-		WARN((obj->gtt_space->start & ~I915_FENCE_START_MASK) ||
+		WARN((i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (obj->gtt_space->start & (size - 1)),
+		     (i915_gem_obj_offset(obj) & (size - 1)),
 		     "object 0x%08lx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
-		     obj->gtt_space->start, obj->map_and_fenceable, size);
+		     i915_gem_obj_offset(obj), obj->map_and_fenceable, size);
 
 		if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 			tile_width = 128;
@@ -2712,7 +2712,7 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 		pitch_val = obj->stride / tile_width;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = obj->gtt_space->start;
+		val = i915_gem_obj_offset(obj);
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I915_FENCE_SIZE_BITS(size);
@@ -2737,19 +2737,19 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 	uint32_t val;
 
 	if (obj) {
-		u32 size = obj->gtt_space->size;
+		u32 size = i915_gem_obj_size(obj);
 		uint32_t pitch_val;
 
-		WARN((obj->gtt_space->start & ~I830_FENCE_START_MASK) ||
+		WARN((i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (obj->gtt_space->start & (size - 1)),
+		     (i915_gem_obj_offset(obj) & (size - 1)),
 		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
-		     obj->gtt_space->start, size);
+		     i915_gem_obj_offset(obj), size);
 
 		pitch_val = obj->stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = obj->gtt_space->start;
+		val = i915_gem_obj_offset(obj);
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I830_FENCE_SIZE_BITS(size);
@@ -3030,6 +3030,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 	int err = 0;
 
 	list_for_each_entry(obj, &dev_priv->mm.gtt_list, global_list) {
+		unsigned long obj_offset = i915_gem_obj_offset(obj);
 		if (obj->gtt_space == NULL) {
 			printk(KERN_ERR "object found on GTT list with no space reserved\n");
 			err++;
@@ -3038,8 +3039,8 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 
 		if (obj->cache_level != obj->gtt_space->color) {
 			printk(KERN_ERR "object reserved space [%08lx, %08lx] with wrong color, cache_level=%x, color=%lx\n",
-			       obj->gtt_space->start,
-			       obj->gtt_space->start + obj->gtt_space->size,
+			       obj_offset,
+			       obj_offset + i915_gem_obj_size(obj),
 			       obj->cache_level,
 			       obj->gtt_space->color);
 			err++;
@@ -3050,8 +3051,8 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 					      obj->gtt_space,
 					      obj->cache_level)) {
 			printk(KERN_ERR "invalid GTT space found at [%08lx, %08lx] - color=%x\n",
-			       obj->gtt_space->start,
-			       obj->gtt_space->start + obj->gtt_space->size,
+			       obj_offset,
+			       obj_offset + i915_gem_obj_size(obj),
 			       obj->cache_level);
 			err++;
 			continue;
@@ -3267,7 +3268,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	int ret;
 
 	/* Not valid to be called on unbound objects. */
-	if (obj->gtt_space == NULL)
+	if (!i915_gem_obj_bound(obj))
 		return -EINVAL;
 
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
@@ -3332,7 +3333,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			return ret;
 	}
 
-	if (obj->gtt_space) {
+	if (i915_gem_obj_bound(obj)) {
 		ret = i915_gem_object_finish_gpu(obj);
 		if (ret)
 			return ret;
@@ -3355,7 +3356,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
 					       obj, cache_level);
 
-		obj->gtt_space->color = cache_level;
+		i915_gem_obj_set_color(obj, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3636,14 +3637,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 		return -EBUSY;
 
-	if (obj->gtt_space != NULL) {
-		if ((alignment && obj->gtt_space->start & (alignment - 1)) ||
+	if (i915_gem_obj_bound(obj)) {
+		if ((alignment && i915_gem_obj_offset(obj) & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     obj->gtt_space->start, alignment,
+			     i915_gem_obj_offset(obj), alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
 			ret = i915_gem_object_unbind(obj);
@@ -3652,7 +3653,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		}
 	}
 
-	if (obj->gtt_space == NULL) {
+	if (!i915_gem_obj_bound(obj)) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
 		ret = i915_gem_object_bind_to_gtt(obj, alignment,
@@ -3678,7 +3679,7 @@ void
 i915_gem_object_unpin(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pin_count == 0);
-	BUG_ON(obj->gtt_space == NULL);
+	BUG_ON(!i915_gem_obj_bound(obj));
 
 	if (--obj->pin_count == 0)
 		obj->pin_mappable = false;
@@ -3728,7 +3729,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	 * as the X server doesn't manage domains yet
 	 */
 	i915_gem_object_flush_cpu_write_domain(obj);
-	args->offset = obj->gtt_space->start;
+	args->offset = i915_gem_obj_offset(obj);
 out:
 	drm_gem_object_unreference(&obj->base);
 unlock:
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 1e838f4..75b4e27 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -395,7 +395,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, new_context->obj->gtt_space->start |
+	intel_ring_emit(ring, i915_gem_obj_offset(new_context->obj) |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 67246a6..837372d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -188,7 +188,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		return -ENOENT;
 
 	target_i915_obj = to_intel_bo(target_obj);
-	target_offset = target_i915_obj->gtt_space->start;
+	target_offset = i915_gem_obj_offset(target_i915_obj);
 
 	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
 	 * pipe_control writes because the gpu doesn't properly redirect them
@@ -280,7 +280,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 			return ret;
 
 		/* Map the page containing the relocation we're going to perform.  */
-		reloc->offset += obj->gtt_space->start;
+		reloc->offset += i915_gem_obj_offset(obj);
 		reloc_page = io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
 						      reloc->offset & PAGE_MASK);
 		reloc_entry = (uint32_t __iomem *)
@@ -436,8 +436,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != obj->gtt_space->start) {
-		entry->offset = obj->gtt_space->start;
+	if (entry->offset != i915_gem_obj_offset(obj)) {
+		entry->offset = i915_gem_obj_offset(obj);
 		*need_reloc = true;
 	}
 
@@ -458,7 +458,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!obj->gtt_space)
+	if (!i915_gem_obj_bound(obj))
 		return;
 
 	entry = obj->exec_entry;
@@ -528,11 +528,13 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 		/* Unbind any ill-fitting objects or pin. */
 		list_for_each_entry(obj, objects, exec_list) {
 			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
+			unsigned long obj_offset;
 			bool need_fence, need_mappable;
 
-			if (!obj->gtt_space)
+			if (!i915_gem_obj_bound(obj))
 				continue;
 
+			obj_offset = i915_gem_obj_offset(obj);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
@@ -540,7 +542,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
 			if ((entry->alignment &&
-			     obj->gtt_space->start & (entry->alignment - 1)) ||
+			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
 				ret = i915_gem_object_unbind(obj);
 			else
@@ -551,7 +553,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (obj->gtt_space)
+			if (i915_gem_obj_bound(obj))
 				continue;
 
 			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
@@ -1072,7 +1074,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = batch_obj->gtt_space->start + args->batch_start_offset;
+	exec_start = i915_gem_obj_offset(batch_obj) + args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 17e334f..566ab76 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -390,7 +390,7 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    enum i915_cache_level cache_level)
 {
 	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   obj->gtt_space->start >> PAGE_SHIFT,
+				   i915_gem_obj_offset(obj) >> PAGE_SHIFT,
 				   cache_level);
 }
 
@@ -398,7 +398,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
 	ppgtt->base.clear_range(&ppgtt->base,
-				obj->gtt_space->start >> PAGE_SHIFT,
+				i915_gem_obj_offset(obj) >> PAGE_SHIFT,
 				obj->base.size >> PAGE_SHIFT);
 }
 
@@ -570,9 +570,10 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned long obj_offset = i915_gem_obj_offset(obj);
 
 	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
-					  obj->gtt_space->start >> PAGE_SHIFT,
+					  obj_offset >> PAGE_SHIFT,
 					  cache_level);
 
 	obj->has_global_gtt_mapping = 1;
@@ -582,9 +583,10 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned long obj_offset = i915_gem_obj_offset(obj);
 
 	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
-				       obj->gtt_space->start >> PAGE_SHIFT,
+				       obj_offset >> PAGE_SHIFT,
 				       obj->base.size >> PAGE_SHIFT);
 
 	obj->has_global_gtt_mapping = 0;
@@ -682,7 +684,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 		uintptr_t gtt_offset = (uintptr_t)obj->gtt_space;
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
-			      obj->gtt_space->start, obj->base.size);
+			      i915_gem_obj_offset(obj), obj->base.size);
 
 		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
 		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 7aab12a..2478114 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -268,18 +268,18 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
 		return true;
 
 	if (INTEL_INFO(obj->base.dev)->gen == 3) {
-		if (obj->gtt_space->start & ~I915_FENCE_START_MASK)
+		if (i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK)
 			return false;
 	} else {
-		if (obj->gtt_space->start & ~I830_FENCE_START_MASK)
+		if (i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK)
 			return false;
 	}
 
 	size = i915_gem_get_gtt_size(obj->base.dev, obj->base.size, tiling_mode);
-	if (obj->gtt_space->size != size)
+	if (i915_gem_obj_size(obj) != size)
 		return false;
 
-	if (obj->gtt_space->start & (size - 1))
+	if (i915_gem_obj_offset(obj) & (size - 1))
 		return false;
 
 	return true;
@@ -358,8 +358,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		 * whilst executing a fenced command for an untiled object.
 		 */
 
-		obj->map_and_fenceable = obj->gtt_space == NULL ||
-			(obj->gtt_space->start +
+		obj->map_and_fenceable = !i915_gem_obj_bound(obj) ||
+			(i915_gem_obj_offset(obj) +
 			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
@@ -369,7 +369,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (obj->gtt_space->start & (unfenced_alignment - 1))
+			if (i915_gem_obj_offset(obj) & (unfenced_alignment - 1))
 				ret = i915_gem_object_unbind(obj);
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2c4fe36..c0be641 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1512,7 +1512,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 	if (dst == NULL)
 		return NULL;
 
-	reloc_offset = src->gtt_space->start;
+	reloc_offset = i915_gem_obj_offset(src);
 	for (i = 0; i < num_pages; i++) {
 		unsigned long flags;
 		void *d;
@@ -1564,7 +1564,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 		reloc_offset += PAGE_SIZE;
 	}
 	dst->page_count = num_pages;
-	dst->gtt_offset = src->gtt_space->start;
+	dst->gtt_offset = i915_gem_obj_offset(src);
 
 	return dst;
 
@@ -1618,7 +1618,7 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 	err->name = obj->base.name;
 	err->rseqno = obj->last_read_seqno;
 	err->wseqno = obj->last_write_seqno;
-	err->gtt_offset = obj->gtt_space->start;
+	err->gtt_offset = i915_gem_obj_offset(obj);
 	err->read_domains = obj->base.read_domains;
 	err->write_domain = obj->base.write_domain;
 	err->fence_reg = obj->fence_reg;
@@ -1716,8 +1716,8 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			return NULL;
 
 		obj = ring->private;
-		if (acthd >= obj->gtt_space->start &&
-		    acthd < obj->gtt_space->start + obj->base.size)
+		if (acthd >= i915_gem_obj_offset(obj) &&
+		    acthd < i915_gem_obj_offset(obj) + obj->base.size)
 			return i915_error_object_create(dev_priv, obj);
 	}
 
@@ -1798,7 +1798,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		if ((error->ccid & PAGE_MASK) == obj->gtt_space->start) {
+		if ((error->ccid & PAGE_MASK) == i915_gem_obj_offset(obj)) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
 								    obj, 1);
 		}
@@ -2152,10 +2152,10 @@ static void __always_unused i915_pageflip_stall_check(struct drm_device *dev, in
 	if (INTEL_INFO(dev)->gen >= 4) {
 		int dspsurf = DSPSURF(intel_crtc->plane);
 		stall_detected = I915_HI_DISPBASE(I915_READ(dspsurf)) ==
-					obj->gtt_space->start;
+					i915_gem_obj_offset(obj);
 	} else {
 		int dspaddr = DSPADDR(intel_crtc->plane);
-		stall_detected = I915_READ(dspaddr) == (obj->gtt_space->start +
+		stall_detected = I915_READ(dspaddr) == (i915_gem_obj_offset(obj) +
 							crtc->y * crtc->fb->pitches[0] +
 							crtc->x * crtc->fb->bits_per_pixel/8);
 	}
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 3db4a68..e4dccb3 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -46,8 +46,8 @@ TRACE_EVENT(i915_gem_object_bind,
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = obj->gtt_space->start;
-			   __entry->size = obj->gtt_space->size;
+			   __entry->offset = i915_gem_obj_offset(obj);
+			   __entry->size = i915_gem_obj_size(obj);
 			   __entry->mappable = mappable;
 			   ),
 
@@ -68,8 +68,8 @@ TRACE_EVENT(i915_gem_object_unbind,
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = obj->gtt_space->start;
-			   __entry->size = obj->gtt_space->size;
+			   __entry->offset = i915_gem_obj_offset(obj);
+			   __entry->size = i915_gem_obj_size(obj);
 			   ),
 
 	    TP_printk("obj=%p, offset=%08x size=%x",
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a269d7a..633bfbf 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1943,18 +1943,18 @@ static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 	}
 
 	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
-		      obj->gtt_space->start, linear_offset, x, y,
+		      i915_gem_obj_offset(obj), linear_offset, x, y,
 		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	if (INTEL_INFO(dev)->gen >= 4) {
 		I915_MODIFY_DISPBASE(DSPSURF(plane),
-				     obj->gtt_space->start +
+				     i915_gem_obj_offset(obj) +
 				     intel_crtc->dspaddr_offset);
 		I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
 		I915_WRITE(DSPLINOFF(plane), linear_offset);
 	} else
 		I915_WRITE(DSPADDR(plane),
-			   obj->gtt_space->start + linear_offset);
+			   i915_gem_obj_offset(obj) + linear_offset);
 	POSTING_READ(reg);
 
 	return 0;
@@ -2035,11 +2035,11 @@ static int ironlake_update_plane(struct drm_crtc *crtc,
 	linear_offset -= intel_crtc->dspaddr_offset;
 
 	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
-		      obj->gtt_space->start, linear_offset, x, y,
+		      i915_gem_obj_offset(obj), linear_offset, x, y,
 		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	I915_MODIFY_DISPBASE(DSPSURF(plane),
-			     obj->gtt_space->start+intel_crtc->dspaddr_offset);
+			     i915_gem_obj_offset(obj)+intel_crtc->dspaddr_offset);
 	if (IS_HASWELL(dev)) {
 		I915_WRITE(DSPOFFSET(plane), (y << 16) | x);
 	} else {
@@ -6558,7 +6558,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
 			goto fail_unpin;
 		}
 
-		addr = obj->gtt_space->start;
+		addr = i915_gem_obj_offset(obj);
 	} else {
 		int align = IS_I830(dev) ? 16 * 1024 : 256;
 		ret = i915_gem_attach_phys_object(dev, obj,
@@ -7274,7 +7274,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			obj->gtt_space->start + intel_crtc->dspaddr_offset);
+			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7316,7 +7316,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			obj->gtt_space->start + intel_crtc->dspaddr_offset);
+			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7356,7 +7356,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			(obj->gtt_space->start + intel_crtc->dspaddr_offset) |
+			(i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset) |
 			obj->tiling_mode);
 
 	/* XXX Enabling the panel-fitter across page-flip is so far
@@ -7400,7 +7400,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
 	intel_ring_emit(ring,
-			obj->gtt_space->start + intel_crtc->dspaddr_offset);
+			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
 
 	/* Contrary to the suggestions in the documentation,
 	 * "Enable Panel Fitter" does not seem to be required when page
@@ -7466,7 +7466,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
 	intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
 	intel_ring_emit(ring,
-			obj->gtt_space->start + intel_crtc->dspaddr_offset);
+			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 242a793..8315a5e 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -139,11 +139,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->apertures->ranges[0].base = dev->mode_config.fb_base;
 	info->apertures->ranges[0].size = dev_priv->gtt.mappable_end;
 
-	info->fix.smem_start = dev->mode_config.fb_base + obj->gtt_space->start;
+	info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_offset(obj);
 	info->fix.smem_len = size;
 
 	info->screen_base =
-		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
+		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
 			   size);
 	if (!info->screen_base) {
 		ret = -ENOSPC;
@@ -168,7 +168,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 
 	DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08lx, bo %p\n",
 		      fb->width, fb->height,
-		      obj->gtt_space->start, obj);
+		      i915_gem_obj_offset(obj), obj);
 
 
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 93f2671..41654b1 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -196,7 +196,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
 		regs = io_mapping_map_wc(dev_priv->gtt.mappable,
-					 overlay->reg_bo->gtt_space->start);
+					 i915_gem_obj_offset(overlay->reg_bo));
 
 	return regs;
 }
@@ -740,7 +740,8 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	swidth = params->src_w;
 	swidthsw = calc_swidthsw(overlay->dev, params->offset_Y, tmp_width);
 	sheight = params->src_h;
-	iowrite32(new_bo->gtt_space->start + params->offset_Y, &regs->OBUF_0Y);
+	iowrite32(i915_gem_obj_offset(new_bo) + params->offset_Y,
+		  &regs->OBUF_0Y);
 	ostride = params->stride_Y;
 
 	if (params->format & I915_OVERLAY_YUV_PLANAR) {
@@ -754,9 +755,9 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 				      params->src_w/uv_hscale);
 		swidthsw |= max_t(u32, tmp_U, tmp_V) << 16;
 		sheight |= (params->src_h/uv_vscale) << 16;
-		iowrite32(new_bo->gtt_space->start + params->offset_U,
+		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_U,
 			  &regs->OBUF_0U);
-		iowrite32(new_bo->gtt_space->start + params->offset_V,
+		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_V,
 			  &regs->OBUF_0V);
 		ostride |= params->stride_UV << 16;
 	}
@@ -1357,7 +1358,7 @@ void intel_setup_overlay(struct drm_device *dev)
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
 		}
-		overlay->flip_addr = reg_bo->gtt_space->start;
+		overlay->flip_addr = i915_gem_obj_offset(reg_bo);
 
 		ret = i915_gem_object_set_to_gtt_domain(reg_bo, true);
 		if (ret) {
@@ -1436,7 +1437,7 @@ intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
 			overlay->reg_bo->phys_obj->handle->vaddr;
 
 	return io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
-					overlay->reg_bo->gtt_space->start);
+					i915_gem_obj_offset(overlay->reg_bo));
 }
 
 static void intel_overlay_unmap_regs_atomic(struct intel_overlay *overlay,
@@ -1467,7 +1468,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
 	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
 		error->base = (__force long)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
-		error->base = overlay->reg_bo->gtt_space->start;
+		error->base = i915_gem_obj_offset(overlay->reg_bo);
 
 	regs = intel_overlay_map_regs_atomic(overlay);
 	if (!regs)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 73c0ee1..504d96b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -217,7 +217,7 @@ static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 		   (stall_watermark << DPFC_RECOMP_STALL_WM_SHIFT) |
 		   (interval << DPFC_RECOMP_TIMER_COUNT_SHIFT));
 	I915_WRITE(ILK_DPFC_FENCE_YOFF, crtc->y);
-	I915_WRITE(ILK_FBC_RT_BASE, obj->gtt_space->start | ILK_FBC_RT_VALID);
+	I915_WRITE(ILK_FBC_RT_BASE, i915_gem_obj_offset(obj) | ILK_FBC_RT_VALID);
 	/* enable it... */
 	I915_WRITE(ILK_DPFC_CONTROL, dpfc_ctl | DPFC_CTL_EN);
 
@@ -3685,7 +3685,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 
 	intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, dev_priv->ips.renderctx->gtt_space->start |
+	intel_ring_emit(ring, i915_gem_obj_offset(dev_priv->ips.renderctx) |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
@@ -3708,7 +3708,8 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 		return;
 	}
 
-	I915_WRITE(PWRCTXA, dev_priv->ips.pwrctx->gtt_space->start | PWRCTX_EN);
+	I915_WRITE(PWRCTXA, i915_gem_obj_offset(dev_priv->ips.pwrctx) |
+			    PWRCTX_EN);
 	I915_WRITE(RSTDBYCTL, I915_READ(RSTDBYCTL) & ~RCX_SW_EXIT);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c4c80c2..64b579f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -424,14 +424,14 @@ static int init_ring_common(struct intel_ring_buffer *ring)
 	 * registers with the above sequence (the readback of the HEAD registers
 	 * also enforces ordering), otherwise the hw might lose the new ring
 	 * register values. */
-	I915_WRITE_START(ring, obj->gtt_space->start);
+	I915_WRITE_START(ring, i915_gem_obj_offset(obj));
 	I915_WRITE_CTL(ring,
 			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
 
 	/* If the head is still not zero, the ring is dead */
 	if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
-		     I915_READ_START(ring) == obj->gtt_space->start &&
+		     I915_READ_START(ring) == i915_gem_obj_offset(obj) &&
 		     (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
 		DRM_ERROR("%s initialization failed "
 				"ctl %08x head %08x tail %08x start %08x\n",
@@ -489,7 +489,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
 	if (ret)
 		goto err_unref;
 
-	pc->gtt_offset = obj->gtt_space->start;
+	pc->gtt_offset = i915_gem_obj_offset(obj);
 	pc->cpu_page = kmap(sg_page(obj->pages->sgl));
 	if (pc->cpu_page == NULL) {
 		ret = -ENOMEM;
@@ -1129,7 +1129,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
 		intel_ring_advance(ring);
 	} else {
 		struct drm_i915_gem_object *obj = ring->private;
-		u32 cs_offset = obj->gtt_space->start;
+		u32 cs_offset = i915_gem_obj_offset(obj);
 
 		if (len > I830_BATCH_LIMIT)
 			return -ENOSPC;
@@ -1214,7 +1214,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
 		goto err_unref;
 	}
 
-	ring->status_page.gfx_addr = obj->gtt_space->start;
+	ring->status_page.gfx_addr = i915_gem_obj_offset(obj);
 	ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
 	if (ring->status_page.page_addr == NULL) {
 		ret = -ENOMEM;
@@ -1308,7 +1308,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 		goto err_unpin;
 
 	ring->virtual_start =
-		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
+		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
 			   ring->size);
 	if (ring->virtual_start == NULL) {
 		DRM_ERROR("Failed to map ringbuffer.\n");
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index c342571..117a2f8 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -133,7 +133,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
 
 	I915_WRITE(SPSIZE(pipe, plane), (crtc_h << 16) | crtc_w);
 	I915_WRITE(SPCNTR(pipe, plane), sprctl);
-	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), obj->gtt_space->start +
+	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), i915_gem_obj_offset(obj) +
 			     sprsurf_offset);
 	POSTING_READ(SPSURF(pipe, plane));
 }
@@ -309,7 +309,7 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 		I915_WRITE(SPRSCALE(pipe), sprscale);
 	I915_WRITE(SPRCTL(pipe), sprctl);
 	I915_MODIFY_DISPBASE(SPRSURF(pipe),
-			     obj->gtt_space->start + sprsurf_offset);
+			     i915_gem_obj_offset(obj) + sprsurf_offset);
 	POSTING_READ(SPRSURF(pipe));
 
 	/* potentially re-enable LP watermarks */
@@ -480,7 +480,7 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 	I915_WRITE(DVSSCALE(pipe), dvsscale);
 	I915_WRITE(DVSCNTR(pipe), dvscntr);
 	I915_MODIFY_DISPBASE(DVSSURF(pipe),
-			     obj->gtt_space->start + dvssurf_offset);
+			     i915_gem_obj_offset(obj) + dvssurf_offset);
 	POSTING_READ(DVSSURF(pipe));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 31/66] drm/i915: Create VMAs (part 1)
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (29 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 30/66] drm/i915: Getter/setter for object attributes Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 32/66] drm/i915: Create VMAs (part 2) - kill gtt space Ben Widawsky
                   ` (36 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Creates the VMA, but leaves the old obj->gtt_space in place. This
primarily just puts the basic infrastructure in place, and helps check
for leaks.

BISECT WARNING: This patch was not meant for bisect. If it does end up
upstream, it should be included in the 3 part series for creating the
VMA.

v2: s/i915_obj/i915_gem_obj (Chris)

v3: Only move an object to the now global unbound list if there are no
more VMAs for the object which are bound into a VM (ie. the list is
empty).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h        | 30 ++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem.c        | 54 ++++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_gem_evict.c  |  8 ++++-
 drivers/gpu/drm/i915/i915_gem_gtt.c    |  3 ++
 drivers/gpu/drm/i915/i915_gem_stolen.c | 13 ++++++++
 5 files changed, 104 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 56d47bc..bd4640a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -539,6 +539,19 @@ struct i915_hw_ppgtt {
 	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
 };
 
+/* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
+ * will always be <= an objects lifetime. So object refcounting should cover us.
+ */
+struct i915_vma {
+	struct i915_address_space *vm;
+	struct drm_i915_gem_object *obj;
+	struct drm_mm_node node;
+	/* Page aligned offset (helper for stolen) */
+	unsigned long deferred_offset;
+
+	struct list_head vma_link; /* Link in the object's VMA list */
+};
+
 struct i915_ctx_hang_stats {
 	/* This context had batch pending when hang was declared */
 	unsigned batch_pending;
@@ -1222,8 +1235,9 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
-	/** Current space allocated to this object in the GTT, if any. */
 	struct drm_mm_node *gtt_space;
+	struct list_head vma_list;
+
 	/** Stolen memory for this object, instead of being backed by shmem. */
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
@@ -1351,6 +1365,7 @@ struct drm_i915_gem_object {
 
 static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
 {
+	BUG_ON(list_empty(&o->vma_list));
 	return o->gtt_space->start;
 }
 
@@ -1361,6 +1376,7 @@ static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
 
 static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
 {
+	BUG_ON(list_empty(&o->vma_list));
 	return o->gtt_space->size;
 }
 
@@ -1370,6 +1386,16 @@ static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
 	o->gtt_space->color = color;
 }
 
+/* This is a temporary define to help transition us to real VMAs. If you see
+ * this, you're either reviewing code, or bisecting it. */
+static inline struct i915_vma *
+__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
+{
+	BUG_ON(!i915_gem_obj_bound(obj));
+	BUG_ON(list_empty(&obj->vma_list));
+	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
+}
+
 /**
  * Request queue structure.
  *
@@ -1680,6 +1706,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj);
+void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dd2228d..a41b2f1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2585,6 +2585,7 @@ int
 i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 {
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
+	struct i915_vma *vma;
 	int ret;
 
 	if (!i915_gem_obj_bound(obj))
@@ -2622,13 +2623,22 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	i915_gem_object_unpin_pages(obj);
 
 	list_del(&obj->mm_list);
-	list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
+	vma = __i915_gem_obj_to_vma(obj);
+	list_del(&vma->vma_link);
+	/* FIXME: drm_mm_remove_node(&vma->node); */
+	i915_gem_vma_destroy(vma);
+
 	drm_mm_put_block(obj->gtt_space);
 	obj->gtt_space = NULL;
 
+	/* Since the unbound list is global, only move to that list if
+	 * no more VMAs exist */
+	if (list_empty(&obj->vma_list))
+		list_move_tail(&obj->global_list, &dev_priv->mm.unbound_list);
+
 	return 0;
 }
 
@@ -3079,8 +3089,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
 		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	struct i915_vma *vma;
 	int ret;
 
+	if (WARN_ON(!list_empty(&obj->vma_list)))
+		return -EBUSY;
+
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3124,6 +3138,12 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		i915_gem_object_unpin_pages(obj);
 		return -ENOMEM;
 	}
+	vma = i915_gem_vma_create(obj);
+	if (vma == NULL) {
+		kfree(node);
+		i915_gem_object_unpin_pages(obj);
+		return -ENOMEM;
+	}
 
 search_free:
 	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm, node,
@@ -3160,6 +3180,9 @@ search_free:
 	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
 
 	obj->gtt_space = node;
+	vma->node.start = node->start;
+	vma->node.size = node->size;
+	list_add(&vma->vma_link, &obj->vma_list);
 
 	fenceable =
 		node->size == fence_size &&
@@ -3317,6 +3340,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_mm_node *node = NULL;
 	int ret;
 
 	if (obj->cache_level == cache_level)
@@ -3327,7 +3351,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 	}
 
-	if (!i915_gem_valid_gtt_space(dev, obj->gtt_space, cache_level)) {
+	if (i915_gem_obj_bound(obj)) {
+		node = obj->gtt_space;
+		BUG_ON(node->start != __i915_gem_obj_to_vma(obj)->node.start);
+	}
+
+	if (!i915_gem_valid_gtt_space(dev, node, cache_level)) {
 		ret = i915_gem_object_unbind(obj);
 		if (ret)
 			return ret;
@@ -3872,6 +3901,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
+	INIT_LIST_HEAD(&obj->vma_list);
 
 	obj->ops = ops;
 
@@ -3992,6 +4022,26 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_object_free(obj);
 }
 
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
+	if (vma == NULL)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&vma->vma_link);
+	vma->vm = i915_gtt_vm;
+	vma->obj = obj;
+
+	return vma;
+}
+
+void i915_gem_vma_destroy(struct i915_vma *vma)
+{
+	WARN_ON(vma->node.allocated);
+	kfree(vma);
+}
+
 int
 i915_gem_idle(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 92856a2..0434c9e 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -38,6 +38,8 @@ mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
 		return false;
 
 	list_add(&obj->exec_list, unwind);
+	BUG_ON(__i915_gem_obj_to_vma(obj)->node.start !=
+	       i915_gem_obj_offset(obj));
 	return drm_mm_scan_add_block(obj->gtt_space);
 }
 
@@ -48,6 +50,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct list_head eviction_list, unwind_list;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
 
@@ -106,7 +109,8 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-
+		vma = __i915_gem_obj_to_vma(obj);
+		BUG_ON(vma->node.start != i915_gem_obj_offset(obj));
 		ret = drm_mm_scan_remove_block(obj->gtt_space);
 		BUG_ON(ret);
 
@@ -127,6 +131,8 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
+		vma = __i915_gem_obj_to_vma(obj);
+		BUG_ON(vma->node.start != i915_gem_obj_offset(obj));
 		if (drm_mm_scan_remove_block(obj->gtt_space)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 566ab76..b59f846 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -687,6 +687,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 			      i915_gem_obj_offset(obj), obj->base.size);
 
 		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
+		BUG_ON((__i915_gem_obj_to_vma(obj)->deferred_offset
+			& I915_GTT_RESERVED) == 0);
 		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
 		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
 		if (!obj->gtt_space) {
@@ -700,6 +702,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
+		list_add(&__i915_gem_obj_to_vma(obj)->vma_link, &obj->vma_list);
 	}
 
 	i915_gtt_vm->start = start;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 925f3b1..6e22355 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -330,6 +330,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
+	struct i915_vma *vma;
 	int ret;
 
 	if (dev_priv->gtt.stolen_base == 0)
@@ -368,6 +369,12 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == -1)
 		return obj;
 
+	vma = i915_gem_vma_create(obj);
+	if (!vma) {
+		drm_gem_object_unreference(&obj->base);
+		return NULL;
+	}
+
 	/* To simplify the initialisation sequence between KMS and GTT,
 	 * we allow construction of the stolen object prior to
 	 * setting up the GTT space. The actual reservation will occur
@@ -376,6 +383,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
 		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
 		if (!obj->gtt_space) {
+			i915_gem_vma_destroy(vma);
 			drm_gem_object_unreference(&obj->base);
 			return NULL;
 		}
@@ -383,15 +391,20 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					  gtt_offset, size);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
+			i915_gem_vma_destroy(vma);
 			drm_gem_object_unreference(&obj->base);
 			kfree(obj->gtt_space);
 			return NULL;
 		}
+		vma->node.start = obj->gtt_space->start;
+		vma->node.size = obj->gtt_space->size;
 		obj->gtt_space->start = gtt_offset;
+		list_add(&vma->vma_link, &obj->vma_list);
 	} else {
 		/* NB: Safe because we assert page alignment */
 		obj->gtt_space = (struct drm_mm_node *)
 			((uintptr_t)gtt_offset | I915_GTT_RESERVED);
+		vma->deferred_offset = gtt_offset | I915_GTT_RESERVED;
 	}
 
 	obj->has_global_gtt_mapping = 1;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 32/66] drm/i915: Create VMAs (part 2) - kill gtt space
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (30 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 31/66] drm/i915: Create VMAs (part 1) Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 33/66] drm/i915: Create VMAs (part 3) - plumbing Ben Widawsky
                   ` (35 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Remove the obj->gtt_space. See if it still works. This validates that
what we did in part 1 was correct.

BISECT WARNING: This patch was not meant for bisect. If it does end up
upstream, it should be included in the 3 part series for creating the
VMA.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    |  2 +-
 drivers/gpu/drm/i915/i915_drv.h        | 16 ++++++++----
 drivers/gpu/drm/i915/i915_gem.c        | 48 ++++++++++++----------------------
 drivers/gpu/drm/i915/i915_gem_evict.c  | 13 +++++----
 drivers/gpu/drm/i915/i915_gem_gtt.c    | 12 +++------
 drivers/gpu/drm/i915/i915_gem_stolen.c | 18 ++-----------
 drivers/gpu/drm/i915/intel_pm.c        |  2 +-
 7 files changed, 41 insertions(+), 70 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 87f813e..aa6d63b 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -210,7 +210,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 	stats->count++;
 	stats->total += obj->base.size;
 
-	if (obj->gtt_space) {
+	if (i915_gem_obj_bound(obj)) {
 		if (!list_empty(&obj->ring_list))
 			stats->active += obj->base.size;
 		else
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bd4640a..217695e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1235,7 +1235,6 @@ struct drm_i915_gem_object {
 
 	const struct drm_i915_gem_object_ops *ops;
 
-	struct drm_mm_node *gtt_space;
 	struct list_head vma_list;
 
 	/** Stolen memory for this object, instead of being backed by shmem. */
@@ -1365,25 +1364,32 @@ struct drm_i915_gem_object {
 
 static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
 {
+	struct i915_vma *vma;
 	BUG_ON(list_empty(&o->vma_list));
-	return o->gtt_space->start;
+	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
+	return vma->node.start;
 }
 
 static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
 {
-	return o->gtt_space != NULL;
+	return !list_empty(&o->vma_list);
 }
 
 static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
 {
+	struct i915_vma *vma;
 	BUG_ON(list_empty(&o->vma_list));
-	return o->gtt_space->size;
+	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
+	return vma->node.size;
 }
 
 static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
 					  enum i915_cache_level color)
 {
-	o->gtt_space->color = color;
+	struct i915_vma *vma;
+	BUG_ON(list_empty(&o->vma_list));
+	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
+	vma->node.color = color;
 }
 
 /* This is a temporary define to help transition us to real VMAs. If you see
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a41b2f1..bc9e089 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2108,8 +2108,8 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 
 static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
 {
-	if (acthd >= obj->gtt_space->start &&
-	    acthd < obj->gtt_space->start + obj->base.size)
+	if (acthd >= i915_gem_obj_offset(obj) &&
+	    acthd < i915_gem_obj_offset(obj) + obj->base.size)
 		return true;
 
 	return false;
@@ -2171,7 +2171,7 @@ static bool i915_set_reset_status(struct intel_ring_buffer *ring,
 			  ring->name,
 			  inside ? "inside" : "flushing",
 			  request->batch_obj ?
-			  request->batch_obj->gtt_space->start : 0,
+			  i915_gem_obj_offset(request->batch_obj) : 0,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
@@ -2628,12 +2628,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 
 	vma = __i915_gem_obj_to_vma(obj);
 	list_del(&vma->vma_link);
-	/* FIXME: drm_mm_remove_node(&vma->node); */
+	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
 
-	drm_mm_put_block(obj->gtt_space);
-	obj->gtt_space = NULL;
-
 	/* Since the unbound list is global, only move to that list if
 	 * no more VMAs exist */
 	if (list_empty(&obj->vma_list))
@@ -3084,7 +3081,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct drm_mm_node *node;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	size_t gtt_max = map_and_fenceable ?
@@ -3133,20 +3129,14 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	node = kzalloc(sizeof(*node), GFP_KERNEL);
-	if (node == NULL) {
-		i915_gem_object_unpin_pages(obj);
-		return -ENOMEM;
-	}
 	vma = i915_gem_vma_create(obj);
 	if (vma == NULL) {
-		kfree(node);
 		i915_gem_object_unpin_pages(obj);
 		return -ENOMEM;
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm, node,
+	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max,
 						  DRM_MM_CREATE_DEFAULT,
@@ -3160,36 +3150,34 @@ search_free:
 			goto search_free;
 
 		i915_gem_object_unpin_pages(obj);
-		kfree(node);
+		i915_gem_vma_destroy(vma);
 		return ret;
 	}
-	if (WARN_ON(!i915_gem_valid_gtt_space(dev, node, obj->cache_level))) {
+	if (WARN_ON(!i915_gem_valid_gtt_space(dev, &vma->node,
+					      obj->cache_level))) {
 		i915_gem_object_unpin_pages(obj);
-		drm_mm_put_block(node);
+		drm_mm_remove_node(&vma->node);
+		i915_gem_vma_destroy(vma);
 		return -EINVAL;
 	}
 
 	ret = i915_gem_gtt_prepare_object(obj);
 	if (ret) {
 		i915_gem_object_unpin_pages(obj);
-		drm_mm_put_block(node);
+		drm_mm_remove_node(&vma->node);
+		i915_gem_vma_destroy(vma);
 		return ret;
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
-
-	obj->gtt_space = node;
-	vma->node.start = node->start;
-	vma->node.size = node->size;
 	list_add(&vma->vma_link, &obj->vma_list);
 
-	fenceable =
-		node->size == fence_size &&
-		(node->start & (fence_alignment - 1)) == 0;
+	fenceable = i915_gem_obj_size(obj) == fence_size &&
+		(i915_gem_obj_offset(obj) & (fence_alignment - 1)) == 0;
 
 	mappable =
-		node->start + obj->base.size <= dev_priv->gtt.mappable_end;
+		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
@@ -3351,10 +3339,8 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 	}
 
-	if (i915_gem_obj_bound(obj)) {
-		node = obj->gtt_space;
-		BUG_ON(node->start != __i915_gem_obj_to_vma(obj)->node.start);
-	}
+	if (i915_gem_obj_bound(obj))
+		node = &__i915_gem_obj_to_vma(obj)->node;
 
 	if (!i915_gem_valid_gtt_space(dev, node, cache_level)) {
 		ret = i915_gem_object_unbind(obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 0434c9e..10aa4d2 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -34,13 +34,13 @@
 static bool
 mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
 {
+	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+
 	if (obj->pin_count)
 		return false;
 
 	list_add(&obj->exec_list, unwind);
-	BUG_ON(__i915_gem_obj_to_vma(obj)->node.start !=
-	       i915_gem_obj_offset(obj));
-	return drm_mm_scan_add_block(obj->gtt_space);
+	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
@@ -110,8 +110,7 @@ none:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		vma = __i915_gem_obj_to_vma(obj);
-		BUG_ON(vma->node.start != i915_gem_obj_offset(obj));
-		ret = drm_mm_scan_remove_block(obj->gtt_space);
+		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
 		list_del_init(&obj->exec_list);
@@ -128,12 +127,12 @@ found:
 	 * temporary list. */
 	INIT_LIST_HEAD(&eviction_list);
 	while (!list_empty(&unwind_list)) {
+		struct i915_vma *vma;
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
 		vma = __i915_gem_obj_to_vma(obj);
-		BUG_ON(vma->node.start != i915_gem_obj_offset(obj));
-		if (drm_mm_scan_remove_block(obj->gtt_space)) {
+		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
 			continue;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index b59f846..9f686c6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -681,22 +681,16 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		uintptr_t gtt_offset = (uintptr_t)obj->gtt_space;
+		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+		uintptr_t gtt_offset = (uintptr_t)vma->deferred_offset;
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
 			      i915_gem_obj_offset(obj), obj->base.size);
 
 		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
-		BUG_ON((__i915_gem_obj_to_vma(obj)->deferred_offset
-			& I915_GTT_RESERVED) == 0);
 		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
-		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
-		if (!obj->gtt_space) {
-			DRM_ERROR("Failed to preserve all objects\n");
-			break;
-		}
 		ret = drm_mm_create_block(&i915_gtt_vm->mm,
-					  obj->gtt_space,
+					  &vma->node,
 					  gtt_offset,
 					  obj->base.size);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 6e22355..13d24aa 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -381,31 +381,17 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 * later.
 	 */
 	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
-		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
-		if (!obj->gtt_space) {
-			i915_gem_vma_destroy(vma);
-			drm_gem_object_unreference(&obj->base);
-			return NULL;
-		}
-		ret = drm_mm_create_block(&i915_gtt_vm->mm, obj->gtt_space,
+		ret = drm_mm_create_block(&i915_gtt_vm->mm, &vma->node,
 					  gtt_offset, size);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			i915_gem_vma_destroy(vma);
 			drm_gem_object_unreference(&obj->base);
-			kfree(obj->gtt_space);
 			return NULL;
 		}
-		vma->node.start = obj->gtt_space->start;
-		vma->node.size = obj->gtt_space->size;
-		obj->gtt_space->start = gtt_offset;
 		list_add(&vma->vma_link, &obj->vma_list);
-	} else {
-		/* NB: Safe because we assert page alignment */
-		obj->gtt_space = (struct drm_mm_node *)
-			((uintptr_t)gtt_offset | I915_GTT_RESERVED);
+	} else
 		vma->deferred_offset = gtt_offset | I915_GTT_RESERVED;
-	}
 
 	obj->has_global_gtt_mapping = 1;
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 504d96b..9bea2e0 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -274,7 +274,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, obj->gtt_space->start);
+	I915_WRITE(IVB_FBC_RT_BASE, i915_gem_obj_offset(obj));
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 33/66] drm/i915: Create VMAs (part 3) - plumbing
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (31 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 32/66] drm/i915: Create VMAs (part 2) - kill gtt space Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 34/66] drm/i915: Create VMAs (part 3.5) - map and fenceable tracking Ben Widawsky
                   ` (34 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Plumb the functions we care about with VM arguments.

With the exception of the hack in i915_ppgtt_bind to only ever be able
to do aliasing PPGTT, this most everything we want.

v2: Fix purge to pick an object and unbind all vmas
This was doable because of the global bound list change.

v3: With the commit to actually pin/unpin pages in place, there is no
longer a need to check if unbind succeeded before calling put_pages().
Make put_pages only BUG() after checking pin count.

v4: Rebased on top of the new hangcheck work by Mika
plumbed eb_destroy also
Many checkpatch related fixes

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |  59 +++--
 drivers/gpu/drm/i915/i915_dma.c            |   8 +-
 drivers/gpu/drm/i915/i915_drv.h            | 109 +++++----
 drivers/gpu/drm/i915/i915_gem.c            | 377 +++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_context.c    |  11 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |  57 +++--
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  87 ++++---
 drivers/gpu/drm/i915/i915_gem_gtt.c        | 101 ++++----
 drivers/gpu/drm/i915/i915_gem_stolen.c     |  11 +-
 drivers/gpu/drm/i915/i915_gem_tiling.c     |  19 +-
 drivers/gpu/drm/i915/i915_irq.c            |  27 ++-
 drivers/gpu/drm/i915/i915_trace.h          |  20 +-
 drivers/gpu/drm/i915/intel_display.c       |  22 +-
 drivers/gpu/drm/i915/intel_fb.c            |   6 +-
 drivers/gpu/drm/i915/intel_overlay.c       |  16 +-
 drivers/gpu/drm/i915/intel_pm.c            |  11 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |  29 ++-
 drivers/gpu/drm/i915/intel_sprite.c        |   6 +-
 18 files changed, 609 insertions(+), 367 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index aa6d63b..cf50389 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -122,10 +122,14 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
-	if (i915_gem_obj_bound(obj))
-		seq_printf(m, " (gtt offset: %08lx, size: %08lx)",
-			   i915_gem_obj_offset(obj),
-			   i915_gem_obj_size(obj));
+	if (i915_gem_obj_bound_any(obj)) {
+		struct i915_vma *vma;
+		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+			seq_printf(m, " (gtt offset: %08lx, size: %08lx)",
+				   i915_gem_obj_offset(obj, vma->vm),
+				   i915_gem_obj_size(obj, vma->vm));
+		}
+	}
 	if (obj->stolen)
 		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
@@ -159,11 +163,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_printf(m, "Active:\n");
-		head = &i915_gtt_vm->active_list;
+		head = ggtt_list(active_list);
 		break;
 	case INACTIVE_LIST:
 		seq_printf(m, "Inactive:\n");
-		head = &i915_gtt_vm->inactive_list;
+		head = ggtt_list(inactive_list);
 		break;
 	default:
 		mutex_unlock(&dev->struct_mutex);
@@ -176,7 +180,8 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 		describe_obj(m, obj);
 		seq_printf(m, "\n");
 		total_obj_size += obj->base.size;
-		total_gtt_size += i915_gem_obj_size(obj);
+		/* FIXME: Add size of all VMs */
+		total_gtt_size += i915_gem_ggtt_size(obj);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -186,12 +191,13 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+/* FIXME: Support multiple VM? */
 #define count_objects(list, member) do { \
 	list_for_each_entry(obj, list, member) { \
-		size += i915_gem_obj_size(obj); \
+		size += i915_gem_ggtt_size(obj); \
 		++count; \
 		if (obj->map_and_fenceable) { \
-			mappable_size += i915_gem_obj_size(obj); \
+			mappable_size += i915_gem_ggtt_size(obj); \
 			++mappable_count; \
 		} \
 	} \
@@ -210,7 +216,7 @@ static int per_file_stats(int id, void *ptr, void *data)
 	stats->count++;
 	stats->total += obj->base.size;
 
-	if (i915_gem_obj_bound(obj)) {
+	if (i915_gem_obj_bound_any(obj)) {
 		if (!list_empty(&obj->ring_list))
 			stats->active += obj->base.size;
 		else
@@ -248,12 +254,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&i915_gtt_vm->active_list, mm_list);
+	count_objects(ggtt_list(active_list), mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&i915_gtt_vm->inactive_list, mm_list);
+	count_objects(ggtt_list(inactive_list), mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -268,11 +274,11 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	size = count = mappable_size = mappable_count = 0;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
 		if (obj->fault_mappable) {
-			size += i915_gem_obj_size(obj);
+			size += i915_gem_ggtt_size(obj);
 			++count;
 		}
 		if (obj->pin_mappable) {
-			mappable_size += i915_gem_obj_size(obj);
+			mappable_size += i915_gem_ggtt_size(obj);
 			++mappable_count;
 		}
 		if (obj->madv == I915_MADV_DONTNEED) {
@@ -288,8 +294,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   count, size);
 
 	seq_printf(m, "%zu [%lu] gtt total\n",
-		   i915_gtt_vm->total,
-		   dev_priv->gtt.mappable_end - i915_gtt_vm->start);
+		   dev_priv->gtt.base.total,
+		   dev_priv->gtt.mappable_end - dev_priv->gtt.base.start);
 
 	seq_printf(m, "\n");
 	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
@@ -334,7 +340,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void* data)
 		describe_obj(m, obj);
 		seq_printf(m, "\n");
 		total_obj_size += obj->base.size;
-		total_gtt_size += i915_gem_obj_size(obj);
+		total_gtt_size += i915_gem_ggtt_size(obj);
 		count++;
 	}
 
@@ -381,13 +387,13 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 				struct drm_i915_gem_object *obj = work->old_fb_obj;
 				if (obj)
 					seq_printf(m, "Old framebuffer gtt_offset 0x%08lx\n",
-						   i915_gem_obj_offset(obj));
+						   i915_gem_ggtt_offset(obj));
 			}
 			if (work->pending_flip_obj) {
 				struct drm_i915_gem_object *obj = work->pending_flip_obj;
 				if (obj)
 					seq_printf(m, "New framebuffer gtt_offset 0x%08lx\n",
-						   i915_gem_obj_offset(obj));
+						   i915_gem_ggtt_offset(obj));
 			}
 		}
 		spin_unlock_irqrestore(&dev->event_lock, flags);
@@ -1980,19 +1986,22 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
+		/* FIXME: Do this for all vms? */
+		list_for_each_entry_safe(obj, next, ggtt_list(inactive_list),
 					 mm_list)
-			if (obj->pin_count == 0) {
-				ret = i915_gem_object_unbind(obj);
-				if (ret)
-					goto unlock;
-			}
+			if (obj->pin_count)
+				continue;
+
+			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
+			if (ret)
+				goto unlock;
 	}
 
 	if (val & DROP_UNBOUND) {
 		list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list,
 					 global_list)
 			if (obj->pages_pin_count == 0) {
+				/* FIXME: Do this for all vms? */
 				ret = i915_gem_object_put_pages(obj);
 				if (ret)
 					goto unlock;
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 24dd593..4b330e5 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1363,7 +1363,7 @@ cleanup_gem:
 	i915_gem_cleanup_ringbuffer(dev);
 	i915_gem_context_fini(dev);
 	mutex_unlock(&dev->struct_mutex);
-	drm_mm_takedown(&i915_gtt_vm->mm);
+	drm_mm_takedown(&dev_priv->gtt.base.mm);
 cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_gem_stolen:
@@ -1497,10 +1497,6 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 
 	i915_dump_device_info(dev_priv);
 
-	INIT_LIST_HEAD(&dev_priv->vm_list);
-	INIT_LIST_HEAD(&dev_priv->gtt.base.global_link);
-	list_add(&dev_priv->gtt.base.global_link, &dev_priv->vm_list);
-
 	if (i915_get_bridge_dev(dev)) {
 		ret = -EIO;
 		goto free_priv;
@@ -1758,7 +1754,7 @@ int i915_driver_unload(struct drm_device *dev)
 	}
 
 	list_del(&dev_priv->vm_list);
-	drm_mm_takedown(&i915_gtt_vm->mm);
+	drm_mm_takedown(&dev_priv->gtt.base.mm);
 	if (dev_priv->regs != NULL)
 		pci_iounmap(dev->pdev, dev_priv->regs);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 217695e..9042376 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -520,10 +520,6 @@ struct i915_gtt {
 			  unsigned long *mappable_end);
 	void (*gtt_remove)(struct drm_device *dev);
 };
-#define i915_gtt_vm ((struct i915_address_space *) \
-		     list_first_entry(&dev_priv->vm_list,\
-				      struct i915_address_space, \
-				      global_link))
 
 struct i915_hw_ppgtt {
 	struct i915_address_space base;
@@ -1362,46 +1358,6 @@ struct drm_i915_gem_object {
 
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
-static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
-{
-	struct i915_vma *vma;
-	BUG_ON(list_empty(&o->vma_list));
-	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
-	return vma->node.start;
-}
-
-static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
-{
-	return !list_empty(&o->vma_list);
-}
-
-static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
-{
-	struct i915_vma *vma;
-	BUG_ON(list_empty(&o->vma_list));
-	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
-	return vma->node.size;
-}
-
-static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
-					  enum i915_cache_level color)
-{
-	struct i915_vma *vma;
-	BUG_ON(list_empty(&o->vma_list));
-	vma = list_first_entry(&o->vma_list, struct i915_vma, vma_link);
-	vma->node.color = color;
-}
-
-/* This is a temporary define to help transition us to real VMAs. If you see
- * this, you're either reviewing code, or bisecting it. */
-static inline struct i915_vma *
-__i915_gem_obj_to_vma(struct drm_i915_gem_object *obj)
-{
-	BUG_ON(!i915_gem_obj_bound(obj));
-	BUG_ON(list_empty(&obj->vma_list));
-	return list_first_entry(&obj->vma_list, struct i915_vma, vma_link);
-}
-
 /**
  * Request queue structure.
  *
@@ -1712,15 +1668,18 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
-struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj);
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
 void i915_gem_vma_destroy(struct i915_vma *vma);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
 				     bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
-int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
+int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+					struct i915_address_space *vm);
 int i915_gem_object_put_pages(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
@@ -1750,6 +1709,7 @@ int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
 
 int i915_gem_dumb_create(struct drm_file *file_priv,
@@ -1856,6 +1816,7 @@ i915_gem_get_gtt_alignment(struct drm_device *dev, uint32_t size,
 			    int tiling_mode, bool fenced);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level);
 
 struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
@@ -1866,6 +1827,56 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 
 void i915_gem_restore_fences(struct drm_device *dev);
 
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm);
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o);
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm);
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm);
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color);
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm);
+/* Some GGTT VM helpers */
+#define ggtt_list(list_name) (&(dev_priv->gtt.base.list_name))
+#define obj_to_ggtt(obj) \
+	(&((struct drm_i915_private *)(obj)->base.dev->dev_private)->gtt.base)
+static inline bool is_i915_ggtt(struct i915_address_space *vm)
+{
+	struct i915_address_space *ggtt =
+		&((struct drm_i915_private *)(vm)->dev->dev_private)->gtt.base;
+	return vm == ggtt;
+}
+
+static inline bool i915_gem_obj_bound_ggtt(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_bound(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long
+i915_gem_ggtt_offset(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_offset(obj, obj_to_ggtt(obj));
+}
+
+static inline unsigned long i915_gem_ggtt_size(struct drm_i915_gem_object *obj)
+{
+	return i915_gem_obj_size(obj, obj_to_ggtt(obj));
+}
+
+static inline int __must_check
+i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
+		  uint32_t alignment,
+		  bool map_and_fenceable,
+		  bool nonblocking)
+{
+	return i915_gem_object_pin(obj, obj_to_ggtt(obj), alignment,
+				   map_and_fenceable, nonblocking);
+}
+#undef obj_to_ggtt
+
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
@@ -1903,6 +1914,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj);
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
+/* FIXME: this is never okay with full PPGTT */
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 				enum i915_cache_level cache_level);
 void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
@@ -1919,7 +1931,9 @@ static inline void i915_gem_chipset_flush(struct drm_device *dev)
 
 
 /* i915_gem_evict.c */
-int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
+int __must_check i915_gem_evict_something(struct drm_device *dev,
+					  struct i915_address_space *vm,
+					  int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable,
@@ -1927,6 +1941,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
 int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
+#define I915_INVALID_OFFSET 0x1
 int i915_gem_init_stolen(struct drm_device *dev);
 int i915_gem_stolen_setup_compression(struct drm_device *dev, int size);
 void i915_gem_stolen_cleanup_compression(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bc9e089..8fe5f4e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -38,10 +38,12 @@
 
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
-static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
-						    unsigned alignment,
-						    bool map_and_fenceable,
-						    bool nonblocking);
+static __must_check int
+i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
+			    struct i915_address_space *vm,
+			    unsigned alignment,
+			    bool map_and_fenceable,
+			    bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -135,7 +137,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_bound(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !obj->active;
 }
 
 int
@@ -178,10 +180,10 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 	mutex_lock(&dev->struct_mutex);
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
-			pinned += i915_gem_obj_size(obj);
+			pinned += i915_gem_ggtt_size(obj);
 	mutex_unlock(&dev->struct_mutex);
 
-	args->aper_size = i915_gtt_vm->total;
+	args->aper_size = dev_priv->gtt.base.total;
 	args->aper_available_size = args->aper_size - pinned;
 
 	return 0;
@@ -422,7 +424,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		if (i915_gem_obj_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, false);
 			if (ret)
 				return ret;
@@ -594,7 +596,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	char __user *user_data;
 	int page_offset, page_length, ret;
 
-	ret = i915_gem_object_pin(obj, 0, true, true);
+	ret = i915_gem_ggtt_pin(obj, 0, true, true);
 	if (ret)
 		goto out;
 
@@ -609,7 +611,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	user_data = to_user_ptr(args->data_ptr);
 	remain = args->size;
 
-	offset = i915_gem_obj_offset(obj) + args->offset;
+	offset = i915_gem_ggtt_offset(obj) + args->offset;
 
 	while (remain > 0) {
 		/* Operation in this page
@@ -739,7 +741,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		if (i915_gem_obj_bound(obj)) {
+		if (i915_gem_obj_bound_any(obj)) {
 			ret = i915_gem_object_set_to_gtt_domain(obj, true);
 			if (ret)
 				return ret;
@@ -1347,7 +1349,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	}
 
 	/* Now bind it into the GTT if needed */
-	ret = i915_gem_object_pin(obj, 0, true, false);
+	ret = i915_gem_ggtt_pin(obj,  0, true, false);
 	if (ret)
 		goto unlock;
 
@@ -1361,7 +1363,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	obj->fault_mappable = true;
 
-	pfn += (i915_gem_obj_offset(obj) >> PAGE_SHIFT) + page_offset;
+	pfn += (i915_gem_ggtt_offset(obj) >> PAGE_SHIFT) + page_offset;
 
 	/* Finally, remap it using the new GTT offset */
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
@@ -1667,11 +1669,11 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->pages == NULL)
 		return 0;
 
-	BUG_ON(i915_gem_obj_bound(obj));
-
 	if (obj->pages_pin_count)
 		return -EBUSY;
 
+	BUG_ON(i915_gem_obj_bound_any(obj));
+
 	/* ->put_pages might need to allocate memory for the bit17 swizzle
 	 * array, hence protect them from being reaped by removing them from gtt
 	 * lists early. */
@@ -1704,16 +1706,22 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
 		}
 	}
 
-	list_for_each_entry_safe(obj, next,
-				 &i915_gtt_vm->inactive_list,
-				 mm_list) {
-		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
-		    i915_gem_object_unbind(obj) == 0 &&
-		    i915_gem_object_put_pages(obj) == 0) {
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.bound_list,
+				 global_list) {
+		struct i915_vma *vma, *v;
+
+		if (!i915_gem_object_is_purgeable(obj) && purgeable_only)
+			continue;
+
+		list_for_each_entry_safe(vma, v, &obj->vma_list, vma_link)
+			if (i915_gem_object_unbind(obj, vma->vm))
+				break;
+
+		if (!i915_gem_object_put_pages(obj))
 			count += obj->base.size >> PAGE_SHIFT;
-			if (count >= target)
-				return count;
-		}
+
+		if (count >= target)
+			return count;
 	}
 
 	return count;
@@ -1864,6 +1872,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 
 void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
+			       struct i915_address_space *vm,
 			       struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -1880,7 +1889,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
+	list_move_tail(&obj->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1900,15 +1909,13 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
+				 struct i915_address_space *vm)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
+	list_move_tail(&obj->mm_list, &vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2106,10 +2113,11 @@ i915_gem_request_remove_from_client(struct drm_i915_gem_request *request)
 	spin_unlock(&file_priv->mm.lock);
 }
 
-static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj)
+static bool i915_head_inside_object(u32 acthd, struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm)
 {
-	if (acthd >= i915_gem_obj_offset(obj) &&
-	    acthd < i915_gem_obj_offset(obj) + obj->base.size)
+	if (acthd >= i915_gem_obj_offset(obj, vm) &&
+	    acthd < i915_gem_obj_offset(obj, vm) + obj->base.size)
 		return true;
 
 	return false;
@@ -2132,6 +2140,17 @@ static bool i915_head_inside_request(const u32 acthd_unmasked,
 	return false;
 }
 
+static struct i915_address_space *
+request_to_vm(struct drm_i915_gem_request *request)
+{
+	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
+	struct i915_address_space *vm;
+
+	vm = &dev_priv->gtt.base;
+
+	return vm;
+}
+
 static bool i915_request_guilty(struct drm_i915_gem_request *request,
 				const u32 acthd, bool *inside)
 {
@@ -2139,9 +2158,9 @@ static bool i915_request_guilty(struct drm_i915_gem_request *request,
 	 * pointing inside the ring, matches the batch_obj address range.
 	 * However this is extremely unlikely.
 	 */
-
 	if (request->batch_obj) {
-		if (i915_head_inside_object(acthd, request->batch_obj)) {
+		if (i915_head_inside_object(acthd, request->batch_obj,
+					    request_to_vm(request))) {
 			*inside = true;
 			return true;
 		}
@@ -2161,17 +2180,21 @@ static bool i915_set_reset_status(struct intel_ring_buffer *ring,
 {
 	struct i915_ctx_hang_stats *hs = NULL;
 	bool inside, guilty, banned;
+	unsigned long offset = 0;
 
 	/* Innocent until proven guilty */
 	guilty = banned = false;
 
+	if (request->batch_obj)
+		offset = i915_gem_obj_offset(request->batch_obj,
+					     request_to_vm(request));
+
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
 		DRM_ERROR("%s hung %s bo (0x%lx ctx %d) at 0x%x\n",
 			  ring->name,
 			  inside ? "inside" : "flushing",
-			  request->batch_obj ?
-			  i915_gem_obj_offset(request->batch_obj) : 0,
+			  offset,
 			  request->ctx ? request->ctx->id : 0,
 			  acthd);
 
@@ -2239,13 +2262,15 @@ static bool i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 
 	return ctx_banned;
@@ -2267,6 +2292,7 @@ bool i915_gem_reset(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
+	struct i915_address_space *vm;
 	int i;
 	bool ctx_banned = false;
 
@@ -2278,8 +2304,9 @@ bool i915_gem_reset(struct drm_device *dev)
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
-	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
-		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, mm_list)
+			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_restore_fences(dev);
@@ -2327,6 +2354,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
+		struct drm_i915_private *dev_priv = ring->dev->dev_private;
+		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2336,7 +2365,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		i915_gem_object_move_to_inactive(obj);
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+			i915_gem_object_move_to_inactive(obj, vm);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
@@ -2582,13 +2612,14 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
  * Unbinds an object from the GTT aperture.
  */
 int
-i915_gem_object_unbind(struct drm_i915_gem_object *obj)
+i915_gem_object_unbind(struct drm_i915_gem_object *obj,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
 	struct i915_vma *vma;
 	int ret;
 
-	if (!i915_gem_obj_bound(obj))
+	if (!i915_gem_obj_bound(obj, vm))
 		return 0;
 
 	if (obj->pin_count)
@@ -2611,7 +2642,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (ret)
 		return ret;
 
-	trace_i915_gem_object_unbind(obj);
+	trace_i915_gem_object_unbind(obj, vm);
 
 	if (obj->has_global_gtt_mapping)
 		i915_gem_gtt_unbind_object(obj);
@@ -2626,7 +2657,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
-	vma = __i915_gem_obj_to_vma(obj);
+	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -2676,11 +2707,11 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
 	}
 
 	if (obj) {
-		u32 size = i915_gem_obj_size(obj);
+		u32 size = i915_gem_ggtt_size(obj);
 
-		val = (uint64_t)((i915_gem_obj_offset(obj) + size - 4096) &
+		val = (uint64_t)((i915_gem_ggtt_offset(obj) + size - 4096) &
 				 0xfffff000) << 32;
-		val |= i915_gem_obj_offset(obj) & 0xfffff000;
+		val |= i915_gem_ggtt_offset(obj) & 0xfffff000;
 		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
@@ -2700,15 +2731,15 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 	u32 val;
 
 	if (obj) {
-		u32 size = i915_gem_obj_size(obj);
+		u32 size = i915_gem_ggtt_size(obj);
 		int pitch_val;
 		int tile_width;
 
-		WARN((i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK) ||
+		WARN((i915_gem_ggtt_offset(obj) & ~I915_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (i915_gem_obj_offset(obj) & (size - 1)),
+		     (i915_gem_ggtt_offset(obj) & (size - 1)),
 		     "object 0x%08lx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
-		     i915_gem_obj_offset(obj), obj->map_and_fenceable, size);
+		     i915_gem_ggtt_offset(obj), obj->map_and_fenceable, size);
 
 		if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
 			tile_width = 128;
@@ -2719,7 +2750,7 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
 		pitch_val = obj->stride / tile_width;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = i915_gem_obj_offset(obj);
+		val = i915_gem_ggtt_offset(obj);
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I915_FENCE_SIZE_BITS(size);
@@ -2744,19 +2775,19 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
 	uint32_t val;
 
 	if (obj) {
-		u32 size = i915_gem_obj_size(obj);
+		u32 size = i915_gem_ggtt_size(obj);
 		uint32_t pitch_val;
 
-		WARN((i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK) ||
+		WARN((i915_gem_ggtt_offset(obj) & ~I830_FENCE_START_MASK) ||
 		     (size & -size) != size ||
-		     (i915_gem_obj_offset(obj) & (size - 1)),
+		     (i915_gem_ggtt_offset(obj) & (size - 1)),
 		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
-		     i915_gem_obj_offset(obj), size);
+		     i915_gem_ggtt_offset(obj), size);
 
 		pitch_val = obj->stride / 128;
 		pitch_val = ffs(pitch_val) - 1;
 
-		val = i915_gem_obj_offset(obj);
+		val = i915_gem_ggtt_offset(obj);
 		if (obj->tiling_mode == I915_TILING_Y)
 			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
 		val |= I830_FENCE_SIZE_BITS(size);
@@ -3075,6 +3106,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
  */
 static int
 i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
+			    struct i915_address_space *vm,
 			    unsigned alignment,
 			    bool map_and_fenceable,
 			    bool nonblocking)
@@ -3083,14 +3115,16 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
-	size_t gtt_max = map_and_fenceable ?
-		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
+	size_t gtt_max =
+		map_and_fenceable ? dev_priv->gtt.mappable_end : vm->total;
 	struct i915_vma *vma;
 	int ret;
 
 	if (WARN_ON(!list_empty(&obj->vma_list)))
 		return -EBUSY;
 
+	BUG_ON(!is_i915_ggtt(vm));
+
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3129,20 +3163,23 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	vma = i915_gem_vma_create(obj);
+	/* For now we only ever use 1 vma per object */
+	WARN_ON(!list_empty(&obj->vma_list));
+
+	vma = i915_gem_vma_create(obj, vm);
 	if (vma == NULL) {
 		i915_gem_object_unpin_pages(obj);
 		return -ENOMEM;
 	}
 
 search_free:
-	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm, &vma->node,
+	ret = drm_mm_insert_node_in_range_generic(&vm->mm, &vma->node,
 						  size, alignment,
 						  obj->cache_level, 0, gtt_max,
 						  DRM_MM_CREATE_DEFAULT,
 						  DRM_MM_SEARCH_DEFAULT);
 	if (ret) {
-		ret = i915_gem_evict_something(dev, size, alignment,
+		ret = i915_gem_evict_something(dev, vm, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable,
 					       nonblocking);
@@ -3170,18 +3207,25 @@ search_free:
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
-	list_add(&vma->vma_link, &obj->vma_list);
+	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	/* Keep GGTT vmas first to make debug easier */
+	if (is_i915_ggtt(vm))
+		list_add(&vma->vma_link, &obj->vma_list);
+	else
+		list_add_tail(&vma->vma_link, &obj->vma_list);
 
-	fenceable = i915_gem_obj_size(obj) == fence_size &&
-		(i915_gem_obj_offset(obj) & (fence_alignment - 1)) == 0;
+	fenceable =
+		is_i915_ggtt(vm) &&
+		i915_gem_ggtt_size(obj) == fence_size &&
+		(i915_gem_ggtt_offset(obj) & (fence_alignment - 1)) == 0;
 
 	mappable =
+		is_i915_ggtt(vm) &&
 		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
 	obj->map_and_fenceable = mappable && fenceable;
 
-	trace_i915_gem_object_bind(obj, map_and_fenceable);
+	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
 	return 0;
 }
@@ -3279,7 +3323,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	int ret;
 
 	/* Not valid to be called on unbound objects. */
-	if (!i915_gem_obj_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return -EINVAL;
 
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
@@ -3318,12 +3362,13 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 
 	/* And bump the LRU for this access */
 	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
+		list_move_tail(&obj->mm_list, ggtt_list(inactive_list));
 
 	return 0;
 }
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
+				    struct i915_address_space *vm,
 				    enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
@@ -3339,16 +3384,19 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 	}
 
-	if (i915_gem_obj_bound(obj))
-		node = &__i915_gem_obj_to_vma(obj)->node;
+	if (i915_gem_obj_bound(obj, vm))
+		node = &(i915_gem_obj_to_vma(obj, vm)->node);
 
 	if (!i915_gem_valid_gtt_space(dev, node, cache_level)) {
-		ret = i915_gem_object_unbind(obj);
+		ret = i915_gem_object_unbind(obj, vm);
 		if (ret)
 			return ret;
 	}
 
-	if (i915_gem_obj_bound(obj)) {
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		if (!i915_gem_obj_bound(obj, vm))
+			continue;
+
 		ret = i915_gem_object_finish_gpu(obj);
 		if (ret)
 			return ret;
@@ -3371,7 +3419,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
 					       obj, cache_level);
 
-		i915_gem_obj_set_color(obj, cache_level);
+		i915_gem_obj_set_color(obj, vm, cache_level);
 	}
 
 	if (cache_level == I915_CACHE_NONE) {
@@ -3431,6 +3479,7 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 			       struct drm_file *file)
 {
 	struct drm_i915_gem_caching *args = data;
+	struct drm_i915_private *dev_priv;
 	struct drm_i915_gem_object *obj;
 	enum i915_cache_level level;
 	int ret;
@@ -3455,8 +3504,10 @@ int i915_gem_set_caching_ioctl(struct drm_device *dev, void *data,
 		ret = -ENOENT;
 		goto unlock;
 	}
+	dev_priv = obj->base.dev->dev_private;
 
-	ret = i915_gem_object_set_cache_level(obj, level);
+	/* FIXME: Add interface for specific VM? */
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base, level);
 
 	drm_gem_object_unreference(&obj->base);
 unlock:
@@ -3474,6 +3525,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     struct intel_ring_buffer *pipelined)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	u32 old_read_domains, old_write_domain;
 	int ret;
 
@@ -3492,7 +3544,8 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * of uncaching, which would allow us to flush all the LLC-cached data
 	 * with that bit in the PTE to main memory with just one PIPE_CONTROL.
 	 */
-	ret = i915_gem_object_set_cache_level(obj, I915_CACHE_NONE);
+	ret = i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					      I915_CACHE_NONE);
 	if (ret)
 		return ret;
 
@@ -3500,7 +3553,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
 	 * always use map_and_fenceable for all scanout buffers.
 	 */
-	ret = i915_gem_object_pin(obj, alignment, true, false);
+	ret = i915_gem_ggtt_pin(obj, alignment, true, false);
 	if (ret)
 		return ret;
 
@@ -3643,6 +3696,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 int
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
+		    struct i915_address_space *vm,
 		    uint32_t alignment,
 		    bool map_and_fenceable,
 		    bool nonblocking)
@@ -3652,26 +3706,29 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
 		return -EBUSY;
 
-	if (i915_gem_obj_bound(obj)) {
-		if ((alignment && i915_gem_obj_offset(obj) & (alignment - 1)) ||
+	BUG_ON(map_and_fenceable && !is_i915_ggtt(vm));
+
+	if (i915_gem_obj_bound(obj, vm)) {
+		if ((alignment &&
+		     i915_gem_obj_offset(obj, vm) & (alignment - 1)) ||
 		    (map_and_fenceable && !obj->map_and_fenceable)) {
 			WARN(obj->pin_count,
 			     "bo is already pinned with incorrect alignment:"
 			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
 			     " obj->map_and_fenceable=%d\n",
-			     i915_gem_obj_offset(obj), alignment,
+			     i915_gem_obj_offset(obj, vm), alignment,
 			     map_and_fenceable,
 			     obj->map_and_fenceable);
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 			if (ret)
 				return ret;
 		}
 	}
 
-	if (!i915_gem_obj_bound(obj)) {
+	if (!i915_gem_obj_bound(obj, vm)) {
 		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
-		ret = i915_gem_object_bind_to_gtt(obj, alignment,
+		ret = i915_gem_object_bind_to_gtt(obj, vm, alignment,
 						  map_and_fenceable,
 						  nonblocking);
 		if (ret)
@@ -3694,7 +3751,7 @@ void
 i915_gem_object_unpin(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pin_count == 0);
-	BUG_ON(!i915_gem_obj_bound(obj));
+	BUG_ON(!i915_gem_obj_bound_any(obj));
 
 	if (--obj->pin_count == 0)
 		obj->pin_mappable = false;
@@ -3732,7 +3789,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	}
 
 	if (obj->user_pin_count == 0) {
-		ret = i915_gem_object_pin(obj, args->alignment, true, false);
+		ret = i915_gem_ggtt_pin(obj, args->alignment, true, false);
 		if (ret)
 			goto out;
 	}
@@ -3744,7 +3801,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	 * as the X server doesn't manage domains yet
 	 */
 	i915_gem_object_flush_cpu_write_domain(obj);
-	args->offset = i915_gem_obj_offset(obj);
+	args->offset = i915_gem_ggtt_offset(obj);
 out:
 	drm_gem_object_unreference(&obj->base);
 unlock:
@@ -3967,6 +4024,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_vma *vma, *next;
 
 	trace_i915_gem_object_destroy(obj);
 
@@ -3974,15 +4032,21 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
 	obj->pin_count = 0;
-	if (WARN_ON(i915_gem_object_unbind(obj) == -ERESTARTSYS)) {
-		bool was_interruptible;
+	/* NB: 0 or 1 elements */
+	WARN_ON(!list_empty(&obj->vma_list) &&
+		!list_is_singular(&obj->vma_list));
+	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
+		int ret = i915_gem_object_unbind(obj, vma->vm);
+		if (WARN_ON(ret == -ERESTARTSYS)) {
+			bool was_interruptible;
 
-		was_interruptible = dev_priv->mm.interruptible;
-		dev_priv->mm.interruptible = false;
+			was_interruptible = dev_priv->mm.interruptible;
+			dev_priv->mm.interruptible = false;
 
-		WARN_ON(i915_gem_object_unbind(obj));
+			WARN_ON(i915_gem_object_unbind(obj, vma->vm));
 
-		dev_priv->mm.interruptible = was_interruptible;
+			dev_priv->mm.interruptible = was_interruptible;
+		}
 	}
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
@@ -4008,15 +4072,18 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_object_free(obj);
 }
 
-struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj)
+struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct i915_vma *vma = kzalloc(sizeof(*vma), GFP_KERNEL);
+	struct i915_vma *vma;
+	BUG_ON(!vm);
+
+	vma = kzalloc(sizeof(*vma), GFP_KERNEL);
 	if (vma == NULL)
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&vma->vma_link);
-	vma->vm = i915_gtt_vm;
+	vma->vm = vm;
 	vma->obj = obj;
 
 	return vma;
@@ -4256,10 +4323,10 @@ int i915_gem_init(struct drm_device *dev)
 	 */
 	if (HAS_HW_CONTEXTS(dev)) {
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
-					  i915_gtt_vm->total, 0);
+					  dev_priv->gtt.base.total, 0);
 		i915_gem_context_init(dev);
 		if (dev_priv->hw_contexts_disabled) {
-			drm_mm_takedown(&i915_gtt_vm->mm);
+			drm_mm_takedown(&dev_priv->gtt.base.mm);
 			goto ggtt_only;
 		}
 	} else
@@ -4270,7 +4337,7 @@ ggtt_only:
 		if (HAS_HW_CONTEXTS(dev))
 			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
 		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
-					  i915_gtt_vm->total, PAGE_SIZE);
+					  dev_priv->gtt.base.total, PAGE_SIZE);
 	}
 
 	ret = i915_gem_init_hw(dev);
@@ -4321,7 +4388,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 		return ret;
 	}
 
-	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
+	BUG_ON(!list_empty(ggtt_list(active_list)));
 	mutex_unlock(&dev->struct_mutex);
 
 	ret = drm_irq_install(dev);
@@ -4370,6 +4437,16 @@ init_ring_lists(struct intel_ring_buffer *ring)
 	INIT_LIST_HEAD(&ring->request_list);
 }
 
+static void i915_init_vm(struct drm_i915_private *dev_priv,
+			 struct i915_address_space *vm)
+{
+	vm->dev = dev_priv->dev;
+	INIT_LIST_HEAD(&vm->active_list);
+	INIT_LIST_HEAD(&vm->inactive_list);
+	INIT_LIST_HEAD(&vm->global_link);
+	list_add(&vm->global_link, &dev_priv->vm_list);
+}
+
 void
 i915_gem_load(struct drm_device *dev)
 {
@@ -4382,8 +4459,9 @@ i915_gem_load(struct drm_device *dev)
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
-	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
-	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
+	INIT_LIST_HEAD(&dev_priv->vm_list);
+	i915_init_vm(dev_priv, &dev_priv->gtt.base);
+
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
@@ -4654,8 +4732,9 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
-	int nr_to_scan = sc->nr_to_scan;
+	int nr_to_scan;
 	bool unlock = true;
 	int cnt;
 
@@ -4669,6 +4748,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 		unlock = false;
 	}
 
+	nr_to_scan = sc->nr_to_scan;
 	if (nr_to_scan) {
 		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
 		if (nr_to_scan > 0)
@@ -4682,11 +4762,94 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
 		if (obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
-	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
-		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
-			cnt += obj->base.size >> PAGE_SHIFT;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		list_for_each_entry(obj, &vm->inactive_list, global_list)
+			if (obj->pin_count == 0 && obj->pages_pin_count == 0)
+				cnt += obj->base.size >> PAGE_SHIFT;
 
 	if (unlock)
 		mutex_unlock(&dev->struct_mutex);
 	return cnt;
 }
+
+/* All the new VM stuff */
+unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
+				  struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->gtt.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.start;
+
+	}
+	WARN_ON(1);
+	return I915_INVALID_OFFSET;
+}
+
+bool i915_gem_obj_bound_any(struct drm_i915_gem_object *o)
+{
+	return !list_empty(&o->vma_list);
+}
+
+bool i915_gem_obj_bound(struct drm_i915_gem_object *o,
+			struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return true;
+	}
+	return false;
+}
+
+unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o,
+				struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = o->base.dev->dev_private;
+	struct i915_vma *vma;
+
+	if (vm == &dev_priv->gtt.aliasing_ppgtt->base)
+		vm = &dev_priv->gtt.base;
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm)
+			return vma->node.size;
+	}
+
+	return 0;
+}
+
+void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
+			    struct i915_address_space *vm,
+			    enum i915_cache_level color)
+{
+	struct i915_vma *vma;
+	BUG_ON(list_empty(&o->vma_list));
+	list_for_each_entry(vma, &o->vma_list, vma_link) {
+		if (vma->vm == vm) {
+			vma->node.color = color;
+			return;
+		}
+	}
+
+	WARN(1, "Couldn't set color for VM %p\n", vm);
+}
+
+struct i915_vma *i915_gem_obj_to_vma(struct drm_i915_gem_object *obj,
+				     struct i915_address_space *vm)
+{
+	struct i915_vma *vma;
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->vm == vm)
+			return vma;
+
+	return NULL;
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 75b4e27..5d5a60f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -157,6 +157,7 @@ create_hw_context(struct drm_device *dev,
 
 	if (INTEL_INFO(dev)->gen >= 7) {
 		ret = i915_gem_object_set_cache_level(ctx->obj,
+						      &dev_priv->gtt.base,
 						      I915_CACHE_LLC_MLC);
 		/* Failure shouldn't ever happen this early */
 		if (WARN_ON(ret))
@@ -219,7 +220,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * may not be available. To avoid this we always pin the
 	 * default context.
 	 */
-	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret) {
 		DRM_DEBUG_DRIVER("Couldn't pin %d\n", ret);
 		goto err_destroy;
@@ -395,7 +396,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 
 	intel_ring_emit(ring, MI_NOOP);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, i915_gem_obj_offset(new_context->obj) |
+	intel_ring_emit(ring, i915_gem_ggtt_offset(new_context->obj) |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
@@ -416,6 +417,7 @@ mi_set_context(struct intel_ring_buffer *ring,
 static int do_switch(struct i915_hw_context *to)
 {
 	struct intel_ring_buffer *ring = to->ring;
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
@@ -425,7 +427,7 @@ static int do_switch(struct i915_hw_context *to)
 	if (from == to)
 		return 0;
 
-	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
+	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
 
@@ -462,7 +464,8 @@ static int do_switch(struct i915_hw_context *to)
 	 */
 	if (from != NULL) {
 		from->obj->base.read_domains = I915_GEM_DOMAIN_INSTRUCTION;
-		i915_gem_object_move_to_active(from->obj, ring);
+		i915_gem_object_move_to_active(from->obj, &dev_priv->gtt.base,
+					       ring);
 		/* As long as MI_SET_CONTEXT is serializing, ie. it flushes the
 		 * whole damn pipeline, we don't need to explicitly mark the
 		 * object dirty. The only exception is that the context must be
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 10aa4d2..7a210b8 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -32,20 +32,18 @@
 #include "i915_trace.h"
 
 static bool
-mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
+mark_free(struct i915_vma *vma, struct list_head *unwind)
 {
-	struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
-
-	if (obj->pin_count)
+	if (vma->obj->pin_count)
 		return false;
 
-	list_add(&obj->exec_list, unwind);
+	list_add(&vma->obj->exec_list, unwind);
 	return drm_mm_scan_add_block(&vma->node);
 }
 
 int
-i915_gem_evict_something(struct drm_device *dev, int min_size,
-			 unsigned alignment, unsigned cache_level,
+i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
+			 int min_size, unsigned alignment, unsigned cache_level,
 			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -81,16 +79,16 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 
 	INIT_LIST_HEAD(&unwind_list);
 	if (mappable)
-		drm_mm_init_scan_with_range(&i915_gtt_vm->mm, min_size,
-					    alignment, cache_level, 0,
+		drm_mm_init_scan_with_range(&vm->mm, min_size, alignment,
+					    cache_level, 0,
 					    dev_priv->gtt.mappable_end);
 	else
-		drm_mm_init_scan(&i915_gtt_vm->mm, min_size, alignment,
-				 cache_level);
+		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -98,8 +96,9 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
-		if (mark_free(obj, &unwind_list))
+	list_for_each_entry(obj, &vm->active_list, mm_list) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
 
@@ -109,7 +108,7 @@ none:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		ret = drm_mm_scan_remove_block(&vma->node);
 		BUG_ON(ret);
 
@@ -131,7 +130,7 @@ found:
 		obj = list_first_entry(&unwind_list,
 				       struct drm_i915_gem_object,
 				       exec_list);
-		vma = __i915_gem_obj_to_vma(obj);
+		vma = i915_gem_obj_to_vma(obj, vm);
 		if (drm_mm_scan_remove_block(&vma->node)) {
 			list_move(&obj->exec_list, &eviction_list);
 			drm_gem_object_reference(&obj->base);
@@ -146,7 +145,7 @@ found:
 				       struct drm_i915_gem_object,
 				       exec_list);
 		if (ret == 0)
-			ret = i915_gem_object_unbind(obj);
+			ret = i915_gem_object_unbind(obj, vm);
 
 		list_del_init(&obj->exec_list);
 		drm_gem_object_unreference(&obj->base);
@@ -160,11 +159,17 @@ i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
-	bool lists_empty;
+	struct i915_address_space *vm;
+	bool lists_empty = true;
 	int ret;
 
-	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
-		       list_empty(&i915_gtt_vm->active_list));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		lists_empty = (list_empty(&vm->inactive_list) &&
+			       list_empty(&vm->active_list));
+		if (!lists_empty)
+			lists_empty = false;
+	}
+
 	if (lists_empty)
 		return -ENOSPC;
 
@@ -181,10 +186,12 @@ i915_gem_evict_everything(struct drm_device *dev)
 	i915_gem_retire_requests(dev);
 
 	/* Having flushed everything, unbind() should never raise an error */
-	list_for_each_entry_safe(obj, next,
-				 &i915_gtt_vm->inactive_list, mm_list)
-		if (obj->pin_count == 0)
-			WARN_ON(i915_gem_object_unbind(obj));
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
+			if (obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(obj, vm));
+	}
+
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 837372d..620f395 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -150,7 +150,7 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
 }
 
 static void
-eb_destroy(struct eb_objects *eb)
+eb_destroy(struct eb_objects *eb, struct i915_address_space *vm)
 {
 	while (!list_empty(&eb->objects)) {
 		struct drm_i915_gem_object *obj;
@@ -174,7 +174,8 @@ static inline int use_cpu_reloc(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 				   struct eb_objects *eb,
-				   struct drm_i915_gem_relocation_entry *reloc)
+				   struct drm_i915_gem_relocation_entry *reloc,
+				   struct i915_address_space *vm)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_gem_object *target_obj;
@@ -188,7 +189,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		return -ENOENT;
 
 	target_i915_obj = to_intel_bo(target_obj);
-	target_offset = i915_gem_obj_offset(target_i915_obj);
+	target_offset = i915_gem_obj_offset(target_i915_obj, vm);
 
 	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
 	 * pipe_control writes because the gpu doesn't properly redirect them
@@ -280,7 +281,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 			return ret;
 
 		/* Map the page containing the relocation we're going to perform.  */
-		reloc->offset += i915_gem_obj_offset(obj);
+		reloc->offset += i915_gem_obj_offset(obj, vm);
 		reloc_page = io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
 						      reloc->offset & PAGE_MASK);
 		reloc_entry = (uint32_t __iomem *)
@@ -297,7 +298,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 
 static int
 i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
-				    struct eb_objects *eb)
+				    struct eb_objects *eb,
+				    struct i915_address_space *vm)
 {
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)];
@@ -321,7 +323,8 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 		do {
 			u64 offset = r->presumed_offset;
 
-			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r);
+			ret = i915_gem_execbuffer_relocate_entry(obj, eb, r,
+								 vm);
 			if (ret)
 				return ret;
 
@@ -344,13 +347,15 @@ i915_gem_execbuffer_relocate_object(struct drm_i915_gem_object *obj,
 static int
 i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 					 struct eb_objects *eb,
-					 struct drm_i915_gem_relocation_entry *relocs)
+					 struct drm_i915_gem_relocation_entry *relocs,
+					 struct i915_address_space *vm)
 {
 	const struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
 	int i, ret;
 
 	for (i = 0; i < entry->relocation_count; i++) {
-		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i]);
+		ret = i915_gem_execbuffer_relocate_entry(obj, eb, &relocs[i],
+							 vm);
 		if (ret)
 			return ret;
 	}
@@ -359,7 +364,8 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
 }
 
 static int
-i915_gem_execbuffer_relocate(struct eb_objects *eb)
+i915_gem_execbuffer_relocate(struct eb_objects *eb,
+			     struct i915_address_space *vm)
 {
 	struct drm_i915_gem_object *obj;
 	int ret = 0;
@@ -373,7 +379,7 @@ i915_gem_execbuffer_relocate(struct eb_objects *eb)
 	 */
 	pagefault_disable();
 	list_for_each_entry(obj, &eb->objects, exec_list) {
-		ret = i915_gem_execbuffer_relocate_object(obj, eb);
+		ret = i915_gem_execbuffer_relocate_object(obj, eb, vm);
 		if (ret)
 			break;
 	}
@@ -395,6 +401,7 @@ need_reloc_mappable(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 				   struct intel_ring_buffer *ring,
+				   struct i915_address_space *vm,
 				   bool *need_reloc)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -409,7 +416,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
+	ret = i915_gem_object_pin(obj, vm, entry->alignment, need_mappable,
+				  false);
 	if (ret)
 		return ret;
 
@@ -436,8 +444,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 		obj->has_aliasing_ppgtt_mapping = 1;
 	}
 
-	if (entry->offset != i915_gem_obj_offset(obj)) {
-		entry->offset = i915_gem_obj_offset(obj);
+	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
+		entry->offset = i915_gem_obj_offset(obj, vm);
 		*need_reloc = true;
 	}
 
@@ -458,7 +466,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_gem_exec_object2 *entry;
 
-	if (!i915_gem_obj_bound(obj))
+	if (!i915_gem_obj_bound_any(obj))
 		return;
 
 	entry = obj->exec_entry;
@@ -475,6 +483,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
 static int
 i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			    struct list_head *objects,
+			    struct i915_address_space *vm,
 			    bool *need_relocs)
 {
 	struct drm_i915_gem_object *obj;
@@ -531,32 +540,35 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			unsigned long obj_offset;
 			bool need_fence, need_mappable;
 
-			if (!i915_gem_obj_bound(obj))
+			if (!i915_gem_obj_bound(obj, vm))
 				continue;
 
-			obj_offset = i915_gem_obj_offset(obj);
+			obj_offset = i915_gem_obj_offset(obj, vm);
 			need_fence =
 				has_fenced_gpu_access &&
 				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
 				obj->tiling_mode != I915_TILING_NONE;
 			need_mappable = need_fence || need_reloc_mappable(obj);
 
+			BUG_ON((need_mappable || need_fence) &&
+			       !is_i915_ggtt(vm));
+
 			if ((entry->alignment &&
 			     obj_offset & (entry->alignment - 1)) ||
 			    (need_mappable && !obj->map_and_fenceable))
-				ret = i915_gem_object_unbind(obj);
+				ret = i915_gem_object_unbind(obj, vm);
 			else
-				ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+				ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
 
 		/* Bind fresh objects */
 		list_for_each_entry(obj, objects, exec_list) {
-			if (i915_gem_obj_bound(obj))
+			if (i915_gem_obj_bound(obj, vm))
 				continue;
 
-			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
+			ret = i915_gem_execbuffer_reserve_object(obj, ring, vm, need_relocs);
 			if (ret)
 				goto err;
 		}
@@ -580,7 +592,8 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 				  struct drm_file *file,
 				  struct intel_ring_buffer *ring,
 				  struct eb_objects *eb,
-				  struct drm_i915_gem_exec_object2 *exec)
+				  struct drm_i915_gem_exec_object2 *exec,
+				  struct i915_address_space *vm)
 {
 	struct drm_i915_gem_relocation_entry *reloc;
 	struct drm_i915_gem_object *obj;
@@ -664,14 +677,15 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
 		goto err;
 
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	list_for_each_entry(obj, &eb->objects, exec_list) {
 		int offset = obj->exec_entry - exec;
 		ret = i915_gem_execbuffer_relocate_object_slow(obj, eb,
-							       reloc + reloc_offset[offset]);
+							       reloc + reloc_offset[offset],
+							       vm);
 		if (ret)
 			goto err;
 	}
@@ -770,6 +784,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec,
 
 static void
 i915_gem_execbuffer_move_to_active(struct list_head *objects,
+				   struct i915_address_space *vm,
 				   struct intel_ring_buffer *ring)
 {
 	struct drm_i915_gem_object *obj;
@@ -784,7 +799,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *objects,
 		obj->base.read_domains = obj->base.pending_read_domains;
 		obj->fenced_gpu_access = obj->pending_fenced_gpu_access;
 
-		i915_gem_object_move_to_active(obj, ring);
+		i915_gem_object_move_to_active(obj, vm, ring);
 		if (obj->base.write_domain) {
 			obj->dirty = 1;
 			obj->last_write_seqno = intel_ring_get_seqno(ring);
@@ -838,7 +853,8 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct i915_address_space *vm)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct eb_objects *eb;
@@ -1001,17 +1017,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	/* Move the objects en-masse into the GTT, evicting if necessary. */
 	need_relocs = (args->flags & I915_EXEC_NO_RELOC) == 0;
-	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, &need_relocs);
+	ret = i915_gem_execbuffer_reserve(ring, &eb->objects, vm, &need_relocs);
 	if (ret)
 		goto err;
 
 	/* The objects are in their final locations, apply the relocations. */
 	if (need_relocs)
-		ret = i915_gem_execbuffer_relocate(eb);
+		ret = i915_gem_execbuffer_relocate(eb, vm);
 	if (ret) {
 		if (ret == -EFAULT) {
 			ret = i915_gem_execbuffer_relocate_slow(dev, args, file, ring,
-								eb, exec);
+								eb, exec, vm);
 			BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 		}
 		if (ret)
@@ -1074,7 +1090,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj) + args->batch_start_offset;
+	exec_start = i915_gem_obj_offset(batch_obj, vm) +
+		args->batch_start_offset;
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
@@ -1099,11 +1116,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	trace_i915_gem_ring_dispatch(ring, intel_ring_get_seqno(ring), flags);
 
-	i915_gem_execbuffer_move_to_active(&eb->objects, ring);
+	i915_gem_execbuffer_move_to_active(&eb->objects, vm, ring);
 	i915_gem_execbuffer_retire_commands(dev, file, ring, batch_obj);
 
 err:
-	eb_destroy(eb);
+	eb_destroy(eb, vm);
 
 	mutex_unlock(&dev->struct_mutex);
 
@@ -1120,6 +1137,7 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -1175,7 +1193,8 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 	exec2.flags = I915_EXEC_RENDER;
 	i915_execbuffer2_set_context_id(exec2, 0);
 
-	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		for (i = 0; i < args->buffer_count; i++)
@@ -1201,6 +1220,7 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
 	int ret;
@@ -1231,7 +1251,8 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
+	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
+				     &dev_priv->gtt.base);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 9f686c6..8b59729 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -278,12 +278,12 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	 * multiplied by page size. We allocate at the top of the GTT to avoid
 	 * fragmentation.
 	 */
-	BUG_ON(!drm_mm_initialized(&i915_gtt_vm->mm));
-	ret = drm_mm_insert_node_in_range_generic(&i915_gtt_vm->mm,
+	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
+	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
 						  &ppgtt->node, GEN6_PD_SIZE,
 						  GEN6_PD_ALIGN, 0,
 						  dev_priv->gtt.mappable_end,
-						  i915_gtt_vm->total,
+						  dev_priv->gtt.base.total,
 						  DRM_MM_TOPDOWN);
 	if (ret)
 		return ret;
@@ -382,6 +382,8 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 		drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
 			    ppgtt->base.total);
 
+	/* i915_init_vm(dev_priv, &ppgtt->base) */
+
 	return ret;
 }
 
@@ -389,17 +391,22 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
-				   i915_gem_obj_offset(obj) >> PAGE_SHIFT,
-				   cache_level);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->insert_entries(vm, obj->pages,
+			   obj_offset >> PAGE_SHIFT,
+			   cache_level);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
 			      struct drm_i915_gem_object *obj)
 {
-	ppgtt->base.clear_range(&ppgtt->base,
-				i915_gem_obj_offset(obj) >> PAGE_SHIFT,
-				obj->base.size >> PAGE_SHIFT);
+	struct i915_address_space *vm = &ppgtt->base;
+	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
+
+	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
+			obj->base.size >> PAGE_SHIFT);
 }
 
 extern int intel_iommu_gfx_mapped;
@@ -443,12 +450,12 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible)
 void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 
 	/* First fill our portion of the GTT with scratch pages */
-	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
-				       i915_gtt_vm->start / PAGE_SIZE,
-				       i915_gtt_vm->total / PAGE_SIZE);
+	gtt_vm->clear_range(&dev_priv->gtt.base, gtt_vm->start / PAGE_SIZE,
+			    gtt_vm->total / PAGE_SIZE);
 
 	if (dev_priv->gtt.aliasing_ppgtt)
 		gen6_write_pdes(dev_priv->gtt.aliasing_ppgtt);
@@ -570,11 +577,11 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long obj_offset = i915_gem_obj_offset(obj);
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
+	uint32_t obj_offset = i915_gem_ggtt_offset(obj);
 
-	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
-					  obj_offset >> PAGE_SHIFT,
-					  cache_level);
+	gtt_vm->insert_entries(gtt_vm, obj->pages, obj_offset >> PAGE_SHIFT,
+			       cache_level);
 
 	obj->has_global_gtt_mapping = 1;
 }
@@ -583,11 +590,11 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long obj_offset = i915_gem_obj_offset(obj);
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
+	uint32_t obj_offset = i915_gem_obj_offset(obj, gtt_vm);
 
-	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
-				       obj_offset >> PAGE_SHIFT,
-				       obj->base.size >> PAGE_SHIFT);
+	gtt_vm->clear_range(gtt_vm, obj_offset >> PAGE_SHIFT,
+			    obj->base.size >> PAGE_SHIFT);
 
 	obj->has_global_gtt_mapping = 0;
 }
@@ -665,7 +672,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	 * aperture.  One page should be enough to keep any prefetching inside
 	 * of the aperture.
 	 */
-	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	struct drm_mm_node *entry;
 	struct drm_i915_gem_object *obj;
 	unsigned long hole_start, hole_end;
@@ -675,50 +683,50 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
 	if (WARN_ON(guard_size & ~PAGE_MASK))
 		guard_size = round_up(guard_size, PAGE_SIZE);
 
-	drm_mm_init(&i915_gtt_vm->mm, start, end - start - guard_size);
+	drm_mm_init(&gtt_vm->mm, start, end - start - guard_size);
 	if (!HAS_LLC(dev))
-		i915_gtt_vm->mm.color_adjust = i915_gtt_color_adjust;
+		gtt_vm->mm.color_adjust = i915_gtt_color_adjust;
 
 	/* Mark any preallocated objects as occupied */
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		struct i915_vma *vma = __i915_gem_obj_to_vma(obj);
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj, gtt_vm);
 		uintptr_t gtt_offset = (uintptr_t)vma->deferred_offset;
 		int ret;
 		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
-			      i915_gem_obj_offset(obj), obj->base.size);
+			      i915_gem_ggtt_offset(obj), obj->base.size);
 
 		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
 		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
-		ret = drm_mm_create_block(&i915_gtt_vm->mm,
+		ret = drm_mm_create_block(&gtt_vm->mm,
 					  &vma->node,
 					  gtt_offset,
 					  obj->base.size);
 		if (ret)
 			DRM_DEBUG_KMS("Reservation failed\n");
 		obj->has_global_gtt_mapping = 1;
-		list_add(&__i915_gem_obj_to_vma(obj)->vma_link, &obj->vma_list);
+		list_add(&vma->vma_link, &obj->vma_list);
 	}
 
-	i915_gtt_vm->start = start;
-	i915_gtt_vm->total = end - start;
+	gtt_vm->start = start;
+	gtt_vm->total = end - start;
 
 	/* Clear any non-preallocated blocks */
-	drm_mm_for_each_hole(entry, &i915_gtt_vm->mm,
-			     hole_start, hole_end) {
+	drm_mm_for_each_hole(entry, &gtt_vm->mm, hole_start, hole_end) {
 		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
 			      hole_start, hole_end);
-		i915_gtt_vm->clear_range(i915_gtt_vm, hole_start / PAGE_SIZE,
-				     (hole_end-hole_start) / PAGE_SIZE);
+		gtt_vm->clear_range(gtt_vm, hole_start / PAGE_SIZE,
+				    (hole_end-hole_start) / PAGE_SIZE);
 	}
 
 	/* And finally clear the reserved guard page */
-	i915_gtt_vm->clear_range(i915_gtt_vm, (end - guard_size) / PAGE_SIZE,
-				 guard_size / PAGE_SIZE);
+	gtt_vm->clear_range(gtt_vm, (end - guard_size) / PAGE_SIZE,
+			    guard_size / PAGE_SIZE);
 }
 
 static int setup_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	struct page *page;
 	dma_addr_t dma_addr;
 
@@ -736,8 +744,8 @@ static int setup_scratch_page(struct drm_device *dev)
 #else
 	dma_addr = page_to_phys(page);
 #endif
-	i915_gtt_vm->scratch.page = page;
-	i915_gtt_vm->scratch.addr = dma_addr;
+	gtt_vm->scratch.page = page;
+	gtt_vm->scratch.addr = dma_addr;
 
 	return 0;
 }
@@ -745,12 +753,13 @@ static int setup_scratch_page(struct drm_device *dev)
 static void teardown_scratch_page(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 
-	set_pages_wb(i915_gtt_vm->scratch.page, 1);
-	pci_unmap_page(dev->pdev, i915_gtt_vm->scratch.addr,
+	set_pages_wb(gtt_vm->scratch.page, 1);
+	pci_unmap_page(dev->pdev, gtt_vm->scratch.addr,
 		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-	put_page(i915_gtt_vm->scratch.page);
-	__free_page(i915_gtt_vm->scratch.page);
+	put_page(gtt_vm->scratch.page);
+	__free_page(gtt_vm->scratch.page);
 }
 
 static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
@@ -774,6 +783,7 @@ static int gen6_gmch_probe(struct drm_device *dev,
 			   unsigned long *mappable_end)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	phys_addr_t gtt_bus_addr;
 	unsigned int gtt_size;
 	u16 snb_gmch_ctl;
@@ -813,8 +823,8 @@ static int gen6_gmch_probe(struct drm_device *dev,
 	if (ret)
 		DRM_ERROR("Scratch setup failed\n");
 
-	i915_gtt_vm->clear_range = gen6_ggtt_clear_range;
-	i915_gtt_vm->insert_entries = gen6_ggtt_insert_entries;
+	gtt_vm->clear_range = gen6_ggtt_clear_range;
+	gtt_vm->insert_entries = gen6_ggtt_insert_entries;
 
 	return ret;
 }
@@ -833,6 +843,7 @@ static int i915_gmch_probe(struct drm_device *dev,
 			   unsigned long *mappable_end)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	int ret;
 
 	ret = intel_gmch_probe(dev_priv->bridge_dev, dev_priv->dev->pdev, NULL);
@@ -844,8 +855,8 @@ static int i915_gmch_probe(struct drm_device *dev,
 	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
 
 	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
-	i915_gtt_vm->clear_range = i915_ggtt_clear_range;
-	i915_gtt_vm->insert_entries = i915_ggtt_insert_entries;
+	gtt_vm->clear_range = i915_ggtt_clear_range;
+	gtt_vm->insert_entries = i915_ggtt_insert_entries;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 13d24aa..4863219 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -328,6 +328,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *gtt_vm = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 	struct drm_mm_node *stolen;
 	struct i915_vma *vma;
@@ -369,7 +370,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (gtt_offset == -1)
 		return obj;
 
-	vma = i915_gem_vma_create(obj);
+	vma = i915_gem_vma_create(obj, gtt_vm);
 	if (!vma) {
 		drm_gem_object_unreference(&obj->base);
 		return NULL;
@@ -380,9 +381,9 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	 * setting up the GTT space. The actual reservation will occur
 	 * later.
 	 */
-	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
-		ret = drm_mm_create_block(&i915_gtt_vm->mm, &vma->node,
-					  gtt_offset, size);
+	if (drm_mm_initialized(&gtt_vm->mm)) {
+		ret = drm_mm_create_block(&gtt_vm->mm, &vma->node, gtt_offset,
+					  size);
 		if (ret) {
 			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
 			i915_gem_vma_destroy(vma);
@@ -396,7 +397,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
+	list_add_tail(&obj->mm_list, &gtt_vm->inactive_list);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index 2478114..25c89a0 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -268,18 +268,18 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
 		return true;
 
 	if (INTEL_INFO(obj->base.dev)->gen == 3) {
-		if (i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK)
+		if (i915_gem_ggtt_offset(obj) & ~I915_FENCE_START_MASK)
 			return false;
 	} else {
-		if (i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK)
+		if (i915_gem_ggtt_offset(obj) & ~I830_FENCE_START_MASK)
 			return false;
 	}
 
 	size = i915_gem_get_gtt_size(obj->base.dev, obj->base.size, tiling_mode);
-	if (i915_gem_obj_size(obj) != size)
+	if (i915_gem_ggtt_size(obj) != size)
 		return false;
 
-	if (i915_gem_obj_offset(obj) & (size - 1))
+	if (i915_gem_ggtt_offset(obj) & (size - 1))
 		return false;
 
 	return true;
@@ -358,19 +358,20 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
 		 * whilst executing a fenced command for an untiled object.
 		 */
 
-		obj->map_and_fenceable = !i915_gem_obj_bound(obj) ||
-			(i915_gem_obj_offset(obj) +
+		obj->map_and_fenceable = !i915_gem_obj_bound_ggtt(obj) ||
+			(i915_gem_ggtt_offset(obj) +
 			 obj->base.size <= dev_priv->gtt.mappable_end &&
 			 i915_gem_object_fence_ok(obj, args->tiling_mode));
 
 		/* Rebind if we need a change of alignment */
 		if (!obj->map_and_fenceable) {
-			u32 unfenced_alignment =
+			struct i915_address_space *ggtt = &dev_priv->gtt.base;
+			u32 unfenced_align =
 				i915_gem_get_gtt_alignment(dev, obj->base.size,
 							    args->tiling_mode,
 							    false);
-			if (i915_gem_obj_offset(obj) & (unfenced_alignment - 1))
-				ret = i915_gem_object_unbind(obj);
+			if (i915_gem_ggtt_offset(obj) & (unfenced_align - 1))
+				ret = i915_gem_object_unbind(obj, ggtt);
 		}
 
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c0be641..050eea3 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1512,7 +1512,8 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 	if (dst == NULL)
 		return NULL;
 
-	reloc_offset = i915_gem_obj_offset(src);
+	/* FIXME: must handle per faulty VM */
+	reloc_offset = i915_gem_ggtt_offset(src);
 	for (i = 0; i < num_pages; i++) {
 		unsigned long flags;
 		void *d;
@@ -1564,7 +1565,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 		reloc_offset += PAGE_SIZE;
 	}
 	dst->page_count = num_pages;
-	dst->gtt_offset = i915_gem_obj_offset(src);
+	dst->gtt_offset = i915_gem_ggtt_offset(src);
 
 	return dst;
 
@@ -1618,7 +1619,8 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 	err->name = obj->base.name;
 	err->rseqno = obj->last_read_seqno;
 	err->wseqno = obj->last_write_seqno;
-	err->gtt_offset = i915_gem_obj_offset(obj);
+	/* FIXME: plumb the actual context into here to pull the right VM */
+	err->gtt_offset = i915_gem_ggtt_offset(obj);
 	err->read_domains = obj->base.read_domains;
 	err->write_domain = obj->base.write_domain;
 	err->fence_reg = obj->fence_reg;
@@ -1712,17 +1714,20 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	if (HAS_BROKEN_CS_TLB(dev_priv->dev)) {
 		u32 acthd = I915_READ(ACTHD);
 
+		if (WARN_ON(HAS_HW_CONTEXTS(dev_priv->dev)))
+			return NULL;
+
 		if (WARN_ON(ring->id != RCS))
 			return NULL;
 
 		obj = ring->private;
-		if (acthd >= i915_gem_obj_offset(obj) &&
-		    acthd < i915_gem_obj_offset(obj) + obj->base.size)
+		if (acthd >= i915_gem_ggtt_offset(obj) &&
+		    acthd < i915_gem_ggtt_offset(obj) + obj->base.size)
 			return i915_error_object_create(dev_priv, obj);
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
+	list_for_each_entry(obj, ggtt_list(active_list), mm_list) {
 		if (obj->ring != ring)
 			continue;
 
@@ -1798,7 +1803,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
-		if ((error->ccid & PAGE_MASK) == i915_gem_obj_offset(obj)) {
+		if ((error->ccid & PAGE_MASK) == i915_gem_ggtt_offset(obj)) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
 								    obj, 1);
 		}
@@ -1857,7 +1862,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
+	list_for_each_entry(obj, ggtt_list(active_list), mm_list)
 		i++;
 	error->active_bo_count = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
@@ -1877,7 +1882,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 		error->active_bo_count =
 			capture_active_bo(error->active_bo,
 					  error->active_bo_count,
-					  &i915_gtt_vm->active_list);
+					  ggtt_list(active_list));
 
 	if (error->pinned_bo)
 		error->pinned_bo_count =
@@ -2152,10 +2157,10 @@ static void __always_unused i915_pageflip_stall_check(struct drm_device *dev, in
 	if (INTEL_INFO(dev)->gen >= 4) {
 		int dspsurf = DSPSURF(intel_crtc->plane);
 		stall_detected = I915_HI_DISPBASE(I915_READ(dspsurf)) ==
-					i915_gem_obj_offset(obj);
+					i915_gem_ggtt_offset(obj);
 	} else {
 		int dspaddr = DSPADDR(intel_crtc->plane);
-		stall_detected = I915_READ(dspaddr) == (i915_gem_obj_offset(obj) +
+		stall_detected = I915_READ(dspaddr) == (i915_gem_ggtt_offset(obj) +
 							crtc->y * crtc->fb->pitches[0] +
 							crtc->x * crtc->fb->bits_per_pixel/8);
 	}
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index e4dccb3..3f019d3 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -34,11 +34,13 @@ TRACE_EVENT(i915_gem_object_create,
 );
 
 TRACE_EVENT(i915_gem_object_bind,
-	    TP_PROTO(struct drm_i915_gem_object *obj, bool mappable),
-	    TP_ARGS(obj, mappable),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm, bool mappable),
+	    TP_ARGS(obj, vm, mappable),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     __field(bool, mappable)
@@ -46,8 +48,8 @@ TRACE_EVENT(i915_gem_object_bind,
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_offset(obj);
-			   __entry->size = i915_gem_obj_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   __entry->mappable = mappable;
 			   ),
 
@@ -57,19 +59,21 @@ TRACE_EVENT(i915_gem_object_bind,
 );
 
 TRACE_EVENT(i915_gem_object_unbind,
-	    TP_PROTO(struct drm_i915_gem_object *obj),
-	    TP_ARGS(obj),
+	    TP_PROTO(struct drm_i915_gem_object *obj,
+		     struct i915_address_space *vm),
+	    TP_ARGS(obj, vm),
 
 	    TP_STRUCT__entry(
 			     __field(struct drm_i915_gem_object *, obj)
+			     __field(struct i915_address_space *, vm)
 			     __field(u32, offset)
 			     __field(u32, size)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->obj = obj;
-			   __entry->offset = i915_gem_obj_offset(obj);
-			   __entry->size = i915_gem_obj_size(obj);
+			   __entry->offset = i915_gem_obj_offset(obj, vm);
+			   __entry->size = i915_gem_obj_size(obj, vm);
 			   ),
 
 	    TP_printk("obj=%p, offset=%08x size=%x",
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 633bfbf..bd1d1bb 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1943,18 +1943,18 @@ static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 	}
 
 	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
-		      i915_gem_obj_offset(obj), linear_offset, x, y,
+		      i915_gem_ggtt_offset(obj), linear_offset, x, y,
 		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	if (INTEL_INFO(dev)->gen >= 4) {
 		I915_MODIFY_DISPBASE(DSPSURF(plane),
-				     i915_gem_obj_offset(obj) +
+				     i915_gem_ggtt_offset(obj) +
 				     intel_crtc->dspaddr_offset);
 		I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
 		I915_WRITE(DSPLINOFF(plane), linear_offset);
 	} else
 		I915_WRITE(DSPADDR(plane),
-			   i915_gem_obj_offset(obj) + linear_offset);
+			   i915_gem_ggtt_offset(obj) + linear_offset);
 	POSTING_READ(reg);
 
 	return 0;
@@ -2035,11 +2035,11 @@ static int ironlake_update_plane(struct drm_crtc *crtc,
 	linear_offset -= intel_crtc->dspaddr_offset;
 
 	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
-		      i915_gem_obj_offset(obj), linear_offset, x, y,
+		      i915_gem_ggtt_offset(obj), linear_offset, x, y,
 		      fb->pitches[0]);
 	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
 	I915_MODIFY_DISPBASE(DSPSURF(plane),
-			     i915_gem_obj_offset(obj)+intel_crtc->dspaddr_offset);
+			     i915_gem_ggtt_offset(obj)+intel_crtc->dspaddr_offset);
 	if (IS_HASWELL(dev)) {
 		I915_WRITE(DSPOFFSET(plane), (y << 16) | x);
 	} else {
@@ -6558,7 +6558,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
 			goto fail_unpin;
 		}
 
-		addr = i915_gem_obj_offset(obj);
+		addr = i915_gem_ggtt_offset(obj);
 	} else {
 		int align = IS_I830(dev) ? 16 * 1024 : 256;
 		ret = i915_gem_attach_phys_object(dev, obj,
@@ -7274,7 +7274,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
+			i915_gem_ggtt_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, 0); /* aux display base address, unused */
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7316,7 +7316,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
+			i915_gem_ggtt_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, MI_NOOP);
 
 	intel_mark_page_flip_active(intel_crtc);
@@ -7356,7 +7356,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0]);
 	intel_ring_emit(ring,
-			(i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset) |
+			(i915_gem_ggtt_offset(obj) + intel_crtc->dspaddr_offset) |
 			obj->tiling_mode);
 
 	/* XXX Enabling the panel-fitter across page-flip is so far
@@ -7400,7 +7400,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
 			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
 	intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
 	intel_ring_emit(ring,
-			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
+			i915_gem_ggtt_offset(obj) + intel_crtc->dspaddr_offset);
 
 	/* Contrary to the suggestions in the documentation,
 	 * "Enable Panel Fitter" does not seem to be required when page
@@ -7466,7 +7466,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
 	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
 	intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
 	intel_ring_emit(ring,
-			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
+			i915_gem_ggtt_offset(obj) + intel_crtc->dspaddr_offset);
 	intel_ring_emit(ring, (MI_NOOP));
 
 	intel_mark_page_flip_active(intel_crtc);
diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 8315a5e..1e56ab2 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -139,11 +139,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	info->apertures->ranges[0].base = dev->mode_config.fb_base;
 	info->apertures->ranges[0].size = dev_priv->gtt.mappable_end;
 
-	info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_offset(obj);
+	info->fix.smem_start = dev->mode_config.fb_base + i915_gem_ggtt_offset(obj);
 	info->fix.smem_len = size;
 
 	info->screen_base =
-		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
+		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_ggtt_offset(obj),
 			   size);
 	if (!info->screen_base) {
 		ret = -ENOSPC;
@@ -168,7 +168,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 
 	DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08lx, bo %p\n",
 		      fb->width, fb->height,
-		      i915_gem_obj_offset(obj), obj);
+		      i915_gem_ggtt_offset(obj), obj);
 
 
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 41654b1..24aeb02 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -196,7 +196,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
 		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
 		regs = io_mapping_map_wc(dev_priv->gtt.mappable,
-					 i915_gem_obj_offset(overlay->reg_bo));
+					 i915_gem_ggtt_offset(overlay->reg_bo));
 
 	return regs;
 }
@@ -740,7 +740,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	swidth = params->src_w;
 	swidthsw = calc_swidthsw(overlay->dev, params->offset_Y, tmp_width);
 	sheight = params->src_h;
-	iowrite32(i915_gem_obj_offset(new_bo) + params->offset_Y,
+	iowrite32(i915_gem_ggtt_offset(new_bo) + params->offset_Y,
 		  &regs->OBUF_0Y);
 	ostride = params->stride_Y;
 
@@ -755,9 +755,9 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 				      params->src_w/uv_hscale);
 		swidthsw |= max_t(u32, tmp_U, tmp_V) << 16;
 		sheight |= (params->src_h/uv_vscale) << 16;
-		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_U,
+		iowrite32(i915_gem_ggtt_offset(new_bo) + params->offset_U,
 			  &regs->OBUF_0U);
-		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_V,
+		iowrite32(i915_gem_ggtt_offset(new_bo) + params->offset_V,
 			  &regs->OBUF_0V);
 		ostride |= params->stride_UV << 16;
 	}
@@ -1353,12 +1353,12 @@ void intel_setup_overlay(struct drm_device *dev)
 		}
 		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
 	} else {
-		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
+		ret = i915_gem_ggtt_pin(reg_bo, PAGE_SIZE, true, false);
 		if (ret) {
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
 		}
-		overlay->flip_addr = i915_gem_obj_offset(reg_bo);
+		overlay->flip_addr = i915_gem_ggtt_offset(reg_bo);
 
 		ret = i915_gem_object_set_to_gtt_domain(reg_bo, true);
 		if (ret) {
@@ -1437,7 +1437,7 @@ intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
 			overlay->reg_bo->phys_obj->handle->vaddr;
 
 	return io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
-					i915_gem_obj_offset(overlay->reg_bo));
+					i915_gem_ggtt_offset(overlay->reg_bo));
 }
 
 static void intel_overlay_unmap_regs_atomic(struct intel_overlay *overlay,
@@ -1468,7 +1468,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
 	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
 		error->base = (__force long)overlay->reg_bo->phys_obj->handle->vaddr;
 	else
-		error->base = i915_gem_obj_offset(overlay->reg_bo);
+		error->base = i915_gem_ggtt_offset(overlay->reg_bo);
 
 	regs = intel_overlay_map_regs_atomic(overlay);
 	if (!regs)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9bea2e0..f8f2f1d 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -217,7 +217,8 @@ static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 		   (stall_watermark << DPFC_RECOMP_STALL_WM_SHIFT) |
 		   (interval << DPFC_RECOMP_TIMER_COUNT_SHIFT));
 	I915_WRITE(ILK_DPFC_FENCE_YOFF, crtc->y);
-	I915_WRITE(ILK_FBC_RT_BASE, i915_gem_obj_offset(obj) | ILK_FBC_RT_VALID);
+	I915_WRITE(ILK_FBC_RT_BASE,
+		   i915_gem_ggtt_offset(obj) | ILK_FBC_RT_VALID);
 	/* enable it... */
 	I915_WRITE(ILK_DPFC_CONTROL, dpfc_ctl | DPFC_CTL_EN);
 
@@ -274,7 +275,7 @@ static void gen7_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
 	struct drm_i915_gem_object *obj = intel_fb->obj;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	I915_WRITE(IVB_FBC_RT_BASE, i915_gem_obj_offset(obj));
+	I915_WRITE(IVB_FBC_RT_BASE, i915_gem_ggtt_offset(obj));
 
 	I915_WRITE(ILK_DPFC_CONTROL, DPFC_CTL_EN | DPFC_CTL_LIMIT_1X |
 		   IVB_DPFC_CTL_FENCE_EN |
@@ -2860,7 +2861,7 @@ intel_alloc_context_page(struct drm_device *dev)
 		return NULL;
 	}
 
-	ret = i915_gem_object_pin(ctx, 4096, true, false);
+	ret = i915_gem_ggtt_pin(ctx, 4096, true, false);
 	if (ret) {
 		DRM_ERROR("failed to pin power context: %d\n", ret);
 		goto err_unref;
@@ -3685,7 +3686,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 
 	intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
 	intel_ring_emit(ring, MI_SET_CONTEXT);
-	intel_ring_emit(ring, i915_gem_obj_offset(dev_priv->ips.renderctx) |
+	intel_ring_emit(ring, i915_gem_ggtt_offset(dev_priv->ips.renderctx) |
 			MI_MM_SPACE_GTT |
 			MI_SAVE_EXT_STATE_EN |
 			MI_RESTORE_EXT_STATE_EN |
@@ -3708,7 +3709,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
 		return;
 	}
 
-	I915_WRITE(PWRCTXA, i915_gem_obj_offset(dev_priv->ips.pwrctx) |
+	I915_WRITE(PWRCTXA, i915_gem_ggtt_offset(dev_priv->ips.pwrctx) |
 			    PWRCTX_EN);
 	I915_WRITE(RSTDBYCTL, I915_READ(RSTDBYCTL) & ~RCX_SW_EXIT);
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 64b579f..4c6cf56 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -424,14 +424,14 @@ static int init_ring_common(struct intel_ring_buffer *ring)
 	 * registers with the above sequence (the readback of the HEAD registers
 	 * also enforces ordering), otherwise the hw might lose the new ring
 	 * register values. */
-	I915_WRITE_START(ring, i915_gem_obj_offset(obj));
+	I915_WRITE_START(ring, i915_gem_ggtt_offset(obj));
 	I915_WRITE_CTL(ring,
 			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
 			| RING_VALID);
 
 	/* If the head is still not zero, the ring is dead */
 	if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
-		     I915_READ_START(ring) == i915_gem_obj_offset(obj) &&
+		     I915_READ_START(ring) == i915_gem_ggtt_offset(obj) &&
 		     (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
 		DRM_ERROR("%s initialization failed "
 				"ctl %08x head %08x tail %08x start %08x\n",
@@ -465,6 +465,7 @@ out:
 static int
 init_pipe_control(struct intel_ring_buffer *ring)
 {
+	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct pipe_control *pc;
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -483,13 +484,14 @@ init_pipe_control(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret)
 		goto err_unref;
 
-	pc->gtt_offset = i915_gem_obj_offset(obj);
+	pc->gtt_offset = i915_gem_ggtt_offset(obj);
 	pc->cpu_page = kmap(sg_page(obj->pages->sgl));
 	if (pc->cpu_page == NULL) {
 		ret = -ENOMEM;
@@ -1129,7 +1131,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
 		intel_ring_advance(ring);
 	} else {
 		struct drm_i915_gem_object *obj = ring->private;
-		u32 cs_offset = i915_gem_obj_offset(obj);
+		u32 cs_offset = i915_gem_ggtt_offset(obj);
 
 		if (len > I830_BATCH_LIMIT)
 			return -ENOSPC;
@@ -1197,6 +1199,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 static int init_status_page(struct intel_ring_buffer *ring)
 {
 	struct drm_device *dev = ring->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -1207,14 +1210,15 @@ static int init_status_page(struct intel_ring_buffer *ring)
 		goto err;
 	}
 
-	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
+	i915_gem_object_set_cache_level(obj, &dev_priv->gtt.base,
+					I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true, false);
+	ret = i915_gem_ggtt_pin(obj, 4096, true, false);
 	if (ret != 0) {
 		goto err_unref;
 	}
 
-	ring->status_page.gfx_addr = i915_gem_obj_offset(obj);
+	ring->status_page.gfx_addr = i915_gem_ggtt_offset(obj);
 	ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
 	if (ring->status_page.page_addr == NULL) {
 		ret = -ENOMEM;
@@ -1299,7 +1303,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 	ring->obj = obj;
 
-	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
+	ret = i915_gem_ggtt_pin(obj, PAGE_SIZE, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -1308,7 +1312,8 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 		goto err_unpin;
 
 	ring->virtual_start =
-		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
+		ioremap_wc(dev_priv->gtt.mappable_base +
+			   i915_gem_ggtt_offset(obj),
 			   ring->size);
 	if (ring->virtual_start == NULL) {
 		DRM_ERROR("Failed to map ringbuffer.\n");
@@ -1821,7 +1826,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			return -ENOMEM;
 		}
 
-		ret = i915_gem_object_pin(obj, 0, true, false);
+		ret = i915_gem_ggtt_pin(obj, 0, true, false);
 		if (ret != 0) {
 			drm_gem_object_unreference(&obj->base);
 			DRM_ERROR("Failed to ping batch bo\n");
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 117a2f8..3555cca 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -133,7 +133,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
 
 	I915_WRITE(SPSIZE(pipe, plane), (crtc_h << 16) | crtc_w);
 	I915_WRITE(SPCNTR(pipe, plane), sprctl);
-	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), i915_gem_obj_offset(obj) +
+	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), i915_gem_ggtt_offset(obj) +
 			     sprsurf_offset);
 	POSTING_READ(SPSURF(pipe, plane));
 }
@@ -309,7 +309,7 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 		I915_WRITE(SPRSCALE(pipe), sprscale);
 	I915_WRITE(SPRCTL(pipe), sprctl);
 	I915_MODIFY_DISPBASE(SPRSURF(pipe),
-			     i915_gem_obj_offset(obj) + sprsurf_offset);
+			     i915_gem_ggtt_offset(obj) + sprsurf_offset);
 	POSTING_READ(SPRSURF(pipe));
 
 	/* potentially re-enable LP watermarks */
@@ -480,7 +480,7 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
 	I915_WRITE(DVSSCALE(pipe), dvsscale);
 	I915_WRITE(DVSCNTR(pipe), dvscntr);
 	I915_MODIFY_DISPBASE(DVSSURF(pipe),
-			     i915_gem_obj_offset(obj) + dvssurf_offset);
+			     i915_gem_ggtt_offset(obj) + dvssurf_offset);
 	POSTING_READ(DVSSURF(pipe));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 34/66] drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (32 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 33/66] drm/i915: Create VMAs (part 3) - plumbing Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 35/66] drm/i915: Create VMAs (part 4) - Error capture Ben Widawsky
                   ` (33 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This commit is split out because it's a bit tricky (or at least it was
for me). It could very well be squashed in to the previous commits.

The map_and_fenceable tracking is per object. Map_and_fenceable however
only makes sense in the context of the global gtt. As such, VMAs created
for any other address space should ever modify these flags.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8fe5f4e..83e2eb3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2644,7 +2644,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	trace_i915_gem_object_unbind(obj, vm);
 
-	if (obj->has_global_gtt_mapping)
+	if (obj->has_global_gtt_mapping && is_i915_ggtt(vm))
 		i915_gem_gtt_unbind_object(obj);
 	if (obj->has_aliasing_ppgtt_mapping) {
 		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
@@ -2655,7 +2655,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
-	obj->map_and_fenceable = true;
+	if (is_i915_ggtt(vm))
+		obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, vm);
 	list_del(&vma->vma_link);
@@ -3223,7 +3224,9 @@ search_free:
 		is_i915_ggtt(vm) &&
 		vma->node.start + obj->base.size <= dev_priv->gtt.mappable_end;
 
-	obj->map_and_fenceable = mappable && fenceable;
+	/* Map and fenceable only changes if the VM is the global GGTT */
+	if (is_i915_ggtt(vm))
+		obj->map_and_fenceable = mappable && fenceable;
 
 	trace_i915_gem_object_bind(obj, vm, map_and_fenceable);
 	i915_gem_verify_gtt(dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 35/66] drm/i915: Create VMAs (part 4) - Error capture
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (33 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 34/66] drm/i915: Create VMAs (part 3.5) - map and fenceable tracking Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 36/66] drm/i915: Create VMAs (part 5) - move mm_list Ben Widawsky
                   ` (32 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Since the active/inactive lists are per VM, we need to modify the error
capture code to be aware of this, and also extend it to capture the
buffers from all the VMs. For now all the code assumes only 1 VM, but it
will become more generic over the next few patches.

NOTE: If the number of VMs in a real world system grows significantly
we'll have to focus on only capturing the guilty VM, or else it's likely
there won't be enough space for error capture.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   8 +--
 drivers/gpu/drm/i915/i915_drv.h     |   4 +-
 drivers/gpu/drm/i915/i915_irq.c     | 104 ++++++++++++++++++++++++------------
 3 files changed, 77 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index cf50389..7d01fb6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -842,13 +842,13 @@ static int i915_error_state(struct i915_error_state_file_priv *error_priv,
 
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
-				    error->active_bo,
-				    error->active_bo_count);
+				    error->active_bo[0],
+				    error->active_bo_count[0]);
 
 	if (error->pinned_bo)
 		print_error_buffers(m, "Pinned",
-				    error->pinned_bo,
-				    error->pinned_bo_count);
+				    error->pinned_bo[0],
+				    error->pinned_bo_count[0]);
 
 	for (i = 0; i < ARRAY_SIZE(error->ring); i++) {
 		struct drm_i915_error_object *obj;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9042376..b0d1008 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -320,8 +320,8 @@ struct drm_i915_error_state {
 		u32 purgeable:1;
 		s32 ring:4;
 		u32 cache_level:2;
-	} *active_bo, *pinned_bo;
-	u32 active_bo_count, pinned_bo_count;
+	} **active_bo, **pinned_bo;
+	u32 *active_bo_count, *pinned_bo_count;
 	struct intel_overlay_error_state *overlay;
 	struct intel_display_error_state *display;
 };
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 050eea3..b786fcd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1607,7 +1607,11 @@ i915_error_state_free(struct kref *error_ref)
 		kfree(error->ring[i].requests);
 	}
 
+	/* FIXME: Assume always 1 VM for now */
+	kfree(error->active_bo[0]);
 	kfree(error->active_bo);
+	kfree(error->active_bo_count);
+	kfree(error->pinned_bo_count);
 	kfree(error->overlay);
 	kfree(error->display);
 	kfree(error);
@@ -1705,6 +1709,7 @@ static struct drm_i915_error_object *
 i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
+	struct i915_address_space *vm;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
 
@@ -1727,20 +1732,23 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	}
 
 	seqno = ring->get_seqno(ring, false);
-	list_for_each_entry(obj, ggtt_list(active_list), mm_list) {
-		if (obj->ring != ring)
-			continue;
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_for_each_entry(obj, &vm->active_list, mm_list) {
+			if (obj->ring != ring)
+				continue;
 
-		if (i915_seqno_passed(seqno, obj->last_read_seqno))
-			continue;
+			if (i915_seqno_passed(seqno, obj->last_read_seqno))
+				continue;
 
-		if ((obj->base.read_domains & I915_GEM_DOMAIN_COMMAND) == 0)
-			continue;
+			if (!(obj->base.read_domains & I915_GEM_DOMAIN_COMMAND))
+				continue;
 
-		/* We need to copy these to an anonymous buffer as the simplest
-		 * method to avoid being overwritten by userspace.
-		 */
-		return i915_error_object_create(dev_priv, obj);
+			/* We need to copy these to an anonymous buffer as the
+			 * simplest method to avoid being overwritten by
+			 * userspace.
+			 */
+			return i915_error_object_create(dev_priv, obj);
+		}
 	}
 
 	return NULL;
@@ -1855,40 +1863,70 @@ static void i915_gem_record_rings(struct drm_device *dev,
 	}
 }
 
-static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
-				     struct drm_i915_error_state *error)
+/* FIXME: Since pin count/bound list is global, we duplicate what we capture per
+ * VM.
+ */
+static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
+				struct drm_i915_error_state *error,
+				struct i915_address_space *vm,
+				const int ndx)
 {
+	struct drm_i915_error_buffer *active_bo = NULL, *pinned_bo = NULL;
 	struct drm_i915_gem_object *obj;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, ggtt_list(active_list), mm_list)
+	list_for_each_entry(obj, &vm->active_list, mm_list)
 		i++;
-	error->active_bo_count = i;
+	error->active_bo_count[ndx] = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
 		if (obj->pin_count)
 			i++;
-	error->pinned_bo_count = i - error->active_bo_count;
+	error->pinned_bo_count[ndx] = i - error->active_bo_count[ndx];
 
 	if (i) {
-		error->active_bo = kmalloc(sizeof(*error->active_bo)*i,
-					   GFP_ATOMIC);
-		if (error->active_bo)
-			error->pinned_bo =
-				error->active_bo + error->active_bo_count;
-	}
-
-	if (error->active_bo)
-		error->active_bo_count =
-			capture_active_bo(error->active_bo,
-					  error->active_bo_count,
-					  ggtt_list(active_list));
-
-	if (error->pinned_bo)
-		error->pinned_bo_count =
-			capture_pinned_bo(error->pinned_bo,
-					  error->pinned_bo_count,
+		active_bo = kmalloc(sizeof(*active_bo)*i, GFP_ATOMIC);
+		if (active_bo)
+			pinned_bo = active_bo + error->active_bo_count[ndx];
+	}
+
+	if (active_bo)
+		error->active_bo_count[ndx] =
+			capture_active_bo(active_bo,
+					  error->active_bo_count[ndx],
+					  &vm->active_list);
+
+	if (pinned_bo)
+		error->pinned_bo_count[ndx] =
+			capture_pinned_bo(pinned_bo,
+					  error->pinned_bo_count[ndx],
 					  &dev_priv->mm.bound_list);
+	error->active_bo[ndx] = active_bo;
+	error->pinned_bo[ndx] = pinned_bo;
+}
+
+static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
+				     struct drm_i915_error_state *error)
+{
+	struct i915_address_space *vm;
+	int cnt = 0;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		cnt++;
+
+	if (WARN(cnt > 1, "Multiple VMs not yet supported\n"))
+		cnt = 1;
+
+	vm = &dev_priv->gtt.base;
+
+	error->active_bo = kcalloc(cnt, sizeof(*error->active_bo), GFP_ATOMIC);
+	error->pinned_bo = kcalloc(cnt, sizeof(*error->pinned_bo), GFP_ATOMIC);
+	error->active_bo_count = kcalloc(cnt, sizeof(*error->active_bo_count),
+					 GFP_ATOMIC);
+	error->pinned_bo_count = kcalloc(cnt, sizeof(*error->pinned_bo_count),
+					 GFP_ATOMIC);
+
+	i915_gem_capture_vm(dev_priv, error, vm, 0);
 }
 
 /**
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 36/66] drm/i915: Create VMAs (part 5) - move mm_list
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (34 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 35/66] drm/i915: Create VMAs (part 4) - Error capture Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 37/66] drm/i915: Create VMAs (part 6) - finish error plumbing Ben Widawsky
                   ` (31 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VM area

Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.

v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c    | 49 ++++++++++++++++++++++------------
 drivers/gpu/drm/i915/i915_drv.h        |  5 ++--
 drivers/gpu/drm/i915/i915_gem.c        | 31 ++++++++++++++-------
 drivers/gpu/drm/i915/i915_gem_evict.c  | 14 +++++-----
 drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
 drivers/gpu/drm/i915/i915_irq.c        | 13 +++++----
 6 files changed, 71 insertions(+), 43 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 7d01fb6..60d2a94 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -152,7 +152,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	struct list_head *head;
 	struct drm_device *dev = node->minor->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	size_t total_obj_size, total_gtt_size;
 	int count, ret;
 
@@ -160,6 +160,7 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	if (ret)
 		return ret;
 
+	/* FIXME: the user of this interface might want more than just GGTT */
 	switch (list) {
 	case ACTIVE_LIST:
 		seq_printf(m, "Active:\n");
@@ -175,13 +176,12 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
 	}
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, head, mm_list) {
+	list_for_each_entry(vma, head, mm_list) {
 		seq_printf(m, "   ");
-		describe_obj(m, obj);
+		describe_obj(m, vma->obj);
 		seq_printf(m, "\n");
-		total_obj_size += obj->base.size;
-		/* FIXME: Add size of all VMs */
-		total_gtt_size += i915_gem_ggtt_size(obj);
+		total_obj_size += vma->obj->base.size;
+		total_gtt_size += i915_gem_ggtt_size(vma->obj);
 		count++;
 	}
 	mutex_unlock(&dev->struct_mutex);
@@ -229,6 +229,17 @@ static int per_file_stats(int id, void *ptr, void *data)
 	return 0;
 }
 
+#define count_vmas(list, member) do { \
+	list_for_each_entry(vma, list, member) { \
+		size += i915_gem_ggtt_size(vma->obj); \
+		++count; \
+		if (vma->obj->map_and_fenceable) { \
+			mappable_size += i915_gem_ggtt_size(vma->obj); \
+			++mappable_count; \
+		} \
+	} \
+} while (0)
+
 static int i915_gem_object_info(struct seq_file *m, void* data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -238,6 +249,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	size_t size, mappable_size, purgeable_size;
 	struct drm_i915_gem_object *obj;
 	struct drm_file *file;
+	struct i915_vma *vma;
 	int ret;
 
 	ret = mutex_lock_interruptible(&dev->struct_mutex);
@@ -254,12 +266,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(ggtt_list(active_list), mm_list);
+	count_vmas(ggtt_list(active_list), mm_list);
 	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(ggtt_list(inactive_list), mm_list);
+	count_vmas(ggtt_list(inactive_list), mm_list);
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -1966,6 +1978,8 @@ i915_drop_caches_set(void *data, u64 val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
+	struct i915_address_space *vm;
+	struct i915_vma *vma, *x;
 	int ret;
 
 	DRM_DEBUG_DRIVER("Dropping caches: 0x%08llx\n", val);
@@ -1986,15 +2000,16 @@ i915_drop_caches_set(void *data, u64 val)
 		i915_gem_retire_requests(dev);
 
 	if (val & DROP_BOUND) {
-		/* FIXME: Do this for all vms? */
-		list_for_each_entry_safe(obj, next, ggtt_list(inactive_list),
-					 mm_list)
-			if (obj->pin_count)
-				continue;
-
-			ret = i915_gem_object_unbind(obj, &dev_priv->gtt.base);
-			if (ret)
-				goto unlock;
+		list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+			list_for_each_entry_safe(vma, x, &vm->inactive_list,
+						 mm_list)
+				if (vma->obj->pin_count == 0) {
+					ret = i915_gem_object_unbind(vma->obj,
+								     vm);
+					if (ret)
+						goto unlock;
+				}
+		}
 	}
 
 	if (val & DROP_UNBOUND) {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b0d1008..cc18349 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -545,6 +545,9 @@ struct i915_vma {
 	/* Page aligned offset (helper for stolen) */
 	unsigned long deferred_offset;
 
+	/** This object's place on the active/inactive lists */
+	struct list_head mm_list;
+
 	struct list_head vma_link; /* Link in the object's VMA list */
 };
 
@@ -1237,9 +1240,7 @@ struct drm_i915_gem_object {
 	struct drm_mm_node *stolen;
 	struct list_head global_list;
 
-	/** This object's place on the active/inactive lists */
 	struct list_head ring_list;
-	struct list_head mm_list;
 	/** This object's place in the batchbuffer or on the eviction list */
 	struct list_head exec_list;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 83e2eb3..950a14b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1878,6 +1878,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	u32 seqno = intel_ring_get_seqno(ring);
+	struct i915_vma *vma;
 
 	BUG_ON(ring == NULL);
 	obj->ring = ring;
@@ -1889,7 +1890,8 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	}
 
 	/* Move from whatever list we were on to the tail of execution. */
-	list_move_tail(&obj->mm_list, &vm->active_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
 	obj->last_read_seqno = seqno;
@@ -1912,10 +1914,13 @@ static void
 i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 				 struct i915_address_space *vm)
 {
+	struct i915_vma *vma;
+
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
 	BUG_ON(!obj->active);
 
-	list_move_tail(&obj->mm_list, &vm->inactive_list);
+	vma = i915_gem_obj_to_vma(obj, vm);
+	list_move_tail(&vma->mm_list, &vm->inactive_list);
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -2290,9 +2295,9 @@ void i915_gem_restore_fences(struct drm_device *dev)
 bool i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
 	struct intel_ring_buffer *ring;
 	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	int i;
 	bool ctx_banned = false;
 
@@ -2305,8 +2310,8 @@ bool i915_gem_reset(struct drm_device *dev)
 	 * necessary invalidation upon reuse.
 	 */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-		list_for_each_entry(obj, &vm->inactive_list, mm_list)
-			obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
+		list_for_each_entry(vma, &vm->inactive_list, mm_list)
+			vma->obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_restore_fences(dev);
@@ -2653,12 +2658,12 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
-	list_del(&obj->mm_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	if (is_i915_ggtt(vm))
 		obj->map_and_fenceable = true;
 
 	vma = i915_gem_obj_to_vma(obj, vm);
+	list_del(&vma->mm_list);
 	list_del(&vma->vma_link);
 	drm_mm_remove_node(&vma->node);
 	i915_gem_vma_destroy(vma);
@@ -3208,7 +3213,7 @@ search_free:
 	}
 
 	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &vm->inactive_list);
+	list_add_tail(&vma->mm_list, &vm->inactive_list);
 	/* Keep GGTT vmas first to make debug easier */
 	if (is_i915_ggtt(vm))
 		list_add(&vma->vma_link, &obj->vma_list);
@@ -3364,8 +3369,14 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 					    old_write_domain);
 
 	/* And bump the LRU for this access */
-	if (i915_gem_object_is_inactive(obj))
-		list_move_tail(&obj->mm_list, ggtt_list(inactive_list));
+	if (i915_gem_object_is_inactive(obj)) {
+		struct i915_vma *vma = i915_gem_obj_to_vma(obj,
+							   &dev_priv->gtt.base);
+		if (vma)
+			list_move_tail(&vma->mm_list,
+				       &dev_priv->gtt.base.inactive_list);
+
+	}
 
 	return 0;
 }
@@ -3943,7 +3954,6 @@ unlock:
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			  const struct drm_i915_gem_object_ops *ops)
 {
-	INIT_LIST_HEAD(&obj->mm_list);
 	INIT_LIST_HEAD(&obj->global_list);
 	INIT_LIST_HEAD(&obj->ring_list);
 	INIT_LIST_HEAD(&obj->exec_list);
@@ -4086,6 +4096,7 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 		return ERR_PTR(-ENOMEM);
 
 	INIT_LIST_HEAD(&vma->vma_link);
+	INIT_LIST_HEAD(&vma->mm_list);
 	vma->vm = vm;
 	vma->obj = obj;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 7a210b8..028e8b1 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -86,8 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		drm_mm_init_scan(&vm->mm, min_size, alignment, cache_level);
 
 	/* First see if there is a large enough contiguous idle region... */
-	list_for_each_entry(obj, &vm->inactive_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->inactive_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -96,8 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, struct i915_address_space *vm,
 		goto none;
 
 	/* Now merge in the soon-to-be-expired objects... */
-	list_for_each_entry(obj, &vm->active_list, mm_list) {
-		struct i915_vma *vma = i915_gem_obj_to_vma(obj, vm);
+	list_for_each_entry(vma, &vm->active_list, mm_list) {
 		if (mark_free(vma, &unwind_list))
 			goto found;
 	}
@@ -158,8 +156,8 @@ int
 i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj, *next;
 	struct i915_address_space *vm;
+	struct i915_vma *vma, *next;
 	bool lists_empty = true;
 	int ret;
 
@@ -187,9 +185,9 @@ i915_gem_evict_everything(struct drm_device *dev)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		list_for_each_entry_safe(obj, next, &vm->inactive_list, mm_list)
-			if (obj->pin_count == 0)
-				WARN_ON(i915_gem_object_unbind(obj, vm));
+		list_for_each_entry_safe(vma, next, &vm->inactive_list, mm_list)
+			if (vma->obj->pin_count == 0)
+				WARN_ON(i915_gem_object_unbind(vma->obj, vm));
 	}
 
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 4863219..d393298 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -397,7 +397,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj->has_global_gtt_mapping = 1;
 
 	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
-	list_add_tail(&obj->mm_list, &gtt_vm->inactive_list);
+	list_add_tail(&vma->mm_list, &gtt_vm->inactive_list);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index b786fcd..28880bf 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1643,11 +1643,11 @@ static void capture_bo(struct drm_i915_error_buffer *err,
 static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 			     int count, struct list_head *head)
 {
-	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
 	int i = 0;
 
-	list_for_each_entry(obj, head, mm_list) {
-		capture_bo(err++, obj);
+	list_for_each_entry(vma, head, mm_list) {
+		capture_bo(err++, vma->obj);
 		if (++i == count)
 			break;
 	}
@@ -1710,6 +1710,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			     struct intel_ring_buffer *ring)
 {
 	struct i915_address_space *vm;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
 
@@ -1733,7 +1734,8 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 
 	seqno = ring->get_seqno(ring, false);
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
-		list_for_each_entry(obj, &vm->active_list, mm_list) {
+		list_for_each_entry(vma, &vm->active_list, mm_list) {
+			obj = vma->obj;
 			if (obj->ring != ring)
 				continue;
 
@@ -1872,11 +1874,12 @@ static void i915_gem_capture_vm(struct drm_i915_private *dev_priv,
 				const int ndx)
 {
 	struct drm_i915_error_buffer *active_bo = NULL, *pinned_bo = NULL;
+	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	int i;
 
 	i = 0;
-	list_for_each_entry(obj, &vm->active_list, mm_list)
+	list_for_each_entry(vma, &vm->active_list, mm_list)
 		i++;
 	error->active_bo_count[ndx] = i;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 37/66] drm/i915: Create VMAs (part 6) - finish error plumbing
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (35 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 36/66] drm/i915: Create VMAs (part 5) - move mm_list Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 38/66] drm/i915: create an object_is_active() Ben Widawsky
                   ` (30 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_irq.c | 53 ++++++++++++++++++++++++++++-------------
 1 file changed, 37 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 28880bf..e1653fd 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1499,6 +1499,7 @@ static void i915_get_extra_instdone(struct drm_device *dev,
 static struct drm_i915_error_object *
 i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 			       struct drm_i915_gem_object *src,
+			       struct i915_address_space *vm,
 			       const int num_pages)
 {
 	struct drm_i915_error_object *dst;
@@ -1512,8 +1513,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 	if (dst == NULL)
 		return NULL;
 
-	/* FIXME: must handle per faulty VM */
-	reloc_offset = i915_gem_ggtt_offset(src);
+	reloc_offset = i915_gem_obj_offset(src, vm);
 	for (i = 0; i < num_pages; i++) {
 		unsigned long flags;
 		void *d;
@@ -1565,7 +1565,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
 		reloc_offset += PAGE_SIZE;
 	}
 	dst->page_count = num_pages;
-	dst->gtt_offset = i915_gem_ggtt_offset(src);
+	dst->gtt_offset = i915_gem_obj_offset(src, vm);
 
 	return dst;
 
@@ -1575,8 +1575,9 @@ unwind:
 	kfree(dst);
 	return NULL;
 }
-#define i915_error_object_create(dev_priv, src) \
+#define i915_error_object_create(dev_priv, src, vm) \
 	i915_error_object_create_sized((dev_priv), (src), \
+				       vm, \
 				       (src)->base.size>>PAGE_SHIFT)
 
 static void
@@ -1617,14 +1618,14 @@ i915_error_state_free(struct kref *error_ref)
 	kfree(error);
 }
 static void capture_bo(struct drm_i915_error_buffer *err,
-		       struct drm_i915_gem_object *obj)
+		       struct drm_i915_gem_object *obj,
+		       struct i915_address_space *vm)
 {
 	err->size = obj->base.size;
 	err->name = obj->base.name;
 	err->rseqno = obj->last_read_seqno;
 	err->wseqno = obj->last_write_seqno;
-	/* FIXME: plumb the actual context into here to pull the right VM */
-	err->gtt_offset = i915_gem_ggtt_offset(obj);
+	err->gtt_offset = i915_gem_obj_offset(obj, vm);
 	err->read_domains = obj->base.read_domains;
 	err->write_domain = obj->base.write_domain;
 	err->fence_reg = obj->fence_reg;
@@ -1647,7 +1648,7 @@ static u32 capture_active_bo(struct drm_i915_error_buffer *err,
 	int i = 0;
 
 	list_for_each_entry(vma, head, mm_list) {
-		capture_bo(err++, vma->obj);
+		capture_bo(err++, vma->obj, vma->vm);
 		if (++i == count)
 			break;
 	}
@@ -1662,10 +1663,14 @@ static u32 capture_pinned_bo(struct drm_i915_error_buffer *err,
 	int i = 0;
 
 	list_for_each_entry(obj, head, global_list) {
+		struct i915_vma *vma;
 		if (obj->pin_count == 0)
 			continue;
 
-		capture_bo(err++, obj);
+		/* Object may be pinned in multiple VMs, just take first */
+		vma = list_first_entry(&obj->vma_list, struct i915_vma,
+				       vma_link);
+		capture_bo(err++, obj, vma->vm);
 		if (++i == count)
 			break;
 	}
@@ -1713,6 +1718,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 	struct i915_vma *vma;
 	struct drm_i915_gem_object *obj;
 	u32 seqno;
+	u32 pp_db;
 
 	if (!ring->get_seqno)
 		return NULL;
@@ -1729,11 +1735,19 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 		obj = ring->private;
 		if (acthd >= i915_gem_ggtt_offset(obj) &&
 		    acthd < i915_gem_ggtt_offset(obj) + obj->base.size)
-			return i915_error_object_create(dev_priv, obj);
+			return i915_error_object_create(dev_priv, obj,
+							&dev_priv->gtt.base);
 	}
 
+	pp_db = I915_READ(RING_PP_DIR_BASE(ring));
 	seqno = ring->get_seqno(ring, false);
+
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		struct i915_hw_ppgtt *ppgtt =
+			container_of(vm, struct i915_hw_ppgtt, base);
+		if (!is_i915_ggtt(vm) && pp_db >> 10 != ppgtt->pd_offset)
+			continue;
+
 		list_for_each_entry(vma, &vm->active_list, mm_list) {
 			obj = vma->obj;
 			if (obj->ring != ring)
@@ -1749,7 +1763,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
 			 * simplest method to avoid being overwritten by
 			 * userspace.
 			 */
-			return i915_error_object_create(dev_priv, obj);
+			return i915_error_object_create(dev_priv, obj, vm);
 		}
 	}
 
@@ -1806,6 +1820,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 					   struct drm_i915_error_ring *ering)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
+	struct i915_address_space *ggtt = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
 
 	/* Currently render ring is the only HW context user */
@@ -1813,9 +1828,14 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
 		return;
 
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
+		if (!i915_gem_obj_bound(obj, ggtt))
+			continue;
+
 		if ((error->ccid & PAGE_MASK) == i915_gem_ggtt_offset(obj)) {
 			ering->ctx = i915_error_object_create_sized(dev_priv,
-								    obj, 1);
+								    obj,
+								    ggtt,
+								    1);
 		}
 	}
 }
@@ -1835,8 +1855,8 @@ static void i915_gem_record_rings(struct drm_device *dev,
 			i915_error_first_batchbuffer(dev_priv, ring);
 
 		error->ring[i].ringbuffer =
-			i915_error_object_create(dev_priv, ring->obj);
-
+			i915_error_object_create(dev_priv, ring->obj,
+						 &dev_priv->gtt.base);
 
 		i915_gem_record_active_context(ring, error, &error->ring[i]);
 
@@ -1912,7 +1932,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 				     struct drm_i915_error_state *error)
 {
 	struct i915_address_space *vm;
-	int cnt = 0;
+	int cnt = 0, i = 0;
 
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
 		cnt++;
@@ -1929,7 +1949,8 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 	error->pinned_bo_count = kcalloc(cnt, sizeof(*error->pinned_bo_count),
 					 GFP_ATOMIC);
 
-	i915_gem_capture_vm(dev_priv, error, vm, 0);
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
+		i915_gem_capture_vm(dev_priv, error, vm, i++);
 }
 
 /**
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 38/66] drm/i915: create an object_is_active()
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (36 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 37/66] drm/i915: Create VMAs (part 6) - finish error plumbing Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 39/66] drm/i915: Move active to vma Ben Widawsky
                   ` (29 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This is simply obj->active for now, but will serve a purpose when we
track activity per vma.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  1 +
 drivers/gpu/drm/i915/i915_gem.c            | 18 ++++++++++++------
 drivers/gpu/drm/i915/i915_gem_context.c    |  2 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  2 +-
 4 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index cc18349..b3eb067 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1709,6 +1709,7 @@ static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj);
 void i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 				    struct i915_address_space *vm,
 				    struct intel_ring_buffer *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 950a14b..f448804 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -134,10 +134,16 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	return 0;
 }
 
+/* NB: Not the same as !i915_gem_object_is_inactive */
+bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
+{
+	return obj->active;
+}
+
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return i915_gem_obj_bound_any(obj) && !obj->active;
+	return i915_gem_obj_bound_any(obj) && !i915_gem_object_is_active(obj);
 }
 
 int
@@ -1884,7 +1890,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	obj->ring = ring;
 
 	/* Add a reference if we're newly entering the active list. */
-	if (!obj->active) {
+	if (!i915_gem_object_is_active(obj)) {
 		drm_gem_object_reference(&obj->base);
 		obj->active = 1;
 	}
@@ -1917,7 +1923,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
-	BUG_ON(!obj->active);
+	BUG_ON(!i915_gem_object_is_active(obj));
 
 	vma = i915_gem_obj_to_vma(obj, vm);
 	list_move_tail(&vma->mm_list, &vm->inactive_list);
@@ -2446,7 +2452,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 {
 	int ret;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		ret = i915_gem_check_olr(obj->ring, obj->last_read_seqno);
 		if (ret)
 			return ret;
@@ -2511,7 +2517,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	if (ret)
 		goto out;
 
-	if (obj->active) {
+	if (i915_gem_object_is_active(obj)) {
 		seqno = obj->last_read_seqno;
 		ring = obj->ring;
 	}
@@ -3885,7 +3891,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 	 */
 	ret = i915_gem_object_flush_active(obj);
 
-	args->busy = obj->active;
+	args->busy = i915_gem_object_is_active(obj);
 	if (obj->ring) {
 		BUILD_BUG_ON(I915_NUM_RINGS > 16);
 		args->busy |= intel_ring_flag(obj->ring) << 16;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 5d5a60f..988123f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -292,7 +292,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 	WARN_ON(!dev_priv->ring[RCS].last_context);
 	if (dev_priv->ring[RCS].last_context == dctx) {
 		/* Fake switch to NULL context */
-		WARN_ON(dctx->obj->active);
+		WARN_ON(i915_gem_object_is_active(dctx->obj));
 		i915_gem_object_unpin(dctx->obj);
 		i915_gem_context_unreference(dctx);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 620f395..7e9823f 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -251,7 +251,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	}
 
 	/* We can't wait for rendering with pagefaults disabled */
-	if (obj->active && in_atomic())
+	if (i915_gem_object_is_active(obj) && in_atomic())
 		return -EFAULT;
 
 	reloc->delta += target_offset;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 39/66] drm/i915: Move active to vma
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (37 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 38/66] drm/i915: create an object_is_active() Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 40/66] drm/i915: Track all VMAs per VM Ben Widawsky
                   ` (28 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Probably need to squash whole thing, or just the inactive part, tbd...

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h | 14 ++++++------
 drivers/gpu/drm/i915/i915_gem.c | 47 ++++++++++++++++++++++++-----------------
 2 files changed, 35 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b3eb067..247a124 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -545,6 +545,13 @@ struct i915_vma {
 	/* Page aligned offset (helper for stolen) */
 	unsigned long deferred_offset;
 
+	/**
+	 * This is set if the object is on the active lists (has pending
+	 * rendering and so a non-zero seqno), and is not set if it i s on
+	 * inactive (ready to be unbound) list.
+	 */
+	unsigned int active:1;
+
 	/** This object's place on the active/inactive lists */
 	struct list_head mm_list;
 
@@ -1245,13 +1252,6 @@ struct drm_i915_gem_object {
 	struct list_head exec_list;
 
 	/**
-	 * This is set if the object is on the active lists (has pending
-	 * rendering and so a non-zero seqno), and is not set if it i s on
-	 * inactive (ready to be unbound) list.
-	 */
-	unsigned int active:1;
-
-	/**
 	 * This is set if the object has been written to since last bound
 	 * to the GTT
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f448804..a3e8c26 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -137,7 +137,13 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 /* NB: Not the same as !i915_gem_object_is_inactive */
 bool i915_gem_object_is_active(struct drm_i915_gem_object *obj)
 {
-	return obj->active;
+	struct i915_vma *vma;
+
+	list_for_each_entry(vma, &obj->vma_list, vma_link)
+		if (vma->active)
+			return true;
+
+	return false;
 }
 
 static inline bool
@@ -1889,14 +1895,14 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 	BUG_ON(ring == NULL);
 	obj->ring = ring;
 
+	/* Move from whatever list we were on to the tail of execution. */
+	vma = i915_gem_obj_to_vma(obj, vm);
 	/* Add a reference if we're newly entering the active list. */
-	if (!i915_gem_object_is_active(obj)) {
+	if (!vma->active) {
 		drm_gem_object_reference(&obj->base);
-		obj->active = 1;
+		vma->active = 1;
 	}
 
-	/* Move from whatever list we were on to the tail of execution. */
-	vma = i915_gem_obj_to_vma(obj, vm);
 	list_move_tail(&vma->mm_list, &vm->active_list);
 	list_move_tail(&obj->ring_list, &ring->active_list);
 
@@ -1917,16 +1923,23 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 }
 
 static void
-i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
-				 struct i915_address_space *vm)
+i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	struct i915_address_space *vm;
 	struct i915_vma *vma;
+	int i = 0;
 
 	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
-	BUG_ON(!i915_gem_object_is_active(obj));
 
-	vma = i915_gem_obj_to_vma(obj, vm);
-	list_move_tail(&vma->mm_list, &vm->inactive_list);
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		vma = i915_gem_obj_to_vma(obj, vm);
+		if (!vma || !vma->active)
+			continue;
+		list_move_tail(&vma->mm_list, &vm->inactive_list);
+		vma->active = 0;
+		i++;
+	}
 
 	list_del_init(&obj->ring_list);
 	obj->ring = NULL;
@@ -1938,8 +1951,8 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj,
 	obj->last_fenced_seqno = 0;
 	obj->fenced_gpu_access = false;
 
-	obj->active = 0;
-	drm_gem_object_unreference(&obj->base);
+	while (i--)
+		drm_gem_object_unreference(&obj->base);
 
 	WARN_ON(i915_verify_lists(dev));
 }
@@ -2273,15 +2286,13 @@ static bool i915_gem_reset_ring_lists(struct drm_i915_private *dev_priv,
 	}
 
 	while (!list_empty(&ring->active_list)) {
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
 				       struct drm_i915_gem_object,
 				       ring_list);
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		i915_gem_object_move_to_inactive(obj);
 	}
 
 	return ctx_banned;
@@ -2365,8 +2376,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	 * by the ringbuffer to the flushing/inactive lists as appropriate.
 	 */
 	while (!list_empty(&ring->active_list)) {
-		struct drm_i915_private *dev_priv = ring->dev->dev_private;
-		struct i915_address_space *vm;
 		struct drm_i915_gem_object *obj;
 
 		obj = list_first_entry(&ring->active_list,
@@ -2376,8 +2385,8 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		if (!i915_seqno_passed(seqno, obj->last_read_seqno))
 			break;
 
-		list_for_each_entry(vm, &dev_priv->vm_list, global_link)
-			i915_gem_object_move_to_inactive(obj, vm);
+		BUG_ON(!i915_gem_object_is_active(obj));
+		i915_gem_object_move_to_inactive(obj);
 	}
 
 	if (unlikely(ring->trace_irq_seqno &&
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 40/66] drm/i915: Track all VMAs per VM
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (38 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 39/66] drm/i915: Move active to vma Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-30 15:35   ` Daniel Vetter
  2013-06-27 23:30 ` [PATCH 41/66] drm/i915: Defer request freeing Ben Widawsky
                   ` (27 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This allows us to be aware of all the VMAs leftover and teardown, and is
useful for debug. I suspect it will prove even more useful later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 ++
 drivers/gpu/drm/i915/i915_gem.c | 4 ++++
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 247a124..0bc4251 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -446,6 +446,7 @@ struct i915_address_space {
 	struct drm_mm mm;
 	struct drm_device *dev;
 	struct list_head global_link;
+	struct list_head vma_list;
 	unsigned long start;		/* Start offset always 0 for dri2 */
 	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
 
@@ -556,6 +557,7 @@ struct i915_vma {
 	struct list_head mm_list;
 
 	struct list_head vma_link; /* Link in the object's VMA list */
+	struct list_head per_vm_link; /* Link in the VM's VMA list */
 };
 
 struct i915_ctx_hang_stats {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a3e8c26..5c0ad6a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4112,14 +4112,17 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&vma->vma_link);
 	INIT_LIST_HEAD(&vma->mm_list);
+	INIT_LIST_HEAD(&vma->per_vm_link);
 	vma->vm = vm;
 	vma->obj = obj;
+	list_add_tail(&vma->per_vm_link, &vm->vma_list);
 
 	return vma;
 }
 
 void i915_gem_vma_destroy(struct i915_vma *vma)
 {
+	list_del(&vma->per_vm_link);
 	WARN_ON(vma->node.allocated);
 	kfree(vma);
 }
@@ -4473,6 +4476,7 @@ static void i915_init_vm(struct drm_i915_private *dev_priv,
 	INIT_LIST_HEAD(&vm->active_list);
 	INIT_LIST_HEAD(&vm->inactive_list);
 	INIT_LIST_HEAD(&vm->global_link);
+	INIT_LIST_HEAD(&vm->vma_list);
 	list_add(&vm->global_link, &dev_priv->vm_list);
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 41/66] drm/i915: Defer request freeing
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (39 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 40/66] drm/i915: Track all VMAs per VM Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 42/66] drm/i915: Clean up VMAs before freeing Ben Widawsky
                   ` (26 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With context destruction, we always want to be able to tear down the
underlying address space. This is invoked on the last unreference to the
context which could happen before we've moved all objects to the
inactive list. To enable a clean tear down the address space, make sure
to process the request free lastly.

Without this change, we cannot guarantee to we don't still have active
objects in the VM.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5c0ad6a..12d0e61 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2342,6 +2342,8 @@ bool i915_gem_reset(struct drm_device *dev)
 void
 i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 {
+	LIST_HEAD(deferred_request_free);
+	struct drm_i915_gem_request *request;
 	uint32_t seqno;
 
 	if (list_empty(&ring->request_list))
@@ -2352,8 +2354,6 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 	seqno = ring->get_seqno(ring, true);
 
 	while (!list_empty(&ring->request_list)) {
-		struct drm_i915_gem_request *request;
-
 		request = list_first_entry(&ring->request_list,
 					   struct drm_i915_gem_request,
 					   list);
@@ -2369,7 +2369,7 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		 */
 		ring->last_retired_head = request->tail;
 
-		i915_gem_free_request(request);
+		list_move_tail(&request->list, &deferred_request_free);
 	}
 
 	/* Move any buffers on the active list that are no longer referenced
@@ -2395,6 +2395,13 @@ i915_gem_retire_requests_ring(struct intel_ring_buffer *ring)
 		ring->trace_irq_seqno = 0;
 	}
 
+	/* Finish processing active list before freeing request */
+	while (!list_empty(&deferred_request_free)) {
+		request = list_first_entry(&deferred_request_free,
+					   struct drm_i915_gem_request,
+					   list);
+		i915_gem_free_request(request);
+	}
 	WARN_ON(i915_verify_lists(ring->dev));
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 42/66] drm/i915: Clean up VMAs before freeing
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (40 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 41/66] drm/i915: Defer request freeing Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-07-02 10:59   ` Ville Syrjälä
  2013-06-27 23:30 ` [PATCH 43/66] drm/i915: Replace has_bsd/blt with a mask Ben Widawsky
                   ` (25 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

It's quite common for an object to simply be on the inactive list (and
not unbound) when we want to free the context. This of course happens
with lazy unbinding. Simply, this is needed when an object isn't fully
unbound but we want to free one VMA of the object, for whatever reason.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem.c         | 28 ++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.c |  1 +
 3 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0bc4251..9febcdd 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1674,6 +1674,7 @@ void i915_gem_free_object(struct drm_gem_object *obj);
 struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm);
 void i915_gem_vma_destroy(struct i915_vma *vma);
+void i915_gem_vma_cleanup(struct i915_address_space *vm);
 
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 12d0e61..9abc3c8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4134,6 +4134,34 @@ void i915_gem_vma_destroy(struct i915_vma *vma)
 	kfree(vma);
 }
 
+/* This is like unbind() but without gtt considerations */
+void i915_gem_vma_cleanup(struct i915_address_space *vm)
+{
+	struct drm_i915_private *dev_priv = vm->dev->dev_private;
+	struct i915_vma *vma, *n;
+
+	BUG_ON(is_i915_ggtt(vm));
+	WARN_ON(!list_empty(&vm->active_list));
+
+	list_for_each_entry_safe(vma, n, &vm->vma_list, per_vm_link) {
+		struct drm_i915_gem_object *obj = vma->obj;
+
+		if (WARN_ON(!i915_gem_obj_bound(obj, vm)))
+			continue;
+
+		i915_gem_object_unpin_pages(obj);
+
+		list_del(&vma->mm_list);
+		list_del(&vma->vma_link);
+		drm_mm_remove_node(&vma->node);
+		i915_gem_vma_destroy(vma);
+
+		if (list_empty(&obj->vma_list))
+			list_move_tail(&obj->global_list,
+				       &dev_priv->mm.unbound_list);
+	}
+}
+
 int
 i915_gem_idle(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 988123f..c45cd5c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -129,6 +129,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
 	struct i915_hw_context *ctx = container_of(ctx_ref,
 						   typeof(*ctx), ref);
 
+	i915_gem_vma_cleanup(&ctx->ppgtt.base);
 	if (ctx->ppgtt.cleanup)
 		ctx->ppgtt.cleanup(&ctx->ppgtt);
 	drm_gem_object_unreference(&ctx->obj->base);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 43/66] drm/i915: Replace has_bsd/blt with a mask
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (41 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 42/66] drm/i915: Clean up VMAs before freeing Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 44/66] drm/i915: Catch missed context unref earlier Ben Widawsky
                   ` (24 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

v2: Big conflict resolution on  Damien's DEV_INFO_FOR_EACH stuff

v3: Resolved vebox addition

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.c | 28 ++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_drv.h | 11 ++++++++---
 2 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 30346ee..d11ebc0 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -157,49 +157,58 @@ extern int intel_agp_enabled;
 static const struct intel_device_info intel_i830_info = {
 	.gen = 2, .is_mobile = 1, .cursor_needs_physical = 1, .num_pipes = 2,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_845g_info = {
 	.gen = 2, .num_pipes = 1,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_i85x_info = {
 	.gen = 2, .is_i85x = 1, .is_mobile = 1, .num_pipes = 2,
 	.cursor_needs_physical = 1,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_i865g_info = {
 	.gen = 2, .num_pipes = 1,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_i915g_info = {
 	.gen = 3, .is_i915g = 1, .cursor_needs_physical = 1, .num_pipes = 2,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 static const struct intel_device_info intel_i915gm_info = {
 	.gen = 3, .is_mobile = 1, .num_pipes = 2,
 	.cursor_needs_physical = 1,
 	.has_overlay = 1, .overlay_needs_physical = 1,
 	.supports_tv = 1,
+	.ring_mask = RENDER_RING,
 };
 static const struct intel_device_info intel_i945g_info = {
 	.gen = 3, .has_hotplug = 1, .cursor_needs_physical = 1, .num_pipes = 2,
 	.has_overlay = 1, .overlay_needs_physical = 1,
+	.ring_mask = RENDER_RING,
 };
 static const struct intel_device_info intel_i945gm_info = {
 	.gen = 3, .is_i945gm = 1, .is_mobile = 1, .num_pipes = 2,
 	.has_hotplug = 1, .cursor_needs_physical = 1,
 	.has_overlay = 1, .overlay_needs_physical = 1,
 	.supports_tv = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_i965g_info = {
 	.gen = 4, .is_broadwater = 1, .num_pipes = 2,
 	.has_hotplug = 1,
 	.has_overlay = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_i965gm_info = {
@@ -207,18 +216,20 @@ static const struct intel_device_info intel_i965gm_info = {
 	.is_mobile = 1, .has_fbc = 1, .has_hotplug = 1,
 	.has_overlay = 1,
 	.supports_tv = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_g33_info = {
 	.gen = 3, .is_g33 = 1, .num_pipes = 2,
 	.need_gfx_hws = 1, .has_hotplug = 1,
 	.has_overlay = 1,
+	.ring_mask = RENDER_RING,
 };
 
 static const struct intel_device_info intel_g45_info = {
 	.gen = 4, .is_g4x = 1, .need_gfx_hws = 1, .num_pipes = 2,
 	.has_pipe_cxsr = 1, .has_hotplug = 1,
-	.has_bsd_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING,
 };
 
 static const struct intel_device_info intel_gm45_info = {
@@ -226,7 +237,7 @@ static const struct intel_device_info intel_gm45_info = {
 	.is_mobile = 1, .need_gfx_hws = 1, .has_fbc = 1,
 	.has_pipe_cxsr = 1, .has_hotplug = 1,
 	.supports_tv = 1,
-	.has_bsd_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING,
 };
 
 static const struct intel_device_info intel_pineview_info = {
@@ -238,21 +249,20 @@ static const struct intel_device_info intel_pineview_info = {
 static const struct intel_device_info intel_ironlake_d_info = {
 	.gen = 5, .num_pipes = 2,
 	.need_gfx_hws = 1, .has_hotplug = 1,
-	.has_bsd_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING,
 };
 
 static const struct intel_device_info intel_ironlake_m_info = {
 	.gen = 5, .is_mobile = 1, .num_pipes = 2,
 	.need_gfx_hws = 1, .has_hotplug = 1,
 	.has_fbc = 1,
-	.has_bsd_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING,
 };
 
 static const struct intel_device_info intel_sandybridge_d_info = {
 	.gen = 6, .num_pipes = 2,
 	.need_gfx_hws = 1, .has_hotplug = 1,
-	.has_bsd_ring = 1,
-	.has_blt_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
 	.has_llc = 1,
 	.has_force_wake = 1,
 };
@@ -261,8 +271,7 @@ static const struct intel_device_info intel_sandybridge_m_info = {
 	.gen = 6, .is_mobile = 1, .num_pipes = 2,
 	.need_gfx_hws = 1, .has_hotplug = 1,
 	.has_fbc = 1,
-	.has_bsd_ring = 1,
-	.has_blt_ring = 1,
+	.ring_mask = RENDER_RING | BSD_RING | BLT_RING,
 	.has_llc = 1,
 	.has_force_wake = 1,
 };
@@ -270,9 +279,8 @@ static const struct intel_device_info intel_sandybridge_m_info = {
 #define GEN7_FEATURES  \
 	.gen = 7, .num_pipes = 3, \
 	.need_gfx_hws = 1, .has_hotplug = 1, \
-	.has_bsd_ring = 1, \
-	.has_blt_ring = 1, \
 	.has_llc = 1, \
+	.ring_mask = RENDER_RING | BSD_RING | BLT_RING, \
 	.has_force_wake = 1
 
 static const struct intel_device_info intel_ivybridge_d_info = {
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 9febcdd..f17d825 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -428,6 +428,7 @@ struct intel_device_info {
 	u32 display_mmio_offset;
 	u8 num_pipes:3;
 	u8 gen;
+	u8 ring_mask; /* Rings supported by the HW */
 	DEV_INFO_FOR_EACH_FLAG(DEFINE_FLAG, SEP_SEMICOLON);
 };
 
@@ -1457,9 +1458,13 @@ struct drm_i915_file_private {
 #define IS_GEN6(dev)	(INTEL_INFO(dev)->gen == 6)
 #define IS_GEN7(dev)	(INTEL_INFO(dev)->gen == 7)
 
-#define HAS_BSD(dev)            (INTEL_INFO(dev)->has_bsd_ring)
-#define HAS_BLT(dev)            (INTEL_INFO(dev)->has_blt_ring)
-#define HAS_VEBOX(dev)          (INTEL_INFO(dev)->has_vebox_ring)
+#define RENDER_RING		(1<<0)
+#define BSD_RING		(1<<1)
+#define BLT_RING		(1<<2)
+#define VEBOX_RING		(1<<3)
+#define HAS_BSD(dev)            (INTEL_INFO(dev)->ring_mask & BSD_RING)
+#define HAS_BLT(dev)            (INTEL_INFO(dev)->ring_mask & BLT_RING)
+#define HAS_VEBOX(dev)            (INTEL_INFO(dev)->ring_mask & VEBOX_RING)
 #define HAS_LLC(dev)            (INTEL_INFO(dev)->has_llc)
 #define I915_NEED_GFX_HWS(dev)	(INTEL_INFO(dev)->need_gfx_hws)
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 44/66] drm/i915: Catch missed context unref earlier
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (42 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 43/66] drm/i915: Replace has_bsd/blt with a mask Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 45/66] drm/i915: Add a context open function Ben Widawsky
                   ` (23 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 4 ++--
 drivers/gpu/drm/i915/i915_gem_context.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f17d825..bead414 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1900,9 +1900,9 @@ static inline void i915_gem_context_reference(struct i915_hw_context *ctx)
 	kref_get(&ctx->ref);
 }
 
-static inline void i915_gem_context_unreference(struct i915_hw_context *ctx)
+static inline int i915_gem_context_unreference(struct i915_hw_context *ctx)
 {
-	kref_put(&ctx->ref, i915_gem_context_free);
+	return kref_put(&ctx->ref, i915_gem_context_free);
 }
 
 struct i915_ctx_hang_stats * __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c45cd5c..9176559 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -299,7 +299,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 	}
 
 	i915_gem_object_unpin(dctx->obj);
-	i915_gem_context_unreference(dctx);
+	WARN_ON(!i915_gem_context_unreference(dctx));
 	dev_priv->ring[RCS].default_context = NULL;
 	dev_priv->ring[RCS].last_context = NULL;
 	dev_priv->gtt.aliasing_ppgtt = NULL;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 45/66] drm/i915: Add a context open function
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (43 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 44/66] drm/i915: Catch missed context unref earlier Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 46/66] drm/i915: Permit contexts on all rings Ben Widawsky
                   ` (22 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We'll be doing a bit more stuff with each file, so having our own open
function should make things clean.

This also allows us to easily add conditionals for stuff we don't want
to do when we don't have HW contexts.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c         |  4 +---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 22 ++++++++++++++++++++++
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4b330e5..9955dc7 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1789,9 +1789,7 @@ int i915_driver_open(struct drm_device *dev, struct drm_file *file)
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 
-	idr_init(&file_priv->context_idr);
-
-	return 0;
+	return i915_gem_context_open(dev, file);
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bead414..19ae0dc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1891,6 +1891,7 @@ i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
+int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct intel_ring_buffer *ring,
 			struct drm_file *file, int to_id);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 9176559..a5ac6dd 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -351,9 +351,31 @@ i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
 	return &to->hang_stats;
 }
 
+int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	mutex_lock(&dev->struct_mutex);
+	if (dev_priv->hw_contexts_disabled) {
+		mutex_unlock(&dev->struct_mutex);
+		return 0;
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+
+	idr_init(&file_priv->context_idr);
+
+	return 0;
+}
+
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+
+	if (dev_priv->hw_contexts_disabled)
+		return;
 
 	mutex_lock(&dev->struct_mutex);
 	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 46/66] drm/i915: Permit contexts on all rings
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (44 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 45/66] drm/i915: Add a context open function Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 47/66] drm/i915: Fix context fini refcounts Ben Widawsky
                   ` (21 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

If we want to use contexts in more abstract terms (specifically with
PPGTT in mind), we need to allow them to be specified for any ring.

NOTE: This commit requires an update to intel-gpu-tools to make it not
fail.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_context.c    | 56 ++++++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 16 ---------
 2 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index a5ac6dd..74714e56 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -97,7 +97,8 @@
 
 static struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
-static int do_switch(struct i915_hw_context *to);
+static int do_switch(struct intel_ring_buffer *ring,
+		     struct i915_hw_context *to);
 
 static int get_context_size(struct drm_device *dev)
 {
@@ -204,13 +205,19 @@ static inline bool is_default_context(struct i915_hw_context *ctx)
  * context state of the GPU for applications that don't utilize HW contexts, as
  * well as an idle case.
  */
-static int create_default_context(struct drm_i915_private *dev_priv)
+static int create_default_context(struct drm_i915_private *dev_priv,
+				  struct intel_ring_buffer *ring)
 {
 	struct i915_hw_context *ctx;
 	int ret;
 
 	BUG_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
+	if (dev_priv->ring[RCS].default_context) {
+		ring->default_context = dev_priv->ring[RCS].default_context;
+		return 0;
+	}
+
 	ctx = create_hw_context(dev_priv->dev, NULL);
 	if (IS_ERR(ctx))
 		return PTR_ERR(ctx);
@@ -241,6 +248,8 @@ err_destroy:
 void i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring;
+	int i;
 
 	if (!HAS_HW_CONTEXTS(dev)) {
 		dev_priv->hw_contexts_disabled = true;
@@ -262,10 +271,17 @@ void i915_gem_context_init(struct drm_device *dev)
 		return;
 	}
 
-	if (create_default_context(dev_priv)) {
-		dev_priv->hw_contexts_disabled = true;
-		DRM_DEBUG_DRIVER("Disabling HW Contexts; create failed\n");
-		return;
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
+			continue;
+
+		ring = &dev_priv->ring[i];
+		if (create_default_context(dev_priv, ring)) {
+			dev_priv->hw_contexts_disabled = true;
+			DRM_DEBUG_DRIVER("Disabling HW Contexts; create failed\n");
+			return;
+		}
 	}
 
 	DRM_DEBUG_DRIVER("HW context support initialized\n");
@@ -310,7 +326,8 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv)
 	if (dev_priv->hw_contexts_disabled)
 		return 0;
 	BUG_ON(!dev_priv->ring[RCS].default_context);
-	return do_switch(dev_priv->ring[RCS].default_context);
+	return do_switch(&dev_priv->ring[RCS],
+			 dev_priv->ring[RCS].default_context);
 }
 
 static int context_idr_cleanup(int id, void *p, void *data)
@@ -437,19 +454,32 @@ mi_set_context(struct intel_ring_buffer *ring,
 	return ret;
 }
 
-static int do_switch(struct i915_hw_context *to)
+static int do_switch(struct intel_ring_buffer *ring,
+		     struct i915_hw_context *to)
 {
-	struct intel_ring_buffer *ring = to->ring;
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
 	struct i915_hw_context *from = ring->last_context;
 	u32 hw_flags = 0;
 	int ret;
 
-	BUG_ON(from != NULL && from->obj != NULL && from->obj->pin_count == 0);
+	if (from != NULL && ring == &dev_priv->ring[RCS]) {
+		BUG_ON(from->obj == NULL);
+		BUG_ON(from->obj->pin_count == 0);
+	}
 
 	if (from == to)
 		return 0;
 
+	if (ring != &dev_priv->ring[RCS] && from) {
+		ret = i915_add_request(ring, NULL);
+		if (ret)
+			return ret;
+		i915_gem_context_unreference(from);
+	}
+
+	if (ring != &dev_priv->ring[RCS])
+		goto done;
+
 	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
@@ -514,6 +544,7 @@ static int do_switch(struct i915_hw_context *to)
 		i915_gem_context_unreference(from);
 	}
 
+done:
 	i915_gem_context_reference(to);
 	ring->last_context = to;
 	to->is_initialized = true;
@@ -546,9 +577,6 @@ int i915_switch_context(struct intel_ring_buffer *ring,
 
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (ring != &dev_priv->ring[RCS])
-		return 0;
-
 	if (to_id == DEFAULT_CONTEXT_ID) {
 		to = ring->default_context;
 	} else {
@@ -560,7 +588,7 @@ int i915_switch_context(struct intel_ring_buffer *ring,
 			return -ENOENT;
 	}
 
-	return do_switch(to);
+	return do_switch(ring, to);
 }
 
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7e9823f..b3e3658 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -892,29 +892,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		break;
 	case I915_EXEC_BSD:
 		ring = &dev_priv->ring[VCS];
-		if (ctx_id != 0) {
-			DRM_DEBUG("Ring %s doesn't support contexts\n",
-				  ring->name);
-			return -EPERM;
-		}
 		break;
 	case I915_EXEC_BLT:
 		ring = &dev_priv->ring[BCS];
-		if (ctx_id != 0) {
-			DRM_DEBUG("Ring %s doesn't support contexts\n",
-				  ring->name);
-			return -EPERM;
-		}
 		break;
 	case I915_EXEC_VEBOX:
 		ring = &dev_priv->ring[VECS];
-		if (ctx_id != 0) {
-			DRM_DEBUG("Ring %s doesn't support contexts\n",
-				  ring->name);
-			return -EPERM;
-		}
 		break;
-
 	default:
 		DRM_DEBUG("execbuf with unknown ring: %d\n",
 			  (int)(args->flags & I915_EXEC_RING_MASK));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 47/66] drm/i915: Fix context fini refcounts
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (45 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 46/66] drm/i915: Permit contexts on all rings Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 48/66] drm/i915: Better reset handling for contexts Ben Widawsky
                   ` (20 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

With multiple rings having a last context, we must now unreference these
contexts.

This could be squashed if desired since it leaves the last patch broken.
However, because it was a bit tricky to catch, I've left it separated
for primarily review purposed

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 74714e56..3e0413e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -291,6 +291,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_hw_context *dctx = dev_priv->ring[RCS].default_context;
+	int i;
 
 	if (dev_priv->hw_contexts_disabled)
 		return;
@@ -312,12 +313,20 @@ void i915_gem_context_fini(struct drm_device *dev)
 		WARN_ON(i915_gem_object_is_active(dctx->obj));
 		i915_gem_object_unpin(dctx->obj);
 		i915_gem_context_unreference(dctx);
+		dev_priv->ring[RCS].last_context = NULL;
+	}
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct intel_ring_buffer *ring = &dev_priv->ring[i];
+		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
+			continue;
+		if (ring->last_context)
+			i915_gem_context_unreference(ring->last_context);
+		ring->default_context = NULL;
 	}
 
 	i915_gem_object_unpin(dctx->obj);
 	WARN_ON(!i915_gem_context_unreference(dctx));
-	dev_priv->ring[RCS].default_context = NULL;
-	dev_priv->ring[RCS].last_context = NULL;
 	dev_priv->gtt.aliasing_ppgtt = NULL;
 }
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 48/66] drm/i915: Better reset handling for contexts
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (46 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 47/66] drm/i915: Fix context fini refcounts Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 49/66] drm/i915: Create a per file_priv default context Ben Widawsky
                   ` (19 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This patch adds to changes for contexts on reset:
Sets last context to default - this will prevent the context switch
happening after a reset. That switch is not possible because the
rings are hung during reset and context switch requires reset. This
behavior will need to be reworked in the future, but this is what we
want for now.

In the future, we'll also want to reset the guilty context to
uninitialized. We should wait for ARB_Robustness related code to land
for that.

This is somewhat for paranoia.  Because we really don't know what the
GPU was doing when it hung, or the state it was in (mid context write,
for example), later restoring the context is a bad idea. By setting the
flag to not initialized, the next load of that context will not restore
the state, and thus on the subsequent switch away from the context will
overwrite the old data.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         |  1 +
 drivers/gpu/drm/i915/i915_gem.c         |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 54 +++++++++++++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 19ae0dc..5beba7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1890,6 +1890,7 @@ i915_gem_ggtt_pin(struct drm_i915_gem_object *obj,
 /* i915_gem_context.c */
 void i915_gem_context_init(struct drm_device *dev);
 void i915_gem_context_fini(struct drm_device *dev);
+void i915_gem_context_reset(struct drm_device *dev);
 int i915_gem_context_enable(struct drm_i915_private *dev_priv);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9abc3c8..d440dd5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2323,6 +2323,8 @@ bool i915_gem_reset(struct drm_device *dev)
 
 	i915_gem_cleanup_ringbuffer(dev);
 
+	i915_gem_context_reset(dev);
+
 	/* Move everything out of the GPU domains to ensure we do any
 	 * necessary invalidation upon reuse.
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 3e0413e..e585e5a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -245,6 +245,60 @@ err_destroy:
 	return ret;
 }
 
+static int context_reset(int id, void *p, void *data)
+{
+#if 0
+	struct i915_hw_context *ctx = p;
+	ctx->is_initialized = false;
+#endif
+	return 0;
+}
+
+void i915_gem_context_reset(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct intel_ring_buffer *ring;
+	struct drm_file *file;
+	int i;
+
+	if (dev_priv->hw_contexts_disabled)
+		return;
+
+	/* Prevent the hardware from restoring the last context (which hung) on
+	 * the next switch */
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		struct i915_hw_context *dctx;
+		if (!(INTEL_INFO(dev)->ring_mask & (1<<i)))
+			continue;
+
+		/* Do a fake switch to the default context */
+		ring = &dev_priv->ring[i];
+		dctx = ring->default_context;
+		if (WARN_ON(!dctx))
+			continue;
+
+		if (!ring->last_context)
+			continue;
+
+		if (ring->last_context == dctx)
+			continue;
+
+		if (i == RCS)
+			WARN_ON(i915_gem_ggtt_pin(dctx->obj, CONTEXT_ALIGN,
+						  false, false));
+
+		i915_gem_context_unreference(ring->last_context);
+		i915_gem_context_reference(dctx);
+		ring->last_context = dctx;
+	}
+
+
+	list_for_each_entry(file, &dev->filelist, lhead) {
+		struct drm_i915_file_private *file_priv = file->driver_priv;
+		idr_for_each(&file_priv->context_idr, context_reset, NULL);
+	}
+}
+
 void i915_gem_context_init(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 49/66] drm/i915: Create a per file_priv default context
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (47 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 48/66] drm/i915: Better reset handling for contexts Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 50/66] drm/i915: Remove ring specificity from contexts Ben Widawsky
                   ` (18 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Every file will get it's own context, and we use this context instead of
the default context. The default context still exists for future
shrinker usage as well as reset handling.

Since this now will cause open to fail if there is no space in the GGTT
for the PPGTT PDEs, and the context object, try calling the shrinker
once to see if it's not possible to carry on.

v2: Updated to address Mika's recent context guilty changes
Some more changes around this come up in later patches as well.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Conflicts:
	drivers/gpu/drm/i915/i915_gem_context.c
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         |  6 ++++--
 drivers/gpu/drm/i915/i915_gem_context.c | 31 ++++++++++++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c     | 13 +++++++++++++
 4 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 5beba7a..4fbc4ab 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1409,7 +1409,7 @@ struct drm_i915_file_private {
 	} mm;
 	struct idr context_idr;
 
-	struct i915_ctx_hang_stats hang_stats;
+	struct i915_hw_context *private_default_ctx;
 };
 
 #define INTEL_INFO(dev)	(((struct drm_i915_private *) (dev)->dev_private)->info)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d440dd5..c91abda 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2209,9 +2209,10 @@ static bool i915_set_reset_status(struct intel_ring_buffer *ring,
 	/* Innocent until proven guilty */
 	guilty = banned = false;
 
-	if (request->batch_obj)
+	if (request->batch_obj) {
 		offset = i915_gem_obj_offset(request->batch_obj,
 					     request_to_vm(request));
+	}
 
 	if (ring->hangcheck.action != wait &&
 	    i915_request_guilty(request, acthd, &inside)) {
@@ -2231,7 +2232,7 @@ static bool i915_set_reset_status(struct intel_ring_buffer *ring,
 	if (request->ctx && request->ctx->id != DEFAULT_CONTEXT_ID)
 		hs = &request->ctx->hang_stats;
 	else if (request->file_priv)
-		hs = &request->file_priv->hang_stats;
+		hs = &request->file_priv->private_default_ctx->hang_stats;
 
 	if (hs) {
 		if (guilty) {
@@ -4859,6 +4860,7 @@ unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o,
 			return vma->node.start;
 
 	}
+
 	WARN_ON(1);
 	return I915_INVALID_OFFSET;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e585e5a..0d6ed19 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -197,7 +197,8 @@ err_out:
 
 static inline bool is_default_context(struct i915_hw_context *ctx)
 {
-	return (ctx == ctx->ring->default_context);
+	/* Cheap trick to determine default contexts */
+	return ctx->file_priv ? false : true;
 }
 
 /**
@@ -422,7 +423,7 @@ i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
 		return ERR_PTR(-EINVAL);
 
 	if (id == DEFAULT_CONTEXT_ID)
-		return &file_priv->hang_stats;
+		return &file_priv->private_default_ctx->hang_stats;
 
 	to = i915_gem_context_get(file->driver_priv, id);
 	if (to == NULL)
@@ -442,9 +443,13 @@ int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
 		return 0;
 	}
 
+	idr_init(&file_priv->context_idr);
+	file_priv->private_default_ctx = create_hw_context(dev, NULL);
+
 	mutex_unlock(&dev->struct_mutex);
 
-	idr_init(&file_priv->context_idr);
+	if (IS_ERR(file_priv->private_default_ctx))
+		return PTR_ERR(file_priv->private_default_ctx);
 
 	return 0;
 }
@@ -460,12 +465,16 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
 	mutex_lock(&dev->struct_mutex);
 	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
 	idr_destroy(&file_priv->context_idr);
+	i915_gem_context_unreference(file_priv->private_default_ctx);
 	mutex_unlock(&dev->struct_mutex);
 }
 
 static struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 {
+
+	if (id == DEFAULT_CONTEXT_ID)
+		return file_priv->private_default_ctx;
 	return (struct i915_hw_context *)idr_find(&file_priv->context_idr, id);
 }
 
@@ -640,16 +649,13 @@ int i915_switch_context(struct intel_ring_buffer *ring,
 
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (to_id == DEFAULT_CONTEXT_ID) {
+	if (file == NULL)
 		to = ring->default_context;
-	} else {
-		if (file == NULL)
-			return -EINVAL;
-
+	else
 		to = i915_gem_context_get(file->driver_priv, to_id);
-		if (to == NULL)
-			return -ENOENT;
-	}
+
+	if (to == NULL)
+		return -ENOENT;
 
 	return do_switch(ring, to);
 }
@@ -695,6 +701,9 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (!(dev->driver->driver_features & DRIVER_GEM))
 		return -ENODEV;
 
+	if (args->ctx_id == DEFAULT_CONTEXT_ID)
+		return -EPERM;
+
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8b59729..c3294c3 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -271,6 +271,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	struct drm_device *dev = vm->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	int i;
+	bool retried = false;
 	int ret = -ENOMEM;
 
 	/* PPGTT PDEs reside in the GGTT stolen space, and consists of 512
@@ -279,12 +280,24 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
 	 * fragmentation.
 	 */
 	BUG_ON(!drm_mm_initialized(&dev_priv->gtt.base.mm));
+alloc:
 	ret = drm_mm_insert_node_in_range_generic(&dev_priv->gtt.base.mm,
 						  &ppgtt->node, GEN6_PD_SIZE,
 						  GEN6_PD_ALIGN, 0,
 						  dev_priv->gtt.mappable_end,
 						  dev_priv->gtt.base.total,
 						  DRM_MM_TOPDOWN);
+	if (ret == -ENOSPC && !retried) {
+		ret = i915_gem_evict_something(dev, &dev_priv->gtt.base,
+					       GEN6_PD_SIZE, GEN6_PD_ALIGN,
+					       I915_CACHE_NONE, false, true);
+		if (ret)
+			return ret;
+
+		retried = true;
+		goto alloc;
+	}
+
 	if (ret)
 		return ret;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 50/66] drm/i915: Remove ring specificity from contexts
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (48 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 49/66] drm/i915: Create a per file_priv default context Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 51/66] drm/i915: Track which ring a context ran on Ben Widawsky
                   ` (17 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

When originally implementing HW contexts it was not clear if we'd
strongly associate a context with a ring. Now it is clear, a context
will not belong to a ring. We've removed all remnants of it's usage. So
drop it completely now.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 1 -
 drivers/gpu/drm/i915/i915_gem_context.c | 8 ++------
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 4fbc4ab..fa8a432 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -582,7 +582,6 @@ struct i915_hw_context {
 	int id;
 	bool is_initialized;
 	struct drm_i915_file_private *file_priv;
-	struct intel_ring_buffer *ring;
 	struct drm_i915_gem_object *obj;
 	struct i915_ctx_hang_stats hang_stats;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 0d6ed19..faddf43 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -166,12 +166,6 @@ create_hw_context(struct drm_device *dev,
 			goto err_out;
 	}
 
-	/* The ring associated with the context object is handled by the normal
-	 * object tracking code. We give an initial ring value simple to pass an
-	 * assertion in the context switch code.
-	 */
-	ctx->ring = &dev_priv->ring[RCS];
-
 	ret = i915_gem_ppgtt_init(dev, &ctx->ppgtt);
 	if (ret)
 		goto err_out;
@@ -215,6 +209,7 @@ static int create_default_context(struct drm_i915_private *dev_priv,
 	BUG_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
 	if (dev_priv->ring[RCS].default_context) {
+		/* NB: RCS will hold a ref for all rings */
 		ring->default_context = dev_priv->ring[RCS].default_context;
 		return 0;
 	}
@@ -378,6 +373,7 @@ void i915_gem_context_fini(struct drm_device *dev)
 		if (ring->last_context)
 			i915_gem_context_unreference(ring->last_context);
 		ring->default_context = NULL;
+		ring->last_context = NULL;
 	}
 
 	i915_gem_object_unpin(dctx->obj);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 51/66] drm/i915: Track which ring a context ran on
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (49 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 50/66] drm/i915: Remove ring specificity from contexts Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 52/66] drm/i915: dump error state based on capture Ben Widawsky
                   ` (16 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).

This is very important when we switch address spaces, which unlike
context objects, do change per ring.

As an example, if we have:

        RCS   BCS
ctx            A
ctx      A
ctx      B
ctx            B

Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h         | 1 +
 drivers/gpu/drm/i915/i915_gem_context.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fa8a432..b1b31c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -586,6 +586,7 @@ struct i915_hw_context {
 	struct i915_ctx_hang_stats hang_stats;
 
 	struct i915_hw_ppgtt ppgtt;
+	struct intel_ring_buffer *last_ring;
 };
 
 struct i915_fbc {
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index faddf43..e20ece6 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -535,7 +535,7 @@ static int do_switch(struct intel_ring_buffer *ring,
 		BUG_ON(from->obj->pin_count == 0);
 	}
 
-	if (from == to)
+	if (from == to && from->last_ring == ring)
 		return 0;
 
 	if (ring != &dev_priv->ring[RCS] && from) {
@@ -616,6 +616,7 @@ done:
 	i915_gem_context_reference(to);
 	ring->last_context = to;
 	to->is_initialized = true;
+	to->last_ring = ring;
 
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 52/66] drm/i915: dump error state based on capture
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (50 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 51/66] drm/i915: Track which ring a context ran on Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 53/66] drm/i915: PPGTT should take a ppgtt argument Ben Widawsky
                   ` (15 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In case something bad happens to the rings on init after a reset, but
debugfs is still available - we should still dump the information. This
is only possible with the change to do more teardown on reset.

NOTE: I've hit this in development, but it should be very unlikely once
the patches are stable.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 60d2a94..15f29de 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -815,7 +815,6 @@ static int i915_error_state(struct i915_error_state_file_priv *error_priv,
 	struct drm_device *dev = error_priv->dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_error_state *error = error_priv->error;
-	struct intel_ring_buffer *ring;
 	int i, j, page, offset, elt;
 
 	if (!error) {
@@ -849,8 +848,9 @@ static int i915_error_state(struct i915_error_state_file_priv *error_priv,
 	if (INTEL_INFO(dev)->gen == 7)
 		err_printf(m, "ERR_INT: 0x%08x\n", error->err_int);
 
-	for_each_ring(ring, dev_priv, i)
-		i915_ring_error_state(m, dev, error, i);
+	for (i = 0; i < I915_NUM_RINGS; i++)
+		if (error->ring[i].ringbuffer)
+			i915_ring_error_state(m, dev, error, i);
 
 	if (error->active_bo)
 		print_error_buffers(m, "Active",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 53/66] drm/i915: PPGTT should take a ppgtt argument
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (51 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 52/66] drm/i915: dump error state based on capture Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 54/66] drm/i915: USE LRI for switching PP_DIR_BASE Ben Widawsky
                   ` (14 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

At one time I was planning to call enable for all PPGTTs. I dropped
this, however I kept around this change because it looks better to me.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     | 2 +-
 drivers/gpu/drm/i915/i915_gem.c     | 4 +++-
 drivers/gpu/drm/i915/i915_gem_gtt.c | 4 ++--
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b1b31c0..883d314 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -533,7 +533,7 @@ struct i915_hw_ppgtt {
 
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
-	int (*enable)(struct drm_device *dev);
+	int (*enable)(struct i915_hw_ppgtt *ppgtt);
 	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c91abda..a4db2cc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4328,6 +4328,7 @@ int
 i915_gem_init_hw(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct i915_hw_ppgtt *ppgtt;
 	int ret;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
@@ -4361,7 +4362,8 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (ret || !dev_priv->gtt.aliasing_ppgtt)
 		goto disable_ctx_out;
 
-	ret = dev_priv->gtt.aliasing_ppgtt->enable(dev);
+	ppgtt = dev_priv->gtt.aliasing_ppgtt;
+	ret = ppgtt->enable(ppgtt);
 	if (ret)
 		goto disable_ctx_out;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index c3294c3..583d136 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -122,12 +122,12 @@ static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 	readl(pd_addr);
 }
 
-static int gen6_ppgtt_enable(struct drm_device *dev)
+static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
+	struct drm_device *dev = ppgtt->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	uint32_t pd_offset;
 	struct intel_ring_buffer *ring;
-	struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
 	int i;
 
 	BUG_ON(ppgtt->pd_offset & 0x3f);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 54/66] drm/i915: USE LRI for switching PP_DIR_BASE
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (52 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 53/66] drm/i915: PPGTT should take a ppgtt argument Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 55/66] drm/i915: Extract mm switching to function Ben Widawsky
                   ` (13 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

The docs seem to suggest this is the appropriate method (though it
doesn't say so outright). We certainly must do this for switching VMs on
the fly, since synchronizing the rings to MMIO updates isn't acceptable.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 583d136..be5c7a9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -171,13 +171,44 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 		/* GFX_MODE is per-ring on gen7+ */
 	}
 
+	POSTING_READ(GAM_ECOCHK);
 	for_each_ring(ring, dev_priv, i) {
+		int ret;
+
 		if (INTEL_INFO(dev)->gen >= 7)
 			I915_WRITE(RING_MODE_GEN7(ring),
 				   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
-		I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
-		I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
+		/* If we're in reset, we can assume the GPU is sufficiently idle
+		 * to manually frob these bits. Ideally we could use the ring
+		 * functions, except our error handling makes it quite difficult
+		 * (can't use intel_ring_begin, ring->flush, or
+		 * intel_ring_advance)
+		 */
+		if (i915_reset_in_progress(&dev_priv->gpu_error)) {
+			WARN_ON(ppgtt != dev_priv->gtt.aliasing_ppgtt);
+			I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
+			I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
+			return 0;
+		}
+
+		/* NB: TLBs must be flushed and invalidated before a switch */
+		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS,
+				  I915_GEM_GPU_DOMAINS);
+		if (ret)
+			return ret;
+
+		ret = intel_ring_begin(ring, 6);
+		if (ret)
+			return ret;
+
+		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
+		intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
+		intel_ring_emit(ring, PP_DIR_DCLV_2G);
+		intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
+		intel_ring_emit(ring, pd_offset);
+		intel_ring_emit(ring, MI_NOOP);
+		intel_ring_advance(ring);
 	}
 	return 0;
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 55/66] drm/i915: Extract mm switching to function
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (53 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 54/66] drm/i915: USE LRI for switching PP_DIR_BASE Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 56/66] drm/i915: Write PDEs at init instead of enable Ben Widawsky
                   ` (12 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  2 +
 drivers/gpu/drm/i915/i915_gem_gtt.c | 79 +++++++++++++++++++++----------------
 2 files changed, 48 insertions(+), 33 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 883d314..7865618 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -534,6 +534,8 @@ struct i915_hw_ppgtt {
 	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
 				     enum i915_cache_level level);
 	int (*enable)(struct i915_hw_ppgtt *ppgtt);
+	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
+			 struct intel_ring_buffer *ring);
 	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
 };
 
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index be5c7a9..646e8ef 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -122,11 +122,54 @@ static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 	readl(pd_addr);
 }
 
+static int gen6_mm_switch(struct i915_hw_ppgtt *ppgtt,
+			  struct intel_ring_buffer *ring)
+{
+	struct drm_device *dev = ppgtt->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	uint32_t pd_offset = ppgtt->pd_offset;
+	int ret;
+
+	pd_offset /= 64; /* in cachelines, */
+	pd_offset <<= 16;
+
+	/* If we're in reset, we can assume the GPU is sufficiently idle
+	 * to manually frob these bits. Ideally we could use the ring
+	 * functions, except our error handling makes it quite difficult
+	 * (can't use intel_ring_begin, ring->flush, or
+	 * intel_ring_advance)
+	 */
+	if (i915_reset_in_progress(&dev_priv->gpu_error)) {
+		WARN_ON(ppgtt != dev_priv->gtt.aliasing_ppgtt);
+		I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
+		I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
+		return 0;
+	}
+
+	/* NB: TLBs must be flushed and invalidated before a switch */
+	ret = ring->flush(ring, I915_GEM_GPU_DOMAINS, I915_GEM_GPU_DOMAINS);
+	if (ret)
+		return ret;
+
+	ret = intel_ring_begin(ring, 6);
+	if (ret)
+		return ret;
+
+	intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
+	intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
+	intel_ring_emit(ring, PP_DIR_DCLV_2G);
+	intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
+	intel_ring_emit(ring, pd_offset);
+	intel_ring_emit(ring, MI_NOOP);
+	intel_ring_advance(ring);
+
+	return 0;
+}
+
 static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_device *dev = ppgtt->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
-	uint32_t pd_offset;
 	struct intel_ring_buffer *ring;
 	int i;
 
@@ -134,10 +177,6 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 
 	gen6_write_pdes(ppgtt);
 
-	pd_offset = ppgtt->pd_offset;
-	pd_offset /= 64; /* in cachelines, */
-	pd_offset <<= 16;
-
 	if (INTEL_INFO(dev)->gen == 6) {
 		uint32_t ecochk, gab_ctl, ecobits;
 
@@ -179,36 +218,9 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 			I915_WRITE(RING_MODE_GEN7(ring),
 				   _MASKED_BIT_ENABLE(GFX_PPGTT_ENABLE));
 
-		/* If we're in reset, we can assume the GPU is sufficiently idle
-		 * to manually frob these bits. Ideally we could use the ring
-		 * functions, except our error handling makes it quite difficult
-		 * (can't use intel_ring_begin, ring->flush, or
-		 * intel_ring_advance)
-		 */
-		if (i915_reset_in_progress(&dev_priv->gpu_error)) {
-			WARN_ON(ppgtt != dev_priv->gtt.aliasing_ppgtt);
-			I915_WRITE(RING_PP_DIR_DCLV(ring), PP_DIR_DCLV_2G);
-			I915_WRITE(RING_PP_DIR_BASE(ring), pd_offset);
-			return 0;
-		}
-
-		/* NB: TLBs must be flushed and invalidated before a switch */
-		ret = ring->flush(ring, I915_GEM_GPU_DOMAINS,
-				  I915_GEM_GPU_DOMAINS);
+		ret = ppgtt->switch_mm(ppgtt, ring);
 		if (ret)
 			return ret;
-
-		ret = intel_ring_begin(ring, 6);
-		if (ret)
-			return ret;
-
-		intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(2));
-		intel_ring_emit(ring, RING_PP_DIR_DCLV(ring));
-		intel_ring_emit(ring, PP_DIR_DCLV_2G);
-		intel_ring_emit(ring, RING_PP_DIR_BASE(ring));
-		intel_ring_emit(ring, pd_offset);
-		intel_ring_emit(ring, MI_NOOP);
-		intel_ring_advance(ring);
 	}
 	return 0;
 }
@@ -375,6 +387,7 @@ alloc:
 
 	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
 	ppgtt->enable = gen6_ppgtt_enable;
+	ppgtt->switch_mm = gen6_mm_switch;
 	ppgtt->cleanup = gen6_ppgtt_cleanup;
 
 	vm->clear_range = gen6_ppgtt_clear_range;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 56/66] drm/i915: Write PDEs at init instead of enable
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (54 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 55/66] drm/i915: Extract mm switching to function Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:30 ` [PATCH 57/66] drm/i915: Disallow pin with full ppgtt Ben Widawsky
                   ` (11 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We won't be calling enable() for all PPGTTs. We do need to write PDEs
for all PPGTTs however. By moving the writing to init (which is called
for all PPGTTs) we should accomplish this.

TODO: Eventually, we should allocate the page tables on demand.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 646e8ef..7e9b2e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -175,8 +175,6 @@ static int gen6_ppgtt_enable(struct i915_hw_ppgtt *ppgtt)
 
 	BUG_ON(ppgtt->pd_offset & 0x3f);
 
-	gen6_write_pdes(ppgtt);
-
 	if (INTEL_INFO(dev)->gen == 6) {
 		uint32_t ecochk, gab_ctl, ecobits;
 
@@ -435,9 +433,11 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 	else
 		BUG();
 
-	if (!ret)
+	if (!ret) {
+		gen6_write_pdes(ppgtt);
 		drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
 			    ppgtt->base.total);
+	}
 
 	/* i915_init_vm(dev_priv, &ppgtt->base) */
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (55 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 56/66] drm/i915: Write PDEs at init instead of enable Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-28  8:55   ` Chris Wilson
  2013-06-27 23:30 ` [PATCH 58/66] drm/i915: Get context early in execbuf Ben Widawsky
                   ` (10 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Pin doesn't fit with PPGTT since the interface doesn't allow for the
context for which we want to pin.

Full PPGTT will bring a new "soft pin" interface. The semantics of which
will probably take some time to iron out.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a4db2cc..e58584b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3746,6 +3746,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		return -EBUSY;
 
 	BUG_ON(map_and_fenceable && !is_i915_ggtt(vm));
+	BUG_ON(!HAS_HW_CONTEXTS(obj->base.dev) && !is_i915_ggtt(vm));
 
 	if (i915_gem_obj_bound(obj, vm)) {
 		if ((alignment &&
@@ -3800,6 +3801,7 @@ int
 i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 		   struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_pin *args = data;
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -3808,6 +3810,11 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		return ret;
 
+	if (!dev_priv->hw_contexts_disabled) {
+		mutex_unlock(&dev->struct_mutex);
+		return -ENXIO;
+	}
+
 	obj = to_intel_bo(drm_gem_object_lookup(dev, file, args->handle));
 	if (&obj->base == NULL) {
 		ret = -ENOENT;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 58/66] drm/i915: Get context early in execbuf
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (56 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 57/66] drm/i915: Disallow pin with full ppgtt Ben Widawsky
@ 2013-06-27 23:30 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 59/66] drm/i915: Pass ctx directly to switch/hangstat Ben Widawsky
                   ` (9 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:30 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.

This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 ++
 drivers/gpu/drm/i915/i915_gem_context.c    |  4 +---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 24 ++++++++++++++++--------
 3 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7865618..3b452f2 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1909,6 +1909,8 @@ static inline int i915_gem_context_unreference(struct i915_hw_context *ctx)
 	return kref_put(&ctx->ref, i915_gem_context_free);
 }
 
+struct i915_hw_context *
+i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
 struct i915_ctx_hang_stats * __must_check
 i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
 				struct drm_file *file,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e20ece6..2975ca0 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -95,8 +95,6 @@
  */
 #define CONTEXT_ALIGN (64<<10)
 
-static struct i915_hw_context *
-i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
 static int do_switch(struct intel_ring_buffer *ring,
 		     struct i915_hw_context *to);
 
@@ -465,7 +463,7 @@ void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
 	mutex_unlock(&dev->struct_mutex);
 }
 
-static struct i915_hw_context *
+struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id)
 {
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b3e3658..3eda4e1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -853,8 +853,7 @@ static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec,
-		       struct i915_address_space *vm)
+		       struct drm_i915_gem_exec_object2 *exec)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct eb_objects *eb;
@@ -862,6 +861,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct drm_clip_rect *cliprects = NULL;
 	struct intel_ring_buffer *ring;
 	struct i915_ctx_hang_stats *hs;
+	struct i915_hw_context *ctx;
+	struct i915_address_space *vm;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
 	u32 exec_start, exec_len;
 	u32 mask, flags;
@@ -989,6 +990,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto pre_mutex_err;
 	}
 
+	ctx = i915_gem_context_get(file->driver_priv, ctx_id);
+	if (!ctx && dev_priv->gtt.aliasing_ppgtt) {
+		mutex_unlock(&dev->struct_mutex);
+		ret = -ENOENT;
+		goto pre_mutex_err;
+	} else if (!ctx) {
+		vm = &dev_priv->gtt.base;
+	} else {
+		vm = &ctx->ppgtt.base;
+	}
+
 	/* Look up object handles */
 	ret = eb_lookup_objects(eb, exec, args, file);
 	if (ret)
@@ -1121,7 +1133,6 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -1177,8 +1188,7 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 	exec2.flags = I915_EXEC_RENDER;
 	i915_execbuffer2_set_context_id(exec2, 0);
 
-	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list,
-				     &dev_priv->gtt.base);
+	ret = i915_gem_do_execbuffer(dev, data, file, &exec2, exec2_list);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		for (i = 0; i < args->buffer_count; i++)
@@ -1204,7 +1214,6 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list = NULL;
 	int ret;
@@ -1235,8 +1244,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list,
-				     &dev_priv->gtt.base);
+	ret = i915_gem_do_execbuffer(dev, data, file, args, exec2_list);
 	if (!ret) {
 		/* Copy the new buffer offsets back to the user's exec list. */
 		ret = copy_to_user(to_user_ptr(args->buffers_ptr),
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 59/66] drm/i915: Pass ctx directly to switch/hangstat
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (57 preceding siblings ...)
  2013-06-27 23:30 ` [PATCH 58/66] drm/i915: Get context early in execbuf Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 60/66] drm/i915: Actually add the new address spaces Ben Widawsky
                   ` (8 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

We have all the info earlier now, so we may as well avoid the excess
lookup.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h            |  5 ++---
 drivers/gpu/drm/i915/i915_gem.c            |  2 +-
 drivers/gpu/drm/i915/i915_gem_context.c    | 21 ++-------------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  5 ++---
 4 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3b452f2..736c714 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1897,7 +1897,7 @@ int i915_gem_context_enable(struct drm_i915_private *dev_priv);
 int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
 void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
 int i915_switch_context(struct intel_ring_buffer *ring,
-			struct drm_file *file, int to_id);
+			struct i915_hw_context *to);
 void i915_gem_context_free(struct kref *ctx_ref);
 static inline void i915_gem_context_reference(struct i915_hw_context *ctx)
 {
@@ -1913,8 +1913,7 @@ struct i915_hw_context *
 i915_gem_context_get(struct drm_i915_file_private *file_priv, u32 id);
 struct i915_ctx_hang_stats * __must_check
 i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
-				struct drm_file *file,
-				u32 id);
+				struct i915_hw_context *to);
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file);
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e58584b..73e116e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2709,7 +2709,7 @@ int i915_gpu_idle(struct drm_device *dev)
 
 	/* Flush everything onto the inactive list. */
 	for_each_ring(ring, dev_priv, i) {
-		ret = i915_switch_context(ring, NULL, DEFAULT_CONTEXT_ID);
+		ret = i915_switch_context(ring, ring->default_context);
 		if (ret)
 			return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 2975ca0..37ebfa2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -400,12 +400,9 @@ static int context_idr_cleanup(int id, void *p, void *data)
 
 struct i915_ctx_hang_stats *
 i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
-				struct drm_file *file,
-				u32 id)
+				struct i915_hw_context *to)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct drm_i915_file_private *file_priv = file->driver_priv;
-	struct i915_hw_context *to;
 
 	if (dev_priv->hw_contexts_disabled)
 		return ERR_PTR(-ENOENT);
@@ -413,13 +410,6 @@ i915_gem_context_get_hang_stats(struct intel_ring_buffer *ring,
 	if (ring->id != RCS)
 		return ERR_PTR(-EINVAL);
 
-	if (file == NULL)
-		return ERR_PTR(-EINVAL);
-
-	if (id == DEFAULT_CONTEXT_ID)
-		return &file_priv->private_default_ctx->hang_stats;
-
-	to = i915_gem_context_get(file->driver_priv, id);
 	if (to == NULL)
 		return ERR_PTR(-ENOENT);
 
@@ -633,22 +623,15 @@ done:
  * object while letting the normal object tracking destroy the backing BO.
  */
 int i915_switch_context(struct intel_ring_buffer *ring,
-			struct drm_file *file,
-			int to_id)
+			struct i915_hw_context *to)
 {
 	struct drm_i915_private *dev_priv = ring->dev->dev_private;
-	struct i915_hw_context *to;
 
 	if (dev_priv->hw_contexts_disabled)
 		return 0;
 
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (file == NULL)
-		to = ring->default_context;
-	else
-		to = i915_gem_context_get(file->driver_priv, to_id);
-
 	if (to == NULL)
 		return -ENOENT;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 3eda4e1..aeec8c0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1049,8 +1049,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	if (ret)
 		goto err;
 
-	hs = i915_gem_context_get_hang_stats(&dev_priv->ring[RCS],
-					     file, ctx_id);
+	hs = i915_gem_context_get_hang_stats(&dev_priv->ring[RCS], ctx);
 	if (IS_ERR(hs)) {
 		ret = PTR_ERR(hs);
 		goto err;
@@ -1061,7 +1060,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		goto err;
 	}
 
-	ret = i915_switch_context(ring, file, ctx_id);
+	ret = i915_switch_context(ring, ctx);
 	if (ret)
 		goto err;
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 60/66] drm/i915: Actually add the new address spaces
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (58 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 59/66] drm/i915: Pass ctx directly to switch/hangstat Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 61/66] drm/i915: Use multiple VMs Ben Widawsky
                   ` (7 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This doesn't actually do the switch, but will actually add and remove
the new address spaces as needed. It is a good point for bisection.

It also adds create/destroy trace events. Notice the FIXME by the
destroy where I acknowledge a layering violation which can be fixed
later (or copy pasted, whatever).

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  2 ++
 drivers/gpu/drm/i915/i915_gem.c     | 17 ++++-------------
 drivers/gpu/drm/i915/i915_gem_gtt.c |  9 +++++++--
 drivers/gpu/drm/i915/i915_irq.c     |  3 ---
 drivers/gpu/drm/i915/i915_trace.h   | 18 ++++++++++++++++++
 5 files changed, 31 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 736c714..c251724 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1678,6 +1678,8 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
+void i915_init_vm(struct drm_i915_private *dev_priv,
+		  struct i915_address_space *vm);
 struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
 				     struct i915_address_space *vm);
 void i915_gem_vma_destroy(struct i915_vma *vma);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 73e116e..af0150e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3151,11 +3151,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	struct i915_vma *vma;
 	int ret;
 
-	if (WARN_ON(!list_empty(&obj->vma_list)))
-		return -EBUSY;
-
-	BUG_ON(!is_i915_ggtt(vm));
-
 	fence_size = i915_gem_get_gtt_size(dev,
 					   obj->base.size,
 					   obj->tiling_mode);
@@ -3194,9 +3189,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 
 	i915_gem_object_pin_pages(obj);
 
-	/* For now we only ever use 1 vma per object */
-	WARN_ON(!list_empty(&obj->vma_list));
-
 	vma = i915_gem_vma_create(obj, vm);
 	if (vma == NULL) {
 		i915_gem_object_unpin_pages(obj);
@@ -4077,9 +4069,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
 	obj->pin_count = 0;
-	/* NB: 0 or 1 elements */
-	WARN_ON(!list_empty(&obj->vma_list) &&
-		!list_is_singular(&obj->vma_list));
+
 	list_for_each_entry_safe(vma, next, &obj->vma_list, vma_link) {
 		int ret = i915_gem_object_unbind(obj, vma->vm);
 		if (WARN_ON(ret == -ERESTARTSYS)) {
@@ -4516,8 +4506,8 @@ init_ring_lists(struct intel_ring_buffer *ring)
 	INIT_LIST_HEAD(&ring->request_list);
 }
 
-static void i915_init_vm(struct drm_i915_private *dev_priv,
-			 struct i915_address_space *vm)
+void i915_init_vm(struct drm_i915_private *dev_priv,
+		  struct i915_address_space *vm)
 {
 	vm->dev = dev_priv->dev;
 	INIT_LIST_HEAD(&vm->active_list);
@@ -4525,6 +4515,7 @@ static void i915_init_vm(struct drm_i915_private *dev_priv,
 	INIT_LIST_HEAD(&vm->global_link);
 	INIT_LIST_HEAD(&vm->vma_list);
 	list_add(&vm->global_link, &dev_priv->vm_list);
+	trace_i915_address_space_create(vm);
 }
 
 void
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 7e9b2e2..2f9af3e 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -289,6 +289,10 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
 	struct i915_address_space *vm = &ppgtt->base;
 	int i;
 
+	trace_i915_address_space_destroy(vm);
+	/* FIXME: It's a bit of a layering violation to remove ourselves here.
+	 * Fix when we have more VM types */
+	list_del(&ppgtt->base.global_link);
 	drm_mm_remove_node(&ppgtt->node);
 	drm_mm_takedown(&ppgtt->base.mm);
 
@@ -437,10 +441,11 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 		gen6_write_pdes(ppgtt);
 		drm_mm_init(&ppgtt->base.mm, ppgtt->base.start,
 			    ppgtt->base.total);
+		i915_init_vm(dev->dev_private, &ppgtt->base);
+		DRM_DEBUG("Adding PPGTT at offset %x\n",
+			  ppgtt->pd_offset << 10);
 	}
 
-	/* i915_init_vm(dev_priv, &ppgtt->base) */
-
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e1653fd..5622012 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1937,9 +1937,6 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
 	list_for_each_entry(vm, &dev_priv->vm_list, global_link)
 		cnt++;
 
-	if (WARN(cnt > 1, "Multiple VMs not yet supported\n"))
-		cnt = 1;
-
 	vm = &dev_priv->gtt.base;
 
 	error->active_bo = kcalloc(cnt, sizeof(*error->active_bo), GFP_ATOMIC);
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 3f019d3..afd0428 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -194,6 +194,24 @@ DEFINE_EVENT(i915_gem_object, i915_gem_object_destroy,
 	    TP_ARGS(obj)
 );
 
+DECLARE_EVENT_CLASS(i915_address_space,
+		    TP_PROTO(struct i915_address_space *vm),
+		    TP_ARGS(vm),
+		    TP_STRUCT__entry(__field(struct i915_address_space *, vm)),
+		    TP_fast_assign(__entry->vm = vm;),
+		    TP_printk("vm = %p", __entry->vm)
+		   );
+
+DEFINE_EVENT(i915_address_space, i915_address_space_create,
+	     TP_PROTO(struct i915_address_space *vm),
+	     TP_ARGS(vm)
+	    );
+
+DEFINE_EVENT(i915_address_space, i915_address_space_destroy,
+	     TP_PROTO(struct i915_address_space *vm),
+	     TP_ARGS(vm)
+	    );
+
 TRACE_EVENT(i915_gem_evict,
 	    TP_PROTO(struct drm_device *dev, u32 size, u32 align, bool mappable),
 	    TP_ARGS(dev, size, align, mappable),
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 61/66] drm/i915: Use multiple VMs
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (59 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 60/66] drm/i915: Actually add the new address spaces Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:43   ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 62/66] drm/i915: Kill now unused ppgtt_{un, }bind Ben Widawsky
                   ` (6 subsequent siblings)
  67 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

This requires doing an actual switch of the page tables during the
context switch/execbuf.

Along the way, cut away as much "aliasing" ppgtt as possible

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_gem.c            | 22 +++++++++++++---------
 drivers/gpu/drm/i915/i915_gem_context.c    | 29 +++++++++++++++++------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 27 ++++++++++++++++++++-------
 3 files changed, 50 insertions(+), 28 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index af0150e..f05d585 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2170,7 +2170,10 @@ request_to_vm(struct drm_i915_gem_request *request)
 	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
 	struct i915_address_space *vm;
 
-	vm = &dev_priv->gtt.base;
+	if (request->ctx)
+		vm = &request->ctx->ppgtt.base;
+	else
+		vm = &dev_priv->gtt.base;
 
 	return vm;
 }
@@ -2676,10 +2679,10 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
 
 	if (obj->has_global_gtt_mapping && is_i915_ggtt(vm))
 		i915_gem_gtt_unbind_object(obj);
-	if (obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
-		obj->has_aliasing_ppgtt_mapping = 0;
-	}
+
+	vm->clear_range(vm, i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
+			obj->base.size >> PAGE_SHIFT);
+
 	i915_gem_gtt_finish_object(obj);
 	i915_gem_object_unpin_pages(obj);
 
@@ -3444,11 +3447,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 				return ret;
 		}
 
-		if (obj->has_global_gtt_mapping)
+		if (!is_i915_ggtt(vm) && obj->has_global_gtt_mapping)
 			i915_gem_gtt_bind_object(obj, cache_level);
-		if (obj->has_aliasing_ppgtt_mapping)
-			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
-					       obj, cache_level);
+
+		vm->insert_entries(vm, obj->pages,
+				   i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
+				   cache_level);
 
 		i915_gem_obj_set_color(obj, vm, cache_level);
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 37ebfa2..cea036e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -526,10 +526,14 @@ static int do_switch(struct intel_ring_buffer *ring,
 	if (from == to && from->last_ring == ring)
 		return 0;
 
+	ret = to->ppgtt.switch_mm(&to->ppgtt, ring);
+	if (ret)
+		return ret;
+
 	if (ring != &dev_priv->ring[RCS] && from) {
 		ret = i915_add_request(ring, NULL);
 		if (ret)
-			return ret;
+			goto err_out;
 		i915_gem_context_unreference(from);
 	}
 
@@ -538,7 +542,7 @@ static int do_switch(struct intel_ring_buffer *ring,
 
 	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
-		return ret;
+		goto err_out;
 
 	/* Clear this page out of any CPU caches for coherent swap-in/out. Note
 	 * that thanks to write = false in this call and us not setting any gpu
@@ -546,10 +550,8 @@ static int do_switch(struct intel_ring_buffer *ring,
 	 * (when switching away from it), this won't block.
 	 * XXX: We need a real interface to do this instead of trickery. */
 	ret = i915_gem_object_set_to_gtt_domain(to->obj, false);
-	if (ret) {
-		i915_gem_object_unpin(to->obj);
-		return ret;
-	}
+	if (ret)
+		goto unpin_out;
 
 	if (!to->obj->has_global_gtt_mapping)
 		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
@@ -560,10 +562,8 @@ static int do_switch(struct intel_ring_buffer *ring,
 		hw_flags |= MI_FORCE_RESTORE;
 
 	ret = mi_set_context(ring, to, hw_flags);
-	if (ret) {
-		i915_gem_object_unpin(to->obj);
-		return ret;
-	}
+	if (ret)
+		goto unpin_out;
 
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
@@ -593,7 +593,7 @@ static int do_switch(struct intel_ring_buffer *ring,
 			 * scream.
 			 */
 			WARN_ON(mi_set_context(ring, from, MI_RESTORE_INHIBIT));
-			return ret;
+			goto err_out;
 		}
 
 		i915_gem_object_unpin(from->obj);
@@ -605,8 +605,13 @@ done:
 	ring->last_context = to;
 	to->is_initialized = true;
 	to->last_ring = ring;
-
 	return 0;
+
+unpin_out:
+	i915_gem_object_unpin(to->obj);
+err_out:
+	WARN_ON(from->ppgtt.switch_mm(&from->ppgtt, ring));
+	return ret;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index aeec8c0..0f6bf3c 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -437,11 +437,21 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
 	}
 
 	/* Ensure ppgtt mapping exists if needed */
-	if (dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
-		i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
-				       obj, obj->cache_level);
-
+	if (is_i915_ggtt(vm) &&
+	    dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
+		/* FIXME: remove this later */
+		struct i915_address_space *appgtt =
+			&dev_priv->gtt.aliasing_ppgtt->base;
+		unsigned long obj_offset = i915_gem_obj_offset(obj, appgtt);
+
+		appgtt->insert_entries(appgtt, obj->pages,
+				       obj_offset >> PAGE_SHIFT,
+				       obj->cache_level);
 		obj->has_aliasing_ppgtt_mapping = 1;
+	} else {
+		vm->insert_entries(vm, obj->pages,
+				   i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
+				   obj->cache_level);
 	}
 
 	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
@@ -864,7 +874,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	struct i915_hw_context *ctx;
 	struct i915_address_space *vm;
 	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
-	u32 exec_start, exec_len;
+	u32 exec_start = args->batch_start_offset, exec_len;
 	u32 mask, flags;
 	int ret, mode, i;
 	bool need_relocs;
@@ -1085,8 +1095,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 			goto err;
 	}
 
-	exec_start = i915_gem_obj_offset(batch_obj, vm) +
-		args->batch_start_offset;
+	if (batch_obj->has_global_gtt_mapping)
+		exec_start += i915_gem_ggtt_offset(batch_obj);
+	else
+		exec_start += i915_gem_obj_offset(batch_obj, vm);
+
 	exec_len = args->batch_len;
 	if (cliprects) {
 		for (i = 0; i < args->num_cliprects; i++) {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 62/66] drm/i915: Kill now unused ppgtt_{un, }bind
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (60 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 61/66] drm/i915: Use multiple VMs Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 63/66] drm/i915: Add PPGTT dumper Ben Widawsky
                   ` (5 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

vm->insert_entries was good enough. We can bring this function back
later if needed.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_drv.h     |  5 -----
 drivers/gpu/drm/i915/i915_gem_gtt.c | 22 ----------------------
 2 files changed, 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c251724..63ba242 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1924,11 +1924,6 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 /* i915_gem_gtt.c */
 bool intel_enable_ppgtt(struct drm_device *dev);
 int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt);
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level);
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj);
 void i915_gem_restore_gtt_mappings(struct drm_device *dev);
 int __must_check i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj);
 /* FIXME: this is never okay with full PPGTT */
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 2f9af3e..e42059a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -449,28 +449,6 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
 	return ret;
 }
 
-void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
-			    struct drm_i915_gem_object *obj,
-			    enum i915_cache_level cache_level)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->insert_entries(vm, obj->pages,
-			   obj_offset >> PAGE_SHIFT,
-			   cache_level);
-}
-
-void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
-			      struct drm_i915_gem_object *obj)
-{
-	struct i915_address_space *vm = &ppgtt->base;
-	unsigned long obj_offset = i915_gem_obj_offset(obj, vm);
-
-	vm->clear_range(vm, obj_offset >> PAGE_SHIFT,
-			obj->base.size >> PAGE_SHIFT);
-}
-
 extern int intel_iommu_gfx_mapped;
 /* Certain Gen5 chipsets require require idling the GPU before
  * unmapping anything from the GTT when VT-d is enabled.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 63/66] drm/i915: Add PPGTT dumper
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (61 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 62/66] drm/i915: Kill now unused ppgtt_{un, }bind Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 64/66] drm/i915: Dump all ppgtt Ben Widawsky
                   ` (4 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Dump the aliasing PPGTT with it. The aliasing PPGTT should actually
always be empty.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  3 +-
 drivers/gpu/drm/i915/i915_drv.h     |  1 +
 drivers/gpu/drm/i915/i915_gem_gtt.c | 60 +++++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 15f29de..2dfa784 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1839,9 +1839,8 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 	}
 	if (dev_priv->gtt.aliasing_ppgtt) {
 		struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
-
 		seq_printf(m, "aliasing PPGTT:\n");
-		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset);
+		ppgtt->debug_dump(ppgtt, m);
 	}
 	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 63ba242..f317d29 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -537,6 +537,7 @@ struct i915_hw_ppgtt {
 	int (*switch_mm)(struct i915_hw_ppgtt *ppgtt,
 			 struct intel_ring_buffer *ring);
 	void (*cleanup)(struct i915_hw_ppgtt *ppgtt);
+	void (*debug_dump)(struct i915_hw_ppgtt *ppgtt, struct seq_file *m);
 };
 
 /* To make things as simple as possible (ie. no refcounting), a VMA's lifetime
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index e42059a..cb151fc6 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -100,6 +100,65 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
 	return pte;
 }
 
+static void gen6_dump_ppgtt(struct i915_hw_ppgtt *ppgtt, struct seq_file *m)
+{
+	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
+	struct i915_hw_context *ctx =
+		container_of(ppgtt, struct i915_hw_context, ppgtt);
+	gen6_gtt_pte_t __iomem *pd_addr;
+	gen6_gtt_pte_t scratch_pte;
+	uint32_t pd_entry;
+	bool found = false;
+	int pte, pde;
+
+	scratch_pte = ppgtt->base.pte_encode(ppgtt->base.scratch.addr,
+					     I915_CACHE_LLC);
+
+
+	pd_addr = (gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm +
+		ppgtt->pd_offset / sizeof(gen6_gtt_pte_t);
+
+	seq_printf(m, "  Context %d (pd_offset %x-%x):\n", ctx->id,
+		   ppgtt->pd_offset, ppgtt->pd_offset + ppgtt->num_pd_entries);
+	for (pde = 0; pde < ppgtt->num_pd_entries; pde++) {
+		u32 expected;
+		gen6_gtt_pte_t *pt_vaddr;
+		dma_addr_t pt_addr = ppgtt->pt_dma_addr[pde];
+		pd_entry = readl(pd_addr + pde);
+		expected = (GEN6_PDE_ADDR_ENCODE(pt_addr) | GEN6_PDE_VALID);
+
+		if (pd_entry != expected)
+			seq_printf(m, "\tPDE #%d mismatch: Actual PDE: %x Expected PDE: %x\n",
+				   pde,
+				   pd_entry,
+				   expected);
+#if 0
+		seq_printf(m, "\tPDE: %x\n", pd_entry);
+#endif
+
+		pt_vaddr = kmap_atomic(ppgtt->pt_pages[pde]);
+		for (pte = 0; pte < I915_PPGTT_PT_ENTRIES; pte++) {
+			unsigned long va =
+				(pde * PAGE_SIZE * I915_PPGTT_PT_ENTRIES) +
+				(pte * PAGE_SIZE);
+			if (pt_vaddr[pte] != scratch_pte) {
+				seq_printf(m, "\t\t0x%lx [%03d,%04d]: = %08x %08x %08x %08x\n",
+					   va,
+					   pde, pte,
+					   pt_vaddr[pte],
+					   pt_vaddr[pte],
+					   pt_vaddr[pte],
+					   pt_vaddr[pte]);
+				found = true;
+			}
+		}
+		kunmap_atomic(pt_vaddr);
+	}
+
+	if (!found)
+		seq_puts(m, "\tempty\n");
+}
+
 static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
 {
 	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
@@ -391,6 +450,7 @@ alloc:
 	ppgtt->enable = gen6_ppgtt_enable;
 	ppgtt->switch_mm = gen6_mm_switch;
 	ppgtt->cleanup = gen6_ppgtt_cleanup;
+	ppgtt->debug_dump = gen6_dump_ppgtt;
 
 	vm->clear_range = gen6_ppgtt_clear_range;
 	vm->insert_entries = gen6_ppgtt_insert_entries;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 64/66] drm/i915: Dump all ppgtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (62 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 63/66] drm/i915: Add PPGTT dumper Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 65/66] drm/i915: Add debugfs for vma info per vm Ben Widawsky
                   ` (3 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2dfa784..20d6265 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -1814,11 +1814,22 @@ static int i915_swizzle_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int per_file_ctx(int id, void *ptr, void *data)
+{
+	struct i915_hw_context *ctx = ptr;
+	struct seq_file *m = data;
+
+	ctx->ppgtt.debug_dump(&ctx->ppgtt, m);
+
+	return 0;
+}
+
 static int i915_ppgtt_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_file *file;
 	struct intel_ring_buffer *ring;
 	int i, ret;
 
@@ -1837,12 +1848,28 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
 		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
 		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
 	}
+	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
+
 	if (dev_priv->gtt.aliasing_ppgtt) {
 		struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
 		seq_printf(m, "aliasing PPGTT:\n");
 		ppgtt->debug_dump(ppgtt, m);
+	} else
+		goto out;
+
+	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
+		struct drm_i915_file_private *file_priv = file->driver_priv;
+		struct i915_hw_ppgtt *pvt_ppgtt;
+
+		pvt_ppgtt = &file_priv->private_default_ctx->ppgtt;
+		seq_printf(m, "proc: %s\n",
+			   get_pid_task(file->pid, PIDTYPE_PID)->comm);
+		seq_puts(m, "  default context:\n");
+		pvt_ppgtt->debug_dump(pvt_ppgtt, m);
+		idr_for_each(&file_priv->context_idr, per_file_ctx, m);
 	}
-	seq_printf(m, "ECOCHK: 0x%08x\n", I915_READ(GAM_ECOCHK));
+
+out:
 	mutex_unlock(&dev->struct_mutex);
 
 	return 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 65/66] drm/i915: Add debugfs for vma info per vm
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (63 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 64/66] drm/i915: Dump all ppgtt Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-27 23:31 ` [PATCH 66/66] drm/i915: Getparam full ppgtt Ben Widawsky
                   ` (2 subsequent siblings)
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_debugfs.c | 81 +++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 20d6265..6bbb602 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -30,6 +30,7 @@
 #include <linux/debugfs.h>
 #include <linux/slab.h>
 #include <linux/export.h>
+#include <linux/list_sort.h>
 #include <generated/utsrelease.h>
 #include <drm/drmP.h>
 #include "intel_drv.h"
@@ -145,6 +146,42 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		seq_printf(m, " (%s)", obj->ring->name);
 }
 
+static void
+describe_less_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
+{
+	seq_printf(m, "%pK: %s%s %02x %02x %d %d %d%s%s%s",
+		   &obj->base,
+		   get_pin_flag(obj),
+		   get_tiling_flag(obj),
+		   obj->base.read_domains,
+		   obj->base.write_domain,
+		   obj->last_read_seqno,
+		   obj->last_write_seqno,
+		   obj->last_fenced_seqno,
+		   cache_level_str(obj->cache_level),
+		   obj->dirty ? " dirty" : "",
+		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
+	if (obj->base.name)
+		seq_printf(m, " (name: %d)", obj->base.name);
+	if (obj->pin_count)
+		seq_printf(m, " (pinned x %d)", obj->pin_count);
+	if (obj->fence_reg != I915_FENCE_REG_NONE)
+		seq_printf(m, " (fence: %d)", obj->fence_reg);
+	if (obj->stolen)
+		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
+	if (obj->pin_mappable || obj->fault_mappable) {
+		char s[3], *t = s;
+		if (obj->pin_mappable)
+			*t++ = 'p';
+		if (obj->fault_mappable)
+			*t++ = 'f';
+		*t = '\0';
+		seq_printf(m, " (%s mappable)", s);
+	}
+	if (obj->ring != NULL)
+		seq_printf(m, " (%s)", obj->ring->name);
+}
+
 static int i915_gem_object_list_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
@@ -1922,6 +1959,49 @@ static int i915_dpio_info(struct seq_file *m, void *data)
 	return 0;
 }
 
+static int vma_compare(void *priv, struct list_head *a, struct list_head *b)
+{
+	struct i915_vma *vma1, *vma2;
+
+	vma1 = list_entry(a, struct i915_vma, per_vm_link);
+	vma2 = list_entry(a, struct i915_vma, per_vm_link);
+
+	return vma1->node.start - vma2->node.start;
+}
+
+static int i915_vm_info(struct seq_file *m, void *data)
+{
+	LIST_HEAD(sorted_vmas);
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_address_space *vm;
+	struct i915_vma *vma;
+	int ret;
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	list_for_each_entry(vm, &dev_priv->vm_list, global_link) {
+		list_sort(NULL, &vm->vma_list, vma_compare);
+		if (is_i915_ggtt(vm))
+			seq_puts(m, "Global ");
+		seq_printf(m, "VM: %p\n", vm);
+		list_for_each_entry(vma, &vm->vma_list, per_vm_link) {
+			seq_printf(m, "  VMA: 0x%08lx-0x%08lx (obj = ",
+				   vma->node.start,
+				   vma->node.start + vma->node.size);
+			describe_less_obj(m, vma->obj);
+			seq_puts(m, ")\n");
+		}
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 static int
 i915_wedged_get(void *data, u64 *val)
 {
@@ -2358,6 +2438,7 @@ static struct drm_info_list i915_debugfs_list[] = {
 	{"i915_swizzle_info", i915_swizzle_info, 0},
 	{"i915_ppgtt_info", i915_ppgtt_info, 0},
 	{"i915_dpio", i915_dpio_info, 0},
+	{"i915_vm_info", i915_vm_info, 0},
 };
 #define I915_DEBUGFS_ENTRIES ARRAY_SIZE(i915_debugfs_list)
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* [PATCH 66/66] drm/i915: Getparam full ppgtt
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (64 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 65/66] drm/i915: Add debugfs for vma info per vm Ben Widawsky
@ 2013-06-27 23:31 ` Ben Widawsky
  2013-06-28  3:38 ` [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
  2013-07-01 21:39 ` Daniel Vetter
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:31 UTC (permalink / raw)
  To: Intel GFX; +Cc: Ben Widawsky

As of now (and this may change) we can't have aliasing PPGTT anymore (it
just won't happen). We do still have the aliasing ppgtt internally
though so we can use that to tell userspace if we have full PPGTT.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
 drivers/gpu/drm/i915/i915_dma.c | 7 +++++--
 include/uapi/drm/i915_drm.h     | 1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 9955dc7..d354c64 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -977,8 +977,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 		value = HAS_LLC(dev);
 		break;
 	case I915_PARAM_HAS_ALIASING_PPGTT:
-		if (intel_enable_ppgtt(dev) && dev_priv->gtt.aliasing_ppgtt)
-			value = 1;
+		value = 0;
 		break;
 	case I915_PARAM_HAS_WAIT_TIMEOUT:
 		value = 1;
@@ -1001,6 +1000,10 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_HANDLE_LUT:
 		value = 1;
 		break;
+	case I915_PARAM_HAS_FULL_PPGTT:
+		if (intel_enable_ppgtt(dev) && dev_priv->gtt.aliasing_ppgtt)
+			value = 1;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 923ed7f..5cb9fd1 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -310,6 +310,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_PINNED_BATCHES	 24
 #define I915_PARAM_HAS_EXEC_NO_RELOC	 25
 #define I915_PARAM_HAS_EXEC_HANDLE_LUT   26
+#define I915_PARAM_HAS_FULL_PPGTT	 27
 
 typedef struct drm_i915_getparam {
 	int param;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-06-27 23:31 ` [PATCH 61/66] drm/i915: Use multiple VMs Ben Widawsky
@ 2013-06-27 23:43   ` Ben Widawsky
  2013-07-02 10:58     ` Ville Syrjälä
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-27 23:43 UTC (permalink / raw)
  To: Intel GFX

On Thu, Jun 27, 2013 at 04:31:02PM -0700, Ben Widawsky wrote:
> This requires doing an actual switch of the page tables during the
> context switch/execbuf.
> 
> Along the way, cut away as much "aliasing" ppgtt as possible
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_gem.c            | 22 +++++++++++++---------
>  drivers/gpu/drm/i915/i915_gem_context.c    | 29 +++++++++++++++++------------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 27 ++++++++++++++++++++-------
>  3 files changed, 50 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index af0150e..f05d585 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2170,7 +2170,10 @@ request_to_vm(struct drm_i915_gem_request *request)
>  	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
>  	struct i915_address_space *vm;
>  
> -	vm = &dev_priv->gtt.base;
> +	if (request->ctx)
> +		vm = &request->ctx->ppgtt.base;
> +	else
> +		vm = &dev_priv->gtt.base;
>  
>  	return vm;
>  }
> @@ -2676,10 +2679,10 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
>  
>  	if (obj->has_global_gtt_mapping && is_i915_ggtt(vm))
>  		i915_gem_gtt_unbind_object(obj);
> -	if (obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
> -		obj->has_aliasing_ppgtt_mapping = 0;
> -	}
> +
> +	vm->clear_range(vm, i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
> +			obj->base.size >> PAGE_SHIFT);
> +
>  	i915_gem_gtt_finish_object(obj);
>  	i915_gem_object_unpin_pages(obj);
>  
> @@ -3444,11 +3447,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  				return ret;
>  		}
>  
> -		if (obj->has_global_gtt_mapping)
> +		if (!is_i915_ggtt(vm) && obj->has_global_gtt_mapping)
>  			i915_gem_gtt_bind_object(obj, cache_level);
> -		if (obj->has_aliasing_ppgtt_mapping)
> -			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
> -					       obj, cache_level);
> +
> +		vm->insert_entries(vm, obj->pages,
> +				   i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
> +				   cache_level);
>  
>  		i915_gem_obj_set_color(obj, vm, cache_level);
>  	}
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 37ebfa2..cea036e 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -526,10 +526,14 @@ static int do_switch(struct intel_ring_buffer *ring,
>  	if (from == to && from->last_ring == ring)
>  		return 0;
>  
> +	ret = to->ppgtt.switch_mm(&to->ppgtt, ring);
> +	if (ret)
> +		return ret;
> +
>  	if (ring != &dev_priv->ring[RCS] && from) {
>  		ret = i915_add_request(ring, NULL);
>  		if (ret)
> -			return ret;
> +			goto err_out;
>  		i915_gem_context_unreference(from);
>  	}
>  
> @@ -538,7 +542,7 @@ static int do_switch(struct intel_ring_buffer *ring,
>  
>  	ret = i915_gem_ggtt_pin(to->obj, CONTEXT_ALIGN, false, false);
>  	if (ret)
> -		return ret;
> +		goto err_out;
>  
>  	/* Clear this page out of any CPU caches for coherent swap-in/out. Note
>  	 * that thanks to write = false in this call and us not setting any gpu
> @@ -546,10 +550,8 @@ static int do_switch(struct intel_ring_buffer *ring,
>  	 * (when switching away from it), this won't block.
>  	 * XXX: We need a real interface to do this instead of trickery. */
>  	ret = i915_gem_object_set_to_gtt_domain(to->obj, false);
> -	if (ret) {
> -		i915_gem_object_unpin(to->obj);
> -		return ret;
> -	}
> +	if (ret)
> +		goto unpin_out;
>  
>  	if (!to->obj->has_global_gtt_mapping)
>  		i915_gem_gtt_bind_object(to->obj, to->obj->cache_level);
> @@ -560,10 +562,8 @@ static int do_switch(struct intel_ring_buffer *ring,
>  		hw_flags |= MI_FORCE_RESTORE;
>  
>  	ret = mi_set_context(ring, to, hw_flags);
> -	if (ret) {
> -		i915_gem_object_unpin(to->obj);
> -		return ret;
> -	}
> +	if (ret)
> +		goto unpin_out;
>  
>  	/* The backing object for the context is done after switching to the
>  	 * *next* context. Therefore we cannot retire the previous context until
> @@ -593,7 +593,7 @@ static int do_switch(struct intel_ring_buffer *ring,
>  			 * scream.
>  			 */
>  			WARN_ON(mi_set_context(ring, from, MI_RESTORE_INHIBIT));
> -			return ret;
> +			goto err_out;
>  		}
>  
>  		i915_gem_object_unpin(from->obj);
> @@ -605,8 +605,13 @@ done:
>  	ring->last_context = to;
>  	to->is_initialized = true;
>  	to->last_ring = ring;
> -
>  	return 0;
> +
> +unpin_out:
> +	i915_gem_object_unpin(to->obj);
> +err_out:
> +	WARN_ON(from->ppgtt.switch_mm(&from->ppgtt, ring));
> +	return ret;
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index aeec8c0..0f6bf3c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -437,11 +437,21 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* Ensure ppgtt mapping exists if needed */
> -	if (dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
> -				       obj, obj->cache_level);
> -
> +	if (is_i915_ggtt(vm) &&
> +	    dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> +		/* FIXME: remove this later */
> +		struct i915_address_space *appgtt =
> +			&dev_priv->gtt.aliasing_ppgtt->base;
> +		unsigned long obj_offset = i915_gem_obj_offset(obj, appgtt);
> +
> +		appgtt->insert_entries(appgtt, obj->pages,
> +				       obj_offset >> PAGE_SHIFT,
> +				       obj->cache_level);
>
I meant to remove this, but I missed it. In theory I don't ever want to
insert Aliasing PPGTT PTEs. Will remove it locally and test it now.
>
>  		obj->has_aliasing_ppgtt_mapping = 1;
> +	} else {
> +		vm->insert_entries(vm, obj->pages,
> +				   i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
> +				   obj->cache_level);
>  	}
>  
>  	if (entry->offset != i915_gem_obj_offset(obj, vm)) {
> @@ -864,7 +874,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	struct i915_hw_context *ctx;
>  	struct i915_address_space *vm;
>  	u32 ctx_id = i915_execbuffer2_get_context_id(*args);
> -	u32 exec_start, exec_len;
> +	u32 exec_start = args->batch_start_offset, exec_len;
>  	u32 mask, flags;
>  	int ret, mode, i;
>  	bool need_relocs;
> @@ -1085,8 +1095,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = i915_gem_obj_offset(batch_obj, vm) +
> -		args->batch_start_offset;
> +	if (batch_obj->has_global_gtt_mapping)
> +		exec_start += i915_gem_ggtt_offset(batch_obj);
> +	else
> +		exec_start += i915_gem_obj_offset(batch_obj, vm);
> +
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> -- 
> 1.8.3.1
> 

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (65 preceding siblings ...)
  2013-06-27 23:31 ` [PATCH 66/66] drm/i915: Getparam full ppgtt Ben Widawsky
@ 2013-06-28  3:38 ` Ben Widawsky
  2013-07-01 21:39 ` Daniel Vetter
  67 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-06-28  3:38 UTC (permalink / raw)
  To: Intel GFX

Forgot the repo:

http://cgit.freedesktop.org/~bwidawsk/drm-intel/log/?h=ppgtt

On Thu, Jun 27, 2013 at 04:30:01PM -0700, Ben Widawsky wrote:
> First, I don't think this whole series is ready for merge yet. It is
> however ready for a review, and I think a lot of the prep patches in the
> series could be merged to make my rebasing life a bit easier. I cannot
> continue ignoring pretty much all emails/bugs as I have for the last
> month to wrap up this series. The current state is on my IVB, things are
> pretty stable. I've seen one unexplained hang, but I'm hopeful review
> might help me uncover/explain.
> 
> This patch series introduces the next step in enabling full PPGTT, which
> is per fd address space/context, and it also contains the previously
> unmerged patches (some of which have been reworked, modified, or
> rebased). In regards to the continued VMA changes, I think in these, the
> delta with regard to the last posting is the bound list was per VM. It
> is now global. I've also moved the active list to per VM.
> 
> Brand new in the series we take the previous series' context per fd
> (with address space) one step further and actually switch the address
> spaces when we do context switches. In order to make this happen, the
> series continues to chip away at removing the notion of an object only
> ever being bound into one address space via the struct
> i915_address_space and struct i915_vma data structures which are really
> abstractions for a page directory, and current mapped ptes respectively.
> The error state is improved since the last series (though still some
> work there is probably needed). It also serves to remove the notion of
> the aliasing PPGTT since in theory everything bound into the GGTT
> shouldn't benefit from an aliasing PPGTT (fact check).
> 
> With every context having it's own address space, and every open DRM fd
> having it's own context, it's trivial on execbuf to lookup a context and
> do the pinning in the proper address space. More importantly, it's
> implicit that a context exists, which made this impossible to do
> earlier.
> 
> *A note on patch ordering:* In order to work this series incrementally, the
> final patch ordering is admittedly a little bit strange. I'm more than willing
> to rework these as requested, but I'd really prefer not to do really heavy
> reordering unless there is a major benefit, or of course to fix bugs.
> 
> # What is not in this patch series in the order I think we should handle it
> (and I acknowledge some of this stuff is non-trivial):
> 
> ## Review + QA coverage
> 
> ## Porting to HSW
> 
> It shouldn't be too much extra work, if any, to add support. I haven't looked
> into it yet.
> 
> ## Better vm/ppgtt info in error state collection
> 
> In particular, I want to dump all the PTEs at hang, and at the very least the
> guilt PTEs.  This isn't difficult, and can be done with copypasta from the
> existing dumper I have.
> 
> ## User space and the implications
> 
> Now that contexts are valid on all rings, userspace should begin emitting the
> context for all rings if it expects both rings to be able to access both
> objects in the same offset. The mesa change looks to be just one line which
> simplies emits the context for batch->is_blit, I doubt libva is using contexts,
> and SNA seems not to. The plan to support mesa will be to do the detection in
> libdrm, and go ahead with the simple mesa one liner. I've been using the
> oneliner if mesa for a while now, but we will need to support old user space in
> the kernel. So there might be a bit of work even on the kernel side here. We
> also need some IGT tools test updates. I have messy versions of these locally
> already.
> 
> ## Performance data
> 
> I think it doesn't preclude preliminary review of the patches since the main
> goal of PPGTT is really abourt security, correctness, and enabling other
> things. I will update with some numbers after I work on it a bit more.
> 
> 
> ## Testing on SNB
> 
> If our current code is correct, then I think these patches might work on SNB
> as is, but it's untested. There is currently no way to disconnect contexts +
> PPGTT from the whole thing; so if this doesn't work - we'll need rework some of
> the code. I think it should just entail bringing back aliasing ppgtt, and not
> doing the address space switch when switching contexts (aliasing ppgtt will
> have a null switch_mm()).
> 
> ## Soft pin interface
> 
> I'd like to defer the discussion until these patches are merged.
> 
> ## On demand page table allocation
> 
> This is a potentially very useful optimization for at least the following
> reasons:
> * any app using contexts will have an extra set of page tables it isn't using;
>   wasted memory
> * Reduce DCLV to reduce pd fetch latency
> * Allows Better use of a switch to default context for low memory situations
>   (evicting unused page tables for example)
> 
> ## Per VMA cache levels/control
> 
> There are situations in the code where we have to flush the GPU pipeline in
> order to change cache levels.  This should no longer be the case for unaffected
> VMs (I think). The same may be true with domain tracking.
> 
> ## dmabuf/prime integration
> 
> I haven't looked into what's missing to support it. If I'm lucky, it just works.
> 
> 
> 
> With that, if you haven't already moved on, chanting tl;dr - all comments
> welcome.
> 
> ---
> 
> Ben Widawsky (65):
>   drm/i915: Remove extra error state NULL
>   drm/i915: Extract error buffer capture
>   drm/i915: make PDE|PTE platform specific
>   drm/i915: Don't clear gtt with 0 entries
>   drm/i915: Conditionally use guard page based on PPGTT
>   drm/i915: Use drm_mm for PPGTT PDEs
>   drm/i915: cleanup context fini
>   drm/i915: Do a fuller init after reset
>   drm/i915: Split context enabling from init
>   drm/i915: destroy i915_gem_init_global_gtt
>   drm/i915: Embed PPGTT into the context
>   drm/i915: Unify PPGTT codepaths on gen6+
>   drm/i915: Move ppgtt initialization down
>   drm/i915: Tie context to PPGTT
>   drm/i915: Really share scratch page
>   drm/i915: Combine scratch members into a struct
>   drm/i915: Drop dev from pte_encode
>   drm/i915: Use gtt shortform where possible
>   drm/i915: Move fbc members out of line
>   drm/i915: Move gtt and ppgtt under address space umbrella
>   drm/i915: Move gtt_mtrr to i915_gtt
>   drm/i915: Move stolen stuff to i915_gtt
>   drm/i915: Move aliasing_ppgtt
>   drm/i915: Put the mm in the parent address space
>   drm/i915: Move active/inactive lists to new mm
>   drm/i915: Create a global list of vms
>   drm/i915: Remove object's gtt_offset
>   drm: pre allocate node for create_block
>   drm/i915: Getter/setter for object attributes
>   drm/i915: Create VMAs (part 1)
>   drm/i915: Create VMAs (part 2) - kill gtt space
>   drm/i915: Create VMAs (part 3) - plumbing
>   drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
>   drm/i915: Create VMAs (part 4) - Error capture
>   drm/i915: Create VMAs (part 5) - move mm_list
>   drm/i915: Create VMAs (part 6) - finish error plumbing
>   drm/i915: create an object_is_active()
>   drm/i915: Move active to vma
>   drm/i915: Track all VMAs per VM
>   drm/i915: Defer request freeing
>   drm/i915: Clean up VMAs before freeing
>   drm/i915: Replace has_bsd/blt with a mask
>   drm/i915: Catch missed context unref earlier
>   drm/i915: Add a context open function
>   drm/i915: Permit contexts on all rings
>   drm/i915: Fix context fini refcounts
>   drm/i915: Better reset handling for contexts
>   drm/i915: Create a per file_priv default context
>   drm/i915: Remove ring specificity from contexts
>   drm/i915: Track which ring a context ran on
>   drm/i915: dump error state based on capture
>   drm/i915: PPGTT should take a ppgtt argument
>   drm/i915: USE LRI for switching PP_DIR_BASE
>   drm/i915: Extract mm switching to function
>   drm/i915: Write PDEs at init instead of enable
>   drm/i915: Disallow pin with full ppgtt
>   drm/i915: Get context early in execbuf
>   drm/i915: Pass ctx directly to switch/hangstat
>   drm/i915: Actually add the new address spaces
>   drm/i915: Use multiple VMs
>   drm/i915: Kill now unused ppgtt_{un,}bind
>   drm/i915: Add PPGTT dumper
>   drm/i915: Dump all ppgtt
>   drm/i915: Add debugfs for vma info per vm
>   drm/i915: Getparam full ppgtt
> 
> Chris Wilson (1):
>   drm: Optionally create mm blocks from top-to-bottom
> 
>  drivers/gpu/drm/drm_mm.c                   | 134 +++---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 215 ++++++++--
>  drivers/gpu/drm/i915/i915_dma.c            |  25 +-
>  drivers/gpu/drm/i915/i915_drv.c            |  57 ++-
>  drivers/gpu/drm/i915/i915_drv.h            | 353 ++++++++++------
>  drivers/gpu/drm/i915/i915_gem.c            | 639 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    | 279 +++++++++----
>  drivers/gpu/drm/i915/i915_gem_debug.c      |  11 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  64 +--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 541 ++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  87 ++--
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  21 +-
>  drivers/gpu/drm/i915/i915_irq.c            | 197 ++++++---
>  drivers/gpu/drm/i915/i915_trace.h          |  38 +-
>  drivers/gpu/drm/i915/intel_display.c       |  40 +-
>  drivers/gpu/drm/i915/intel_drv.h           |   7 -
>  drivers/gpu/drm/i915/intel_fb.c            |   8 +-
>  drivers/gpu/drm/i915/intel_overlay.c       |  26 +-
>  drivers/gpu/drm/i915/intel_pm.c            |  65 +--
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  32 +-
>  drivers/gpu/drm/i915/intel_sprite.c        |   8 +-
>  include/drm/drm_mm.h                       | 147 ++++---
>  include/uapi/drm/i915_drm.h                |   1 +
>  24 files changed, 2044 insertions(+), 1089 deletions(-)
> 
> -- 
> 1.8.3.1
> 

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-27 23:30 ` [PATCH 57/66] drm/i915: Disallow pin with full ppgtt Ben Widawsky
@ 2013-06-28  8:55   ` Chris Wilson
  2013-06-29  5:43     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Chris Wilson @ 2013-06-28  8:55 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> Pin doesn't fit with PPGTT since the interface doesn't allow for the
> context for which we want to pin.

Nak. Pin still retains it semantics with the gtt and only applies to the
gtt.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 03/66] drm/i915: make PDE|PTE platform specific
  2013-06-27 23:30 ` [PATCH 03/66] drm/i915: make PDE|PTE platform specific Ben Widawsky
@ 2013-06-28 16:53   ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-06-28 16:53 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:04PM -0700, Ben Widawsky wrote:
> Nothing outside of i915_gem_gtt.c and more specifically, the relevant
> gen specific init function should need to know about number of PDEs, or
> PTEs per PD. Exposing this will only lead to circumventing using the
> upcoming VM abstraction.
> 
> To accomplish this, move the defines into the .c file, rename the PDE
> define to be GEN6, and make the PTE count less of a magic number.
> 
> The remaining code in the global gtt setup is a bit messy, but an
> upcoming patch will clean that one up.
> 
> v2: Don't hardcode number of PDEs (Daniel + Jesse)
> Reworded commit message to reflect change.
> 
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

First 3 patches merged to dinq, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT
  2013-06-27 23:30 ` [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT Ben Widawsky
@ 2013-06-28 17:57   ` Jesse Barnes
  0 siblings, 0 replies; 124+ messages in thread
From: Jesse Barnes @ 2013-06-28 17:57 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, 27 Jun 2013 16:30:07 -0700
Ben Widawsky <ben@bwidawsk.net> wrote:

> The PPGTT PDEs serve as the guard page (as long as they remain at the
> top) so we don't need yet another guard page. Note that there is a
> potential issue if the aliasing PPGTT (and later, the default context)
> relinquish this part of the GGTT. We should be able to assert that won't
> happen however.
> 
> While there, add some comments for the setup_global_gtt function which
> started getting complicated.
> 
> The reason I've opted not to leave out the guard_page argument is that
> in order to support dri1, we call the setup function, and I didn't like
> to have to clear the guard page in more than 1 location.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |  3 ++-
>  drivers/gpu/drm/i915/i915_gem.c     |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 27 ++++++++++++++++++++++-----
>  3 files changed, 26 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index b709712..c677d6c 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1852,7 +1852,8 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj);
>  void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj);
>  void i915_gem_init_global_gtt(struct drm_device *dev);
>  void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start,
> -			       unsigned long mappable_end, unsigned long end);
> +			       unsigned long mappable_end, unsigned long end,
> +			       unsigned long guard_size);
>  int i915_gem_gtt_init(struct drm_device *dev);
>  static inline void i915_gem_chipset_flush(struct drm_device *dev)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6806bb9..629e047 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -158,8 +158,8 @@ i915_gem_init_ioctl(struct drm_device *dev, void *data,
>  
>  	mutex_lock(&dev->struct_mutex);
>  	i915_gem_setup_global_gtt(dev, args->gtt_start, args->gtt_end,
> -				  args->gtt_end);
> -	dev_priv->gtt.mappable_end = args->gtt_end;
> +				  args->gtt_end, PAGE_SIZE);
> +	dev_priv->gtt.mappable_end = args->gtt_end - PAGE_SIZE;
>  	mutex_unlock(&dev->struct_mutex);
>  
>  	return 0;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 0fce8d0..fb30d65 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -613,10 +613,23 @@ static void i915_gtt_color_adjust(struct drm_mm_node *node,
>  			*end -= 4096;
>  	}
>  }
> +
> +/**
> + * i915_gem_setup_global_gtt() setup an allocator for the global GTT with the
> + * given parameters and initialize all PTEs to point to the scratch page.
> + *
> + * @dev
> + * @start - first offset of managed GGTT space
> + * @mappable_end - Last offset of the aperture mapped region
> + * @end - Last offset that can be accessed by the allocator
> + * @guard_size - Size to initialize to scratch after end. (Currently only used
> + *		 for prefetching case)
> + */
>  void i915_gem_setup_global_gtt(struct drm_device *dev,
>  			       unsigned long start,
>  			       unsigned long mappable_end,
> -			       unsigned long end)
> +			       unsigned long end,
> +			       unsigned long guard_size)
>  {
>  	/* Let GEM Manage all of the aperture.
>  	 *
> @@ -634,8 +647,11 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  
>  	BUG_ON(mappable_end > end);
>  
> +	if (WARN_ON(guard_size & ~PAGE_MASK))
> +		guard_size = round_up(guard_size, PAGE_SIZE);
> +
>  	/* Subtract the guard page ... */
> -	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE);
> +	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - guard_size);
>  	if (!HAS_LLC(dev))
>  		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
>  
> @@ -665,7 +681,8 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1);
> +	dev_priv->gtt.gtt_clear_range(dev, (end - guard_size) / PAGE_SIZE,
> +				      guard_size / PAGE_SIZE);
>  }
>  
>  static bool
> @@ -700,7 +717,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
>  			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
>  		}
>  
> -		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
> +		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, 0);
>  
>  		ret = i915_gem_init_aliasing_ppgtt(dev);
>  		if (!ret)
> @@ -710,7 +727,7 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
>  		drm_mm_takedown(&dev_priv->mm.gtt_space);
>  		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
>  	}
> -	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size);
> +	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, PAGE_SIZE);
>  }
>  
>  static int setup_scratch_page(struct drm_device *dev)

Just a nitpick that can be changed with a follow on patch if others
agree: I'd rather see the WARN_ON made a BUG_ON when checking that the
guard_size is a multiple of PAGE_SIZE (which, incidentally, is the
wrong value to use, but that's also for another cleanup).

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs
  2013-06-27 23:30 ` [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
@ 2013-06-28 18:01   ` Jesse Barnes
  0 siblings, 0 replies; 124+ messages in thread
From: Jesse Barnes @ 2013-06-28 18:01 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, 27 Jun 2013 16:30:08 -0700
Ben Widawsky <ben@bwidawsk.net> wrote:

> When PPGTT support was originally enabled, it was only designed to
> support 1 PPGTT. It therefore made sense to simply hide the GGTT space
> required to enable this from the drm_mm allocator.
> 
> Since we intend to support full PPGTT, which means more than 1, and they
> can be created and destroyed ad hoc it will be required to use the
> proper allocation techniques we already have.
> 
> The first step here is to make the existing single PPGTT use the allocator.
> 
> v2: Align PDEs to 64b in GTT
> Allocate the node dynamically so we can use drm_mm_put_block
> Now tested on IGT
> Allocate node at the top to avoid fragmentation (Chris)
> 
> v3: Use Chris' top down allocator
> 
> v4: Embed drm_mm_node into ppgtt struct (Jesse)
> Remove hunks which didn't belong (Jesse)
> 
> v5: Don't subtract guard page since we now killed the guard page prior
> to this patch. (Ben)
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h     |  1 +
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 45 ++++++++++++++++++++++++-------------
>  2 files changed, 31 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index c677d6c..659b4aa 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -484,6 +484,7 @@ struct i915_gtt {
>  #define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
>  
>  struct i915_hw_ppgtt {
> +	struct drm_mm_node node;
>  	struct drm_device *dev;
>  	unsigned num_pd_entries;
>  	struct page **pt_pages;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index fb30d65..5284dc5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -247,6 +247,8 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
>  {
>  	int i;
>  
> +	drm_mm_remove_node(&ppgtt->node);
> +
>  	if (ppgtt->pt_dma_addr) {
>  		for (i = 0; i < ppgtt->num_pd_entries; i++)
>  			pci_unmap_page(ppgtt->dev->pdev,
> @@ -263,16 +265,27 @@ static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
>  
>  static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
> +#define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> +#define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
>  	struct drm_device *dev = ppgtt->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	unsigned first_pd_entry_in_global_pt;
>  	int i;
>  	int ret = -ENOMEM;
>  
> -	/* ppgtt PDEs reside in the global gtt pagetable, which has 512*1024
> -	 * entries. For aliasing ppgtt support we just steal them at the end for
> -	 * now. */
> -	first_pd_entry_in_global_pt = gtt_total_entries(dev_priv->gtt);
> +	/* PPGTT PDEs reside in the GGTT stolen space, and consists of 512
> +	 * entries. The allocator works in address space sizes, so it's
> +	 * multiplied by page size. We allocate at the top of the GTT to avoid
> +	 * fragmentation.
> +	 */
> +	BUG_ON(!drm_mm_initialized(&dev_priv->mm.gtt_space));
> +	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space,
> +						  &ppgtt->node, GEN6_PD_SIZE,
> +						  GEN6_PD_ALIGN, 0,
> +						  dev_priv->gtt.mappable_end,
> +						  dev_priv->gtt.total,
> +						  DRM_MM_TOPDOWN);
> +	if (ret)
> +		return ret;
>  
>  	if (IS_HASWELL(dev)) {
>  		ppgtt->pte_encode = hsw_pte_encode;
> @@ -288,8 +301,10 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	ppgtt->cleanup = gen6_ppgtt_cleanup;
>  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*ppgtt->num_pd_entries,
>  				  GFP_KERNEL);
> -	if (!ppgtt->pt_pages)
> +	if (!ppgtt->pt_pages) {
> +		drm_mm_remove_node(&ppgtt->node);
>  		return -ENOMEM;
> +	}
>  
>  	for (i = 0; i < ppgtt->num_pd_entries; i++) {
>  		ppgtt->pt_pages[i] = alloc_page(GFP_KERNEL);
> @@ -319,7 +334,11 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  	ppgtt->clear_range(ppgtt, 0,
>  			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
>  
> -	ppgtt->pd_offset = first_pd_entry_in_global_pt * sizeof(gen6_gtt_pte_t);
> +	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
> +			 ppgtt->node.size >> 20,
> +			 ppgtt->node.start / PAGE_SIZE);
> +	ppgtt->pd_offset =
> +		ppgtt->node.start / PAGE_SIZE * sizeof(gen6_gtt_pte_t);
>  
>  	return 0;
>  
> @@ -336,6 +355,7 @@ err_pt_alloc:
>  			__free_page(ppgtt->pt_pages[i]);
>  	}
>  	kfree(ppgtt->pt_pages);
> +	drm_mm_remove_node(&ppgtt->node);
>  
>  	return ret;
>  }
> @@ -442,6 +462,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
>  				      dev_priv->gtt.total / PAGE_SIZE);
>  
> +	if (dev_priv->mm.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
>  		i915_gem_gtt_bind_object(obj, obj->cache_level);
> @@ -711,21 +734,13 @@ void i915_gem_init_global_gtt(struct drm_device *dev)
>  	if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) {
>  		int ret;
>  
> -		if (INTEL_INFO(dev)->gen <= 7) {
> -			/* PPGTT pdes are stolen from global gtt ptes, so shrink the
> -			 * aperture accordingly when using aliasing ppgtt. */
> -			gtt_size -= GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
> -		}
> -
>  		i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, 0);
> -
>  		ret = i915_gem_init_aliasing_ppgtt(dev);
>  		if (!ret)
>  			return;
>  
>  		DRM_ERROR("Aliased PPGTT setup failed %d\n", ret);
>  		drm_mm_takedown(&dev_priv->mm.gtt_space);
> -		gtt_size += GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE;
>  	}
>  	i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, PAGE_SIZE);
>  }

Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-28  8:55   ` Chris Wilson
@ 2013-06-29  5:43     ` Ben Widawsky
  2013-06-29  6:44       ` Chris Wilson
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-29  5:43 UTC (permalink / raw)
  To: Chris Wilson, Intel GFX

On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> > context for which we want to pin.
> 
> Nak. Pin still retains it semantics with the gtt and only applies to the
> gtt.
> -Chris
> 
> -- 
> Chris Wilson, Intel Open Source Technology Centre


Here is the error I have on pin. I was trying to debug it previously but
got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
make it work, but I never finished. Maybe you know offhand what I've
messed up, and the right way to fix it?

gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-29  5:43     ` Ben Widawsky
@ 2013-06-29  6:44       ` Chris Wilson
  2013-06-29 14:34         ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Chris Wilson @ 2013-06-29  6:44 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> > > context for which we want to pin.
> > 
> > Nak. Pin still retains it semantics with the gtt and only applies to the
> > gtt.
> 
> 
> Here is the error I have on pin. I was trying to debug it previously but
> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
> make it work, but I never finished. Maybe you know offhand what I've
> messed up, and the right way to fix it?
> 
> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.

Ok, that is a condition that no longer holds with full ppgtt. Now
fortunately, userspace that might depend upon that is limited to DRI1
era machines (at least in the userspace I know about) and we can just
update the test to understand that pinning and exec are two different
address spaces.

How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
pinning should be tested separately (so that it isn't confused with
pinning to the ggtt), but that is also a viable solution to this portion
of the test.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-29  6:44       ` Chris Wilson
@ 2013-06-29 14:34         ` Daniel Vetter
  2013-06-30  6:56           ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-29 14:34 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Sat, Jun 29, 2013 at 8:44 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
>> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
>> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
>> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
>> > > context for which we want to pin.
>> >
>> > Nak. Pin still retains it semantics with the gtt and only applies to the
>> > gtt.
>>
>>
>> Here is the error I have on pin. I was trying to debug it previously but
>> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
>> make it work, but I never finished. Maybe you know offhand what I've
>> messed up, and the right way to fix it?
>>
>> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.
>
> Ok, that is a condition that no longer holds with full ppgtt. Now
> fortunately, userspace that might depend upon that is limited to DRI1
> era machines (at least in the userspace I know about) and we can just
> update the test to understand that pinning and exec are two different
> address spaces.
>
> How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
> w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
> pinning should be tested separately (so that it isn't confused with
> pinning to the ggtt), but that is also a viable solution to this portion
> of the test.

NEEDS_GTT is only valid where we alias the global GTT and PPGTT (i.e.
snb). So I think we should reject it on other platforms (or silently
ignore it if userspace uses it already).

Now since we've had a few funny bugs in this area already which proved
to be rather hard to track down I think it's time to implement the
relevant igt tests. The (currently only internal) i-g-t wiki has some
information about what exactly blows up and what I think should be
tested. I think it'd be a good requirement to block the real ppgtt
enabling (not all the vma prep patches) until that test is ready.

On that topic: Do we have other gaps in our testing, or is the current
igt coverage sufficient for this massive refactoring? Ben, has
anything blown up while you've developed these patches which was not
caught by i-g-t?

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-29 14:34         ` Daniel Vetter
@ 2013-06-30  6:56           ` Ben Widawsky
  2013-06-30 11:06             ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-06-30  6:56 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sat, Jun 29, 2013 at 04:34:07PM +0200, Daniel Vetter wrote:
> On Sat, Jun 29, 2013 at 8:44 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
> >> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> >> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> >> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> >> > > context for which we want to pin.
> >> >
> >> > Nak. Pin still retains it semantics with the gtt and only applies to the
> >> > gtt.
> >>
> >>
> >> Here is the error I have on pin. I was trying to debug it previously but
> >> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
> >> make it work, but I never finished. Maybe you know offhand what I've
> >> messed up, and the right way to fix it?
> >>
> >> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.
> >
> > Ok, that is a condition that no longer holds with full ppgtt. Now
> > fortunately, userspace that might depend upon that is limited to DRI1
> > era machines (at least in the userspace I know about) and we can just
> > update the test to understand that pinning and exec are two different
> > address spaces.
> >
> > How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
> > w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
> > pinning should be tested separately (so that it isn't confused with
> > pinning to the ggtt), but that is also a viable solution to this portion
> > of the test.
> 
> NEEDS_GTT is only valid where we alias the global GTT and PPGTT (i.e.
> snb). So I think we should reject it on other platforms (or silently
> ignore it if userspace uses it already).

What do you recommend as the resolution to the failing gem_in then?

> 
> Now since we've had a few funny bugs in this area already which proved
> to be rather hard to track down I think it's time to implement the
> relevant igt tests. The (currently only internal) i-g-t wiki has some
> information about what exactly blows up and what I think should be
> tested. I think it'd be a good requirement to block the real ppgtt
> enabling (not all the vma prep patches) until that test is ready.
> 

I'll take a look on Monday and see try to start on it. I think if the
test case is reasonable, then blocking the merge is fair, but I have to
see what's in store before I decide whether or not to argue.

> On that topic: Do we have other gaps in our testing, or is the current
> igt coverage sufficient for this massive refactoring? Ben, has
> anything blown up while you've developed these patches which was not
> caught by i-g-t?

Other than the one bug I haven't yet tracked down which I mentioned in
0000-cover-letter (I've never hit it in many IGT runs), I had one
reproducible bug which was really hard to resolve. It passed all of the
IGT tests, and caused some weird display corruption in UXA (it was the
screenshot I posted, if you recall). It ran fine in SNA for a while, and
then would hang. That wasn't even full PPGTT, it was just after the
refactor. The root cause was some screwed up unbind logic where I ended
up with a bogus unbind offset. In retrospect, I'm not really sure why
the issue wasn't either more sever, or less, and also why IGT didn't hit
it. It was the kind of bug which I don't feel is worth testing.

Since addresses spaces are per fd, any test which opens multiple fds,
and executes testable batches is a good test. I'm not really sure how
many of them we have offhand, but we have a least a few. I think piglit
on a composited desktop is one of the best focus tests we can run.


> 
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-30  6:56           ` Ben Widawsky
@ 2013-06-30 11:06             ` Daniel Vetter
  2013-06-30 11:31               ` Chris Wilson
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 11:06 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Sat, Jun 29, 2013 at 11:56:53PM -0700, Ben Widawsky wrote:
> On Sat, Jun 29, 2013 at 04:34:07PM +0200, Daniel Vetter wrote:
> > On Sat, Jun 29, 2013 at 8:44 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
> > >> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> > >> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> > >> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> > >> > > context for which we want to pin.
> > >> >
> > >> > Nak. Pin still retains it semantics with the gtt and only applies to the
> > >> > gtt.
> > >>
> > >>
> > >> Here is the error I have on pin. I was trying to debug it previously but
> > >> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
> > >> make it work, but I never finished. Maybe you know offhand what I've
> > >> messed up, and the right way to fix it?
> > >>
> > >> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.
> > >
> > > Ok, that is a condition that no longer holds with full ppgtt. Now
> > > fortunately, userspace that might depend upon that is limited to DRI1
> > > era machines (at least in the userspace I know about) and we can just
> > > update the test to understand that pinning and exec are two different
> > > address spaces.
> > >
> > > How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
> > > w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
> > > pinning should be tested separately (so that it isn't confused with
> > > pinning to the ggtt), but that is also a viable solution to this portion
> > > of the test.
> > 
> > NEEDS_GTT is only valid where we alias the global GTT and PPGTT (i.e.
> > snb). So I think we should reject it on other platforms (or silently
> > ignore it if userspace uses it already).
> 
> What do you recommend as the resolution to the failing gem_in then?

IIrc that was a test I've asked for since Chris wanted to use pin/unpin to
work around the i830/i845 cs tlb bug in a better way, just to exercise the
basics. I think the right option would be to reject pin on gen6+ since
those platforms are kms-only and not one of the few kms platforms where we
(ab)use pinning. Also all the earlier platforms don't have any (useful) hw
ppgtt implementation. We don't need pinning on gen4/5 iirc, but still
allowing it there increases test coverage with igt/gem_pin - now that we
have a test we better make use of it.

> > Now since we've had a few funny bugs in this area already which proved
> > to be rather hard to track down I think it's time to implement the
> > relevant igt tests. The (currently only internal) i-g-t wiki has some
> > information about what exactly blows up and what I think should be
> > tested. I think it'd be a good requirement to block the real ppgtt
> > enabling (not all the vma prep patches) until that test is ready.
> > 
> 
> I'll take a look on Monday and see try to start on it. I think if the
> test case is reasonable, then blocking the merge is fair, but I have to
> see what's in store before I decide whether or not to argue.
> 
> > On that topic: Do we have other gaps in our testing, or is the current
> > igt coverage sufficient for this massive refactoring? Ben, has
> > anything blown up while you've developed these patches which was not
> > caught by i-g-t?
> 
> Other than the one bug I haven't yet tracked down which I mentioned in
> 0000-cover-letter (I've never hit it in many IGT runs), I had one
> reproducible bug which was really hard to resolve. It passed all of the
> IGT tests, and caused some weird display corruption in UXA (it was the
> screenshot I posted, if you recall). It ran fine in SNA for a while, and
> then would hang. That wasn't even full PPGTT, it was just after the
> refactor. The root cause was some screwed up unbind logic where I ended
> up with a bogus unbind offset. In retrospect, I'm not really sure why
> the issue wasn't either more sever, or less, and also why IGT didn't hit
> it. It was the kind of bug which I don't feel is worth testing.

Hm, I'm intrigued about this one, but also confused. Can you please
elaborate on what exactly blows up? "unbind offset" is a new concept in my
ears ...

> Since addresses spaces are per fd, any test which opens multiple fds,
> and executes testable batches is a good test. I'm not really sure how
> many of them we have offhand, but we have a least a few. I think piglit
> on a composited desktop is one of the best focus tests we can run.

Yeah, switching between contexts and ppgtt is probably best done with real
workloads. Worth checking though might be a very basic test with
- 2 fds (so 2 address spaces)
- a bunch of equally-sized objects
- using them in inverse order on the two fds in the first batch (or any
  other trick to make sure that the two address spaces have completely
  different bindings)
- blitting a bit of data around, alternating between the two fds

Just to have a basic check for ppgtt switching. This test can probably
derived quickly from one of the existing "copy stuff around" tests.
drmtest helpers should also be useful. Even libdrm should keep on working
if we set up two libdrm bufmgr instances.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
n

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-30 11:06             ` Daniel Vetter
@ 2013-06-30 11:31               ` Chris Wilson
  2013-06-30 11:36                 ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Chris Wilson @ 2013-06-30 11:31 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Ben Widawsky, Intel GFX

On Sun, Jun 30, 2013 at 01:06:47PM +0200, Daniel Vetter wrote:
> On Sat, Jun 29, 2013 at 11:56:53PM -0700, Ben Widawsky wrote:
> > On Sat, Jun 29, 2013 at 04:34:07PM +0200, Daniel Vetter wrote:
> > > On Sat, Jun 29, 2013 at 8:44 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > > > On Fri, Jun 28, 2013 at 10:43:30PM -0700, Ben Widawsky wrote:
> > > >> On Fri, Jun 28, 2013 at 09:55:27AM +0100, Chris Wilson wrote:
> > > >> > On Thu, Jun 27, 2013 at 04:30:58PM -0700, Ben Widawsky wrote:
> > > >> > > Pin doesn't fit with PPGTT since the interface doesn't allow for the
> > > >> > > context for which we want to pin.
> > > >> >
> > > >> > Nak. Pin still retains it semantics with the gtt and only applies to the
> > > >> > gtt.
> > > >>
> > > >>
> > > >> Here is the error I have on pin. I was trying to debug it previously but
> > > >> got sidetracked. I thought some combo of EXEC_GTT flag and hacks would
> > > >> make it work, but I never finished. Maybe you know offhand what I've
> > > >> messed up, and the right way to fix it?
> > > >>
> > > >> gem_pin: gem_pin.c:84: exec: Assertion `gem_exec[0].offset == offset' failed.
> > > >
> > > > Ok, that is a condition that no longer holds with full ppgtt. Now
> > > > fortunately, userspace that might depend upon that is limited to DRI1
> > > > era machines (at least in the userspace I know about) and we can just
> > > > update the test to understand that pinning and exec are two different
> > > > address spaces.
> > > >
> > > > How do you handle EXEC_OBJECT_NEEDS_GTT? As that may be an acceptable
> > > > w/a. Or just skip that portion of the test if PRAM_HAS_FULL_PPGTT. Soft
> > > > pinning should be tested separately (so that it isn't confused with
> > > > pinning to the ggtt), but that is also a viable solution to this portion
> > > > of the test.
> > > 
> > > NEEDS_GTT is only valid where we alias the global GTT and PPGTT (i.e.
> > > snb). So I think we should reject it on other platforms (or silently
> > > ignore it if userspace uses it already).
> > 
> > What do you recommend as the resolution to the failing gem_in then?
> 
> IIrc that was a test I've asked for since Chris wanted to use pin/unpin to
> work around the i830/i845 cs tlb bug in a better way, just to exercise the
> basics. I think the right option would be to reject pin on gen6+ since
> those platforms are kms-only and not one of the few kms platforms where we
> (ab)use pinning. Also all the earlier platforms don't have any (useful) hw
> ppgtt implementation. We don't need pinning on gen4/5 iirc, but still
> allowing it there increases test coverage with igt/gem_pin - now that we
> have a test we better make use of it.

I respectfully disagree. The semantics of the pin ioctl remain useful
even with the ggtt/ppgtt split, and I think barring its use forever more
is unwise. Not that pinning is a good solution, just in some cases it
may be the only solution. (It has proven useful in the past, it is
likely to do so again.) All that we need to do is note that the offset
returned by pin is ggtt and the offsets used by execbuffer are ppgtt. So
keep pin-ioctl and fix the test not to assume that pin.offset is
meaningful with execbuffer after HAS_FULL_PPGTT.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-30 11:31               ` Chris Wilson
@ 2013-06-30 11:36                 ` Daniel Vetter
  2013-07-01 18:27                   ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 11:36 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, Ben Widawsky, Intel GFX

On Sun, Jun 30, 2013 at 1:31 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> I respectfully disagree. The semantics of the pin ioctl remain useful
> even with the ggtt/ppgtt split, and I think barring its use forever more
> is unwise. Not that pinning is a good solution, just in some cases it
> may be the only solution. (It has proven useful in the past, it is
> likely to do so again.) All that we need to do is note that the offset
> returned by pin is ggtt and the offsets used by execbuffer are ppgtt. So
> keep pin-ioctl and fix the test not to assume that pin.offset is
> meaningful with execbuffer after HAS_FULL_PPGTT.

I was eyeing for the most minimal fix, but this is ok with me, too.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom
  2013-06-27 23:30 ` [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
@ 2013-06-30 12:30   ` Daniel Vetter
  2013-06-30 12:40     ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 12:30 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, DRI Development

On Thu, Jun 27, 2013 at 04:30:05PM -0700, Ben Widawsky wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Clients like i915 needs to segregate cache domains within the GTT which
> can lead to small amounts of fragmentation. By allocating the uncached
> buffers from the bottom and the cacheable buffers from the top, we can
> reduce the amount of wasted space and also optimize allocation of the
> mappable portion of the GTT to only those buffers that require CPU
> access through the GTT.
> 
> v2 by Ben:
> Update callers in i915_gem_object_bind_to_gtt()
> Turn search flags and allocation flags into separate enums
> Make checkpatch happy where logical/easy
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Since this is a core drm patch it must be cc'ed to dri-devel (and acked by
Dave) before I can merge it. Can you please resend?
-Daniel

> ---
>  drivers/gpu/drm/drm_mm.c        | 122 ++++++++++++++++++---------------
>  drivers/gpu/drm/i915/i915_gem.c |   4 +-
>  include/drm/drm_mm.h            | 148 ++++++++++++++++++++++++----------------
>  3 files changed, 161 insertions(+), 113 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> index 07cf99c..7095328 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -49,7 +49,7 @@
>  
>  #define MM_UNUSED_TARGET 4
>  
> -static struct drm_mm_node *drm_mm_kmalloc(struct drm_mm *mm, int atomic)
> +static struct drm_mm_node *drm_mm_kmalloc(struct drm_mm *mm, bool atomic)
>  {
>  	struct drm_mm_node *child;
>  
> @@ -105,7 +105,8 @@ EXPORT_SYMBOL(drm_mm_pre_get);
>  static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
>  				 struct drm_mm_node *node,
>  				 unsigned long size, unsigned alignment,
> -				 unsigned long color)
> +				 unsigned long color,
> +				 enum drm_mm_allocator_flags flags)
>  {
>  	struct drm_mm *mm = hole_node->mm;
>  	unsigned long hole_start = drm_mm_hole_node_start(hole_node);
> @@ -118,12 +119,22 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
>  	if (mm->color_adjust)
>  		mm->color_adjust(hole_node, color, &adj_start, &adj_end);
>  
> +	if (flags & DRM_MM_CREATE_TOP)
> +		adj_start = adj_end - size;
> +
>  	if (alignment) {
>  		unsigned tmp = adj_start % alignment;
> -		if (tmp)
> -			adj_start += alignment - tmp;
> +		if (tmp) {
> +			if (flags & DRM_MM_CREATE_TOP)
> +				adj_start -= tmp;
> +			else
> +				adj_start += alignment - tmp;
> +		}
>  	}
>  
> +	BUG_ON(adj_start < hole_start);
> +	BUG_ON(adj_end > hole_end);
> +
>  	if (adj_start == hole_start) {
>  		hole_node->hole_follows = 0;
>  		list_del(&hole_node->hole_stack);
> @@ -150,7 +161,7 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
>  struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
>  					unsigned long start,
>  					unsigned long size,
> -					bool atomic)
> +					enum drm_mm_allocator_flags flags)
>  {
>  	struct drm_mm_node *hole, *node;
>  	unsigned long end = start + size;
> @@ -161,7 +172,7 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
>  		if (hole_start > start || hole_end < end)
>  			continue;
>  
> -		node = drm_mm_kmalloc(mm, atomic);
> +		node = drm_mm_kmalloc(mm, flags & DRM_MM_CREATE_ATOMIC);
>  		if (unlikely(node == NULL))
>  			return NULL;
>  
> @@ -196,15 +207,15 @@ struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
>  					     unsigned long size,
>  					     unsigned alignment,
>  					     unsigned long color,
> -					     int atomic)
> +					     enum drm_mm_allocator_flags flags)
>  {
>  	struct drm_mm_node *node;
>  
> -	node = drm_mm_kmalloc(hole_node->mm, atomic);
> +	node = drm_mm_kmalloc(hole_node->mm, flags & DRM_MM_CREATE_ATOMIC);
>  	if (unlikely(node == NULL))
>  		return NULL;
>  
> -	drm_mm_insert_helper(hole_node, node, size, alignment, color);
> +	drm_mm_insert_helper(hole_node, node, size, alignment, color, flags);
>  
>  	return node;
>  }
> @@ -217,32 +228,28 @@ EXPORT_SYMBOL(drm_mm_get_block_generic);
>   */
>  int drm_mm_insert_node_generic(struct drm_mm *mm, struct drm_mm_node *node,
>  			       unsigned long size, unsigned alignment,
> -			       unsigned long color)
> +			       unsigned long color,
> +			       enum drm_mm_allocator_flags aflags,
> +			       enum drm_mm_search_flags sflags)
>  {
>  	struct drm_mm_node *hole_node;
>  
>  	hole_node = drm_mm_search_free_generic(mm, size, alignment,
> -					       color, 0);
> +					       color, sflags);
>  	if (!hole_node)
>  		return -ENOSPC;
>  
> -	drm_mm_insert_helper(hole_node, node, size, alignment, color);
> +	drm_mm_insert_helper(hole_node, node, size, alignment, color, aflags);
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_mm_insert_node_generic);
>  
> -int drm_mm_insert_node(struct drm_mm *mm, struct drm_mm_node *node,
> -		       unsigned long size, unsigned alignment)
> -{
> -	return drm_mm_insert_node_generic(mm, node, size, alignment, 0);
> -}
> -EXPORT_SYMBOL(drm_mm_insert_node);
> -
>  static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
>  				       struct drm_mm_node *node,
>  				       unsigned long size, unsigned alignment,
>  				       unsigned long color,
> -				       unsigned long start, unsigned long end)
> +				       unsigned long start, unsigned long end,
> +				       enum drm_mm_search_flags flags)
>  {
>  	struct drm_mm *mm = hole_node->mm;
>  	unsigned long hole_start = drm_mm_hole_node_start(hole_node);
> @@ -257,13 +264,20 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
>  	if (adj_end > end)
>  		adj_end = end;
>  
> +	if (flags & DRM_MM_CREATE_TOP)
> +		adj_start = adj_end - size;
> +
>  	if (mm->color_adjust)
>  		mm->color_adjust(hole_node, color, &adj_start, &adj_end);
>  
>  	if (alignment) {
>  		unsigned tmp = adj_start % alignment;
> -		if (tmp)
> -			adj_start += alignment - tmp;
> +		if (tmp) {
> +			if (flags & DRM_MM_CREATE_TOP)
> +				adj_start -= tmp;
> +			else
> +				adj_start += alignment - tmp;
> +		}
>  	}
>  
>  	if (adj_start == hole_start) {
> @@ -280,6 +294,8 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
>  	INIT_LIST_HEAD(&node->hole_stack);
>  	list_add(&node->node_list, &hole_node->node_list);
>  
> +	BUG_ON(node->start < start);
> +	BUG_ON(node->start < adj_start);
>  	BUG_ON(node->start + node->size > adj_end);
>  	BUG_ON(node->start + node->size > end);
>  
> @@ -290,22 +306,23 @@ static void drm_mm_insert_helper_range(struct drm_mm_node *hole_node,
>  	}
>  }
>  
> -struct drm_mm_node *drm_mm_get_block_range_generic(struct drm_mm_node *hole_node,
> -						unsigned long size,
> -						unsigned alignment,
> -						unsigned long color,
> -						unsigned long start,
> -						unsigned long end,
> -						int atomic)
> +struct drm_mm_node *
> +drm_mm_get_block_range_generic(struct drm_mm_node *hole_node,
> +			       unsigned long size,
> +			       unsigned alignment,
> +			       unsigned long color,
> +			       unsigned long start,
> +			       unsigned long end,
> +			       enum drm_mm_allocator_flags flags)
>  {
>  	struct drm_mm_node *node;
>  
> -	node = drm_mm_kmalloc(hole_node->mm, atomic);
> +	node = drm_mm_kmalloc(hole_node->mm, flags & DRM_MM_CREATE_ATOMIC);
>  	if (unlikely(node == NULL))
>  		return NULL;
>  
>  	drm_mm_insert_helper_range(hole_node, node, size, alignment, color,
> -				   start, end);
> +				   start, end, flags);
>  
>  	return node;
>  }
> @@ -318,31 +335,25 @@ EXPORT_SYMBOL(drm_mm_get_block_range_generic);
>   */
>  int drm_mm_insert_node_in_range_generic(struct drm_mm *mm, struct drm_mm_node *node,
>  					unsigned long size, unsigned alignment, unsigned long color,
> -					unsigned long start, unsigned long end)
> +					unsigned long start, unsigned long end,
> +					enum drm_mm_allocator_flags aflags,
> +					enum drm_mm_search_flags sflags)
>  {
>  	struct drm_mm_node *hole_node;
>  
>  	hole_node = drm_mm_search_free_in_range_generic(mm,
>  							size, alignment, color,
> -							start, end, 0);
> +							start, end, sflags);
>  	if (!hole_node)
>  		return -ENOSPC;
>  
>  	drm_mm_insert_helper_range(hole_node, node,
>  				   size, alignment, color,
> -				   start, end);
> +				   start, end, aflags);
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_mm_insert_node_in_range_generic);
>  
> -int drm_mm_insert_node_in_range(struct drm_mm *mm, struct drm_mm_node *node,
> -				unsigned long size, unsigned alignment,
> -				unsigned long start, unsigned long end)
> -{
> -	return drm_mm_insert_node_in_range_generic(mm, node, size, alignment, 0, start, end);
> -}
> -EXPORT_SYMBOL(drm_mm_insert_node_in_range);
> -
>  /**
>   * Remove a memory node from the allocator.
>   */
> @@ -418,7 +429,7 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
>  					       unsigned long size,
>  					       unsigned alignment,
>  					       unsigned long color,
> -					       bool best_match)
> +					       enum drm_mm_search_flags flags)
>  {
>  	struct drm_mm_node *entry;
>  	struct drm_mm_node *best;
> @@ -431,7 +442,8 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
>  	best = NULL;
>  	best_size = ~0UL;
>  
> -	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
> +	__drm_mm_for_each_hole(entry, mm, adj_start, adj_end,
> +			       flags & DRM_MM_SEARCH_BELOW) {
>  		if (mm->color_adjust) {
>  			mm->color_adjust(entry, color, &adj_start, &adj_end);
>  			if (adj_end <= adj_start)
> @@ -441,7 +453,7 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
>  		if (!check_free_hole(adj_start, adj_end, size, alignment))
>  			continue;
>  
> -		if (!best_match)
> +		if ((flags & DRM_MM_SEARCH_BEST) == 0)
>  			return entry;
>  
>  		if (entry->size < best_size) {
> @@ -454,13 +466,14 @@ struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
>  }
>  EXPORT_SYMBOL(drm_mm_search_free_generic);
>  
> -struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
> -							unsigned long size,
> -							unsigned alignment,
> -							unsigned long color,
> -							unsigned long start,
> -							unsigned long end,
> -							bool best_match)
> +struct drm_mm_node *
> +drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
> +				    unsigned long size,
> +				    unsigned alignment,
> +				    unsigned long color,
> +				    unsigned long start,
> +				    unsigned long end,
> +				    enum drm_mm_search_flags flags)
>  {
>  	struct drm_mm_node *entry;
>  	struct drm_mm_node *best;
> @@ -473,7 +486,8 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
>  	best = NULL;
>  	best_size = ~0UL;
>  
> -	drm_mm_for_each_hole(entry, mm, adj_start, adj_end) {
> +	__drm_mm_for_each_hole(entry, mm, adj_start, adj_end,
> +			       flags & DRM_MM_SEARCH_BELOW) {
>  		if (adj_start < start)
>  			adj_start = start;
>  		if (adj_end > end)
> @@ -488,7 +502,7 @@ struct drm_mm_node *drm_mm_search_free_in_range_generic(const struct drm_mm *mm,
>  		if (!check_free_hole(adj_start, adj_end, size, alignment))
>  			continue;
>  
> -		if (!best_match)
> +		if ((flags & DRM_MM_SEARCH_BEST) == 0)
>  			return entry;
>  
>  		if (entry->size < best_size) {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bbc3beb..6806bb9 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3131,7 +3131,9 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  search_free:
>  	ret = drm_mm_insert_node_in_range_generic(&dev_priv->mm.gtt_space, node,
>  						  size, alignment,
> -						  obj->cache_level, 0, gtt_max);
> +						  obj->cache_level, 0, gtt_max,
> +						  DRM_MM_CREATE_DEFAULT,
> +						  DRM_MM_SEARCH_DEFAULT);
>  	if (ret) {
>  		ret = i915_gem_evict_something(dev, size, alignment,
>  					       obj->cache_level,
> diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
> index 88591ef..8935710 100644
> --- a/include/drm/drm_mm.h
> +++ b/include/drm/drm_mm.h
> @@ -41,6 +41,21 @@
>  #include <linux/seq_file.h>
>  #endif
>  
> +enum drm_mm_allocator_flags {
> +	DRM_MM_CREATE_DEFAULT = 0,
> +	DRM_MM_CREATE_ATOMIC = 1<<0,
> +	DRM_MM_CREATE_TOP = 1<<1,
> +};
> +
> +enum drm_mm_search_flags {
> +	DRM_MM_SEARCH_DEFAULT = 0,
> +	DRM_MM_SEARCH_BEST = 1<<0,
> +	DRM_MM_SEARCH_BELOW = 1<<1,
> +};
> +
> +#define DRM_MM_BOTTOMUP DRM_MM_CREATE_DEFAULT, DRM_MM_SEARCH_DEFAULT
> +#define DRM_MM_TOPDOWN DRM_MM_CREATE_TOP, DRM_MM_SEARCH_BELOW
> +
>  struct drm_mm_node {
>  	struct list_head node_list;
>  	struct list_head hole_stack;
> @@ -135,26 +150,37 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
>  	     1 : 0; \
>  	     entry = list_entry(entry->hole_stack.next, struct drm_mm_node, hole_stack))
>  
> +#define __drm_mm_for_each_hole(entry, mm, hole_start, hole_end, backwards) \
> +	for (entry = list_entry((backwards) ? (mm)->hole_stack.prev : (mm)->hole_stack.next, struct drm_mm_node, hole_stack); \
> +	     &entry->hole_stack != &(mm)->hole_stack ? \
> +	     hole_start = drm_mm_hole_node_start(entry), \
> +	     hole_end = drm_mm_hole_node_end(entry), \
> +	     1 : 0; \
> +	     entry = list_entry((backwards) ? entry->hole_stack.prev : entry->hole_stack.next, struct drm_mm_node, hole_stack))
> +
>  /*
>   * Basic range manager support (drm_mm.c)
>   */
> -extern struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
> -					       unsigned long start,
> -					       unsigned long size,
> -					       bool atomic);
> -extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node,
> -						    unsigned long size,
> -						    unsigned alignment,
> -						    unsigned long color,
> -						    int atomic);
> -extern struct drm_mm_node *drm_mm_get_block_range_generic(
> -						struct drm_mm_node *node,
> -						unsigned long size,
> -						unsigned alignment,
> -						unsigned long color,
> -						unsigned long start,
> -						unsigned long end,
> -						int atomic);
> +extern struct drm_mm_node *
> +drm_mm_create_block(struct drm_mm *mm,
> +		    unsigned long start,
> +		    unsigned long size,
> +		    enum drm_mm_allocator_flags flags);
> +extern struct drm_mm_node *
> +drm_mm_get_block_generic(struct drm_mm_node *node,
> +			 unsigned long size,
> +			 unsigned alignment,
> +			 unsigned long color,
> +			 enum drm_mm_allocator_flags flags);
> +extern struct drm_mm_node *
> +drm_mm_get_block_range_generic(struct drm_mm_node *node,
> +			       unsigned long size,
> +			       unsigned alignment,
> +			       unsigned long color,
> +			       unsigned long start,
> +			       unsigned long end,
> +			       enum drm_mm_allocator_flags flags);
> +
>  static inline struct drm_mm_node *drm_mm_get_block(struct drm_mm_node *parent,
>  						   unsigned long size,
>  						   unsigned alignment)
> @@ -165,7 +191,8 @@ static inline struct drm_mm_node *drm_mm_get_block_atomic(struct drm_mm_node *pa
>  							  unsigned long size,
>  							  unsigned alignment)
>  {
> -	return drm_mm_get_block_generic(parent, size, alignment, 0, 1);
> +	return drm_mm_get_block_generic(parent, size, alignment, 0,
> +					DRM_MM_CREATE_ATOMIC);
>  }
>  static inline struct drm_mm_node *drm_mm_get_block_range(
>  						struct drm_mm_node *parent,
> @@ -196,39 +223,41 @@ static inline struct drm_mm_node *drm_mm_get_block_atomic_range(
>  						unsigned long end)
>  {
>  	return drm_mm_get_block_range_generic(parent, size, alignment, 0,
> -						start, end, 1);
> +						start, end,
> +						DRM_MM_CREATE_ATOMIC);
>  }
>  
> -extern int drm_mm_insert_node(struct drm_mm *mm,
> -			      struct drm_mm_node *node,
> -			      unsigned long size,
> -			      unsigned alignment);
> -extern int drm_mm_insert_node_in_range(struct drm_mm *mm,
> -				       struct drm_mm_node *node,
> -				       unsigned long size,
> -				       unsigned alignment,
> -				       unsigned long start,
> -				       unsigned long end);
>  extern int drm_mm_insert_node_generic(struct drm_mm *mm,
>  				      struct drm_mm_node *node,
>  				      unsigned long size,
>  				      unsigned alignment,
> -				      unsigned long color);
> -extern int drm_mm_insert_node_in_range_generic(struct drm_mm *mm,
> -				       struct drm_mm_node *node,
> -				       unsigned long size,
> -				       unsigned alignment,
> -				       unsigned long color,
> -				       unsigned long start,
> -				       unsigned long end);
> +				      unsigned long color,
> +				      enum drm_mm_allocator_flags aflags,
> +				      enum drm_mm_search_flags sflags);
> +#define drm_mm_insert_node(mm, node, size, alignment) \
> +	drm_mm_insert_node_generic(mm, node, size, alignment, 0, 0)
> +extern int
> +drm_mm_insert_node_in_range_generic(struct drm_mm *mm,
> +				    struct drm_mm_node *node,
> +				    unsigned long size,
> +				    unsigned alignment,
> +				    unsigned long color,
> +				    unsigned long start,
> +				    unsigned long end,
> +				    enum drm_mm_allocator_flags aflags,
> +				    enum drm_mm_search_flags sflags);
> +#define drm_mm_insert_node_in_range(mm, node, size, alignment, start, end) \
> +	drm_mm_insert_node_in_range_generic(mm, node, size, alignment, 0, start, end, 0)
>  extern void drm_mm_put_block(struct drm_mm_node *cur);
>  extern void drm_mm_remove_node(struct drm_mm_node *node);
>  extern void drm_mm_replace_node(struct drm_mm_node *old, struct drm_mm_node *new);
> -extern struct drm_mm_node *drm_mm_search_free_generic(const struct drm_mm *mm,
> -						      unsigned long size,
> -						      unsigned alignment,
> -						      unsigned long color,
> -						      bool best_match);
> +
> +extern struct drm_mm_node *
> +drm_mm_search_free_generic(const struct drm_mm *mm,
> +			   unsigned long size,
> +			   unsigned alignment,
> +			   unsigned long color,
> +			   enum drm_mm_search_flags flags);
>  extern struct drm_mm_node *drm_mm_search_free_in_range_generic(
>  						const struct drm_mm *mm,
>  						unsigned long size,
> @@ -236,13 +265,15 @@ extern struct drm_mm_node *drm_mm_search_free_in_range_generic(
>  						unsigned long color,
>  						unsigned long start,
>  						unsigned long end,
> -						bool best_match);
> -static inline struct drm_mm_node *drm_mm_search_free(const struct drm_mm *mm,
> -						     unsigned long size,
> -						     unsigned alignment,
> -						     bool best_match)
> +						enum drm_mm_search_flags flags);
> +
> +static inline struct drm_mm_node *
> +drm_mm_search_free(const struct drm_mm *mm,
> +		   unsigned long size,
> +		   unsigned alignment,
> +		   enum drm_mm_search_flags flags)
>  {
> -	return drm_mm_search_free_generic(mm,size, alignment, 0, best_match);
> +	return drm_mm_search_free_generic(mm, size, alignment, 0, flags);
>  }
>  static inline  struct drm_mm_node *drm_mm_search_free_in_range(
>  						const struct drm_mm *mm,
> @@ -250,18 +281,19 @@ static inline  struct drm_mm_node *drm_mm_search_free_in_range(
>  						unsigned alignment,
>  						unsigned long start,
>  						unsigned long end,
> -						bool best_match)
> +						enum drm_mm_search_flags flags)
>  {
>  	return drm_mm_search_free_in_range_generic(mm, size, alignment, 0,
> -						   start, end, best_match);
> +						   start, end, flags);
>  }
> -static inline struct drm_mm_node *drm_mm_search_free_color(const struct drm_mm *mm,
> -							   unsigned long size,
> -							   unsigned alignment,
> -							   unsigned long color,
> -							   bool best_match)
> +static inline struct drm_mm_node *
> +drm_mm_search_free_color(const struct drm_mm *mm,
> +			 unsigned long size,
> +			 unsigned alignment,
> +			 unsigned long color,
> +			 enum drm_mm_search_flags flags)
>  {
> -	return drm_mm_search_free_generic(mm,size, alignment, color, best_match);
> +	return drm_mm_search_free_generic(mm, size, alignment, color, flags);
>  }
>  static inline  struct drm_mm_node *drm_mm_search_free_in_range_color(
>  						const struct drm_mm *mm,
> @@ -270,10 +302,10 @@ static inline  struct drm_mm_node *drm_mm_search_free_in_range_color(
>  						unsigned long color,
>  						unsigned long start,
>  						unsigned long end,
> -						bool best_match)
> +						enum drm_mm_search_flags flags)
>  {
>  	return drm_mm_search_free_in_range_generic(mm, size, alignment, color,
> -						   start, end, best_match);
> +						   start, end, flags);
>  }
>  extern int drm_mm_init(struct drm_mm *mm,
>  		       unsigned long start,
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 29/66] drm: pre allocate node for create_block
  2013-06-27 23:30 ` [PATCH 29/66] drm: pre allocate node for create_block Ben Widawsky
@ 2013-06-30 12:34   ` Daniel Vetter
  2013-07-01 18:30     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 12:34 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:30PM -0700, Ben Widawsky wrote:
> For an upcoming patch where we introduce the i915 VMA, it's ideal to
> have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
> Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
> will break a bunch of code, but amongst them are 2 callers of
> drm_mm_create_block(), both related to stolen memory.
> 
> As a side note, this patch has is able to leverage all the existing
> drm_mm_put_block because the node is still kzalloc'd. When the
> aforementioned VMA code comes into play, that too has to change.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Same here about cc'ing dri-devel. Furthermore I think it'd be nice to kill
the interfaces from drm_mm.c which allocate the drm_mm_node themselves.
The last user outside of drm/i915 is ttm, killing that one would also
allow us to remove the (racy) preallocation madness.

So if you convert over all of drm/i915 to the preallcoate functions which
pass in the drm_mm_node, I'll volunteer myself to fix up ttm.
-Daniel

> ---
>  drivers/gpu/drm/drm_mm.c               | 16 +++++-----------
>  drivers/gpu/drm/i915/i915_drv.h        |  2 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c    | 20 ++++++++++++++-----
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 35 +++++++++++++++++++++++-----------
>  include/drm/drm_mm.h                   |  9 ++++-----
>  5 files changed, 49 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
> index 7095328..a2dcfdb 100644
> --- a/drivers/gpu/drm/drm_mm.c
> +++ b/drivers/gpu/drm/drm_mm.c
> @@ -158,12 +158,10 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
>  	}
>  }
>  
> -struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
> -					unsigned long start,
> -					unsigned long size,
> -					enum drm_mm_allocator_flags flags)
> +int drm_mm_create_block(struct drm_mm *mm, struct drm_mm_node *node,
> +			unsigned long start, unsigned long size)
>  {
> -	struct drm_mm_node *hole, *node;
> +	struct drm_mm_node *hole;
>  	unsigned long end = start + size;
>  	unsigned long hole_start;
>  	unsigned long hole_end;
> @@ -172,10 +170,6 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
>  		if (hole_start > start || hole_end < end)
>  			continue;
>  
> -		node = drm_mm_kmalloc(mm, flags & DRM_MM_CREATE_ATOMIC);
> -		if (unlikely(node == NULL))
> -			return NULL;
> -
>  		node->start = start;
>  		node->size = size;
>  		node->mm = mm;
> @@ -195,11 +189,11 @@ struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
>  			node->hole_follows = 1;
>  		}
>  
> -		return node;
> +		return 0;
>  	}
>  
>  	WARN(1, "no hole found for block 0x%lx + 0x%lx\n", start, size);
> -	return NULL;
> +	return -ENOSPC;
>  }
>  EXPORT_SYMBOL(drm_mm_create_block);
>  
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f6704d3..bc80ce0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1197,7 +1197,7 @@ enum hdmi_force_audio {
>  	HDMI_AUDIO_ON,			/* force turn on HDMI audio */
>  };
>  
> -#define I915_GTT_RESERVED ((struct drm_mm_node *)0x1)
> +#define I915_GTT_RESERVED 0x1
>  
>  struct drm_i915_gem_object_ops {
>  	/* Interface between the GEM object and its backing storage.
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index a45c00d..17e334f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -679,14 +679,24 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  
>  	/* Mark any preallocated objects as occupied */
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> +		uintptr_t gtt_offset = (uintptr_t)obj->gtt_space;
> +		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
>  			      obj->gtt_space->start, obj->base.size);
>  
> -		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
> -		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
> -						     obj->gtt_space->start,
> -						     obj->base.size,
> -						     false);
> +		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
> +		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
> +		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
> +		if (!obj->gtt_space) {
> +			DRM_ERROR("Failed to preserve all objects\n");
> +			break;
> +		}
> +		ret = drm_mm_create_block(&i915_gtt_vm->mm,
> +					  obj->gtt_space,
> +					  gtt_offset,
> +					  obj->base.size);
> +		if (ret)
> +			DRM_DEBUG_KMS("Reservation failed\n");
>  		obj->has_global_gtt_mapping = 1;
>  	}
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 7fba6f5..925f3b1 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -330,6 +330,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_object *obj;
>  	struct drm_mm_node *stolen;
> +	int ret;
>  
>  	if (dev_priv->gtt.stolen_base == 0)
>  		return NULL;
> @@ -344,11 +345,15 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (WARN_ON(size == 0))
>  		return NULL;
>  
> -	stolen = drm_mm_create_block(&dev_priv->gtt.stolen,
> -				     stolen_offset, size,
> -				     false);
> -	if (stolen == NULL) {
> +	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
> +	if (!stolen)
> +		return NULL;
> +
> +	ret = drm_mm_create_block(&dev_priv->gtt.stolen, stolen, stolen_offset,
> +				  size);
> +	if (ret) {
>  		DRM_DEBUG_KMS("failed to allocate stolen space\n");
> +		kfree(stolen);
>  		return NULL;
>  	}
>  
> @@ -369,18 +374,26 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	 * later.
>  	 */
>  	if (drm_mm_initialized(&i915_gtt_vm->mm)) {
> -		obj->gtt_space = drm_mm_create_block(&i915_gtt_vm->mm,
> -						     gtt_offset, size,
> -						     false);
> -		if (obj->gtt_space == NULL) {
> +		obj->gtt_space = kzalloc(sizeof(*obj->gtt_space), GFP_KERNEL);
> +		if (!obj->gtt_space) {
> +			drm_gem_object_unreference(&obj->base);
> +			return NULL;
> +		}
> +		ret = drm_mm_create_block(&i915_gtt_vm->mm, obj->gtt_space,
> +					  gtt_offset, size);
> +		if (ret) {
>  			DRM_DEBUG_KMS("failed to allocate stolen GTT space\n");
>  			drm_gem_object_unreference(&obj->base);
> +			kfree(obj->gtt_space);
>  			return NULL;
>  		}
> -	} else
> -		obj->gtt_space = I915_GTT_RESERVED;
> +		obj->gtt_space->start = gtt_offset;
> +	} else {
> +		/* NB: Safe because we assert page alignment */
> +		obj->gtt_space = (struct drm_mm_node *)
> +			((uintptr_t)gtt_offset | I915_GTT_RESERVED);
> +	}
>  
> -	obj->gtt_space->start = gtt_offset;
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
> index 8935710..0cfb06c 100644
> --- a/include/drm/drm_mm.h
> +++ b/include/drm/drm_mm.h
> @@ -161,11 +161,10 @@ static inline unsigned long drm_mm_hole_node_end(struct drm_mm_node *hole_node)
>  /*
>   * Basic range manager support (drm_mm.c)
>   */
> -extern struct drm_mm_node *
> -drm_mm_create_block(struct drm_mm *mm,
> -		    unsigned long start,
> -		    unsigned long size,
> -		    enum drm_mm_allocator_flags flags);
> +extern int drm_mm_create_block(struct drm_mm *mm,
> +			       struct drm_mm_node *node,
> +			       unsigned long start,
> +			       unsigned long size);
>  extern struct drm_mm_node *
>  drm_mm_get_block_generic(struct drm_mm_node *node,
>  			 unsigned long size,
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom
  2013-06-30 12:30   ` Daniel Vetter
@ 2013-06-30 12:40     ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 12:40 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX, DRI Development

On Sun, Jun 30, 2013 at 2:30 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Jun 27, 2013 at 04:30:05PM -0700, Ben Widawsky wrote:
>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>
>> Clients like i915 needs to segregate cache domains within the GTT which
>> can lead to small amounts of fragmentation. By allocating the uncached
>> buffers from the bottom and the cacheable buffers from the top, we can
>> reduce the amount of wasted space and also optimize allocation of the
>> mappable portion of the GTT to only those buffers that require CPU
>> access through the GTT.
>>
>> v2 by Ben:
>> Update callers in i915_gem_object_bind_to_gtt()
>> Turn search flags and allocation flags into separate enums
>> Make checkpatch happy where logical/easy
>>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>
> Since this is a core drm patch it must be cc'ed to dri-devel (and acked by
> Dave) before I can merge it. Can you please resend?

And same review as for Chris' original patch still applies: best_match
is unused (and it's better that way, really) so can be garbage
collected.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-06-27 23:30 ` [PATCH 30/66] drm/i915: Getter/setter for object attributes Ben Widawsky
@ 2013-06-30 13:00   ` Daniel Vetter
  2013-07-01 18:32     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 13:00 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:31PM -0700, Ben Widawsky wrote:
> This will be handy when we add VMs. It's not strictly, necessary, but it
> will make the code much cleaner.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

You're going to hate, but this is patch ordering fail. Imo this should be
one of the very first patches, at least before you kill obj->gtt_offset.

To increase your hatred some more, I have bikesheds on the names, too.

I think the best would be to respin this patch and merge it right away.
It'll cause tons of conflicts. But keeping it as no. 30 in this series
will be even worse, since merging the first 30 patches won't happen
instantly. So much more potential for rebase hell imo.

The MO for when you stumble over such a giant renaming operation should be
imo to submit the "add inline abstraction functions" patch(es) right away.
That way everyone else who potentially works in the same area also gets a
heads up.


> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bc80ce0..56d47bc 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1349,6 +1349,27 @@ struct drm_i915_gem_object {
>  
>  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>  
> +static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
> +{
> +	return o->gtt_space->start;
> +}

To differentiate from the ppgtt offset I'd call this
i915_gem_obj_ggtt_offset.

> +
> +static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
> +{
> +	return o->gtt_space != NULL;
> +}

Same here, I think we want  ggtt inserted.

> +
> +static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
> +{
> +	return o->gtt_space->size;
> +}

This is even more misleading and the real reasons I vote for all the ggtt
bikesheds: ggtt_size != obj->size is very much possible (on gen2/3 only
though). We use that to satisfy alignment/size constraints on tiled
objects. So the i915_gem_obj_ggtt_size rename is mandatory here.

> +
> +static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> +					  enum i915_cache_level color)
> +{
> +	o->gtt_space->color = color;
> +}

Dito for consistency.

Cheers, Daniel

> +
>  /**
>   * Request queue structure.
>   *
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index d747a1f..dd2228d 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -135,7 +135,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  static inline bool
>  i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
>  {
> -	return obj->gtt_space && !obj->active;
> +	return i915_gem_obj_bound(obj) && !obj->active;
>  }
>  
>  int
> @@ -178,7 +178,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>  	mutex_lock(&dev->struct_mutex);
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
>  		if (obj->pin_count)
> -			pinned += obj->gtt_space->size;
> +			pinned += i915_gem_obj_size(obj);
>  	mutex_unlock(&dev->struct_mutex);
>  
>  	args->aper_size = i915_gtt_vm->total;
> @@ -422,7 +422,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
>  		 * anyway again before the next pread happens. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush = 1;
> -		if (obj->gtt_space) {
> +		if (i915_gem_obj_bound(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, false);
>  			if (ret)
>  				return ret;
> @@ -609,7 +609,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>  	user_data = to_user_ptr(args->data_ptr);
>  	remain = args->size;
>  
> -	offset = obj->gtt_space->start + args->offset;
> +	offset = i915_gem_obj_offset(obj) + args->offset;
>  
>  	while (remain > 0) {
>  		/* Operation in this page
> @@ -739,7 +739,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		 * right away and we therefore have to clflush anyway. */
>  		if (obj->cache_level == I915_CACHE_NONE)
>  			needs_clflush_after = 1;
> -		if (obj->gtt_space) {
> +		if (i915_gem_obj_bound(obj)) {
>  			ret = i915_gem_object_set_to_gtt_domain(obj, true);
>  			if (ret)
>  				return ret;
> @@ -1361,7 +1361,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>  
>  	obj->fault_mappable = true;
>  
> -	pfn += (obj->gtt_space->start >> PAGE_SHIFT) + page_offset;
> +	pfn += (i915_gem_obj_offset(obj) >> PAGE_SHIFT) + page_offset;
>  
>  	/* Finally, remap it using the new GTT offset */
>  	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
> @@ -1667,7 +1667,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
>  	if (obj->pages == NULL)
>  		return 0;
>  
> -	BUG_ON(obj->gtt_space);
> +	BUG_ON(i915_gem_obj_bound(obj));
>  
>  	if (obj->pages_pin_count)
>  		return -EBUSY;
> @@ -2587,7 +2587,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	drm_i915_private_t *dev_priv = obj->base.dev->dev_private;
>  	int ret;
>  
> -	if (obj->gtt_space == NULL)
> +	if (!i915_gem_obj_bound(obj))
>  		return 0;
>  
>  	if (obj->pin_count)
> @@ -2669,11 +2669,11 @@ static void i965_write_fence_reg(struct drm_device *dev, int reg,
>  	}
>  
>  	if (obj) {
> -		u32 size = obj->gtt_space->size;
> +		u32 size = i915_gem_obj_size(obj);
>  
> -		val = (uint64_t)((obj->gtt_space->start + size - 4096) &
> +		val = (uint64_t)((i915_gem_obj_offset(obj) + size - 4096) &
>  				 0xfffff000) << 32;
> -		val |= obj->gtt_space->start & 0xfffff000;
> +		val |= i915_gem_obj_offset(obj) & 0xfffff000;
>  		val |= (uint64_t)((obj->stride / 128) - 1) << fence_pitch_shift;
>  		if (obj->tiling_mode == I915_TILING_Y)
>  			val |= 1 << I965_FENCE_TILING_Y_SHIFT;
> @@ -2693,15 +2693,15 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
>  	u32 val;
>  
>  	if (obj) {
> -		u32 size = obj->gtt_space->size;
> +		u32 size = i915_gem_obj_size(obj);
>  		int pitch_val;
>  		int tile_width;
>  
> -		WARN((obj->gtt_space->start & ~I915_FENCE_START_MASK) ||
> +		WARN((i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK) ||
>  		     (size & -size) != size ||
> -		     (obj->gtt_space->start & (size - 1)),
> +		     (i915_gem_obj_offset(obj) & (size - 1)),
>  		     "object 0x%08lx [fenceable? %d] not 1M or pot-size (0x%08x) aligned\n",
> -		     obj->gtt_space->start, obj->map_and_fenceable, size);
> +		     i915_gem_obj_offset(obj), obj->map_and_fenceable, size);
>  
>  		if (obj->tiling_mode == I915_TILING_Y && HAS_128_BYTE_Y_TILING(dev))
>  			tile_width = 128;
> @@ -2712,7 +2712,7 @@ static void i915_write_fence_reg(struct drm_device *dev, int reg,
>  		pitch_val = obj->stride / tile_width;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> -		val = obj->gtt_space->start;
> +		val = i915_gem_obj_offset(obj);
>  		if (obj->tiling_mode == I915_TILING_Y)
>  			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
>  		val |= I915_FENCE_SIZE_BITS(size);
> @@ -2737,19 +2737,19 @@ static void i830_write_fence_reg(struct drm_device *dev, int reg,
>  	uint32_t val;
>  
>  	if (obj) {
> -		u32 size = obj->gtt_space->size;
> +		u32 size = i915_gem_obj_size(obj);
>  		uint32_t pitch_val;
>  
> -		WARN((obj->gtt_space->start & ~I830_FENCE_START_MASK) ||
> +		WARN((i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK) ||
>  		     (size & -size) != size ||
> -		     (obj->gtt_space->start & (size - 1)),
> +		     (i915_gem_obj_offset(obj) & (size - 1)),
>  		     "object 0x%08lx not 512K or pot-size 0x%08x aligned\n",
> -		     obj->gtt_space->start, size);
> +		     i915_gem_obj_offset(obj), size);
>  
>  		pitch_val = obj->stride / 128;
>  		pitch_val = ffs(pitch_val) - 1;
>  
> -		val = obj->gtt_space->start;
> +		val = i915_gem_obj_offset(obj);
>  		if (obj->tiling_mode == I915_TILING_Y)
>  			val |= 1 << I830_FENCE_TILING_Y_SHIFT;
>  		val |= I830_FENCE_SIZE_BITS(size);
> @@ -3030,6 +3030,7 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>  	int err = 0;
>  
>  	list_for_each_entry(obj, &dev_priv->mm.gtt_list, global_list) {
> +		unsigned long obj_offset = i915_gem_obj_offset(obj);
>  		if (obj->gtt_space == NULL) {
>  			printk(KERN_ERR "object found on GTT list with no space reserved\n");
>  			err++;
> @@ -3038,8 +3039,8 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>  
>  		if (obj->cache_level != obj->gtt_space->color) {
>  			printk(KERN_ERR "object reserved space [%08lx, %08lx] with wrong color, cache_level=%x, color=%lx\n",
> -			       obj->gtt_space->start,
> -			       obj->gtt_space->start + obj->gtt_space->size,
> +			       obj_offset,
> +			       obj_offset + i915_gem_obj_size(obj),
>  			       obj->cache_level,
>  			       obj->gtt_space->color);
>  			err++;
> @@ -3050,8 +3051,8 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
>  					      obj->gtt_space,
>  					      obj->cache_level)) {
>  			printk(KERN_ERR "invalid GTT space found at [%08lx, %08lx] - color=%x\n",
> -			       obj->gtt_space->start,
> -			       obj->gtt_space->start + obj->gtt_space->size,
> +			       obj_offset,
> +			       obj_offset + i915_gem_obj_size(obj),
>  			       obj->cache_level);
>  			err++;
>  			continue;
> @@ -3267,7 +3268,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  	int ret;
>  
>  	/* Not valid to be called on unbound objects. */
> -	if (obj->gtt_space == NULL)
> +	if (!i915_gem_obj_bound(obj))
>  		return -EINVAL;
>  
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
> @@ -3332,7 +3333,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			return ret;
>  	}
>  
> -	if (obj->gtt_space) {
> +	if (i915_gem_obj_bound(obj)) {
>  		ret = i915_gem_object_finish_gpu(obj);
>  		if (ret)
>  			return ret;
> @@ -3355,7 +3356,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
>  					       obj, cache_level);
>  
> -		obj->gtt_space->color = cache_level;
> +		i915_gem_obj_set_color(obj, cache_level);
>  	}
>  
>  	if (cache_level == I915_CACHE_NONE) {
> @@ -3636,14 +3637,14 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  	if (WARN_ON(obj->pin_count == DRM_I915_GEM_OBJECT_MAX_PIN_COUNT))
>  		return -EBUSY;
>  
> -	if (obj->gtt_space != NULL) {
> -		if ((alignment && obj->gtt_space->start & (alignment - 1)) ||
> +	if (i915_gem_obj_bound(obj)) {
> +		if ((alignment && i915_gem_obj_offset(obj) & (alignment - 1)) ||
>  		    (map_and_fenceable && !obj->map_and_fenceable)) {
>  			WARN(obj->pin_count,
>  			     "bo is already pinned with incorrect alignment:"
>  			     " offset=%lx, req.alignment=%x, req.map_and_fenceable=%d,"
>  			     " obj->map_and_fenceable=%d\n",
> -			     obj->gtt_space->start, alignment,
> +			     i915_gem_obj_offset(obj), alignment,
>  			     map_and_fenceable,
>  			     obj->map_and_fenceable);
>  			ret = i915_gem_object_unbind(obj);
> @@ -3652,7 +3653,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  		}
>  	}
>  
> -	if (obj->gtt_space == NULL) {
> +	if (!i915_gem_obj_bound(obj)) {
>  		struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  
>  		ret = i915_gem_object_bind_to_gtt(obj, alignment,
> @@ -3678,7 +3679,7 @@ void
>  i915_gem_object_unpin(struct drm_i915_gem_object *obj)
>  {
>  	BUG_ON(obj->pin_count == 0);
> -	BUG_ON(obj->gtt_space == NULL);
> +	BUG_ON(!i915_gem_obj_bound(obj));
>  
>  	if (--obj->pin_count == 0)
>  		obj->pin_mappable = false;
> @@ -3728,7 +3729,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
>  	 * as the X server doesn't manage domains yet
>  	 */
>  	i915_gem_object_flush_cpu_write_domain(obj);
> -	args->offset = obj->gtt_space->start;
> +	args->offset = i915_gem_obj_offset(obj);
>  out:
>  	drm_gem_object_unreference(&obj->base);
>  unlock:
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 1e838f4..75b4e27 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -395,7 +395,7 @@ mi_set_context(struct intel_ring_buffer *ring,
>  
>  	intel_ring_emit(ring, MI_NOOP);
>  	intel_ring_emit(ring, MI_SET_CONTEXT);
> -	intel_ring_emit(ring, new_context->obj->gtt_space->start |
> +	intel_ring_emit(ring, i915_gem_obj_offset(new_context->obj) |
>  			MI_MM_SPACE_GTT |
>  			MI_SAVE_EXT_STATE_EN |
>  			MI_RESTORE_EXT_STATE_EN |
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 67246a6..837372d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -188,7 +188,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  		return -ENOENT;
>  
>  	target_i915_obj = to_intel_bo(target_obj);
> -	target_offset = target_i915_obj->gtt_space->start;
> +	target_offset = i915_gem_obj_offset(target_i915_obj);
>  
>  	/* Sandybridge PPGTT errata: We need a global gtt mapping for MI and
>  	 * pipe_control writes because the gpu doesn't properly redirect them
> @@ -280,7 +280,7 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>  			return ret;
>  
>  		/* Map the page containing the relocation we're going to perform.  */
> -		reloc->offset += obj->gtt_space->start;
> +		reloc->offset += i915_gem_obj_offset(obj);
>  		reloc_page = io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
>  						      reloc->offset & PAGE_MASK);
>  		reloc_entry = (uint32_t __iomem *)
> @@ -436,8 +436,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  		obj->has_aliasing_ppgtt_mapping = 1;
>  	}
>  
> -	if (entry->offset != obj->gtt_space->start) {
> -		entry->offset = obj->gtt_space->start;
> +	if (entry->offset != i915_gem_obj_offset(obj)) {
> +		entry->offset = i915_gem_obj_offset(obj);
>  		*need_reloc = true;
>  	}
>  
> @@ -458,7 +458,7 @@ i915_gem_execbuffer_unreserve_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_gem_exec_object2 *entry;
>  
> -	if (!obj->gtt_space)
> +	if (!i915_gem_obj_bound(obj))
>  		return;
>  
>  	entry = obj->exec_entry;
> @@ -528,11 +528,13 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  		/* Unbind any ill-fitting objects or pin. */
>  		list_for_each_entry(obj, objects, exec_list) {
>  			struct drm_i915_gem_exec_object2 *entry = obj->exec_entry;
> +			unsigned long obj_offset;
>  			bool need_fence, need_mappable;
>  
> -			if (!obj->gtt_space)
> +			if (!i915_gem_obj_bound(obj))
>  				continue;
>  
> +			obj_offset = i915_gem_obj_offset(obj);
>  			need_fence =
>  				has_fenced_gpu_access &&
>  				entry->flags & EXEC_OBJECT_NEEDS_FENCE &&
> @@ -540,7 +542,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  			need_mappable = need_fence || need_reloc_mappable(obj);
>  
>  			if ((entry->alignment &&
> -			     obj->gtt_space->start & (entry->alignment - 1)) ||
> +			     obj_offset & (entry->alignment - 1)) ||
>  			    (need_mappable && !obj->map_and_fenceable))
>  				ret = i915_gem_object_unbind(obj);
>  			else
> @@ -551,7 +553,7 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
>  
>  		/* Bind fresh objects */
>  		list_for_each_entry(obj, objects, exec_list) {
> -			if (obj->gtt_space)
> +			if (i915_gem_obj_bound(obj))
>  				continue;
>  
>  			ret = i915_gem_execbuffer_reserve_object(obj, ring, need_relocs);
> @@ -1072,7 +1074,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  			goto err;
>  	}
>  
> -	exec_start = batch_obj->gtt_space->start + args->batch_start_offset;
> +	exec_start = i915_gem_obj_offset(batch_obj) + args->batch_start_offset;
>  	exec_len = args->batch_len;
>  	if (cliprects) {
>  		for (i = 0; i < args->num_cliprects; i++) {
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 17e334f..566ab76 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -390,7 +390,7 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    enum i915_cache_level cache_level)
>  {
>  	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> -				   obj->gtt_space->start >> PAGE_SHIFT,
> +				   i915_gem_obj_offset(obj) >> PAGE_SHIFT,
>  				   cache_level);
>  }
>  
> @@ -398,7 +398,7 @@ void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
>  	ppgtt->base.clear_range(&ppgtt->base,
> -				obj->gtt_space->start >> PAGE_SHIFT,
> +				i915_gem_obj_offset(obj) >> PAGE_SHIFT,
>  				obj->base.size >> PAGE_SHIFT);
>  }
>  
> @@ -570,9 +570,10 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj);
>  
>  	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
> -					  obj->gtt_space->start >> PAGE_SHIFT,
> +					  obj_offset >> PAGE_SHIFT,
>  					  cache_level);
>  
>  	obj->has_global_gtt_mapping = 1;
> @@ -582,9 +583,10 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	unsigned long obj_offset = i915_gem_obj_offset(obj);
>  
>  	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
> -				       obj->gtt_space->start >> PAGE_SHIFT,
> +				       obj_offset >> PAGE_SHIFT,
>  				       obj->base.size >> PAGE_SHIFT);
>  
>  	obj->has_global_gtt_mapping = 0;
> @@ -682,7 +684,7 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  		uintptr_t gtt_offset = (uintptr_t)obj->gtt_space;
>  		int ret;
>  		DRM_DEBUG_KMS("reserving preallocated space: %lx + %zx\n",
> -			      obj->gtt_space->start, obj->base.size);
> +			      i915_gem_obj_offset(obj), obj->base.size);
>  
>  		BUG_ON((gtt_offset & I915_GTT_RESERVED) == 0);
>  		gtt_offset = gtt_offset & ~I915_GTT_RESERVED;
> diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
> index 7aab12a..2478114 100644
> --- a/drivers/gpu/drm/i915/i915_gem_tiling.c
> +++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
> @@ -268,18 +268,18 @@ i915_gem_object_fence_ok(struct drm_i915_gem_object *obj, int tiling_mode)
>  		return true;
>  
>  	if (INTEL_INFO(obj->base.dev)->gen == 3) {
> -		if (obj->gtt_space->start & ~I915_FENCE_START_MASK)
> +		if (i915_gem_obj_offset(obj) & ~I915_FENCE_START_MASK)
>  			return false;
>  	} else {
> -		if (obj->gtt_space->start & ~I830_FENCE_START_MASK)
> +		if (i915_gem_obj_offset(obj) & ~I830_FENCE_START_MASK)
>  			return false;
>  	}
>  
>  	size = i915_gem_get_gtt_size(obj->base.dev, obj->base.size, tiling_mode);
> -	if (obj->gtt_space->size != size)
> +	if (i915_gem_obj_size(obj) != size)
>  		return false;
>  
> -	if (obj->gtt_space->start & (size - 1))
> +	if (i915_gem_obj_offset(obj) & (size - 1))
>  		return false;
>  
>  	return true;
> @@ -358,8 +358,8 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  		 * whilst executing a fenced command for an untiled object.
>  		 */
>  
> -		obj->map_and_fenceable = obj->gtt_space == NULL ||
> -			(obj->gtt_space->start +
> +		obj->map_and_fenceable = !i915_gem_obj_bound(obj) ||
> +			(i915_gem_obj_offset(obj) +
>  			 obj->base.size <= dev_priv->gtt.mappable_end &&
>  			 i915_gem_object_fence_ok(obj, args->tiling_mode));
>  
> @@ -369,7 +369,7 @@ i915_gem_set_tiling(struct drm_device *dev, void *data,
>  				i915_gem_get_gtt_alignment(dev, obj->base.size,
>  							    args->tiling_mode,
>  							    false);
> -			if (obj->gtt_space->start & (unfenced_alignment - 1))
> +			if (i915_gem_obj_offset(obj) & (unfenced_alignment - 1))
>  				ret = i915_gem_object_unbind(obj);
>  		}
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 2c4fe36..c0be641 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1512,7 +1512,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
>  	if (dst == NULL)
>  		return NULL;
>  
> -	reloc_offset = src->gtt_space->start;
> +	reloc_offset = i915_gem_obj_offset(src);
>  	for (i = 0; i < num_pages; i++) {
>  		unsigned long flags;
>  		void *d;
> @@ -1564,7 +1564,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
>  		reloc_offset += PAGE_SIZE;
>  	}
>  	dst->page_count = num_pages;
> -	dst->gtt_offset = src->gtt_space->start;
> +	dst->gtt_offset = i915_gem_obj_offset(src);
>  
>  	return dst;
>  
> @@ -1618,7 +1618,7 @@ static void capture_bo(struct drm_i915_error_buffer *err,
>  	err->name = obj->base.name;
>  	err->rseqno = obj->last_read_seqno;
>  	err->wseqno = obj->last_write_seqno;
> -	err->gtt_offset = obj->gtt_space->start;
> +	err->gtt_offset = i915_gem_obj_offset(obj);
>  	err->read_domains = obj->base.read_domains;
>  	err->write_domain = obj->base.write_domain;
>  	err->fence_reg = obj->fence_reg;
> @@ -1716,8 +1716,8 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  			return NULL;
>  
>  		obj = ring->private;
> -		if (acthd >= obj->gtt_space->start &&
> -		    acthd < obj->gtt_space->start + obj->base.size)
> +		if (acthd >= i915_gem_obj_offset(obj) &&
> +		    acthd < i915_gem_obj_offset(obj) + obj->base.size)
>  			return i915_error_object_create(dev_priv, obj);
>  	}
>  
> @@ -1798,7 +1798,7 @@ static void i915_gem_record_active_context(struct intel_ring_buffer *ring,
>  		return;
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> -		if ((error->ccid & PAGE_MASK) == obj->gtt_space->start) {
> +		if ((error->ccid & PAGE_MASK) == i915_gem_obj_offset(obj)) {
>  			ering->ctx = i915_error_object_create_sized(dev_priv,
>  								    obj, 1);
>  		}
> @@ -2152,10 +2152,10 @@ static void __always_unused i915_pageflip_stall_check(struct drm_device *dev, in
>  	if (INTEL_INFO(dev)->gen >= 4) {
>  		int dspsurf = DSPSURF(intel_crtc->plane);
>  		stall_detected = I915_HI_DISPBASE(I915_READ(dspsurf)) ==
> -					obj->gtt_space->start;
> +					i915_gem_obj_offset(obj);
>  	} else {
>  		int dspaddr = DSPADDR(intel_crtc->plane);
> -		stall_detected = I915_READ(dspaddr) == (obj->gtt_space->start +
> +		stall_detected = I915_READ(dspaddr) == (i915_gem_obj_offset(obj) +
>  							crtc->y * crtc->fb->pitches[0] +
>  							crtc->x * crtc->fb->bits_per_pixel/8);
>  	}
> diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
> index 3db4a68..e4dccb3 100644
> --- a/drivers/gpu/drm/i915/i915_trace.h
> +++ b/drivers/gpu/drm/i915/i915_trace.h
> @@ -46,8 +46,8 @@ TRACE_EVENT(i915_gem_object_bind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = obj->gtt_space->start;
> -			   __entry->size = obj->gtt_space->size;
> +			   __entry->offset = i915_gem_obj_offset(obj);
> +			   __entry->size = i915_gem_obj_size(obj);
>  			   __entry->mappable = mappable;
>  			   ),
>  
> @@ -68,8 +68,8 @@ TRACE_EVENT(i915_gem_object_unbind,
>  
>  	    TP_fast_assign(
>  			   __entry->obj = obj;
> -			   __entry->offset = obj->gtt_space->start;
> -			   __entry->size = obj->gtt_space->size;
> +			   __entry->offset = i915_gem_obj_offset(obj);
> +			   __entry->size = i915_gem_obj_size(obj);
>  			   ),
>  
>  	    TP_printk("obj=%p, offset=%08x size=%x",
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index a269d7a..633bfbf 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -1943,18 +1943,18 @@ static int i9xx_update_plane(struct drm_crtc *crtc, struct drm_framebuffer *fb,
>  	}
>  
>  	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
> -		      obj->gtt_space->start, linear_offset, x, y,
> +		      i915_gem_obj_offset(obj), linear_offset, x, y,
>  		      fb->pitches[0]);
>  	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
>  	if (INTEL_INFO(dev)->gen >= 4) {
>  		I915_MODIFY_DISPBASE(DSPSURF(plane),
> -				     obj->gtt_space->start +
> +				     i915_gem_obj_offset(obj) +
>  				     intel_crtc->dspaddr_offset);
>  		I915_WRITE(DSPTILEOFF(plane), (y << 16) | x);
>  		I915_WRITE(DSPLINOFF(plane), linear_offset);
>  	} else
>  		I915_WRITE(DSPADDR(plane),
> -			   obj->gtt_space->start + linear_offset);
> +			   i915_gem_obj_offset(obj) + linear_offset);
>  	POSTING_READ(reg);
>  
>  	return 0;
> @@ -2035,11 +2035,11 @@ static int ironlake_update_plane(struct drm_crtc *crtc,
>  	linear_offset -= intel_crtc->dspaddr_offset;
>  
>  	DRM_DEBUG_KMS("Writing base %08lX %08lX %d %d %d\n",
> -		      obj->gtt_space->start, linear_offset, x, y,
> +		      i915_gem_obj_offset(obj), linear_offset, x, y,
>  		      fb->pitches[0]);
>  	I915_WRITE(DSPSTRIDE(plane), fb->pitches[0]);
>  	I915_MODIFY_DISPBASE(DSPSURF(plane),
> -			     obj->gtt_space->start+intel_crtc->dspaddr_offset);
> +			     i915_gem_obj_offset(obj)+intel_crtc->dspaddr_offset);
>  	if (IS_HASWELL(dev)) {
>  		I915_WRITE(DSPOFFSET(plane), (y << 16) | x);
>  	} else {
> @@ -6558,7 +6558,7 @@ static int intel_crtc_cursor_set(struct drm_crtc *crtc,
>  			goto fail_unpin;
>  		}
>  
> -		addr = obj->gtt_space->start;
> +		addr = i915_gem_obj_offset(obj);
>  	} else {
>  		int align = IS_I830(dev) ? 16 * 1024 : 256;
>  		ret = i915_gem_attach_phys_object(dev, obj,
> @@ -7274,7 +7274,7 @@ static int intel_gen2_queue_flip(struct drm_device *dev,
>  			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
>  	intel_ring_emit(ring, fb->pitches[0]);
>  	intel_ring_emit(ring,
> -			obj->gtt_space->start + intel_crtc->dspaddr_offset);
> +			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
>  	intel_ring_emit(ring, 0); /* aux display base address, unused */
>  
>  	intel_mark_page_flip_active(intel_crtc);
> @@ -7316,7 +7316,7 @@ static int intel_gen3_queue_flip(struct drm_device *dev,
>  			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
>  	intel_ring_emit(ring, fb->pitches[0]);
>  	intel_ring_emit(ring,
> -			obj->gtt_space->start + intel_crtc->dspaddr_offset);
> +			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
>  	intel_ring_emit(ring, MI_NOOP);
>  
>  	intel_mark_page_flip_active(intel_crtc);
> @@ -7356,7 +7356,7 @@ static int intel_gen4_queue_flip(struct drm_device *dev,
>  			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
>  	intel_ring_emit(ring, fb->pitches[0]);
>  	intel_ring_emit(ring,
> -			(obj->gtt_space->start + intel_crtc->dspaddr_offset) |
> +			(i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset) |
>  			obj->tiling_mode);
>  
>  	/* XXX Enabling the panel-fitter across page-flip is so far
> @@ -7400,7 +7400,7 @@ static int intel_gen6_queue_flip(struct drm_device *dev,
>  			MI_DISPLAY_FLIP_PLANE(intel_crtc->plane));
>  	intel_ring_emit(ring, fb->pitches[0] | obj->tiling_mode);
>  	intel_ring_emit(ring,
> -			obj->gtt_space->start + intel_crtc->dspaddr_offset);
> +			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
>  
>  	/* Contrary to the suggestions in the documentation,
>  	 * "Enable Panel Fitter" does not seem to be required when page
> @@ -7466,7 +7466,7 @@ static int intel_gen7_queue_flip(struct drm_device *dev,
>  	intel_ring_emit(ring, MI_DISPLAY_FLIP_I915 | plane_bit);
>  	intel_ring_emit(ring, (fb->pitches[0] | obj->tiling_mode));
>  	intel_ring_emit(ring,
> -			obj->gtt_space->start + intel_crtc->dspaddr_offset);
> +			i915_gem_obj_offset(obj) + intel_crtc->dspaddr_offset);
>  	intel_ring_emit(ring, (MI_NOOP));
>  
>  	intel_mark_page_flip_active(intel_crtc);
> diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
> index 242a793..8315a5e 100644
> --- a/drivers/gpu/drm/i915/intel_fb.c
> +++ b/drivers/gpu/drm/i915/intel_fb.c
> @@ -139,11 +139,11 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  	info->apertures->ranges[0].base = dev->mode_config.fb_base;
>  	info->apertures->ranges[0].size = dev_priv->gtt.mappable_end;
>  
> -	info->fix.smem_start = dev->mode_config.fb_base + obj->gtt_space->start;
> +	info->fix.smem_start = dev->mode_config.fb_base + i915_gem_obj_offset(obj);
>  	info->fix.smem_len = size;
>  
>  	info->screen_base =
> -		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
> +		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
>  			   size);
>  	if (!info->screen_base) {
>  		ret = -ENOSPC;
> @@ -168,7 +168,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
>  
>  	DRM_DEBUG_KMS("allocated %dx%d fb: 0x%08lx, bo %p\n",
>  		      fb->width, fb->height,
> -		      obj->gtt_space->start, obj);
> +		      i915_gem_obj_offset(obj), obj);
>  
>  
>  	mutex_unlock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 93f2671..41654b1 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -196,7 +196,7 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
>  		regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_obj->handle->vaddr;
>  	else
>  		regs = io_mapping_map_wc(dev_priv->gtt.mappable,
> -					 overlay->reg_bo->gtt_space->start);
> +					 i915_gem_obj_offset(overlay->reg_bo));
>  
>  	return regs;
>  }
> @@ -740,7 +740,8 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
>  	swidth = params->src_w;
>  	swidthsw = calc_swidthsw(overlay->dev, params->offset_Y, tmp_width);
>  	sheight = params->src_h;
> -	iowrite32(new_bo->gtt_space->start + params->offset_Y, &regs->OBUF_0Y);
> +	iowrite32(i915_gem_obj_offset(new_bo) + params->offset_Y,
> +		  &regs->OBUF_0Y);
>  	ostride = params->stride_Y;
>  
>  	if (params->format & I915_OVERLAY_YUV_PLANAR) {
> @@ -754,9 +755,9 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
>  				      params->src_w/uv_hscale);
>  		swidthsw |= max_t(u32, tmp_U, tmp_V) << 16;
>  		sheight |= (params->src_h/uv_vscale) << 16;
> -		iowrite32(new_bo->gtt_space->start + params->offset_U,
> +		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_U,
>  			  &regs->OBUF_0U);
> -		iowrite32(new_bo->gtt_space->start + params->offset_V,
> +		iowrite32(i915_gem_obj_offset(new_bo) + params->offset_V,
>  			  &regs->OBUF_0V);
>  		ostride |= params->stride_UV << 16;
>  	}
> @@ -1357,7 +1358,7 @@ void intel_setup_overlay(struct drm_device *dev)
>  			DRM_ERROR("failed to pin overlay register bo\n");
>  			goto out_free_bo;
>  		}
> -		overlay->flip_addr = reg_bo->gtt_space->start;
> +		overlay->flip_addr = i915_gem_obj_offset(reg_bo);
>  
>  		ret = i915_gem_object_set_to_gtt_domain(reg_bo, true);
>  		if (ret) {
> @@ -1436,7 +1437,7 @@ intel_overlay_map_regs_atomic(struct intel_overlay *overlay)
>  			overlay->reg_bo->phys_obj->handle->vaddr;
>  
>  	return io_mapping_map_atomic_wc(dev_priv->gtt.mappable,
> -					overlay->reg_bo->gtt_space->start);
> +					i915_gem_obj_offset(overlay->reg_bo));
>  }
>  
>  static void intel_overlay_unmap_regs_atomic(struct intel_overlay *overlay,
> @@ -1467,7 +1468,7 @@ intel_overlay_capture_error_state(struct drm_device *dev)
>  	if (OVERLAY_NEEDS_PHYSICAL(overlay->dev))
>  		error->base = (__force long)overlay->reg_bo->phys_obj->handle->vaddr;
>  	else
> -		error->base = overlay->reg_bo->gtt_space->start;
> +		error->base = i915_gem_obj_offset(overlay->reg_bo);
>  
>  	regs = intel_overlay_map_regs_atomic(overlay);
>  	if (!regs)
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 73c0ee1..504d96b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -217,7 +217,7 @@ static void ironlake_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  		   (stall_watermark << DPFC_RECOMP_STALL_WM_SHIFT) |
>  		   (interval << DPFC_RECOMP_TIMER_COUNT_SHIFT));
>  	I915_WRITE(ILK_DPFC_FENCE_YOFF, crtc->y);
> -	I915_WRITE(ILK_FBC_RT_BASE, obj->gtt_space->start | ILK_FBC_RT_VALID);
> +	I915_WRITE(ILK_FBC_RT_BASE, i915_gem_obj_offset(obj) | ILK_FBC_RT_VALID);
>  	/* enable it... */
>  	I915_WRITE(ILK_DPFC_CONTROL, dpfc_ctl | DPFC_CTL_EN);
>  
> @@ -3685,7 +3685,7 @@ static void ironlake_enable_rc6(struct drm_device *dev)
>  
>  	intel_ring_emit(ring, MI_SUSPEND_FLUSH | MI_SUSPEND_FLUSH_EN);
>  	intel_ring_emit(ring, MI_SET_CONTEXT);
> -	intel_ring_emit(ring, dev_priv->ips.renderctx->gtt_space->start |
> +	intel_ring_emit(ring, i915_gem_obj_offset(dev_priv->ips.renderctx) |
>  			MI_MM_SPACE_GTT |
>  			MI_SAVE_EXT_STATE_EN |
>  			MI_RESTORE_EXT_STATE_EN |
> @@ -3708,7 +3708,8 @@ static void ironlake_enable_rc6(struct drm_device *dev)
>  		return;
>  	}
>  
> -	I915_WRITE(PWRCTXA, dev_priv->ips.pwrctx->gtt_space->start | PWRCTX_EN);
> +	I915_WRITE(PWRCTXA, i915_gem_obj_offset(dev_priv->ips.pwrctx) |
> +			    PWRCTX_EN);
>  	I915_WRITE(RSTDBYCTL, I915_READ(RSTDBYCTL) & ~RCX_SW_EXIT);
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index c4c80c2..64b579f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -424,14 +424,14 @@ static int init_ring_common(struct intel_ring_buffer *ring)
>  	 * registers with the above sequence (the readback of the HEAD registers
>  	 * also enforces ordering), otherwise the hw might lose the new ring
>  	 * register values. */
> -	I915_WRITE_START(ring, obj->gtt_space->start);
> +	I915_WRITE_START(ring, i915_gem_obj_offset(obj));
>  	I915_WRITE_CTL(ring,
>  			((ring->size - PAGE_SIZE) & RING_NR_PAGES)
>  			| RING_VALID);
>  
>  	/* If the head is still not zero, the ring is dead */
>  	if (wait_for((I915_READ_CTL(ring) & RING_VALID) != 0 &&
> -		     I915_READ_START(ring) == obj->gtt_space->start &&
> +		     I915_READ_START(ring) == i915_gem_obj_offset(obj) &&
>  		     (I915_READ_HEAD(ring) & HEAD_ADDR) == 0, 50)) {
>  		DRM_ERROR("%s initialization failed "
>  				"ctl %08x head %08x tail %08x start %08x\n",
> @@ -489,7 +489,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
>  	if (ret)
>  		goto err_unref;
>  
> -	pc->gtt_offset = obj->gtt_space->start;
> +	pc->gtt_offset = i915_gem_obj_offset(obj);
>  	pc->cpu_page = kmap(sg_page(obj->pages->sgl));
>  	if (pc->cpu_page == NULL) {
>  		ret = -ENOMEM;
> @@ -1129,7 +1129,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring,
>  		intel_ring_advance(ring);
>  	} else {
>  		struct drm_i915_gem_object *obj = ring->private;
> -		u32 cs_offset = obj->gtt_space->start;
> +		u32 cs_offset = i915_gem_obj_offset(obj);
>  
>  		if (len > I830_BATCH_LIMIT)
>  			return -ENOSPC;
> @@ -1214,7 +1214,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
>  		goto err_unref;
>  	}
>  
> -	ring->status_page.gfx_addr = obj->gtt_space->start;
> +	ring->status_page.gfx_addr = i915_gem_obj_offset(obj);
>  	ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
>  	if (ring->status_page.page_addr == NULL) {
>  		ret = -ENOMEM;
> @@ -1308,7 +1308,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>  		goto err_unpin;
>  
>  	ring->virtual_start =
> -		ioremap_wc(dev_priv->gtt.mappable_base + obj->gtt_space->start,
> +		ioremap_wc(dev_priv->gtt.mappable_base + i915_gem_obj_offset(obj),
>  			   ring->size);
>  	if (ring->virtual_start == NULL) {
>  		DRM_ERROR("Failed to map ringbuffer.\n");
> diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
> index c342571..117a2f8 100644
> --- a/drivers/gpu/drm/i915/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/intel_sprite.c
> @@ -133,7 +133,7 @@ vlv_update_plane(struct drm_plane *dplane, struct drm_framebuffer *fb,
>  
>  	I915_WRITE(SPSIZE(pipe, plane), (crtc_h << 16) | crtc_w);
>  	I915_WRITE(SPCNTR(pipe, plane), sprctl);
> -	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), obj->gtt_space->start +
> +	I915_MODIFY_DISPBASE(SPSURF(pipe, plane), i915_gem_obj_offset(obj) +
>  			     sprsurf_offset);
>  	POSTING_READ(SPSURF(pipe, plane));
>  }
> @@ -309,7 +309,7 @@ ivb_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
>  		I915_WRITE(SPRSCALE(pipe), sprscale);
>  	I915_WRITE(SPRCTL(pipe), sprctl);
>  	I915_MODIFY_DISPBASE(SPRSURF(pipe),
> -			     obj->gtt_space->start + sprsurf_offset);
> +			     i915_gem_obj_offset(obj) + sprsurf_offset);
>  	POSTING_READ(SPRSURF(pipe));
>  
>  	/* potentially re-enable LP watermarks */
> @@ -480,7 +480,7 @@ ilk_update_plane(struct drm_plane *plane, struct drm_framebuffer *fb,
>  	I915_WRITE(DVSSCALE(pipe), dvsscale);
>  	I915_WRITE(DVSCNTR(pipe), dvscntr);
>  	I915_MODIFY_DISPBASE(DVSSURF(pipe),
> -			     obj->gtt_space->start + dvssurf_offset);
> +			     i915_gem_obj_offset(obj) + dvssurf_offset);
>  	POSTING_READ(DVSSURF(pipe));
>  }
>  
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 20/66] drm/i915: Move fbc members out of line
  2013-06-27 23:30 ` [PATCH 20/66] drm/i915: Move fbc members out of line Ben Widawsky
@ 2013-06-30 13:10   ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 13:10 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:21PM -0700, Ben Widawsky wrote:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Merged patches 16-20, thanks.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c    |  2 +-
>  drivers/gpu/drm/i915/i915_drv.h        | 48 +++++++++++++++++++--------------
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 20 +++++++-------
>  drivers/gpu/drm/i915/intel_display.c   |  6 ++---
>  drivers/gpu/drm/i915/intel_drv.h       |  7 -----
>  drivers/gpu/drm/i915/intel_pm.c        | 49 +++++++++++++++++-----------------
>  6 files changed, 67 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index d4e78b6..e654bf4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1444,7 +1444,7 @@ static int i915_fbc_status(struct seq_file *m, void *unused)
>  		seq_printf(m, "FBC enabled\n");
>  	} else {
>  		seq_printf(m, "FBC disabled: ");
> -		switch (dev_priv->no_fbc_reason) {
> +		switch (dev_priv->fbc.no_fbc_reason) {
>  		case FBC_NO_OUTPUT:
>  			seq_printf(m, "no outputs");
>  			break;
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index efd244d..21cf593 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -534,17 +534,35 @@ struct i915_hw_context {
>  	struct i915_hw_ppgtt ppgtt;
>  };
>  
> -enum no_fbc_reason {
> -	FBC_NO_OUTPUT, /* no outputs enabled to compress */
> -	FBC_STOLEN_TOO_SMALL, /* not enough space to hold compressed buffers */
> -	FBC_UNSUPPORTED_MODE, /* interlace or doublescanned mode */
> -	FBC_MODE_TOO_LARGE, /* mode too large for compression */
> -	FBC_BAD_PLANE, /* fbc not supported on plane */
> -	FBC_NOT_TILED, /* buffer not tiled */
> -	FBC_MULTIPLE_PIPES, /* more than one pipe active */
> -	FBC_MODULE_PARAM,
> +struct i915_fbc {
> +	unsigned long size;
> +	unsigned int fb_id;
> +	enum plane plane;
> +	int y;
> +
> +	struct drm_mm_node *compressed_fb;
> +	struct drm_mm_node *compressed_llb;
> +
> +	struct intel_fbc_work {
> +		struct delayed_work work;
> +		struct drm_crtc *crtc;
> +		struct drm_framebuffer *fb;
> +		int interval;
> +	} *fbc_work;
> +
> +	enum {
> +		FBC_NO_OUTPUT, /* no outputs enabled to compress */
> +		FBC_STOLEN_TOO_SMALL, /* not enough space for buffers */
> +		FBC_UNSUPPORTED_MODE, /* interlace or doublescanned mode */
> +		FBC_MODE_TOO_LARGE, /* mode too large for compression */
> +		FBC_BAD_PLANE, /* fbc not supported on plane */
> +		FBC_NOT_TILED, /* buffer not tiled */
> +		FBC_MULTIPLE_PIPES, /* more than one pipe active */
> +		FBC_MODULE_PARAM,
> +	} no_fbc_reason;
>  };
>  
> +
>  enum intel_pch {
>  	PCH_NONE = 0,	/* No PCH present */
>  	PCH_IBX,	/* Ibexpeak PCH */
> @@ -1064,12 +1082,7 @@ typedef struct drm_i915_private {
>  
>  	int num_plane;
>  
> -	unsigned long cfb_size;
> -	unsigned int cfb_fb;
> -	enum plane cfb_plane;
> -	int cfb_y;
> -	struct intel_fbc_work *fbc_work;
> -
> +	struct i915_fbc fbc;
>  	struct intel_opregion opregion;
>  	struct intel_vbt_data vbt;
>  
> @@ -1147,11 +1160,6 @@ typedef struct drm_i915_private {
>  	/* Haswell power well */
>  	struct i915_power_well power_well;
>  
> -	enum no_fbc_reason no_fbc_reason;
> -
> -	struct drm_mm_node *compressed_fb;
> -	struct drm_mm_node *compressed_llb;
> -
>  	struct i915_gpu_error gpu_error;
>  
>  	struct drm_i915_gem_object *vlv_pctx;
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index f713294..8e02344 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -120,7 +120,7 @@ static int i915_setup_compression(struct drm_device *dev, int size)
>  		if (!compressed_llb)
>  			goto err_fb;
>  
> -		dev_priv->compressed_llb = compressed_llb;
> +		dev_priv->fbc.compressed_llb = compressed_llb;
>  
>  		I915_WRITE(FBC_CFB_BASE,
>  			   dev_priv->mm.stolen_base + compressed_fb->start);
> @@ -128,8 +128,8 @@ static int i915_setup_compression(struct drm_device *dev, int size)
>  			   dev_priv->mm.stolen_base + compressed_llb->start);
>  	}
>  
> -	dev_priv->compressed_fb = compressed_fb;
> -	dev_priv->cfb_size = size;
> +	dev_priv->fbc.compressed_fb = compressed_fb;
> +	dev_priv->fbc.size = size;
>  
>  	DRM_DEBUG_KMS("reserved %d bytes of contiguous stolen space for FBC\n",
>  		      size);
> @@ -150,7 +150,7 @@ int i915_gem_stolen_setup_compression(struct drm_device *dev, int size)
>  	if (dev_priv->mm.stolen_base == 0)
>  		return -ENODEV;
>  
> -	if (size < dev_priv->cfb_size)
> +	if (size < dev_priv->fbc.size)
>  		return 0;
>  
>  	/* Release any current block */
> @@ -163,16 +163,16 @@ void i915_gem_stolen_cleanup_compression(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	if (dev_priv->cfb_size == 0)
> +	if (dev_priv->fbc.size == 0)
>  		return;
>  
> -	if (dev_priv->compressed_fb)
> -		drm_mm_put_block(dev_priv->compressed_fb);
> +	if (dev_priv->fbc.compressed_fb)
> +		drm_mm_put_block(dev_priv->fbc.compressed_fb);
>  
> -	if (dev_priv->compressed_llb)
> -		drm_mm_put_block(dev_priv->compressed_llb);
> +	if (dev_priv->fbc.compressed_llb)
> +		drm_mm_put_block(dev_priv->fbc.compressed_llb);
>  
> -	dev_priv->cfb_size = 0;
> +	dev_priv->fbc.size = 0;
>  }
>  
>  void i915_gem_cleanup_stolen(struct drm_device *dev)
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 8d075b1f..f056eca 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3391,7 +3391,7 @@ static void ironlake_crtc_disable(struct drm_crtc *crtc)
>  	intel_crtc_wait_for_pending_flips(crtc);
>  	drm_vblank_off(dev, pipe);
>  
> -	if (dev_priv->cfb_plane == plane)
> +	if (dev_priv->fbc.plane == plane)
>  		intel_disable_fbc(dev);
>  
>  	intel_crtc_update_cursor(crtc, false);
> @@ -3464,7 +3464,7 @@ static void haswell_crtc_disable(struct drm_crtc *crtc)
>  	drm_vblank_off(dev, pipe);
>  
>  	/* FBC must be disabled before disabling the plane on HSW. */
> -	if (dev_priv->cfb_plane == plane)
> +	if (dev_priv->fbc.plane == plane)
>  		intel_disable_fbc(dev);
>  
>  	hsw_disable_ips(intel_crtc);
> @@ -3705,7 +3705,7 @@ static void i9xx_crtc_disable(struct drm_crtc *crtc)
>  	intel_crtc_wait_for_pending_flips(crtc);
>  	drm_vblank_off(dev, pipe);
>  
> -	if (dev_priv->cfb_plane == plane)
> +	if (dev_priv->fbc.plane == plane)
>  		intel_disable_fbc(dev);
>  
>  	intel_crtc_dpms_overlay(intel_crtc, false);
> diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
> index ffe9d35..af68861 100644
> --- a/drivers/gpu/drm/i915/intel_drv.h
> +++ b/drivers/gpu/drm/i915/intel_drv.h
> @@ -548,13 +548,6 @@ struct intel_unpin_work {
>  	bool enable_stall_check;
>  };
>  
> -struct intel_fbc_work {
> -	struct delayed_work work;
> -	struct drm_crtc *crtc;
> -	struct drm_framebuffer *fb;
> -	int interval;
> -};
> -
>  int intel_pch_rawclk(struct drm_device *dev);
>  
>  int intel_connector_update_modes(struct drm_connector *connector,
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index b27bda0..d32734d 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -86,7 +86,7 @@ static void i8xx_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	int plane, i;
>  	u32 fbc_ctl, fbc_ctl2;
>  
> -	cfb_pitch = dev_priv->cfb_size / FBC_LL_SIZE;
> +	cfb_pitch = dev_priv->fbc.size / FBC_LL_SIZE;
>  	if (fb->pitches[0] < cfb_pitch)
>  		cfb_pitch = fb->pitches[0];
>  
> @@ -325,7 +325,7 @@ static void intel_fbc_work_fn(struct work_struct *__work)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
>  	mutex_lock(&dev->struct_mutex);
> -	if (work == dev_priv->fbc_work) {
> +	if (work == dev_priv->fbc.fbc_work) {
>  		/* Double check that we haven't switched fb without cancelling
>  		 * the prior work.
>  		 */
> @@ -333,12 +333,12 @@ static void intel_fbc_work_fn(struct work_struct *__work)
>  			dev_priv->display.enable_fbc(work->crtc,
>  						     work->interval);
>  
> -			dev_priv->cfb_plane = to_intel_crtc(work->crtc)->plane;
> -			dev_priv->cfb_fb = work->crtc->fb->base.id;
> -			dev_priv->cfb_y = work->crtc->y;
> +			dev_priv->fbc.plane = to_intel_crtc(work->crtc)->plane;
> +			dev_priv->fbc.fb_id = work->crtc->fb->base.id;
> +			dev_priv->fbc.y = work->crtc->y;
>  		}
>  
> -		dev_priv->fbc_work = NULL;
> +		dev_priv->fbc.fbc_work = NULL;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
>  
> @@ -347,25 +347,25 @@ static void intel_fbc_work_fn(struct work_struct *__work)
>  
>  static void intel_cancel_fbc_work(struct drm_i915_private *dev_priv)
>  {
> -	if (dev_priv->fbc_work == NULL)
> +	if (dev_priv->fbc.fbc_work == NULL)
>  		return;
>  
>  	DRM_DEBUG_KMS("cancelling pending FBC enable\n");
>  
>  	/* Synchronisation is provided by struct_mutex and checking of
> -	 * dev_priv->fbc_work, so we can perform the cancellation
> +	 * dev_priv->fbc.fbc_work, so we can perform the cancellation
>  	 * entirely asynchronously.
>  	 */
> -	if (cancel_delayed_work(&dev_priv->fbc_work->work))
> +	if (cancel_delayed_work(&dev_priv->fbc.fbc_work->work))
>  		/* tasklet was killed before being run, clean up */
> -		kfree(dev_priv->fbc_work);
> +		kfree(dev_priv->fbc.fbc_work);
>  
>  	/* Mark the work as no longer wanted so that if it does
>  	 * wake-up (because the work was already running and waiting
>  	 * for our mutex), it will discover that is no longer
>  	 * necessary to run.
>  	 */
> -	dev_priv->fbc_work = NULL;
> +	dev_priv->fbc.fbc_work = NULL;
>  }
>  
>  void intel_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
> @@ -390,7 +390,7 @@ void intel_enable_fbc(struct drm_crtc *crtc, unsigned long interval)
>  	work->interval = interval;
>  	INIT_DELAYED_WORK(&work->work, intel_fbc_work_fn);
>  
> -	dev_priv->fbc_work = work;
> +	dev_priv->fbc.fbc_work = work;
>  
>  	DRM_DEBUG_KMS("scheduling delayed FBC enable\n");
>  
> @@ -418,7 +418,7 @@ void intel_disable_fbc(struct drm_device *dev)
>  		return;
>  
>  	dev_priv->display.disable_fbc(dev);
> -	dev_priv->cfb_plane = -1;
> +	dev_priv->fbc.plane = -1;
>  }
>  
>  /**
> @@ -471,7 +471,8 @@ void intel_update_fbc(struct drm_device *dev)
>  		    !to_intel_crtc(tmp_crtc)->primary_disabled) {
>  			if (crtc) {
>  				DRM_DEBUG_KMS("more than one pipe active, disabling compression\n");
> -				dev_priv->no_fbc_reason = FBC_MULTIPLE_PIPES;
> +				dev_priv->fbc.no_fbc_reason =
> +					FBC_MULTIPLE_PIPES;
>  				goto out_disable;
>  			}
>  			crtc = tmp_crtc;
> @@ -480,7 +481,7 @@ void intel_update_fbc(struct drm_device *dev)
>  
>  	if (!crtc || crtc->fb == NULL) {
>  		DRM_DEBUG_KMS("no output, disabling\n");
> -		dev_priv->no_fbc_reason = FBC_NO_OUTPUT;
> +		dev_priv->fbc.no_fbc_reason = FBC_NO_OUTPUT;
>  		goto out_disable;
>  	}
>  
> @@ -498,14 +499,14 @@ void intel_update_fbc(struct drm_device *dev)
>  	}
>  	if (!enable_fbc) {
>  		DRM_DEBUG_KMS("fbc disabled per module param\n");
> -		dev_priv->no_fbc_reason = FBC_MODULE_PARAM;
> +		dev_priv->fbc.no_fbc_reason = FBC_MODULE_PARAM;
>  		goto out_disable;
>  	}
>  	if ((crtc->mode.flags & DRM_MODE_FLAG_INTERLACE) ||
>  	    (crtc->mode.flags & DRM_MODE_FLAG_DBLSCAN)) {
>  		DRM_DEBUG_KMS("mode incompatible with compression, "
>  			      "disabling\n");
> -		dev_priv->no_fbc_reason = FBC_UNSUPPORTED_MODE;
> +		dev_priv->fbc.no_fbc_reason = FBC_UNSUPPORTED_MODE;
>  		goto out_disable;
>  	}
>  
> @@ -519,13 +520,13 @@ void intel_update_fbc(struct drm_device *dev)
>  	if ((crtc->mode.hdisplay > max_hdisplay) ||
>  	    (crtc->mode.vdisplay > max_vdisplay)) {
>  		DRM_DEBUG_KMS("mode too large for compression, disabling\n");
> -		dev_priv->no_fbc_reason = FBC_MODE_TOO_LARGE;
> +		dev_priv->fbc.no_fbc_reason = FBC_MODE_TOO_LARGE;
>  		goto out_disable;
>  	}
>  	if ((IS_I915GM(dev) || IS_I945GM(dev) || IS_HASWELL(dev)) &&
>  	    intel_crtc->plane != 0) {
>  		DRM_DEBUG_KMS("plane not 0, disabling compression\n");
> -		dev_priv->no_fbc_reason = FBC_BAD_PLANE;
> +		dev_priv->fbc.no_fbc_reason = FBC_BAD_PLANE;
>  		goto out_disable;
>  	}
>  
> @@ -535,7 +536,7 @@ void intel_update_fbc(struct drm_device *dev)
>  	if (obj->tiling_mode != I915_TILING_X ||
>  	    obj->fence_reg == I915_FENCE_REG_NONE) {
>  		DRM_DEBUG_KMS("framebuffer not tiled or fenced, disabling compression\n");
> -		dev_priv->no_fbc_reason = FBC_NOT_TILED;
> +		dev_priv->fbc.no_fbc_reason = FBC_NOT_TILED;
>  		goto out_disable;
>  	}
>  
> @@ -545,7 +546,7 @@ void intel_update_fbc(struct drm_device *dev)
>  
>  	if (i915_gem_stolen_setup_compression(dev, intel_fb->obj->base.size)) {
>  		DRM_DEBUG_KMS("framebuffer too large, disabling compression\n");
> -		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
> +		dev_priv->fbc.no_fbc_reason = FBC_STOLEN_TOO_SMALL;
>  		goto out_disable;
>  	}
>  
> @@ -554,9 +555,9 @@ void intel_update_fbc(struct drm_device *dev)
>  	 * cannot be unpinned (and have its GTT offset and fence revoked)
>  	 * without first being decoupled from the scanout and FBC disabled.
>  	 */
> -	if (dev_priv->cfb_plane == intel_crtc->plane &&
> -	    dev_priv->cfb_fb == fb->base.id &&
> -	    dev_priv->cfb_y == crtc->y)
> +	if (dev_priv->fbc.plane == intel_crtc->plane &&
> +	    dev_priv->fbc.fb_id == fb->base.id &&
> +	    dev_priv->fbc.y == crtc->y)
>  		return;
>  
>  	if (intel_fbc_enabled(dev)) {
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-06-27 23:30 ` [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
@ 2013-06-30 13:12   ` Daniel Vetter
  2013-07-01 18:40     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 13:12 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:22PM -0700, Ben Widawsky wrote:
> The GTT and PPGTT can be thought of more generally as GPU address
> spaces. Many of their actions (insert entries), state (LRU lists) and
> many of their characteristics (size), can be shared. Do that.
> 
> Created a i915_gtt_vm helper macro since for now we always want the
> regular GTT address space. Eventually we'll ween ourselves off of using
> this except in cases where we obviously want the GGTT (like display).
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

The i915_gtt_vm #define is imo too evil. Looks like a local variable, but
isn't. I think in most places we should just drop it, in others we should
add a real local vm variable. I'll punt for now on this one.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
>  drivers/gpu/drm/i915/i915_drv.h     |  48 ++++++------
>  drivers/gpu/drm/i915/i915_gem.c     |   8 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 145 +++++++++++++++++++-----------------
>  4 files changed, 110 insertions(+), 95 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index e654bf4..c10a690 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>  		   count, size);
>  
>  	seq_printf(m, "%zu [%lu] gtt total\n",
> -		   dev_priv->gtt.total,
> -		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
> +		   i915_gtt_vm->total,
> +		   dev_priv->gtt.mappable_end - i915_gtt_vm->start);
>  
>  	seq_printf(m, "\n");
>  	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 21cf593..7f4c9b6 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -442,6 +442,28 @@ enum i915_cache_level {
>  
>  typedef uint32_t gen6_gtt_pte_t;
>  
> +struct i915_address_space {
> +	struct drm_device *dev;
> +	unsigned long start;		/* Start offset always 0 for dri2 */
> +	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> +
> +	struct {
> +		dma_addr_t addr;
> +		struct page *page;
> +	} scratch;
> +
> +	/* FIXME: Need a more generic return type */
> +	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> +				     enum i915_cache_level level);
> +	void (*clear_range)(struct i915_address_space *i915_mm,
> +			    unsigned int first_entry,
> +			    unsigned int num_entries);
> +	void (*insert_entries)(struct i915_address_space *i915_mm,
> +			       struct sg_table *st,
> +			       unsigned int first_entry,
> +			       enum i915_cache_level cache_level);
> +};
> +
>  /* The Graphics Translation Table is the way in which GEN hardware translates a
>   * Graphics Virtual Address into a Physical Address. In addition to the normal
>   * collateral associated with any va->pa translations GEN hardware also has a
> @@ -450,8 +472,7 @@ typedef uint32_t gen6_gtt_pte_t;
>   * the spec.
>   */
>  struct i915_gtt {
> -	unsigned long start;		/* Start offset of used GTT */
> -	size_t total;			/* Total size GTT can map */
> +	struct i915_address_space base;
>  	size_t stolen_size;		/* Total size of stolen memory */
>  
>  	unsigned long mappable_end;	/* End offset that we can CPU map */
> @@ -472,34 +493,17 @@ struct i915_gtt {
>  			  size_t *stolen, phys_addr_t *mappable_base,
>  			  unsigned long *mappable_end);
>  	void (*gtt_remove)(struct drm_device *dev);
> -	void (*gtt_clear_range)(struct drm_device *dev,
> -				unsigned int first_entry,
> -				unsigned int num_entries);
> -	void (*gtt_insert_entries)(struct drm_device *dev,
> -				   struct sg_table *st,
> -				   unsigned int pg_start,
> -				   enum i915_cache_level cache_level);
> -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> -				     enum i915_cache_level level);
>  };
> -#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
> +#define i915_gtt_vm ((struct i915_address_space *)&(dev_priv->gtt.base))
>  
>  struct i915_hw_ppgtt {
> +	struct i915_address_space base;
>  	struct drm_mm_node node;
> -	struct drm_device *dev;
>  	unsigned num_pd_entries;
>  	struct page **pt_pages;
>  	uint32_t pd_offset;
>  	dma_addr_t *pt_dma_addr;
>  
> -	/* pte functions, mirroring the interface of the global gtt. */
> -	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
> -			    unsigned int first_entry,
> -			    unsigned int num_entries);
> -	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
> -			       struct sg_table *st,
> -			       unsigned int pg_start,
> -			       enum i915_cache_level cache_level);
>  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
>  				     enum i915_cache_level level);
>  	int (*enable)(struct drm_device *dev);
> @@ -1123,7 +1127,7 @@ typedef struct drm_i915_private {
>  	enum modeset_restore modeset_restore;
>  	struct mutex modeset_restore_lock;
>  
> -	struct i915_gtt gtt;
> +	struct i915_gtt gtt; /* VMA representing the global address space */
>  
>  	struct i915_gem_mm mm;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c96b422..e31ed47 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>  			pinned += obj->gtt_space->size;
>  	mutex_unlock(&dev->struct_mutex);
>  
> -	args->aper_size = dev_priv->gtt.total;
> +	args->aper_size = i915_gtt_vm->total;
>  	args->aper_available_size = args->aper_size - pinned;
>  
>  	return 0;
> @@ -3083,7 +3083,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  	u32 size, fence_size, fence_alignment, unfenced_alignment;
>  	bool mappable, fenceable;
>  	size_t gtt_max = map_and_fenceable ?
> -		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
> +		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
>  	int ret;
>  
>  	fence_size = i915_gem_get_gtt_size(dev,
> @@ -4226,7 +4226,7 @@ int i915_gem_init(struct drm_device *dev)
>  	 */
>  	if (HAS_HW_CONTEXTS(dev)) {
>  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> -					  dev_priv->gtt.total, 0);
> +					  i915_gtt_vm->total, 0);
>  		i915_gem_context_init(dev);
>  		if (dev_priv->hw_contexts_disabled) {
>  			drm_mm_takedown(&dev_priv->mm.gtt_space);
> @@ -4240,7 +4240,7 @@ ggtt_only:
>  		if (HAS_HW_CONTEXTS(dev))
>  			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
>  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> -					  dev_priv->gtt.total, PAGE_SIZE);
> +					  i915_gtt_vm->total, PAGE_SIZE);
>  	}
>  
>  	ret = i915_gem_init_hw(dev);
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index bb4ccb5..6de75c7 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
>  
>  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
>  	gen6_gtt_pte_t __iomem *pd_addr;
>  	uint32_t pd_entry;
>  	int i;
> @@ -183,18 +183,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>  }
>  
>  /* PPGTT support for Sandybdrige/Gen6 and later */
> -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
>  				   unsigned first_entry,
>  				   unsigned num_entries)
>  {
> -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
>  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
>  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
>  	unsigned last_pte, i;
>  
> -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> -					I915_CACHE_LLC);
> +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
>  
>  	while (num_entries) {
>  		last_pte = first_pte + num_entries;
> @@ -214,11 +214,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
>  	}
>  }
>  
> -static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> +static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
>  				      struct sg_table *pages,
>  				      unsigned first_entry,
>  				      enum i915_cache_level cache_level)
>  {
> +	struct i915_hw_ppgtt *ppgtt =
> +		container_of(vm, struct i915_hw_ppgtt, base);
>  	gen6_gtt_pte_t *pt_vaddr;
>  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
>  	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> @@ -229,7 +231,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
>  		dma_addr_t page_addr;
>  
>  		page_addr = sg_page_iter_dma_address(&sg_iter);
> -		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
> +		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
>  		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
>  			kunmap_atomic(pt_vaddr);
>  			act_pt++;
> @@ -243,14 +245,14 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
>  
>  static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
>  {
> +	struct i915_address_space *vm = &ppgtt->base;
>  	int i;
>  
>  	drm_mm_remove_node(&ppgtt->node);
>  
>  	if (ppgtt->pt_dma_addr) {
>  		for (i = 0; i < ppgtt->num_pd_entries; i++)
> -			pci_unmap_page(ppgtt->dev->pdev,
> -				       ppgtt->pt_dma_addr[i],
> +			pci_unmap_page(vm->dev->pdev, ppgtt->pt_dma_addr[i],
>  				       4096, PCI_DMA_BIDIRECTIONAL);
>  	}
>  
> @@ -264,7 +266,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  {
>  #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
>  #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
> -	struct drm_device *dev = ppgtt->dev;
> +	struct i915_address_space *vm = &ppgtt->base;
> +	struct drm_device *dev = vm->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int i;
>  	int ret = -ENOMEM;
> @@ -279,21 +282,22 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  						  &ppgtt->node, GEN6_PD_SIZE,
>  						  GEN6_PD_ALIGN, 0,
>  						  dev_priv->gtt.mappable_end,
> -						  dev_priv->gtt.total,
> +						  i915_gtt_vm->total,
>  						  DRM_MM_TOPDOWN);
>  	if (ret)
>  		return ret;
>  
>  	if (IS_HASWELL(dev)) {
> -		ppgtt->pte_encode = hsw_pte_encode;
> +		vm->pte_encode = hsw_pte_encode;
>  	} else if (IS_VALLEYVIEW(dev)) {
> -		ppgtt->pte_encode = byt_pte_encode;
> +		vm->pte_encode = byt_pte_encode;
>  	} else {
> -		ppgtt->pte_encode = gen6_pte_encode;
> +		vm->pte_encode = gen6_pte_encode;
>  	}
>  
>  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*GEN6_PPGTT_PD_ENTRIES,
>  				  GFP_KERNEL);
> +
>  	if (!ppgtt->pt_pages) {
>  		drm_mm_remove_node(&ppgtt->node);
>  		return -ENOMEM;
> @@ -326,12 +330,15 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
>  
>  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
>  	ppgtt->enable = gen6_ppgtt_enable;
> -	ppgtt->clear_range = gen6_ppgtt_clear_range;
> -	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
>  	ppgtt->cleanup = gen6_ppgtt_cleanup;
>  
> -	ppgtt->clear_range(ppgtt, 0,
> -			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
> +	vm->clear_range = gen6_ppgtt_clear_range;
> +	vm->insert_entries = gen6_ppgtt_insert_entries;
> +	vm->start = 0;
> +	vm->total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
> +	vm->scratch = dev_priv->gtt.base.scratch;
> +
> +	vm->clear_range(vm, 0, ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
>  
>  	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
>  			 ppgtt->node.size >> 20,
> @@ -363,7 +370,7 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
>  {
>  	int ret;
>  
> -	ppgtt->dev = dev;
> +	ppgtt->base.dev = dev;
>  
>  	if (INTEL_INFO(dev)->gen < 8)
>  		ret = gen6_ppgtt_init(ppgtt);
> @@ -377,17 +384,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
>  			    struct drm_i915_gem_object *obj,
>  			    enum i915_cache_level cache_level)
>  {
> -	ppgtt->insert_entries(ppgtt, obj->pages,
> -			      obj->gtt_space->start >> PAGE_SHIFT,
> -			      cache_level);
> +	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> +				   obj->gtt_space->start >> PAGE_SHIFT,
> +				   cache_level);
>  }
>  
>  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
>  			      struct drm_i915_gem_object *obj)
>  {
> -	ppgtt->clear_range(ppgtt,
> -			   obj->gtt_space->start >> PAGE_SHIFT,
> -			   obj->base.size >> PAGE_SHIFT);
> +	ppgtt->base.clear_range(&ppgtt->base,
> +				obj->gtt_space->start >> PAGE_SHIFT,
> +				obj->base.size >> PAGE_SHIFT);
>  }
>  
>  extern int intel_iommu_gfx_mapped;
> @@ -434,8 +441,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  	struct drm_i915_gem_object *obj;
>  
>  	/* First fill our portion of the GTT with scratch pages */
> -	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
> -				      dev_priv->gtt.total / PAGE_SIZE);
> +	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
> +				       i915_gtt_vm->start / PAGE_SIZE,
> +				       i915_gtt_vm->total / PAGE_SIZE);
>  
>  	if (dev_priv->mm.aliasing_ppgtt)
>  		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> @@ -467,12 +475,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
>   * within the global GTT as well as accessible by the GPU through the GMADR
>   * mapped BAR (dev_priv->mm.gtt->gtt).
>   */
> -static void gen6_ggtt_insert_entries(struct drm_device *dev,
> +static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct sg_table *st,
>  				     unsigned int first_entry,
>  				     enum i915_cache_level level)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
>  	gen6_gtt_pte_t __iomem *gtt_entries =
>  		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
>  	int i = 0;
> @@ -481,8 +489,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  
>  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
>  		addr = sg_page_iter_dma_address(&sg_iter);
> -		iowrite32(dev_priv->gtt.pte_encode(addr, level),
> -			  &gtt_entries[i]);
> +		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
>  		i++;
>  	}
>  
> @@ -493,8 +500,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  	 * hardware should work, we must keep this posting read for paranoia.
>  	 */
>  	if (i != 0)
> -		WARN_ON(readl(&gtt_entries[i-1])
> -			!= dev_priv->gtt.pte_encode(addr, level));
> +		WARN_ON(readl(&gtt_entries[i-1]) !=
> +			vm->pte_encode(addr, level));
>  
>  	/* This next bit makes the above posting read even more important. We
>  	 * want to flush the TLBs only after we're certain all the PTE updates
> @@ -504,14 +511,14 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
>  	POSTING_READ(GFX_FLSH_CNTL_GEN6);
>  }
>  
> -static void gen6_ggtt_clear_range(struct drm_device *dev,
> +static void gen6_ggtt_clear_range(struct i915_address_space *vm,
>  				  unsigned int first_entry,
>  				  unsigned int num_entries)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
>  	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
>  		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
> -	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
> +	const int max_entries = (vm->total >> PAGE_SHIFT) - first_entry;
>  	int i;
>  
>  	if (num_entries == 0)
> @@ -522,15 +529,15 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
>  		 first_entry, num_entries, max_entries))
>  		num_entries = max_entries;
>  
> -	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
> -					       I915_CACHE_LLC);
> +	scratch_pte = vm->pte_encode(vm->scratch.addr,
> +					  I915_CACHE_LLC);
>  	for (i = 0; i < num_entries; i++)
>  		iowrite32(scratch_pte, &gtt_base[i]);
>  	readl(gtt_base);
>  }
>  
>  
> -static void i915_ggtt_insert_entries(struct drm_device *dev,
> +static void i915_ggtt_insert_entries(struct i915_address_space *vm,
>  				     struct sg_table *st,
>  				     unsigned int pg_start,
>  				     enum i915_cache_level cache_level)
> @@ -542,7 +549,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
>  
>  }
>  
> -static void i915_ggtt_clear_range(struct drm_device *dev,
> +static void i915_ggtt_clear_range(struct i915_address_space *vm,
>  				  unsigned int first_entry,
>  				  unsigned int num_entries)
>  {
> @@ -559,9 +566,9 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
> -					 obj->gtt_space->start >> PAGE_SHIFT,
> -					 cache_level);
> +	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
> +					  obj->gtt_space->start >> PAGE_SHIFT,
> +					  cache_level);
>  
>  	obj->has_global_gtt_mapping = 1;
>  }
> @@ -571,9 +578,9 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	dev_priv->gtt.gtt_clear_range(obj->base.dev,
> -				      obj->gtt_space->start >> PAGE_SHIFT,
> -				      obj->base.size >> PAGE_SHIFT);
> +	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
> +				       obj->gtt_space->start >> PAGE_SHIFT,
> +				       obj->base.size >> PAGE_SHIFT);
>  
>  	obj->has_global_gtt_mapping = 0;
>  }
> @@ -679,21 +686,21 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
>  		obj->has_global_gtt_mapping = 1;
>  	}
>  
> -	dev_priv->gtt.start = start;
> -	dev_priv->gtt.total = end - start;
> +	i915_gtt_vm->start = start;
> +	i915_gtt_vm->total = end - start;
>  
>  	/* Clear any non-preallocated blocks */
>  	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
>  			     hole_start, hole_end) {
>  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
>  			      hole_start, hole_end);
> -		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
> -					      (hole_end-hole_start) / PAGE_SIZE);
> +		i915_gtt_vm->clear_range(i915_gtt_vm, hole_start / PAGE_SIZE,
> +				     (hole_end-hole_start) / PAGE_SIZE);
>  	}
>  
>  	/* And finally clear the reserved guard page */
> -	dev_priv->gtt.gtt_clear_range(dev, (end - guard_size) / PAGE_SIZE,
> -				      guard_size / PAGE_SIZE);
> +	i915_gtt_vm->clear_range(i915_gtt_vm, (end - guard_size) / PAGE_SIZE,
> +				 guard_size / PAGE_SIZE);
>  }
>  
>  static int setup_scratch_page(struct drm_device *dev)
> @@ -716,8 +723,8 @@ static int setup_scratch_page(struct drm_device *dev)
>  #else
>  	dma_addr = page_to_phys(page);
>  #endif
> -	dev_priv->gtt.scratch.page = page;
> -	dev_priv->gtt.scratch.addr = dma_addr;
> +	i915_gtt_vm->scratch.page = page;
> +	i915_gtt_vm->scratch.addr = dma_addr;
>  
>  	return 0;
>  }
> @@ -725,11 +732,12 @@ static int setup_scratch_page(struct drm_device *dev)
>  static void teardown_scratch_page(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	set_pages_wb(dev_priv->gtt.scratch.page, 1);
> -	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
> +
> +	set_pages_wb(i915_gtt_vm->scratch.page, 1);
> +	pci_unmap_page(dev->pdev, i915_gtt_vm->scratch.addr,
>  		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> -	put_page(dev_priv->gtt.scratch.page);
> -	__free_page(dev_priv->gtt.scratch.page);
> +	put_page(i915_gtt_vm->scratch.page);
> +	__free_page(i915_gtt_vm->scratch.page);
>  }
>  
>  static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
> @@ -792,8 +800,8 @@ static int gen6_gmch_probe(struct drm_device *dev,
>  	if (ret)
>  		DRM_ERROR("Scratch setup failed\n");
>  
> -	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
> -	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
> +	i915_gtt_vm->clear_range = gen6_ggtt_clear_range;
> +	i915_gtt_vm->insert_entries = gen6_ggtt_insert_entries;
>  
>  	return ret;
>  }
> @@ -823,8 +831,8 @@ static int i915_gmch_probe(struct drm_device *dev,
>  	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
>  
>  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> -	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
> -	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
> +	i915_gtt_vm->clear_range = i915_ggtt_clear_range;
> +	i915_gtt_vm->insert_entries = i915_ggtt_insert_entries;
>  
>  	return 0;
>  }
> @@ -847,20 +855,23 @@ int i915_gem_gtt_init(struct drm_device *dev)
>  		gtt->gtt_probe = gen6_gmch_probe;
>  		gtt->gtt_remove = gen6_gmch_remove;
>  		if (IS_HASWELL(dev))
> -			gtt->pte_encode = hsw_pte_encode;
> +			gtt->base.pte_encode = hsw_pte_encode;
>  		else if (IS_VALLEYVIEW(dev))
> -			gtt->pte_encode = byt_pte_encode;
> +			gtt->base.pte_encode = byt_pte_encode;
>  		else
> -			gtt->pte_encode = gen6_pte_encode;
> +			gtt->base.pte_encode = gen6_pte_encode;
>  	}
>  
> -	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
> +	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
>  			     &gtt->mappable_base, &gtt->mappable_end);
>  	if (ret)
>  		return ret;
>  
> +	gtt->base.dev = dev;
> +
>  	/* GMADR is the PCI mmio aperture into the global GTT. */
> -	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
> +	DRM_INFO("Memory usable by graphics device = %zdM\n",
> +		 gtt->base.total >> 20);
>  	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
>  	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
>  
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 23/66] drm/i915: Move stolen stuff to i915_gtt
  2013-06-27 23:30 ` [PATCH 23/66] drm/i915: Move stolen stuff " Ben Widawsky
@ 2013-06-30 13:18   ` Daniel Vetter
  2013-07-01 18:43     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 13:18 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:24PM -0700, Ben Widawsky wrote:
> It doesn't apply to generic VMA, so it belongs with the gtt.
> 
> for file in `ls drivers/gpu/drm/i915/*.c` ; do
> 	sed -i "s/mm.stolen_base/gtt.stolen_base/" $file;
> done
> 
> for file in `ls drivers/gpu/drm/i915/*.c` ; do
> 	sed -i "s/mm.stolen/gtt.stolen/" $file;
> done
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

Before I keep on merging I'd like to clarify the plan a bit: Afaics the
goal is to extract useful stuff shared between global gtt and ppgtt into
i915_address_space. But I'm a bit unclear what dev_priv->mm will hold in
the end, so I'm not sure whether moving stolen around makes sense.

Can you please elaborate on your plan a bit on how dev_priv->gtt and
dev_priv->mm will relate in the end?

Thanks, Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h        |  8 +++-----
>  drivers/gpu/drm/i915/i915_gem_stolen.c | 32 ++++++++++++++++----------------
>  drivers/gpu/drm/i915/i915_irq.c        |  2 +-
>  drivers/gpu/drm/i915/intel_pm.c        |  4 ++--
>  4 files changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index f428076..7016074 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -473,6 +473,9 @@ struct i915_address_space {
>   */
>  struct i915_gtt {
>  	struct i915_address_space base;
> +
> +	struct drm_mm stolen;
> +	unsigned long stolen_base; /* limited to low memory (32-bit) */
>  	size_t stolen_size;		/* Total size of stolen memory */
>  
>  	unsigned long mappable_end;	/* End offset that we can CPU map */
> @@ -828,8 +831,6 @@ struct intel_l3_parity {
>  };
>  
>  struct i915_gem_mm {
> -	/** Memory allocator for GTT stolen memory */
> -	struct drm_mm stolen;
>  	/** Memory allocator for GTT */
>  	struct drm_mm gtt_space;
>  	/** List of all objects in gtt_space. Used to restore gtt
> @@ -842,9 +843,6 @@ struct i915_gem_mm {
>  	 */
>  	struct list_head unbound_list;
>  
> -	/** Usable portion of the GTT for GEM */
> -	unsigned long stolen_base; /* limited to low memory (32-bit) */
> -
>  	/** PPGTT used for aliasing the PPGTT with the GTT */
>  	struct i915_hw_ppgtt *aliasing_ppgtt;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 8e02344..fd812d5 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -97,10 +97,10 @@ static int i915_setup_compression(struct drm_device *dev, int size)
>  	struct drm_mm_node *compressed_fb, *uninitialized_var(compressed_llb);
>  
>  	/* Try to over-allocate to reduce reallocations and fragmentation */
> -	compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
> +	compressed_fb = drm_mm_search_free(&dev_priv->gtt.stolen,
>  					   size <<= 1, 4096, 0);
>  	if (!compressed_fb)
> -		compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen,
> +		compressed_fb = drm_mm_search_free(&dev_priv->gtt.stolen,
>  						   size >>= 1, 4096, 0);
>  	if (compressed_fb)
>  		compressed_fb = drm_mm_get_block(compressed_fb, size, 4096);
> @@ -112,7 +112,7 @@ static int i915_setup_compression(struct drm_device *dev, int size)
>  	else if (IS_GM45(dev)) {
>  		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
>  	} else {
> -		compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen,
> +		compressed_llb = drm_mm_search_free(&dev_priv->gtt.stolen,
>  						    4096, 4096, 0);
>  		if (compressed_llb)
>  			compressed_llb = drm_mm_get_block(compressed_llb,
> @@ -123,9 +123,9 @@ static int i915_setup_compression(struct drm_device *dev, int size)
>  		dev_priv->fbc.compressed_llb = compressed_llb;
>  
>  		I915_WRITE(FBC_CFB_BASE,
> -			   dev_priv->mm.stolen_base + compressed_fb->start);
> +			   dev_priv->gtt.stolen_base + compressed_fb->start);
>  		I915_WRITE(FBC_LL_BASE,
> -			   dev_priv->mm.stolen_base + compressed_llb->start);
> +			   dev_priv->gtt.stolen_base + compressed_llb->start);
>  	}
>  
>  	dev_priv->fbc.compressed_fb = compressed_fb;
> @@ -147,7 +147,7 @@ int i915_gem_stolen_setup_compression(struct drm_device *dev, int size)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	if (dev_priv->mm.stolen_base == 0)
> +	if (dev_priv->gtt.stolen_base == 0)
>  		return -ENODEV;
>  
>  	if (size < dev_priv->fbc.size)
> @@ -180,7 +180,7 @@ void i915_gem_cleanup_stolen(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
>  	i915_gem_stolen_cleanup_compression(dev);
> -	drm_mm_takedown(&dev_priv->mm.stolen);
> +	drm_mm_takedown(&dev_priv->gtt.stolen);
>  }
>  
>  int i915_gem_init_stolen(struct drm_device *dev)
> @@ -188,18 +188,18 @@ int i915_gem_init_stolen(struct drm_device *dev)
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	int bios_reserved = 0;
>  
> -	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
> -	if (dev_priv->mm.stolen_base == 0)
> +	dev_priv->gtt.stolen_base = i915_stolen_to_physical(dev);
> +	if (dev_priv->gtt.stolen_base == 0)
>  		return 0;
>  
>  	DRM_DEBUG_KMS("found %zd bytes of stolen memory at %08lx\n",
> -		      dev_priv->gtt.stolen_size, dev_priv->mm.stolen_base);
> +		      dev_priv->gtt.stolen_size, dev_priv->gtt.stolen_base);
>  
>  	if (IS_VALLEYVIEW(dev))
>  		bios_reserved = 1024*1024; /* top 1M on VLV/BYT */
>  
>  	/* Basic memrange allocator for stolen space */
> -	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->gtt.stolen_size -
> +	drm_mm_init(&dev_priv->gtt.stolen, 0, dev_priv->gtt.stolen_size -
>  		    bios_reserved);
>  
>  	return 0;
> @@ -234,7 +234,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
>  	sg->offset = offset;
>  	sg->length = size;
>  
> -	sg_dma_address(sg) = (dma_addr_t)dev_priv->mm.stolen_base + offset;
> +	sg_dma_address(sg) = (dma_addr_t)dev_priv->gtt.stolen_base + offset;
>  	sg_dma_len(sg) = size;
>  
>  	return st;
> @@ -300,14 +300,14 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
>  	struct drm_i915_gem_object *obj;
>  	struct drm_mm_node *stolen;
>  
> -	if (dev_priv->mm.stolen_base == 0)
> +	if (dev_priv->gtt.stolen_base == 0)
>  		return NULL;
>  
>  	DRM_DEBUG_KMS("creating stolen object: size=%x\n", size);
>  	if (size == 0)
>  		return NULL;
>  
> -	stolen = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
> +	stolen = drm_mm_search_free(&dev_priv->gtt.stolen, size, 4096, 0);
>  	if (stolen)
>  		stolen = drm_mm_get_block(stolen, size, 4096);
>  	if (stolen == NULL)
> @@ -331,7 +331,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	struct drm_i915_gem_object *obj;
>  	struct drm_mm_node *stolen;
>  
> -	if (dev_priv->mm.stolen_base == 0)
> +	if (dev_priv->gtt.stolen_base == 0)
>  		return NULL;
>  
>  	DRM_DEBUG_KMS("creating preallocated stolen object: stolen_offset=%x, gtt_offset=%x, size=%x\n",
> @@ -344,7 +344,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	if (WARN_ON(size == 0))
>  		return NULL;
>  
> -	stolen = drm_mm_create_block(&dev_priv->mm.stolen,
> +	stolen = drm_mm_create_block(&dev_priv->gtt.stolen,
>  				     stolen_offset, size,
>  				     false);
>  	if (stolen == NULL) {
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index fa70fd0..1e25920 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1538,7 +1538,7 @@ i915_error_object_create_sized(struct drm_i915_private *dev_priv,
>  		} else if (src->stolen) {
>  			unsigned long offset;
>  
> -			offset = dev_priv->mm.stolen_base;
> +			offset = dev_priv->gtt.stolen_base;
>  			offset += src->stolen->start;
>  			offset += i << PAGE_SHIFT;
>  
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index d32734d..02f2dea 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -3464,7 +3464,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
>  		/* BIOS set it up already, grab the pre-alloc'd space */
>  		int pcbr_offset;
>  
> -		pcbr_offset = (pcbr & (~4095)) - dev_priv->mm.stolen_base;
> +		pcbr_offset = (pcbr & (~4095)) - dev_priv->gtt.stolen_base;
>  		pctx = i915_gem_object_create_stolen_for_preallocated(dev_priv->dev,
>  								      pcbr_offset,
>  								      -1,
> @@ -3486,7 +3486,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
>  		return;
>  	}
>  
> -	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->start;
> +	pctx_paddr = dev_priv->gtt.stolen_base + pctx->stolen->start;
>  	I915_WRITE(VLV_PCBR, pctx_paddr);
>  
>  out:
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-06-27 23:30 ` [PATCH 24/66] drm/i915: Move aliasing_ppgtt Ben Widawsky
@ 2013-06-30 13:27   ` Daniel Vetter
  2013-07-01 18:52     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 13:27 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:25PM -0700, Ben Widawsky wrote:
> for file in `ls drivers/gpu/drm/i915/*.c` ; do
> 	sed -i "s/mm.aliasing/gtt.aliasing/" $file;
> done

Commit message should explain _why_ we do something. Again I'm asking
since I'm unclear about how things fit all together and what the different
responsibilities are. I think I understand your design here to make both
real ppgtt work and keep aliasing ppgtt going, but I'd like you to explain
this in your words ;-)

One thing which looks a bit peculiar at the end is that struct
i915_hw_ppgtt is actually used as the real ppgtt object (since it
subclases i915_address space). My original plan was that we'll add a new
struct i915_ppgtt {
	struct i915_address_space base;
	struct i915_hw_ppgtt hw_ppgtt;
}

To fit into your design the alising ppgtt pointer in dev_priv->gtt would
then only point at a hw_ppgtt struct, not the full deal with address space
and everything else around attached.

Cheers, Daniel

> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++--
>  drivers/gpu/drm/i915/i915_dma.c            |  2 +-
>  drivers/gpu/drm/i915/i915_drv.h            |  6 +++---
>  drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
>  drivers/gpu/drm/i915/i915_gem_context.c    |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 ++--
>  drivers/gpu/drm/i915/i915_gem_gtt.c        |  6 +++---
>  7 files changed, 19 insertions(+), 19 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index c10a690..f3c76ab 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -1816,8 +1816,8 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
>  		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
>  		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
>  	}
> -	if (dev_priv->mm.aliasing_ppgtt) {
> -		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> +	if (dev_priv->gtt.aliasing_ppgtt) {
> +		struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
>  
>  		seq_printf(m, "aliasing PPGTT:\n");
>  		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset);
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 3535ced..ef00847 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -977,7 +977,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
>  		value = HAS_LLC(dev);
>  		break;
>  	case I915_PARAM_HAS_ALIASING_PPGTT:
> -		if (intel_enable_ppgtt(dev) && dev_priv->mm.aliasing_ppgtt)
> +		if (intel_enable_ppgtt(dev) && dev_priv->gtt.aliasing_ppgtt)
>  			value = 1;
>  		break;
>  	case I915_PARAM_HAS_WAIT_TIMEOUT:
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 7016074..0fa7a21 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -482,6 +482,9 @@ struct i915_gtt {
>  	struct io_mapping *mappable;	/* Mapping to our CPU mappable region */
>  	phys_addr_t mappable_base;	/* PA of our GMADR */
>  
> +	/** PPGTT used for aliasing the PPGTT with the GTT */
> +	struct i915_hw_ppgtt *aliasing_ppgtt;
> +
>  	/** "Graphics Stolen Memory" holds the global PTEs */
>  	void __iomem *gsm;
>  
> @@ -843,9 +846,6 @@ struct i915_gem_mm {
>  	 */
>  	struct list_head unbound_list;
>  
> -	/** PPGTT used for aliasing the PPGTT with the GTT */
> -	struct i915_hw_ppgtt *aliasing_ppgtt;
> -
>  	struct shrinker inactive_shrinker;
>  	bool shrinker_no_lock_stealing;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index e31ed47..eb78c5b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2620,7 +2620,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
>  	if (obj->has_global_gtt_mapping)
>  		i915_gem_gtt_unbind_object(obj);
>  	if (obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> +		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
>  		obj->has_aliasing_ppgtt_mapping = 0;
>  	}
>  	i915_gem_gtt_finish_object(obj);
> @@ -3359,7 +3359,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
>  		if (obj->has_global_gtt_mapping)
>  			i915_gem_gtt_bind_object(obj, cache_level);
>  		if (obj->has_aliasing_ppgtt_mapping)
> -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> +			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
>  					       obj, cache_level);
>  
>  		obj->gtt_space->color = cache_level;
> @@ -3668,7 +3668,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  		if (ret)
>  			return ret;
>  
> -		if (!dev_priv->mm.aliasing_ppgtt)
> +		if (!dev_priv->gtt.aliasing_ppgtt)
>  			i915_gem_gtt_bind_object(obj, obj->cache_level);
>  	}
>  
> @@ -4191,10 +4191,10 @@ i915_gem_init_hw(struct drm_device *dev)
>  	 * the do_switch), but before enabling PPGTT. So don't move this.
>  	 */
>  	ret = i915_gem_context_enable(dev_priv);
> -	if (ret || !dev_priv->mm.aliasing_ppgtt)
> +	if (ret || !dev_priv->gtt.aliasing_ppgtt)
>  		goto disable_ctx_out;
>  
> -	ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
> +	ret = dev_priv->gtt.aliasing_ppgtt->enable(dev);
>  	if (ret)
>  		goto disable_ctx_out;
>  
> @@ -4236,7 +4236,7 @@ int i915_gem_init(struct drm_device *dev)
>  		dev_priv->hw_contexts_disabled = true;
>  
>  ggtt_only:
> -	if (!dev_priv->mm.aliasing_ppgtt) {
> +	if (!dev_priv->gtt.aliasing_ppgtt) {
>  		if (HAS_HW_CONTEXTS(dev))
>  			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
>  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index d92f121..aa4fc4a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -226,7 +226,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
>  	}
>  
>  	dev_priv->ring[RCS].default_context = ctx;
> -	dev_priv->mm.aliasing_ppgtt = &ctx->ppgtt;
> +	dev_priv->gtt.aliasing_ppgtt = &ctx->ppgtt;
>  
>  	DRM_DEBUG_DRIVER("Default HW context loaded\n");
>  	return 0;
> @@ -300,7 +300,7 @@ void i915_gem_context_fini(struct drm_device *dev)
>  	i915_gem_context_unreference(dctx);
>  	dev_priv->ring[RCS].default_context = NULL;
>  	dev_priv->ring[RCS].last_context = NULL;
> -	dev_priv->mm.aliasing_ppgtt = NULL;
> +	dev_priv->gtt.aliasing_ppgtt = NULL;
>  }
>  
>  int i915_gem_context_enable(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 7fcd6c0..93870bb 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -429,8 +429,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* Ensure ppgtt mapping exists if needed */
> -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> +	if (dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> +		i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
>  				       obj, obj->cache_level);
>  
>  		obj->has_aliasing_ppgtt_mapping = 1;
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 6de75c7..18820cb 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -127,7 +127,7 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
>  	drm_i915_private_t *dev_priv = dev->dev_private;
>  	uint32_t pd_offset;
>  	struct intel_ring_buffer *ring;
> -	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> +	struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
>  	int i;
>  
>  	BUG_ON(ppgtt->pd_offset & 0x3f);
> @@ -445,8 +445,8 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
>  				       i915_gtt_vm->start / PAGE_SIZE,
>  				       i915_gtt_vm->total / PAGE_SIZE);
>  
> -	if (dev_priv->mm.aliasing_ppgtt)
> -		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> +	if (dev_priv->gtt.aliasing_ppgtt)
> +		gen6_write_pdes(dev_priv->gtt.aliasing_ppgtt);
>  
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
>  		i915_gem_clflush_object(obj);
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 40/66] drm/i915: Track all VMAs per VM
  2013-06-27 23:30 ` [PATCH 40/66] drm/i915: Track all VMAs per VM Ben Widawsky
@ 2013-06-30 15:35   ` Daniel Vetter
  2013-07-01 19:04     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 15:35 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:41PM -0700, Ben Widawsky wrote:
> This allows us to be aware of all the VMAs leftover and teardown, and is
> useful for debug. I suspect it will prove even more useful later.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h | 2 ++
>  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 247a124..0bc4251 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -446,6 +446,7 @@ struct i915_address_space {
>  	struct drm_mm mm;
>  	struct drm_device *dev;
>  	struct list_head global_link;
> +	struct list_head vma_list;

This one feels a bit unecessary. With the drm_mm_node embedded we already
have a total of 4 lists:
- The node_list in the drm_mm. There's even a for_each helper for it. This
  lists nodes in ascending offset ordering. We only need to upcast from
  the drm_mm_node to our vma, but due to embedded that's no problem.
- The hole list in drm_mm. Again comes with a for_each helper included.
- The inactive/active lists. Together they again list all vmas in a vm.

What's the new one doing that we need it so much?
-Daniel

>  	unsigned long start;		/* Start offset always 0 for dri2 */
>  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
>  
> @@ -556,6 +557,7 @@ struct i915_vma {
>  	struct list_head mm_list;
>  
>  	struct list_head vma_link; /* Link in the object's VMA list */
> +	struct list_head per_vm_link; /* Link in the VM's VMA list */
>  };
>  
>  struct i915_ctx_hang_stats {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a3e8c26..5c0ad6a 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4112,14 +4112,17 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  
>  	INIT_LIST_HEAD(&vma->vma_link);
>  	INIT_LIST_HEAD(&vma->mm_list);
> +	INIT_LIST_HEAD(&vma->per_vm_link);
>  	vma->vm = vm;
>  	vma->obj = obj;
> +	list_add_tail(&vma->per_vm_link, &vm->vma_list);
>  
>  	return vma;
>  }
>  
>  void i915_gem_vma_destroy(struct i915_vma *vma)
>  {
> +	list_del(&vma->per_vm_link);
>  	WARN_ON(vma->node.allocated);
>  	kfree(vma);
>  }
> @@ -4473,6 +4476,7 @@ static void i915_init_vm(struct drm_i915_private *dev_priv,
>  	INIT_LIST_HEAD(&vm->active_list);
>  	INIT_LIST_HEAD(&vm->inactive_list);
>  	INIT_LIST_HEAD(&vm->global_link);
> +	INIT_LIST_HEAD(&vm->vma_list);
>  	list_add(&vm->global_link, &dev_priv->vm_list);
>  }
>  
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 26/66] drm/i915: Move active/inactive lists to new mm
  2013-06-27 23:30 ` [PATCH 26/66] drm/i915: Move active/inactive lists to new mm Ben Widawsky
@ 2013-06-30 15:38   ` Daniel Vetter
  2013-07-01 22:56     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-06-30 15:38 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:27PM -0700, Ben Widawsky wrote:
> for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.inactive_list/i915_gtt_mm-\>inactive_list/" $file; done
> for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.active_list/i915_gtt_mm-\>active_list/" $file; done
> 
> I've also opted to move the comments out of line a bit so one can get a
> better picture of what the various lists do.

Bikeshed: That makes you now inconsistent with all the other in-detail
structure memeber comments we have. And I don't see how it looks better,
so I'd vote to keep things as-is with per-member comments.

> v2: Leave the bound list as a global one. (Chris, indirectly)
> 
> CC: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>

The real comment though is on the commit message, it fails to explain why
we want to move the active/inactive lists from mm/obj to the address
space/vma pair. I think I understand, but this should be explained more
in-depth.

I think in the first commit which starts moving those lists and execution
tracking state you should also mention why some of the state
(bound/unbound lists e.g.) are not moved.

Cheers, Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c    | 11 ++++----
>  drivers/gpu/drm/i915/i915_drv.h        | 49 ++++++++++++++--------------------
>  drivers/gpu/drm/i915/i915_gem.c        | 24 +++++++----------
>  drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++----
>  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
>  drivers/gpu/drm/i915/i915_irq.c        |  6 ++---
>  7 files changed, 46 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index f3c76ab..a0babc7 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -158,11 +158,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
>  	switch (list) {
>  	case ACTIVE_LIST:
>  		seq_printf(m, "Active:\n");
> -		head = &dev_priv->mm.active_list;
> +		head = &i915_gtt_vm->active_list;
>  		break;
>  	case INACTIVE_LIST:
>  		seq_printf(m, "Inactive:\n");
> -		head = &dev_priv->mm.inactive_list;
> +		head = &i915_gtt_vm->inactive_list;
>  		break;
>  	default:
>  		mutex_unlock(&dev->struct_mutex);
> @@ -247,12 +247,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&dev_priv->mm.active_list, mm_list);
> +	count_objects(&i915_gtt_vm->active_list, mm_list);
>  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> -	count_objects(&dev_priv->mm.inactive_list, mm_list);
> +	count_objects(&i915_gtt_vm->inactive_list, mm_list);
>  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
> @@ -1977,7 +1977,8 @@ i915_drop_caches_set(void *data, u64 val)
>  		i915_gem_retire_requests(dev);
>  
>  	if (val & DROP_BOUND) {
> -		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
> +		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
> +					 mm_list)
>  			if (obj->pin_count == 0) {
>  				ret = i915_gem_object_unbind(obj);
>  				if (ret)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e65cf57..0553410 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -448,6 +448,22 @@ struct i915_address_space {
>  	unsigned long start;		/* Start offset always 0 for dri2 */
>  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
>  
> +/* We use many types of lists for object tracking:
> + *  active_list: List of objects currently involved in rendering.
> + *	Includes buffers having the contents of their GPU caches flushed, not
> + *	necessarily primitives. last_rendering_seqno represents when the
> + *	rendering involved will be completed. A reference is held on the buffer
> + *	while on this list.
> + *  inactive_list: LRU list of objects which are not in the ringbuffer
> + *	objects are ready to unbind but are still mapped.
> + *	last_rendering_seqno is 0 while an object is in this list.
> + *	A reference is not held on the buffer while on this list,
> + *	as merely being GTT-bound shouldn't prevent its being
> + *	freed, and we'll pull it off the list in the free path.
> + */
> +	struct list_head active_list;
> +	struct list_head inactive_list;
> +
>  	struct {
>  		dma_addr_t addr;
>  		struct page *page;
> @@ -835,42 +851,17 @@ struct intel_l3_parity {
>  };
>  
>  struct i915_gem_mm {
> -	/** List of all objects in gtt_space. Used to restore gtt
> -	 * mappings on resume */
> -	struct list_head bound_list;
>  	/**
> -	 * List of objects which are not bound to the GTT (thus
> -	 * are idle and not used by the GPU) but still have
> -	 * (presumably uncached) pages still attached.
> +	 * Lists of objects which are [not] bound to a VM. Unbound objects are
> +	 * idle are idle but still have (presumably uncached) pages still
> +	 * attached.
>  	 */
> +	struct list_head bound_list;
>  	struct list_head unbound_list;
>  
>  	struct shrinker inactive_shrinker;
>  	bool shrinker_no_lock_stealing;
>  
> -	/**
> -	 * List of objects currently involved in rendering.
> -	 *
> -	 * Includes buffers having the contents of their GPU caches
> -	 * flushed, not necessarily primitives.  last_rendering_seqno
> -	 * represents when the rendering involved will be completed.
> -	 *
> -	 * A reference is held on the buffer while on this list.
> -	 */
> -	struct list_head active_list;
> -
> -	/**
> -	 * LRU list of objects which are not in the ringbuffer and
> -	 * are ready to unbind, but are still in the GTT.
> -	 *
> -	 * last_rendering_seqno is 0 while an object is in this list.
> -	 *
> -	 * A reference is not held on the buffer while on this list,
> -	 * as merely being GTT-bound shouldn't prevent its being
> -	 * freed, and we'll pull it off the list in the free path.
> -	 */
> -	struct list_head inactive_list;
> -
>  	/** LRU list of objects with fence regs on them. */
>  	struct list_head fence_list;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 608b6b5..7da06df 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1706,7 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
>  	}
>  
>  	list_for_each_entry_safe(obj, next,
> -				 &dev_priv->mm.inactive_list,
> +				 &i915_gtt_vm->inactive_list,
>  				 mm_list) {
>  		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
>  		    i915_gem_object_unbind(obj) == 0 &&
> @@ -1881,7 +1881,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  	}
>  
>  	/* Move from whatever list we were on to the tail of execution. */
> -	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
> +	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
>  	list_move_tail(&obj->ring_list, &ring->active_list);
>  
>  	obj->last_read_seqno = seqno;
> @@ -1909,7 +1909,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
>  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
>  	BUG_ON(!obj->active);
>  
> -	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> +	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
>  
>  	list_del_init(&obj->ring_list);
>  	obj->ring = NULL;
> @@ -2279,12 +2279,8 @@ bool i915_gem_reset(struct drm_device *dev)
>  	/* Move everything out of the GPU domains to ensure we do any
>  	 * necessary invalidation upon reuse.
>  	 */
> -	list_for_each_entry(obj,
> -			    &dev_priv->mm.inactive_list,
> -			    mm_list)
> -	{
> +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
>  		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> -	}
>  
>  	/* The fence registers are invalidated so clear them out */
>  	i915_gem_restore_fences(dev);
> @@ -3162,7 +3158,7 @@ search_free:
>  	}
>  
>  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
>  
>  	obj->gtt_space = node;
>  	obj->gtt_offset = node->start;
> @@ -3313,7 +3309,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  
>  	/* And bump the LRU for this access */
>  	if (i915_gem_object_is_inactive(obj))
> -		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> +		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
>  
>  	return 0;
>  }
> @@ -4291,7 +4287,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
>  		return ret;
>  	}
>  
> -	BUG_ON(!list_empty(&dev_priv->mm.active_list));
> +	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
>  	mutex_unlock(&dev->struct_mutex);
>  
>  	ret = drm_irq_install(dev);
> @@ -4352,8 +4348,8 @@ i915_gem_load(struct drm_device *dev)
>  				  SLAB_HWCACHE_ALIGN,
>  				  NULL);
>  
> -	INIT_LIST_HEAD(&dev_priv->mm.active_list);
> -	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
> +	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
> +	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
>  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> @@ -4652,7 +4648,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
>  		if (obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
> -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, global_list)
> +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
>  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
>  			cnt += obj->base.size >> PAGE_SHIFT;
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
> index 582e6a5..bf945a3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_debug.c
> +++ b/drivers/gpu/drm/i915/i915_gem_debug.c
> @@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
>  		}
>  	}
>  
> -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
> +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
>  		if (obj->base.dev != dev ||
>  		    !atomic_read(&obj->base.refcount.refcount)) {
>  			DRM_ERROR("freed inactive %p\n", obj);
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 6e620f86..92856a2 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -86,7 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  				 cache_level);
>  
>  	/* First see if there is a large enough contiguous idle region... */
> -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
> +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
>  		if (mark_free(obj, &unwind_list))
>  			goto found;
>  	}
> @@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
>  		goto none;
>  
>  	/* Now merge in the soon-to-be-expired objects... */
> -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
>  		if (mark_free(obj, &unwind_list))
>  			goto found;
>  	}
> @@ -158,8 +158,8 @@ i915_gem_evict_everything(struct drm_device *dev)
>  	bool lists_empty;
>  	int ret;
>  
> -	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
> -		       list_empty(&dev_priv->mm.active_list));
> +	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
> +		       list_empty(&i915_gtt_vm->active_list));
>  	if (lists_empty)
>  		return -ENOSPC;
>  
> @@ -177,7 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
>  
>  	/* Having flushed everything, unbind() should never raise an error */
>  	list_for_each_entry_safe(obj, next,
> -				 &dev_priv->mm.inactive_list, mm_list)
> +				 &i915_gtt_vm->inactive_list, mm_list)
>  		if (obj->pin_count == 0)
>  			WARN_ON(i915_gem_object_unbind(obj));
>  
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 49e8be7..3f6564d 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -384,7 +384,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>  	obj->has_global_gtt_mapping = 1;
>  
>  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
>  
>  	return obj;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 1e25920..5dc055a 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -1722,7 +1722,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
>  	}
>  
>  	seqno = ring->get_seqno(ring, false);
> -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
>  		if (obj->ring != ring)
>  			continue;
>  
> @@ -1857,7 +1857,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
>  	int i;
>  
>  	i = 0;
> -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
> +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
>  		i++;
>  	error->active_bo_count = i;
>  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> @@ -1877,7 +1877,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
>  		error->active_bo_count =
>  			capture_active_bo(error->active_bo,
>  					  error->active_bo_count,
> -					  &dev_priv->mm.active_list);
> +					  &i915_gtt_vm->active_list);
>  
>  	if (error->pinned_bo)
>  		error->pinned_bo_count =
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 57/66] drm/i915: Disallow pin with full ppgtt
  2013-06-30 11:36                 ` Daniel Vetter
@ 2013-07-01 18:27                   ` Ben Widawsky
  0 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 01:36:39PM +0200, Daniel Vetter wrote:
> On Sun, Jun 30, 2013 at 1:31 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > I respectfully disagree. The semantics of the pin ioctl remain useful
> > even with the ggtt/ppgtt split, and I think barring its use forever more
> > is unwise. Not that pinning is a good solution, just in some cases it
> > may be the only solution. (It has proven useful in the past, it is
> > likely to do so again.) All that we need to do is note that the offset
> > returned by pin is ggtt and the offsets used by execbuffer are ppgtt. So
> > keep pin-ioctl and fix the test not to assume that pin.offset is
> > meaningful with execbuffer after HAS_FULL_PPGTT.
> 
> I was eyeing for the most minimal fix, but this is ok with me, too.
> -Daniel
>
So, just fix the test?

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 29/66] drm: pre allocate node for create_block
  2013-06-30 12:34   ` Daniel Vetter
@ 2013-07-01 18:30     ` Ben Widawsky
  0 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:30 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 02:34:43PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:30PM -0700, Ben Widawsky wrote:
> > For an upcoming patch where we introduce the i915 VMA, it's ideal to
> > have the drm_mm_node as part of the VMA struct (ie. it's pre-allocated).
> > Part of the conversion to VMAs is to kill off obj->gtt_space. Doing this
> > will break a bunch of code, but amongst them are 2 callers of
> > drm_mm_create_block(), both related to stolen memory.
> > 
> > As a side note, this patch has is able to leverage all the existing
> > drm_mm_put_block because the node is still kzalloc'd. When the
> > aforementioned VMA code comes into play, that too has to change.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Same here about cc'ing dri-devel. Furthermore I think it'd be nice to kill
> the interfaces from drm_mm.c which allocate the drm_mm_node themselves.
> The last user outside of drm/i915 is ttm, killing that one would also
> allow us to remove the (racy) preallocation madness.
> 
> So if you convert over all of drm/i915 to the preallcoate functions which
> pass in the drm_mm_node, I'll volunteer myself to fix up ttm.
> -Daniel

I will do that and send it as a separate series.
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-06-30 13:00   ` Daniel Vetter
@ 2013-07-01 18:32     ` Ben Widawsky
  2013-07-01 18:43       ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 03:00:05PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:31PM -0700, Ben Widawsky wrote:
> > This will be handy when we add VMs. It's not strictly, necessary, but it
> > will make the code much cleaner.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> You're going to hate, but this is patch ordering fail. Imo this should be
> one of the very first patches, at least before you kill obj->gtt_offset.
> 
> To increase your hatred some more, I have bikesheds on the names, too.
> 
> I think the best would be to respin this patch and merge it right away.
> It'll cause tons of conflicts. But keeping it as no. 30 in this series
> will be even worse, since merging the first 30 patches won't happen
> instantly. So much more potential for rebase hell imo.
> 
> The MO for when you stumble over such a giant renaming operation should be
> imo to submit the "add inline abstraction functions" patch(es) right away.
> That way everyone else who potentially works in the same area also gets a
> heads up.
> 
> 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index bc80ce0..56d47bc 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1349,6 +1349,27 @@ struct drm_i915_gem_object {
> >  
> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
> >  
> > +static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
> > +{
> > +	return o->gtt_space->start;
> > +}
> 
> To differentiate from the ppgtt offset I'd call this
> i915_gem_obj_ggtt_offset.
> 
> > +
> > +static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
> > +{
> > +	return o->gtt_space != NULL;
> > +}
> 
> Same here, I think we want  ggtt inserted.
> 
> > +
> > +static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
> > +{
> > +	return o->gtt_space->size;
> > +}
> 
> This is even more misleading and the real reasons I vote for all the ggtt
> bikesheds: ggtt_size != obj->size is very much possible (on gen2/3 only
> though). We use that to satisfy alignment/size constraints on tiled
> objects. So the i915_gem_obj_ggtt_size rename is mandatory here.
> 
> > +
> > +static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
> > +					  enum i915_cache_level color)
> > +{
> > +	o->gtt_space->color = color;
> > +}
> 
> Dito for consistency.
> 
> Cheers, Daniel
> 
>
All of this is addressed in future patches. As we've discussed, I think
I'll have to respin it anyway, so I'll name it as such upfront. To me it
felt a little weird to start calling things "ggtt" before I made the
separation.

[snip]

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-06-30 13:12   ` Daniel Vetter
@ 2013-07-01 18:40     ` Ben Widawsky
  2013-07-01 18:48       ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:40 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 03:12:35PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:22PM -0700, Ben Widawsky wrote:
> > The GTT and PPGTT can be thought of more generally as GPU address
> > spaces. Many of their actions (insert entries), state (LRU lists) and
> > many of their characteristics (size), can be shared. Do that.
> > 
> > Created a i915_gtt_vm helper macro since for now we always want the
> > regular GTT address space. Eventually we'll ween ourselves off of using
> > this except in cases where we obviously want the GGTT (like display).
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> The i915_gtt_vm #define is imo too evil. Looks like a local variable, but
> isn't. I think in most places we should just drop it, in others we should
> add a real local vm variable. I'll punt for now on this one.
> -Daniel

It's dropped later in the series. It was a temporary bandaid to make the
diffs a bit easier to swallow. I can certainly get rid of it next time
around.

> 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c |   4 +-
> >  drivers/gpu/drm/i915/i915_drv.h     |  48 ++++++------
> >  drivers/gpu/drm/i915/i915_gem.c     |   8 +-
> >  drivers/gpu/drm/i915/i915_gem_gtt.c | 145 +++++++++++++++++++-----------------
> >  4 files changed, 110 insertions(+), 95 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index e654bf4..c10a690 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -287,8 +287,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> >  		   count, size);
> >  
> >  	seq_printf(m, "%zu [%lu] gtt total\n",
> > -		   dev_priv->gtt.total,
> > -		   dev_priv->gtt.mappable_end - dev_priv->gtt.start);
> > +		   i915_gtt_vm->total,
> > +		   dev_priv->gtt.mappable_end - i915_gtt_vm->start);
> >  
> >  	seq_printf(m, "\n");
> >  	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 21cf593..7f4c9b6 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -442,6 +442,28 @@ enum i915_cache_level {
> >  
> >  typedef uint32_t gen6_gtt_pte_t;
> >  
> > +struct i915_address_space {
> > +	struct drm_device *dev;
> > +	unsigned long start;		/* Start offset always 0 for dri2 */
> > +	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> > +
> > +	struct {
> > +		dma_addr_t addr;
> > +		struct page *page;
> > +	} scratch;
> > +
> > +	/* FIXME: Need a more generic return type */
> > +	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > +				     enum i915_cache_level level);
> > +	void (*clear_range)(struct i915_address_space *i915_mm,
> > +			    unsigned int first_entry,
> > +			    unsigned int num_entries);
> > +	void (*insert_entries)(struct i915_address_space *i915_mm,
> > +			       struct sg_table *st,
> > +			       unsigned int first_entry,
> > +			       enum i915_cache_level cache_level);
> > +};
> > +
> >  /* The Graphics Translation Table is the way in which GEN hardware translates a
> >   * Graphics Virtual Address into a Physical Address. In addition to the normal
> >   * collateral associated with any va->pa translations GEN hardware also has a
> > @@ -450,8 +472,7 @@ typedef uint32_t gen6_gtt_pte_t;
> >   * the spec.
> >   */
> >  struct i915_gtt {
> > -	unsigned long start;		/* Start offset of used GTT */
> > -	size_t total;			/* Total size GTT can map */
> > +	struct i915_address_space base;
> >  	size_t stolen_size;		/* Total size of stolen memory */
> >  
> >  	unsigned long mappable_end;	/* End offset that we can CPU map */
> > @@ -472,34 +493,17 @@ struct i915_gtt {
> >  			  size_t *stolen, phys_addr_t *mappable_base,
> >  			  unsigned long *mappable_end);
> >  	void (*gtt_remove)(struct drm_device *dev);
> > -	void (*gtt_clear_range)(struct drm_device *dev,
> > -				unsigned int first_entry,
> > -				unsigned int num_entries);
> > -	void (*gtt_insert_entries)(struct drm_device *dev,
> > -				   struct sg_table *st,
> > -				   unsigned int pg_start,
> > -				   enum i915_cache_level cache_level);
> > -	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> > -				     enum i915_cache_level level);
> >  };
> > -#define gtt_total_entries(gtt) ((gtt).total >> PAGE_SHIFT)
> > +#define i915_gtt_vm ((struct i915_address_space *)&(dev_priv->gtt.base))
> >  
> >  struct i915_hw_ppgtt {
> > +	struct i915_address_space base;
> >  	struct drm_mm_node node;
> > -	struct drm_device *dev;
> >  	unsigned num_pd_entries;
> >  	struct page **pt_pages;
> >  	uint32_t pd_offset;
> >  	dma_addr_t *pt_dma_addr;
> >  
> > -	/* pte functions, mirroring the interface of the global gtt. */
> > -	void (*clear_range)(struct i915_hw_ppgtt *ppgtt,
> > -			    unsigned int first_entry,
> > -			    unsigned int num_entries);
> > -	void (*insert_entries)(struct i915_hw_ppgtt *ppgtt,
> > -			       struct sg_table *st,
> > -			       unsigned int pg_start,
> > -			       enum i915_cache_level cache_level);
> >  	gen6_gtt_pte_t (*pte_encode)(dma_addr_t addr,
> >  				     enum i915_cache_level level);
> >  	int (*enable)(struct drm_device *dev);
> > @@ -1123,7 +1127,7 @@ typedef struct drm_i915_private {
> >  	enum modeset_restore modeset_restore;
> >  	struct mutex modeset_restore_lock;
> >  
> > -	struct i915_gtt gtt;
> > +	struct i915_gtt gtt; /* VMA representing the global address space */
> >  
> >  	struct i915_gem_mm mm;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index c96b422..e31ed47 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -181,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
> >  			pinned += obj->gtt_space->size;
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> > -	args->aper_size = dev_priv->gtt.total;
> > +	args->aper_size = i915_gtt_vm->total;
> >  	args->aper_available_size = args->aper_size - pinned;
> >  
> >  	return 0;
> > @@ -3083,7 +3083,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
> >  	u32 size, fence_size, fence_alignment, unfenced_alignment;
> >  	bool mappable, fenceable;
> >  	size_t gtt_max = map_and_fenceable ?
> > -		dev_priv->gtt.mappable_end : dev_priv->gtt.total;
> > +		dev_priv->gtt.mappable_end : dev_priv->gtt.base.total;
> >  	int ret;
> >  
> >  	fence_size = i915_gem_get_gtt_size(dev,
> > @@ -4226,7 +4226,7 @@ int i915_gem_init(struct drm_device *dev)
> >  	 */
> >  	if (HAS_HW_CONTEXTS(dev)) {
> >  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> > -					  dev_priv->gtt.total, 0);
> > +					  i915_gtt_vm->total, 0);
> >  		i915_gem_context_init(dev);
> >  		if (dev_priv->hw_contexts_disabled) {
> >  			drm_mm_takedown(&dev_priv->mm.gtt_space);
> > @@ -4240,7 +4240,7 @@ ggtt_only:
> >  		if (HAS_HW_CONTEXTS(dev))
> >  			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
> >  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> > -					  dev_priv->gtt.total, PAGE_SIZE);
> > +					  i915_gtt_vm->total, PAGE_SIZE);
> >  	}
> >  
> >  	ret = i915_gem_init_hw(dev);
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index bb4ccb5..6de75c7 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -102,7 +102,7 @@ static gen6_gtt_pte_t hsw_pte_encode(dma_addr_t addr,
> >  
> >  static void gen6_write_pdes(struct i915_hw_ppgtt *ppgtt)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct drm_i915_private *dev_priv = ppgtt->base.dev->dev_private;
> >  	gen6_gtt_pte_t __iomem *pd_addr;
> >  	uint32_t pd_entry;
> >  	int i;
> > @@ -183,18 +183,18 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >  }
> >  
> >  /* PPGTT support for Sandybdrige/Gen6 and later */
> > -static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> > +static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
> >  				   unsigned first_entry,
> >  				   unsigned num_entries)
> >  {
> > -	struct drm_i915_private *dev_priv = ppgtt->dev->dev_private;
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	gen6_gtt_pte_t *pt_vaddr, scratch_pte;
> >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> >  	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> >  	unsigned last_pte, i;
> >  
> > -	scratch_pte = ppgtt->pte_encode(dev_priv->gtt.scratch.addr,
> > -					I915_CACHE_LLC);
> > +	scratch_pte = vm->pte_encode(vm->scratch.addr, I915_CACHE_LLC);
> >  
> >  	while (num_entries) {
> >  		last_pte = first_pte + num_entries;
> > @@ -214,11 +214,13 @@ static void gen6_ppgtt_clear_range(struct i915_hw_ppgtt *ppgtt,
> >  	}
> >  }
> >  
> > -static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> > +static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
> >  				      struct sg_table *pages,
> >  				      unsigned first_entry,
> >  				      enum i915_cache_level cache_level)
> >  {
> > +	struct i915_hw_ppgtt *ppgtt =
> > +		container_of(vm, struct i915_hw_ppgtt, base);
> >  	gen6_gtt_pte_t *pt_vaddr;
> >  	unsigned act_pt = first_entry / I915_PPGTT_PT_ENTRIES;
> >  	unsigned act_pte = first_entry % I915_PPGTT_PT_ENTRIES;
> > @@ -229,7 +231,7 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> >  		dma_addr_t page_addr;
> >  
> >  		page_addr = sg_page_iter_dma_address(&sg_iter);
> > -		pt_vaddr[act_pte] = ppgtt->pte_encode(page_addr, cache_level);
> > +		pt_vaddr[act_pte] = vm->pte_encode(page_addr, cache_level);
> >  		if (++act_pte == I915_PPGTT_PT_ENTRIES) {
> >  			kunmap_atomic(pt_vaddr);
> >  			act_pt++;
> > @@ -243,14 +245,14 @@ static void gen6_ppgtt_insert_entries(struct i915_hw_ppgtt *ppgtt,
> >  
> >  static void gen6_ppgtt_cleanup(struct i915_hw_ppgtt *ppgtt)
> >  {
> > +	struct i915_address_space *vm = &ppgtt->base;
> >  	int i;
> >  
> >  	drm_mm_remove_node(&ppgtt->node);
> >  
> >  	if (ppgtt->pt_dma_addr) {
> >  		for (i = 0; i < ppgtt->num_pd_entries; i++)
> > -			pci_unmap_page(ppgtt->dev->pdev,
> > -				       ppgtt->pt_dma_addr[i],
> > +			pci_unmap_page(vm->dev->pdev, ppgtt->pt_dma_addr[i],
> >  				       4096, PCI_DMA_BIDIRECTIONAL);
> >  	}
> >  
> > @@ -264,7 +266,8 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  {
> >  #define GEN6_PD_ALIGN (PAGE_SIZE * 16)
> >  #define GEN6_PD_SIZE (GEN6_PPGTT_PD_ENTRIES * PAGE_SIZE)
> > -	struct drm_device *dev = ppgtt->dev;
> > +	struct i915_address_space *vm = &ppgtt->base;
> > +	struct drm_device *dev = vm->dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  	int i;
> >  	int ret = -ENOMEM;
> > @@ -279,21 +282,22 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  						  &ppgtt->node, GEN6_PD_SIZE,
> >  						  GEN6_PD_ALIGN, 0,
> >  						  dev_priv->gtt.mappable_end,
> > -						  dev_priv->gtt.total,
> > +						  i915_gtt_vm->total,
> >  						  DRM_MM_TOPDOWN);
> >  	if (ret)
> >  		return ret;
> >  
> >  	if (IS_HASWELL(dev)) {
> > -		ppgtt->pte_encode = hsw_pte_encode;
> > +		vm->pte_encode = hsw_pte_encode;
> >  	} else if (IS_VALLEYVIEW(dev)) {
> > -		ppgtt->pte_encode = byt_pte_encode;
> > +		vm->pte_encode = byt_pte_encode;
> >  	} else {
> > -		ppgtt->pte_encode = gen6_pte_encode;
> > +		vm->pte_encode = gen6_pte_encode;
> >  	}
> >  
> >  	ppgtt->pt_pages = kzalloc(sizeof(struct page *)*GEN6_PPGTT_PD_ENTRIES,
> >  				  GFP_KERNEL);
> > +
> >  	if (!ppgtt->pt_pages) {
> >  		drm_mm_remove_node(&ppgtt->node);
> >  		return -ENOMEM;
> > @@ -326,12 +330,15 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt)
> >  
> >  	ppgtt->num_pd_entries = GEN6_PPGTT_PD_ENTRIES;
> >  	ppgtt->enable = gen6_ppgtt_enable;
> > -	ppgtt->clear_range = gen6_ppgtt_clear_range;
> > -	ppgtt->insert_entries = gen6_ppgtt_insert_entries;
> >  	ppgtt->cleanup = gen6_ppgtt_cleanup;
> >  
> > -	ppgtt->clear_range(ppgtt, 0,
> > -			   ppgtt->num_pd_entries*I915_PPGTT_PT_ENTRIES);
> > +	vm->clear_range = gen6_ppgtt_clear_range;
> > +	vm->insert_entries = gen6_ppgtt_insert_entries;
> > +	vm->start = 0;
> > +	vm->total = GEN6_PPGTT_PD_ENTRIES * I915_PPGTT_PT_ENTRIES * PAGE_SIZE;
> > +	vm->scratch = dev_priv->gtt.base.scratch;
> > +
> > +	vm->clear_range(vm, 0, ppgtt->num_pd_entries * I915_PPGTT_PT_ENTRIES);
> >  
> >  	DRM_DEBUG_DRIVER("Allocated pde space (%ldM) at GTT entry: %lx\n",
> >  			 ppgtt->node.size >> 20,
> > @@ -363,7 +370,7 @@ int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt)
> >  {
> >  	int ret;
> >  
> > -	ppgtt->dev = dev;
> > +	ppgtt->base.dev = dev;
> >  
> >  	if (INTEL_INFO(dev)->gen < 8)
> >  		ret = gen6_ppgtt_init(ppgtt);
> > @@ -377,17 +384,17 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
> >  			    struct drm_i915_gem_object *obj,
> >  			    enum i915_cache_level cache_level)
> >  {
> > -	ppgtt->insert_entries(ppgtt, obj->pages,
> > -			      obj->gtt_space->start >> PAGE_SHIFT,
> > -			      cache_level);
> > +	ppgtt->base.insert_entries(&ppgtt->base, obj->pages,
> > +				   obj->gtt_space->start >> PAGE_SHIFT,
> > +				   cache_level);
> >  }
> >  
> >  void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
> >  			      struct drm_i915_gem_object *obj)
> >  {
> > -	ppgtt->clear_range(ppgtt,
> > -			   obj->gtt_space->start >> PAGE_SHIFT,
> > -			   obj->base.size >> PAGE_SHIFT);
> > +	ppgtt->base.clear_range(&ppgtt->base,
> > +				obj->gtt_space->start >> PAGE_SHIFT,
> > +				obj->base.size >> PAGE_SHIFT);
> >  }
> >  
> >  extern int intel_iommu_gfx_mapped;
> > @@ -434,8 +441,9 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  	struct drm_i915_gem_object *obj;
> >  
> >  	/* First fill our portion of the GTT with scratch pages */
> > -	dev_priv->gtt.gtt_clear_range(dev, dev_priv->gtt.start / PAGE_SIZE,
> > -				      dev_priv->gtt.total / PAGE_SIZE);
> > +	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
> > +				       i915_gtt_vm->start / PAGE_SIZE,
> > +				       i915_gtt_vm->total / PAGE_SIZE);
> >  
> >  	if (dev_priv->mm.aliasing_ppgtt)
> >  		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > @@ -467,12 +475,12 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
> >   * within the global GTT as well as accessible by the GPU through the GMADR
> >   * mapped BAR (dev_priv->mm.gtt->gtt).
> >   */
> > -static void gen6_ggtt_insert_entries(struct drm_device *dev,
> > +static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
> >  				     struct sg_table *st,
> >  				     unsigned int first_entry,
> >  				     enum i915_cache_level level)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> >  	gen6_gtt_pte_t __iomem *gtt_entries =
> >  		(gen6_gtt_pte_t __iomem *)dev_priv->gtt.gsm + first_entry;
> >  	int i = 0;
> > @@ -481,8 +489,7 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  
> >  	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
> >  		addr = sg_page_iter_dma_address(&sg_iter);
> > -		iowrite32(dev_priv->gtt.pte_encode(addr, level),
> > -			  &gtt_entries[i]);
> > +		iowrite32(vm->pte_encode(addr, level), &gtt_entries[i]);
> >  		i++;
> >  	}
> >  
> > @@ -493,8 +500,8 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  	 * hardware should work, we must keep this posting read for paranoia.
> >  	 */
> >  	if (i != 0)
> > -		WARN_ON(readl(&gtt_entries[i-1])
> > -			!= dev_priv->gtt.pte_encode(addr, level));
> > +		WARN_ON(readl(&gtt_entries[i-1]) !=
> > +			vm->pte_encode(addr, level));
> >  
> >  	/* This next bit makes the above posting read even more important. We
> >  	 * want to flush the TLBs only after we're certain all the PTE updates
> > @@ -504,14 +511,14 @@ static void gen6_ggtt_insert_entries(struct drm_device *dev,
> >  	POSTING_READ(GFX_FLSH_CNTL_GEN6);
> >  }
> >  
> > -static void gen6_ggtt_clear_range(struct drm_device *dev,
> > +static void gen6_ggtt_clear_range(struct i915_address_space *vm,
> >  				  unsigned int first_entry,
> >  				  unsigned int num_entries)
> >  {
> > -	struct drm_i915_private *dev_priv = dev->dev_private;
> > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> >  	gen6_gtt_pte_t scratch_pte, __iomem *gtt_base =
> >  		(gen6_gtt_pte_t __iomem *) dev_priv->gtt.gsm + first_entry;
> > -	const int max_entries = gtt_total_entries(dev_priv->gtt) - first_entry;
> > +	const int max_entries = (vm->total >> PAGE_SHIFT) - first_entry;
> >  	int i;
> >  
> >  	if (num_entries == 0)
> > @@ -522,15 +529,15 @@ static void gen6_ggtt_clear_range(struct drm_device *dev,
> >  		 first_entry, num_entries, max_entries))
> >  		num_entries = max_entries;
> >  
> > -	scratch_pte = dev_priv->gtt.pte_encode(dev_priv->gtt.scratch.addr,
> > -					       I915_CACHE_LLC);
> > +	scratch_pte = vm->pte_encode(vm->scratch.addr,
> > +					  I915_CACHE_LLC);
> >  	for (i = 0; i < num_entries; i++)
> >  		iowrite32(scratch_pte, &gtt_base[i]);
> >  	readl(gtt_base);
> >  }
> >  
> >  
> > -static void i915_ggtt_insert_entries(struct drm_device *dev,
> > +static void i915_ggtt_insert_entries(struct i915_address_space *vm,
> >  				     struct sg_table *st,
> >  				     unsigned int pg_start,
> >  				     enum i915_cache_level cache_level)
> > @@ -542,7 +549,7 @@ static void i915_ggtt_insert_entries(struct drm_device *dev,
> >  
> >  }
> >  
> > -static void i915_ggtt_clear_range(struct drm_device *dev,
> > +static void i915_ggtt_clear_range(struct i915_address_space *vm,
> >  				  unsigned int first_entry,
> >  				  unsigned int num_entries)
> >  {
> > @@ -559,9 +566,9 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  
> > -	dev_priv->gtt.gtt_insert_entries(dev, obj->pages,
> > -					 obj->gtt_space->start >> PAGE_SHIFT,
> > -					 cache_level);
> > +	i915_gtt_vm->insert_entries(&dev_priv->gtt.base, obj->pages,
> > +					  obj->gtt_space->start >> PAGE_SHIFT,
> > +					  cache_level);
> >  
> >  	obj->has_global_gtt_mapping = 1;
> >  }
> > @@ -571,9 +578,9 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj)
> >  	struct drm_device *dev = obj->base.dev;
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> >  
> > -	dev_priv->gtt.gtt_clear_range(obj->base.dev,
> > -				      obj->gtt_space->start >> PAGE_SHIFT,
> > -				      obj->base.size >> PAGE_SHIFT);
> > +	i915_gtt_vm->clear_range(&dev_priv->gtt.base,
> > +				       obj->gtt_space->start >> PAGE_SHIFT,
> > +				       obj->base.size >> PAGE_SHIFT);
> >  
> >  	obj->has_global_gtt_mapping = 0;
> >  }
> > @@ -679,21 +686,21 @@ void i915_gem_setup_global_gtt(struct drm_device *dev,
> >  		obj->has_global_gtt_mapping = 1;
> >  	}
> >  
> > -	dev_priv->gtt.start = start;
> > -	dev_priv->gtt.total = end - start;
> > +	i915_gtt_vm->start = start;
> > +	i915_gtt_vm->total = end - start;
> >  
> >  	/* Clear any non-preallocated blocks */
> >  	drm_mm_for_each_hole(entry, &dev_priv->mm.gtt_space,
> >  			     hole_start, hole_end) {
> >  		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
> >  			      hole_start, hole_end);
> > -		dev_priv->gtt.gtt_clear_range(dev, hole_start / PAGE_SIZE,
> > -					      (hole_end-hole_start) / PAGE_SIZE);
> > +		i915_gtt_vm->clear_range(i915_gtt_vm, hole_start / PAGE_SIZE,
> > +				     (hole_end-hole_start) / PAGE_SIZE);
> >  	}
> >  
> >  	/* And finally clear the reserved guard page */
> > -	dev_priv->gtt.gtt_clear_range(dev, (end - guard_size) / PAGE_SIZE,
> > -				      guard_size / PAGE_SIZE);
> > +	i915_gtt_vm->clear_range(i915_gtt_vm, (end - guard_size) / PAGE_SIZE,
> > +				 guard_size / PAGE_SIZE);
> >  }
> >  
> >  static int setup_scratch_page(struct drm_device *dev)
> > @@ -716,8 +723,8 @@ static int setup_scratch_page(struct drm_device *dev)
> >  #else
> >  	dma_addr = page_to_phys(page);
> >  #endif
> > -	dev_priv->gtt.scratch.page = page;
> > -	dev_priv->gtt.scratch.addr = dma_addr;
> > +	i915_gtt_vm->scratch.page = page;
> > +	i915_gtt_vm->scratch.addr = dma_addr;
> >  
> >  	return 0;
> >  }
> > @@ -725,11 +732,12 @@ static int setup_scratch_page(struct drm_device *dev)
> >  static void teardown_scratch_page(struct drm_device *dev)
> >  {
> >  	struct drm_i915_private *dev_priv = dev->dev_private;
> > -	set_pages_wb(dev_priv->gtt.scratch.page, 1);
> > -	pci_unmap_page(dev->pdev, dev_priv->gtt.scratch.addr,
> > +
> > +	set_pages_wb(i915_gtt_vm->scratch.page, 1);
> > +	pci_unmap_page(dev->pdev, i915_gtt_vm->scratch.addr,
> >  		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
> > -	put_page(dev_priv->gtt.scratch.page);
> > -	__free_page(dev_priv->gtt.scratch.page);
> > +	put_page(i915_gtt_vm->scratch.page);
> > +	__free_page(i915_gtt_vm->scratch.page);
> >  }
> >  
> >  static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl)
> > @@ -792,8 +800,8 @@ static int gen6_gmch_probe(struct drm_device *dev,
> >  	if (ret)
> >  		DRM_ERROR("Scratch setup failed\n");
> >  
> > -	dev_priv->gtt.gtt_clear_range = gen6_ggtt_clear_range;
> > -	dev_priv->gtt.gtt_insert_entries = gen6_ggtt_insert_entries;
> > +	i915_gtt_vm->clear_range = gen6_ggtt_clear_range;
> > +	i915_gtt_vm->insert_entries = gen6_ggtt_insert_entries;
> >  
> >  	return ret;
> >  }
> > @@ -823,8 +831,8 @@ static int i915_gmch_probe(struct drm_device *dev,
> >  	intel_gtt_get(gtt_total, stolen, mappable_base, mappable_end);
> >  
> >  	dev_priv->gtt.do_idle_maps = needs_idle_maps(dev_priv->dev);
> > -	dev_priv->gtt.gtt_clear_range = i915_ggtt_clear_range;
> > -	dev_priv->gtt.gtt_insert_entries = i915_ggtt_insert_entries;
> > +	i915_gtt_vm->clear_range = i915_ggtt_clear_range;
> > +	i915_gtt_vm->insert_entries = i915_ggtt_insert_entries;
> >  
> >  	return 0;
> >  }
> > @@ -847,20 +855,23 @@ int i915_gem_gtt_init(struct drm_device *dev)
> >  		gtt->gtt_probe = gen6_gmch_probe;
> >  		gtt->gtt_remove = gen6_gmch_remove;
> >  		if (IS_HASWELL(dev))
> > -			gtt->pte_encode = hsw_pte_encode;
> > +			gtt->base.pte_encode = hsw_pte_encode;
> >  		else if (IS_VALLEYVIEW(dev))
> > -			gtt->pte_encode = byt_pte_encode;
> > +			gtt->base.pte_encode = byt_pte_encode;
> >  		else
> > -			gtt->pte_encode = gen6_pte_encode;
> > +			gtt->base.pte_encode = gen6_pte_encode;
> >  	}
> >  
> > -	ret = gtt->gtt_probe(dev, &gtt->total, &gtt->stolen_size,
> > +	ret = gtt->gtt_probe(dev, &gtt->base.total, &gtt->stolen_size,
> >  			     &gtt->mappable_base, &gtt->mappable_end);
> >  	if (ret)
> >  		return ret;
> >  
> > +	gtt->base.dev = dev;
> > +
> >  	/* GMADR is the PCI mmio aperture into the global GTT. */
> > -	DRM_INFO("Memory usable by graphics device = %zdM\n", gtt->total >> 20);
> > +	DRM_INFO("Memory usable by graphics device = %zdM\n",
> > +		 gtt->base.total >> 20);
> >  	DRM_DEBUG_DRIVER("GMADR size = %ldM\n", gtt->mappable_end >> 20);
> >  	DRM_DEBUG_DRIVER("GTT stolen size = %zdM\n", gtt->stolen_size >> 20);
> >  
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 23/66] drm/i915: Move stolen stuff to i915_gtt
  2013-06-30 13:18   ` Daniel Vetter
@ 2013-07-01 18:43     ` Ben Widawsky
  2013-07-01 18:51       ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:43 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 03:18:46PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:24PM -0700, Ben Widawsky wrote:
> > It doesn't apply to generic VMA, so it belongs with the gtt.
> > 
> > for file in `ls drivers/gpu/drm/i915/*.c` ; do
> > 	sed -i "s/mm.stolen_base/gtt.stolen_base/" $file;
> > done
> > 
> > for file in `ls drivers/gpu/drm/i915/*.c` ; do
> > 	sed -i "s/mm.stolen/gtt.stolen/" $file;
> > done
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> Before I keep on merging I'd like to clarify the plan a bit: Afaics the
> goal is to extract useful stuff shared between global gtt and ppgtt into
> i915_address_space. But I'm a bit unclear what dev_priv->mm will hold in
> the end, so I'm not sure whether moving stolen around makes sense.
> 
> Can you please elaborate on your plan a bit on how dev_priv->gtt and
> dev_priv->mm will relate in the end?
> 
> Thanks, Daniel
>
This patch is leftover from when I completely removed mm. In the high
level abstraction, the stolen memory belongs to the GTT, so I decided to
keep it. There is no other reason for the patch. We can drop it of
nobody likes it.

[snip]
-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-01 18:32     ` Ben Widawsky
@ 2013-07-01 18:43       ` Daniel Vetter
  2013-07-01 19:08         ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 18:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 8:32 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sun, Jun 30, 2013 at 03:00:05PM +0200, Daniel Vetter wrote:
>> On Thu, Jun 27, 2013 at 04:30:31PM -0700, Ben Widawsky wrote:
>> > This will be handy when we add VMs. It's not strictly, necessary, but it
>> > will make the code much cleaner.
>> >
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> You're going to hate, but this is patch ordering fail. Imo this should be
>> one of the very first patches, at least before you kill obj->gtt_offset.
>>
>> To increase your hatred some more, I have bikesheds on the names, too.
>>
>> I think the best would be to respin this patch and merge it right away.
>> It'll cause tons of conflicts. But keeping it as no. 30 in this series
>> will be even worse, since merging the first 30 patches won't happen
>> instantly. So much more potential for rebase hell imo.
>>
>> The MO for when you stumble over such a giant renaming operation should be
>> imo to submit the "add inline abstraction functions" patch(es) right away.
>> That way everyone else who potentially works in the same area also gets a
>> heads up.
>>
>>
>> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> > index bc80ce0..56d47bc 100644
>> > --- a/drivers/gpu/drm/i915/i915_drv.h
>> > +++ b/drivers/gpu/drm/i915/i915_drv.h
>> > @@ -1349,6 +1349,27 @@ struct drm_i915_gem_object {
>> >
>> >  #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
>> >
>> > +static inline unsigned long i915_gem_obj_offset(struct drm_i915_gem_object *o)
>> > +{
>> > +   return o->gtt_space->start;
>> > +}
>>
>> To differentiate from the ppgtt offset I'd call this
>> i915_gem_obj_ggtt_offset.
>>
>> > +
>> > +static inline bool i915_gem_obj_bound(struct drm_i915_gem_object *o)
>> > +{
>> > +   return o->gtt_space != NULL;
>> > +}
>>
>> Same here, I think we want  ggtt inserted.
>>
>> > +
>> > +static inline unsigned long i915_gem_obj_size(struct drm_i915_gem_object *o)
>> > +{
>> > +   return o->gtt_space->size;
>> > +}
>>
>> This is even more misleading and the real reasons I vote for all the ggtt
>> bikesheds: ggtt_size != obj->size is very much possible (on gen2/3 only
>> though). We use that to satisfy alignment/size constraints on tiled
>> objects. So the i915_gem_obj_ggtt_size rename is mandatory here.
>>
>> > +
>> > +static inline void i915_gem_obj_set_color(struct drm_i915_gem_object *o,
>> > +                                     enum i915_cache_level color)
>> > +{
>> > +   o->gtt_space->color = color;
>> > +}
>>
>> Dito for consistency.
>>
>> Cheers, Daniel
>>
>>
> All of this is addressed in future patches. As we've discussed, I think
> I'll have to respin it anyway, so I'll name it as such upfront. To me it
> felt a little weird to start calling things "ggtt" before I made the
> separation.

I think now that we know what the end result should (more or less at
least) look like we can aim to make it right the first time we touch a
piece of code. That will reduce the churn in the patch series and so
make the beast easier to review.

Imo foreshadowing (to keep consistent with the "a patch series should
tell a story" analogy) is perfectly fine, and in many cases helps in
understanding the big picture of a large pile of patches.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella
  2013-07-01 18:40     ` Ben Widawsky
@ 2013-07-01 18:48       ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 18:48 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 8:40 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sun, Jun 30, 2013 at 03:12:35PM +0200, Daniel Vetter wrote:
>> On Thu, Jun 27, 2013 at 04:30:22PM -0700, Ben Widawsky wrote:
>> > The GTT and PPGTT can be thought of more generally as GPU address
>> > spaces. Many of their actions (insert entries), state (LRU lists) and
>> > many of their characteristics (size), can be shared. Do that.
>> >
>> > Created a i915_gtt_vm helper macro since for now we always want the
>> > regular GTT address space. Eventually we'll ween ourselves off of using
>> > this except in cases where we obviously want the GGTT (like display).
>> >
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> The i915_gtt_vm #define is imo too evil. Looks like a local variable, but
>> isn't. I think in most places we should just drop it, in others we should
>> add a real local vm variable. I'll punt for now on this one.
>> -Daniel
>
> It's dropped later in the series. It was a temporary bandaid to make the
> diffs a bit easier to swallow. I can certainly get rid of it next time
> around

Again I think (hope) that know that we know what this should look like
in the end, we can cut some diff churn. So e.g. when you move
something from dev_priv->mm to i915_address_space you check each place
you touch in the end-state of your branch and
- either add a local variable if the specific function will look up
the relevant address_space object itself
- or add the right function argument right away if it gets the
address_space passed in from callers.

Equivalently for moving stuff from the gem object to the vma.

Since both address space and vma have pointers to the old place (i.e.
drm_device or gem_object) you can add a temporary variable if a
functions newly gets an address_space/vma while the conversion is
still ongoing. That way we should not end up with needless churn in
function signatures, either.

Again I think more foreshadowing here would be really nice, so that's
what I'm aiming for with this idea.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 23/66] drm/i915: Move stolen stuff to i915_gtt
  2013-07-01 18:43     ` Ben Widawsky
@ 2013-07-01 18:51       ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 18:51 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 8:43 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Sun, Jun 30, 2013 at 03:18:46PM +0200, Daniel Vetter wrote:
>> On Thu, Jun 27, 2013 at 04:30:24PM -0700, Ben Widawsky wrote:
>> > It doesn't apply to generic VMA, so it belongs with the gtt.
>> >
>> > for file in `ls drivers/gpu/drm/i915/*.c` ; do
>> >     sed -i "s/mm.stolen_base/gtt.stolen_base/" $file;
>> > done
>> >
>> > for file in `ls drivers/gpu/drm/i915/*.c` ; do
>> >     sed -i "s/mm.stolen/gtt.stolen/" $file;
>> > done
>> >
>> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
>>
>> Before I keep on merging I'd like to clarify the plan a bit: Afaics the
>> goal is to extract useful stuff shared between global gtt and ppgtt into
>> i915_address_space. But I'm a bit unclear what dev_priv->mm will hold in
>> the end, so I'm not sure whether moving stolen around makes sense.
>>
>> Can you please elaborate on your plan a bit on how dev_priv->gtt and
>> dev_priv->mm will relate in the end?
>>
>> Thanks, Daniel
>>
> This patch is leftover from when I completely removed mm. In the high
> level abstraction, the stolen memory belongs to the GTT, so I decided to
> keep it. There is no other reason for the patch. We can drop it of
> nobody likes it.

I think keeping the stolen allocator in ->mm makes more sense then,
since it doesn't stricly have anything to do with managing the global
gtt address space. If the linux memory hotplug support wouldn't be so
limited we could even hand out that block of RAM to the core page
allocator ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-06-30 13:27   ` Daniel Vetter
@ 2013-07-01 18:52     ` Ben Widawsky
  2013-07-01 19:06       ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 18:52 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 03:27:44PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:25PM -0700, Ben Widawsky wrote:
> > for file in `ls drivers/gpu/drm/i915/*.c` ; do
> > 	sed -i "s/mm.aliasing/gtt.aliasing/" $file;
> > done
> 
> Commit message should explain _why_ we do something. Again I'm asking
> since I'm unclear about how things fit all together and what the different
> responsibilities are. I think I understand your design here to make both
> real ppgtt work and keep aliasing ppgtt going, but I'd like you to explain
> this in your words ;-)

That's fair.

> 
> One thing which looks a bit peculiar at the end is that struct
> i915_hw_ppgtt is actually used as the real ppgtt object (since it
> subclases i915_address space). My original plan was that we'll add a new
> struct i915_ppgtt {
> 	struct i915_address_space base;
> 	struct i915_hw_ppgtt hw_ppgtt;
> }
> 
> To fit into your design the alising ppgtt pointer in dev_priv->gtt would
> then only point at a hw_ppgtt struct, not the full deal with address space
> and everything else around attached.
> 
> Cheers, Daniel

I don't think creating a struct i915_ppgtt is necessary or buys much. We
can rename i915_hw_ppgtt to i915_ppgtt, and it accomplishes the same
thing. Same for the i915_hw_context for that matter. I wanted to do any
sort of renaming after the rest of the series.

Can you explain why we'd want to keep the hw_ppgtt, and ppgtt with the
track lists etc. distinct?

> 
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c        |  4 ++--
> >  drivers/gpu/drm/i915/i915_dma.c            |  2 +-
> >  drivers/gpu/drm/i915/i915_drv.h            |  6 +++---
> >  drivers/gpu/drm/i915/i915_gem.c            | 12 ++++++------
> >  drivers/gpu/drm/i915/i915_gem_context.c    |  4 ++--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c |  4 ++--
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        |  6 +++---
> >  7 files changed, 19 insertions(+), 19 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index c10a690..f3c76ab 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -1816,8 +1816,8 @@ static int i915_ppgtt_info(struct seq_file *m, void *data)
> >  		seq_printf(m, "PP_DIR_BASE_READ: 0x%08x\n", I915_READ(RING_PP_DIR_BASE_READ(ring)));
> >  		seq_printf(m, "PP_DIR_DCLV: 0x%08x\n", I915_READ(RING_PP_DIR_DCLV(ring)));
> >  	}
> > -	if (dev_priv->mm.aliasing_ppgtt) {
> > -		struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> > +	if (dev_priv->gtt.aliasing_ppgtt) {
> > +		struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
> >  
> >  		seq_printf(m, "aliasing PPGTT:\n");
> >  		seq_printf(m, "pd gtt offset: 0x%08x\n", ppgtt->pd_offset);
> > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> > index 3535ced..ef00847 100644
> > --- a/drivers/gpu/drm/i915/i915_dma.c
> > +++ b/drivers/gpu/drm/i915/i915_dma.c
> > @@ -977,7 +977,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
> >  		value = HAS_LLC(dev);
> >  		break;
> >  	case I915_PARAM_HAS_ALIASING_PPGTT:
> > -		if (intel_enable_ppgtt(dev) && dev_priv->mm.aliasing_ppgtt)
> > +		if (intel_enable_ppgtt(dev) && dev_priv->gtt.aliasing_ppgtt)
> >  			value = 1;
> >  		break;
> >  	case I915_PARAM_HAS_WAIT_TIMEOUT:
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 7016074..0fa7a21 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -482,6 +482,9 @@ struct i915_gtt {
> >  	struct io_mapping *mappable;	/* Mapping to our CPU mappable region */
> >  	phys_addr_t mappable_base;	/* PA of our GMADR */
> >  
> > +	/** PPGTT used for aliasing the PPGTT with the GTT */
> > +	struct i915_hw_ppgtt *aliasing_ppgtt;
> > +
> >  	/** "Graphics Stolen Memory" holds the global PTEs */
> >  	void __iomem *gsm;
> >  
> > @@ -843,9 +846,6 @@ struct i915_gem_mm {
> >  	 */
> >  	struct list_head unbound_list;
> >  
> > -	/** PPGTT used for aliasing the PPGTT with the GTT */
> > -	struct i915_hw_ppgtt *aliasing_ppgtt;
> > -
> >  	struct shrinker inactive_shrinker;
> >  	bool shrinker_no_lock_stealing;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index e31ed47..eb78c5b 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2620,7 +2620,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
> >  	if (obj->has_global_gtt_mapping)
> >  		i915_gem_gtt_unbind_object(obj);
> >  	if (obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_unbind_object(dev_priv->mm.aliasing_ppgtt, obj);
> > +		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
> >  		obj->has_aliasing_ppgtt_mapping = 0;
> >  	}
> >  	i915_gem_gtt_finish_object(obj);
> > @@ -3359,7 +3359,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  		if (obj->has_global_gtt_mapping)
> >  			i915_gem_gtt_bind_object(obj, cache_level);
> >  		if (obj->has_aliasing_ppgtt_mapping)
> > -			i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > +			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
> >  					       obj, cache_level);
> >  
> >  		obj->gtt_space->color = cache_level;
> > @@ -3668,7 +3668,7 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  		if (ret)
> >  			return ret;
> >  
> > -		if (!dev_priv->mm.aliasing_ppgtt)
> > +		if (!dev_priv->gtt.aliasing_ppgtt)
> >  			i915_gem_gtt_bind_object(obj, obj->cache_level);
> >  	}
> >  
> > @@ -4191,10 +4191,10 @@ i915_gem_init_hw(struct drm_device *dev)
> >  	 * the do_switch), but before enabling PPGTT. So don't move this.
> >  	 */
> >  	ret = i915_gem_context_enable(dev_priv);
> > -	if (ret || !dev_priv->mm.aliasing_ppgtt)
> > +	if (ret || !dev_priv->gtt.aliasing_ppgtt)
> >  		goto disable_ctx_out;
> >  
> > -	ret = dev_priv->mm.aliasing_ppgtt->enable(dev);
> > +	ret = dev_priv->gtt.aliasing_ppgtt->enable(dev);
> >  	if (ret)
> >  		goto disable_ctx_out;
> >  
> > @@ -4236,7 +4236,7 @@ int i915_gem_init(struct drm_device *dev)
> >  		dev_priv->hw_contexts_disabled = true;
> >  
> >  ggtt_only:
> > -	if (!dev_priv->mm.aliasing_ppgtt) {
> > +	if (!dev_priv->gtt.aliasing_ppgtt) {
> >  		if (HAS_HW_CONTEXTS(dev))
> >  			DRM_DEBUG_DRIVER("Context setup failed %d\n", ret);
> >  		i915_gem_setup_global_gtt(dev, 0, dev_priv->gtt.mappable_end,
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index d92f121..aa4fc4a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -226,7 +226,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
> >  	}
> >  
> >  	dev_priv->ring[RCS].default_context = ctx;
> > -	dev_priv->mm.aliasing_ppgtt = &ctx->ppgtt;
> > +	dev_priv->gtt.aliasing_ppgtt = &ctx->ppgtt;
> >  
> >  	DRM_DEBUG_DRIVER("Default HW context loaded\n");
> >  	return 0;
> > @@ -300,7 +300,7 @@ void i915_gem_context_fini(struct drm_device *dev)
> >  	i915_gem_context_unreference(dctx);
> >  	dev_priv->ring[RCS].default_context = NULL;
> >  	dev_priv->ring[RCS].last_context = NULL;
> > -	dev_priv->mm.aliasing_ppgtt = NULL;
> > +	dev_priv->gtt.aliasing_ppgtt = NULL;
> >  }
> >  
> >  int i915_gem_context_enable(struct drm_i915_private *dev_priv)
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 7fcd6c0..93870bb 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -429,8 +429,8 @@ i915_gem_execbuffer_reserve_object(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	/* Ensure ppgtt mapping exists if needed */
> > -	if (dev_priv->mm.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_bind_object(dev_priv->mm.aliasing_ppgtt,
> > +	if (dev_priv->gtt.aliasing_ppgtt && !obj->has_aliasing_ppgtt_mapping) {
> > +		i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
> >  				       obj, obj->cache_level);
> >  
> >  		obj->has_aliasing_ppgtt_mapping = 1;
> > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > index 6de75c7..18820cb 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> > @@ -127,7 +127,7 @@ static int gen6_ppgtt_enable(struct drm_device *dev)
> >  	drm_i915_private_t *dev_priv = dev->dev_private;
> >  	uint32_t pd_offset;
> >  	struct intel_ring_buffer *ring;
> > -	struct i915_hw_ppgtt *ppgtt = dev_priv->mm.aliasing_ppgtt;
> > +	struct i915_hw_ppgtt *ppgtt = dev_priv->gtt.aliasing_ppgtt;
> >  	int i;
> >  
> >  	BUG_ON(ppgtt->pd_offset & 0x3f);
> > @@ -445,8 +445,8 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
> >  				       i915_gtt_vm->start / PAGE_SIZE,
> >  				       i915_gtt_vm->total / PAGE_SIZE);
> >  
> > -	if (dev_priv->mm.aliasing_ppgtt)
> > -		gen6_write_pdes(dev_priv->mm.aliasing_ppgtt);
> > +	if (dev_priv->gtt.aliasing_ppgtt)
> > +		gen6_write_pdes(dev_priv->gtt.aliasing_ppgtt);
> >  
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) {
> >  		i915_gem_clflush_object(obj);
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 40/66] drm/i915: Track all VMAs per VM
  2013-06-30 15:35   ` Daniel Vetter
@ 2013-07-01 19:04     ` Ben Widawsky
  0 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 19:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 05:35:00PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:41PM -0700, Ben Widawsky wrote:
> > This allows us to be aware of all the VMAs leftover and teardown, and is
> > useful for debug. I suspect it will prove even more useful later.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h | 2 ++
> >  drivers/gpu/drm/i915/i915_gem.c | 4 ++++
> >  2 files changed, 6 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 247a124..0bc4251 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -446,6 +446,7 @@ struct i915_address_space {
> >  	struct drm_mm mm;
> >  	struct drm_device *dev;
> >  	struct list_head global_link;
> > +	struct list_head vma_list;
> 
> This one feels a bit unecessary. With the drm_mm_node embedded we already
> have a total of 4 lists:
> - The node_list in the drm_mm. There's even a for_each helper for it. This
>   lists nodes in ascending offset ordering. We only need to upcast from
>   the drm_mm_node to our vma, but due to embedded that's no problem.
> - The hole list in drm_mm. Again comes with a for_each helper included.
> - The inactive/active lists. Together they again list all vmas in a vm.
> 
> What's the new one doing that we need it so much?
> -Daniel
>

I can try to use the existing data structures to make it work. It was
really easy to do it with our own list though, a list which is really
easy to maintain, and not often traversed. So in all, I don't find the
additional list offensive. I guess a fair argument would be since we'll
have at least as many VMAs as BOs, the extra list_head is a bit
offensive. If you want to make that your tune, then I can agree with you
on that.

One reason I didn't try harder to not use this list was I felt it would
be a nice thing when we properly support page faults, though even there
I think the existing lists could probably be used.

Also, at one time, I was still use drm_mm_node *, so the upcast wasn't
possible.

> 
> >  	unsigned long start;		/* Start offset always 0 for dri2 */
> >  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> >  
> > @@ -556,6 +557,7 @@ struct i915_vma {
> >  	struct list_head mm_list;
> >  
> >  	struct list_head vma_link; /* Link in the object's VMA list */
> > +	struct list_head per_vm_link; /* Link in the VM's VMA list */
> >  };
> >  
> >  struct i915_ctx_hang_stats {
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index a3e8c26..5c0ad6a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4112,14 +4112,17 @@ struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  
> >  	INIT_LIST_HEAD(&vma->vma_link);
> >  	INIT_LIST_HEAD(&vma->mm_list);
> > +	INIT_LIST_HEAD(&vma->per_vm_link);
> >  	vma->vm = vm;
> >  	vma->obj = obj;
> > +	list_add_tail(&vma->per_vm_link, &vm->vma_list);
> >  
> >  	return vma;
> >  }
> >  
> >  void i915_gem_vma_destroy(struct i915_vma *vma)
> >  {
> > +	list_del(&vma->per_vm_link);
> >  	WARN_ON(vma->node.allocated);
> >  	kfree(vma);
> >  }
> > @@ -4473,6 +4476,7 @@ static void i915_init_vm(struct drm_i915_private *dev_priv,
> >  	INIT_LIST_HEAD(&vm->active_list);
> >  	INIT_LIST_HEAD(&vm->inactive_list);
> >  	INIT_LIST_HEAD(&vm->global_link);
> > +	INIT_LIST_HEAD(&vm->vma_list);
> >  	list_add(&vm->global_link, &dev_priv->vm_list);
> >  }
> >  
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-07-01 18:52     ` Ben Widawsky
@ 2013-07-01 19:06       ` Daniel Vetter
  2013-07-01 19:48         ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 19:06 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 8:52 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> One thing which looks a bit peculiar at the end is that struct
>> i915_hw_ppgtt is actually used as the real ppgtt object (since it
>> subclases i915_address space). My original plan was that we'll add a new
>> struct i915_ppgtt {
>>       struct i915_address_space base;
>>       struct i915_hw_ppgtt hw_ppgtt;
>> }
>>
>> To fit into your design the alising ppgtt pointer in dev_priv->gtt would
>> then only point at a hw_ppgtt struct, not the full deal with address space
>> and everything else around attached.
>>
>> Cheers, Daniel
>
> I don't think creating a struct i915_ppgtt is necessary or buys much. We
> can rename i915_hw_ppgtt to i915_ppgtt, and it accomplishes the same
> thing. Same for the i915_hw_context for that matter. I wanted to do any
> sort of renaming after the rest of the series.
>
> Can you explain why we'd want to keep the hw_ppgtt, and ppgtt with the
> track lists etc. distinct?

The difference is that for the aliasing ppgtt we don't really need a
address space object to track object offsets since those are identical
to the ggtt offsets. Hence we only need the hw part of the ppgtt, i.e.
writing/clearing ptes.

Full ppgtt otoh need the address space of course.

The above structure layout would allow that. It means that aliasing
ppgtt will stick out a bit like a sore thumb but imo that's ok since
it really _is_ a bit an odd thing. I've just noticed that you've
partially disabled the has_aliasing_ppgtt_mapping logic in your
branch, so I wonder a bit how aliasing ppgtt works there. But I guess
the simplest approach would be to simply keep that stuff around as-is,
with a few twists to wrestle it into the new world: Batches would be
executed as if they'd run in the global gtt (i.e. for eviction/binding
purposes) with the simple difference that we'd check whether the
global gtt has the aliasing_ppgtt pointer set up, and then write the
ptes there. Note that unconditionally writing both ppgtt and global
gtt ptes isn't too good, since in some virtualized enviroments
tracking those separately is a pretty substantial win apparently.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-01 18:43       ` Daniel Vetter
@ 2013-07-01 19:08         ` Daniel Vetter
  2013-07-01 22:59           ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 19:08 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 8:43 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> All of this is addressed in future patches. As we've discussed, I think
>> I'll have to respin it anyway, so I'll name it as such upfront. To me it
>> felt a little weird to start calling things "ggtt" before I made the
>> separation.
>
> I think now that we know what the end result should (more or less at
> least) look like we can aim to make it right the first time we touch a
> piece of code. That will reduce the churn in the patch series and so
> make the beast easier to review.
>
> Imo foreshadowing (to keep consistent with the "a patch series should
> tell a story" analogy) is perfectly fine, and in many cases helps in
> understanding the big picture of a large pile of patches.

I've forgotten to add one thing: If you switch these again later on
(layz me didn't check for that) it's imo best to stick with those
names (presuming they fit, since the gtt_size vs. obj->size
disdinction is a rather important one). Again I think now that we know
where to go to it's best to get there with as few intermediate steps
as possible.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-07-01 19:06       ` Daniel Vetter
@ 2013-07-01 19:48         ` Ben Widawsky
  2013-07-01 19:54           ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 19:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 09:06:20PM +0200, Daniel Vetter wrote:
> On Mon, Jul 1, 2013 at 8:52 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> >> One thing which looks a bit peculiar at the end is that struct
> >> i915_hw_ppgtt is actually used as the real ppgtt object (since it
> >> subclases i915_address space). My original plan was that we'll add a new
> >> struct i915_ppgtt {
> >>       struct i915_address_space base;
> >>       struct i915_hw_ppgtt hw_ppgtt;
> >> }
> >>
> >> To fit into your design the alising ppgtt pointer in dev_priv->gtt would
> >> then only point at a hw_ppgtt struct, not the full deal with address space
> >> and everything else around attached.
> >>
> >> Cheers, Daniel
> >
> > I don't think creating a struct i915_ppgtt is necessary or buys much. We
> > can rename i915_hw_ppgtt to i915_ppgtt, and it accomplishes the same
> > thing. Same for the i915_hw_context for that matter. I wanted to do any
> > sort of renaming after the rest of the series.
> >
> > Can you explain why we'd want to keep the hw_ppgtt, and ppgtt with the
> > track lists etc. distinct?
> 
> The difference is that for the aliasing ppgtt we don't really need a
> address space object to track object offsets since those are identical
> to the ggtt offsets. Hence we only need the hw part of the ppgtt, i.e.
> writing/clearing ptes.

I actually did attempt to make the aliasing ppgtt a real address space
(so the ggtt was the special case instead of vice versa). It proved too
much work, so I dropped it.
> 
> Full ppgtt otoh need the address space of course.
> 
> The above structure layout would allow that. It means that aliasing
> ppgtt will stick out a bit like a sore thumb but imo that's ok since
> it really _is_ a bit an odd thing. I've just noticed that you've
> partially disabled the has_aliasing_ppgtt_mapping logic in your
> branch, so I wonder a bit how aliasing ppgtt works there.

It doesn't. The intent was to completely eclipse aliasing PPGTT. If we
keep the pin interface (as discussed in the other thread), I am not
certain that is still true, though I personally have no issue forcing a
performance penalty on users that wish to use that interface.

> But I guess
> the simplest approach would be to simply keep that stuff around as-is,
> with a few twists to wrestle it into the new world: Batches would be
> executed as if they'd run in the global gtt (i.e. for eviction/binding
> purposes) with the simple difference that we'd check whether the
> global gtt has the aliasing_ppgtt pointer set up, and then write the
> ptes there. Note that unconditionally writing both ppgtt and global
> gtt ptes isn't too good, since in some virtualized enviroments
> tracking those separately is a pretty substantial win apparently.
> -Daniel

I'm still not convinced that having distinct i915_ppgtt and
i915_hw_ppgtt buys us anything.

> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 24/66] drm/i915: Move aliasing_ppgtt
  2013-07-01 19:48         ` Ben Widawsky
@ 2013-07-01 19:54           ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 19:54 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 1, 2013 at 9:48 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
>> Full ppgtt otoh need the address space of course.
>>
>> The above structure layout would allow that. It means that aliasing
>> ppgtt will stick out a bit like a sore thumb but imo that's ok since
>> it really _is_ a bit an odd thing. I've just noticed that you've
>> partially disabled the has_aliasing_ppgtt_mapping logic in your
>> branch, so I wonder a bit how aliasing ppgtt works there.
>
> It doesn't. The intent was to completely eclipse aliasing PPGTT. If we
> keep the pin interface (as discussed in the other thread), I am not
> certain that is still true, though I personally have no issue forcing a
> performance penalty on users that wish to use that interface.

Ah, that explains my confusion ;-)

I guess we simply have to be careful to keep this working properly
when merging the patches. Like I've said I'm pretty sure that we can
keep all the major concepts unchanged and just bolt aliasing ppgtt
onto the side like it's done today.

For i915_ppgtt vs. i915_hw_ppgtt I think we can revisite that we the
in-depth review of the relevant patches is on the table.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
                   ` (66 preceding siblings ...)
  2013-06-28  3:38 ` [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
@ 2013-07-01 21:39 ` Daniel Vetter
  2013-07-01 22:36   ` Ben Widawsky
  2013-10-29 23:08   ` Eric Anholt
  67 siblings, 2 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-01 21:39 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

Hi Ben

So first things first: I rather like what the code looks like overall at
the end. I've done a light read-through (by far not a full review) and
besides a few bikesheds (all mentioned by mail already) the big thing is
the 1:1 context:ppgtt address space relationship.

We've discussed this at length in private irc and agreed that we need to
changes this two a n:1 relationship, so I'll just reiterate the reasons
for that on the list:

- Current userspace expects that different contexts created on the same fd
  all use the same address space (since there's really only one). So if we
  want to no add a new ABI (and for testing I really think we want to
  enable ppgtt on current unchanged userspace) we must keep that promise.
  Hence we need to be able to point the different contexts created on an
  fd all at the same (per-fd) address space.

  If we want we could later on extend this and allow a context to have a
  private address space all of its own.

- HW contexts are pretty much the execution/scheduling primitive the hw
  provides us with. On current platforms that's a bit a stretch, but it's
  much clearer on future platforms.

  The equivalent concept on the cpu would be threads, and history
  unanimously established that having multiple threads in the same process
  is a useful concept. So I think allowing the same N:1 context:ppgtt
  relation on gpus is sound. Of course that does not preclude other ways
  to share individual buffers, which we already support with
  flink/dma_buf.

With that big issue resolved there's only the bikesheds left. I'm not
really worried about those, and in any case we already have some good
discussions going on about them.

So merge plan:

Since the context<->ppgtt relation needs to be revamped I'd like to punt
on patches in that area at first and concentrate on merging the
address_space/vma conversion and all the prep work leading to that first.

We're already discussing some of the details, so I won't repeat that here
again. I think for the overall approach I wouldn't bother with rebasing -
shuffling around such a massive&invasive patch series is a major pain and
rather error-prone.

Instead I'd just freeze the current branch as a known-working reference
point and cherry-pick individual subseries out of it. Also some patches
will need to be redone, but thanks to the benefit of hindsight I hope that
v2 will have much less churn. Again I've tossed around a few ideas in
replies to individual patches.

For prep work I see a few pieces:

- drm_mm related prep work, especially drm_mm.c core changes. I think
  simply reordering all the relevant patches and resubmitting them (with
  cc: dri-devel) is all we need to get those in (minus the oddball last
  minute bikeshed).

- Small static inline functions to ease the pain of the conversion. I
  think those need to be redone with as much foreshadowing as possible (so
  that later patches don't suffer from overblown diff sizes). Maybe we
  should also do the lookup-helpers upfront.

- Other prep work like killing obj->gtt_offset and stuff I've missed (but
  which doesn't touch the context<->ppgtt relation).

- I think we can also merge as much of the hw interfacing code as possible
  up-front (or in paralle), e.g. converting the pde loading to LRI.

Then we can slurp all the address_space/vma conversion patches. Imo the
important part is to fledge out the commit message with the hindsight
insights and explain a bit why certain stuff gets moved and why other
stuff should stay where it currently is. We also need to review whether
bisecting isn't broken anywhere, but since we don't yet add a real ppgtt I
don't expect any issues.

Once that's all in we can revisit the context vs. ppgtt question and
figure out how to make it work. I expect that we need to refcount ppgtt
address spaces. But if we keep the per-fd contexts around (imo a sane idea
anyway) we should get by when just the contexts hold references onto the
ppgtt object. Since the context is guaranteed to stay around until the
last object is inactive again that should ensure that the ppgtt address
space stays around for long enough, too. And it would avoid the ugliness
of adding more tricky active refcounting on top of the context/obj
refcounting we already have.

Comments?

Cheers, Daniel

On Thu, Jun 27, 2013 at 04:30:01PM -0700, Ben Widawsky wrote:
> First, I don't think this whole series is ready for merge yet. It is
> however ready for a review, and I think a lot of the prep patches in the
> series could be merged to make my rebasing life a bit easier. I cannot
> continue ignoring pretty much all emails/bugs as I have for the last
> month to wrap up this series. The current state is on my IVB, things are
> pretty stable. I've seen one unexplained hang, but I'm hopeful review
> might help me uncover/explain.
> 
> This patch series introduces the next step in enabling full PPGTT, which
> is per fd address space/context, and it also contains the previously
> unmerged patches (some of which have been reworked, modified, or
> rebased). In regards to the continued VMA changes, I think in these, the
> delta with regard to the last posting is the bound list was per VM. It
> is now global. I've also moved the active list to per VM.
> 
> Brand new in the series we take the previous series' context per fd
> (with address space) one step further and actually switch the address
> spaces when we do context switches. In order to make this happen, the
> series continues to chip away at removing the notion of an object only
> ever being bound into one address space via the struct
> i915_address_space and struct i915_vma data structures which are really
> abstractions for a page directory, and current mapped ptes respectively.
> The error state is improved since the last series (though still some
> work there is probably needed). It also serves to remove the notion of
> the aliasing PPGTT since in theory everything bound into the GGTT
> shouldn't benefit from an aliasing PPGTT (fact check).
> 
> With every context having it's own address space, and every open DRM fd
> having it's own context, it's trivial on execbuf to lookup a context and
> do the pinning in the proper address space. More importantly, it's
> implicit that a context exists, which made this impossible to do
> earlier.
> 
> *A note on patch ordering:* In order to work this series incrementally, the
> final patch ordering is admittedly a little bit strange. I'm more than willing
> to rework these as requested, but I'd really prefer not to do really heavy
> reordering unless there is a major benefit, or of course to fix bugs.
> 
> # What is not in this patch series in the order I think we should handle it
> (and I acknowledge some of this stuff is non-trivial):
> 
> ## Review + QA coverage
> 
> ## Porting to HSW
> 
> It shouldn't be too much extra work, if any, to add support. I haven't looked
> into it yet.
> 
> ## Better vm/ppgtt info in error state collection
> 
> In particular, I want to dump all the PTEs at hang, and at the very least the
> guilt PTEs.  This isn't difficult, and can be done with copypasta from the
> existing dumper I have.
> 
> ## User space and the implications
> 
> Now that contexts are valid on all rings, userspace should begin emitting the
> context for all rings if it expects both rings to be able to access both
> objects in the same offset. The mesa change looks to be just one line which
> simplies emits the context for batch->is_blit, I doubt libva is using contexts,
> and SNA seems not to. The plan to support mesa will be to do the detection in
> libdrm, and go ahead with the simple mesa one liner. I've been using the
> oneliner if mesa for a while now, but we will need to support old user space in
> the kernel. So there might be a bit of work even on the kernel side here. We
> also need some IGT tools test updates. I have messy versions of these locally
> already.
> 
> ## Performance data
> 
> I think it doesn't preclude preliminary review of the patches since the main
> goal of PPGTT is really abourt security, correctness, and enabling other
> things. I will update with some numbers after I work on it a bit more.
> 
> 
> ## Testing on SNB
> 
> If our current code is correct, then I think these patches might work on SNB
> as is, but it's untested. There is currently no way to disconnect contexts +
> PPGTT from the whole thing; so if this doesn't work - we'll need rework some of
> the code. I think it should just entail bringing back aliasing ppgtt, and not
> doing the address space switch when switching contexts (aliasing ppgtt will
> have a null switch_mm()).
> 
> ## Soft pin interface
> 
> I'd like to defer the discussion until these patches are merged.
> 
> ## On demand page table allocation
> 
> This is a potentially very useful optimization for at least the following
> reasons:
> * any app using contexts will have an extra set of page tables it isn't using;
>   wasted memory
> * Reduce DCLV to reduce pd fetch latency
> * Allows Better use of a switch to default context for low memory situations
>   (evicting unused page tables for example)
> 
> ## Per VMA cache levels/control
> 
> There are situations in the code where we have to flush the GPU pipeline in
> order to change cache levels.  This should no longer be the case for unaffected
> VMs (I think). The same may be true with domain tracking.
> 
> ## dmabuf/prime integration
> 
> I haven't looked into what's missing to support it. If I'm lucky, it just works.
> 
> 
> 
> With that, if you haven't already moved on, chanting tl;dr - all comments
> welcome.
> 
> ---
> 
> Ben Widawsky (65):
>   drm/i915: Remove extra error state NULL
>   drm/i915: Extract error buffer capture
>   drm/i915: make PDE|PTE platform specific
>   drm/i915: Don't clear gtt with 0 entries
>   drm/i915: Conditionally use guard page based on PPGTT
>   drm/i915: Use drm_mm for PPGTT PDEs
>   drm/i915: cleanup context fini
>   drm/i915: Do a fuller init after reset
>   drm/i915: Split context enabling from init
>   drm/i915: destroy i915_gem_init_global_gtt
>   drm/i915: Embed PPGTT into the context
>   drm/i915: Unify PPGTT codepaths on gen6+
>   drm/i915: Move ppgtt initialization down
>   drm/i915: Tie context to PPGTT
>   drm/i915: Really share scratch page
>   drm/i915: Combine scratch members into a struct
>   drm/i915: Drop dev from pte_encode
>   drm/i915: Use gtt shortform where possible
>   drm/i915: Move fbc members out of line
>   drm/i915: Move gtt and ppgtt under address space umbrella
>   drm/i915: Move gtt_mtrr to i915_gtt
>   drm/i915: Move stolen stuff to i915_gtt
>   drm/i915: Move aliasing_ppgtt
>   drm/i915: Put the mm in the parent address space
>   drm/i915: Move active/inactive lists to new mm
>   drm/i915: Create a global list of vms
>   drm/i915: Remove object's gtt_offset
>   drm: pre allocate node for create_block
>   drm/i915: Getter/setter for object attributes
>   drm/i915: Create VMAs (part 1)
>   drm/i915: Create VMAs (part 2) - kill gtt space
>   drm/i915: Create VMAs (part 3) - plumbing
>   drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
>   drm/i915: Create VMAs (part 4) - Error capture
>   drm/i915: Create VMAs (part 5) - move mm_list
>   drm/i915: Create VMAs (part 6) - finish error plumbing
>   drm/i915: create an object_is_active()
>   drm/i915: Move active to vma
>   drm/i915: Track all VMAs per VM
>   drm/i915: Defer request freeing
>   drm/i915: Clean up VMAs before freeing
>   drm/i915: Replace has_bsd/blt with a mask
>   drm/i915: Catch missed context unref earlier
>   drm/i915: Add a context open function
>   drm/i915: Permit contexts on all rings
>   drm/i915: Fix context fini refcounts
>   drm/i915: Better reset handling for contexts
>   drm/i915: Create a per file_priv default context
>   drm/i915: Remove ring specificity from contexts
>   drm/i915: Track which ring a context ran on
>   drm/i915: dump error state based on capture
>   drm/i915: PPGTT should take a ppgtt argument
>   drm/i915: USE LRI for switching PP_DIR_BASE
>   drm/i915: Extract mm switching to function
>   drm/i915: Write PDEs at init instead of enable
>   drm/i915: Disallow pin with full ppgtt
>   drm/i915: Get context early in execbuf
>   drm/i915: Pass ctx directly to switch/hangstat
>   drm/i915: Actually add the new address spaces
>   drm/i915: Use multiple VMs
>   drm/i915: Kill now unused ppgtt_{un,}bind
>   drm/i915: Add PPGTT dumper
>   drm/i915: Dump all ppgtt
>   drm/i915: Add debugfs for vma info per vm
>   drm/i915: Getparam full ppgtt
> 
> Chris Wilson (1):
>   drm: Optionally create mm blocks from top-to-bottom
> 
>  drivers/gpu/drm/drm_mm.c                   | 134 +++---
>  drivers/gpu/drm/i915/i915_debugfs.c        | 215 ++++++++--
>  drivers/gpu/drm/i915/i915_dma.c            |  25 +-
>  drivers/gpu/drm/i915/i915_drv.c            |  57 ++-
>  drivers/gpu/drm/i915/i915_drv.h            | 353 ++++++++++------
>  drivers/gpu/drm/i915/i915_gem.c            | 639 +++++++++++++++++++++--------
>  drivers/gpu/drm/i915/i915_gem_context.c    | 279 +++++++++----
>  drivers/gpu/drm/i915/i915_gem_debug.c      |  11 +-
>  drivers/gpu/drm/i915/i915_gem_evict.c      |  64 +--
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++---
>  drivers/gpu/drm/i915/i915_gem_gtt.c        | 541 ++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_stolen.c     |  87 ++--
>  drivers/gpu/drm/i915/i915_gem_tiling.c     |  21 +-
>  drivers/gpu/drm/i915/i915_irq.c            | 197 ++++++---
>  drivers/gpu/drm/i915/i915_trace.h          |  38 +-
>  drivers/gpu/drm/i915/intel_display.c       |  40 +-
>  drivers/gpu/drm/i915/intel_drv.h           |   7 -
>  drivers/gpu/drm/i915/intel_fb.c            |   8 +-
>  drivers/gpu/drm/i915/intel_overlay.c       |  26 +-
>  drivers/gpu/drm/i915/intel_pm.c            |  65 +--
>  drivers/gpu/drm/i915/intel_ringbuffer.c    |  32 +-
>  drivers/gpu/drm/i915/intel_sprite.c        |   8 +-
>  include/drm/drm_mm.h                       | 147 ++++---
>  include/uapi/drm/i915_drm.h                |   1 +
>  24 files changed, 2044 insertions(+), 1089 deletions(-)
> 
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-07-01 21:39 ` Daniel Vetter
@ 2013-07-01 22:36   ` Ben Widawsky
  2013-07-02  7:43     ` Daniel Vetter
  2013-10-29 23:08   ` Eric Anholt
  1 sibling, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 22:36 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 11:39:30PM +0200, Daniel Vetter wrote:
> Hi Ben
> 
> So first things first: I rather like what the code looks like overall at
> the end. I've done a light read-through (by far not a full review) and
> besides a few bikesheds (all mentioned by mail already) the big thing is
> the 1:1 context:ppgtt address space relationship.
> 
> We've discussed this at length in private irc and agreed that we need to
> changes this two a n:1 relationship, so I'll just reiterate the reasons
> for that on the list:
> 
> - Current userspace expects that different contexts created on the same fd
>   all use the same address space (since there's really only one). So if we
>   want to no add a new ABI (and for testing I really think we want to
>   enable ppgtt on current unchanged userspace) we must keep that promise.
>   Hence we need to be able to point the different contexts created on an
>   fd all at the same (per-fd) address space.
> 
>   If we want we could later on extend this and allow a context to have a
>   private address space all of its own.
> 
> - HW contexts are pretty much the execution/scheduling primitive the hw
>   provides us with. On current platforms that's a bit a stretch, but it's
>   much clearer on future platforms.
> 
>   The equivalent concept on the cpu would be threads, and history
>   unanimously established that having multiple threads in the same process
>   is a useful concept. So I think allowing the same N:1 context:ppgtt
>   relation on gpus is sound. Of course that does not preclude other ways
>   to share individual buffers, which we already support with
>   flink/dma_buf.
> 
> With that big issue resolved there's only the bikesheds left. I'm not
> really worried about those, and in any case we already have some good
> discussions going on about them.

I've discussed this with the Mesa team, and I believe this is what they
want. I'll highlight the important bit for TL;DR people:
>   Hence we need to be able to point the different contexts created on an
>   fd all at the same (per-fd) address space.

If one wants a new address space, they will have to open a new fd.

> 
> So merge plan:
> 
> Since the context<->ppgtt relation needs to be revamped I'd like to punt
> on patches in that area at first and concentrate on merging the
> address_space/vma conversion and all the prep work leading to that first.

The main idea [ack, or nak] is the ppgtt becomes *ppgtt, and the context
will refcount it on context creation/destrution - that way all the
existing tricky context refcounting should still work, and the last
context to down ref a ppgtt will destroy it in it's handler.

> 
> We're already discussing some of the details, so I won't repeat that here
> again. I think for the overall approach I wouldn't bother with rebasing -
> shuffling around such a massive&invasive patch series is a major pain and
> rather error-prone.
> 
> Instead I'd just freeze the current branch as a known-working reference
> point and cherry-pick individual subseries out of it. Also some patches
> will need to be redone, but thanks to the benefit of hindsight I hope that
> v2 will have much less churn. Again I've tossed around a few ideas in
> replies to individual patches.
> 
> For prep work I see a few pieces:
> 
> - drm_mm related prep work, especially drm_mm.c core changes. I think
>   simply reordering all the relevant patches and resubmitting them (with
>   cc: dri-devel) is all we need to get those in (minus the oddball last
>   minute bikeshed).
> 
> - Small static inline functions to ease the pain of the conversion. I
>   think those need to be redone with as much foreshadowing as possible (so
>   that later patches don't suffer from overblown diff sizes). Maybe we
>   should also do the lookup-helpers upfront.
> 
> - Other prep work like killing obj->gtt_offset and stuff I've missed (but
>   which doesn't touch the context<->ppgtt relation).

I think this might be mergeable now. Did you try and have conflicts?

> 
> - I think we can also merge as much of the hw interfacing code as possible
>   up-front (or in paralle), e.g. converting the pde loading to LRI.

I was thinking this as well. I wasn't sure how you'd feel about the
idea. I'd really like that. This should also have no conflicts at
present (that I'm aware of).

> 
> Then we can slurp all the address_space/vma conversion patches. Imo the
> important part is to fledge out the commit message with the hindsight
> insights and explain a bit why certain stuff gets moved and why other
> stuff should stay where it currently is. We also need to review whether
> bisecting isn't broken anywhere, but since we don't yet add a real ppgtt I
> don't expect any issues.
> 
> Once that's all in we can revisit the context vs. ppgtt question and
> figure out how to make it work. I expect that we need to refcount ppgtt
> address spaces. But if we keep the per-fd contexts around (imo a sane idea
> anyway) we should get by when just the contexts hold references onto the
> ppgtt object. Since the context is guaranteed to stay around until the
> last object is inactive again that should ensure that the ppgtt address
> space stays around for long enough, too. And it would avoid the ugliness
> of adding more tricky active refcounting on top of the context/obj
> refcounting we already have.
> 
> Comments?

The high level plan sounds totally fine to me, I still have some issues
with how to break up the huge changes in the VMA/VM conversion since
many interfaces need to change, and that gets messy. In the end I like
what I did where I split out the 6 or 7 major changes for review. Would
you be okay with that again? Maybe if all goes well, it won't be a
problem.

> 
> Cheers, Daniel
> 
> On Thu, Jun 27, 2013 at 04:30:01PM -0700, Ben Widawsky wrote:
> > First, I don't think this whole series is ready for merge yet. It is
> > however ready for a review, and I think a lot of the prep patches in the
> > series could be merged to make my rebasing life a bit easier. I cannot
> > continue ignoring pretty much all emails/bugs as I have for the last
> > month to wrap up this series. The current state is on my IVB, things are
> > pretty stable. I've seen one unexplained hang, but I'm hopeful review
> > might help me uncover/explain.
> > 
> > This patch series introduces the next step in enabling full PPGTT, which
> > is per fd address space/context, and it also contains the previously
> > unmerged patches (some of which have been reworked, modified, or
> > rebased). In regards to the continued VMA changes, I think in these, the
> > delta with regard to the last posting is the bound list was per VM. It
> > is now global. I've also moved the active list to per VM.
> > 
> > Brand new in the series we take the previous series' context per fd
> > (with address space) one step further and actually switch the address
> > spaces when we do context switches. In order to make this happen, the
> > series continues to chip away at removing the notion of an object only
> > ever being bound into one address space via the struct
> > i915_address_space and struct i915_vma data structures which are really
> > abstractions for a page directory, and current mapped ptes respectively.
> > The error state is improved since the last series (though still some
> > work there is probably needed). It also serves to remove the notion of
> > the aliasing PPGTT since in theory everything bound into the GGTT
> > shouldn't benefit from an aliasing PPGTT (fact check).
> > 
> > With every context having it's own address space, and every open DRM fd
> > having it's own context, it's trivial on execbuf to lookup a context and
> > do the pinning in the proper address space. More importantly, it's
> > implicit that a context exists, which made this impossible to do
> > earlier.
> > 
> > *A note on patch ordering:* In order to work this series incrementally, the
> > final patch ordering is admittedly a little bit strange. I'm more than willing
> > to rework these as requested, but I'd really prefer not to do really heavy
> > reordering unless there is a major benefit, or of course to fix bugs.
> > 
> > # What is not in this patch series in the order I think we should handle it
> > (and I acknowledge some of this stuff is non-trivial):
> > 
> > ## Review + QA coverage
> > 
> > ## Porting to HSW
> > 
> > It shouldn't be too much extra work, if any, to add support. I haven't looked
> > into it yet.
> > 
> > ## Better vm/ppgtt info in error state collection
> > 
> > In particular, I want to dump all the PTEs at hang, and at the very least the
> > guilt PTEs.  This isn't difficult, and can be done with copypasta from the
> > existing dumper I have.
> > 
> > ## User space and the implications
> > 
> > Now that contexts are valid on all rings, userspace should begin emitting the
> > context for all rings if it expects both rings to be able to access both
> > objects in the same offset. The mesa change looks to be just one line which
> > simplies emits the context for batch->is_blit, I doubt libva is using contexts,
> > and SNA seems not to. The plan to support mesa will be to do the detection in
> > libdrm, and go ahead with the simple mesa one liner. I've been using the
> > oneliner if mesa for a while now, but we will need to support old user space in
> > the kernel. So there might be a bit of work even on the kernel side here. We
> > also need some IGT tools test updates. I have messy versions of these locally
> > already.
> > 
> > ## Performance data
> > 
> > I think it doesn't preclude preliminary review of the patches since the main
> > goal of PPGTT is really abourt security, correctness, and enabling other
> > things. I will update with some numbers after I work on it a bit more.
> > 
> > 
> > ## Testing on SNB
> > 
> > If our current code is correct, then I think these patches might work on SNB
> > as is, but it's untested. There is currently no way to disconnect contexts +
> > PPGTT from the whole thing; so if this doesn't work - we'll need rework some of
> > the code. I think it should just entail bringing back aliasing ppgtt, and not
> > doing the address space switch when switching contexts (aliasing ppgtt will
> > have a null switch_mm()).
> > 
> > ## Soft pin interface
> > 
> > I'd like to defer the discussion until these patches are merged.
> > 
> > ## On demand page table allocation
> > 
> > This is a potentially very useful optimization for at least the following
> > reasons:
> > * any app using contexts will have an extra set of page tables it isn't using;
> >   wasted memory
> > * Reduce DCLV to reduce pd fetch latency
> > * Allows Better use of a switch to default context for low memory situations
> >   (evicting unused page tables for example)
> > 
> > ## Per VMA cache levels/control
> > 
> > There are situations in the code where we have to flush the GPU pipeline in
> > order to change cache levels.  This should no longer be the case for unaffected
> > VMs (I think). The same may be true with domain tracking.
> > 
> > ## dmabuf/prime integration
> > 
> > I haven't looked into what's missing to support it. If I'm lucky, it just works.
> > 
> > 
> > 
> > With that, if you haven't already moved on, chanting tl;dr - all comments
> > welcome.
> > 
> > ---
> > 
> > Ben Widawsky (65):
> >   drm/i915: Remove extra error state NULL
> >   drm/i915: Extract error buffer capture
> >   drm/i915: make PDE|PTE platform specific
> >   drm/i915: Don't clear gtt with 0 entries
> >   drm/i915: Conditionally use guard page based on PPGTT
> >   drm/i915: Use drm_mm for PPGTT PDEs
> >   drm/i915: cleanup context fini
> >   drm/i915: Do a fuller init after reset
> >   drm/i915: Split context enabling from init
> >   drm/i915: destroy i915_gem_init_global_gtt
> >   drm/i915: Embed PPGTT into the context
> >   drm/i915: Unify PPGTT codepaths on gen6+
> >   drm/i915: Move ppgtt initialization down
> >   drm/i915: Tie context to PPGTT
> >   drm/i915: Really share scratch page
> >   drm/i915: Combine scratch members into a struct
> >   drm/i915: Drop dev from pte_encode
> >   drm/i915: Use gtt shortform where possible
> >   drm/i915: Move fbc members out of line
> >   drm/i915: Move gtt and ppgtt under address space umbrella
> >   drm/i915: Move gtt_mtrr to i915_gtt
> >   drm/i915: Move stolen stuff to i915_gtt
> >   drm/i915: Move aliasing_ppgtt
> >   drm/i915: Put the mm in the parent address space
> >   drm/i915: Move active/inactive lists to new mm
> >   drm/i915: Create a global list of vms
> >   drm/i915: Remove object's gtt_offset
> >   drm: pre allocate node for create_block
> >   drm/i915: Getter/setter for object attributes
> >   drm/i915: Create VMAs (part 1)
> >   drm/i915: Create VMAs (part 2) - kill gtt space
> >   drm/i915: Create VMAs (part 3) - plumbing
> >   drm/i915: Create VMAs (part 3.5) - map and fenceable tracking
> >   drm/i915: Create VMAs (part 4) - Error capture
> >   drm/i915: Create VMAs (part 5) - move mm_list
> >   drm/i915: Create VMAs (part 6) - finish error plumbing
> >   drm/i915: create an object_is_active()
> >   drm/i915: Move active to vma
> >   drm/i915: Track all VMAs per VM
> >   drm/i915: Defer request freeing
> >   drm/i915: Clean up VMAs before freeing
> >   drm/i915: Replace has_bsd/blt with a mask
> >   drm/i915: Catch missed context unref earlier
> >   drm/i915: Add a context open function
> >   drm/i915: Permit contexts on all rings
> >   drm/i915: Fix context fini refcounts
> >   drm/i915: Better reset handling for contexts
> >   drm/i915: Create a per file_priv default context
> >   drm/i915: Remove ring specificity from contexts
> >   drm/i915: Track which ring a context ran on
> >   drm/i915: dump error state based on capture
> >   drm/i915: PPGTT should take a ppgtt argument
> >   drm/i915: USE LRI for switching PP_DIR_BASE
> >   drm/i915: Extract mm switching to function
> >   drm/i915: Write PDEs at init instead of enable
> >   drm/i915: Disallow pin with full ppgtt
> >   drm/i915: Get context early in execbuf
> >   drm/i915: Pass ctx directly to switch/hangstat
> >   drm/i915: Actually add the new address spaces
> >   drm/i915: Use multiple VMs
> >   drm/i915: Kill now unused ppgtt_{un,}bind
> >   drm/i915: Add PPGTT dumper
> >   drm/i915: Dump all ppgtt
> >   drm/i915: Add debugfs for vma info per vm
> >   drm/i915: Getparam full ppgtt
> > 
> > Chris Wilson (1):
> >   drm: Optionally create mm blocks from top-to-bottom
> > 
> >  drivers/gpu/drm/drm_mm.c                   | 134 +++---
> >  drivers/gpu/drm/i915/i915_debugfs.c        | 215 ++++++++--
> >  drivers/gpu/drm/i915/i915_dma.c            |  25 +-
> >  drivers/gpu/drm/i915/i915_drv.c            |  57 ++-
> >  drivers/gpu/drm/i915/i915_drv.h            | 353 ++++++++++------
> >  drivers/gpu/drm/i915/i915_gem.c            | 639 +++++++++++++++++++++--------
> >  drivers/gpu/drm/i915/i915_gem_context.c    | 279 +++++++++----
> >  drivers/gpu/drm/i915/i915_gem_debug.c      |  11 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c      |  64 +--
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 138 ++++---
> >  drivers/gpu/drm/i915/i915_gem_gtt.c        | 541 ++++++++++++++----------
> >  drivers/gpu/drm/i915/i915_gem_stolen.c     |  87 ++--
> >  drivers/gpu/drm/i915/i915_gem_tiling.c     |  21 +-
> >  drivers/gpu/drm/i915/i915_irq.c            | 197 ++++++---
> >  drivers/gpu/drm/i915/i915_trace.h          |  38 +-
> >  drivers/gpu/drm/i915/intel_display.c       |  40 +-
> >  drivers/gpu/drm/i915/intel_drv.h           |   7 -
> >  drivers/gpu/drm/i915/intel_fb.c            |   8 +-
> >  drivers/gpu/drm/i915/intel_overlay.c       |  26 +-
> >  drivers/gpu/drm/i915/intel_pm.c            |  65 +--
> >  drivers/gpu/drm/i915/intel_ringbuffer.c    |  32 +-
> >  drivers/gpu/drm/i915/intel_sprite.c        |   8 +-
> >  include/drm/drm_mm.h                       | 147 ++++---
> >  include/uapi/drm/i915_drm.h                |   1 +
> >  24 files changed, 2044 insertions(+), 1089 deletions(-)
> > 
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 26/66] drm/i915: Move active/inactive lists to new mm
  2013-06-30 15:38   ` Daniel Vetter
@ 2013-07-01 22:56     ` Ben Widawsky
  2013-07-02  7:26       ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 22:56 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Sun, Jun 30, 2013 at 05:38:16PM +0200, Daniel Vetter wrote:
> On Thu, Jun 27, 2013 at 04:30:27PM -0700, Ben Widawsky wrote:
> > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.inactive_list/i915_gtt_mm-\>inactive_list/" $file; done
> > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.active_list/i915_gtt_mm-\>active_list/" $file; done
> > 
> > I've also opted to move the comments out of line a bit so one can get a
> > better picture of what the various lists do.
> 
> Bikeshed: That makes you now inconsistent with all the other in-detail
> structure memeber comments we have. And I don't see how it looks better,
> so I'd vote to keep things as-is with per-member comments.
>
Initially I moved all the comments (in the original mm destruction I
did).

> 
> > v2: Leave the bound list as a global one. (Chris, indirectly)
> > 
> > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> 
> The real comment though is on the commit message, it fails to explain why
> we want to move the active/inactive lists from mm/obj to the address
> space/vma pair. I think I understand, but this should be explained more
> in-depth.
> 
> I think in the first commit which starts moving those lists and execution
> tracking state you should also mention why some of the state
> (bound/unbound lists e.g.) are not moved.
> 
> Cheers, Daniel

Can I use, "because Chris told me to"? :p

> 
> > ---
> >  drivers/gpu/drm/i915/i915_debugfs.c    | 11 ++++----
> >  drivers/gpu/drm/i915/i915_drv.h        | 49 ++++++++++++++--------------------
> >  drivers/gpu/drm/i915/i915_gem.c        | 24 +++++++----------
> >  drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
> >  drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++----
> >  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
> >  drivers/gpu/drm/i915/i915_irq.c        |  6 ++---
> >  7 files changed, 46 insertions(+), 58 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > index f3c76ab..a0babc7 100644
> > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > @@ -158,11 +158,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> >  	switch (list) {
> >  	case ACTIVE_LIST:
> >  		seq_printf(m, "Active:\n");
> > -		head = &dev_priv->mm.active_list;
> > +		head = &i915_gtt_vm->active_list;
> >  		break;
> >  	case INACTIVE_LIST:
> >  		seq_printf(m, "Inactive:\n");
> > -		head = &dev_priv->mm.inactive_list;
> > +		head = &i915_gtt_vm->inactive_list;
> >  		break;
> >  	default:
> >  		mutex_unlock(&dev->struct_mutex);
> > @@ -247,12 +247,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&dev_priv->mm.active_list, mm_list);
> > +	count_objects(&i915_gtt_vm->active_list, mm_list);
> >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> >  	size = count = mappable_size = mappable_count = 0;
> > -	count_objects(&dev_priv->mm.inactive_list, mm_list);
> > +	count_objects(&i915_gtt_vm->inactive_list, mm_list);
> >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> >  		   count, mappable_count, size, mappable_size);
> >  
> > @@ -1977,7 +1977,8 @@ i915_drop_caches_set(void *data, u64 val)
> >  		i915_gem_retire_requests(dev);
> >  
> >  	if (val & DROP_BOUND) {
> > -		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
> > +		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
> > +					 mm_list)
> >  			if (obj->pin_count == 0) {
> >  				ret = i915_gem_object_unbind(obj);
> >  				if (ret)
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index e65cf57..0553410 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -448,6 +448,22 @@ struct i915_address_space {
> >  	unsigned long start;		/* Start offset always 0 for dri2 */
> >  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> >  
> > +/* We use many types of lists for object tracking:
> > + *  active_list: List of objects currently involved in rendering.
> > + *	Includes buffers having the contents of their GPU caches flushed, not
> > + *	necessarily primitives. last_rendering_seqno represents when the
> > + *	rendering involved will be completed. A reference is held on the buffer
> > + *	while on this list.
> > + *  inactive_list: LRU list of objects which are not in the ringbuffer
> > + *	objects are ready to unbind but are still mapped.
> > + *	last_rendering_seqno is 0 while an object is in this list.
> > + *	A reference is not held on the buffer while on this list,
> > + *	as merely being GTT-bound shouldn't prevent its being
> > + *	freed, and we'll pull it off the list in the free path.
> > + */
> > +	struct list_head active_list;
> > +	struct list_head inactive_list;
> > +
> >  	struct {
> >  		dma_addr_t addr;
> >  		struct page *page;
> > @@ -835,42 +851,17 @@ struct intel_l3_parity {
> >  };
> >  
> >  struct i915_gem_mm {
> > -	/** List of all objects in gtt_space. Used to restore gtt
> > -	 * mappings on resume */
> > -	struct list_head bound_list;
> >  	/**
> > -	 * List of objects which are not bound to the GTT (thus
> > -	 * are idle and not used by the GPU) but still have
> > -	 * (presumably uncached) pages still attached.
> > +	 * Lists of objects which are [not] bound to a VM. Unbound objects are
> > +	 * idle are idle but still have (presumably uncached) pages still
> > +	 * attached.
> >  	 */
> > +	struct list_head bound_list;
> >  	struct list_head unbound_list;
> >  
> >  	struct shrinker inactive_shrinker;
> >  	bool shrinker_no_lock_stealing;
> >  
> > -	/**
> > -	 * List of objects currently involved in rendering.
> > -	 *
> > -	 * Includes buffers having the contents of their GPU caches
> > -	 * flushed, not necessarily primitives.  last_rendering_seqno
> > -	 * represents when the rendering involved will be completed.
> > -	 *
> > -	 * A reference is held on the buffer while on this list.
> > -	 */
> > -	struct list_head active_list;
> > -
> > -	/**
> > -	 * LRU list of objects which are not in the ringbuffer and
> > -	 * are ready to unbind, but are still in the GTT.
> > -	 *
> > -	 * last_rendering_seqno is 0 while an object is in this list.
> > -	 *
> > -	 * A reference is not held on the buffer while on this list,
> > -	 * as merely being GTT-bound shouldn't prevent its being
> > -	 * freed, and we'll pull it off the list in the free path.
> > -	 */
> > -	struct list_head inactive_list;
> > -
> >  	/** LRU list of objects with fence regs on them. */
> >  	struct list_head fence_list;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 608b6b5..7da06df 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1706,7 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> >  	}
> >  
> >  	list_for_each_entry_safe(obj, next,
> > -				 &dev_priv->mm.inactive_list,
> > +				 &i915_gtt_vm->inactive_list,
> >  				 mm_list) {
> >  		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> >  		    i915_gem_object_unbind(obj) == 0 &&
> > @@ -1881,7 +1881,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> >  	}
> >  
> >  	/* Move from whatever list we were on to the tail of execution. */
> > -	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
> > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
> >  	list_move_tail(&obj->ring_list, &ring->active_list);
> >  
> >  	obj->last_read_seqno = seqno;
> > @@ -1909,7 +1909,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> >  	BUG_ON(!obj->active);
> >  
> > -	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> >  
> >  	list_del_init(&obj->ring_list);
> >  	obj->ring = NULL;
> > @@ -2279,12 +2279,8 @@ bool i915_gem_reset(struct drm_device *dev)
> >  	/* Move everything out of the GPU domains to ensure we do any
> >  	 * necessary invalidation upon reuse.
> >  	 */
> > -	list_for_each_entry(obj,
> > -			    &dev_priv->mm.inactive_list,
> > -			    mm_list)
> > -	{
> > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
> >  		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > -	}
> >  
> >  	/* The fence registers are invalidated so clear them out */
> >  	i915_gem_restore_fences(dev);
> > @@ -3162,7 +3158,7 @@ search_free:
> >  	}
> >  
> >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> >  
> >  	obj->gtt_space = node;
> >  	obj->gtt_offset = node->start;
> > @@ -3313,7 +3309,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> >  
> >  	/* And bump the LRU for this access */
> >  	if (i915_gem_object_is_inactive(obj))
> > -		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > +		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> >  
> >  	return 0;
> >  }
> > @@ -4291,7 +4287,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> >  		return ret;
> >  	}
> >  
> > -	BUG_ON(!list_empty(&dev_priv->mm.active_list));
> > +	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
> >  	mutex_unlock(&dev->struct_mutex);
> >  
> >  	ret = drm_irq_install(dev);
> > @@ -4352,8 +4348,8 @@ i915_gem_load(struct drm_device *dev)
> >  				  SLAB_HWCACHE_ALIGN,
> >  				  NULL);
> >  
> > -	INIT_LIST_HEAD(&dev_priv->mm.active_list);
> > -	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
> > +	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
> > +	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > @@ -4652,7 +4648,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> >  		if (obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, global_list)
> > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
> >  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> >  			cnt += obj->base.size >> PAGE_SHIFT;
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
> > index 582e6a5..bf945a3 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_debug.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_debug.c
> > @@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
> >  		}
> >  	}
> >  
> > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
> > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
> >  		if (obj->base.dev != dev ||
> >  		    !atomic_read(&obj->base.refcount.refcount)) {
> >  			DRM_ERROR("freed inactive %p\n", obj);
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index 6e620f86..92856a2 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -86,7 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  				 cache_level);
> >  
> >  	/* First see if there is a large enough contiguous idle region... */
> > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
> > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
> >  		if (mark_free(obj, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> >  		goto none;
> >  
> >  	/* Now merge in the soon-to-be-expired objects... */
> > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> >  		if (mark_free(obj, &unwind_list))
> >  			goto found;
> >  	}
> > @@ -158,8 +158,8 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  	bool lists_empty;
> >  	int ret;
> >  
> > -	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
> > -		       list_empty(&dev_priv->mm.active_list));
> > +	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
> > +		       list_empty(&i915_gtt_vm->active_list));
> >  	if (lists_empty)
> >  		return -ENOSPC;
> >  
> > @@ -177,7 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> >  
> >  	/* Having flushed everything, unbind() should never raise an error */
> >  	list_for_each_entry_safe(obj, next,
> > -				 &dev_priv->mm.inactive_list, mm_list)
> > +				 &i915_gtt_vm->inactive_list, mm_list)
> >  		if (obj->pin_count == 0)
> >  			WARN_ON(i915_gem_object_unbind(obj));
> >  
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index 49e8be7..3f6564d 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -384,7 +384,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> >  	obj->has_global_gtt_mapping = 1;
> >  
> >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> >  
> >  	return obj;
> >  }
> > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > index 1e25920..5dc055a 100644
> > --- a/drivers/gpu/drm/i915/i915_irq.c
> > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > @@ -1722,7 +1722,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> >  	}
> >  
> >  	seqno = ring->get_seqno(ring, false);
> > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> >  		if (obj->ring != ring)
> >  			continue;
> >  
> > @@ -1857,7 +1857,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> >  	int i;
> >  
> >  	i = 0;
> > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
> > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
> >  		i++;
> >  	error->active_bo_count = i;
> >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > @@ -1877,7 +1877,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> >  		error->active_bo_count =
> >  			capture_active_bo(error->active_bo,
> >  					  error->active_bo_count,
> > -					  &dev_priv->mm.active_list);
> > +					  &i915_gtt_vm->active_list);
> >  
> >  	if (error->pinned_bo)
> >  		error->pinned_bo_count =
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-01 19:08         ` Daniel Vetter
@ 2013-07-01 22:59           ` Ben Widawsky
  2013-07-02  7:28             ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-01 22:59 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 09:08:58PM +0200, Daniel Vetter wrote:
> On Mon, Jul 1, 2013 at 8:43 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> >> All of this is addressed in future patches. As we've discussed, I think
> >> I'll have to respin it anyway, so I'll name it as such upfront. To me it
> >> felt a little weird to start calling things "ggtt" before I made the
> >> separation.
> >
> > I think now that we know what the end result should (more or less at
> > least) look like we can aim to make it right the first time we touch a
> > piece of code. That will reduce the churn in the patch series and so
> > make the beast easier to review.
> >
> > Imo foreshadowing (to keep consistent with the "a patch series should
> > tell a story" analogy) is perfectly fine, and in many cases helps in
> > understanding the big picture of a large pile of patches.
> 
> I've forgotten to add one thing: If you switch these again later on
> (layz me didn't check for that) it's imo best to stick with those
> names (presuming they fit, since the gtt_size vs. obj->size
> disdinction is a rather important one). Again I think now that we know
> where to go to it's best to get there with as few intermediate steps
> as possible.
> -Daniel
>

I don't recall object size being very important actually, so I don't
think the distinction is too important, but I'm just arguing for the
sake of arguing. With the sg page stuff that Imre did, I think most size
calculations unrelated to gtt size are there anyway, and most of our mm
(not page allocation) code should only ever care about the gtt.

> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 26/66] drm/i915: Move active/inactive lists to new mm
  2013-07-01 22:56     ` Ben Widawsky
@ 2013-07-02  7:26       ` Daniel Vetter
  2013-07-02 16:47         ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-07-02  7:26 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 03:56:50PM -0700, Ben Widawsky wrote:
> On Sun, Jun 30, 2013 at 05:38:16PM +0200, Daniel Vetter wrote:
> > On Thu, Jun 27, 2013 at 04:30:27PM -0700, Ben Widawsky wrote:
> > > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.inactive_list/i915_gtt_mm-\>inactive_list/" $file; done
> > > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.active_list/i915_gtt_mm-\>active_list/" $file; done
> > > 
> > > I've also opted to move the comments out of line a bit so one can get a
> > > better picture of what the various lists do.
> > 
> > Bikeshed: That makes you now inconsistent with all the other in-detail
> > structure memeber comments we have. And I don't see how it looks better,
> > so I'd vote to keep things as-is with per-member comments.
> >
> Initially I moved all the comments (in the original mm destruction I
> did).

I mean to keep the per-struct-member comments right next to each
individual declaration.

> > > v2: Leave the bound list as a global one. (Chris, indirectly)
> > > 
> > > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > 
> > The real comment though is on the commit message, it fails to explain why
> > we want to move the active/inactive lists from mm/obj to the address
> > space/vma pair. I think I understand, but this should be explained more
> > in-depth.
> > 
> > I think in the first commit which starts moving those lists and execution
> > tracking state you should also mention why some of the state
> > (bound/unbound lists e.g.) are not moved.
> > 
> > Cheers, Daniel
> 
> Can I use, "because Chris told me to"? :p

I think some high-level explanation should be doable ;-) E.g. when moving
the lists around explain that the active/inactive stuff is used by
eviction when we run out of address space, so needs to be per-vma and
per-address space. Bound/unbound otoh is used by the shrinker which only
cares about the amount of memory used and not one bit about in which
address space this memory is all used in. Of course to actual kick out an
object we need to unbind it from every address space, but for that we have
the per-object list of vmas.
-Daniel

> 
> > 
> > > ---
> > >  drivers/gpu/drm/i915/i915_debugfs.c    | 11 ++++----
> > >  drivers/gpu/drm/i915/i915_drv.h        | 49 ++++++++++++++--------------------
> > >  drivers/gpu/drm/i915/i915_gem.c        | 24 +++++++----------
> > >  drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
> > >  drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++----
> > >  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
> > >  drivers/gpu/drm/i915/i915_irq.c        |  6 ++---
> > >  7 files changed, 46 insertions(+), 58 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > index f3c76ab..a0babc7 100644
> > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > @@ -158,11 +158,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > >  	switch (list) {
> > >  	case ACTIVE_LIST:
> > >  		seq_printf(m, "Active:\n");
> > > -		head = &dev_priv->mm.active_list;
> > > +		head = &i915_gtt_vm->active_list;
> > >  		break;
> > >  	case INACTIVE_LIST:
> > >  		seq_printf(m, "Inactive:\n");
> > > -		head = &dev_priv->mm.inactive_list;
> > > +		head = &i915_gtt_vm->inactive_list;
> > >  		break;
> > >  	default:
> > >  		mutex_unlock(&dev->struct_mutex);
> > > @@ -247,12 +247,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > >  	size = count = mappable_size = mappable_count = 0;
> > > -	count_objects(&dev_priv->mm.active_list, mm_list);
> > > +	count_objects(&i915_gtt_vm->active_list, mm_list);
> > >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > >  	size = count = mappable_size = mappable_count = 0;
> > > -	count_objects(&dev_priv->mm.inactive_list, mm_list);
> > > +	count_objects(&i915_gtt_vm->inactive_list, mm_list);
> > >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> > >  		   count, mappable_count, size, mappable_size);
> > >  
> > > @@ -1977,7 +1977,8 @@ i915_drop_caches_set(void *data, u64 val)
> > >  		i915_gem_retire_requests(dev);
> > >  
> > >  	if (val & DROP_BOUND) {
> > > -		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
> > > +		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
> > > +					 mm_list)
> > >  			if (obj->pin_count == 0) {
> > >  				ret = i915_gem_object_unbind(obj);
> > >  				if (ret)
> > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > index e65cf57..0553410 100644
> > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > @@ -448,6 +448,22 @@ struct i915_address_space {
> > >  	unsigned long start;		/* Start offset always 0 for dri2 */
> > >  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> > >  
> > > +/* We use many types of lists for object tracking:
> > > + *  active_list: List of objects currently involved in rendering.
> > > + *	Includes buffers having the contents of their GPU caches flushed, not
> > > + *	necessarily primitives. last_rendering_seqno represents when the
> > > + *	rendering involved will be completed. A reference is held on the buffer
> > > + *	while on this list.
> > > + *  inactive_list: LRU list of objects which are not in the ringbuffer
> > > + *	objects are ready to unbind but are still mapped.
> > > + *	last_rendering_seqno is 0 while an object is in this list.
> > > + *	A reference is not held on the buffer while on this list,
> > > + *	as merely being GTT-bound shouldn't prevent its being
> > > + *	freed, and we'll pull it off the list in the free path.
> > > + */
> > > +	struct list_head active_list;
> > > +	struct list_head inactive_list;
> > > +
> > >  	struct {
> > >  		dma_addr_t addr;
> > >  		struct page *page;
> > > @@ -835,42 +851,17 @@ struct intel_l3_parity {
> > >  };
> > >  
> > >  struct i915_gem_mm {
> > > -	/** List of all objects in gtt_space. Used to restore gtt
> > > -	 * mappings on resume */
> > > -	struct list_head bound_list;
> > >  	/**
> > > -	 * List of objects which are not bound to the GTT (thus
> > > -	 * are idle and not used by the GPU) but still have
> > > -	 * (presumably uncached) pages still attached.
> > > +	 * Lists of objects which are [not] bound to a VM. Unbound objects are
> > > +	 * idle are idle but still have (presumably uncached) pages still
> > > +	 * attached.
> > >  	 */
> > > +	struct list_head bound_list;
> > >  	struct list_head unbound_list;
> > >  
> > >  	struct shrinker inactive_shrinker;
> > >  	bool shrinker_no_lock_stealing;
> > >  
> > > -	/**
> > > -	 * List of objects currently involved in rendering.
> > > -	 *
> > > -	 * Includes buffers having the contents of their GPU caches
> > > -	 * flushed, not necessarily primitives.  last_rendering_seqno
> > > -	 * represents when the rendering involved will be completed.
> > > -	 *
> > > -	 * A reference is held on the buffer while on this list.
> > > -	 */
> > > -	struct list_head active_list;
> > > -
> > > -	/**
> > > -	 * LRU list of objects which are not in the ringbuffer and
> > > -	 * are ready to unbind, but are still in the GTT.
> > > -	 *
> > > -	 * last_rendering_seqno is 0 while an object is in this list.
> > > -	 *
> > > -	 * A reference is not held on the buffer while on this list,
> > > -	 * as merely being GTT-bound shouldn't prevent its being
> > > -	 * freed, and we'll pull it off the list in the free path.
> > > -	 */
> > > -	struct list_head inactive_list;
> > > -
> > >  	/** LRU list of objects with fence regs on them. */
> > >  	struct list_head fence_list;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index 608b6b5..7da06df 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1706,7 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > >  	}
> > >  
> > >  	list_for_each_entry_safe(obj, next,
> > > -				 &dev_priv->mm.inactive_list,
> > > +				 &i915_gtt_vm->inactive_list,
> > >  				 mm_list) {
> > >  		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > >  		    i915_gem_object_unbind(obj) == 0 &&
> > > @@ -1881,7 +1881,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > >  	}
> > >  
> > >  	/* Move from whatever list we were on to the tail of execution. */
> > > -	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
> > > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
> > >  	list_move_tail(&obj->ring_list, &ring->active_list);
> > >  
> > >  	obj->last_read_seqno = seqno;
> > > @@ -1909,7 +1909,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > >  	BUG_ON(!obj->active);
> > >  
> > > -	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > >  
> > >  	list_del_init(&obj->ring_list);
> > >  	obj->ring = NULL;
> > > @@ -2279,12 +2279,8 @@ bool i915_gem_reset(struct drm_device *dev)
> > >  	/* Move everything out of the GPU domains to ensure we do any
> > >  	 * necessary invalidation upon reuse.
> > >  	 */
> > > -	list_for_each_entry(obj,
> > > -			    &dev_priv->mm.inactive_list,
> > > -			    mm_list)
> > > -	{
> > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
> > >  		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > -	}
> > >  
> > >  	/* The fence registers are invalidated so clear them out */
> > >  	i915_gem_restore_fences(dev);
> > > @@ -3162,7 +3158,7 @@ search_free:
> > >  	}
> > >  
> > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > >  
> > >  	obj->gtt_space = node;
> > >  	obj->gtt_offset = node->start;
> > > @@ -3313,7 +3309,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > >  
> > >  	/* And bump the LRU for this access */
> > >  	if (i915_gem_object_is_inactive(obj))
> > > -		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > +		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > >  
> > >  	return 0;
> > >  }
> > > @@ -4291,7 +4287,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> > >  		return ret;
> > >  	}
> > >  
> > > -	BUG_ON(!list_empty(&dev_priv->mm.active_list));
> > > +	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
> > >  	mutex_unlock(&dev->struct_mutex);
> > >  
> > >  	ret = drm_irq_install(dev);
> > > @@ -4352,8 +4348,8 @@ i915_gem_load(struct drm_device *dev)
> > >  				  SLAB_HWCACHE_ALIGN,
> > >  				  NULL);
> > >  
> > > -	INIT_LIST_HEAD(&dev_priv->mm.active_list);
> > > -	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
> > > +	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
> > > +	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > @@ -4652,7 +4648,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > >  		if (obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, global_list)
> > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
> > >  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
> > > index 582e6a5..bf945a3 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_debug.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_debug.c
> > > @@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
> > >  		}
> > >  	}
> > >  
> > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
> > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
> > >  		if (obj->base.dev != dev ||
> > >  		    !atomic_read(&obj->base.refcount.refcount)) {
> > >  			DRM_ERROR("freed inactive %p\n", obj);
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > index 6e620f86..92856a2 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > @@ -86,7 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  				 cache_level);
> > >  
> > >  	/* First see if there is a large enough contiguous idle region... */
> > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
> > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
> > >  		if (mark_free(obj, &unwind_list))
> > >  			goto found;
> > >  	}
> > > @@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > >  		goto none;
> > >  
> > >  	/* Now merge in the soon-to-be-expired objects... */
> > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> > >  		if (mark_free(obj, &unwind_list))
> > >  			goto found;
> > >  	}
> > > @@ -158,8 +158,8 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  	bool lists_empty;
> > >  	int ret;
> > >  
> > > -	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
> > > -		       list_empty(&dev_priv->mm.active_list));
> > > +	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
> > > +		       list_empty(&i915_gtt_vm->active_list));
> > >  	if (lists_empty)
> > >  		return -ENOSPC;
> > >  
> > > @@ -177,7 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> > >  
> > >  	/* Having flushed everything, unbind() should never raise an error */
> > >  	list_for_each_entry_safe(obj, next,
> > > -				 &dev_priv->mm.inactive_list, mm_list)
> > > +				 &i915_gtt_vm->inactive_list, mm_list)
> > >  		if (obj->pin_count == 0)
> > >  			WARN_ON(i915_gem_object_unbind(obj));
> > >  
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > index 49e8be7..3f6564d 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > @@ -384,7 +384,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > >  	obj->has_global_gtt_mapping = 1;
> > >  
> > >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > >  
> > >  	return obj;
> > >  }
> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > index 1e25920..5dc055a 100644
> > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > @@ -1722,7 +1722,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > >  	}
> > >  
> > >  	seqno = ring->get_seqno(ring, false);
> > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> > >  		if (obj->ring != ring)
> > >  			continue;
> > >  
> > > @@ -1857,7 +1857,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > >  	int i;
> > >  
> > >  	i = 0;
> > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
> > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
> > >  		i++;
> > >  	error->active_bo_count = i;
> > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > > @@ -1877,7 +1877,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > >  		error->active_bo_count =
> > >  			capture_active_bo(error->active_bo,
> > >  					  error->active_bo_count,
> > > -					  &dev_priv->mm.active_list);
> > > +					  &i915_gtt_vm->active_list);
> > >  
> > >  	if (error->pinned_bo)
> > >  		error->pinned_bo_count =
> > > -- 
> > > 1.8.3.1
> > > 
> > > _______________________________________________
> > > Intel-gfx mailing list
> > > Intel-gfx@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> 
> -- 
> Ben Widawsky, Intel Open Source Technology Center

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-01 22:59           ` Ben Widawsky
@ 2013-07-02  7:28             ` Daniel Vetter
  2013-07-02 16:51               ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Daniel Vetter @ 2013-07-02  7:28 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 03:59:51PM -0700, Ben Widawsky wrote:
> On Mon, Jul 01, 2013 at 09:08:58PM +0200, Daniel Vetter wrote:
> > On Mon, Jul 1, 2013 at 8:43 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > >> All of this is addressed in future patches. As we've discussed, I think
> > >> I'll have to respin it anyway, so I'll name it as such upfront. To me it
> > >> felt a little weird to start calling things "ggtt" before I made the
> > >> separation.
> > >
> > > I think now that we know what the end result should (more or less at
> > > least) look like we can aim to make it right the first time we touch a
> > > piece of code. That will reduce the churn in the patch series and so
> > > make the beast easier to review.
> > >
> > > Imo foreshadowing (to keep consistent with the "a patch series should
> > > tell a story" analogy) is perfectly fine, and in many cases helps in
> > > understanding the big picture of a large pile of patches.
> > 
> > I've forgotten to add one thing: If you switch these again later on
> > (layz me didn't check for that) it's imo best to stick with those
> > names (presuming they fit, since the gtt_size vs. obj->size
> > disdinction is a rather important one). Again I think now that we know
> > where to go to it's best to get there with as few intermediate steps
> > as possible.
> > -Daniel
> >
> 
> I don't recall object size being very important actually, so I don't
> think the distinction is too important, but I'm just arguing for the
> sake of arguing. With the sg page stuff that Imre did, I think most size
> calculations unrelated to gtt size are there anyway, and most of our mm
> (not page allocation) code should only ever care about the gtt.

The disdinction is only important on gen2/3, which is why you don't recall
it being important ;-)

I think you have two options:
- Trust me that it's indeed important.
- Read up on gen2/3 fencing code and make up your own mind.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-07-01 22:36   ` Ben Widawsky
@ 2013-07-02  7:43     ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-02  7:43 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Mon, Jul 01, 2013 at 03:36:13PM -0700, Ben Widawsky wrote:
> On Mon, Jul 01, 2013 at 11:39:30PM +0200, Daniel Vetter wrote:
> > Hi Ben
> > 
> > So first things first: I rather like what the code looks like overall at
> > the end. I've done a light read-through (by far not a full review) and
> > besides a few bikesheds (all mentioned by mail already) the big thing is
> > the 1:1 context:ppgtt address space relationship.
> > 
> > We've discussed this at length in private irc and agreed that we need to
> > changes this two a n:1 relationship, so I'll just reiterate the reasons
> > for that on the list:
> > 
> > - Current userspace expects that different contexts created on the same fd
> >   all use the same address space (since there's really only one). So if we
> >   want to no add a new ABI (and for testing I really think we want to
> >   enable ppgtt on current unchanged userspace) we must keep that promise.
> >   Hence we need to be able to point the different contexts created on an
> >   fd all at the same (per-fd) address space.
> > 
> >   If we want we could later on extend this and allow a context to have a
> >   private address space all of its own.
> > 
> > - HW contexts are pretty much the execution/scheduling primitive the hw
> >   provides us with. On current platforms that's a bit a stretch, but it's
> >   much clearer on future platforms.
> > 
> >   The equivalent concept on the cpu would be threads, and history
> >   unanimously established that having multiple threads in the same process
> >   is a useful concept. So I think allowing the same N:1 context:ppgtt
> >   relation on gpus is sound. Of course that does not preclude other ways
> >   to share individual buffers, which we already support with
> >   flink/dma_buf.
> > 
> > With that big issue resolved there's only the bikesheds left. I'm not
> > really worried about those, and in any case we already have some good
> > discussions going on about them.
> 
> I've discussed this with the Mesa team, and I believe this is what they
> want. I'll highlight the important bit for TL;DR people:
> >   Hence we need to be able to point the different contexts created on an
> >   fd all at the same (per-fd) address space.
> 
> If one wants a new address space, they will have to open a new fd.

Yeah, that's the gist of it.

> > So merge plan:
> > 
> > Since the context<->ppgtt relation needs to be revamped I'd like to punt
> > on patches in that area at first and concentrate on merging the
> > address_space/vma conversion and all the prep work leading to that first.
> 
> The main idea [ack, or nak] is the ppgtt becomes *ppgtt, and the context
> will refcount it on context creation/destrution - that way all the
> existing tricky context refcounting should still work, and the last
> context to down ref a ppgtt will destroy it in it's handler.

Yep, that's what I have in mind.

> > We're already discussing some of the details, so I won't repeat that here
> > again. I think for the overall approach I wouldn't bother with rebasing -
> > shuffling around such a massive&invasive patch series is a major pain and
> > rather error-prone.
> > 
> > Instead I'd just freeze the current branch as a known-working reference
> > point and cherry-pick individual subseries out of it. Also some patches
> > will need to be redone, but thanks to the benefit of hindsight I hope that
> > v2 will have much less churn. Again I've tossed around a few ideas in
> > replies to individual patches.
> > 
> > For prep work I see a few pieces:
> > 
> > - drm_mm related prep work, especially drm_mm.c core changes. I think
> >   simply reordering all the relevant patches and resubmitting them (with
> >   cc: dri-devel) is all we need to get those in (minus the oddball last
> >   minute bikeshed).
> > 
> > - Small static inline functions to ease the pain of the conversion. I
> >   think those need to be redone with as much foreshadowing as possible (so
> >   that later patches don't suffer from overblown diff sizes). Maybe we
> >   should also do the lookup-helpers upfront.
> > 
> > - Other prep work like killing obj->gtt_offset and stuff I've missed (but
> >   which doesn't touch the context<->ppgtt relation).
> 
> I think this might be mergeable now. Did you try and have conflicts?

It's mergeable but imo it makes much more sense to add the gtt_space access
helpers first and then embed the drm_mm_node. Same for killing gtt_offset.
Atm your patch ordering for those three things is the wrong way round,
resulting in needless amounts of diff churn. I should have replied to that
patch with this comment, but maybe I've failed ...

> > - I think we can also merge as much of the hw interfacing code as possible
> >   up-front (or in paralle), e.g. converting the pde loading to LRI.
> 
> I was thinking this as well. I wasn't sure how you'd feel about the
> idea. I'd really like that. This should also have no conflicts at
> present (that I'm aware of).

Yeah, if you can prep a subseries of cherry-pick hw interface prep stuff I
can merge it right away. Should also make reviewing stuff a bit easier if
it's split out from the main thing.

> > Then we can slurp all the address_space/vma conversion patches. Imo the
> > important part is to fledge out the commit message with the hindsight
> > insights and explain a bit why certain stuff gets moved and why other
> > stuff should stay where it currently is. We also need to review whether
> > bisecting isn't broken anywhere, but since we don't yet add a real ppgtt I
> > don't expect any issues.
> > 
> > Once that's all in we can revisit the context vs. ppgtt question and
> > figure out how to make it work. I expect that we need to refcount ppgtt
> > address spaces. But if we keep the per-fd contexts around (imo a sane idea
> > anyway) we should get by when just the contexts hold references onto the
> > ppgtt object. Since the context is guaranteed to stay around until the
> > last object is inactive again that should ensure that the ppgtt address
> > space stays around for long enough, too. And it would avoid the ugliness
> > of adding more tricky active refcounting on top of the context/obj
> > refcounting we already have.
> > 
> > Comments?
> 
> The high level plan sounds totally fine to me, I still have some issues
> with how to break up the huge changes in the VMA/VM conversion since
> many interfaces need to change, and that gets messy. In the end I like
> what I did where I split out the 6 or 7 major changes for review. Would
> you be okay with that again? Maybe if all goes well, it won't be a
> problem.

The current vma patches have some scary looking warnings about bisecting,
which need to be addressed. One trick could be to simply embed the vma
object temporarily into the gem_object until everything is moved into the
new place. That way the logic stays the same, we only have the code churn
to deal with (like sprinkling vma arguments instead of obj arguments over
tons of functions). Even the list handling code could be implemented while
the (single) vma is still embedded.

Then the actual behaviour change of a free-standing vma would reduce a lot
(and that change is the tricky one for bisecting I'd guess).

This approach for vmas would mirror your current approach for growing the
address_space struct by piecewise moving stuff from dev_priv->mm to the
(global gtt) address space struct.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-06-27 23:43   ` Ben Widawsky
@ 2013-07-02 10:58     ` Ville Syrjälä
  2013-07-02 11:07       ` Chris Wilson
  0 siblings, 1 reply; 124+ messages in thread
From: Ville Syrjälä @ 2013-07-02 10:58 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:43:40PM -0700, Ben Widawsky wrote:
> On Thu, Jun 27, 2013 at 04:31:02PM -0700, Ben Widawsky wrote:
> > This requires doing an actual switch of the page tables during the
> > context switch/execbuf.
> > 
> > Along the way, cut away as much "aliasing" ppgtt as possible
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c            | 22 +++++++++++++---------
> >  drivers/gpu/drm/i915/i915_gem_context.c    | 29 +++++++++++++++++------------
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 27 ++++++++++++++++++++-------
> >  3 files changed, 50 insertions(+), 28 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index af0150e..f05d585 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2170,7 +2170,10 @@ request_to_vm(struct drm_i915_gem_request *request)
> >  	struct drm_i915_private *dev_priv = request->ring->dev->dev_private;
> >  	struct i915_address_space *vm;
> >  
> > -	vm = &dev_priv->gtt.base;
> > +	if (request->ctx)
> > +		vm = &request->ctx->ppgtt.base;
> > +	else
> > +		vm = &dev_priv->gtt.base;
> >  
> >  	return vm;
> >  }
> > @@ -2676,10 +2679,10 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj,
> >  
> >  	if (obj->has_global_gtt_mapping && is_i915_ggtt(vm))
> >  		i915_gem_gtt_unbind_object(obj);
> > -	if (obj->has_aliasing_ppgtt_mapping) {
> > -		i915_ppgtt_unbind_object(dev_priv->gtt.aliasing_ppgtt, obj);
> > -		obj->has_aliasing_ppgtt_mapping = 0;
> > -	}
> > +
> > +	vm->clear_range(vm, i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
> > +			obj->base.size >> PAGE_SHIFT);
> > +
> >  	i915_gem_gtt_finish_object(obj);
> >  	i915_gem_object_unpin_pages(obj);
> >  
> > @@ -3444,11 +3447,12 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
> >  				return ret;
> >  		}
> >  
> > -		if (obj->has_global_gtt_mapping)
> > +		if (!is_i915_ggtt(vm) && obj->has_global_gtt_mapping)
> >  			i915_gem_gtt_bind_object(obj, cache_level);

Are you planning to kill i915_gem_gtt_(un)bind_object? In many cases you
seem to end up writing the global GTT PTEs twice because of it. I guess
the only catch is that obj->has_gtt_mapping must be kept in sync w/
reality if you kill it.

> > -		if (obj->has_aliasing_ppgtt_mapping)
> > -			i915_ppgtt_bind_object(dev_priv->gtt.aliasing_ppgtt,
> > -					       obj, cache_level);
> > +
> > +		vm->insert_entries(vm, obj->pages,
> > +				   i915_gem_obj_offset(obj, vm) >> PAGE_SHIFT,
> > +				   cache_level);
> >  
> >  		i915_gem_obj_set_color(obj, vm, cache_level);

This cache level stuff ends up looking a bit wonky, but I guess you
didn't spend much time on it yet.

I'll just note that I don't think we can allow per address space cache
levels through this interface since that would break the case where the
client renders to the buffer, then the server/compositor sets it to
uncached and does a page flip. After this the client has to also use
uncached mappings or the server/compositor won't realize that it would
need to clflush again before page flipping.

So I think you can just eliminate the vm argument from this function and
then the function could look something like this (pardon my pseudo
code):

  ...
  if (bound_any(obj)) {
    finish_gpu
    finish_gtt
    put_fence
    ...
  }

  for_each_safe(obj->vma_list) {
    vm = vma->vm;
    node = vma->node;
    if (verify_gtt(node, cache_level)) {
      unbind(ovj, vm);
      continue;
    }
    vm->insert_range();
    vma->color = cacle_level;
  }
  ...


This also got me thinking about MOCS. That think seems a bit dangerous
in this case since the client can easily override the cache level w/o
the server/compositor noticing. I guess we just have to make a rule that
clients aren't allowed to override cache level w/ MOCS for window
system buffers.

Also IIRC someone told me that w/ uncached mappings the caches aren't
snooped even on LLC platforms. If that's true, MOCS seems even more
dangerous since the client could easily mix cached and uncached
accesses. I don't really understand why uncached mappings wouldn't
snoop on LLC platforms since the snooping should be more or less free.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 42/66] drm/i915: Clean up VMAs before freeing
  2013-06-27 23:30 ` [PATCH 42/66] drm/i915: Clean up VMAs before freeing Ben Widawsky
@ 2013-07-02 10:59   ` Ville Syrjälä
  2013-07-02 16:58     ` Ben Widawsky
  0 siblings, 1 reply; 124+ messages in thread
From: Ville Syrjälä @ 2013-07-02 10:59 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Thu, Jun 27, 2013 at 04:30:43PM -0700, Ben Widawsky wrote:
> It's quite common for an object to simply be on the inactive list (and
> not unbound) when we want to free the context. This of course happens
> with lazy unbinding. Simply, this is needed when an object isn't fully
> unbound but we want to free one VMA of the object, for whatever reason.
> 
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  1 +
>  drivers/gpu/drm/i915/i915_gem.c         | 28 ++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_context.c |  1 +
>  3 files changed, 30 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 0bc4251..9febcdd 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1674,6 +1674,7 @@ void i915_gem_free_object(struct drm_gem_object *obj);
>  struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm);
>  void i915_gem_vma_destroy(struct i915_vma *vma);
> +void i915_gem_vma_cleanup(struct i915_address_space *vm);
>  
>  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
>  				     struct i915_address_space *vm,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 12d0e61..9abc3c8 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4134,6 +4134,34 @@ void i915_gem_vma_destroy(struct i915_vma *vma)
>  	kfree(vma);
>  }
>  
> +/* This is like unbind() but without gtt considerations */
> +void i915_gem_vma_cleanup(struct i915_address_space *vm)
> +{
> +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> +	struct i915_vma *vma, *n;
> +
> +	BUG_ON(is_i915_ggtt(vm));
> +	WARN_ON(!list_empty(&vm->active_list));
> +
> +	list_for_each_entry_safe(vma, n, &vm->vma_list, per_vm_link) {
> +		struct drm_i915_gem_object *obj = vma->obj;
> +
> +		if (WARN_ON(!i915_gem_obj_bound(obj, vm)))
> +			continue;
> +
> +		i915_gem_object_unpin_pages(obj);
> +
> +		list_del(&vma->mm_list);
> +		list_del(&vma->vma_link);
> +		drm_mm_remove_node(&vma->node);
> +		i915_gem_vma_destroy(vma);

Is there a good reason why all of that stuff isn't included in
i915_gem_vma_destroy()? It seems like it should be there.

> +
> +		if (list_empty(&obj->vma_list))
> +			list_move_tail(&obj->global_list,
> +				       &dev_priv->mm.unbound_list);
> +	}
> +}
> +
>  int
>  i915_gem_idle(struct drm_device *dev)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 988123f..c45cd5c 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -129,6 +129,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
>  	struct i915_hw_context *ctx = container_of(ctx_ref,
>  						   typeof(*ctx), ref);
>  
> +	i915_gem_vma_cleanup(&ctx->ppgtt.base);
>  	if (ctx->ppgtt.cleanup)
>  		ctx->ppgtt.cleanup(&ctx->ppgtt);
>  	drm_gem_object_unreference(&ctx->obj->base);
> -- 
> 1.8.3.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-07-02 10:58     ` Ville Syrjälä
@ 2013-07-02 11:07       ` Chris Wilson
  2013-07-02 11:34         ` Ville Syrjälä
  0 siblings, 1 reply; 124+ messages in thread
From: Chris Wilson @ 2013-07-02 11:07 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Ben Widawsky, Intel GFX

On Tue, Jul 02, 2013 at 01:58:13PM +0300, Ville Syrjälä wrote:
> Also IIRC someone told me that w/ uncached mappings the caches aren't
> snooped even on LLC platforms. If that's true, MOCS seems even more
> dangerous since the client could easily mix cached and uncached
> accesses. I don't really understand why uncached mappings wouldn't
> snoop on LLC platforms since the snooping should be more or less free.

Who said that? Doesn't appear to be the case on SNB/IVB/HSW as far as I
can tell.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-07-02 11:07       ` Chris Wilson
@ 2013-07-02 11:34         ` Ville Syrjälä
  2013-07-02 11:38           ` Chris Wilson
  0 siblings, 1 reply; 124+ messages in thread
From: Ville Syrjälä @ 2013-07-02 11:34 UTC (permalink / raw)
  To: Chris Wilson, Ben Widawsky, Intel GFX

On Tue, Jul 02, 2013 at 12:07:10PM +0100, Chris Wilson wrote:
> On Tue, Jul 02, 2013 at 01:58:13PM +0300, Ville Syrjälä wrote:
> > Also IIRC someone told me that w/ uncached mappings the caches aren't
> > snooped even on LLC platforms. If that's true, MOCS seems even more
> > dangerous since the client could easily mix cached and uncached
> > accesses. I don't really understand why uncached mappings wouldn't
> > snoop on LLC platforms since the snooping should be more or less free.
> 
> Who said that? Doesn't appear to be the case on SNB/IVB/HSW as far as I
> can tell.

Can't remember now who said it, or could be I just misunderstood. But if
it's not true, then MOCS seems actually useful. Otherwise it's going to
be a clflush fest when you want to change from cached to uncached.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-07-02 11:34         ` Ville Syrjälä
@ 2013-07-02 11:38           ` Chris Wilson
  2013-07-02 12:34             ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Chris Wilson @ 2013-07-02 11:38 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Ben Widawsky, Intel GFX

On Tue, Jul 02, 2013 at 02:34:59PM +0300, Ville Syrjälä wrote:
> On Tue, Jul 02, 2013 at 12:07:10PM +0100, Chris Wilson wrote:
> > On Tue, Jul 02, 2013 at 01:58:13PM +0300, Ville Syrjälä wrote:
> > > Also IIRC someone told me that w/ uncached mappings the caches aren't
> > > snooped even on LLC platforms. If that's true, MOCS seems even more
> > > dangerous since the client could easily mix cached and uncached
> > > accesses. I don't really understand why uncached mappings wouldn't
> > > snoop on LLC platforms since the snooping should be more or less free.
> > 
> > Who said that? Doesn't appear to be the case on SNB/IVB/HSW as far as I
> > can tell.
> 
> Can't remember now who said it, or could be I just misunderstood. But if
> it's not true, then MOCS seems actually useful. Otherwise it's going to
> be a clflush fest when you want to change from cached to uncached.

Well MOCS doesn't apply for GTT access by the CPU, so there you only
need to be concerned about whether you are rendering to a scanout. That
is tracked in the ddx, but as you point out mesa would have to assume
that any winsys buffer is a potential scanout unless we add an interface
for it to ask the display server.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 61/66] drm/i915: Use multiple VMs
  2013-07-02 11:38           ` Chris Wilson
@ 2013-07-02 12:34             ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-02 12:34 UTC (permalink / raw)
  To: Chris Wilson, Ville Syrjälä, Ben Widawsky, Intel GFX

On Tue, Jul 02, 2013 at 12:38:33PM +0100, Chris Wilson wrote:
> On Tue, Jul 02, 2013 at 02:34:59PM +0300, Ville Syrjälä wrote:
> > On Tue, Jul 02, 2013 at 12:07:10PM +0100, Chris Wilson wrote:
> > > On Tue, Jul 02, 2013 at 01:58:13PM +0300, Ville Syrjälä wrote:
> > > > Also IIRC someone told me that w/ uncached mappings the caches aren't
> > > > snooped even on LLC platforms. If that's true, MOCS seems even more
> > > > dangerous since the client could easily mix cached and uncached
> > > > accesses. I don't really understand why uncached mappings wouldn't
> > > > snoop on LLC platforms since the snooping should be more or less free.
> > > 
> > > Who said that? Doesn't appear to be the case on SNB/IVB/HSW as far as I
> > > can tell.
> > 
> > Can't remember now who said it, or could be I just misunderstood. But if
> > it's not true, then MOCS seems actually useful. Otherwise it's going to
> > be a clflush fest when you want to change from cached to uncached.
> 
> Well MOCS doesn't apply for GTT access by the CPU, so there you only
> need to be concerned about whether you are rendering to a scanout. That
> is tracked in the ddx, but as you point out mesa would have to assume
> that any winsys buffer is a potential scanout unless we add an interface
> for it to ask the display server.

I think userspace should make damn sure that it doesn't change the cache
level (i.e. snooped vs. non-snooped or that special write-back mode we
have on some hsw machines). Otherwise we'd be indeed screwed up since the
clflush tracking done by the kernel would be screwed up.

But thus far all the MOCS stuff seems to only be used to select in which
caches and at what age we should allocate cachelines ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 26/66] drm/i915: Move active/inactive lists to new mm
  2013-07-02  7:26       ` Daniel Vetter
@ 2013-07-02 16:47         ` Ben Widawsky
  0 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-07-02 16:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 02, 2013 at 09:26:45AM +0200, Daniel Vetter wrote:
> On Mon, Jul 01, 2013 at 03:56:50PM -0700, Ben Widawsky wrote:
> > On Sun, Jun 30, 2013 at 05:38:16PM +0200, Daniel Vetter wrote:
> > > On Thu, Jun 27, 2013 at 04:30:27PM -0700, Ben Widawsky wrote:
> > > > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.inactive_list/i915_gtt_mm-\>inactive_list/" $file; done
> > > > for file in `ls drivers/gpu/drm/i915/*.c` ; do sed -i "s/dev_priv->mm.active_list/i915_gtt_mm-\>active_list/" $file; done
> > > > 
> > > > I've also opted to move the comments out of line a bit so one can get a
> > > > better picture of what the various lists do.
> > > 
> > > Bikeshed: That makes you now inconsistent with all the other in-detail
> > > structure memeber comments we have. And I don't see how it looks better,
> > > so I'd vote to keep things as-is with per-member comments.
> > >
> > Initially I moved all the comments (in the original mm destruction I
> > did).
> 
> I mean to keep the per-struct-member comments right next to each
> individual declaration.

I meant, in the initial version I had a big blob where I wrote about all
the tracking, and what each list did. It actually was pretty cool, but
at that time I was trying to track [un]bound with the vm.

> 
> > > > v2: Leave the bound list as a global one. (Chris, indirectly)
> > > > 
> > > > CC: Chris Wilson <chris@chris-wilson.co.uk>
> > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > > 
> > > The real comment though is on the commit message, it fails to explain why
> > > we want to move the active/inactive lists from mm/obj to the address
> > > space/vma pair. I think I understand, but this should be explained more
> > > in-depth.
> > > 
> > > I think in the first commit which starts moving those lists and execution
> > > tracking state you should also mention why some of the state
> > > (bound/unbound lists e.g.) are not moved.
> > > 
> > > Cheers, Daniel
> > 
> > Can I use, "because Chris told me to"? :p
> 
> I think some high-level explanation should be doable ;-) E.g. when moving
> the lists around explain that the active/inactive stuff is used by
> eviction when we run out of address space, so needs to be per-vma and
> per-address space. Bound/unbound otoh is used by the shrinker which only
> cares about the amount of memory used and not one bit about in which
> address space this memory is all used in. Of course to actual kick out an
> object we need to unbind it from every address space, but for that we have
> the per-object list of vmas.
> -Daniel

I was being facetious, but thanks for writing the commit message for me
:D

> 
> > 
> > > 
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_debugfs.c    | 11 ++++----
> > > >  drivers/gpu/drm/i915/i915_drv.h        | 49 ++++++++++++++--------------------
> > > >  drivers/gpu/drm/i915/i915_gem.c        | 24 +++++++----------
> > > >  drivers/gpu/drm/i915/i915_gem_debug.c  |  2 +-
> > > >  drivers/gpu/drm/i915/i915_gem_evict.c  | 10 +++----
> > > >  drivers/gpu/drm/i915/i915_gem_stolen.c |  2 +-
> > > >  drivers/gpu/drm/i915/i915_irq.c        |  6 ++---
> > > >  7 files changed, 46 insertions(+), 58 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > index f3c76ab..a0babc7 100644
> > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c
> > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> > > > @@ -158,11 +158,11 @@ static int i915_gem_object_list_info(struct seq_file *m, void *data)
> > > >  	switch (list) {
> > > >  	case ACTIVE_LIST:
> > > >  		seq_printf(m, "Active:\n");
> > > > -		head = &dev_priv->mm.active_list;
> > > > +		head = &i915_gtt_vm->active_list;
> > > >  		break;
> > > >  	case INACTIVE_LIST:
> > > >  		seq_printf(m, "Inactive:\n");
> > > > -		head = &dev_priv->mm.inactive_list;
> > > > +		head = &i915_gtt_vm->inactive_list;
> > > >  		break;
> > > >  	default:
> > > >  		mutex_unlock(&dev->struct_mutex);
> > > > @@ -247,12 +247,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > >  	size = count = mappable_size = mappable_count = 0;
> > > > -	count_objects(&dev_priv->mm.active_list, mm_list);
> > > > +	count_objects(&i915_gtt_vm->active_list, mm_list);
> > > >  	seq_printf(m, "  %u [%u] active objects, %zu [%zu] bytes\n",
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > >  	size = count = mappable_size = mappable_count = 0;
> > > > -	count_objects(&dev_priv->mm.inactive_list, mm_list);
> > > > +	count_objects(&i915_gtt_vm->inactive_list, mm_list);
> > > >  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
> > > >  		   count, mappable_count, size, mappable_size);
> > > >  
> > > > @@ -1977,7 +1977,8 @@ i915_drop_caches_set(void *data, u64 val)
> > > >  		i915_gem_retire_requests(dev);
> > > >  
> > > >  	if (val & DROP_BOUND) {
> > > > -		list_for_each_entry_safe(obj, next, &dev_priv->mm.inactive_list, mm_list)
> > > > +		list_for_each_entry_safe(obj, next, &i915_gtt_vm->inactive_list,
> > > > +					 mm_list)
> > > >  			if (obj->pin_count == 0) {
> > > >  				ret = i915_gem_object_unbind(obj);
> > > >  				if (ret)
> > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > > > index e65cf57..0553410 100644
> > > > --- a/drivers/gpu/drm/i915/i915_drv.h
> > > > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > > > @@ -448,6 +448,22 @@ struct i915_address_space {
> > > >  	unsigned long start;		/* Start offset always 0 for dri2 */
> > > >  	size_t total;		/* size addr space maps (ex. 2GB for ggtt) */
> > > >  
> > > > +/* We use many types of lists for object tracking:
> > > > + *  active_list: List of objects currently involved in rendering.
> > > > + *	Includes buffers having the contents of their GPU caches flushed, not
> > > > + *	necessarily primitives. last_rendering_seqno represents when the
> > > > + *	rendering involved will be completed. A reference is held on the buffer
> > > > + *	while on this list.
> > > > + *  inactive_list: LRU list of objects which are not in the ringbuffer
> > > > + *	objects are ready to unbind but are still mapped.
> > > > + *	last_rendering_seqno is 0 while an object is in this list.
> > > > + *	A reference is not held on the buffer while on this list,
> > > > + *	as merely being GTT-bound shouldn't prevent its being
> > > > + *	freed, and we'll pull it off the list in the free path.
> > > > + */
> > > > +	struct list_head active_list;
> > > > +	struct list_head inactive_list;
> > > > +
> > > >  	struct {
> > > >  		dma_addr_t addr;
> > > >  		struct page *page;
> > > > @@ -835,42 +851,17 @@ struct intel_l3_parity {
> > > >  };
> > > >  
> > > >  struct i915_gem_mm {
> > > > -	/** List of all objects in gtt_space. Used to restore gtt
> > > > -	 * mappings on resume */
> > > > -	struct list_head bound_list;
> > > >  	/**
> > > > -	 * List of objects which are not bound to the GTT (thus
> > > > -	 * are idle and not used by the GPU) but still have
> > > > -	 * (presumably uncached) pages still attached.
> > > > +	 * Lists of objects which are [not] bound to a VM. Unbound objects are
> > > > +	 * idle are idle but still have (presumably uncached) pages still
> > > > +	 * attached.
> > > >  	 */
> > > > +	struct list_head bound_list;
> > > >  	struct list_head unbound_list;
> > > >  
> > > >  	struct shrinker inactive_shrinker;
> > > >  	bool shrinker_no_lock_stealing;
> > > >  
> > > > -	/**
> > > > -	 * List of objects currently involved in rendering.
> > > > -	 *
> > > > -	 * Includes buffers having the contents of their GPU caches
> > > > -	 * flushed, not necessarily primitives.  last_rendering_seqno
> > > > -	 * represents when the rendering involved will be completed.
> > > > -	 *
> > > > -	 * A reference is held on the buffer while on this list.
> > > > -	 */
> > > > -	struct list_head active_list;
> > > > -
> > > > -	/**
> > > > -	 * LRU list of objects which are not in the ringbuffer and
> > > > -	 * are ready to unbind, but are still in the GTT.
> > > > -	 *
> > > > -	 * last_rendering_seqno is 0 while an object is in this list.
> > > > -	 *
> > > > -	 * A reference is not held on the buffer while on this list,
> > > > -	 * as merely being GTT-bound shouldn't prevent its being
> > > > -	 * freed, and we'll pull it off the list in the free path.
> > > > -	 */
> > > > -	struct list_head inactive_list;
> > > > -
> > > >  	/** LRU list of objects with fence regs on them. */
> > > >  	struct list_head fence_list;
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > > index 608b6b5..7da06df 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > > @@ -1706,7 +1706,7 @@ __i915_gem_shrink(struct drm_i915_private *dev_priv, long target,
> > > >  	}
> > > >  
> > > >  	list_for_each_entry_safe(obj, next,
> > > > -				 &dev_priv->mm.inactive_list,
> > > > +				 &i915_gtt_vm->inactive_list,
> > > >  				 mm_list) {
> > > >  		if ((i915_gem_object_is_purgeable(obj) || !purgeable_only) &&
> > > >  		    i915_gem_object_unbind(obj) == 0 &&
> > > > @@ -1881,7 +1881,7 @@ i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
> > > >  	}
> > > >  
> > > >  	/* Move from whatever list we were on to the tail of execution. */
> > > > -	list_move_tail(&obj->mm_list, &dev_priv->mm.active_list);
> > > > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->active_list);
> > > >  	list_move_tail(&obj->ring_list, &ring->active_list);
> > > >  
> > > >  	obj->last_read_seqno = seqno;
> > > > @@ -1909,7 +1909,7 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
> > > >  	BUG_ON(obj->base.write_domain & ~I915_GEM_GPU_DOMAINS);
> > > >  	BUG_ON(!obj->active);
> > > >  
> > > > -	list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > > +	list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > > >  
> > > >  	list_del_init(&obj->ring_list);
> > > >  	obj->ring = NULL;
> > > > @@ -2279,12 +2279,8 @@ bool i915_gem_reset(struct drm_device *dev)
> > > >  	/* Move everything out of the GPU domains to ensure we do any
> > > >  	 * necessary invalidation upon reuse.
> > > >  	 */
> > > > -	list_for_each_entry(obj,
> > > > -			    &dev_priv->mm.inactive_list,
> > > > -			    mm_list)
> > > > -	{
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list)
> > > >  		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
> > > > -	}
> > > >  
> > > >  	/* The fence registers are invalidated so clear them out */
> > > >  	i915_gem_restore_fences(dev);
> > > > @@ -3162,7 +3158,7 @@ search_free:
> > > >  	}
> > > >  
> > > >  	list_move_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > > >  
> > > >  	obj->gtt_space = node;
> > > >  	obj->gtt_offset = node->start;
> > > > @@ -3313,7 +3309,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
> > > >  
> > > >  	/* And bump the LRU for this access */
> > > >  	if (i915_gem_object_is_inactive(obj))
> > > > -		list_move_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > > +		list_move_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > > >  
> > > >  	return 0;
> > > >  }
> > > > @@ -4291,7 +4287,7 @@ i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
> > > >  		return ret;
> > > >  	}
> > > >  
> > > > -	BUG_ON(!list_empty(&dev_priv->mm.active_list));
> > > > +	BUG_ON(!list_empty(&i915_gtt_vm->active_list));
> > > >  	mutex_unlock(&dev->struct_mutex);
> > > >  
> > > >  	ret = drm_irq_install(dev);
> > > > @@ -4352,8 +4348,8 @@ i915_gem_load(struct drm_device *dev)
> > > >  				  SLAB_HWCACHE_ALIGN,
> > > >  				  NULL);
> > > >  
> > > > -	INIT_LIST_HEAD(&dev_priv->mm.active_list);
> > > > -	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
> > > > +	INIT_LIST_HEAD(&i915_gtt_vm->active_list);
> > > > +	INIT_LIST_HEAD(&i915_gtt_vm->inactive_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> > > >  	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
> > > > @@ -4652,7 +4648,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> > > >  	list_for_each_entry(obj, &dev_priv->mm.unbound_list, global_list)
> > > >  		if (obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, global_list)
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, global_list)
> > > >  		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
> > > >  			cnt += obj->base.size >> PAGE_SHIFT;
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_debug.c b/drivers/gpu/drm/i915/i915_gem_debug.c
> > > > index 582e6a5..bf945a3 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_debug.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_debug.c
> > > > @@ -97,7 +97,7 @@ i915_verify_lists(struct drm_device *dev)
> > > >  		}
> > > >  	}
> > > >  
> > > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, list) {
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, list) {
> > > >  		if (obj->base.dev != dev ||
> > > >  		    !atomic_read(&obj->base.refcount.refcount)) {
> > > >  			DRM_ERROR("freed inactive %p\n", obj);
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > index 6e620f86..92856a2 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > > > @@ -86,7 +86,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  				 cache_level);
> > > >  
> > > >  	/* First see if there is a large enough contiguous idle region... */
> > > > -	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list) {
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->inactive_list, mm_list) {
> > > >  		if (mark_free(obj, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > > @@ -95,7 +95,7 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
> > > >  		goto none;
> > > >  
> > > >  	/* Now merge in the soon-to-be-expired objects... */
> > > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> > > >  		if (mark_free(obj, &unwind_list))
> > > >  			goto found;
> > > >  	}
> > > > @@ -158,8 +158,8 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  	bool lists_empty;
> > > >  	int ret;
> > > >  
> > > > -	lists_empty = (list_empty(&dev_priv->mm.inactive_list) &&
> > > > -		       list_empty(&dev_priv->mm.active_list));
> > > > +	lists_empty = (list_empty(&i915_gtt_vm->inactive_list) &&
> > > > +		       list_empty(&i915_gtt_vm->active_list));
> > > >  	if (lists_empty)
> > > >  		return -ENOSPC;
> > > >  
> > > > @@ -177,7 +177,7 @@ i915_gem_evict_everything(struct drm_device *dev)
> > > >  
> > > >  	/* Having flushed everything, unbind() should never raise an error */
> > > >  	list_for_each_entry_safe(obj, next,
> > > > -				 &dev_priv->mm.inactive_list, mm_list)
> > > > +				 &i915_gtt_vm->inactive_list, mm_list)
> > > >  		if (obj->pin_count == 0)
> > > >  			WARN_ON(i915_gem_object_unbind(obj));
> > > >  
> > > > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > index 49e8be7..3f6564d 100644
> > > > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > > > @@ -384,7 +384,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
> > > >  	obj->has_global_gtt_mapping = 1;
> > > >  
> > > >  	list_add_tail(&obj->global_list, &dev_priv->mm.bound_list);
> > > > -	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
> > > > +	list_add_tail(&obj->mm_list, &i915_gtt_vm->inactive_list);
> > > >  
> > > >  	return obj;
> > > >  }
> > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> > > > index 1e25920..5dc055a 100644
> > > > --- a/drivers/gpu/drm/i915/i915_irq.c
> > > > +++ b/drivers/gpu/drm/i915/i915_irq.c
> > > > @@ -1722,7 +1722,7 @@ i915_error_first_batchbuffer(struct drm_i915_private *dev_priv,
> > > >  	}
> > > >  
> > > >  	seqno = ring->get_seqno(ring, false);
> > > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list) {
> > > >  		if (obj->ring != ring)
> > > >  			continue;
> > > >  
> > > > @@ -1857,7 +1857,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > > >  	int i;
> > > >  
> > > >  	i = 0;
> > > > -	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
> > > > +	list_for_each_entry(obj, &i915_gtt_vm->active_list, mm_list)
> > > >  		i++;
> > > >  	error->active_bo_count = i;
> > > >  	list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list)
> > > > @@ -1877,7 +1877,7 @@ static void i915_gem_capture_buffers(struct drm_i915_private *dev_priv,
> > > >  		error->active_bo_count =
> > > >  			capture_active_bo(error->active_bo,
> > > >  					  error->active_bo_count,
> > > > -					  &dev_priv->mm.active_list);
> > > > +					  &i915_gtt_vm->active_list);
> > > >  
> > > >  	if (error->pinned_bo)
> > > >  		error->pinned_bo_count =
> > > > -- 
> > > > 1.8.3.1
> > > > 
> > > > _______________________________________________
> > > > Intel-gfx mailing list
> > > > Intel-gfx@lists.freedesktop.org
> > > > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> > > 
> > > -- 
> > > Daniel Vetter
> > > Software Engineer, Intel Corporation
> > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> > 
> > -- 
> > Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-02  7:28             ` Daniel Vetter
@ 2013-07-02 16:51               ` Ben Widawsky
  2013-07-02 17:07                 ` Daniel Vetter
  0 siblings, 1 reply; 124+ messages in thread
From: Ben Widawsky @ 2013-07-02 16:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel GFX

On Tue, Jul 02, 2013 at 09:28:56AM +0200, Daniel Vetter wrote:
> On Mon, Jul 01, 2013 at 03:59:51PM -0700, Ben Widawsky wrote:
> > On Mon, Jul 01, 2013 at 09:08:58PM +0200, Daniel Vetter wrote:
> > > On Mon, Jul 1, 2013 at 8:43 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > >> All of this is addressed in future patches. As we've discussed, I think
> > > >> I'll have to respin it anyway, so I'll name it as such upfront. To me it
> > > >> felt a little weird to start calling things "ggtt" before I made the
> > > >> separation.
> > > >
> > > > I think now that we know what the end result should (more or less at
> > > > least) look like we can aim to make it right the first time we touch a
> > > > piece of code. That will reduce the churn in the patch series and so
> > > > make the beast easier to review.
> > > >
> > > > Imo foreshadowing (to keep consistent with the "a patch series should
> > > > tell a story" analogy) is perfectly fine, and in many cases helps in
> > > > understanding the big picture of a large pile of patches.
> > > 
> > > I've forgotten to add one thing: If you switch these again later on
> > > (layz me didn't check for that) it's imo best to stick with those
> > > names (presuming they fit, since the gtt_size vs. obj->size
> > > disdinction is a rather important one). Again I think now that we know
> > > where to go to it's best to get there with as few intermediate steps
> > > as possible.
> > > -Daniel
> > >
> > 
> > I don't recall object size being very important actually, so I don't
> > think the distinction is too important, but I'm just arguing for the
> > sake of arguing. With the sg page stuff that Imre did, I think most size
> > calculations unrelated to gtt size are there anyway, and most of our mm
> > (not page allocation) code should only ever care about the gtt.
> 
> The disdinction is only important on gen2/3, which is why you don't recall
> it being important ;-)
> 
> I think you have two options:
> - Trust me that it's indeed important.
> - Read up on gen2/3 fencing code and make up your own mind.
> 
> Cheers, Daniel
I am not saying the distinction doesn't exist. I was saying it's not
prevalent in too many places. See the second part of my statement, I
believe it holds.

But please be clear what you're asking for, do you want 2 getters for
vma size vs. obj size?

> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

-- 
Ben Widawsky, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 42/66] drm/i915: Clean up VMAs before freeing
  2013-07-02 10:59   ` Ville Syrjälä
@ 2013-07-02 16:58     ` Ben Widawsky
  0 siblings, 0 replies; 124+ messages in thread
From: Ben Widawsky @ 2013-07-02 16:58 UTC (permalink / raw)
  To: Ville Syrjälä; +Cc: Intel GFX

On Tue, Jul 02, 2013 at 01:59:17PM +0300, Ville Syrjälä wrote:
> On Thu, Jun 27, 2013 at 04:30:43PM -0700, Ben Widawsky wrote:
> > It's quite common for an object to simply be on the inactive list (and
> > not unbound) when we want to free the context. This of course happens
> > with lazy unbinding. Simply, this is needed when an object isn't fully
> > unbound but we want to free one VMA of the object, for whatever reason.
> > 
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> >  drivers/gpu/drm/i915/i915_drv.h         |  1 +
> >  drivers/gpu/drm/i915/i915_gem.c         | 28 ++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/i915_gem_context.c |  1 +
> >  3 files changed, 30 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index 0bc4251..9febcdd 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -1674,6 +1674,7 @@ void i915_gem_free_object(struct drm_gem_object *obj);
> >  struct i915_vma *i915_gem_vma_create(struct drm_i915_gem_object *obj,
> >  				     struct i915_address_space *vm);
> >  void i915_gem_vma_destroy(struct i915_vma *vma);
> > +void i915_gem_vma_cleanup(struct i915_address_space *vm);
> >  
> >  int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
> >  				     struct i915_address_space *vm,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 12d0e61..9abc3c8 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4134,6 +4134,34 @@ void i915_gem_vma_destroy(struct i915_vma *vma)
> >  	kfree(vma);
> >  }
> >  
> > +/* This is like unbind() but without gtt considerations */
> > +void i915_gem_vma_cleanup(struct i915_address_space *vm)
> > +{
> > +	struct drm_i915_private *dev_priv = vm->dev->dev_private;
> > +	struct i915_vma *vma, *n;
> > +
> > +	BUG_ON(is_i915_ggtt(vm));
> > +	WARN_ON(!list_empty(&vm->active_list));
> > +
> > +	list_for_each_entry_safe(vma, n, &vm->vma_list, per_vm_link) {
> > +		struct drm_i915_gem_object *obj = vma->obj;
> > +
> > +		if (WARN_ON(!i915_gem_obj_bound(obj, vm)))
> > +			continue;
> > +
> > +		i915_gem_object_unpin_pages(obj);
> > +
> > +		list_del(&vma->mm_list);
> > +		list_del(&vma->vma_link);
> > +		drm_mm_remove_node(&vma->node);
> > +		i915_gem_vma_destroy(vma);
> 
> Is there a good reason why all of that stuff isn't included in
> i915_gem_vma_destroy()? It seems like it should be there.

No good reason, just sloppy. The remove node ideally should only happen
conditionally (ie. at unbind), but the list_del stuff should go in
destroy.

> 
> > +
> > +		if (list_empty(&obj->vma_list))
> > +			list_move_tail(&obj->global_list,
> > +				       &dev_priv->mm.unbound_list);
> > +	}
> > +}
> > +
> >  int
> >  i915_gem_idle(struct drm_device *dev)
> >  {
> > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> > index 988123f..c45cd5c 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_context.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> > @@ -129,6 +129,7 @@ void i915_gem_context_free(struct kref *ctx_ref)
> >  	struct i915_hw_context *ctx = container_of(ctx_ref,
> >  						   typeof(*ctx), ref);
> >  
> > +	i915_gem_vma_cleanup(&ctx->ppgtt.base);
> >  	if (ctx->ppgtt.cleanup)
> >  		ctx->ppgtt.cleanup(&ctx->ppgtt);
> >  	drm_gem_object_unreference(&ctx->obj->base);
> > -- 
> > 1.8.3.1
> > 
> > _______________________________________________
> > Intel-gfx mailing list
> > Intel-gfx@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
> 
> -- 
> Ville Syrjälä
> Intel OTC

-- 
Ben Widawsky, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 30/66] drm/i915: Getter/setter for object attributes
  2013-07-02 16:51               ` Ben Widawsky
@ 2013-07-02 17:07                 ` Daniel Vetter
  0 siblings, 0 replies; 124+ messages in thread
From: Daniel Vetter @ 2013-07-02 17:07 UTC (permalink / raw)
  To: Ben Widawsky; +Cc: Intel GFX

On Tue, Jul 2, 2013 at 6:51 PM, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Tue, Jul 02, 2013 at 09:28:56AM +0200, Daniel Vetter wrote:
>> On Mon, Jul 01, 2013 at 03:59:51PM -0700, Ben Widawsky wrote:
>> > On Mon, Jul 01, 2013 at 09:08:58PM +0200, Daniel Vetter wrote:
>> > > On Mon, Jul 1, 2013 at 8:43 PM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> > > >> All of this is addressed in future patches. As we've discussed, I think
>> > > >> I'll have to respin it anyway, so I'll name it as such upfront. To me it
>> > > >> felt a little weird to start calling things "ggtt" before I made the
>> > > >> separation.
>> > > >
>> > > > I think now that we know what the end result should (more or less at
>> > > > least) look like we can aim to make it right the first time we touch a
>> > > > piece of code. That will reduce the churn in the patch series and so
>> > > > make the beast easier to review.
>> > > >
>> > > > Imo foreshadowing (to keep consistent with the "a patch series should
>> > > > tell a story" analogy) is perfectly fine, and in many cases helps in
>> > > > understanding the big picture of a large pile of patches.
>> > >
>> > > I've forgotten to add one thing: If you switch these again later on
>> > > (layz me didn't check for that) it's imo best to stick with those
>> > > names (presuming they fit, since the gtt_size vs. obj->size
>> > > disdinction is a rather important one). Again I think now that we know
>> > > where to go to it's best to get there with as few intermediate steps
>> > > as possible.
>> > > -Daniel
>> > >
>> >
>> > I don't recall object size being very important actually, so I don't
>> > think the distinction is too important, but I'm just arguing for the
>> > sake of arguing. With the sg page stuff that Imre did, I think most size
>> > calculations unrelated to gtt size are there anyway, and most of our mm
>> > (not page allocation) code should only ever care about the gtt.
>>
>> The disdinction is only important on gen2/3, which is why you don't recall
>> it being important ;-)
>>
>> I think you have two options:
>> - Trust me that it's indeed important.
>> - Read up on gen2/3 fencing code and make up your own mind.
>>
>> Cheers, Daniel
> I am not saying the distinction doesn't exist. I was saying it's not
> prevalent in too many places. See the second part of my statement, I
> believe it holds.
>
> But please be clear what you're asking for, do you want 2 getters for
> vma size vs. obj size?

I guess we need a few different things:
- vma->node.size: Most functions can probably just access that one
directly (e.g. for writing the ptes), maybe we need something for
transition. I guess we could call it obj_gtt_size since while
transition there's only the one gtt address space.
- The obj size in the gtt. Differs from obj->size on gen2/3 for tiled
objects (and has pretty funny rules at that how it changes). Mostly
important for debugfs, for the gtt stats. Any access helper which just
calls this obj_size is imo highly misleading, hence why I've voted for
for obj_gtt_size.
- I don't think we need an access helper for obj->size itself, since
that is invariant and I don't think we'll move it around either.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-07-01 21:39 ` Daniel Vetter
  2013-07-01 22:36   ` Ben Widawsky
@ 2013-10-29 23:08   ` Eric Anholt
  2013-10-30  0:10     ` Jesse Barnes
  1 sibling, 1 reply; 124+ messages in thread
From: Eric Anholt @ 2013-10-29 23:08 UTC (permalink / raw)
  To: Daniel Vetter, Ben Widawsky; +Cc: Intel GFX


[-- Attachment #1.1: Type: text/plain, Size: 1170 bytes --]

Daniel Vetter <daniel@ffwll.ch> writes:

> Hi Ben
>
> So first things first: I rather like what the code looks like overall at
> the end. I've done a light read-through (by far not a full review) and
> besides a few bikesheds (all mentioned by mail already) the big thing is
> the 1:1 context:ppgtt address space relationship.
>
> We've discussed this at length in private irc and agreed that we need to
> changes this two a n:1 relationship, so I'll just reiterate the reasons
> for that on the list:
>
> - Current userspace expects that different contexts created on the same fd
>   all use the same address space (since there's really only one). So if we
>   want to no add a new ABI (and for testing I really think we want to
>   enable ppgtt on current unchanged userspace) we must keep that promise.
>   Hence we need to be able to point the different contexts created on an
>   fd all at the same (per-fd) address space.

I'm not coming up with anything in userland requiring this.  Can you
clarify?

For the GL context reset stuff, it is required that we have more than
one address space per fd, because the fd is global to all contexts, not
just a share group.

[-- Attachment #1.2: Type: application/pgp-signature, Size: 835 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-10-29 23:08   ` Eric Anholt
@ 2013-10-30  0:10     ` Jesse Barnes
  2013-11-01 17:20       ` Jesse Barnes
  0 siblings, 1 reply; 124+ messages in thread
From: Jesse Barnes @ 2013-10-30  0:10 UTC (permalink / raw)
  To: Eric Anholt; +Cc: Ben Widawsky, Intel GFX

On Tue, 29 Oct 2013 16:08:24 -0700
Eric Anholt <eric@anholt.net> wrote:

> Daniel Vetter <daniel@ffwll.ch> writes:
> 
> > Hi Ben
> >
> > So first things first: I rather like what the code looks like overall at
> > the end. I've done a light read-through (by far not a full review) and
> > besides a few bikesheds (all mentioned by mail already) the big thing is
> > the 1:1 context:ppgtt address space relationship.
> >
> > We've discussed this at length in private irc and agreed that we need to
> > changes this two a n:1 relationship, so I'll just reiterate the reasons
> > for that on the list:
> >
> > - Current userspace expects that different contexts created on the same fd
> >   all use the same address space (since there's really only one). So if we
> >   want to no add a new ABI (and for testing I really think we want to
> >   enable ppgtt on current unchanged userspace) we must keep that promise.
> >   Hence we need to be able to point the different contexts created on an
> >   fd all at the same (per-fd) address space.
> 
> I'm not coming up with anything in userland requiring this.  Can you
> clarify?
> 
> For the GL context reset stuff, it is required that we have more than
> one address space per fd, because the fd is global to all contexts, not
> just a share group.

I think Daniel was just worried about the potential semantic change?
But if userspace doesn't rely on it, we can go the easier route of
simply creating one address space per context.

But overall, do we need to allow creating multiple contexts in the same
address space for GL share groups or any other feature?  If so, we'd
need to track contexts and address spaces separately and refcount them
like Ben has done with the per-fd work, though we could go back
to sharing a single fd and exposing the feature through the context
create ioctl instead, or possibly a new one if we need the notion of an
ASID as a separate entity.

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

* Re: [PATCH 00/66] [v1] Full PPGTT minus soft pin
  2013-10-30  0:10     ` Jesse Barnes
@ 2013-11-01 17:20       ` Jesse Barnes
  0 siblings, 0 replies; 124+ messages in thread
From: Jesse Barnes @ 2013-11-01 17:20 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Ben Widawsky, Intel GFX

On Tue, 29 Oct 2013 17:10:11 -0700
Jesse Barnes <jbarnes@virtuousgeek.org> wrote:

> On Tue, 29 Oct 2013 16:08:24 -0700
> Eric Anholt <eric@anholt.net> wrote:
> 
> > Daniel Vetter <daniel@ffwll.ch> writes:
> > 
> > > Hi Ben
> > >
> > > So first things first: I rather like what the code looks like overall at
> > > the end. I've done a light read-through (by far not a full review) and
> > > besides a few bikesheds (all mentioned by mail already) the big thing is
> > > the 1:1 context:ppgtt address space relationship.
> > >
> > > We've discussed this at length in private irc and agreed that we need to
> > > changes this two a n:1 relationship, so I'll just reiterate the reasons
> > > for that on the list:
> > >
> > > - Current userspace expects that different contexts created on the same fd
> > >   all use the same address space (since there's really only one). So if we
> > >   want to no add a new ABI (and for testing I really think we want to
> > >   enable ppgtt on current unchanged userspace) we must keep that promise.
> > >   Hence we need to be able to point the different contexts created on an
> > >   fd all at the same (per-fd) address space.
> > 
> > I'm not coming up with anything in userland requiring this.  Can you
> > clarify?
> > 
> > For the GL context reset stuff, it is required that we have more than
> > one address space per fd, because the fd is global to all contexts, not
> > just a share group.
> 
> I think Daniel was just worried about the potential semantic change?
> But if userspace doesn't rely on it, we can go the easier route of
> simply creating one address space per context.
> 
> But overall, do we need to allow creating multiple contexts in the same
> address space for GL share groups or any other feature?  If so, we'd
> need to track contexts and address spaces separately and refcount them
> like Ben has done with the per-fd work, though we could go back
> to sharing a single fd and exposing the feature through the context
> create ioctl instead, or possibly a new one if we need the notion of an
> ASID as a separate entity.

Ok so per discussion on IRC:
  - requiring multiple opens and fds is definitely not desired
  - multiple contexts sharing a single address space is not required

Thus simply creating a new ppgtt instance with every new context via
the context create ioctl ought to be sufficient.

Any objections?

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 124+ messages in thread

end of thread, other threads:[~2013-11-01 17:20 UTC | newest]

Thread overview: 124+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-27 23:30 [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
2013-06-27 23:30 ` [PATCH 01/66] drm/i915: Remove extra error state NULL Ben Widawsky
2013-06-27 23:30 ` [PATCH 02/66] drm/i915: Extract error buffer capture Ben Widawsky
2013-06-27 23:30 ` [PATCH 03/66] drm/i915: make PDE|PTE platform specific Ben Widawsky
2013-06-28 16:53   ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 04/66] drm: Optionally create mm blocks from top-to-bottom Ben Widawsky
2013-06-30 12:30   ` Daniel Vetter
2013-06-30 12:40     ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 05/66] drm/i915: Don't clear gtt with 0 entries Ben Widawsky
2013-06-27 23:30 ` [PATCH 06/66] drm/i915: Conditionally use guard page based on PPGTT Ben Widawsky
2013-06-28 17:57   ` Jesse Barnes
2013-06-27 23:30 ` [PATCH 07/66] drm/i915: Use drm_mm for PPGTT PDEs Ben Widawsky
2013-06-28 18:01   ` Jesse Barnes
2013-06-27 23:30 ` [PATCH 08/66] drm/i915: cleanup context fini Ben Widawsky
2013-06-27 23:30 ` [PATCH 09/66] drm/i915: Do a fuller init after reset Ben Widawsky
2013-06-27 23:30 ` [PATCH 10/66] drm/i915: Split context enabling from init Ben Widawsky
2013-06-27 23:30 ` [PATCH 11/66] drm/i915: destroy i915_gem_init_global_gtt Ben Widawsky
2013-06-27 23:30 ` [PATCH 12/66] drm/i915: Embed PPGTT into the context Ben Widawsky
2013-06-27 23:30 ` [PATCH 13/66] drm/i915: Unify PPGTT codepaths on gen6+ Ben Widawsky
2013-06-27 23:30 ` [PATCH 14/66] drm/i915: Move ppgtt initialization down Ben Widawsky
2013-06-27 23:30 ` [PATCH 15/66] drm/i915: Tie context to PPGTT Ben Widawsky
2013-06-27 23:30 ` [PATCH 16/66] drm/i915: Really share scratch page Ben Widawsky
2013-06-27 23:30 ` [PATCH 17/66] drm/i915: Combine scratch members into a struct Ben Widawsky
2013-06-27 23:30 ` [PATCH 18/66] drm/i915: Drop dev from pte_encode Ben Widawsky
2013-06-27 23:30 ` [PATCH 19/66] drm/i915: Use gtt shortform where possible Ben Widawsky
2013-06-27 23:30 ` [PATCH 20/66] drm/i915: Move fbc members out of line Ben Widawsky
2013-06-30 13:10   ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 21/66] drm/i915: Move gtt and ppgtt under address space umbrella Ben Widawsky
2013-06-30 13:12   ` Daniel Vetter
2013-07-01 18:40     ` Ben Widawsky
2013-07-01 18:48       ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 22/66] drm/i915: Move gtt_mtrr to i915_gtt Ben Widawsky
2013-06-27 23:30 ` [PATCH 23/66] drm/i915: Move stolen stuff " Ben Widawsky
2013-06-30 13:18   ` Daniel Vetter
2013-07-01 18:43     ` Ben Widawsky
2013-07-01 18:51       ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 24/66] drm/i915: Move aliasing_ppgtt Ben Widawsky
2013-06-30 13:27   ` Daniel Vetter
2013-07-01 18:52     ` Ben Widawsky
2013-07-01 19:06       ` Daniel Vetter
2013-07-01 19:48         ` Ben Widawsky
2013-07-01 19:54           ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 25/66] drm/i915: Put the mm in the parent address space Ben Widawsky
2013-06-27 23:30 ` [PATCH 26/66] drm/i915: Move active/inactive lists to new mm Ben Widawsky
2013-06-30 15:38   ` Daniel Vetter
2013-07-01 22:56     ` Ben Widawsky
2013-07-02  7:26       ` Daniel Vetter
2013-07-02 16:47         ` Ben Widawsky
2013-06-27 23:30 ` [PATCH 27/66] drm/i915: Create a global list of vms Ben Widawsky
2013-06-27 23:30 ` [PATCH 28/66] drm/i915: Remove object's gtt_offset Ben Widawsky
2013-06-27 23:30 ` [PATCH 29/66] drm: pre allocate node for create_block Ben Widawsky
2013-06-30 12:34   ` Daniel Vetter
2013-07-01 18:30     ` Ben Widawsky
2013-06-27 23:30 ` [PATCH 30/66] drm/i915: Getter/setter for object attributes Ben Widawsky
2013-06-30 13:00   ` Daniel Vetter
2013-07-01 18:32     ` Ben Widawsky
2013-07-01 18:43       ` Daniel Vetter
2013-07-01 19:08         ` Daniel Vetter
2013-07-01 22:59           ` Ben Widawsky
2013-07-02  7:28             ` Daniel Vetter
2013-07-02 16:51               ` Ben Widawsky
2013-07-02 17:07                 ` Daniel Vetter
2013-06-27 23:30 ` [PATCH 31/66] drm/i915: Create VMAs (part 1) Ben Widawsky
2013-06-27 23:30 ` [PATCH 32/66] drm/i915: Create VMAs (part 2) - kill gtt space Ben Widawsky
2013-06-27 23:30 ` [PATCH 33/66] drm/i915: Create VMAs (part 3) - plumbing Ben Widawsky
2013-06-27 23:30 ` [PATCH 34/66] drm/i915: Create VMAs (part 3.5) - map and fenceable tracking Ben Widawsky
2013-06-27 23:30 ` [PATCH 35/66] drm/i915: Create VMAs (part 4) - Error capture Ben Widawsky
2013-06-27 23:30 ` [PATCH 36/66] drm/i915: Create VMAs (part 5) - move mm_list Ben Widawsky
2013-06-27 23:30 ` [PATCH 37/66] drm/i915: Create VMAs (part 6) - finish error plumbing Ben Widawsky
2013-06-27 23:30 ` [PATCH 38/66] drm/i915: create an object_is_active() Ben Widawsky
2013-06-27 23:30 ` [PATCH 39/66] drm/i915: Move active to vma Ben Widawsky
2013-06-27 23:30 ` [PATCH 40/66] drm/i915: Track all VMAs per VM Ben Widawsky
2013-06-30 15:35   ` Daniel Vetter
2013-07-01 19:04     ` Ben Widawsky
2013-06-27 23:30 ` [PATCH 41/66] drm/i915: Defer request freeing Ben Widawsky
2013-06-27 23:30 ` [PATCH 42/66] drm/i915: Clean up VMAs before freeing Ben Widawsky
2013-07-02 10:59   ` Ville Syrjälä
2013-07-02 16:58     ` Ben Widawsky
2013-06-27 23:30 ` [PATCH 43/66] drm/i915: Replace has_bsd/blt with a mask Ben Widawsky
2013-06-27 23:30 ` [PATCH 44/66] drm/i915: Catch missed context unref earlier Ben Widawsky
2013-06-27 23:30 ` [PATCH 45/66] drm/i915: Add a context open function Ben Widawsky
2013-06-27 23:30 ` [PATCH 46/66] drm/i915: Permit contexts on all rings Ben Widawsky
2013-06-27 23:30 ` [PATCH 47/66] drm/i915: Fix context fini refcounts Ben Widawsky
2013-06-27 23:30 ` [PATCH 48/66] drm/i915: Better reset handling for contexts Ben Widawsky
2013-06-27 23:30 ` [PATCH 49/66] drm/i915: Create a per file_priv default context Ben Widawsky
2013-06-27 23:30 ` [PATCH 50/66] drm/i915: Remove ring specificity from contexts Ben Widawsky
2013-06-27 23:30 ` [PATCH 51/66] drm/i915: Track which ring a context ran on Ben Widawsky
2013-06-27 23:30 ` [PATCH 52/66] drm/i915: dump error state based on capture Ben Widawsky
2013-06-27 23:30 ` [PATCH 53/66] drm/i915: PPGTT should take a ppgtt argument Ben Widawsky
2013-06-27 23:30 ` [PATCH 54/66] drm/i915: USE LRI for switching PP_DIR_BASE Ben Widawsky
2013-06-27 23:30 ` [PATCH 55/66] drm/i915: Extract mm switching to function Ben Widawsky
2013-06-27 23:30 ` [PATCH 56/66] drm/i915: Write PDEs at init instead of enable Ben Widawsky
2013-06-27 23:30 ` [PATCH 57/66] drm/i915: Disallow pin with full ppgtt Ben Widawsky
2013-06-28  8:55   ` Chris Wilson
2013-06-29  5:43     ` Ben Widawsky
2013-06-29  6:44       ` Chris Wilson
2013-06-29 14:34         ` Daniel Vetter
2013-06-30  6:56           ` Ben Widawsky
2013-06-30 11:06             ` Daniel Vetter
2013-06-30 11:31               ` Chris Wilson
2013-06-30 11:36                 ` Daniel Vetter
2013-07-01 18:27                   ` Ben Widawsky
2013-06-27 23:30 ` [PATCH 58/66] drm/i915: Get context early in execbuf Ben Widawsky
2013-06-27 23:31 ` [PATCH 59/66] drm/i915: Pass ctx directly to switch/hangstat Ben Widawsky
2013-06-27 23:31 ` [PATCH 60/66] drm/i915: Actually add the new address spaces Ben Widawsky
2013-06-27 23:31 ` [PATCH 61/66] drm/i915: Use multiple VMs Ben Widawsky
2013-06-27 23:43   ` Ben Widawsky
2013-07-02 10:58     ` Ville Syrjälä
2013-07-02 11:07       ` Chris Wilson
2013-07-02 11:34         ` Ville Syrjälä
2013-07-02 11:38           ` Chris Wilson
2013-07-02 12:34             ` Daniel Vetter
2013-06-27 23:31 ` [PATCH 62/66] drm/i915: Kill now unused ppgtt_{un, }bind Ben Widawsky
2013-06-27 23:31 ` [PATCH 63/66] drm/i915: Add PPGTT dumper Ben Widawsky
2013-06-27 23:31 ` [PATCH 64/66] drm/i915: Dump all ppgtt Ben Widawsky
2013-06-27 23:31 ` [PATCH 65/66] drm/i915: Add debugfs for vma info per vm Ben Widawsky
2013-06-27 23:31 ` [PATCH 66/66] drm/i915: Getparam full ppgtt Ben Widawsky
2013-06-28  3:38 ` [PATCH 00/66] [v1] Full PPGTT minus soft pin Ben Widawsky
2013-07-01 21:39 ` Daniel Vetter
2013-07-01 22:36   ` Ben Widawsky
2013-07-02  7:43     ` Daniel Vetter
2013-10-29 23:08   ` Eric Anholt
2013-10-30  0:10     ` Jesse Barnes
2013-11-01 17:20       ` Jesse Barnes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.