intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend
@ 2021-01-19 14:49 Chris Wilson
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

When flushing objects larger than the CPU cache it is preferrable to use
a single wbinvd() rather than overlapping clflush(). At runtime, we
avoid wbinvd() due to its system-wide latencies, but during
singlethreaded suspend, no one will observe the imposed latency and we
can opt for the faster wbinvd to clear all objects in a single hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_pm.c | 40 +++++++++-----------------
 1 file changed, 13 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
index 40d3e40500fa..38c1298cb14b 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
@@ -11,6 +11,12 @@
 
 #include "i915_drv.h"
 
+#if defined(CONFIG_X86)
+#include <asm/smp.h>
+#else
+#define wbinvd_on_all_cpus()
+#endif
+
 void i915_gem_suspend(struct drm_i915_private *i915)
 {
 	GEM_TRACE("%s\n", dev_name(i915->drm.dev));
@@ -32,13 +38,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
 	i915_gem_drain_freed_objects(i915);
 }
 
-static struct drm_i915_gem_object *first_mm_object(struct list_head *list)
-{
-	return list_first_entry_or_null(list,
-					struct drm_i915_gem_object,
-					mm.link);
-}
-
 void i915_gem_suspend_late(struct drm_i915_private *i915)
 {
 	struct drm_i915_gem_object *obj;
@@ -48,6 +47,7 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
 		NULL
 	}, **phase;
 	unsigned long flags;
+	bool flush = false;
 
 	/*
 	 * Neither the BIOS, ourselves or any other kernel
@@ -73,29 +73,15 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
 
 	spin_lock_irqsave(&i915->mm.obj_lock, flags);
 	for (phase = phases; *phase; phase++) {
-		LIST_HEAD(keep);
-
-		while ((obj = first_mm_object(*phase))) {
-			list_move_tail(&obj->mm.link, &keep);
-
-			/* Beware the background _i915_gem_free_objects */
-			if (!kref_get_unless_zero(&obj->base.refcount))
-				continue;
-
-			spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
-
-			i915_gem_object_lock(obj, NULL);
-			drm_WARN_ON(&i915->drm,
-			    i915_gem_object_set_to_gtt_domain(obj, false));
-			i915_gem_object_unlock(obj);
-			i915_gem_object_put(obj);
-
-			spin_lock_irqsave(&i915->mm.obj_lock, flags);
+		list_for_each_entry(obj, *phase, mm.link) {
+			if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
+				flush |= (obj->read_domains & I915_GEM_DOMAIN_CPU) == 0;
+			__start_cpu_write(obj); /* presume auto-hibernate */
 		}
-
-		list_splice_tail(&keep, *phase);
 	}
 	spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
+	if (flush)
+		wbinvd_on_all_cpus();
 }
 
 void i915_gem_resume(struct drm_i915_private *i915)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
@ 2021-01-19 14:49 ` Chris Wilson
  2021-01-19 15:34   ` Matthew Auld
  2021-01-23 14:46   ` Guenter Roeck
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union Chris Wilson
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

When flushing objects larger than the CPU cache it is preferrable to use
a single wbinvd() rather than overlapping clflush(). At runtime, we
avoid wbinvd() due to its system-wide latencies, but during
singlethreaded suspend, no one will observe the imposed latency and we
can opt for the faster wbinvd to clear all objects in a single hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c013148835e6..d3a287bf56c5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1175,19 +1175,13 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
 	 * the objects as well, see i915_gem_freeze()
 	 */
 
-	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
-
-	i915_gem_shrink(i915, -1UL, NULL, ~0);
+	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
+		i915_gem_shrink(i915, -1UL, NULL, ~0);
 	i915_gem_drain_freed_objects(i915);
 
-	list_for_each_entry(obj, &i915->mm.shrink_list, mm.link) {
-		i915_gem_object_lock(obj, NULL);
-		drm_WARN_ON(&i915->drm,
-			    i915_gem_object_set_to_cpu_domain(obj, true));
-		i915_gem_object_unlock(obj);
-	}
-
-	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
+	wbinvd_on_all_cpus();
+	list_for_each_entry(obj, &i915->mm.shrink_list, mm.link)
+		__start_cpu_write(obj);
 
 	return 0;
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
@ 2021-01-19 14:49 ` Chris Wilson
  2021-01-19 15:40   ` Matthew Auld
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks Chris Wilson
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/display/intel_fbdev.c       | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_object.h       | 2 ++
 drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 4 ++--
 drivers/gpu/drm/i915/gem/i915_gem_phys.c         | 2 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c        | 5 +++++
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c       | 5 +++++
 drivers/gpu/drm/i915/gem/i915_gem_stolen.h       | 2 ++
 drivers/gpu/drm/i915/gt/intel_ring.c             | 2 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c            | 2 +-
 drivers/gpu/drm/i915/i915_debugfs.c              | 2 +-
 10 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_fbdev.c b/drivers/gpu/drm/i915/display/intel_fbdev.c
index 842c04e63214..84f853f113b9 100644
--- a/drivers/gpu/drm/i915/display/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/display/intel_fbdev.c
@@ -256,7 +256,7 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	 * If the object is stolen however, it will be full of whatever
 	 * garbage was left in there.
 	 */
-	if (vma->obj->stolen && !prealloc)
+	if (!i915_gem_object_is_shmem(vma->obj) && !prealloc)
 		memset_io(info->screen_base, 0, info->screen_size);
 
 	/* Use default scratch pixmap (info->pixmap.flags = FB_PIXMAP_SYSTEM) */
@@ -595,7 +595,7 @@ void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous
 	 * full of whatever garbage was left in there.
 	 */
 	if (state == FBINFO_STATE_RUNNING &&
-	    intel_fb_obj(&ifbdev->fb->base)->stolen)
+	    !i915_gem_object_is_shmem(intel_fb_obj(&ifbdev->fb->base)))
 		memset_io(info->screen_base, 0, info->screen_size);
 
 	drm_fb_helper_set_suspend(&ifbdev->helper, state);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index be14486f63a7..3603b702a14e 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -540,4 +540,6 @@ i915_gem_object_invalidate_frontbuffer(struct drm_i915_gem_object *obj,
 		__i915_gem_object_invalidate_frontbuffer(obj, origin);
 }
 
+bool i915_gem_object_is_shmem(const struct drm_i915_gem_object *obj);
+
 #endif
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index e2d9b7e1e152..b01b4b307ee7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -142,8 +142,6 @@ struct drm_i915_gem_object {
 	 */
 	struct list_head obj_link;
 
-	/** Stolen memory for this object, instead of being backed by shmem. */
-	struct drm_mm_node *stolen;
 	union {
 		struct rcu_head rcu;
 		struct llist_node freed;
@@ -295,6 +293,8 @@ struct drm_i915_gem_object {
 			struct work_struct *work;
 		} userptr;
 
+		struct drm_mm_node *stolen;
+
 		unsigned long scratch;
 		u64 encode;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 3a4dfe2ef1da..3bb65a1b1d93 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -213,7 +213,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 	if (obj->ops == &i915_gem_phys_ops)
 		return 0;
 
-	if (obj->ops != &i915_gem_shmem_ops)
+	if (!i915_gem_object_is_shmem(obj))
 		return -EINVAL;
 
 	err = i915_gem_object_unbind(obj, I915_GEM_OBJECT_UNBIND_ACTIVE);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
index 722e02164c3e..cf83c208688c 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c
@@ -612,3 +612,8 @@ struct intel_memory_region *i915_gem_shmem_setup(struct drm_i915_private *i915)
 					  PAGE_SIZE, 0,
 					  &shmem_region_ops);
 }
+
+bool i915_gem_object_is_shmem(const struct drm_i915_gem_object *obj)
+{
+	return obj->ops == &i915_gem_shmem_ops;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
index b221492531ef..551935348ad8 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c
@@ -781,3 +781,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *i915,
 	kfree(stolen);
 	return obj;
 }
+
+bool i915_gem_object_is_stolen(const struct drm_i915_gem_object *obj)
+{
+	return obj->ops == &i915_gem_object_stolen_ops;
+}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
index 61e028063f9f..b03489706796 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.h
@@ -30,6 +30,8 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_i915_private *dev_priv
 					       resource_size_t stolen_offset,
 					       resource_size_t size);
 
+bool i915_gem_object_is_stolen(const struct drm_i915_gem_object *obj);
+
 #define I915_GEM_STOLEN_BIAS SZ_128K
 
 #endif /* __I915_GEM_STOLEN_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c
index 06385550450c..78d1360caa0f 100644
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -42,7 +42,7 @@ int intel_ring_pin(struct intel_ring *ring, struct i915_gem_ww_ctx *ww)
 	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
 	flags = PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
 
-	if (vma->obj->stolen)
+	if (i915_gem_object_is_stolen(vma->obj))
 		flags |= PIN_MAPPABLE;
 	else
 		flags |= PIN_HIGH;
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 5982b62f913d..a4d8fc9e2374 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -33,7 +33,7 @@ struct file *shmem_create_from_object(struct drm_i915_gem_object *obj)
 	struct file *file;
 	void *ptr;
 
-	if (obj->ops == &i915_gem_shmem_ops) {
+	if (i915_gem_object_is_shmem(obj)) {
 		file = obj->base.filp;
 		atomic_long_inc(&file->f_count);
 		return file;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index de8e0e44cfb6..ec5a32d0eeca 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -210,7 +210,7 @@ i915_debugfs_describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 	spin_unlock(&obj->vma.lock);
 
 	seq_printf(m, " (pinned x %d)", pin_count);
-	if (obj->stolen)
+	if (i915_gem_object_is_stolen(obj))
 		seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
 	if (i915_gem_object_is_framebuffer(obj))
 		seq_printf(m, " (fb)");
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union Chris Wilson
@ 2021-01-19 14:49 ` Chris Wilson
  2021-01-19 16:13   ` Matthew Auld
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static Chris Wilson
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_object.h    | 18 ++++++++++++++++++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  7 +------
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     | 19 +++++++++++--------
 drivers/gpu/drm/i915/gem/i915_gem_phys.c      |  2 +-
 drivers/gpu/drm/i915/gem/i915_gem_tiling.c    | 12 ++++++------
 drivers/gpu/drm/i915/i915_gem.c               | 12 ++++++------
 .../gpu/drm/i915/selftests/i915_gem_evict.c   | 10 +++++-----
 7 files changed, 48 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 3603b702a14e..6e28159a883d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -187,6 +187,24 @@ i915_gem_object_set_volatile(struct drm_i915_gem_object *obj)
 	obj->flags |= I915_BO_ALLOC_VOLATILE;
 }
 
+static inline bool
+i915_gem_object_has_tiling_quirk(struct drm_i915_gem_object *obj)
+{
+	return test_bit(I915_TILING_QUIRK_BIT, &obj->flags);
+}
+
+static inline void
+i915_gem_object_set_tiling_quirk(struct drm_i915_gem_object *obj)
+{
+	set_bit(I915_TILING_QUIRK_BIT, &obj->flags);
+}
+
+static inline void
+i915_gem_object_clear_tiling_quirk(struct drm_i915_gem_object *obj)
+{
+	clear_bit(I915_TILING_QUIRK_BIT, &obj->flags);
+}
+
 static inline bool
 i915_gem_object_type_has(const struct drm_i915_gem_object *obj,
 			 unsigned long flags)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index b01b4b307ee7..273694592596 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -165,6 +165,7 @@ struct drm_i915_gem_object {
 #define I915_BO_ALLOC_VOLATILE   BIT(1)
 #define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | I915_BO_ALLOC_VOLATILE)
 #define I915_BO_READONLY         BIT(2)
+#define I915_TILING_QUIRK_BIT    3 /* unknown swizzling; do not release! */
 
 	/*
 	 * Is the object to be mapped as read-only to the GPU
@@ -273,12 +274,6 @@ struct drm_i915_gem_object {
 		 * pages were last acquired.
 		 */
 		bool dirty:1;
-
-		/**
-		 * This is set if the object has been pinned due to unknown
-		 * swizzling.
-		 */
-		bool quirked:1;
 	} mm;
 
 	/** Record of address bit 17 of each page at last unbind. */
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
index 3db3c667c486..43028f3539a6 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c
@@ -16,6 +16,7 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 {
 	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	unsigned long supported = INTEL_INFO(i915)->page_sizes;
+	bool shrinkable;
 	int i;
 
 	lockdep_assert_held(&obj->mm.lock);
@@ -38,13 +39,6 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 
 	obj->mm.pages = pages;
 
-	if (i915_gem_object_is_tiled(obj) &&
-	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
-		GEM_BUG_ON(obj->mm.quirked);
-		__i915_gem_object_pin_pages(obj);
-		obj->mm.quirked = true;
-	}
-
 	GEM_BUG_ON(!sg_page_sizes);
 	obj->mm.page_sizes.phys = sg_page_sizes;
 
@@ -63,7 +57,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
 	}
 	GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg));
 
-	if (i915_gem_object_is_shrinkable(obj)) {
+	shrinkable = i915_gem_object_is_shrinkable(obj);
+
+	if (i915_gem_object_is_tiled(obj) &&
+	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
+		GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
+		i915_gem_object_set_tiling_quirk(obj);
+		shrinkable = false;
+	}
+
+	if (shrinkable) {
 		struct list_head *list;
 		unsigned long flags;
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
index 3bb65a1b1d93..3c0b157e2a35 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -227,7 +227,7 @@ int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
 		goto err_unlock;
 	}
 
-	if (obj->mm.quirked) {
+	if (i915_gem_object_has_tiling_quirk(obj)) {
 		err = -EFAULT;
 		goto err_unlock;
 	}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
index ffcaee74a249..b4f720ed80cd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_tiling.c
@@ -270,14 +270,14 @@ i915_gem_object_set_tiling(struct drm_i915_gem_object *obj,
 	    obj->mm.madv == I915_MADV_WILLNEED &&
 	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
 		if (tiling == I915_TILING_NONE) {
-			GEM_BUG_ON(!obj->mm.quirked);
-			__i915_gem_object_unpin_pages(obj);
-			obj->mm.quirked = false;
+			GEM_BUG_ON(!i915_gem_object_has_tiling_quirk(obj));
+			i915_gem_object_make_shrinkable(obj);
+			i915_gem_object_clear_tiling_quirk(obj);
 		}
 		if (!i915_gem_object_is_tiled(obj)) {
-			GEM_BUG_ON(obj->mm.quirked);
-			__i915_gem_object_pin_pages(obj);
-			obj->mm.quirked = true;
+			GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
+			i915_gem_object_make_unshrinkable(obj);
+			i915_gem_object_set_tiling_quirk(obj);
 		}
 	}
 	mutex_unlock(&obj->mm.lock);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d3a287bf56c5..9a534c4023ef 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -957,14 +957,14 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	    i915_gem_object_is_tiled(obj) &&
 	    i915->quirks & QUIRK_PIN_SWIZZLED_PAGES) {
 		if (obj->mm.madv == I915_MADV_WILLNEED) {
-			GEM_BUG_ON(!obj->mm.quirked);
-			__i915_gem_object_unpin_pages(obj);
-			obj->mm.quirked = false;
+			GEM_BUG_ON(!i915_gem_object_has_tiling_quirk(obj));
+			i915_gem_object_make_shrinkable(obj);
+			i915_gem_object_set_tiling_quirk(obj);
 		}
 		if (args->madv == I915_MADV_WILLNEED) {
-			GEM_BUG_ON(obj->mm.quirked);
-			__i915_gem_object_pin_pages(obj);
-			obj->mm.quirked = true;
+			GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
+			i915_gem_object_clear_tiling_quirk(obj);
+			i915_gem_object_make_unshrinkable(obj);
 		}
 	}
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
index 3512bb8433cf..f99bb0113726 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_evict.c
@@ -38,8 +38,8 @@ static void quirk_add(struct drm_i915_gem_object *obj,
 		      struct list_head *objects)
 {
 	/* quirk is only for live tiled objects, use it to declare ownership */
-	GEM_BUG_ON(obj->mm.quirked);
-	obj->mm.quirked = true;
+	GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj));
+	i915_gem_object_set_tiling_quirk(obj);
 	list_add(&obj->st_link, objects);
 }
 
@@ -85,7 +85,7 @@ static void unpin_ggtt(struct i915_ggtt *ggtt)
 	struct i915_vma *vma;
 
 	list_for_each_entry(vma, &ggtt->vm.bound_list, vm_link)
-		if (vma->obj->mm.quirked)
+		if (i915_gem_object_has_tiling_quirk(vma->obj))
 			i915_vma_unpin(vma);
 }
 
@@ -94,8 +94,8 @@ static void cleanup_objects(struct i915_ggtt *ggtt, struct list_head *list)
 	struct drm_i915_gem_object *obj, *on;
 
 	list_for_each_entry_safe(obj, on, list, st_link) {
-		GEM_BUG_ON(!obj->mm.quirked);
-		obj->mm.quirked = false;
+		GEM_BUG_ON(!i915_gem_object_has_tiling_quirk(obj));
+		i915_gem_object_set_tiling_quirk(obj);
 		i915_gem_object_put(obj);
 	}
 
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
                   ` (2 preceding siblings ...)
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks Chris Wilson
@ 2021-01-19 14:49 ` Chris Wilson
  2021-01-19 16:16   ` Matthew Auld
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning Chris Wilson
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

flush_write_domain() is only used within the GEM domain managament code,
so move it to i915_gem_domain.c and drop the export.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gem/i915_gem_domain.c | 58 +++++++++++++++++++---
 drivers/gpu/drm/i915/gem/i915_gem_object.c | 47 ------------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h |  4 --
 3 files changed, 52 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index fcce6909f201..f0379b550dfc 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -5,6 +5,7 @@
  */
 
 #include "display/intel_frontbuffer.h"
+#include "gt/intel_gt.h"
 
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
@@ -15,13 +16,58 @@
 #include "i915_gem_lmem.h"
 #include "i915_gem_mman.h"
 
+static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
+{
+	return !(obj->cache_level == I915_CACHE_NONE ||
+		 obj->cache_level == I915_CACHE_WT);
+}
+
+static void
+flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
+{
+	struct i915_vma *vma;
+
+	assert_object_held(obj);
+
+	if (!(obj->write_domain & flush_domains))
+		return;
+
+	switch (obj->write_domain) {
+	case I915_GEM_DOMAIN_GTT:
+		spin_lock(&obj->vma.lock);
+		for_each_ggtt_vma(vma, obj) {
+			if (i915_vma_unset_ggtt_write(vma))
+				intel_gt_flush_ggtt_writes(vma->vm->gt);
+		}
+		spin_unlock(&obj->vma.lock);
+
+		i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
+		break;
+
+	case I915_GEM_DOMAIN_WC:
+		wmb();
+		break;
+
+	case I915_GEM_DOMAIN_CPU:
+		i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
+		break;
+
+	case I915_GEM_DOMAIN_RENDER:
+		if (gpu_write_needs_clflush(obj))
+			obj->cache_dirty = true;
+		break;
+	}
+
+	obj->write_domain = 0;
+}
+
 static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj)
 {
 	/*
 	 * We manually flush the CPU domain so that we can override and
 	 * force the flush for the display, and perform it asyncrhonously.
 	 */
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 	if (obj->cache_dirty)
 		i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
 	obj->write_domain = 0;
@@ -80,7 +126,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -141,7 +187,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -451,7 +497,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* Flush the CPU cache if it's still invalid. */
 	if ((obj->read_domains & I915_GEM_DOMAIN_CPU) == 0) {
@@ -619,7 +665,7 @@ int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu read domain, set ourself into the gtt
 	 * read domain and manually flush cachelines (if required). This
@@ -670,7 +716,7 @@ int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu write domain, set ourself into the
 	 * gtt write domain and manually flush cachelines (as required).
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c
index 00d24000b5e8..83c6ee6a509a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -25,7 +25,6 @@
 #include <linux/sched/mm.h>
 
 #include "display/intel_frontbuffer.h"
-#include "gt/intel_gt.h"
 #include "i915_drv.h"
 #include "i915_gem_clflush.h"
 #include "i915_gem_context.h"
@@ -313,52 +312,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		queue_work(i915->wq, &i915->mm.free_work);
 }
 
-static bool gpu_write_needs_clflush(struct drm_i915_gem_object *obj)
-{
-	return !(obj->cache_level == I915_CACHE_NONE ||
-		 obj->cache_level == I915_CACHE_WT);
-}
-
-void
-i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
-				   unsigned int flush_domains)
-{
-	struct i915_vma *vma;
-
-	assert_object_held(obj);
-
-	if (!(obj->write_domain & flush_domains))
-		return;
-
-	switch (obj->write_domain) {
-	case I915_GEM_DOMAIN_GTT:
-		spin_lock(&obj->vma.lock);
-		for_each_ggtt_vma(vma, obj) {
-			if (i915_vma_unset_ggtt_write(vma))
-				intel_gt_flush_ggtt_writes(vma->vm->gt);
-		}
-		spin_unlock(&obj->vma.lock);
-
-		i915_gem_object_flush_frontbuffer(obj, ORIGIN_CPU);
-		break;
-
-	case I915_GEM_DOMAIN_WC:
-		wmb();
-		break;
-
-	case I915_GEM_DOMAIN_CPU:
-		i915_gem_clflush_object(obj, I915_CLFLUSH_SYNC);
-		break;
-
-	case I915_GEM_DOMAIN_RENDER:
-		if (gpu_write_needs_clflush(obj))
-			obj->cache_dirty = true;
-		break;
-	}
-
-	obj->write_domain = 0;
-}
-
 void __i915_gem_object_flush_frontbuffer(struct drm_i915_gem_object *obj,
 					 enum fb_op_origin origin)
 {
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 6e28159a883d..72c699088be0 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -453,10 +453,6 @@ static inline void i915_gem_object_unpin_map(struct drm_i915_gem_object *obj)
 
 void __i915_gem_object_release_map(struct drm_i915_gem_object *obj);
 
-void
-i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
-				   unsigned int flush_domains);
-
 int i915_gem_object_prepare_read(struct drm_i915_gem_object *obj,
 				 unsigned int *needs_clflush);
 int i915_gem_object_prepare_write(struct drm_i915_gem_object *obj,
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
                   ` (3 preceding siblings ...)
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static Chris Wilson
@ 2021-01-19 14:49 ` Chris Wilson
  2021-01-19 16:38   ` Matthew Auld
  2021-01-19 15:30 ` [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Matthew Auld
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 14:49 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson

Simplify the frontbuffer unpin by removing the lock requirement. The LRU
bumping was primarily to protect the GTT from being evicted and from
frontbuffers being eagerly shrunk. Now we protect frontbuffers from the
shrinker, and we avoid accidentally evicting from the GTT, so the
benefit from bumping LRU is no more, and we can save more time by not.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/display/intel_display.c |  7 +--
 drivers/gpu/drm/i915/display/intel_overlay.c |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_domain.c   | 45 --------------------
 drivers/gpu/drm/i915/gem/i915_gem_object.h   |  1 -
 4 files changed, 4 insertions(+), 53 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index b614987eddf1..3c120a4a6800 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -1430,7 +1430,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 		 */
 		ret = i915_vma_pin_fence(vma);
 		if (ret != 0 && INTEL_GEN(dev_priv) < 4) {
-			i915_gem_object_unpin_from_display_plane(vma);
+			i915_vma_unpin(vma);
 			vma = ERR_PTR(ret);
 			goto err;
 		}
@@ -1448,12 +1448,9 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb,
 
 void intel_unpin_fb_vma(struct i915_vma *vma, unsigned long flags)
 {
-	i915_gem_object_lock(vma->obj, NULL);
 	if (flags & PLANE_HAS_FENCE)
 		i915_vma_unpin_fence(vma);
-	i915_gem_object_unpin_from_display_plane(vma);
-	i915_gem_object_unlock(vma->obj);
-
+	i915_vma_unpin(vma);
 	i915_vma_put(vma);
 }
 
diff --git a/drivers/gpu/drm/i915/display/intel_overlay.c b/drivers/gpu/drm/i915/display/intel_overlay.c
index 6be5d8946c69..8edb9b2cdbeb 100644
--- a/drivers/gpu/drm/i915/display/intel_overlay.c
+++ b/drivers/gpu/drm/i915/display/intel_overlay.c
@@ -360,7 +360,7 @@ static void intel_overlay_release_old_vma(struct intel_overlay *overlay)
 	intel_frontbuffer_flip_complete(overlay->i915,
 					INTEL_FRONTBUFFER_OVERLAY(overlay->crtc->pipe));
 
-	i915_gem_object_unpin_from_display_plane(vma);
+	i915_vma_unpin(vma);
 	i915_vma_put(vma);
 }
 
@@ -861,7 +861,7 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	return 0;
 
 out_unpin:
-	i915_gem_object_unpin_from_display_plane(vma);
+	i915_vma_unpin(vma);
 out_pin_section:
 	atomic_dec(&dev_priv->gpu_error.pending_fb_pin);
 
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_domain.c b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
index f0379b550dfc..666e4c60d610 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_domain.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_domain.c
@@ -433,48 +433,6 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	return vma;
 }
 
-static void i915_gem_object_bump_inactive_ggtt(struct drm_i915_gem_object *obj)
-{
-	struct drm_i915_private *i915 = to_i915(obj->base.dev);
-	struct i915_vma *vma;
-
-	if (list_empty(&obj->vma.list))
-		return;
-
-	mutex_lock(&i915->ggtt.vm.mutex);
-	spin_lock(&obj->vma.lock);
-	for_each_ggtt_vma(vma, obj) {
-		if (!drm_mm_node_allocated(&vma->node))
-			continue;
-
-		GEM_BUG_ON(vma->vm != &i915->ggtt.vm);
-		list_move_tail(&vma->vm_link, &vma->vm->bound_list);
-	}
-	spin_unlock(&obj->vma.lock);
-	mutex_unlock(&i915->ggtt.vm.mutex);
-
-	if (i915_gem_object_is_shrinkable(obj)) {
-		unsigned long flags;
-
-		spin_lock_irqsave(&i915->mm.obj_lock, flags);
-
-		if (obj->mm.madv == I915_MADV_WILLNEED &&
-		    !atomic_read(&obj->mm.shrink_pin))
-			list_move_tail(&obj->mm.link, &i915->mm.shrink_list);
-
-		spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
-	}
-}
-
-void
-i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
-{
-	/* Bump the LRU to try and avoid premature eviction whilst flipping  */
-	i915_gem_object_bump_inactive_ggtt(vma->obj);
-
-	i915_vma_unpin(vma);
-}
-
 /**
  * Moves a single object to the CPU read, and possibly write domain.
  * @obj: object to act on
@@ -615,9 +573,6 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	else
 		err = i915_gem_object_set_to_cpu_domain(obj, write_domain);
 
-	/* And bump the LRU for this access */
-	i915_gem_object_bump_inactive_ggtt(obj);
-
 	i915_gem_object_unlock(obj);
 
 	if (write_domain)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index 72c699088be0..7ccf786c41a9 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -500,7 +500,6 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 				     u32 alignment,
 				     const struct i915_ggtt_view *view,
 				     unsigned int flags);
-void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 
 void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj);
 void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj);
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
                   ` (4 preceding siblings ...)
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning Chris Wilson
@ 2021-01-19 15:30 ` Matthew Auld
  2021-01-19 15:37   ` Chris Wilson
  2021-01-19 17:26 ` Matthew Auld
  2021-01-19 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/6] drm/i915/gem: Almagamate clflushes on suspend (rev2) Patchwork
  7 siblings, 1 reply; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 15:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> When flushing objects larger than the CPU cache it is preferrable to use
> a single wbinvd() rather than overlapping clflush(). At runtime, we
> avoid wbinvd() due to its system-wide latencies, but during
> singlethreaded suspend, no one will observe the imposed latency and we
> can opt for the faster wbinvd to clear all objects in a single hit.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_pm.c | 40 +++++++++-----------------
>  1 file changed, 13 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> index 40d3e40500fa..38c1298cb14b 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> @@ -11,6 +11,12 @@
>
>  #include "i915_drv.h"
>
> +#if defined(CONFIG_X86)
> +#include <asm/smp.h>
> +#else
> +#define wbinvd_on_all_cpus()
> +#endif
> +
>  void i915_gem_suspend(struct drm_i915_private *i915)
>  {
>         GEM_TRACE("%s\n", dev_name(i915->drm.dev));
> @@ -32,13 +38,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
>         i915_gem_drain_freed_objects(i915);
>  }
>
> -static struct drm_i915_gem_object *first_mm_object(struct list_head *list)
> -{
> -       return list_first_entry_or_null(list,
> -                                       struct drm_i915_gem_object,
> -                                       mm.link);
> -}
> -
>  void i915_gem_suspend_late(struct drm_i915_private *i915)
>  {
>         struct drm_i915_gem_object *obj;
> @@ -48,6 +47,7 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
>                 NULL
>         }, **phase;
>         unsigned long flags;
> +       bool flush = false;
>
>         /*
>          * Neither the BIOS, ourselves or any other kernel
> @@ -73,29 +73,15 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
>
>         spin_lock_irqsave(&i915->mm.obj_lock, flags);
>         for (phase = phases; *phase; phase++) {
> -               LIST_HEAD(keep);
> -
> -               while ((obj = first_mm_object(*phase))) {
> -                       list_move_tail(&obj->mm.link, &keep);
> -
> -                       /* Beware the background _i915_gem_free_objects */
> -                       if (!kref_get_unless_zero(&obj->base.refcount))
> -                               continue;
> -
> -                       spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> -
> -                       i915_gem_object_lock(obj, NULL);
> -                       drm_WARN_ON(&i915->drm,
> -                           i915_gem_object_set_to_gtt_domain(obj, false));
> -                       i915_gem_object_unlock(obj);
> -                       i915_gem_object_put(obj);
> -
> -                       spin_lock_irqsave(&i915->mm.obj_lock, flags);
> +               list_for_each_entry(obj, *phase, mm.link) {
> +                       if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> +                               flush |= (obj->read_domains & I915_GEM_DOMAIN_CPU) == 0;
> +                       __start_cpu_write(obj); /* presume auto-hibernate */
>                 }
> -
> -               list_splice_tail(&keep, *phase);
>         }
>         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> +       if (flush)
> +               wbinvd_on_all_cpus();

Hmmm, this builds on !CONFIG_X86?

>  }
>
>  void i915_gem_resume(struct drm_i915_private *i915)
> --
> 2.20.1
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
@ 2021-01-19 15:34   ` Matthew Auld
  2021-01-23 14:46   ` Guenter Roeck
  1 sibling, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 15:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> When flushing objects larger than the CPU cache it is preferrable to use
> a single wbinvd() rather than overlapping clflush(). At runtime, we
> avoid wbinvd() due to its system-wide latencies, but during
> singlethreaded suspend, no one will observe the imposed latency and we
> can opt for the faster wbinvd to clear all objects in a single hit.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend
  2021-01-19 15:30 ` [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Matthew Auld
@ 2021-01-19 15:37   ` Chris Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 15:37 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development

Quoting Matthew Auld (2021-01-19 15:30:41)
> On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > When flushing objects larger than the CPU cache it is preferrable to use
> > a single wbinvd() rather than overlapping clflush(). At runtime, we
> > avoid wbinvd() due to its system-wide latencies, but during
> > singlethreaded suspend, no one will observe the imposed latency and we
> > can opt for the faster wbinvd to clear all objects in a single hit.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/gem/i915_gem_pm.c | 40 +++++++++-----------------
> >  1 file changed, 13 insertions(+), 27 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pm.c b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > index 40d3e40500fa..38c1298cb14b 100644
> > --- a/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > +++ b/drivers/gpu/drm/i915/gem/i915_gem_pm.c
> > @@ -11,6 +11,12 @@
> >
> >  #include "i915_drv.h"
> >
> > +#if defined(CONFIG_X86)
> > +#include <asm/smp.h>
> > +#else
> > +#define wbinvd_on_all_cpus()
> > +#endif
> > +
> >  void i915_gem_suspend(struct drm_i915_private *i915)
> >  {
> >         GEM_TRACE("%s\n", dev_name(i915->drm.dev));
> > @@ -32,13 +38,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
> >         i915_gem_drain_freed_objects(i915);
> >  }
> >
> > -static struct drm_i915_gem_object *first_mm_object(struct list_head *list)
> > -{
> > -       return list_first_entry_or_null(list,
> > -                                       struct drm_i915_gem_object,
> > -                                       mm.link);
> > -}
> > -
> >  void i915_gem_suspend_late(struct drm_i915_private *i915)
> >  {
> >         struct drm_i915_gem_object *obj;
> > @@ -48,6 +47,7 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
> >                 NULL
> >         }, **phase;
> >         unsigned long flags;
> > +       bool flush = false;
> >
> >         /*
> >          * Neither the BIOS, ourselves or any other kernel
> > @@ -73,29 +73,15 @@ void i915_gem_suspend_late(struct drm_i915_private *i915)
> >
> >         spin_lock_irqsave(&i915->mm.obj_lock, flags);
> >         for (phase = phases; *phase; phase++) {
> > -               LIST_HEAD(keep);
> > -
> > -               while ((obj = first_mm_object(*phase))) {
> > -                       list_move_tail(&obj->mm.link, &keep);
> > -
> > -                       /* Beware the background _i915_gem_free_objects */
> > -                       if (!kref_get_unless_zero(&obj->base.refcount))
> > -                               continue;
> > -
> > -                       spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> > -
> > -                       i915_gem_object_lock(obj, NULL);
> > -                       drm_WARN_ON(&i915->drm,
> > -                           i915_gem_object_set_to_gtt_domain(obj, false));
> > -                       i915_gem_object_unlock(obj);
> > -                       i915_gem_object_put(obj);
> > -
> > -                       spin_lock_irqsave(&i915->mm.obj_lock, flags);
> > +               list_for_each_entry(obj, *phase, mm.link) {
> > +                       if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
> > +                               flush |= (obj->read_domains & I915_GEM_DOMAIN_CPU) == 0;
> > +                       __start_cpu_write(obj); /* presume auto-hibernate */
> >                 }
> > -
> > -               list_splice_tail(&keep, *phase);
> >         }
> >         spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
> > +       if (flush)
> > +               wbinvd_on_all_cpus();
> 
> Hmmm, this builds on !CONFIG_X86?

It builds; but does it do anything? The answer is no, but finding the
answer to that is a bridge I can cross later -- it's probably something
like flush_dcache_range(0, HUGEVAL) / __flush_dcache_all() I expect it to
be a solved probably, just not sure what the solution is.

Maybe dev_warn() instead of a quiet macro.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union Chris Wilson
@ 2021-01-19 15:40   ` Matthew Auld
  0 siblings, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 15:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks Chris Wilson
@ 2021-01-19 16:13   ` Matthew Auld
  0 siblings, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 16:13 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static Chris Wilson
@ 2021-01-19 16:16   ` Matthew Auld
  0 siblings, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 16:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> flush_write_domain() is only used within the GEM domain managament code,
                                                       management

> so move it to i915_gem_domain.c and drop the export.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning Chris Wilson
@ 2021-01-19 16:38   ` Matthew Auld
  2021-01-19 17:02     ` Chris Wilson
  0 siblings, 1 reply; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 16:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Simplify the frontbuffer unpin by removing the lock requirement. The LRU
> bumping was primarily to protect the GTT from being evicted and from
> frontbuffers being eagerly shrunk. Now we protect frontbuffers from the
> shrinker, and we avoid accidentally evicting from the GTT, so the
> benefit from bumping LRU is no more, and we can save more time by not.

For the GTT evict case, where/how do we currently try to prevent
accidental eviction for fb?
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning
  2021-01-19 16:38   ` Matthew Auld
@ 2021-01-19 17:02     ` Chris Wilson
  2021-01-19 17:14       ` Matthew Auld
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Wilson @ 2021-01-19 17:02 UTC (permalink / raw)
  To: Matthew Auld; +Cc: Intel Graphics Development

Quoting Matthew Auld (2021-01-19 16:38:04)
> On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> >
> > Simplify the frontbuffer unpin by removing the lock requirement. The LRU
> > bumping was primarily to protect the GTT from being evicted and from
> > frontbuffers being eagerly shrunk. Now we protect frontbuffers from the
> > shrinker, and we avoid accidentally evicting from the GTT, so the
> > benefit from bumping LRU is no more, and we can save more time by not.
> 
> For the GTT evict case, where/how do we currently try to prevent
> accidental eviction for fb?

Our preference is to try with NOEVICT, and then use smaller partial
mappings, reducing the risk of evicting anything that may be reused in
the near future. The goal is to really only use the full GTT mapping for
when HW needs access to the whole object.

However, we could apply the same rule as we do for the shrinker as leave
frontbuffer objects until the second pass. Such as

--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -61,6 +61,19 @@ mark_free(struct drm_mm_scan *scan,
        return drm_mm_scan_add_block(scan, &vma->node);
 }

+static bool skip_vma(struct i915_vma *vma)
+{
+       if (i915_vma_is_active(vma))
+               return true;
+
+       if (i915_is_ggtt(vma) &&
+           vma->obj &&
+           i915_gem_object_is_framebuffer(vma->obj))
+               return true;
+
+       return false;
+}
+
 /**
  * i915_gem_evict_something - Evict vmas to make room for binding a new one
  * @vm: address space to evict from
@@ -150,7 +163,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
                 * To notice when we complete one full cycle, we record the
                 * first active element seen, before moving it to the tail.
                 */
-               if (active != ERR_PTR(-EAGAIN) && i915_vma_is_active(vma)) {
+               if (active != ERR_PTR(-EAGAIN) && skip_vma(vma)) {
                        if (!active)
                                active = vma;

That would be better if we mark the vma as being used by display.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning
  2021-01-19 17:02     ` Chris Wilson
@ 2021-01-19 17:14       ` Matthew Auld
  0 siblings, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 17:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 17:02, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> Quoting Matthew Auld (2021-01-19 16:38:04)
> > On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
> > >
> > > Simplify the frontbuffer unpin by removing the lock requirement. The LRU
> > > bumping was primarily to protect the GTT from being evicted and from
> > > frontbuffers being eagerly shrunk. Now we protect frontbuffers from the
> > > shrinker, and we avoid accidentally evicting from the GTT, so the
> > > benefit from bumping LRU is no more, and we can save more time by not.
> >
> > For the GTT evict case, where/how do we currently try to prevent
> > accidental eviction for fb?
>
> Our preference is to try with NOEVICT, and then use smaller partial
> mappings, reducing the risk of evicting anything that may be reused in
> the near future. The goal is to really only use the full GTT mapping for
> when HW needs access to the whole object.

Ah, right.

>
> However, we could apply the same rule as we do for the shrinker as leave
> frontbuffer objects until the second pass. Such as
>
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -61,6 +61,19 @@ mark_free(struct drm_mm_scan *scan,
>         return drm_mm_scan_add_block(scan, &vma->node);
>  }
>
> +static bool skip_vma(struct i915_vma *vma)
> +{
> +       if (i915_vma_is_active(vma))
> +               return true;
> +
> +       if (i915_is_ggtt(vma) &&
> +           vma->obj &&
> +           i915_gem_object_is_framebuffer(vma->obj))
> +               return true;
> +
> +       return false;
> +}
> +
>  /**
>   * i915_gem_evict_something - Evict vmas to make room for binding a new one
>   * @vm: address space to evict from
> @@ -150,7 +163,7 @@ i915_gem_evict_something(struct i915_address_space *vm,
>                  * To notice when we complete one full cycle, we record the
>                  * first active element seen, before moving it to the tail.
>                  */
> -               if (active != ERR_PTR(-EAGAIN) && i915_vma_is_active(vma)) {
> +               if (active != ERR_PTR(-EAGAIN) && skip_vma(vma)) {
>                         if (!active)
>                                 active = vma;
>
> That would be better if we mark the vma as being used by display.

Makes sense. For whichever method,
Reviewed-by: Matthew Auld <matthew.auld@intel.com>

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
                   ` (5 preceding siblings ...)
  2021-01-19 15:30 ` [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Matthew Auld
@ 2021-01-19 17:26 ` Matthew Auld
  2021-01-19 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/6] drm/i915/gem: Almagamate clflushes on suspend (rev2) Patchwork
  7 siblings, 0 replies; 19+ messages in thread
From: Matthew Auld @ 2021-01-19 17:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Intel Graphics Development

On Tue, 19 Jan 2021 at 14:49, Chris Wilson <chris@chris-wilson.co.uk> wrote:
>
> When flushing objects larger than the CPU cache it is preferrable to use
> a single wbinvd() rather than overlapping clflush(). At runtime, we
> avoid wbinvd() due to its system-wide latencies, but during
> singlethreaded suspend, no one will observe the imposed latency and we
> can opt for the faster wbinvd to clear all objects in a single hit.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/6] drm/i915/gem: Almagamate clflushes on suspend (rev2)
  2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
                   ` (6 preceding siblings ...)
  2021-01-19 17:26 ` Matthew Auld
@ 2021-01-19 21:50 ` Patchwork
  7 siblings, 0 replies; 19+ messages in thread
From: Patchwork @ 2021-01-19 21:50 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/6] drm/i915/gem: Almagamate clflushes on suspend (rev2)
URL   : https://patchwork.freedesktop.org/series/86046/
State : failure

== Summary ==

Applying: drm/i915/gem: Almagamate clflushes on suspend
Applying: drm/i915/gem: Almagamate clflushes on freeze
Applying: drm/i915/gem: Move stolen node into GEM object union
Applying: drm/i915/gem: Use shrinkable status for unknown swizzle quirks
Applying: drm/i915/gem: Make i915_gem_object_flush_write_domain() static
error: sha1 information is lacking or useless (drivers/gpu/drm/i915/gem/i915_gem_object.h).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0005 drm/i915/gem: Make i915_gem_object_flush_write_domain() static
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze
  2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
  2021-01-19 15:34   ` Matthew Auld
@ 2021-01-23 14:46   ` Guenter Roeck
  2021-01-23 14:53     ` Chris Wilson
  1 sibling, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2021-01-23 14:46 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Tue, Jan 19, 2021 at 02:49:08PM +0000, Chris Wilson wrote:
> When flushing objects larger than the CPU cache it is preferrable to use
> a single wbinvd() rather than overlapping clflush(). At runtime, we
> avoid wbinvd() due to its system-wide latencies, but during
> singlethreaded suspend, no one will observe the imposed latency and we
> can opt for the faster wbinvd to clear all objects in a single hit.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c013148835e6..d3a287bf56c5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1175,19 +1175,13 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
>  	 * the objects as well, see i915_gem_freeze()
>  	 */
>  
> -	wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> -
> -	i915_gem_shrink(i915, -1UL, NULL, ~0);
> +	with_intel_runtime_pm(&i915->runtime_pm, wakeref)
> +		i915_gem_shrink(i915, -1UL, NULL, ~0);
>  	i915_gem_drain_freed_objects(i915);
>  
> -	list_for_each_entry(obj, &i915->mm.shrink_list, mm.link) {
> -		i915_gem_object_lock(obj, NULL);
> -		drm_WARN_ON(&i915->drm,
> -			    i915_gem_object_set_to_cpu_domain(obj, true));
> -		i915_gem_object_unlock(obj);
> -	}
> -
> -	intel_runtime_pm_put(&i915->runtime_pm, wakeref);
> +	wbinvd_on_all_cpus();

with CONFIG_SMP=n, this results in:

drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_freeze_late':
drivers/gpu/drm/i915/i915_gem.c:1182:2: error: implicit declaration of function 'wbinvd_on_all_cpus'

Other drivers calling this function include <asm/smp.h>.

Guenter

> +	list_for_each_entry(obj, &i915->mm.shrink_list, mm.link)
> +		__start_cpu_write(obj);
>  
>  	return 0;
>  }
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze
  2021-01-23 14:46   ` Guenter Roeck
@ 2021-01-23 14:53     ` Chris Wilson
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Wilson @ 2021-01-23 14:53 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: intel-gfx

Quoting Guenter Roeck (2021-01-23 14:46:33)
> On Tue, Jan 19, 2021 at 02:49:08PM +0000, Chris Wilson wrote:
> > When flushing objects larger than the CPU cache it is preferrable to use
> > a single wbinvd() rather than overlapping clflush(). At runtime, we
> > avoid wbinvd() due to its system-wide latencies, but during
> > singlethreaded suspend, no one will observe the imposed latency and we
> > can opt for the faster wbinvd to clear all objects in a single hit.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Reviewed-by: Matthew Auld <matthew.auld@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 16 +++++-----------
> >  1 file changed, 5 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index c013148835e6..d3a287bf56c5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1175,19 +1175,13 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
> >        * the objects as well, see i915_gem_freeze()
> >        */
> >  
> > -     wakeref = intel_runtime_pm_get(&i915->runtime_pm);
> > -
> > -     i915_gem_shrink(i915, -1UL, NULL, ~0);
> > +     with_intel_runtime_pm(&i915->runtime_pm, wakeref)
> > +             i915_gem_shrink(i915, -1UL, NULL, ~0);
> >       i915_gem_drain_freed_objects(i915);
> >  
> > -     list_for_each_entry(obj, &i915->mm.shrink_list, mm.link) {
> > -             i915_gem_object_lock(obj, NULL);
> > -             drm_WARN_ON(&i915->drm,
> > -                         i915_gem_object_set_to_cpu_domain(obj, true));
> > -             i915_gem_object_unlock(obj);
> > -     }
> > -
> > -     intel_runtime_pm_put(&i915->runtime_pm, wakeref);
> > +     wbinvd_on_all_cpus();
> 
> with CONFIG_SMP=n, this results in:
> 
> drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_freeze_late':
> drivers/gpu/drm/i915/i915_gem.c:1182:2: error: implicit declaration of function 'wbinvd_on_all_cpus'
> 
> Other drivers calling this function include <asm/smp.h>.

I mistakenly thought this was next to i915_gem_suspend...
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-01-23 14:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-19 14:49 [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Chris Wilson
2021-01-19 14:49 ` [Intel-gfx] [PATCH 2/6] drm/i915/gem: Almagamate clflushes on freeze Chris Wilson
2021-01-19 15:34   ` Matthew Auld
2021-01-23 14:46   ` Guenter Roeck
2021-01-23 14:53     ` Chris Wilson
2021-01-19 14:49 ` [Intel-gfx] [PATCH 3/6] drm/i915/gem: Move stolen node into GEM object union Chris Wilson
2021-01-19 15:40   ` Matthew Auld
2021-01-19 14:49 ` [Intel-gfx] [PATCH 4/6] drm/i915/gem: Use shrinkable status for unknown swizzle quirks Chris Wilson
2021-01-19 16:13   ` Matthew Auld
2021-01-19 14:49 ` [Intel-gfx] [PATCH 5/6] drm/i915/gem: Make i915_gem_object_flush_write_domain() static Chris Wilson
2021-01-19 16:16   ` Matthew Auld
2021-01-19 14:49 ` [Intel-gfx] [PATCH 6/6] drm/i915/gem: Drop lru bumping on display unpinning Chris Wilson
2021-01-19 16:38   ` Matthew Auld
2021-01-19 17:02     ` Chris Wilson
2021-01-19 17:14       ` Matthew Auld
2021-01-19 15:30 ` [Intel-gfx] [PATCH 1/6] drm/i915/gem: Almagamate clflushes on suspend Matthew Auld
2021-01-19 15:37   ` Chris Wilson
2021-01-19 17:26 ` Matthew Auld
2021-01-19 21:50 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/6] drm/i915/gem: Almagamate clflushes on suspend (rev2) Patchwork

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).