All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
@ 2019-05-07 10:55 Matthew Auld
  2019-05-07 10:55 ` [PATCH v2 2/2] drm/i915: add in-kernel blitter client Matthew Auld
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Matthew Auld @ 2019-05-07 10:55 UTC (permalink / raw)
  To: intel-gfx

Some steps in gen6_alloc_va_range require the HW to be awake, so ideally
we should be grabbing the wakeref ourselves and not relying on the
caller already holding it for us.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 8f5db787b7f2..ffb8c3d011e9 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1745,10 +1745,13 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 {
 	struct gen6_hw_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
 	struct i915_page_table *pt;
+	intel_wakeref_t wakeref;
 	u64 from = start;
 	unsigned int pde;
 	bool flush = false;
 
+	wakeref = intel_runtime_pm_get(vm->i915);
+
 	gen6_for_each_pde(pt, &ppgtt->base.pd, start, length, pde) {
 		const unsigned int count = gen6_pte_count(start, length);
 
@@ -1774,12 +1777,15 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
 
 	if (flush) {
 		mark_tlbs_dirty(&ppgtt->base);
-		gen6_ggtt_invalidate(ppgtt->base.vm.i915);
+		gen6_ggtt_invalidate(vm->i915);
 	}
 
+	intel_runtime_pm_put(vm->i915, wakeref);
+
 	return 0;
 
 unwind_out:
+	intel_runtime_pm_put(vm->i915, wakeref);
 	gen6_ppgtt_clear_range(vm, from, start - from);
 	return -ENOMEM;
 }
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] drm/i915: add in-kernel blitter client
  2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
@ 2019-05-07 10:55 ` Matthew Auld
  2019-05-07 11:57   ` Chris Wilson
  2019-05-07 11:36 ` [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Chris Wilson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 7+ messages in thread
From: Matthew Auld @ 2019-05-07 10:55 UTC (permalink / raw)
  To: intel-gfx

The plan is to use the blitter engine for async object clearing when
using local memory, but before we can move the worker to get_pages() we
have to first tame some more of our struct_mutex usage. With this in
mind we should be able to upstream the object clearing as some
selftests, which should serve as a guinea pig for the ongoing locking
rework and upcoming asyc get_pages() framework.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |   2 +
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h  |   1 +
 drivers/gpu/drm/i915/i915_gem_client_blt.c    | 297 ++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_client_blt.h    |  21 ++
 drivers/gpu/drm/i915/i915_gem_object_blt.c    | 103 ++++++
 drivers/gpu/drm/i915/i915_gem_object_blt.h    |  24 ++
 .../drm/i915/selftests/i915_gem_client_blt.c  | 131 ++++++++
 .../drm/i915/selftests/i915_gem_object_blt.c  | 114 +++++++
 .../drm/i915/selftests/i915_live_selftests.h  |   2 +
 9 files changed, 695 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_client_blt.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_client_blt.h
 create mode 100644 drivers/gpu/drm/i915/i915_gem_object_blt.c
 create mode 100644 drivers/gpu/drm/i915/i915_gem_object_blt.h
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c
 create mode 100644 drivers/gpu/drm/i915/selftests/i915_gem_object_blt.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 68106fe35a04..a1690aade273 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -90,6 +90,7 @@ i915-y += \
 	  i915_cmd_parser.o \
 	  i915_gem_batch_pool.o \
 	  i915_gem_clflush.o \
+	  i915_gem_client_blt.o \
 	  i915_gem_context.o \
 	  i915_gem_dmabuf.o \
 	  i915_gem_evict.o \
@@ -99,6 +100,7 @@ i915-y += \
 	  i915_gem_internal.o \
 	  i915_gem.o \
 	  i915_gem_object.o \
+	  i915_gem_object_blt.o \
 	  i915_gem_pm.o \
 	  i915_gem_render_state.o \
 	  i915_gem_shrinker.o \
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index a34ece53a771..7e95827b0726 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -180,6 +180,7 @@
 #define GFX_OP_DRAWRECT_INFO_I965  ((0x7900<<16)|0x2)
 
 #define COLOR_BLT_CMD			(2<<29 | 0x40<<22 | (5-2))
+#define XY_COLOR_BLT_CMD		(2<<29 | 0x50<<22)
 #define SRC_COPY_BLT_CMD		((2<<29)|(0x43<<22)|4)
 #define XY_SRC_COPY_BLT_CMD		((2<<29)|(0x53<<22)|6)
 #define XY_MONO_SRC_COPY_IMM_BLT	((2<<29)|(0x71<<22)|5)
diff --git a/drivers/gpu/drm/i915/i915_gem_client_blt.c b/drivers/gpu/drm/i915/i915_gem_client_blt.c
new file mode 100644
index 000000000000..0a25df8d567a
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_client_blt.c
@@ -0,0 +1,297 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+#include "i915_gem_client_blt.h"
+
+#include "i915_gem_object_blt.h"
+#include "intel_drv.h"
+
+struct i915_sleeve {
+	struct i915_vma *vma;
+	struct drm_i915_gem_object *obj;
+	struct sg_table *pages;
+	struct i915_page_sizes page_sizes;
+};
+
+static int vma_set_pages(struct i915_vma *vma)
+{
+	struct i915_sleeve *sleeve = vma->private;
+
+	vma->pages = sleeve->pages;
+	vma->page_sizes = sleeve->page_sizes;
+
+	return 0;
+}
+
+static void vma_clear_pages(struct i915_vma *vma)
+{
+	GEM_BUG_ON(!vma->pages);
+	vma->pages = NULL;
+}
+
+static int vma_bind(struct i915_vma *vma,
+		    enum i915_cache_level cache_level,
+		    u32 flags)
+{
+	return vma->vm->vma_ops.bind_vma(vma, cache_level, flags);
+}
+
+static void vma_unbind(struct i915_vma *vma)
+{
+	vma->vm->vma_ops.unbind_vma(vma);
+}
+
+static const struct i915_vma_ops proxy_vma_ops = {
+	.set_pages = vma_set_pages,
+	.clear_pages = vma_clear_pages,
+	.bind_vma = vma_bind,
+	.unbind_vma = vma_unbind,
+};
+
+static struct i915_sleeve *create_sleeve(struct i915_address_space *vm,
+					 struct drm_i915_gem_object *obj,
+					 struct sg_table *pages,
+					 struct i915_page_sizes *page_sizes)
+{
+	struct i915_sleeve *sleeve;
+	struct i915_vma *vma;
+	int err;
+
+	sleeve = kzalloc(sizeof(*sleeve), GFP_KERNEL);
+	if (!sleeve)
+		return ERR_PTR(-ENOMEM);
+
+	vma = i915_vma_instance(obj, vm, NULL);
+	if (IS_ERR(vma)) {
+		err = PTR_ERR(vma);
+		goto err_free;
+	}
+
+	vma->private = sleeve;
+	vma->ops = &proxy_vma_ops;
+
+	sleeve->vma = vma;
+	sleeve->obj = i915_gem_object_get(obj);
+	sleeve->pages = pages;
+	sleeve->page_sizes = *page_sizes;
+
+	return sleeve;
+
+err_free:
+	kfree(sleeve);
+	return ERR_PTR(err);
+}
+
+static void destroy_sleeve(struct i915_sleeve *sleeve)
+{
+	i915_gem_object_put(sleeve->obj);
+	kfree(sleeve);
+}
+
+struct clear_pages_work {
+	struct dma_fence dma;
+	struct dma_fence_cb cb;
+	struct i915_sw_fence wait;
+	struct work_struct work;
+	struct irq_work irq_work;
+	struct i915_sleeve *sleeve;
+	struct intel_context *ce;
+	u32 value;
+};
+
+static const char *clear_pages_work_driver_name(struct dma_fence *fence)
+{
+	return DRIVER_NAME;
+}
+
+static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
+{
+	return "clear";
+}
+
+static void clear_pages_work_release(struct dma_fence *fence)
+{
+	struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
+
+	destroy_sleeve(w->sleeve);
+
+	i915_sw_fence_fini(&w->wait);
+
+	BUILD_BUG_ON(offsetof(typeof(*w), dma));
+	dma_fence_free(&w->dma);
+}
+
+static const struct dma_fence_ops clear_pages_work_ops = {
+	.get_driver_name = clear_pages_work_driver_name,
+	.get_timeline_name = clear_pages_work_timeline_name,
+	.release = clear_pages_work_release,
+};
+
+static void clear_pages_signal_irq_worker(struct irq_work *work)
+{
+	struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
+
+	dma_fence_signal(&w->dma);
+	dma_fence_put(&w->dma);
+}
+
+static void clear_pages_dma_fence_cb(struct dma_fence *fence,
+				     struct dma_fence_cb *cb)
+{
+	struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
+
+	/*
+	 * Push the signalling of the fence into yet another worker to avoid
+	 * the nightmare locking around the fence spinlock.
+	 */
+	irq_work_queue(&w->irq_work);
+}
+
+static void clear_pages_worker(struct work_struct *work)
+{
+	struct clear_pages_work *w = container_of(work, typeof(*w), work);
+	struct drm_i915_private *i915 = w->ce->gem_context->i915;
+	struct drm_i915_gem_object *obj = w->sleeve->obj;
+	struct i915_vma *vma = w->sleeve->vma;
+	struct i915_request *rq;
+	int err;
+
+	if (w->dma.error)
+		goto out_signal;
+
+	if (obj->cache_dirty) {
+		obj->write_domain = 0;
+		if (i915_gem_object_has_struct_page(obj))
+			drm_clflush_sg(w->sleeve->pages);
+		obj->cache_dirty = false;
+	}
+
+	mutex_lock(&i915->drm.struct_mutex);
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (unlikely(err)) {
+		mutex_unlock(&i915->drm.struct_mutex);
+		dma_fence_set_error(&w->dma, err);
+		goto out_signal;
+	}
+
+	rq = i915_request_create(w->ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_unpin;
+	}
+
+	err = intel_emit_vma_fill_blt(rq, vma, w->value);
+	if (unlikely(err))
+		goto out_request;
+
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+	else
+		i915_request_get(rq);
+
+	i915_request_add(rq);
+out_unpin:
+	i915_vma_unpin(vma);
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	if (!err) {
+		err = dma_fence_add_callback(&rq->fence, &w->cb,
+					     clear_pages_dma_fence_cb);
+		i915_request_put(rq);
+		if (!err)
+			return;
+	} else {
+		dma_fence_set_error(&w->dma, err);
+	}
+out_signal:
+	dma_fence_signal(&w->dma);
+	dma_fence_put(&w->dma);
+}
+
+static int __i915_sw_fence_call
+clear_pages_work_notify(struct i915_sw_fence *fence,
+			enum i915_sw_fence_notify state)
+{
+	struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
+
+	switch (state) {
+	case FENCE_COMPLETE:
+		schedule_work(&w->work);
+		break;
+
+	case FENCE_FREE:
+		dma_fence_put(&w->dma);
+		break;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static DEFINE_SPINLOCK(fence_lock);
+
+int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
+				     struct intel_context *ce,
+				     struct sg_table *pages,
+				     struct i915_page_sizes *page_sizes,
+				     u32 value)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_gem_context *ctx = ce->gem_context;
+	struct i915_address_space *vm;
+	struct clear_pages_work *work;
+	struct i915_sleeve *sleeve;
+	int err;
+
+	vm = ctx->ppgtt ? &ctx->ppgtt->vm : &i915->ggtt.vm;
+
+	sleeve = create_sleeve(vm, obj, pages, page_sizes);
+	if (IS_ERR(sleeve))
+		return PTR_ERR(sleeve);
+
+	work = kmalloc(sizeof(*work), GFP_KERNEL);
+	if (work == NULL) {
+		destroy_sleeve(sleeve);
+		return -ENOMEM;
+	}
+
+	work->value = value;
+	work->sleeve = sleeve;
+	work->ce = ce;
+
+	INIT_WORK(&work->work, clear_pages_worker);
+
+	init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
+
+	dma_fence_init(&work->dma,
+		       &clear_pages_work_ops,
+		       &fence_lock,
+		       i915->mm.unordered_timeline,
+		       0);
+	i915_sw_fence_init(&work->wait, clear_pages_work_notify);
+
+	i915_gem_object_lock(obj);
+	err = i915_sw_fence_await_reservation(&work->wait,
+					      obj->resv, NULL,
+					      true, I915_FENCE_TIMEOUT,
+					      I915_FENCE_GFP);
+	if (err < 0) {
+		dma_fence_set_error(&work->dma, err);
+	} else {
+		reservation_object_add_excl_fence(obj->resv, &work->dma);
+		err = 0;
+	}
+	i915_gem_object_unlock(obj);
+
+	dma_fence_get(&work->dma);
+	i915_sw_fence_commit(&work->wait);
+
+	return err;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_gem_client_blt.c"
+#endif
diff --git a/drivers/gpu/drm/i915/i915_gem_client_blt.h b/drivers/gpu/drm/i915/i915_gem_client_blt.h
new file mode 100644
index 000000000000..a7080623e741
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_client_blt.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+#ifndef __I915_GEM_CLIENT_BLT_H__
+#define __I915_GEM_CLIENT_BLT_H__
+
+#include <linux/types.h>
+
+struct drm_i915_gem_object;
+struct intel_context ce;
+struct i915_page_sizes;
+struct sg_table;
+
+int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
+				     struct intel_context *ce,
+				     struct sg_table *pages,
+				     struct i915_page_sizes *page_sizes,
+				     u32 value);
+
+#endif
diff --git a/drivers/gpu/drm/i915/i915_gem_object_blt.c b/drivers/gpu/drm/i915/i915_gem_object_blt.c
new file mode 100644
index 000000000000..3fda33e5dcf5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_object_blt.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "i915_gem_object_blt.h"
+
+#include "i915_gem_clflush.h"
+#include "intel_drv.h"
+
+int intel_emit_vma_fill_blt(struct i915_request *rq,
+			    struct i915_vma *vma,
+			    u32 value)
+{
+	struct intel_context *ce = rq->hw_context;
+	u32 *cs;
+	int err;
+
+	if (ce->engine->emit_init_breadcrumb) {
+		err = ce->engine->emit_init_breadcrumb(rq);
+		if (unlikely(err))
+			return err;
+	}
+
+	cs = intel_ring_begin(rq, 8);
+
+	if (INTEL_GEN(rq->i915) >= 8) {
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7-2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = vma->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = lower_32_bits(vma->node.start);
+		*cs++ = upper_32_bits(vma->node.start);
+		*cs++ = value;
+		*cs++ = MI_NOOP;
+	} else {
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6-2);
+		*cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
+		*cs++ = 0;
+		*cs++ = vma->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
+		*cs++ = vma->node.start;
+		*cs++ = value;
+		*cs++ = MI_NOOP;
+		*cs++ = MI_NOOP;
+	}
+
+	intel_ring_advance(rq, cs);
+
+	return 0;
+}
+
+int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
+			     struct intel_context *ce,
+			     u32 value)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_gem_context *ctx = ce->gem_context;
+	struct i915_address_space *vm;
+	struct i915_request *rq;
+	struct i915_vma *vma;
+	int err;
+
+	vm = ctx->ppgtt ? &ctx->ppgtt->vm : &i915->ggtt.vm;
+
+	vma = i915_vma_instance(obj, vm, NULL);
+	if (IS_ERR(vma))
+		return PTR_ERR(vma);
+
+	err = i915_vma_pin(vma, 0, 0, PIN_USER);
+	if (unlikely(err))
+		return err;
+
+	if (obj->cache_dirty)
+		i915_gem_clflush_object(obj, 0);
+
+	rq = i915_request_create(ce);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto out_unpin;
+	}
+
+	err = i915_request_await_object(rq, obj, true);
+	if (unlikely(err))
+		goto out_request;
+
+	err = intel_emit_vma_fill_blt(rq, vma, value);
+	if (unlikely(err))
+		goto out_request;
+
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+out_request:
+	if (unlikely(err))
+		i915_request_skip(rq, err);
+
+	i915_request_add(rq);
+out_unpin:
+	i915_vma_unpin(vma);
+	return err;
+}
+
+#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
+#include "selftests/i915_gem_object_blt.c"
+#endif
diff --git a/drivers/gpu/drm/i915/i915_gem_object_blt.h b/drivers/gpu/drm/i915/i915_gem_object_blt.h
new file mode 100644
index 000000000000..7ec7de6ac0c0
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_object_blt.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef __I915_GEM_OBJECT_BLT_H__
+#define __I915_GEM_OBJECT_BLT_H__
+
+#include <linux/types.h>
+
+struct drm_i915_gem_object;
+struct intel_context;
+struct i915_request;
+struct i915_vma;
+
+int intel_emit_vma_fill_blt(struct i915_request *rq,
+			    struct i915_vma *vma,
+			    u32 value);
+
+int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
+			     struct intel_context *ce,
+			     u32 value);
+
+#endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c
new file mode 100644
index 000000000000..54b15d22e310
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c
@@ -0,0 +1,131 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "../i915_selftest.h"
+
+#include "igt_flush_test.h"
+#include "mock_drm.h"
+#include "mock_context.h"
+
+static int igt_client_fill(void *arg)
+{
+	struct intel_context *ce = arg;
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct drm_i915_gem_object *obj;
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);
+	u32 *vaddr;
+	int err = 0;
+
+	prandom_seed_state(&prng, i915_selftest.random_seed);
+
+	do {
+		u32 sz = prandom_u32_state(&prng) % SZ_32M;
+		u32 val = prandom_u32_state(&prng);
+		u32 i;
+
+		sz = round_up(sz, PAGE_SIZE);
+
+		pr_info("%s with sz=%x, val=%x\n", __func__, sz, val);
+
+		obj = i915_gem_object_create_internal(i915, sz);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(obj);
+			goto err_flush;
+		}
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto err_put;
+		}
+
+		/*
+		 * XXX: The goal is move this to get_pages, so try to dirty the
+		 * CPU cache first to check that we do the required clflush
+		 * before scheduling the blt for !llc platforms. This matches
+		 * some version of reality where at get_pages the pages
+		 * themselves may not yet be coherent with the GPU(swap-in). If
+		 * we are missing the flush then we should see the stale cache
+		 * values after we do the set_to_cpu_domain and pick it up as a
+		 * test failure.
+		 */
+		memset32(vaddr, val ^ 0xdeadbeaf, obj->base.size / sizeof(u32));
+
+		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+			obj->cache_dirty = true;
+
+		err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
+						       &obj->mm.page_sizes,
+						       val);
+		if (err)
+			goto err_unpin;
+
+		/*
+		 * XXX: For now do the wait without the BKL to ensure we don't
+		 * deadlock.
+		 */
+		err = i915_gem_object_wait(obj,
+					   I915_WAIT_INTERRUPTIBLE |
+					   I915_WAIT_ALL,
+					   MAX_SCHEDULE_TIMEOUT);
+		if (err)
+			goto err_unpin;
+
+		mutex_lock(&i915->drm.struct_mutex);
+		err = i915_gem_object_set_to_cpu_domain(obj, false);
+		mutex_unlock(&i915->drm.struct_mutex);
+		if (err)
+			goto err_unpin;
+
+		for (i = 0; i < obj->base.size / sizeof(u32); ++i) {
+			if (vaddr[i] != val) {
+				pr_err("vaddr[%u]=%x, expected=%x\n", i,
+				       vaddr[i], val);
+				err = -EINVAL;
+				goto err_unpin;
+			}
+		}
+
+		i915_gem_object_unpin_map(obj);
+
+		mutex_lock(&i915->drm.struct_mutex);
+		__i915_gem_object_release_unless_active(obj);
+		mutex_unlock(&i915->drm.struct_mutex);
+	} while (!time_after(jiffies, end));
+
+	goto err_flush;
+
+err_unpin:
+	i915_gem_object_unpin_map(obj);
+err_put:
+	mutex_lock(&i915->drm.struct_mutex);
+	__i915_gem_object_release_unless_active(obj);
+	mutex_unlock(&i915->drm.struct_mutex);
+err_flush:
+	mutex_lock(&i915->drm.struct_mutex);
+	igt_flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
+int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_client_fill),
+	};
+
+	if (i915_terminally_wedged(i915))
+		return 0;
+
+	if (!HAS_ENGINE(i915, BCS0))
+		return 0;
+
+	return i915_subtests(tests, i915->engine[BCS0]->kernel_context);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object_blt.c b/drivers/gpu/drm/i915/selftests/i915_gem_object_blt.c
new file mode 100644
index 000000000000..9216f24c7c8f
--- /dev/null
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object_blt.c
@@ -0,0 +1,114 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#include "../i915_selftest.h"
+
+#include "igt_flush_test.h"
+#include "mock_drm.h"
+#include "mock_context.h"
+
+static int igt_fill_blt(void *arg)
+{
+	struct intel_context *ce = arg;
+	struct drm_i915_private *i915 = ce->gem_context->i915;
+	struct drm_i915_gem_object *obj;
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);
+	u32 *vaddr;
+	int err = 0;
+
+	prandom_seed_state(&prng, i915_selftest.random_seed);
+
+	do {
+		u32 sz = prandom_u32_state(&prng) % SZ_32M;
+		u32 val = prandom_u32_state(&prng);
+		u32 i;
+
+		sz = round_up(sz, PAGE_SIZE);
+
+		pr_info("%s with sz=%x, val=%x\n", __func__, sz, val);
+
+		obj = i915_gem_object_create_internal(i915, sz);
+		if (IS_ERR(obj)) {
+			err = PTR_ERR(vaddr);
+			goto err_flush;
+		}
+
+		vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
+		if (IS_ERR(vaddr)) {
+			err = PTR_ERR(vaddr);
+			goto err_put;
+		}
+
+		/*
+		 * Make sure the potentially async clflush does its job, if
+		 * required.
+		 */
+		memset32(vaddr, val ^ 0xdeadbeaf, obj->base.size / sizeof(u32));
+
+		if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
+			obj->cache_dirty = true;
+
+		mutex_lock(&i915->drm.struct_mutex);
+		err = i915_gem_object_fill_blt(obj, ce, val);
+		mutex_unlock(&i915->drm.struct_mutex);
+		if (err)
+			goto err_unpin;
+
+		mutex_lock(&i915->drm.struct_mutex);
+		err = i915_gem_object_set_to_cpu_domain(obj, false);
+		mutex_unlock(&i915->drm.struct_mutex);
+		if (err)
+			goto err_unpin;
+
+		for (i = 0; i < obj->base.size / sizeof(u32); ++i) {
+			if (vaddr[i] != val) {
+				pr_err("vaddr[%u]=%x, expected=%x\n", i,
+				       vaddr[i], val);
+				err = -EINVAL;
+				goto err_unpin;
+			}
+		}
+
+		i915_gem_object_unpin_map(obj);
+
+		mutex_lock(&i915->drm.struct_mutex);
+		__i915_gem_object_release_unless_active(obj);
+		mutex_unlock(&i915->drm.struct_mutex);
+	} while (!time_after(jiffies, end));
+
+	goto err_flush;
+
+err_unpin:
+	i915_gem_object_unpin_map(obj);
+err_put:
+	mutex_lock(&i915->drm.struct_mutex);
+	__i915_gem_object_release_unless_active(obj);
+	mutex_unlock(&i915->drm.struct_mutex);
+err_flush:
+	mutex_lock(&i915->drm.struct_mutex);
+	igt_flush_test(i915, I915_WAIT_LOCKED);
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	if (err == -ENOMEM)
+		err = 0;
+
+	return err;
+}
+
+int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
+{
+	static const struct i915_subtest tests[] = {
+		SUBTEST(igt_fill_blt),
+	};
+
+	if (i915_terminally_wedged(i915))
+		return 0;
+
+	if (!HAS_ENGINE(i915, BCS0))
+		return 0;
+
+	return i915_subtests(tests, i915->engine[BCS0]->kernel_context);
+}
diff --git a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
index 6d766925ad04..3797574e1984 100644
--- a/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
+++ b/drivers/gpu/drm/i915/selftests/i915_live_selftests.h
@@ -23,6 +23,8 @@ selftest(gem, i915_gem_live_selftests)
 selftest(evict, i915_gem_evict_live_selftests)
 selftest(hugepages, i915_gem_huge_page_live_selftests)
 selftest(contexts, i915_gem_context_live_selftests)
+selftest(blt, i915_gem_object_blt_live_selftests)
+selftest(client, i915_gem_client_blt_live_selftests)
 selftest(hangcheck, intel_hangcheck_live_selftests)
 selftest(execlists, intel_execlists_live_selftests)
 selftest(guc, intel_guc_live_selftest)
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
  2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
  2019-05-07 10:55 ` [PATCH v2 2/2] drm/i915: add in-kernel blitter client Matthew Auld
@ 2019-05-07 11:36 ` Chris Wilson
  2019-05-07 12:14 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2,1/2] " Patchwork
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-05-07 11:36 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-05-07 11:55:56)
> Some steps in gen6_alloc_va_range require the HW to be awake, so ideally
> we should be grabbing the wakeref ourselves and not relying on the
> caller already holding it for us.
> 
> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_gtt.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index 8f5db787b7f2..ffb8c3d011e9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1745,10 +1745,13 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
>  {
>         struct gen6_hw_ppgtt *ppgtt = to_gen6_ppgtt(i915_vm_to_ppgtt(vm));
>         struct i915_page_table *pt;
> +       intel_wakeref_t wakeref;
>         u64 from = start;
>         unsigned int pde;
>         bool flush = false;
>  
> +       wakeref = intel_runtime_pm_get(vm->i915);
> +
>         gen6_for_each_pde(pt, &ppgtt->base.pd, start, length, pde) {
>                 const unsigned int count = gen6_pte_count(start, length);
>  
> @@ -1774,12 +1777,15 @@ static int gen6_alloc_va_range(struct i915_address_space *vm,
>  
>         if (flush) {
>                 mark_tlbs_dirty(&ppgtt->base);
> -               gen6_ggtt_invalidate(ppgtt->base.vm.i915);
> +               gen6_ggtt_invalidate(vm->i915);
>         }
>  
> +       intel_runtime_pm_put(vm->i915, wakeref);
> +
>         return 0;
>  
>  unwind_out:
> +       intel_runtime_pm_put(vm->i915, wakeref);
>         gen6_ppgtt_clear_range(vm, from, start - from);
>         return -ENOMEM;

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

It's a bit too fiddly here to try and defer it until the next time the
HW is awake -- and really if we are adjusting the iova, then we are
going to be using the HW, and normally would be under a longterm wakeref
(e.g. execbuf) but for in-kernel clients, we need to be more precise.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/2] drm/i915: add in-kernel blitter client
  2019-05-07 10:55 ` [PATCH v2 2/2] drm/i915: add in-kernel blitter client Matthew Auld
@ 2019-05-07 11:57   ` Chris Wilson
  0 siblings, 0 replies; 7+ messages in thread
From: Chris Wilson @ 2019-05-07 11:57 UTC (permalink / raw)
  To: Matthew Auld, intel-gfx

Quoting Matthew Auld (2019-05-07 11:55:57)
> The plan is to use the blitter engine for async object clearing when
> using local memory, but before we can move the worker to get_pages() we
> have to first tame some more of our struct_mutex usage. With this in
> mind we should be able to upstream the object clearing as some
> selftests, which should serve as a guinea pig for the ongoing locking
> rework and upcoming asyc get_pages() framework.
> 
> Signed-off-by: Matthew Auld <matthew.auld@intel.com>
> ---
> +struct clear_pages_work {
> +       struct dma_fence dma;
> +       struct dma_fence_cb cb;
> +       struct i915_sw_fence wait;
> +       struct work_struct work;
> +       struct irq_work irq_work;
> +       struct i915_sleeve *sleeve;
> +       struct intel_context *ce;
> +       u32 value;
> +};
> +
> +static const char *clear_pages_work_driver_name(struct dma_fence *fence)
> +{
> +       return DRIVER_NAME;
> +}
> +
> +static const char *clear_pages_work_timeline_name(struct dma_fence *fence)
> +{
> +       return "clear";
> +}
> +
> +static void clear_pages_work_release(struct dma_fence *fence)
> +{
> +       struct clear_pages_work *w = container_of(fence, typeof(*w), dma);
> +
> +       destroy_sleeve(w->sleeve);
> +
> +       i915_sw_fence_fini(&w->wait);
> +
> +       BUILD_BUG_ON(offsetof(typeof(*w), dma));
> +       dma_fence_free(&w->dma);
> +}
> +
> +static const struct dma_fence_ops clear_pages_work_ops = {
> +       .get_driver_name = clear_pages_work_driver_name,
> +       .get_timeline_name = clear_pages_work_timeline_name,
> +       .release = clear_pages_work_release,
> +};
> +
> +static void clear_pages_signal_irq_worker(struct irq_work *work)
> +{
> +       struct clear_pages_work *w = container_of(work, typeof(*w), irq_work);
> +
> +       dma_fence_signal(&w->dma);
> +       dma_fence_put(&w->dma);
> +}
> +
> +static void clear_pages_dma_fence_cb(struct dma_fence *fence,
> +                                    struct dma_fence_cb *cb)
> +{
> +       struct clear_pages_work *w = container_of(cb, typeof(*w), cb);
> +
> +       /*
> +        * Push the signalling of the fence into yet another worker to avoid
> +        * the nightmare locking around the fence spinlock.
> +        */
> +       irq_work_queue(&w->irq_work);
> +}
> +
> +static void clear_pages_worker(struct work_struct *work)
> +{
> +       struct clear_pages_work *w = container_of(work, typeof(*w), work);
> +       struct drm_i915_private *i915 = w->ce->gem_context->i915;
> +       struct drm_i915_gem_object *obj = w->sleeve->obj;
> +       struct i915_vma *vma = w->sleeve->vma;
> +       struct i915_request *rq;
> +       int err;
> +
> +       if (w->dma.error)
> +               goto out_signal;
> +
> +       if (obj->cache_dirty) {
> +               obj->write_domain = 0;
> +               if (i915_gem_object_has_struct_page(obj))
> +                       drm_clflush_sg(w->sleeve->pages);
> +               obj->cache_dirty = false;

Interesting. If we have no struct_page, can we be cache_dirty here?
That might be a useful thought exercise and worth verifying at odd
points.

> +       }
> +
> +       mutex_lock(&i915->drm.struct_mutex);

This will be come vm->mutex. But that's why we need this patch so that
we can trim down the locking with a working test.

> +       err = i915_vma_pin(vma, 0, 0, PIN_USER);
> +       if (unlikely(err)) {
> +               mutex_unlock(&i915->drm.struct_mutex);
> +               dma_fence_set_error(&w->dma, err);
> +               goto out_signal;
> +       }
> +
> +       rq = i915_request_create(w->ce);
> +       if (IS_ERR(rq)) {
> +               err = PTR_ERR(rq);
> +               goto out_unpin;
> +       }
> +
> +       err = intel_emit_vma_fill_blt(rq, vma, w->value);
> +       if (unlikely(err))
> +               goto out_request;
> +
> +       err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
> +out_request:
> +       if (unlikely(err))
> +               i915_request_skip(rq, err);
> +       else
> +               i915_request_get(rq);
> +
> +       i915_request_add(rq);
> +out_unpin:
> +       i915_vma_unpin(vma);
> +
> +       mutex_unlock(&i915->drm.struct_mutex);
> +
> +       if (!err) {
> +               err = dma_fence_add_callback(&rq->fence, &w->cb,
> +                                            clear_pages_dma_fence_cb);
> +               i915_request_put(rq);
> +               if (!err)
> +                       return;

This should be rearranged such that after we have a rq allocated, we
always attach the callback and propagate via the callback. Even on the
error path. That should tidy this up quite a bit. (It's pretty much the
point of why we always i915_request_add even when skipping, we always
have an intact timeline for conveying errors.)

> +       } else {
> +               dma_fence_set_error(&w->dma, err);
> +       }
> +out_signal:
> +       dma_fence_signal(&w->dma);
> +       dma_fence_put(&w->dma);
> +}
> +
> +static int __i915_sw_fence_call
> +clear_pages_work_notify(struct i915_sw_fence *fence,
> +                       enum i915_sw_fence_notify state)
> +{
> +       struct clear_pages_work *w = container_of(fence, typeof(*w), wait);
> +
> +       switch (state) {
> +       case FENCE_COMPLETE:
> +               schedule_work(&w->work);
> +               break;
> +
> +       case FENCE_FREE:
> +               dma_fence_put(&w->dma);
> +               break;
> +       }
> +
> +       return NOTIFY_DONE;
> +}
> +
> +static DEFINE_SPINLOCK(fence_lock);
> +
> +int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,

Not sold on this name. Scheduling is inherent in the name GEM, and this
takes the i915_gem_object as its primary argument. I'd favour
i915_gem_object_fill_blt, though it's really part of the mman family. To
be resolved later.

> +                                    struct intel_context *ce,
> +                                    struct sg_table *pages,
> +                                    struct i915_page_sizes *page_sizes,
> +                                    u32 value)
> +{
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +       struct i915_gem_context *ctx = ce->gem_context;
> +       struct i915_address_space *vm;
> +       struct clear_pages_work *work;
> +       struct i915_sleeve *sleeve;
> +       int err;
> +
> +       vm = ctx->ppgtt ? &ctx->ppgtt->vm : &i915->ggtt.vm;

Remind me, this needs to be ce->vm.

> +       sleeve = create_sleeve(vm, obj, pages, page_sizes);
> +       if (IS_ERR(sleeve))
> +               return PTR_ERR(sleeve);
> +
> +       work = kmalloc(sizeof(*work), GFP_KERNEL);
> +       if (work == NULL) {
> +               destroy_sleeve(sleeve);
> +               return -ENOMEM;
> +       }
> +
> +       work->value = value;
> +       work->sleeve = sleeve;
> +       work->ce = ce;
> +
> +       INIT_WORK(&work->work, clear_pages_worker);
> +
> +       init_irq_work(&work->irq_work, clear_pages_signal_irq_worker);
> +
> +       dma_fence_init(&work->dma,
> +                      &clear_pages_work_ops,
> +                      &fence_lock,
> +                      i915->mm.unordered_timeline,
> +                      0);
> +       i915_sw_fence_init(&work->wait, clear_pages_work_notify);
> +
> +       i915_gem_object_lock(obj);
> +       err = i915_sw_fence_await_reservation(&work->wait,
> +                                             obj->resv, NULL,
> +                                             true, I915_FENCE_TIMEOUT,
> +                                             I915_FENCE_GFP);
> +       if (err < 0) {
> +               dma_fence_set_error(&work->dma, err);
> +       } else {
> +               reservation_object_add_excl_fence(obj->resv, &work->dma);
> +               err = 0;
> +       }
> +       i915_gem_object_unlock(obj);
> +
> +       dma_fence_get(&work->dma);
> +       i915_sw_fence_commit(&work->wait);
> +
> +       return err;
> +}
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +#include "selftests/i915_gem_client_blt.c"
> +#endif
> diff --git a/drivers/gpu/drm/i915/i915_gem_client_blt.h b/drivers/gpu/drm/i915/i915_gem_client_blt.h
> new file mode 100644
> index 000000000000..a7080623e741
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_client_blt.h
> @@ -0,0 +1,21 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +#ifndef __I915_GEM_CLIENT_BLT_H__
> +#define __I915_GEM_CLIENT_BLT_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_gem_object;
> +struct intel_context ce;
> +struct i915_page_sizes;
> +struct sg_table;
> +
> +int i915_gem_schedule_fill_pages_blt(struct drm_i915_gem_object *obj,
> +                                    struct intel_context *ce,
> +                                    struct sg_table *pages,
> +                                    struct i915_page_sizes *page_sizes,
> +                                    u32 value);
> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/i915_gem_object_blt.c b/drivers/gpu/drm/i915/i915_gem_object_blt.c
> new file mode 100644
> index 000000000000..3fda33e5dcf5
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_object_blt.c
> @@ -0,0 +1,103 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "i915_gem_object_blt.h"
> +
> +#include "i915_gem_clflush.h"
> +#include "intel_drv.h"
> +
> +int intel_emit_vma_fill_blt(struct i915_request *rq,
> +                           struct i915_vma *vma,
> +                           u32 value)
> +{
> +       struct intel_context *ce = rq->hw_context;
> +       u32 *cs;
> +       int err;
> +
> +       if (ce->engine->emit_init_breadcrumb) {
> +               err = ce->engine->emit_init_breadcrumb(rq);
> +               if (unlikely(err))
> +                       return err;
> +       }

Though it may push some duplication in the callers, this is the callers
duty.

> +       cs = intel_ring_begin(rq, 8);

if (IS_ERR(cs))
	return cs;
> +
> +       if (INTEL_GEN(rq->i915) >= 8) {
> +               *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7-2);
> +               *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> +               *cs++ = 0;
> +               *cs++ = vma->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +               *cs++ = lower_32_bits(vma->node.start);
> +               *cs++ = upper_32_bits(vma->node.start);
> +               *cs++ = value;
> +               *cs++ = MI_NOOP;
> +       } else {
> +               *cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6-2);
> +               *cs++ = BLT_DEPTH_32 | BLT_ROP_COLOR_COPY | PAGE_SIZE;
> +               *cs++ = 0;
> +               *cs++ = vma->size >> PAGE_SHIFT << 16 | PAGE_SIZE / 4;
> +               *cs++ = vma->node.start;
> +               *cs++ = value;
> +               *cs++ = MI_NOOP;
> +               *cs++ = MI_NOOP;
> +       }
> +
> +       intel_ring_advance(rq, cs);
> +
> +       return 0;
> +}
> +
> +int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> +                            struct intel_context *ce,
> +                            u32 value)
> +{
> +       struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +       struct i915_gem_context *ctx = ce->gem_context;
> +       struct i915_address_space *vm;
> +       struct i915_request *rq;
> +       struct i915_vma *vma;
> +       int err;
> +
> +       vm = ctx->ppgtt ? &ctx->ppgtt->vm : &i915->ggtt.vm;
> +
> +       vma = i915_vma_instance(obj, vm, NULL);
> +       if (IS_ERR(vma))
> +               return PTR_ERR(vma);
> +
> +       err = i915_vma_pin(vma, 0, 0, PIN_USER);
> +       if (unlikely(err))
> +               return err;
> +
> +       if (obj->cache_dirty)

if (obj->cache_dirty & ~obj->cache_coherent)

> +               i915_gem_clflush_object(obj, 0);
> +
> +       rq = i915_request_create(ce);
> +       if (IS_ERR(rq)) {
> +               err = PTR_ERR(rq);
> +               goto out_unpin;
> +       }
> +
> +       err = i915_request_await_object(rq, obj, true);
> +       if (unlikely(err))
> +               goto out_request;
> +
> +       err = intel_emit_vma_fill_blt(rq, vma, value);
> +       if (unlikely(err))
> +               goto out_request;
> +
> +       err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
> +out_request:
> +       if (unlikely(err))
> +               i915_request_skip(rq, err);
> +
> +       i915_request_add(rq);
> +out_unpin:
> +       i915_vma_unpin(vma);
> +       return err;

Ok.

> +}
> +
> +#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
> +#include "selftests/i915_gem_object_blt.c"
> +#endif
> diff --git a/drivers/gpu/drm/i915/i915_gem_object_blt.h b/drivers/gpu/drm/i915/i915_gem_object_blt.h
> new file mode 100644
> index 000000000000..7ec7de6ac0c0
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/i915_gem_object_blt.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: MIT */
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#ifndef __I915_GEM_OBJECT_BLT_H__
> +#define __I915_GEM_OBJECT_BLT_H__
> +
> +#include <linux/types.h>
> +
> +struct drm_i915_gem_object;
> +struct intel_context;
> +struct i915_request;
> +struct i915_vma;
> +
> +int intel_emit_vma_fill_blt(struct i915_request *rq,
> +                           struct i915_vma *vma,
> +                           u32 value);
> +
> +int i915_gem_object_fill_blt(struct drm_i915_gem_object *obj,
> +                            struct intel_context *ce,
> +                            u32 value);
> +
> +#endif
> diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c
> new file mode 100644
> index 000000000000..54b15d22e310
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c
> @@ -0,0 +1,131 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright © 2019 Intel Corporation
> + */
> +
> +#include "../i915_selftest.h"
> +
> +#include "igt_flush_test.h"
> +#include "mock_drm.h"
> +#include "mock_context.h"
> +
> +static int igt_client_fill(void *arg)
> +{
> +       struct intel_context *ce = arg;
> +       struct drm_i915_private *i915 = ce->gem_context->i915;
> +       struct drm_i915_gem_object *obj;
> +       struct rnd_state prng;
> +       IGT_TIMEOUT(end);
> +       u32 *vaddr;
> +       int err = 0;
> +
> +       prandom_seed_state(&prng, i915_selftest.random_seed);
> +
> +       do {
> +               u32 sz = prandom_u32_state(&prng) % SZ_32M;
> +               u32 val = prandom_u32_state(&prng);
> +               u32 i;
> +
> +               sz = round_up(sz, PAGE_SIZE);
> +
> +               pr_info("%s with sz=%x, val=%x\n", __func__, sz, val);

pr_debug ? Won't this be quite frequent?

> +               obj = i915_gem_object_create_internal(i915, sz);
> +               if (IS_ERR(obj)) {
> +                       err = PTR_ERR(obj);
> +                       goto err_flush;
> +               }
> +
> +               vaddr = i915_gem_object_pin_map(obj, I915_MAP_WB);
> +               if (IS_ERR(vaddr)) {
> +                       err = PTR_ERR(vaddr);
> +                       goto err_put;
> +               }
> +
> +               /*
> +                * XXX: The goal is move this to get_pages, so try to dirty the
> +                * CPU cache first to check that we do the required clflush
> +                * before scheduling the blt for !llc platforms. This matches
> +                * some version of reality where at get_pages the pages
> +                * themselves may not yet be coherent with the GPU(swap-in). If
> +                * we are missing the flush then we should see the stale cache
> +                * values after we do the set_to_cpu_domain and pick it up as a
> +                * test failure.
> +                */
> +               memset32(vaddr, val ^ 0xdeadbeaf, obj->base.size / sizeof(u32));
> +
> +               if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
> +                       obj->cache_dirty = true;
> +
> +               err = i915_gem_schedule_fill_pages_blt(obj, ce, obj->mm.pages,
> +                                                      &obj->mm.page_sizes,
> +                                                      val);
> +               if (err)
> +                       goto err_unpin;
> +
> +               /*
> +                * XXX: For now do the wait without the BKL to ensure we don't
> +                * deadlock.
> +                */
> +               err = i915_gem_object_wait(obj,
> +                                          I915_WAIT_INTERRUPTIBLE |
> +                                          I915_WAIT_ALL,
> +                                          MAX_SCHEDULE_TIMEOUT);
> +               if (err)
> +                       goto err_unpin;
> +
> +               mutex_lock(&i915->drm.struct_mutex);
> +               err = i915_gem_object_set_to_cpu_domain(obj, false);
> +               mutex_unlock(&i915->drm.struct_mutex);
> +               if (err)
> +                       goto err_unpin;
> +
> +               for (i = 0; i < obj->base.size / sizeof(u32); ++i) {
> +                       if (vaddr[i] != val) {
> +                               pr_err("vaddr[%u]=%x, expected=%x\n", i,
> +                                      vaddr[i], val);
> +                               err = -EINVAL;
> +                               goto err_unpin;
> +                       }
> +               }
> +
> +               i915_gem_object_unpin_map(obj);
> +
> +               mutex_lock(&i915->drm.struct_mutex);
> +               __i915_gem_object_release_unless_active(obj);
> +               mutex_unlock(&i915->drm.struct_mutex);
> +       } while (!time_after(jiffies, end));
> +
> +       goto err_flush;
> +
> +err_unpin:
> +       i915_gem_object_unpin_map(obj);
> +err_put:
> +       mutex_lock(&i915->drm.struct_mutex);
> +       __i915_gem_object_release_unless_active(obj);
> +       mutex_unlock(&i915->drm.struct_mutex);
> +err_flush:
> +       mutex_lock(&i915->drm.struct_mutex);
> +       igt_flush_test(i915, I915_WAIT_LOCKED);

if (igt...)
	err = -EIO;

When it fails, we are wedged, so promote the result to an -EIO.

> +       mutex_unlock(&i915->drm.struct_mutex);
> +
> +       if (err == -ENOMEM)
> +               err = 0;
> +
> +       return err;
> +}

So much work to do to reduce lock coverage so that we can emit requests
from inside obj->mm.lock. This code is very much a WIP and not near
ready for actual use, but serves a very, very useful purpose in
providing a test bed for incremental improvements.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
  2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
  2019-05-07 10:55 ` [PATCH v2 2/2] drm/i915: add in-kernel blitter client Matthew Auld
  2019-05-07 11:36 ` [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Chris Wilson
@ 2019-05-07 12:14 ` Patchwork
  2019-05-07 12:15 ` ✗ Fi.CI.SPARSE: " Patchwork
  2019-05-07 12:45 ` ✗ Fi.CI.BAT: failure " Patchwork
  4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-05-07 12:14 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
URL   : https://patchwork.freedesktop.org/series/60364/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
85a64e026cc5 drm/i915/gtt: grab wakeref in gen6_alloc_va_range
12c7827ed80a drm/i915: add in-kernel blitter client
-:43: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#43: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:183:
+#define XY_COLOR_BLT_CMD		(2<<29 | 0x50<<22)
                         		  ^

-:43: CHECK:SPACING: spaces preferred around that '<<' (ctx:VxV)
#43: FILE: drivers/gpu/drm/i915/gt/intel_gpu_commands.h:183:
+#define XY_COLOR_BLT_CMD		(2<<29 | 0x50<<22)
                         		             ^

-:48: WARNING:FILE_PATH_CHANGES: added, moved or deleted file(s), does MAINTAINERS need updating?
#48: 
new file mode 100644

-:308: CHECK:COMPARISON_TO_NULL: Comparison to NULL could be written "!work"
#308: FILE: drivers/gpu/drm/i915/i915_gem_client_blt.c:256:
+	if (work == NULL) {

-:410: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#410: FILE: drivers/gpu/drm/i915/i915_gem_object_blt.c:28:
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (7-2);
 		                                              ^

-:419: CHECK:SPACING: spaces preferred around that '-' (ctx:VxV)
#419: FILE: drivers/gpu/drm/i915/i915_gem_object_blt.c:37:
+		*cs++ = XY_COLOR_BLT_CMD | BLT_WRITE_RGBA | (6-2);
 		                                              ^

-:539: WARNING:LINE_SPACING: Missing a blank line after declarations
#539: FILE: drivers/gpu/drm/i915/selftests/i915_gem_client_blt.c:18:
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);

-:676: WARNING:LINE_SPACING: Missing a blank line after declarations
#676: FILE: drivers/gpu/drm/i915/selftests/i915_gem_object_blt.c:18:
+	struct rnd_state prng;
+	IGT_TIMEOUT(end);

total: 0 errors, 3 warnings, 5 checks, 719 lines checked

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
  2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
                   ` (2 preceding siblings ...)
  2019-05-07 12:14 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2,1/2] " Patchwork
@ 2019-05-07 12:15 ` Patchwork
  2019-05-07 12:45 ` ✗ Fi.CI.BAT: failure " Patchwork
  4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-05-07 12:15 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
URL   : https://patchwork.freedesktop.org/series/60364/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/gtt: grab wakeref in gen6_alloc_va_range
-O:drivers/gpu/drm/i915/i915_gem_gtt.c:1752:9: warning: expression using sizeof(void)
-O:drivers/gpu/drm/i915/i915_gem_gtt.c:1752:9: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem_gtt.c:1755:9: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/i915_gem_gtt.c:1755:9: warning: expression using sizeof(void)

Commit: drm/i915: add in-kernel blitter client
+drivers/gpu/drm/i915/i915_gem_client_blt.h:11:22: warning: symbol 'ce' was not declared. Should it be static?
+./include/linux/reservation.h:220:20: warning: dereference of noderef expression
+./include/linux/reservation.h:220:45: warning: dereference of noderef expression
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)
+./include/uapi/linux/perf_event.h:147:56: warning: cast truncates bits from constant value (8000000000000000 becomes 0)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
  2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
                   ` (3 preceding siblings ...)
  2019-05-07 12:15 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2019-05-07 12:45 ` Patchwork
  4 siblings, 0 replies; 7+ messages in thread
From: Patchwork @ 2019-05-07 12:45 UTC (permalink / raw)
  To: Matthew Auld; +Cc: intel-gfx

== Series Details ==

Series: series starting with [v2,1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range
URL   : https://patchwork.freedesktop.org/series/60364/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_6057 -> Patchwork_12975
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_12975 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_12975, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_12975:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_flip@basic-flip-vs-modeset:
    - fi-kbl-r:           [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-kbl-r/igt@kms_flip@basic-flip-vs-modeset.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-kbl-r/igt@kms_flip@basic-flip-vs-modeset.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@gem_exec_basic@readonly-vebox:
    - {fi-cml-u}:         [PASS][3] -> [INCOMPLETE][4]
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-cml-u/igt@gem_exec_basic@readonly-vebox.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-cml-u/igt@gem_exec_basic@readonly-vebox.html

  
New tests
---------

  New tests have been introduced between CI_DRM_6057 and Patchwork_12975:

### New IGT tests (2) ###

  * igt@i915_selftest@live_blt:
    - Statuses : 40 pass(s)
    - Exec time: [0.37, 1.96] s

  * igt@i915_selftest@live_client:
    - Statuses : 40 pass(s)
    - Exec time: [0.38, 2.00] s

  

Known issues
------------

  Here are the changes found in Patchwork_12975 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@i915_selftest@live_gtt:
    - fi-glk-dsi:         [PASS][5] -> [INCOMPLETE][6] ([fdo#103359] / [k.org#198133])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-glk-dsi/igt@i915_selftest@live_gtt.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-glk-dsi/igt@i915_selftest@live_gtt.html

  
#### Possible fixes ####

  * igt@i915_selftest@live_contexts:
    - fi-skl-gvtdvm:      [DMESG-FAIL][7] ([fdo#110235]) -> [PASS][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-skl-gvtdvm/igt@i915_selftest@live_contexts.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-skl-gvtdvm/igt@i915_selftest@live_contexts.html

  
#### Warnings ####

  * igt@i915_selftest@live_hangcheck:
    - fi-apl-guc:         [DMESG-FAIL][9] ([fdo#110620]) -> [INCOMPLETE][10] ([fdo#103927] / [fdo#110624])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-apl-guc/igt@i915_selftest@live_hangcheck.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-apl-guc/igt@i915_selftest@live_hangcheck.html

  * igt@runner@aborted:
    - fi-apl-guc:         [FAIL][11] ([fdo#110622]) -> [FAIL][12] ([fdo#110624])
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6057/fi-apl-guc/igt@runner@aborted.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/fi-apl-guc/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103359]: https://bugs.freedesktop.org/show_bug.cgi?id=103359
  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#110235]: https://bugs.freedesktop.org/show_bug.cgi?id=110235
  [fdo#110620]: https://bugs.freedesktop.org/show_bug.cgi?id=110620
  [fdo#110622]: https://bugs.freedesktop.org/show_bug.cgi?id=110622
  [fdo#110624]: https://bugs.freedesktop.org/show_bug.cgi?id=110624
  [k.org#198133]: https://bugzilla.kernel.org/show_bug.cgi?id=198133


Participating hosts (51 -> 44)
------------------------------

  Additional (1): fi-icl-u2 
  Missing    (8): fi-kbl-soraka fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-y fi-byt-clapper 


Build changes
-------------

  * Linux: CI_DRM_6057 -> Patchwork_12975

  CI_DRM_6057: d3aeed5c84b415c6113d04e9574e7516e401fdb7 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4972: f052e49a43cc9704ea5f240df15dd9d3dfed68ab @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_12975: 12c7827ed80a6073b923293aa3e6995adf3df94d @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

12c7827ed80a drm/i915: add in-kernel blitter client
85a64e026cc5 drm/i915/gtt: grab wakeref in gen6_alloc_va_range

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12975/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-05-07 12:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 10:55 [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Matthew Auld
2019-05-07 10:55 ` [PATCH v2 2/2] drm/i915: add in-kernel blitter client Matthew Auld
2019-05-07 11:57   ` Chris Wilson
2019-05-07 11:36 ` [PATCH v2 1/2] drm/i915/gtt: grab wakeref in gen6_alloc_va_range Chris Wilson
2019-05-07 12:14 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [v2,1/2] " Patchwork
2019-05-07 12:15 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-05-07 12:45 ` ✗ Fi.CI.BAT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.