All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/11] drm/scheduler dependency tracking
@ 2021-06-24 14:00 Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                   ` (10 more replies)
  0 siblings, 11 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter

Hi all,

While trying to carefully auditing how all the various drivers handler the
implicit dependencies in the dma-resv object I got a bit too annoyed about
all the hand-rolling. Here's some patches to unify this at least for
drivers using the drm/scheduler.

4 out of 5 are converted over (but only compile-tested), I think amdgpu
would also work:

- handle the job->sync dependencies using drm_sched_job_await*

- build up the job->sched_sync fences needed for deciding whether we need
  a full flush or not before we push the job into the scheduler, instead
  of in the ->dependency callback. This also has the benefit of removing
  a bunch of allocations from scheduler callbacks, where they're not ok
  (due to recursuion into mmu notifier/shrinker on direct reclaim)

- keep the vmid_grab stuff in the ->dependency callback, for special
  things like that I've kept that as a fallback.

There's a few complications though:

- amdgpu_sync is both used for amdgpu_job and for other things, mostly
  amdkfd, but also some bo wait functions

- amdgpu_job is both used for pushing jobs into the scheduler, but also
  for directly pushing a job into the hw through an ib

All not insurmountable, but a bit too much when the main goal here was
just to establish the drm_sched_job_await api.

Wrt the datastructure I picked: Since 3 out of 5 drivers used the xarray,
and that should at least be fairly storage efficient and easy to grow, I
went with that. We can bikeshed/tune the backing implementation later on.

Similarly the await_implicit implementation is as inefficient as the one
the drivers currently use, relying on dma_resv_get_fences(). This means we
copy all the fences to some temporary array first, which is entirely
unecessary because we're holding the dma_resv lock.

All that can be tuned later on easily.

Review, comments and especially testing very much welcome.

Cheers, Daniel

Daniel Vetter (11):
  drm/sched: Split drm_sched_job_init
  drm/sched: Add dependency tracking
  drm/sched: drop entity parameter from drm_sched_push_job
  drm/panfrost: use scheduler dependency tracking
  drm/lima: use scheduler dependency tracking
  drm/v3d: Move drm_sched_job_init to v3d_job_init
  drm/v3d: Use scheduler dependency handling
  drm/etnaviv: Use scheduler dependency handling
  drm/gem: Delete gem array fencing helpers
  drm/scheduler: Don't store self-dependencies
  drm/sched: Check locking in drm_sched_job_await_implicit

 .gitignore                                   |   1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c       |   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c      |   4 +-
 drivers/gpu/drm/drm_gem.c                    |  96 -------------
 drivers/gpu/drm/etnaviv/etnaviv_gem.h        |   5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c |  32 ++---
 drivers/gpu/drm/etnaviv/etnaviv_sched.c      |  63 +--------
 drivers/gpu/drm/etnaviv/etnaviv_sched.h      |   3 +-
 drivers/gpu/drm/lima/lima_gem.c              |   7 +-
 drivers/gpu/drm/lima/lima_sched.c            |  28 +---
 drivers/gpu/drm/lima/lima_sched.h            |   6 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c      |  14 +-
 drivers/gpu/drm/panfrost/panfrost_job.c      |  39 +-----
 drivers/gpu/drm/panfrost/panfrost_job.h      |   5 +-
 drivers/gpu/drm/scheduler/sched_entity.c     |  30 +++--
 drivers/gpu/drm/scheduler/sched_fence.c      |  15 ++-
 drivers/gpu/drm/scheduler/sched_main.c       | 135 ++++++++++++++++++-
 drivers/gpu/drm/v3d/v3d_drv.h                |   5 -
 drivers/gpu/drm/v3d/v3d_gem.c                |  91 ++++---------
 drivers/gpu/drm/v3d/v3d_sched.c              |  29 +---
 include/drm/drm_gem.h                        |   5 -
 include/drm/gpu_scheduler.h                  |  40 +++++-
 22 files changed, 282 insertions(+), 375 deletions(-)

-- 
2.32.0.rc2


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Qiang Yu, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, David Airlie, Daniel Vetter,
	Sumit Semwal, Christian König, Masahiro Yamada, Kees Cook,
	Adam Borowski, Nick Terrell, Mauro Carvalho Chehab, Paul Menzel,
	Sami Tolvanen, Viresh Kumar, Alex Deucher, Dave Airlie,
	Nirmoy Das, Deepak R Varma, Lee Jones, Kevin Wang, Chen Li,
	Luben Tuikov, Marek Olšák, Dennis Li,
	Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, etnaviv, lima,
	linux-media, linaro-mm-sig

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 .gitignore                               |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c        |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
 drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
 drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
 include/drm/gpu_scheduler.h              |  6 +++++-
 10 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/.gitignore b/.gitignore
index 7afd412dadd2..52433a930299 100644
--- a/.gitignore
+++ b/.gitignore
@@ -66,6 +66,7 @@ modules.order
 /modules.builtin
 /modules.builtin.modinfo
 /modules.nsdeps
+*.builtin
 
 #
 # RPM spec file (make rpm-pkg)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	if (r)
 		goto error_unlock;
 
+	drm_sched_job_arm(&job->base);
+
 	/* No memory allocation is allowed while holding the notifier lock.
 	 * The lock is held until amdgpu_cs_submit is finished and fence is
 	 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 	if (r)
 		return r;
 
+	drm_sched_job_arm(&job->base);
+
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
 	drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..af1671f01c7f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	if (ret)
 		goto out_unlock;
 
+	drm_sched_job_arm(&submit->sched_job);
+
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
 	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
 						submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..bd1af1fd8c0f 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
 		return err;
 	}
 
+	drm_sched_job_arm(&task->base);
+
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index beb62c8fc851..1e950534b9b0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
 		goto unlock;
 	}
 
+	drm_sched_job_arm(&job->base);
+
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..f7347c284886 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * @sched_job: job to submit
  * @entity: scheduler entity
  *
- * Note: To guarantee that the order of insertion to queue matches
- * the job's fence sequence number this function should be
- * called with drm_sched_job_init under common lock.
+ * Note: To guarantee that the order of insertion to queue matches the job's
+ * fence sequence number this function should be called with drm_sched_job_arm()
+ * under common lock.
  *
  * Returns 0 for success, negative error code otherwise.
  */
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..0ba810c198bd 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
-struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
-					       void *owner)
+struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
+					      void *owner)
 {
 	struct drm_sched_fence *fence = NULL;
-	unsigned seq;
 
 	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
 	if (fence == NULL)
@@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
 	fence->sched = entity->rq->sched;
 	spin_lock_init(&fence->lock);
 
+	return fence;
+}
+
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity)
+{
+	unsigned seq;
+
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
 	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
 		       &fence->lock, entity->fence_context + 1, seq);
-
-	return fence;
 }
 
 module_init(drm_sched_fence_slab_init);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 61420a9c1021..70eefed17e06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,9 +48,11 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
+#include <linux/dma-resv.h>
 #include <uapi/linux/sched/types.h>
 
 #include <drm/drm_print.h>
+#include <drm/drm_gem.h>
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
@@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 	job->sched = sched;
 	job->entity = entity;
 	job->s_priority = entity->rq - sched->sched_rq;
-	job->s_fence = drm_sched_fence_create(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner);
 	if (!job->s_fence)
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
@@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
 }
 EXPORT_SYMBOL(drm_sched_job_init);
 
+/**
+ * drm_sched_job_arm - arm a scheduler job for execution
+ * @job: scheduler job to arm
+ *
+ * This arms a scheduler job for execution. Specifically it initializes the
+ * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
+ * or other places that need to track the completion of this job.
+ *
+ * Refer to drm_sched_entity_push_job() documentation for locking
+ * considerations.
+ *
+ * This can only be called if drm_sched_job_init() succeeded.
+ */
+void drm_sched_job_arm(struct drm_sched_job *job)
+{
+	drm_sched_fence_init(job->s_fence, job->entity);
+}
+EXPORT_SYMBOL(drm_sched_job_arm);
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..80438d126c9d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
+void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
@@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 
-struct drm_sched_fence *drm_sched_fence_create(
+struct drm_sched_fence *drm_sched_fence_alloc(
 	struct drm_sched_entity *s_entity, void *owner);
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity);
+
 void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 void drm_sched_fence_finished(struct drm_sched_fence *fence);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Adam Borowski, David Airlie, Daniel Vetter, Sonny Jiang,
	Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang, lima,
	Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Viresh Kumar,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kevin Wang,
	Kees Cook, Marek Olšák, Russell King, etnaviv,
	linaro-mm-sig, Deepak R Varma, Tomeu Vizoso, Nick Terrell,
	Boris Brezillon, Qiang Yu, Alex Deucher, Tian Tao, linux-media,
	Christian König

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 .gitignore                               |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c        |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
 drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
 drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
 include/drm/gpu_scheduler.h              |  6 +++++-
 10 files changed, 51 insertions(+), 10 deletions(-)

diff --git a/.gitignore b/.gitignore
index 7afd412dadd2..52433a930299 100644
--- a/.gitignore
+++ b/.gitignore
@@ -66,6 +66,7 @@ modules.order
 /modules.builtin
 /modules.builtin.modinfo
 /modules.nsdeps
+*.builtin
 
 #
 # RPM spec file (make rpm-pkg)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	if (r)
 		goto error_unlock;
 
+	drm_sched_job_arm(&job->base);
+
 	/* No memory allocation is allowed while holding the notifier lock.
 	 * The lock is held until amdgpu_cs_submit is finished and fence is
 	 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 	if (r)
 		return r;
 
+	drm_sched_job_arm(&job->base);
+
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
 	drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..af1671f01c7f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	if (ret)
 		goto out_unlock;
 
+	drm_sched_job_arm(&submit->sched_job);
+
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
 	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
 						submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..bd1af1fd8c0f 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
 		return err;
 	}
 
+	drm_sched_job_arm(&task->base);
+
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index beb62c8fc851..1e950534b9b0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
 		goto unlock;
 	}
 
+	drm_sched_job_arm(&job->base);
+
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..f7347c284886 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * @sched_job: job to submit
  * @entity: scheduler entity
  *
- * Note: To guarantee that the order of insertion to queue matches
- * the job's fence sequence number this function should be
- * called with drm_sched_job_init under common lock.
+ * Note: To guarantee that the order of insertion to queue matches the job's
+ * fence sequence number this function should be called with drm_sched_job_arm()
+ * under common lock.
  *
  * Returns 0 for success, negative error code otherwise.
  */
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..0ba810c198bd 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
-struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
-					       void *owner)
+struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
+					      void *owner)
 {
 	struct drm_sched_fence *fence = NULL;
-	unsigned seq;
 
 	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
 	if (fence == NULL)
@@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
 	fence->sched = entity->rq->sched;
 	spin_lock_init(&fence->lock);
 
+	return fence;
+}
+
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity)
+{
+	unsigned seq;
+
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
 	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
 		       &fence->lock, entity->fence_context + 1, seq);
-
-	return fence;
 }
 
 module_init(drm_sched_fence_slab_init);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 61420a9c1021..70eefed17e06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,9 +48,11 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
+#include <linux/dma-resv.h>
 #include <uapi/linux/sched/types.h>
 
 #include <drm/drm_print.h>
+#include <drm/drm_gem.h>
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
@@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 	job->sched = sched;
 	job->entity = entity;
 	job->s_priority = entity->rq - sched->sched_rq;
-	job->s_fence = drm_sched_fence_create(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner);
 	if (!job->s_fence)
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
@@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
 }
 EXPORT_SYMBOL(drm_sched_job_init);
 
+/**
+ * drm_sched_job_arm - arm a scheduler job for execution
+ * @job: scheduler job to arm
+ *
+ * This arms a scheduler job for execution. Specifically it initializes the
+ * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
+ * or other places that need to track the completion of this job.
+ *
+ * Refer to drm_sched_entity_push_job() documentation for locking
+ * considerations.
+ *
+ * This can only be called if drm_sched_job_init() succeeded.
+ */
+void drm_sched_job_arm(struct drm_sched_job *job)
+{
+	drm_sched_fence_init(job->s_fence, job->entity);
+}
+EXPORT_SYMBOL(drm_sched_job_arm);
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..80438d126c9d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
+void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
@@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 
-struct drm_sched_fence *drm_sched_fence_create(
+struct drm_sched_fence *drm_sched_fence_alloc(
 	struct drm_sched_entity *s_entity, void *owner);
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity);
+
 void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 void drm_sched_fence_finished(struct drm_sched_fence *fence);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/11] drm/sched: Add dependency tracking
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, David Airlie, Daniel Vetter,
	Sumit Semwal, Christian König, Andrey Grodzovsky, Lee Jones,
	Nirmoy Das, Boris Brezillon, Luben Tuikov, Alex Deucher,
	Jack Zhang, linux-media, linaro-mm-sig

Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

On the naming bikeshed: The idea for using _await_ to denote adding
dependencies to a job comes from i915, where that's used quite
extensively all over the place, in lots of datastructures.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
 drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
 include/drm/gpu_scheduler.h              |  31 ++++++-
 3 files changed, 146 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..b6f72fafd504 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 	job->sched->ops->free_job(job);
 }
 
+static struct dma_fence *
+drm_sched_job_dependency(struct drm_sched_job *job,
+			 struct drm_sched_entity *entity)
+{
+	if (!xa_empty(&job->dependencies))
+		return xa_erase(&job->dependencies, job->last_dependency++);
+
+	if (job->sched->ops->dependency)
+		return job->sched->ops->dependency(job, entity);
+
+	return NULL;
+}
+
 /**
  * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
  *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 		struct drm_sched_fence *s_fence = job->s_fence;
 
 		/* Wait for all dependencies to avoid data corruptions */
-		while ((f = job->sched->ops->dependency(job, entity)))
+		while ((f = drm_sched_job_dependency(job, entity)))
 			dma_fence_wait(f, false);
 
 		drm_sched_fence_scheduled(s_fence);
@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
  */
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 {
-	struct drm_gpu_scheduler *sched = entity->rq->sched;
 	struct drm_sched_job *sched_job;
 
 	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 		return NULL;
 
 	while ((entity->dependency =
-			sched->ops->dependency(sched_job, entity))) {
+			drm_sched_job_dependency(sched_job, entity))) {
 		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
 		if (drm_sched_entity_add_dependency_cb(entity))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 70eefed17e06..370c336d383f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
 	INIT_LIST_HEAD(&job->list);
 
+	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_sched_job_init);
@@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_arm);
 
+/**
+ * drm_sched_job_await_fence - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+			      struct dma_fence *fence)
+{
+	struct dma_fence *entry;
+	unsigned long index;
+	u32 id = 0;
+	int ret;
+
+	if (!fence)
+		return 0;
+
+	/* Deduplicate if we already depend on a fence from the same context.
+	 * This lets the size of the array of deps scale with the number of
+	 * engines involved, rather than the number of BOs.
+	 */
+	xa_for_each(&job->dependencies, index, entry) {
+		if (entry->context != fence->context)
+			continue;
+
+		if (dma_fence_is_later(fence, entry)) {
+			dma_fence_put(entry);
+			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
+		} else {
+			dma_fence_put(fence);
+		}
+		return 0;
+	}
+
+	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
+	if (ret != 0)
+		dma_fence_put(fence);
+
+	return ret;
+}
+EXPORT_SYMBOL(drm_sched_job_await_fence);
+
+/**
+ * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
+ * @job: scheduler job to add the dependencies to
+ * @obj: the gem object to add new dependencies from.
+ * @write: whether the job might write the object (so we need to depend on
+ * shared fences in the reservation object).
+ *
+ * This should be called after drm_gem_lock_reservations() on your array of
+ * GEM objects used in the job but before updating the reservations with your
+ * own fences.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_implicit(struct drm_sched_job *job,
+				 struct drm_gem_object *obj,
+				 bool write)
+{
+	int ret;
+	struct dma_fence **fences;
+	unsigned int i, fence_count;
+
+	if (!write) {
+		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
+
+		return drm_sched_job_await_fence(job, fence);
+	}
+
+	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
+	if (ret || !fence_count)
+		return ret;
+
+	for (i = 0; i < fence_count; i++) {
+		ret = drm_sched_job_await_fence(job, fences[i]);
+		if (ret)
+			break;
+	}
+
+	for (; i < fence_count; i++)
+		dma_fence_put(fences[i]);
+	kfree(fences);
+	return ret;
+}
+EXPORT_SYMBOL(drm_sched_job_await_implicit);
+
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
@@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
  */
 void drm_sched_job_cleanup(struct drm_sched_job *job)
 {
+	struct dma_fence *fence;
+	unsigned long index;
+
 	dma_fence_put(&job->s_fence->finished);
 	job->s_fence = NULL;
+
+	xa_for_each(&job->dependencies, index, fence) {
+		dma_fence_put(fence);
+	}
+	xa_destroy(&job->dependencies);
+
 }
 EXPORT_SYMBOL(drm_sched_job_cleanup);
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 80438d126c9d..e4d7e1496296 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -27,9 +27,12 @@
 #include <drm/spsc_queue.h>
 #include <linux/dma-fence.h>
 #include <linux/completion.h>
+#include <linux/xarray.h>
 
 #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
 
+struct drm_gem_object;
+
 struct drm_gpu_scheduler;
 struct drm_sched_rq;
 
@@ -198,6 +201,16 @@ struct drm_sched_job {
 	enum drm_sched_priority		s_priority;
 	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
+	/**
+	 * @dependencies:
+	 *
+	 * Contains the dependencies as struct dma_fence for this job, see
+	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
+	 */
+	struct xarray			dependencies;
+
+	/** @last_dependency: tracks @dependencies as they signal */
+	unsigned long			last_dependency;
 };
 
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
@@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
  */
 struct drm_sched_backend_ops {
 	/**
-         * @dependency: Called when the scheduler is considering scheduling
-         * this job next, to get another struct dma_fence for this job to
-	 * block on.  Once it returns NULL, run_job() may be called.
+	 * @dependency:
+	 *
+	 * Called when the scheduler is considering scheduling this job next, to
+	 * get another struct dma_fence for this job to block on.  Once it
+	 * returns NULL, run_job() may be called.
+	 *
+	 * If a driver exclusively uses drm_sched_job_await_fence() and
+	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
 	 */
 	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
 					struct drm_sched_entity *s_entity);
@@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
 void drm_sched_job_arm(struct drm_sched_job *job);
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+			      struct dma_fence *fence);
+int drm_sched_job_await_implicit(struct drm_sched_job *job,
+				 struct drm_gem_object *obj,
+				 bool write);
+
+
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 02/11] drm/sched: Add dependency tracking
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Jack Zhang, David Airlie, Daniel Vetter, linaro-mm-sig,
	Boris Brezillon, Alex Deucher, Daniel Vetter, linux-media,
	Lee Jones, Christian König, Luben Tuikov, Nirmoy Das

Instead of just a callback we can just glue in the gem helpers that
panfrost, v3d and lima currently use. There's really not that many
ways to skin this cat.

On the naming bikeshed: The idea for using _await_ to denote adding
dependencies to a job comes from i915, where that's used quite
extensively all over the place, in lots of datastructures.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
 drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
 include/drm/gpu_scheduler.h              |  31 ++++++-
 3 files changed, 146 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index f7347c284886..b6f72fafd504 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
 	job->sched->ops->free_job(job);
 }
 
+static struct dma_fence *
+drm_sched_job_dependency(struct drm_sched_job *job,
+			 struct drm_sched_entity *entity)
+{
+	if (!xa_empty(&job->dependencies))
+		return xa_erase(&job->dependencies, job->last_dependency++);
+
+	if (job->sched->ops->dependency)
+		return job->sched->ops->dependency(job, entity);
+
+	return NULL;
+}
+
 /**
  * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
  *
@@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
 		struct drm_sched_fence *s_fence = job->s_fence;
 
 		/* Wait for all dependencies to avoid data corruptions */
-		while ((f = job->sched->ops->dependency(job, entity)))
+		while ((f = drm_sched_job_dependency(job, entity)))
 			dma_fence_wait(f, false);
 
 		drm_sched_fence_scheduled(s_fence);
@@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
  */
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 {
-	struct drm_gpu_scheduler *sched = entity->rq->sched;
 	struct drm_sched_job *sched_job;
 
 	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
@@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
 		return NULL;
 
 	while ((entity->dependency =
-			sched->ops->dependency(sched_job, entity))) {
+			drm_sched_job_dependency(sched_job, entity))) {
 		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
 
 		if (drm_sched_entity_add_dependency_cb(entity))
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 70eefed17e06..370c336d383f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
 
 	INIT_LIST_HEAD(&job->list);
 
+	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_sched_job_init);
@@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
 }
 EXPORT_SYMBOL(drm_sched_job_arm);
 
+/**
+ * drm_sched_job_await_fence - adds the fence as a job dependency
+ * @job: scheduler job to add the dependencies to
+ * @fence: the dma_fence to add to the list of dependencies.
+ *
+ * Note that @fence is consumed in both the success and error cases.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+			      struct dma_fence *fence)
+{
+	struct dma_fence *entry;
+	unsigned long index;
+	u32 id = 0;
+	int ret;
+
+	if (!fence)
+		return 0;
+
+	/* Deduplicate if we already depend on a fence from the same context.
+	 * This lets the size of the array of deps scale with the number of
+	 * engines involved, rather than the number of BOs.
+	 */
+	xa_for_each(&job->dependencies, index, entry) {
+		if (entry->context != fence->context)
+			continue;
+
+		if (dma_fence_is_later(fence, entry)) {
+			dma_fence_put(entry);
+			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
+		} else {
+			dma_fence_put(fence);
+		}
+		return 0;
+	}
+
+	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
+	if (ret != 0)
+		dma_fence_put(fence);
+
+	return ret;
+}
+EXPORT_SYMBOL(drm_sched_job_await_fence);
+
+/**
+ * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
+ * @job: scheduler job to add the dependencies to
+ * @obj: the gem object to add new dependencies from.
+ * @write: whether the job might write the object (so we need to depend on
+ * shared fences in the reservation object).
+ *
+ * This should be called after drm_gem_lock_reservations() on your array of
+ * GEM objects used in the job but before updating the reservations with your
+ * own fences.
+ *
+ * Returns:
+ * 0 on success, or an error on failing to expand the array.
+ */
+int drm_sched_job_await_implicit(struct drm_sched_job *job,
+				 struct drm_gem_object *obj,
+				 bool write)
+{
+	int ret;
+	struct dma_fence **fences;
+	unsigned int i, fence_count;
+
+	if (!write) {
+		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
+
+		return drm_sched_job_await_fence(job, fence);
+	}
+
+	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
+	if (ret || !fence_count)
+		return ret;
+
+	for (i = 0; i < fence_count; i++) {
+		ret = drm_sched_job_await_fence(job, fences[i]);
+		if (ret)
+			break;
+	}
+
+	for (; i < fence_count; i++)
+		dma_fence_put(fences[i]);
+	kfree(fences);
+	return ret;
+}
+EXPORT_SYMBOL(drm_sched_job_await_implicit);
+
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
@@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
  */
 void drm_sched_job_cleanup(struct drm_sched_job *job)
 {
+	struct dma_fence *fence;
+	unsigned long index;
+
 	dma_fence_put(&job->s_fence->finished);
 	job->s_fence = NULL;
+
+	xa_for_each(&job->dependencies, index, fence) {
+		dma_fence_put(fence);
+	}
+	xa_destroy(&job->dependencies);
+
 }
 EXPORT_SYMBOL(drm_sched_job_cleanup);
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 80438d126c9d..e4d7e1496296 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -27,9 +27,12 @@
 #include <drm/spsc_queue.h>
 #include <linux/dma-fence.h>
 #include <linux/completion.h>
+#include <linux/xarray.h>
 
 #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
 
+struct drm_gem_object;
+
 struct drm_gpu_scheduler;
 struct drm_sched_rq;
 
@@ -198,6 +201,16 @@ struct drm_sched_job {
 	enum drm_sched_priority		s_priority;
 	struct drm_sched_entity         *entity;
 	struct dma_fence_cb		cb;
+	/**
+	 * @dependencies:
+	 *
+	 * Contains the dependencies as struct dma_fence for this job, see
+	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
+	 */
+	struct xarray			dependencies;
+
+	/** @last_dependency: tracks @dependencies as they signal */
+	unsigned long			last_dependency;
 };
 
 static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
@@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
  */
 struct drm_sched_backend_ops {
 	/**
-         * @dependency: Called when the scheduler is considering scheduling
-         * this job next, to get another struct dma_fence for this job to
-	 * block on.  Once it returns NULL, run_job() may be called.
+	 * @dependency:
+	 *
+	 * Called when the scheduler is considering scheduling this job next, to
+	 * get another struct dma_fence for this job to block on.  Once it
+	 * returns NULL, run_job() may be called.
+	 *
+	 * If a driver exclusively uses drm_sched_job_await_fence() and
+	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
 	 */
 	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
 					struct drm_sched_entity *s_entity);
@@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
 void drm_sched_job_arm(struct drm_sched_job *job);
+int drm_sched_job_await_fence(struct drm_sched_job *job,
+			      struct dma_fence *fence);
+int drm_sched_job_await_implicit(struct drm_sched_job *job,
+				 struct drm_gem_object *obj,
+				 bool write);
+
+
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/11] drm/sched: drop entity parameter from drm_sched_push_job
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Qiang Yu, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, Emma Anholt, David Airlie,
	Daniel Vetter, Sumit Semwal, Christian König, Alex Deucher,
	Nirmoy Das, Dave Airlie, Chen Li, Lee Jones, Deepak R Varma,
	Kevin Wang, Luben Tuikov, Marek Olšák,
	Maarten Lankhorst, Andrey Grodzovsky, Dennis Li, Boris Brezillon,
	etnaviv, lima, linux-media, linaro-mm-sig

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c          | 3 +--
 drivers/gpu/drm/lima/lima_sched.c        | 5 ++---
 drivers/gpu/drm/lima/lima_sched.h        | 3 +--
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++----
 drivers/gpu/drm/v3d/v3d_gem.c            | 2 +-
 include/drm/gpu_scheduler.h              | 3 +--
 10 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a4ec092af9a7..18f63567fb69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
 	trace_amdgpu_cs_ioctl(job);
 	amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b8609cccc9c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index af1671f01c7f..77995f190790 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	/* the scheduler holds on to the job now */
 	kref_get(&submit->refcount);
 
-	drm_sched_entity_push_job(&submit->sched_job, sched_entity);
+	drm_sched_entity_push_job(&submit->sched_job);
 
 out_unlock:
 	mutex_unlock(&submit->gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
 			goto err_out2;
 	}
 
-	fence = lima_sched_context_queue_task(
-		submit->ctx->context + submit->pipe, submit->task);
+	fence = lima_sched_context_queue_task(submit->task);
 
 	for (i = 0; i < submit->nr_bos; i++) {
 		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index bd1af1fd8c0f..de7e71c42a69 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 	drm_sched_entity_fini(&context->base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
-						struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
 	struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
 
 	trace_lima_task_submit(task);
-	drm_sched_entity_push_job(&task->base, &context->base);
+	drm_sched_entity_push_job(&task->base);
 	return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
index 90f03c48ef4a..ac70006b0e26 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
 			    atomic_t *guilty);
 void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 			     struct lima_sched_context *context);
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
-						struct lima_sched_task *task);
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task);
 
 int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name);
 void lima_sched_pipe_fini(struct lima_sched_pipe *pipe);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 1e950534b9b0..2d01a670a4e8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -257,7 +257,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	kref_get(&job->refcount); /* put by scheduler job completion */
 
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	mutex_unlock(&pfdev->sched_lock);
 
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index b6f72fafd504..2ab1b9e648f2 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -493,9 +493,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 
 /**
  * drm_sched_entity_push_job - Submit a job to the entity's job queue
- *
  * @sched_job: job to submit
- * @entity: scheduler entity
  *
  * Note: To guarantee that the order of insertion to queue matches the job's
  * fence sequence number this function should be called with drm_sched_job_arm()
@@ -503,9 +501,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  *
  * Returns 0 for success, negative error code otherwise.
  */
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
-			       struct drm_sched_entity *entity)
+void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 {
+	struct drm_sched_entity *entity = sched_job->entity;
 	bool first;
 
 	trace_drm_sched_job(sched_job, entity);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 4eb354226972..ac608eb9b594 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -480,7 +480,7 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
 	/* put by scheduler job completion */
 	kref_get(&job->refcount);
 
-	drm_sched_entity_push_job(&job->base, &v3d_priv->sched_entity[queue]);
+	drm_sched_entity_push_job(&job->base);
 
 	return 0;
 }
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e4d7e1496296..55e0acf8015a 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -372,8 +372,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity);
 void drm_sched_entity_destroy(struct drm_sched_entity *entity);
 void drm_sched_entity_select_rq(struct drm_sched_entity *entity);
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity);
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
-			       struct drm_sched_entity *entity);
+void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
 void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 03/11] drm/sched: drop entity parameter from drm_sched_push_job
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Emma Anholt, David Airlie, Daniel Vetter, Nirmoy Das,
	Daniel Vetter, Lee Jones, lima, Steven Price, Luben Tuikov,
	Alyssa Rosenzweig, Russell King, Dave Airlie, Dennis Li, Chen Li,
	Marek Olšák, Kevin Wang, etnaviv, linaro-mm-sig,
	Deepak R Varma, Tomeu Vizoso, Boris Brezillon, Qiang Yu,
	Alex Deucher, linux-media, Christian König

Originally a job was only bound to the queue when we pushed this, but
now that's done in drm_sched_job_init, making that parameter entirely
redundant.

Remove it.

The same applies to the context parameter in
lima_sched_context_queue_task, simplify that too.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
 drivers/gpu/drm/lima/lima_gem.c          | 3 +--
 drivers/gpu/drm/lima/lima_sched.c        | 5 ++---
 drivers/gpu/drm/lima/lima_sched.h        | 3 +--
 drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
 drivers/gpu/drm/scheduler/sched_entity.c | 6 ++----
 drivers/gpu/drm/v3d/v3d_gem.c            | 2 +-
 include/drm/gpu_scheduler.h              | 3 +--
 10 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a4ec092af9a7..18f63567fb69 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 
 	trace_amdgpu_cs_ioctl(job);
 	amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 5ddb955d2315..b8609cccc9c1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index af1671f01c7f..77995f190790 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	/* the scheduler holds on to the job now */
 	kref_get(&submit->refcount);
 
-	drm_sched_entity_push_job(&submit->sched_job, sched_entity);
+	drm_sched_entity_push_job(&submit->sched_job);
 
 out_unlock:
 	mutex_unlock(&submit->gpu->fence_lock);
diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index de62966243cd..c528f40981bb 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
 			goto err_out2;
 	}
 
-	fence = lima_sched_context_queue_task(
-		submit->ctx->context + submit->pipe, submit->task);
+	fence = lima_sched_context_queue_task(submit->task);
 
 	for (i = 0; i < submit->nr_bos; i++) {
 		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index bd1af1fd8c0f..de7e71c42a69 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 	drm_sched_entity_fini(&context->base);
 }
 
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
-						struct lima_sched_task *task)
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 {
 	struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
 
 	trace_lima_task_submit(task);
-	drm_sched_entity_push_job(&task->base, &context->base);
+	drm_sched_entity_push_job(&task->base);
 	return fence;
 }
 
diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
index 90f03c48ef4a..ac70006b0e26 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
 			    atomic_t *guilty);
 void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 			     struct lima_sched_context *context);
-struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
-						struct lima_sched_task *task);
+struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task);
 
 int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name);
 void lima_sched_pipe_fini(struct lima_sched_pipe *pipe);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 1e950534b9b0..2d01a670a4e8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -257,7 +257,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	kref_get(&job->refcount); /* put by scheduler job completion */
 
-	drm_sched_entity_push_job(&job->base, entity);
+	drm_sched_entity_push_job(&job->base);
 
 	mutex_unlock(&pfdev->sched_lock);
 
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index b6f72fafd504..2ab1b9e648f2 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -493,9 +493,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
 
 /**
  * drm_sched_entity_push_job - Submit a job to the entity's job queue
- *
  * @sched_job: job to submit
- * @entity: scheduler entity
  *
  * Note: To guarantee that the order of insertion to queue matches the job's
  * fence sequence number this function should be called with drm_sched_job_arm()
@@ -503,9 +501,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  *
  * Returns 0 for success, negative error code otherwise.
  */
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
-			       struct drm_sched_entity *entity)
+void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
 {
+	struct drm_sched_entity *entity = sched_job->entity;
 	bool first;
 
 	trace_drm_sched_job(sched_job, entity);
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 4eb354226972..ac608eb9b594 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -480,7 +480,7 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
 	/* put by scheduler job completion */
 	kref_get(&job->refcount);
 
-	drm_sched_entity_push_job(&job->base, &v3d_priv->sched_entity[queue]);
+	drm_sched_entity_push_job(&job->base);
 
 	return 0;
 }
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index e4d7e1496296..55e0acf8015a 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -372,8 +372,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity);
 void drm_sched_entity_destroy(struct drm_sched_entity *entity);
 void drm_sched_entity_select_rq(struct drm_sched_entity *entity);
 struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity);
-void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
-			       struct drm_sched_entity *entity);
+void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
 void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/11] drm/panfrost: use scheduler dependency tracking
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, Sumit Semwal,
	Christian König, linux-media, linaro-mm-sig

Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 14 +++++++---
 drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++----------------------
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 15 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..79904f55c19f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
 		if (ret)
 			goto fail;
 
-		ret = drm_gem_fence_array_add(&job->deps, fence);
+		ret = drm_sched_job_await_fence(&job->base, fence);
 
 		if (ret)
 			goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	struct drm_panfrost_submit *args = data;
 	struct drm_syncobj *sync_out = NULL;
 	struct panfrost_job *job;
-	int ret = 0;
+	int ret = 0, slot;
 
 	if (!args->jc)
 		return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	kref_init(&job->refcount);
 
-	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
-
 	job->pfdev = pfdev;
 	job->jc = args->jc;
 	job->requirements = args->requirements;
 	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
 	job->file_priv = file->driver_priv;
 
+	slot = panfrost_job_get_slot(job);
+
+	ret = drm_sched_job_init(&job->base,
+				 &job->file_priv->sched_entity[slot],
+				 NULL);
+	if (ret)
+		goto fail_job;
+
 	ret = panfrost_copy_in_sync(dev, file, args, job);
 	if (ret)
 		goto fail_job;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 2d01a670a4e8..d097e52f8caa 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -109,7 +109,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in
 	return &fence->base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
 	/* JS0: fragment jobs.
 	 * JS1: vertex/tiler jobs
@@ -198,13 +198,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 					  int bo_count,
-					  struct xarray *deps)
+					  struct drm_sched_job *job)
 {
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
 		/* panfrost always uses write mode in its current uapi */
-		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+		ret = drm_sched_job_await_implicit(job, bos[i], true);
 		if (ret)
 			return ret;
 	}
@@ -225,31 +225,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
 	struct panfrost_device *pfdev = job->pfdev;
-	int slot = panfrost_job_get_slot(job);
-	struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
 	struct ww_acquire_ctx acquire_ctx;
 	int ret = 0;
 
-
 	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
 					    &acquire_ctx);
 	if (ret)
 		return ret;
 
 	mutex_lock(&pfdev->sched_lock);
-
-	ret = drm_sched_job_init(&job->base, entity, NULL);
-	if (ret) {
-		mutex_unlock(&pfdev->sched_lock);
-		goto unlock;
-	}
-
 	drm_sched_job_arm(&job->base);
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
-					     &job->deps);
+					     &job->base);
 	if (ret) {
 		mutex_unlock(&pfdev->sched_lock);
 		goto unlock;
@@ -274,15 +264,8 @@ static void panfrost_job_cleanup(struct kref *ref)
 {
 	struct panfrost_job *job = container_of(ref, struct panfrost_job,
 						refcount);
-	struct dma_fence *fence;
-	unsigned long index;
 	unsigned int i;
 
-	xa_for_each(&job->deps, index, fence) {
-		dma_fence_put(fence);
-	}
-	xa_destroy(&job->deps);
-
 	dma_fence_put(job->done_fence);
 	dma_fence_put(job->render_done_fence);
 
@@ -321,17 +304,6 @@ static void panfrost_job_free(struct drm_sched_job *sched_job)
 	panfrost_job_put(job);
 }
 
-static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job,
-						 struct drm_sched_entity *s_entity)
-{
-	struct panfrost_job *job = to_panfrost_job(sched_job);
-
-	if (!xa_empty(&job->deps))
-		return xa_erase(&job->deps, job->last_dep++);
-
-	return NULL;
-}
-
 static struct dma_fence *panfrost_job_run(struct drm_sched_job *sched_job)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
@@ -457,7 +429,6 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job
 }
 
 static const struct drm_sched_backend_ops panfrost_sched_ops = {
-	.dependency = panfrost_job_dependency,
 	.run_job = panfrost_job_run,
 	.timedout_job = panfrost_job_timedout,
 	.free_job = panfrost_job_free
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index 82306a03b57e..77e6d0e6f612 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -19,10 +19,6 @@ struct panfrost_job {
 	struct panfrost_device *pfdev;
 	struct panfrost_file_priv *file_priv;
 
-	/* Contains both explicit and implicit fences */
-	struct xarray deps;
-	unsigned long last_dep;
-
 	/* Fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *done_fence;
 
@@ -42,6 +38,7 @@ int panfrost_job_init(struct panfrost_device *pfdev);
 void panfrost_job_fini(struct panfrost_device *pfdev);
 int panfrost_job_open(struct panfrost_file_priv *panfrost_priv);
 void panfrost_job_close(struct panfrost_file_priv *panfrost_priv);
+int panfrost_job_get_slot(struct panfrost_job *job);
 int panfrost_job_push(struct panfrost_job *job);
 void panfrost_job_put(struct panfrost_job *job);
 void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 04/11] drm/panfrost: use scheduler dependency tracking
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Tomeu Vizoso, Christian König, Daniel Vetter, Steven Price,
	linaro-mm-sig, Alyssa Rosenzweig, Daniel Vetter, linux-media

Just deletes some code that's now more shared.

Note that thanks to the split into drm_sched_job_init/arm we can now
easily pull the _init() part from under the submission lock way ahead
where we're adding the sync file in-fences as dependencies.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 14 +++++++---
 drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++----------------------
 drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
 3 files changed, 15 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 1ffaef5ec5ff..79904f55c19f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
 		if (ret)
 			goto fail;
 
-		ret = drm_gem_fence_array_add(&job->deps, fence);
+		ret = drm_sched_job_await_fence(&job->base, fence);
 
 		if (ret)
 			goto fail;
@@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	struct drm_panfrost_submit *args = data;
 	struct drm_syncobj *sync_out = NULL;
 	struct panfrost_job *job;
-	int ret = 0;
+	int ret = 0, slot;
 
 	if (!args->jc)
 		return -EINVAL;
@@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	kref_init(&job->refcount);
 
-	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
-
 	job->pfdev = pfdev;
 	job->jc = args->jc;
 	job->requirements = args->requirements;
 	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
 	job->file_priv = file->driver_priv;
 
+	slot = panfrost_job_get_slot(job);
+
+	ret = drm_sched_job_init(&job->base,
+				 &job->file_priv->sched_entity[slot],
+				 NULL);
+	if (ret)
+		goto fail_job;
+
 	ret = panfrost_copy_in_sync(dev, file, args, job);
 	if (ret)
 		goto fail_job;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 2d01a670a4e8..d097e52f8caa 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -109,7 +109,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in
 	return &fence->base;
 }
 
-static int panfrost_job_get_slot(struct panfrost_job *job)
+int panfrost_job_get_slot(struct panfrost_job *job)
 {
 	/* JS0: fragment jobs.
 	 * JS1: vertex/tiler jobs
@@ -198,13 +198,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 
 static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 					  int bo_count,
-					  struct xarray *deps)
+					  struct drm_sched_job *job)
 {
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
 		/* panfrost always uses write mode in its current uapi */
-		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
+		ret = drm_sched_job_await_implicit(job, bos[i], true);
 		if (ret)
 			return ret;
 	}
@@ -225,31 +225,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
 int panfrost_job_push(struct panfrost_job *job)
 {
 	struct panfrost_device *pfdev = job->pfdev;
-	int slot = panfrost_job_get_slot(job);
-	struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
 	struct ww_acquire_ctx acquire_ctx;
 	int ret = 0;
 
-
 	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
 					    &acquire_ctx);
 	if (ret)
 		return ret;
 
 	mutex_lock(&pfdev->sched_lock);
-
-	ret = drm_sched_job_init(&job->base, entity, NULL);
-	if (ret) {
-		mutex_unlock(&pfdev->sched_lock);
-		goto unlock;
-	}
-
 	drm_sched_job_arm(&job->base);
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
-					     &job->deps);
+					     &job->base);
 	if (ret) {
 		mutex_unlock(&pfdev->sched_lock);
 		goto unlock;
@@ -274,15 +264,8 @@ static void panfrost_job_cleanup(struct kref *ref)
 {
 	struct panfrost_job *job = container_of(ref, struct panfrost_job,
 						refcount);
-	struct dma_fence *fence;
-	unsigned long index;
 	unsigned int i;
 
-	xa_for_each(&job->deps, index, fence) {
-		dma_fence_put(fence);
-	}
-	xa_destroy(&job->deps);
-
 	dma_fence_put(job->done_fence);
 	dma_fence_put(job->render_done_fence);
 
@@ -321,17 +304,6 @@ static void panfrost_job_free(struct drm_sched_job *sched_job)
 	panfrost_job_put(job);
 }
 
-static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job,
-						 struct drm_sched_entity *s_entity)
-{
-	struct panfrost_job *job = to_panfrost_job(sched_job);
-
-	if (!xa_empty(&job->deps))
-		return xa_erase(&job->deps, job->last_dep++);
-
-	return NULL;
-}
-
 static struct dma_fence *panfrost_job_run(struct drm_sched_job *sched_job)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
@@ -457,7 +429,6 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job
 }
 
 static const struct drm_sched_backend_ops panfrost_sched_ops = {
-	.dependency = panfrost_job_dependency,
 	.run_job = panfrost_job_run,
 	.timedout_job = panfrost_job_timedout,
 	.free_job = panfrost_job_free
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index 82306a03b57e..77e6d0e6f612 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -19,10 +19,6 @@ struct panfrost_job {
 	struct panfrost_device *pfdev;
 	struct panfrost_file_priv *file_priv;
 
-	/* Contains both explicit and implicit fences */
-	struct xarray deps;
-	unsigned long last_dep;
-
 	/* Fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *done_fence;
 
@@ -42,6 +38,7 @@ int panfrost_job_init(struct panfrost_device *pfdev);
 void panfrost_job_fini(struct panfrost_device *pfdev);
 int panfrost_job_open(struct panfrost_file_priv *panfrost_priv);
 void panfrost_job_close(struct panfrost_file_priv *panfrost_priv);
+int panfrost_job_get_slot(struct panfrost_job *job);
 int panfrost_job_push(struct panfrost_job *job);
 void panfrost_job_put(struct panfrost_job *job);
 void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 05/11] drm/lima: use scheduler dependency tracking
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
                   ` (3 preceding siblings ...)
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:00 ` Daniel Vetter
  2021-06-24 14:00 ` [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init Daniel Vetter
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Daniel Vetter

Nothing special going on here.

Aside reviewing the code, it seems like drm_sched_job_arm() should be
moved into lima_sched_context_queue_task and put under some mutex
together with drm_sched_push_job(). See the kerneldoc for
drm_sched_push_job().

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/lima/lima_gem.c   |  4 ++--
 drivers/gpu/drm/lima/lima_sched.c | 21 ---------------------
 drivers/gpu/drm/lima/lima_sched.h |  3 ---
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
index c528f40981bb..e54a88d5037a 100644
--- a/drivers/gpu/drm/lima/lima_gem.c
+++ b/drivers/gpu/drm/lima/lima_gem.c
@@ -267,7 +267,7 @@ static int lima_gem_sync_bo(struct lima_sched_task *task, struct lima_bo *bo,
 	if (explicit)
 		return 0;
 
-	return drm_gem_fence_array_add_implicit(&task->deps, &bo->base.base, write);
+	return drm_sched_job_await_implicit(&task->base, &bo->base.base, write);
 }
 
 static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
@@ -285,7 +285,7 @@ static int lima_gem_add_deps(struct drm_file *file, struct lima_submit *submit)
 		if (err)
 			return err;
 
-		err = drm_gem_fence_array_add(&submit->task->deps, fence);
+		err = drm_sched_job_await_fence(&submit->task->base, fence);
 		if (err) {
 			dma_fence_put(fence);
 			return err;
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index de7e71c42a69..c8d032429853 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -134,24 +134,15 @@ int lima_sched_task_init(struct lima_sched_task *task,
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
-	xa_init_flags(&task->deps, XA_FLAGS_ALLOC);
-
 	return 0;
 }
 
 void lima_sched_task_fini(struct lima_sched_task *task)
 {
-	struct dma_fence *fence;
-	unsigned long index;
 	int i;
 
 	drm_sched_job_cleanup(&task->base);
 
-	xa_for_each(&task->deps, index, fence) {
-		dma_fence_put(fence);
-	}
-	xa_destroy(&task->deps);
-
 	if (task->bos) {
 		for (i = 0; i < task->num_bos; i++)
 			drm_gem_object_put(&task->bos[i]->base.base);
@@ -186,17 +177,6 @@ struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
 	return fence;
 }
 
-static struct dma_fence *lima_sched_dependency(struct drm_sched_job *job,
-					       struct drm_sched_entity *entity)
-{
-	struct lima_sched_task *task = to_lima_task(job);
-
-	if (!xa_empty(&task->deps))
-		return xa_erase(&task->deps, task->last_dep++);
-
-	return NULL;
-}
-
 static int lima_pm_busy(struct lima_device *ldev)
 {
 	int ret;
@@ -472,7 +452,6 @@ static void lima_sched_free_job(struct drm_sched_job *job)
 }
 
 static const struct drm_sched_backend_ops lima_sched_ops = {
-	.dependency = lima_sched_dependency,
 	.run_job = lima_sched_run_job,
 	.timedout_job = lima_sched_timedout_job,
 	.free_job = lima_sched_free_job,
diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
index ac70006b0e26..6a11764d87b3 100644
--- a/drivers/gpu/drm/lima/lima_sched.h
+++ b/drivers/gpu/drm/lima/lima_sched.h
@@ -23,9 +23,6 @@ struct lima_sched_task {
 	struct lima_vm *vm;
 	void *frame;
 
-	struct xarray deps;
-	unsigned long last_dep;
-
 	struct lima_bo **bos;
 	int num_bos;
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
                   ` (4 preceding siblings ...)
  2021-06-24 14:00 ` [PATCH 05/11] drm/lima: " Daniel Vetter
@ 2021-06-24 14:00 ` Daniel Vetter
  2021-06-24 16:59   ` Emma Anholt
  2021-06-24 20:45   ` [PATCH] " Daniel Vetter
  2021-06-24 14:00 ` [PATCH 07/11] drm/v3d: Use scheduler dependency handling Daniel Vetter
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Emma Anholt, Daniel Vetter

Prep work for using the scheduler dependency handling.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
---
 drivers/gpu/drm/v3d/v3d_gem.c | 64 +++++++++++------------------------
 1 file changed, 20 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index ac608eb9b594..40302c77e667 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -433,9 +433,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	     struct v3d_job *job, void (*free)(struct kref *ref),
-	     u32 in_sync)
+	     u32 in_sync, enum v3d_queue queue)
 {
 	struct dma_fence *in_fence = NULL;
+	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	int ret;
 
 	job->v3d = v3d;
@@ -446,6 +447,10 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 		return ret;
 
 	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
+				 v3d_priv);
+	if (ret)
+		goto fail;
 
 	ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence);
 	if (ret == -EINVAL)
@@ -464,25 +469,15 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-	     struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-	int ret;
-
-	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
-				 v3d_priv);
-	if (ret)
-		return ret;
-
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	/* put by scheduler job completion */
 	kref_get(&job->refcount);
 
 	drm_sched_entity_push_job(&job->base);
-
-	return 0;
 }
 
 static void
@@ -528,7 +523,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		    struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_cl *args = data;
 	struct v3d_bin_job *bin = NULL;
 	struct v3d_render_job *render;
@@ -554,7 +548,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 	INIT_LIST_HEAD(&render->unref_list);
 
 	ret = v3d_job_init(v3d, file_priv, &render->base,
-			   v3d_render_job_free, args->in_sync_rcl);
+			   v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
 	if (ret) {
 		kfree(render);
 		return ret;
@@ -568,7 +562,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		}
 
 		ret = v3d_job_init(v3d, file_priv, &bin->base,
-				   v3d_job_free, args->in_sync_bcl);
+				   v3d_job_free, args->in_sync_bcl, V3D_BIN);
 		if (ret) {
 			v3d_job_put(&render->base);
 			kfree(bin);
@@ -590,7 +584,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 			goto fail;
 		}
 
-		ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+		ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN);
 		if (ret) {
 			kfree(clean_job);
 			clean_job = NULL;
@@ -613,9 +607,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 
 	mutex_lock(&v3d->sched_lock);
 	if (bin) {
-		ret = v3d_push_job(v3d_priv, &bin->base, V3D_BIN);
-		if (ret)
-			goto fail_unreserve;
+		v3d_push_job(&bin->base);
 
 		ret = drm_gem_fence_array_add(&render->base.deps,
 					      dma_fence_get(bin->base.done_fence));
@@ -623,9 +615,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 			goto fail_unreserve;
 	}
 
-	ret = v3d_push_job(v3d_priv, &render->base, V3D_RENDER);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&render->base);
 
 	if (clean_job) {
 		struct dma_fence *render_fence =
@@ -633,9 +623,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		ret = drm_gem_fence_array_add(&clean_job->deps, render_fence);
 		if (ret)
 			goto fail_unreserve;
-		ret = v3d_push_job(v3d_priv, clean_job, V3D_CACHE_CLEAN);
-		if (ret)
-			goto fail_unreserve;
+		v3d_push_job(clean_job);
 	}
 
 	mutex_unlock(&v3d->sched_lock);
@@ -682,7 +670,6 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_tfu *args = data;
 	struct v3d_tfu_job *job;
 	struct ww_acquire_ctx acquire_ctx;
@@ -695,7 +682,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 
 	ret = v3d_job_init(v3d, file_priv, &job->base,
-			   v3d_job_free, args->in_sync);
+			   v3d_job_free, args->in_sync, V3D_TFU);
 	if (ret) {
 		kfree(job);
 		return ret;
@@ -739,9 +726,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		goto fail;
 
 	mutex_lock(&v3d->sched_lock);
-	ret = v3d_push_job(v3d_priv, &job->base, V3D_TFU);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&job->base);
 	mutex_unlock(&v3d->sched_lock);
 
 	v3d_attach_fences_and_unlock_reservation(file_priv,
@@ -753,10 +738,6 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 
 	return 0;
 
-fail_unreserve:
-	mutex_unlock(&v3d->sched_lock);
-	drm_gem_unlock_reservations(job->base.bo, job->base.bo_count,
-				    &acquire_ctx);
 fail:
 	v3d_job_put(&job->base);
 
@@ -777,7 +758,6 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_csd *args = data;
 	struct v3d_csd_job *job;
 	struct v3d_job *clean_job;
@@ -796,7 +776,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 
 	ret = v3d_job_init(v3d, file_priv, &job->base,
-			   v3d_job_free, args->in_sync);
+			   v3d_job_free, args->in_sync, V3D_CSD);
 	if (ret) {
 		kfree(job);
 		return ret;
@@ -809,7 +789,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 	}
 
-	ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+	ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN);
 	if (ret) {
 		v3d_job_put(&job->base);
 		kfree(clean_job);
@@ -828,18 +808,14 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		goto fail;
 
 	mutex_lock(&v3d->sched_lock);
-	ret = v3d_push_job(v3d_priv, &job->base, V3D_CSD);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&job->base);
 
 	ret = drm_gem_fence_array_add(&clean_job->deps,
 				      dma_fence_get(job->base.done_fence));
 	if (ret)
 		goto fail_unreserve;
 
-	ret = v3d_push_job(v3d_priv, clean_job, V3D_CACHE_CLEAN);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(clean_job);
 	mutex_unlock(&v3d->sched_lock);
 
 	v3d_attach_fences_and_unlock_reservation(file_priv,
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 07/11] drm/v3d: Use scheduler dependency handling
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
                   ` (5 preceding siblings ...)
  2021-06-24 14:00 ` [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init Daniel Vetter
@ 2021-06-24 14:00 ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Daniel Vetter

With the prep work out of the way this isn't tricky anymore.

Aside: The chaining of the various jobs is a bit awkward, with the
possibility of failure in bad places. I think with the
drm_sched_job_init/arm split and maybe preloading the
job->dependencies xarray this should be fixable.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/v3d/v3d_drv.h   |  5 -----
 drivers/gpu/drm/v3d/v3d_gem.c   | 25 ++++++++-----------------
 drivers/gpu/drm/v3d/v3d_sched.c | 29 +----------------------------
 3 files changed, 9 insertions(+), 50 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h
index 8a390738d65b..73559eb368a7 100644
--- a/drivers/gpu/drm/v3d/v3d_drv.h
+++ b/drivers/gpu/drm/v3d/v3d_drv.h
@@ -192,11 +192,6 @@ struct v3d_job {
 	struct drm_gem_object **bo;
 	u32 bo_count;
 
-	/* Array of struct dma_fence * to block on before submitting this job.
-	 */
-	struct xarray deps;
-	unsigned long last_dep;
-
 	/* v3d fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *irq_fence;
 
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 40302c77e667..c54684f8b877 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -257,8 +257,8 @@ v3d_lock_bo_reservations(struct v3d_job *job,
 		return ret;
 
 	for (i = 0; i < job->bo_count; i++) {
-		ret = drm_gem_fence_array_add_implicit(&job->deps,
-						       job->bo[i], true);
+		ret = drm_sched_job_await_implicit(&job->base,
+						   job->bo[i], true);
 		if (ret) {
 			drm_gem_unlock_reservations(job->bo, job->bo_count,
 						    acquire_ctx);
@@ -354,8 +354,6 @@ static void
 v3d_job_free(struct kref *ref)
 {
 	struct v3d_job *job = container_of(ref, struct v3d_job, refcount);
-	unsigned long index;
-	struct dma_fence *fence;
 	int i;
 
 	for (i = 0; i < job->bo_count; i++) {
@@ -364,11 +362,6 @@ v3d_job_free(struct kref *ref)
 	}
 	kvfree(job->bo);
 
-	xa_for_each(&job->deps, index, fence) {
-		dma_fence_put(fence);
-	}
-	xa_destroy(&job->deps);
-
 	dma_fence_put(job->irq_fence);
 	dma_fence_put(job->done_fence);
 
@@ -446,7 +439,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	if (ret < 0)
 		return ret;
 
-	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
 	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
 				 v3d_priv);
 	if (ret)
@@ -456,7 +448,7 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	if (ret == -EINVAL)
 		goto fail;
 
-	ret = drm_gem_fence_array_add(&job->deps, in_fence);
+	ret = drm_sched_job_await_fence(&job->base, in_fence);
 	if (ret)
 		goto fail;
 
@@ -464,7 +456,6 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 
 	return 0;
 fail:
-	xa_destroy(&job->deps);
 	pm_runtime_put_autosuspend(v3d->drm.dev);
 	return ret;
 }
@@ -609,8 +600,8 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 	if (bin) {
 		v3d_push_job(&bin->base);
 
-		ret = drm_gem_fence_array_add(&render->base.deps,
-					      dma_fence_get(bin->base.done_fence));
+		ret = drm_sched_job_await_fence(&render->base.base,
+						dma_fence_get(bin->base.done_fence));
 		if (ret)
 			goto fail_unreserve;
 	}
@@ -620,7 +611,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 	if (clean_job) {
 		struct dma_fence *render_fence =
 			dma_fence_get(render->base.done_fence);
-		ret = drm_gem_fence_array_add(&clean_job->deps, render_fence);
+		ret = drm_sched_job_await_fence(&clean_job->base, render_fence);
 		if (ret)
 			goto fail_unreserve;
 		v3d_push_job(clean_job);
@@ -810,8 +801,8 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 	mutex_lock(&v3d->sched_lock);
 	v3d_push_job(&job->base);
 
-	ret = drm_gem_fence_array_add(&clean_job->deps,
-				      dma_fence_get(job->base.done_fence));
+	ret = drm_sched_job_await_fence(&clean_job->base,
+					dma_fence_get(job->base.done_fence));
 	if (ret)
 		goto fail_unreserve;
 
diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c
index 8992480c88fa..3af6a09764c7 100644
--- a/drivers/gpu/drm/v3d/v3d_sched.c
+++ b/drivers/gpu/drm/v3d/v3d_sched.c
@@ -13,7 +13,7 @@
  * jobs when bulk background jobs are queued up, we submit a new job
  * to the HW only when it has completed the last one, instead of
  * filling up the CT[01]Q FIFOs with jobs.  Similarly, we use
- * v3d_job_dependency() to manage the dependency between bin and
+ * drm_sched_job_await_fence() to manage the dependency between bin and
  * render, instead of having the clients submit jobs using the HW's
  * semaphores to interlock between them.
  */
@@ -63,28 +63,6 @@ v3d_job_free(struct drm_sched_job *sched_job)
 	v3d_job_put(job);
 }
 
-/*
- * Returns the fences that the job depends on, one by one.
- *
- * If placed in the scheduler's .dependency method, the corresponding
- * .run_job won't be called until all of them have been signaled.
- */
-static struct dma_fence *
-v3d_job_dependency(struct drm_sched_job *sched_job,
-		   struct drm_sched_entity *s_entity)
-{
-	struct v3d_job *job = to_v3d_job(sched_job);
-
-	/* XXX: Wait on a fence for switching the GMP if necessary,
-	 * and then do so.
-	 */
-
-	if (!xa_empty(&job->deps))
-		return xa_erase(&job->deps, job->last_dep++);
-
-	return NULL;
-}
-
 static struct dma_fence *v3d_bin_job_run(struct drm_sched_job *sched_job)
 {
 	struct v3d_bin_job *job = to_bin_job(sched_job);
@@ -357,35 +335,30 @@ v3d_csd_job_timedout(struct drm_sched_job *sched_job)
 }
 
 static const struct drm_sched_backend_ops v3d_bin_sched_ops = {
-	.dependency = v3d_job_dependency,
 	.run_job = v3d_bin_job_run,
 	.timedout_job = v3d_bin_job_timedout,
 	.free_job = v3d_job_free,
 };
 
 static const struct drm_sched_backend_ops v3d_render_sched_ops = {
-	.dependency = v3d_job_dependency,
 	.run_job = v3d_render_job_run,
 	.timedout_job = v3d_render_job_timedout,
 	.free_job = v3d_job_free,
 };
 
 static const struct drm_sched_backend_ops v3d_tfu_sched_ops = {
-	.dependency = v3d_job_dependency,
 	.run_job = v3d_tfu_job_run,
 	.timedout_job = v3d_generic_job_timedout,
 	.free_job = v3d_job_free,
 };
 
 static const struct drm_sched_backend_ops v3d_csd_sched_ops = {
-	.dependency = v3d_job_dependency,
 	.run_job = v3d_csd_job_run,
 	.timedout_job = v3d_csd_job_timedout,
 	.free_job = v3d_job_free
 };
 
 static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = {
-	.dependency = v3d_job_dependency,
 	.run_job = v3d_cache_clean_job_run,
 	.timedout_job = v3d_generic_job_timedout,
 	.free_job = v3d_job_free
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/11] drm/etnaviv: Use scheduler dependency handling
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Sumit Semwal, Christian König, etnaviv,
	linux-media, linaro-mm-sig

We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h        |  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 +++++-----
 drivers/gpu/drm/etnaviv/etnaviv_sched.c      | 61 +-------------------
 drivers/gpu/drm/etnaviv/etnaviv_sched.h      |  3 +-
 4 files changed, 20 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
 	u64 va;
 	struct etnaviv_gem_object *obj;
 	struct etnaviv_vram_mapping *mapping;
-	struct dma_fence *excl;
-	unsigned int nr_shared;
-	struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
 	struct etnaviv_file_private *ctx;
 	struct etnaviv_gpu *gpu;
 	struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-	struct dma_fence *out_fence, *in_fence;
+	struct dma_fence *out_fence;
 	int out_fence_id;
 	struct list_head node; /* GPU active submit list */
 	struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..92478a50a580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
 			continue;
 
-		if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-			ret = dma_resv_get_fences(robj, &bo->excl,
-						  &bo->nr_shared,
-						  &bo->shared);
-			if (ret)
-				return ret;
-		} else {
-			bo->excl = dma_resv_get_excl_unlocked(robj);
-		}
-
+		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
+						   bo->flags & ETNA_SUBMIT_BO_WRITE);
+		if (ret)
+			return ret;
 	}
 
 	return ret;
@@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref)
 
 	wake_up_all(&submit->gpu->fence_event);
 
-	if (submit->in_fence)
-		dma_fence_put(submit->in_fence);
 	if (submit->out_fence) {
 		/* first remove from IDR, so fence can not be found anymore */
 		mutex_lock(&submit->gpu->fence_lock);
@@ -537,6 +529,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	submit->exec_state = args->exec_state;
 	submit->flags = args->flags;
 
+	ret = drm_sched_job_init(&submit->sched_job,
+				 &ctx->sched_entity[args->pipe],
+				 submit->ctx);
+	if (ret)
+		goto err_submit_objects;
+
 	ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
 	if (ret)
 		goto err_submit_objects;
@@ -549,11 +547,15 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	}
 
 	if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-		submit->in_fence = sync_file_get_fence(args->fence_fd);
-		if (!submit->in_fence) {
+		struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd);
+		if (!in_fence) {
 			ret = -EINVAL;
 			goto err_submit_objects;
 		}
+
+		ret = drm_sched_job_await_fence(&submit->sched_job, in_fence);
+		if (ret)
+			goto err_submit_objects;
 	}
 
 	ret = submit_pin_objects(submit);
@@ -579,7 +581,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	if (ret)
 		goto err_submit_objects;
 
-	ret = etnaviv_sched_push_job(&ctx->sched_entity[args->pipe], submit);
+	ret = etnaviv_sched_push_job(submit);
 	if (ret)
 		goto err_submit_objects;
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 77995f190790..d62053b69953 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -17,58 +17,6 @@ module_param_named(job_hang_limit, etnaviv_job_hang_limit, int , 0444);
 static int etnaviv_hw_jobs_limit = 4;
 module_param_named(hw_job_limit, etnaviv_hw_jobs_limit, int , 0444);
 
-static struct dma_fence *
-etnaviv_sched_dependency(struct drm_sched_job *sched_job,
-			 struct drm_sched_entity *entity)
-{
-	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
-	struct dma_fence *fence;
-	int i;
-
-	if (unlikely(submit->in_fence)) {
-		fence = submit->in_fence;
-		submit->in_fence = NULL;
-
-		if (!dma_fence_is_signaled(fence))
-			return fence;
-
-		dma_fence_put(fence);
-	}
-
-	for (i = 0; i < submit->nr_bos; i++) {
-		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
-		int j;
-
-		if (bo->excl) {
-			fence = bo->excl;
-			bo->excl = NULL;
-
-			if (!dma_fence_is_signaled(fence))
-				return fence;
-
-			dma_fence_put(fence);
-		}
-
-		for (j = 0; j < bo->nr_shared; j++) {
-			if (!bo->shared[j])
-				continue;
-
-			fence = bo->shared[j];
-			bo->shared[j] = NULL;
-
-			if (!dma_fence_is_signaled(fence))
-				return fence;
-
-			dma_fence_put(fence);
-		}
-		kfree(bo->shared);
-		bo->nr_shared = 0;
-		bo->shared = NULL;
-	}
-
-	return NULL;
-}
-
 static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
 {
 	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
@@ -140,14 +88,12 @@ static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
 }
 
 static const struct drm_sched_backend_ops etnaviv_sched_ops = {
-	.dependency = etnaviv_sched_dependency,
 	.run_job = etnaviv_sched_run_job,
 	.timedout_job = etnaviv_sched_timedout_job,
 	.free_job = etnaviv_sched_free_job,
 };
 
-int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
-			   struct etnaviv_gem_submit *submit)
+int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit)
 {
 	int ret = 0;
 
@@ -158,11 +104,6 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	 */
 	mutex_lock(&submit->gpu->fence_lock);
 
-	ret = drm_sched_job_init(&submit->sched_job, sched_entity,
-				 submit->ctx);
-	if (ret)
-		goto out_unlock;
-
 	drm_sched_job_arm(&submit->sched_job);
 
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.h b/drivers/gpu/drm/etnaviv/etnaviv_sched.h
index c0a6796e22c9..baebfa069afc 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.h
@@ -18,7 +18,6 @@ struct etnaviv_gem_submit *to_etnaviv_submit(struct drm_sched_job *sched_job)
 
 int etnaviv_sched_init(struct etnaviv_gpu *gpu);
 void etnaviv_sched_fini(struct etnaviv_gpu *gpu);
-int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
-			   struct etnaviv_gem_submit *submit);
+int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit);
 
 #endif /* __ETNAVIV_SCHED_H__ */
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 08/11] drm/etnaviv: Use scheduler dependency handling
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, etnaviv, Christian König, linaro-mm-sig,
	Russell King, Daniel Vetter, linux-media

We need to pull the drm_sched_job_init much earlier, but that's very
minor surgery.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/etnaviv/etnaviv_gem.h        |  5 +-
 drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c | 32 +++++-----
 drivers/gpu/drm/etnaviv/etnaviv_sched.c      | 61 +-------------------
 drivers/gpu/drm/etnaviv/etnaviv_sched.h      |  3 +-
 4 files changed, 20 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.h b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
index 98e60df882b6..63688e6e4580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.h
@@ -80,9 +80,6 @@ struct etnaviv_gem_submit_bo {
 	u64 va;
 	struct etnaviv_gem_object *obj;
 	struct etnaviv_vram_mapping *mapping;
-	struct dma_fence *excl;
-	unsigned int nr_shared;
-	struct dma_fence **shared;
 };
 
 /* Created per submit-ioctl, to track bo's and cmdstream bufs, etc,
@@ -95,7 +92,7 @@ struct etnaviv_gem_submit {
 	struct etnaviv_file_private *ctx;
 	struct etnaviv_gpu *gpu;
 	struct etnaviv_iommu_context *mmu_context, *prev_mmu_context;
-	struct dma_fence *out_fence, *in_fence;
+	struct dma_fence *out_fence;
 	int out_fence_id;
 	struct list_head node; /* GPU active submit list */
 	struct etnaviv_cmdbuf cmdbuf;
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 4dd7d9d541c0..92478a50a580 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -188,16 +188,10 @@ static int submit_fence_sync(struct etnaviv_gem_submit *submit)
 		if (submit->flags & ETNA_SUBMIT_NO_IMPLICIT)
 			continue;
 
-		if (bo->flags & ETNA_SUBMIT_BO_WRITE) {
-			ret = dma_resv_get_fences(robj, &bo->excl,
-						  &bo->nr_shared,
-						  &bo->shared);
-			if (ret)
-				return ret;
-		} else {
-			bo->excl = dma_resv_get_excl_unlocked(robj);
-		}
-
+		ret = drm_sched_job_await_implicit(&submit->sched_job, &bo->obj->base,
+						   bo->flags & ETNA_SUBMIT_BO_WRITE);
+		if (ret)
+			return ret;
 	}
 
 	return ret;
@@ -403,8 +397,6 @@ static void submit_cleanup(struct kref *kref)
 
 	wake_up_all(&submit->gpu->fence_event);
 
-	if (submit->in_fence)
-		dma_fence_put(submit->in_fence);
 	if (submit->out_fence) {
 		/* first remove from IDR, so fence can not be found anymore */
 		mutex_lock(&submit->gpu->fence_lock);
@@ -537,6 +529,12 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	submit->exec_state = args->exec_state;
 	submit->flags = args->flags;
 
+	ret = drm_sched_job_init(&submit->sched_job,
+				 &ctx->sched_entity[args->pipe],
+				 submit->ctx);
+	if (ret)
+		goto err_submit_objects;
+
 	ret = submit_lookup_objects(submit, file, bos, args->nr_bos);
 	if (ret)
 		goto err_submit_objects;
@@ -549,11 +547,15 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	}
 
 	if (args->flags & ETNA_SUBMIT_FENCE_FD_IN) {
-		submit->in_fence = sync_file_get_fence(args->fence_fd);
-		if (!submit->in_fence) {
+		struct dma_fence *in_fence = sync_file_get_fence(args->fence_fd);
+		if (!in_fence) {
 			ret = -EINVAL;
 			goto err_submit_objects;
 		}
+
+		ret = drm_sched_job_await_fence(&submit->sched_job, in_fence);
+		if (ret)
+			goto err_submit_objects;
 	}
 
 	ret = submit_pin_objects(submit);
@@ -579,7 +581,7 @@ int etnaviv_ioctl_gem_submit(struct drm_device *dev, void *data,
 	if (ret)
 		goto err_submit_objects;
 
-	ret = etnaviv_sched_push_job(&ctx->sched_entity[args->pipe], submit);
+	ret = etnaviv_sched_push_job(submit);
 	if (ret)
 		goto err_submit_objects;
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 77995f190790..d62053b69953 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -17,58 +17,6 @@ module_param_named(job_hang_limit, etnaviv_job_hang_limit, int , 0444);
 static int etnaviv_hw_jobs_limit = 4;
 module_param_named(hw_job_limit, etnaviv_hw_jobs_limit, int , 0444);
 
-static struct dma_fence *
-etnaviv_sched_dependency(struct drm_sched_job *sched_job,
-			 struct drm_sched_entity *entity)
-{
-	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
-	struct dma_fence *fence;
-	int i;
-
-	if (unlikely(submit->in_fence)) {
-		fence = submit->in_fence;
-		submit->in_fence = NULL;
-
-		if (!dma_fence_is_signaled(fence))
-			return fence;
-
-		dma_fence_put(fence);
-	}
-
-	for (i = 0; i < submit->nr_bos; i++) {
-		struct etnaviv_gem_submit_bo *bo = &submit->bos[i];
-		int j;
-
-		if (bo->excl) {
-			fence = bo->excl;
-			bo->excl = NULL;
-
-			if (!dma_fence_is_signaled(fence))
-				return fence;
-
-			dma_fence_put(fence);
-		}
-
-		for (j = 0; j < bo->nr_shared; j++) {
-			if (!bo->shared[j])
-				continue;
-
-			fence = bo->shared[j];
-			bo->shared[j] = NULL;
-
-			if (!dma_fence_is_signaled(fence))
-				return fence;
-
-			dma_fence_put(fence);
-		}
-		kfree(bo->shared);
-		bo->nr_shared = 0;
-		bo->shared = NULL;
-	}
-
-	return NULL;
-}
-
 static struct dma_fence *etnaviv_sched_run_job(struct drm_sched_job *sched_job)
 {
 	struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
@@ -140,14 +88,12 @@ static void etnaviv_sched_free_job(struct drm_sched_job *sched_job)
 }
 
 static const struct drm_sched_backend_ops etnaviv_sched_ops = {
-	.dependency = etnaviv_sched_dependency,
 	.run_job = etnaviv_sched_run_job,
 	.timedout_job = etnaviv_sched_timedout_job,
 	.free_job = etnaviv_sched_free_job,
 };
 
-int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
-			   struct etnaviv_gem_submit *submit)
+int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit)
 {
 	int ret = 0;
 
@@ -158,11 +104,6 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	 */
 	mutex_lock(&submit->gpu->fence_lock);
 
-	ret = drm_sched_job_init(&submit->sched_job, sched_entity,
-				 submit->ctx);
-	if (ret)
-		goto out_unlock;
-
 	drm_sched_job_arm(&submit->sched_job);
 
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.h b/drivers/gpu/drm/etnaviv/etnaviv_sched.h
index c0a6796e22c9..baebfa069afc 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.h
@@ -18,7 +18,6 @@ struct etnaviv_gem_submit *to_etnaviv_submit(struct drm_sched_job *sched_job)
 
 int etnaviv_sched_init(struct etnaviv_gpu *gpu);
 void etnaviv_sched_fini(struct etnaviv_gpu *gpu);
-int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
-			   struct etnaviv_gem_submit *submit);
+int etnaviv_sched_push_job(struct etnaviv_gem_submit *submit);
 
 #endif /* __ETNAVIV_SCHED_H__ */
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/11] drm/gem: Delete gem array fencing helpers
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
@ 2021-06-24 14:00   ` Daniel Vetter
  2021-06-24 14:00   ` Daniel Vetter
                     ` (9 subsequent siblings)
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Daniel Vetter, Sumit Semwal,
	Christian König, linux-media, linaro-mm-sig

Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---------------------------------------
 include/drm/drm_gem.h     |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 68deb1de8235..24d49a2636e0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 	ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-			    struct dma_fence *fence)
-{
-	struct dma_fence *entry;
-	unsigned long index;
-	u32 id = 0;
-	int ret;
-
-	if (!fence)
-		return 0;
-
-	/* Deduplicate if we already depend on a fence from the same context.
-	 * This lets the size of the array of deps scale with the number of
-	 * engines involved, rather than the number of BOs.
-	 */
-	xa_for_each(fence_array, index, entry) {
-		if (entry->context != fence->context)
-			continue;
-
-		if (dma_fence_is_later(fence, entry)) {
-			dma_fence_put(entry);
-			xa_store(fence_array, index, fence, GFP_KERNEL);
-		} else {
-			dma_fence_put(fence);
-		}
-		return 0;
-	}
-
-	ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL);
-	if (ret != 0)
-		dma_fence_put(fence);
-
-	return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-				     struct drm_gem_object *obj,
-				     bool write)
-{
-	int ret;
-	struct dma_fence **fences;
-	unsigned int i, fence_count;
-
-	if (!write) {
-		struct dma_fence *fence =
-			dma_resv_get_excl_unlocked(obj->resv);
-
-		return drm_gem_fence_array_add(fence_array, fence);
-	}
-
-	ret = dma_resv_get_fences(obj->resv, NULL,
-						&fence_count, &fences);
-	if (ret || !fence_count)
-		return ret;
-
-	for (i = 0; i < fence_count; i++) {
-		ret = drm_gem_fence_array_add(fence_array, fences[i]);
-		if (ret)
-			break;
-	}
-
-	for (; i < fence_count; i++)
-		dma_fence_put(fences[i]);
-	kfree(fences);
-	return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 240049566592..6d5e33b89074 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
 			      struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 				 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-			    struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-				     struct drm_gem_object *obj,
-				     bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
 			    u32 handle, u64 *offset);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 09/11] drm/gem: Delete gem array fencing helpers
@ 2021-06-24 14:00   ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Christian König, linaro-mm-sig,
	Thomas Zimmermann, Daniel Vetter, linux-media

Integrated into the scheduler now and all users converted over.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/drm_gem.c | 96 ---------------------------------------
 include/drm/drm_gem.h     |  5 --
 2 files changed, 101 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 68deb1de8235..24d49a2636e0 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1294,99 +1294,3 @@ drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 	ww_acquire_fini(acquire_ctx);
 }
 EXPORT_SYMBOL(drm_gem_unlock_reservations);
-
-/**
- * drm_gem_fence_array_add - Adds the fence to an array of fences to be
- * waited on, deduplicating fences from the same context.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @fence: the dma_fence to add to the list of dependencies.
- *
- * This functions consumes the reference for @fence both on success and error
- * cases.
- *
- * Returns:
- * 0 on success, or an error on failing to expand the array.
- */
-int drm_gem_fence_array_add(struct xarray *fence_array,
-			    struct dma_fence *fence)
-{
-	struct dma_fence *entry;
-	unsigned long index;
-	u32 id = 0;
-	int ret;
-
-	if (!fence)
-		return 0;
-
-	/* Deduplicate if we already depend on a fence from the same context.
-	 * This lets the size of the array of deps scale with the number of
-	 * engines involved, rather than the number of BOs.
-	 */
-	xa_for_each(fence_array, index, entry) {
-		if (entry->context != fence->context)
-			continue;
-
-		if (dma_fence_is_later(fence, entry)) {
-			dma_fence_put(entry);
-			xa_store(fence_array, index, fence, GFP_KERNEL);
-		} else {
-			dma_fence_put(fence);
-		}
-		return 0;
-	}
-
-	ret = xa_alloc(fence_array, &id, fence, xa_limit_32b, GFP_KERNEL);
-	if (ret != 0)
-		dma_fence_put(fence);
-
-	return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add);
-
-/**
- * drm_gem_fence_array_add_implicit - Adds the implicit dependencies tracked
- * in the GEM object's reservation object to an array of dma_fences for use in
- * scheduling a rendering job.
- *
- * This should be called after drm_gem_lock_reservations() on your array of
- * GEM objects used in the job but before updating the reservations with your
- * own fences.
- *
- * @fence_array: array of dma_fence * for the job to block on.
- * @obj: the gem object to add new dependencies from.
- * @write: whether the job might write the object (so we need to depend on
- * shared fences in the reservation object).
- */
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-				     struct drm_gem_object *obj,
-				     bool write)
-{
-	int ret;
-	struct dma_fence **fences;
-	unsigned int i, fence_count;
-
-	if (!write) {
-		struct dma_fence *fence =
-			dma_resv_get_excl_unlocked(obj->resv);
-
-		return drm_gem_fence_array_add(fence_array, fence);
-	}
-
-	ret = dma_resv_get_fences(obj->resv, NULL,
-						&fence_count, &fences);
-	if (ret || !fence_count)
-		return ret;
-
-	for (i = 0; i < fence_count; i++) {
-		ret = drm_gem_fence_array_add(fence_array, fences[i]);
-		if (ret)
-			break;
-	}
-
-	for (; i < fence_count; i++)
-		dma_fence_put(fences[i]);
-	kfree(fences);
-	return ret;
-}
-EXPORT_SYMBOL(drm_gem_fence_array_add_implicit);
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 240049566592..6d5e33b89074 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -409,11 +409,6 @@ int drm_gem_lock_reservations(struct drm_gem_object **objs, int count,
 			      struct ww_acquire_ctx *acquire_ctx);
 void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count,
 				 struct ww_acquire_ctx *acquire_ctx);
-int drm_gem_fence_array_add(struct xarray *fence_array,
-			    struct dma_fence *fence);
-int drm_gem_fence_array_add_implicit(struct xarray *fence_array,
-				     struct drm_gem_object *obj,
-				     bool write);
 int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
 			    u32 handle, u64 *offset);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
                   ` (8 preceding siblings ...)
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:00 ` Daniel Vetter
  2021-06-24 14:42   ` Lucas Stach
  2021-06-24 17:03   ` Christian König
  2021-06-24 14:00 ` [PATCH 11/11] drm/sched: Check locking in drm_sched_job_await_implicit Daniel Vetter
  10 siblings, 2 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Jack Zhang, Daniel Vetter, Luben Tuikov, Alex Deucher,
	Daniel Vetter, Christian König

This is essentially part of drm_sched_dependency_optimized(), which
only amdgpu seems to make use of. Use it a bit more.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 370c336d383f..c31d7cf7df74 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
 	if (!fence)
 		return 0;
 
+	/* if it's a fence from us it's guaranteed to be earlier */
+	if (fence->context == job->entity->fence_context ||
+	    fence->context == job->entity->fence_context + 1) {
+		dma_fence_put(fence);
+		return 0;
+	}
+
 	/* Deduplicate if we already depend on a fence from the same context.
 	 * This lets the size of the array of deps scale with the number of
 	 * engines involved, rather than the number of BOs.
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH 11/11] drm/sched: Check locking in drm_sched_job_await_implicit
  2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
                   ` (9 preceding siblings ...)
  2021-06-24 14:00 ` [PATCH 10/11] drm/scheduler: Don't store self-dependencies Daniel Vetter
@ 2021-06-24 14:00 ` Daniel Vetter
  10 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 14:00 UTC (permalink / raw)
  To: DRI Development
  Cc: Jack Zhang, Daniel Vetter, Luben Tuikov, Alex Deucher,
	Daniel Vetter, Christian König

You really need to hold the reservation here or all kinds of funny
things can happen between grabbing the dependencies and inserting the
new fences.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index c31d7cf7df74..ee75383331af 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -703,6 +703,8 @@ int drm_sched_job_await_implicit(struct drm_sched_job *job,
 	struct dma_fence **fences;
 	unsigned int i, fence_count;
 
+	dma_resv_assert_held(obj->resv);
+
 	if (!write) {
 		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:32     ` Steven Price
  -1 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Lucas Stach, Russell King, Christian Gmeiner,
	Qiang Yu, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	David Airlie, Daniel Vetter, Sumit Semwal, Christian König,
	Masahiro Yamada, Kees Cook, Adam Borowski, Nick Terrell,
	Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen, Viresh Kumar,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Lee Jones,
	Kevin Wang, Chen Li, Luben Tuikov, Marek Olšák,
	Dennis Li, Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, etnaviv, lima,
	linux-media, linaro-mm-sig

On 24/06/2021 15:00, Daniel Vetter wrote:
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.
> 
> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  .gitignore                               |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>  include/drm/gpu_scheduler.h              |  6 +++++-
>  10 files changed, 51 insertions(+), 10 deletions(-)
> 
> diff --git a/.gitignore b/.gitignore
> index 7afd412dadd2..52433a930299 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -66,6 +66,7 @@ modules.order
>  /modules.builtin
>  /modules.builtin.modinfo
>  /modules.nsdeps
> +*.builtin

I don't think this belongs in this patch!

[...]
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>  		goto unlock;
>  	}
>  
> +	drm_sched_job_arm(&job->base);
> +
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
>  	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,

Acked-by: Steven Price <steven.price@arm.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 14:32     ` Steven Price
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Adam Borowski, David Airlie, Viresh Kumar, Sonny Jiang,
	Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang, lima,
	Mauro Carvalho Chehab, Masahiro Yamada, Luben Tuikov,
	Alyssa Rosenzweig, Sami Tolvanen, Russell King, Dave Airlie,
	Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, etnaviv, linaro-mm-sig,
	Nick Terrell, Deepak R Varma, Tomeu Vizoso, Boris Brezillon,
	Qiang Yu, Alex Deucher, Tian Tao, linux-media,
	Christian König

On 24/06/2021 15:00, Daniel Vetter wrote:
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.
> 
> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  .gitignore                               |  1 +
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>  include/drm/gpu_scheduler.h              |  6 +++++-
>  10 files changed, 51 insertions(+), 10 deletions(-)
> 
> diff --git a/.gitignore b/.gitignore
> index 7afd412dadd2..52433a930299 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -66,6 +66,7 @@ modules.order
>  /modules.builtin
>  /modules.builtin.modinfo
>  /modules.nsdeps
> +*.builtin

I don't think this belongs in this patch!

[...]
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>  		goto unlock;
>  	}
>  
> +	drm_sched_job_arm(&job->base);
> +
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
>  	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,

Acked-by: Steven Price <steven.price@arm.com>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:32     ` Steven Price
  -1 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Jack Zhang, David Airlie, linaro-mm-sig, Boris Brezillon,
	Alex Deucher, Daniel Vetter, linux-media, Lee Jones,
	Christian König, Luben Tuikov, Nirmoy Das

On 24/06/2021 15:00, Daniel Vetter wrote:
> Instead of just a callback we can just glue in the gem helpers that
> panfrost, v3d and lima currently use. There's really not that many
> ways to skin this cat.
> 
> On the naming bikeshed: The idea for using _await_ to denote adding
> dependencies to a job comes from i915, where that's used quite
> extensively all over the place, in lots of datastructures.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

LGTM:

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>  include/drm/gpu_scheduler.h              |  31 ++++++-
>  3 files changed, 146 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index f7347c284886..b6f72fafd504 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>  	job->sched->ops->free_job(job);
>  }
>  
> +static struct dma_fence *
> +drm_sched_job_dependency(struct drm_sched_job *job,
> +			 struct drm_sched_entity *entity)
> +{
> +	if (!xa_empty(&job->dependencies))
> +		return xa_erase(&job->dependencies, job->last_dependency++);
> +
> +	if (job->sched->ops->dependency)
> +		return job->sched->ops->dependency(job, entity);
> +
> +	return NULL;
> +}
> +
>  /**
>   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>   *
> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		/* Wait for all dependencies to avoid data corruptions */
> -		while ((f = job->sched->ops->dependency(job, entity)))
> +		while ((f = drm_sched_job_dependency(job, entity)))
>  			dma_fence_wait(f, false);
>  
>  		drm_sched_fence_scheduled(s_fence);
> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>   */
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  {
> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>  	struct drm_sched_job *sched_job;
>  
>  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  		return NULL;
>  
>  	while ((entity->dependency =
> -			sched->ops->dependency(sched_job, entity))) {
> +			drm_sched_job_dependency(sched_job, entity))) {
>  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>  
>  		if (drm_sched_entity_add_dependency_cb(entity))
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 70eefed17e06..370c336d383f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  
>  	INIT_LIST_HEAD(&job->list);
>  
> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_job_init);
> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  }
>  EXPORT_SYMBOL(drm_sched_job_arm);
>  
> +/**
> + * drm_sched_job_await_fence - adds the fence as a job dependency
> + * @job: scheduler job to add the dependencies to
> + * @fence: the dma_fence to add to the list of dependencies.
> + *
> + * Note that @fence is consumed in both the success and error cases.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence)
> +{
> +	struct dma_fence *entry;
> +	unsigned long index;
> +	u32 id = 0;
> +	int ret;
> +
> +	if (!fence)
> +		return 0;
> +
> +	/* Deduplicate if we already depend on a fence from the same context.
> +	 * This lets the size of the array of deps scale with the number of
> +	 * engines involved, rather than the number of BOs.
> +	 */
> +	xa_for_each(&job->dependencies, index, entry) {
> +		if (entry->context != fence->context)
> +			continue;
> +
> +		if (dma_fence_is_later(fence, entry)) {
> +			dma_fence_put(entry);
> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> +		} else {
> +			dma_fence_put(fence);
> +		}
> +		return 0;
> +	}
> +
> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> +	if (ret != 0)
> +		dma_fence_put(fence);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_fence);
> +
> +/**
> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> + * @job: scheduler job to add the dependencies to
> + * @obj: the gem object to add new dependencies from.
> + * @write: whether the job might write the object (so we need to depend on
> + * shared fences in the reservation object).
> + *
> + * This should be called after drm_gem_lock_reservations() on your array of
> + * GEM objects used in the job but before updating the reservations with your
> + * own fences.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write)
> +{
> +	int ret;
> +	struct dma_fence **fences;
> +	unsigned int i, fence_count;
> +
> +	if (!write) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> +
> +		return drm_sched_job_await_fence(job, fence);
> +	}
> +
> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> +	if (ret || !fence_count)
> +		return ret;
> +
> +	for (i = 0; i < fence_count; i++) {
> +		ret = drm_sched_job_await_fence(job, fences[i]);
> +		if (ret)
> +			break;
> +	}
> +
> +	for (; i < fence_count; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> +
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>   */
>  void drm_sched_job_cleanup(struct drm_sched_job *job)
>  {
> +	struct dma_fence *fence;
> +	unsigned long index;
> +
>  	dma_fence_put(&job->s_fence->finished);
>  	job->s_fence = NULL;
> +
> +	xa_for_each(&job->dependencies, index, fence) {
> +		dma_fence_put(fence);
> +	}
> +	xa_destroy(&job->dependencies);
> +
>  }
>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>  
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 80438d126c9d..e4d7e1496296 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -27,9 +27,12 @@
>  #include <drm/spsc_queue.h>
>  #include <linux/dma-fence.h>
>  #include <linux/completion.h>
> +#include <linux/xarray.h>
>  
>  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>  
> +struct drm_gem_object;
> +
>  struct drm_gpu_scheduler;
>  struct drm_sched_rq;
>  
> @@ -198,6 +201,16 @@ struct drm_sched_job {
>  	enum drm_sched_priority		s_priority;
>  	struct drm_sched_entity         *entity;
>  	struct dma_fence_cb		cb;
> +	/**
> +	 * @dependencies:
> +	 *
> +	 * Contains the dependencies as struct dma_fence for this job, see
> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> +	 */
> +	struct xarray			dependencies;
> +
> +	/** @last_dependency: tracks @dependencies as they signal */
> +	unsigned long			last_dependency;
>  };
>  
>  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>   */
>  struct drm_sched_backend_ops {
>  	/**
> -         * @dependency: Called when the scheduler is considering scheduling
> -         * this job next, to get another struct dma_fence for this job to
> -	 * block on.  Once it returns NULL, run_job() may be called.
> +	 * @dependency:
> +	 *
> +	 * Called when the scheduler is considering scheduling this job next, to
> +	 * get another struct dma_fence for this job to block on.  Once it
> +	 * returns NULL, run_job() may be called.
> +	 *
> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>  	 */
>  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>  					struct drm_sched_entity *s_entity);
> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner);
>  void drm_sched_job_arm(struct drm_sched_job *job);
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence);
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write);
> +
> +
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
@ 2021-06-24 14:32     ` Steven Price
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Luben Tuikov, Jack Zhang, David Airlie, linaro-mm-sig,
	Boris Brezillon, Alex Deucher, Daniel Vetter, Nirmoy Das,
	Lee Jones, Christian König, linux-media

On 24/06/2021 15:00, Daniel Vetter wrote:
> Instead of just a callback we can just glue in the gem helpers that
> panfrost, v3d and lima currently use. There's really not that many
> ways to skin this cat.
> 
> On the naming bikeshed: The idea for using _await_ to denote adding
> dependencies to a job comes from i915, where that's used quite
> extensively all over the place, in lots of datastructures.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

LGTM:

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>  include/drm/gpu_scheduler.h              |  31 ++++++-
>  3 files changed, 146 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index f7347c284886..b6f72fafd504 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>  	job->sched->ops->free_job(job);
>  }
>  
> +static struct dma_fence *
> +drm_sched_job_dependency(struct drm_sched_job *job,
> +			 struct drm_sched_entity *entity)
> +{
> +	if (!xa_empty(&job->dependencies))
> +		return xa_erase(&job->dependencies, job->last_dependency++);
> +
> +	if (job->sched->ops->dependency)
> +		return job->sched->ops->dependency(job, entity);
> +
> +	return NULL;
> +}
> +
>  /**
>   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>   *
> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		/* Wait for all dependencies to avoid data corruptions */
> -		while ((f = job->sched->ops->dependency(job, entity)))
> +		while ((f = drm_sched_job_dependency(job, entity)))
>  			dma_fence_wait(f, false);
>  
>  		drm_sched_fence_scheduled(s_fence);
> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>   */
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  {
> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>  	struct drm_sched_job *sched_job;
>  
>  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  		return NULL;
>  
>  	while ((entity->dependency =
> -			sched->ops->dependency(sched_job, entity))) {
> +			drm_sched_job_dependency(sched_job, entity))) {
>  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>  
>  		if (drm_sched_entity_add_dependency_cb(entity))
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 70eefed17e06..370c336d383f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  
>  	INIT_LIST_HEAD(&job->list);
>  
> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_job_init);
> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  }
>  EXPORT_SYMBOL(drm_sched_job_arm);
>  
> +/**
> + * drm_sched_job_await_fence - adds the fence as a job dependency
> + * @job: scheduler job to add the dependencies to
> + * @fence: the dma_fence to add to the list of dependencies.
> + *
> + * Note that @fence is consumed in both the success and error cases.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence)
> +{
> +	struct dma_fence *entry;
> +	unsigned long index;
> +	u32 id = 0;
> +	int ret;
> +
> +	if (!fence)
> +		return 0;
> +
> +	/* Deduplicate if we already depend on a fence from the same context.
> +	 * This lets the size of the array of deps scale with the number of
> +	 * engines involved, rather than the number of BOs.
> +	 */
> +	xa_for_each(&job->dependencies, index, entry) {
> +		if (entry->context != fence->context)
> +			continue;
> +
> +		if (dma_fence_is_later(fence, entry)) {
> +			dma_fence_put(entry);
> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> +		} else {
> +			dma_fence_put(fence);
> +		}
> +		return 0;
> +	}
> +
> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> +	if (ret != 0)
> +		dma_fence_put(fence);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_fence);
> +
> +/**
> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> + * @job: scheduler job to add the dependencies to
> + * @obj: the gem object to add new dependencies from.
> + * @write: whether the job might write the object (so we need to depend on
> + * shared fences in the reservation object).
> + *
> + * This should be called after drm_gem_lock_reservations() on your array of
> + * GEM objects used in the job but before updating the reservations with your
> + * own fences.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write)
> +{
> +	int ret;
> +	struct dma_fence **fences;
> +	unsigned int i, fence_count;
> +
> +	if (!write) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> +
> +		return drm_sched_job_await_fence(job, fence);
> +	}
> +
> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> +	if (ret || !fence_count)
> +		return ret;
> +
> +	for (i = 0; i < fence_count; i++) {
> +		ret = drm_sched_job_await_fence(job, fences[i]);
> +		if (ret)
> +			break;
> +	}
> +
> +	for (; i < fence_count; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> +
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>   */
>  void drm_sched_job_cleanup(struct drm_sched_job *job)
>  {
> +	struct dma_fence *fence;
> +	unsigned long index;
> +
>  	dma_fence_put(&job->s_fence->finished);
>  	job->s_fence = NULL;
> +
> +	xa_for_each(&job->dependencies, index, fence) {
> +		dma_fence_put(fence);
> +	}
> +	xa_destroy(&job->dependencies);
> +
>  }
>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>  
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 80438d126c9d..e4d7e1496296 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -27,9 +27,12 @@
>  #include <drm/spsc_queue.h>
>  #include <linux/dma-fence.h>
>  #include <linux/completion.h>
> +#include <linux/xarray.h>
>  
>  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>  
> +struct drm_gem_object;
> +
>  struct drm_gpu_scheduler;
>  struct drm_sched_rq;
>  
> @@ -198,6 +201,16 @@ struct drm_sched_job {
>  	enum drm_sched_priority		s_priority;
>  	struct drm_sched_entity         *entity;
>  	struct dma_fence_cb		cb;
> +	/**
> +	 * @dependencies:
> +	 *
> +	 * Contains the dependencies as struct dma_fence for this job, see
> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> +	 */
> +	struct xarray			dependencies;
> +
> +	/** @last_dependency: tracks @dependencies as they signal */
> +	unsigned long			last_dependency;
>  };
>  
>  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>   */
>  struct drm_sched_backend_ops {
>  	/**
> -         * @dependency: Called when the scheduler is considering scheduling
> -         * this job next, to get another struct dma_fence for this job to
> -	 * block on.  Once it returns NULL, run_job() may be called.
> +	 * @dependency:
> +	 *
> +	 * Called when the scheduler is considering scheduling this job next, to
> +	 * get another struct dma_fence for this job to block on.  Once it
> +	 * returns NULL, run_job() may be called.
> +	 *
> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>  	 */
>  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>  					struct drm_sched_entity *s_entity);
> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner);
>  void drm_sched_job_arm(struct drm_sched_job *job);
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence);
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write);
> +
> +
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/11] drm/sched: drop entity parameter from drm_sched_push_job
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:32     ` Steven Price
  -1 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Lucas Stach, Russell King, Christian Gmeiner,
	Qiang Yu, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	Emma Anholt, David Airlie, Daniel Vetter, Sumit Semwal,
	Christian König, Alex Deucher, Nirmoy Das, Dave Airlie,
	Chen Li, Lee Jones, Deepak R Varma, Kevin Wang, Luben Tuikov,
	Marek Olšák, Maarten Lankhorst, Andrey Grodzovsky,
	Dennis Li, Boris Brezillon, etnaviv, lima, linux-media,
	linaro-mm-sig

On 24/06/2021 15:00, Daniel Vetter wrote:
> Originally a job was only bound to the queue when we pushed this, but
> now that's done in drm_sched_job_init, making that parameter entirely
> redundant.
> 
> Remove it.
> 
> The same applies to the context parameter in
> lima_sched_context_queue_task, simplify that too.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Emma Anholt <emma@anholt.net>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
>  drivers/gpu/drm/lima/lima_gem.c          | 3 +--
>  drivers/gpu/drm/lima/lima_sched.c        | 5 ++---
>  drivers/gpu/drm/lima/lima_sched.h        | 3 +--
>  drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
>  drivers/gpu/drm/scheduler/sched_entity.c | 6 ++----
>  drivers/gpu/drm/v3d/v3d_gem.c            | 2 +-
>  include/drm/gpu_scheduler.h              | 3 +--
>  10 files changed, 12 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index a4ec092af9a7..18f63567fb69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>  
>  	trace_amdgpu_cs_ioctl(job);
>  	amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 5ddb955d2315..b8609cccc9c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>  
>  	*f = dma_fence_get(&job->base.s_fence->finished);
>  	amdgpu_job_free_resources(job);
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index af1671f01c7f..77995f190790 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>  	/* the scheduler holds on to the job now */
>  	kref_get(&submit->refcount);
>  
> -	drm_sched_entity_push_job(&submit->sched_job, sched_entity);
> +	drm_sched_entity_push_job(&submit->sched_job);
>  
>  out_unlock:
>  	mutex_unlock(&submit->gpu->fence_lock);
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index de62966243cd..c528f40981bb 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
>  			goto err_out2;
>  	}
>  
> -	fence = lima_sched_context_queue_task(
> -		submit->ctx->context + submit->pipe, submit->task);
> +	fence = lima_sched_context_queue_task(submit->task);
>  
>  	for (i = 0; i < submit->nr_bos; i++) {
>  		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index bd1af1fd8c0f..de7e71c42a69 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>  	drm_sched_entity_fini(&context->base);
>  }
>  
> -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
> -						struct lima_sched_task *task)
> +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
>  {
>  	struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
>  
>  	trace_lima_task_submit(task);
> -	drm_sched_entity_push_job(&task->base, &context->base);
> +	drm_sched_entity_push_job(&task->base);
>  	return fence;
>  }
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
> index 90f03c48ef4a..ac70006b0e26 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
>  			    atomic_t *guilty);
>  void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>  			     struct lima_sched_context *context);
> -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
> -						struct lima_sched_task *task);
> +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task);
>  
>  int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name);
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 1e950534b9b0..2d01a670a4e8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -257,7 +257,7 @@ int panfrost_job_push(struct panfrost_job *job)
>  
>  	kref_get(&job->refcount); /* put by scheduler job completion */
>  
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	mutex_unlock(&pfdev->sched_lock);
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index b6f72fafd504..2ab1b9e648f2 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -493,9 +493,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>  
>  /**
>   * drm_sched_entity_push_job - Submit a job to the entity's job queue
> - *
>   * @sched_job: job to submit
> - * @entity: scheduler entity
>   *
>   * Note: To guarantee that the order of insertion to queue matches the job's
>   * fence sequence number this function should be called with drm_sched_job_arm()
> @@ -503,9 +501,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   *
>   * Returns 0 for success, negative error code otherwise.
>   */
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
> -			       struct drm_sched_entity *entity)
> +void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  {
> +	struct drm_sched_entity *entity = sched_job->entity;
>  	bool first;
>  
>  	trace_drm_sched_job(sched_job, entity);
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 4eb354226972..ac608eb9b594 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -480,7 +480,7 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>  	/* put by scheduler job completion */
>  	kref_get(&job->refcount);
>  
> -	drm_sched_entity_push_job(&job->base, &v3d_priv->sched_entity[queue]);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	return 0;
>  }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index e4d7e1496296..55e0acf8015a 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -372,8 +372,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity);
>  void drm_sched_entity_destroy(struct drm_sched_entity *entity);
>  void drm_sched_entity_select_rq(struct drm_sched_entity *entity);
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity);
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
> -			       struct drm_sched_entity *entity);
> +void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>  				   enum drm_sched_priority priority);
>  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 03/11] drm/sched: drop entity parameter from drm_sched_push_job
@ 2021-06-24 14:32     ` Steven Price
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Emma Anholt, David Airlie, Nirmoy Das, Daniel Vetter, Lee Jones,
	lima, Luben Tuikov, Alyssa Rosenzweig, Russell King, Dave Airlie,
	Dennis Li, Chen Li, Marek Olšák, Kevin Wang, etnaviv,
	linaro-mm-sig, Deepak R Varma, Tomeu Vizoso, Boris Brezillon,
	Qiang Yu, Alex Deucher, linux-media, Christian König

On 24/06/2021 15:00, Daniel Vetter wrote:
> Originally a job was only bound to the queue when we pushed this, but
> now that's done in drm_sched_job_init, making that parameter entirely
> redundant.
> 
> Remove it.
> 
> The same applies to the context parameter in
> lima_sched_context_queue_task, simplify that too.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Emma Anholt <emma@anholt.net>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  | 2 +-
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  | 2 +-
>  drivers/gpu/drm/lima/lima_gem.c          | 3 +--
>  drivers/gpu/drm/lima/lima_sched.c        | 5 ++---
>  drivers/gpu/drm/lima/lima_sched.h        | 3 +--
>  drivers/gpu/drm/panfrost/panfrost_job.c  | 2 +-
>  drivers/gpu/drm/scheduler/sched_entity.c | 6 ++----
>  drivers/gpu/drm/v3d/v3d_gem.c            | 2 +-
>  include/drm/gpu_scheduler.h              | 3 +--
>  10 files changed, 12 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index a4ec092af9a7..18f63567fb69 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1267,7 +1267,7 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>  
>  	trace_amdgpu_cs_ioctl(job);
>  	amdgpu_vm_bo_trace_cs(&fpriv->vm, &p->ticket);
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	amdgpu_vm_move_to_lru_tail(p->adev, &fpriv->vm);
>  
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index 5ddb955d2315..b8609cccc9c1 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -174,7 +174,7 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>  
>  	*f = dma_fence_get(&job->base.s_fence->finished);
>  	amdgpu_job_free_resources(job);
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index af1671f01c7f..77995f190790 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -178,7 +178,7 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>  	/* the scheduler holds on to the job now */
>  	kref_get(&submit->refcount);
>  
> -	drm_sched_entity_push_job(&submit->sched_job, sched_entity);
> +	drm_sched_entity_push_job(&submit->sched_job);
>  
>  out_unlock:
>  	mutex_unlock(&submit->gpu->fence_lock);
> diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c
> index de62966243cd..c528f40981bb 100644
> --- a/drivers/gpu/drm/lima/lima_gem.c
> +++ b/drivers/gpu/drm/lima/lima_gem.c
> @@ -359,8 +359,7 @@ int lima_gem_submit(struct drm_file *file, struct lima_submit *submit)
>  			goto err_out2;
>  	}
>  
> -	fence = lima_sched_context_queue_task(
> -		submit->ctx->context + submit->pipe, submit->task);
> +	fence = lima_sched_context_queue_task(submit->task);
>  
>  	for (i = 0; i < submit->nr_bos; i++) {
>  		if (submit->bos[i].flags & LIMA_SUBMIT_BO_WRITE)
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index bd1af1fd8c0f..de7e71c42a69 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -177,13 +177,12 @@ void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>  	drm_sched_entity_fini(&context->base);
>  }
>  
> -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
> -						struct lima_sched_task *task)
> +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
>  {
>  	struct dma_fence *fence = dma_fence_get(&task->base.s_fence->finished);
>  
>  	trace_lima_task_submit(task);
> -	drm_sched_entity_push_job(&task->base, &context->base);
> +	drm_sched_entity_push_job(&task->base);
>  	return fence;
>  }
>  
> diff --git a/drivers/gpu/drm/lima/lima_sched.h b/drivers/gpu/drm/lima/lima_sched.h
> index 90f03c48ef4a..ac70006b0e26 100644
> --- a/drivers/gpu/drm/lima/lima_sched.h
> +++ b/drivers/gpu/drm/lima/lima_sched.h
> @@ -98,8 +98,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
>  			    atomic_t *guilty);
>  void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>  			     struct lima_sched_context *context);
> -struct dma_fence *lima_sched_context_queue_task(struct lima_sched_context *context,
> -						struct lima_sched_task *task);
> +struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task);
>  
>  int lima_sched_pipe_init(struct lima_sched_pipe *pipe, const char *name);
>  void lima_sched_pipe_fini(struct lima_sched_pipe *pipe);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 1e950534b9b0..2d01a670a4e8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -257,7 +257,7 @@ int panfrost_job_push(struct panfrost_job *job)
>  
>  	kref_get(&job->refcount); /* put by scheduler job completion */
>  
> -	drm_sched_entity_push_job(&job->base, entity);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	mutex_unlock(&pfdev->sched_lock);
>  
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index b6f72fafd504..2ab1b9e648f2 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -493,9 +493,7 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>  
>  /**
>   * drm_sched_entity_push_job - Submit a job to the entity's job queue
> - *
>   * @sched_job: job to submit
> - * @entity: scheduler entity
>   *
>   * Note: To guarantee that the order of insertion to queue matches the job's
>   * fence sequence number this function should be called with drm_sched_job_arm()
> @@ -503,9 +501,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   *
>   * Returns 0 for success, negative error code otherwise.
>   */
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
> -			       struct drm_sched_entity *entity)
> +void drm_sched_entity_push_job(struct drm_sched_job *sched_job)
>  {
> +	struct drm_sched_entity *entity = sched_job->entity;
>  	bool first;
>  
>  	trace_drm_sched_job(sched_job, entity);
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 4eb354226972..ac608eb9b594 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -480,7 +480,7 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>  	/* put by scheduler job completion */
>  	kref_get(&job->refcount);
>  
> -	drm_sched_entity_push_job(&job->base, &v3d_priv->sched_entity[queue]);
> +	drm_sched_entity_push_job(&job->base);
>  
>  	return 0;
>  }
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index e4d7e1496296..55e0acf8015a 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -372,8 +372,7 @@ void drm_sched_entity_fini(struct drm_sched_entity *entity);
>  void drm_sched_entity_destroy(struct drm_sched_entity *entity);
>  void drm_sched_entity_select_rq(struct drm_sched_entity *entity);
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity);
> -void drm_sched_entity_push_job(struct drm_sched_job *sched_job,
> -			       struct drm_sched_entity *entity);
> +void drm_sched_entity_push_job(struct drm_sched_job *sched_job);
>  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>  				   enum drm_sched_priority priority);
>  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/11] drm/panfrost: use scheduler dependency tracking
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:32     ` Steven Price
  -1 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	Sumit Semwal, Christian König, linux-media, linaro-mm-sig

On 24/06/2021 15:00, Daniel Vetter wrote:
> Just deletes some code that's now more shared.
> 
> Note that thanks to the split into drm_sched_job_init/arm we can now
> easily pull the _init() part from under the submission lock way ahead
> where we're adding the sync file in-fences as dependencies.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 14 +++++++---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++----------------------
>  drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
>  3 files changed, 15 insertions(+), 41 deletions(-)

Nice!

Reviewed-by: Steven Price <steven.price@arm.com>

> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 1ffaef5ec5ff..79904f55c19f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  		if (ret)
>  			goto fail;
>  
> -		ret = drm_gem_fence_array_add(&job->deps, fence);
> +		ret = drm_sched_job_await_fence(&job->base, fence);
>  
>  		if (ret)
>  			goto fail;
> @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  	struct drm_panfrost_submit *args = data;
>  	struct drm_syncobj *sync_out = NULL;
>  	struct panfrost_job *job;
> -	int ret = 0;
> +	int ret = 0, slot;
>  
>  	if (!args->jc)
>  		return -EINVAL;
> @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  
>  	kref_init(&job->refcount);
>  
> -	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> -
>  	job->pfdev = pfdev;
>  	job->jc = args->jc;
>  	job->requirements = args->requirements;
>  	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
>  	job->file_priv = file->driver_priv;
>  
> +	slot = panfrost_job_get_slot(job);
> +
> +	ret = drm_sched_job_init(&job->base,
> +				 &job->file_priv->sched_entity[slot],
> +				 NULL);
> +	if (ret)
> +		goto fail_job;
> +
>  	ret = panfrost_copy_in_sync(dev, file, args, job);
>  	if (ret)
>  		goto fail_job;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 2d01a670a4e8..d097e52f8caa 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -109,7 +109,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in
>  	return &fence->base;
>  }
>  
> -static int panfrost_job_get_slot(struct panfrost_job *job)
> +int panfrost_job_get_slot(struct panfrost_job *job)
>  {
>  	/* JS0: fragment jobs.
>  	 * JS1: vertex/tiler jobs
> @@ -198,13 +198,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  
>  static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  					  int bo_count,
> -					  struct xarray *deps)
> +					  struct drm_sched_job *job)
>  {
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
>  		/* panfrost always uses write mode in its current uapi */
> -		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> +		ret = drm_sched_job_await_implicit(job, bos[i], true);
>  		if (ret)
>  			return ret;
>  	}
> @@ -225,31 +225,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
>  int panfrost_job_push(struct panfrost_job *job)
>  {
>  	struct panfrost_device *pfdev = job->pfdev;
> -	int slot = panfrost_job_get_slot(job);
> -	struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
>  	struct ww_acquire_ctx acquire_ctx;
>  	int ret = 0;
>  
> -
>  	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>  					    &acquire_ctx);
>  	if (ret)
>  		return ret;
>  
>  	mutex_lock(&pfdev->sched_lock);
> -
> -	ret = drm_sched_job_init(&job->base, entity, NULL);
> -	if (ret) {
> -		mutex_unlock(&pfdev->sched_lock);
> -		goto unlock;
> -	}
> -
>  	drm_sched_job_arm(&job->base);
>  
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
>  	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> -					     &job->deps);
> +					     &job->base);
>  	if (ret) {
>  		mutex_unlock(&pfdev->sched_lock);
>  		goto unlock;
> @@ -274,15 +264,8 @@ static void panfrost_job_cleanup(struct kref *ref)
>  {
>  	struct panfrost_job *job = container_of(ref, struct panfrost_job,
>  						refcount);
> -	struct dma_fence *fence;
> -	unsigned long index;
>  	unsigned int i;
>  
> -	xa_for_each(&job->deps, index, fence) {
> -		dma_fence_put(fence);
> -	}
> -	xa_destroy(&job->deps);
> -
>  	dma_fence_put(job->done_fence);
>  	dma_fence_put(job->render_done_fence);
>  
> @@ -321,17 +304,6 @@ static void panfrost_job_free(struct drm_sched_job *sched_job)
>  	panfrost_job_put(job);
>  }
>  
> -static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job,
> -						 struct drm_sched_entity *s_entity)
> -{
> -	struct panfrost_job *job = to_panfrost_job(sched_job);
> -
> -	if (!xa_empty(&job->deps))
> -		return xa_erase(&job->deps, job->last_dep++);
> -
> -	return NULL;
> -}
> -
>  static struct dma_fence *panfrost_job_run(struct drm_sched_job *sched_job)
>  {
>  	struct panfrost_job *job = to_panfrost_job(sched_job);
> @@ -457,7 +429,6 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job
>  }
>  
>  static const struct drm_sched_backend_ops panfrost_sched_ops = {
> -	.dependency = panfrost_job_dependency,
>  	.run_job = panfrost_job_run,
>  	.timedout_job = panfrost_job_timedout,
>  	.free_job = panfrost_job_free
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index 82306a03b57e..77e6d0e6f612 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -19,10 +19,6 @@ struct panfrost_job {
>  	struct panfrost_device *pfdev;
>  	struct panfrost_file_priv *file_priv;
>  
> -	/* Contains both explicit and implicit fences */
> -	struct xarray deps;
> -	unsigned long last_dep;
> -
>  	/* Fence to be signaled by IRQ handler when the job is complete. */
>  	struct dma_fence *done_fence;
>  
> @@ -42,6 +38,7 @@ int panfrost_job_init(struct panfrost_device *pfdev);
>  void panfrost_job_fini(struct panfrost_device *pfdev);
>  int panfrost_job_open(struct panfrost_file_priv *panfrost_priv);
>  void panfrost_job_close(struct panfrost_file_priv *panfrost_priv);
> +int panfrost_job_get_slot(struct panfrost_job *job);
>  int panfrost_job_push(struct panfrost_job *job);
>  void panfrost_job_put(struct panfrost_job *job);
>  void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 04/11] drm/panfrost: use scheduler dependency tracking
@ 2021-06-24 14:32     ` Steven Price
  0 siblings, 0 replies; 57+ messages in thread
From: Steven Price @ 2021-06-24 14:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Tomeu Vizoso, linaro-mm-sig, Alyssa Rosenzweig, Daniel Vetter,
	Christian König, linux-media

On 24/06/2021 15:00, Daniel Vetter wrote:
> Just deletes some code that's now more shared.
> 
> Note that thanks to the split into drm_sched_job_init/arm we can now
> easily pull the _init() part from under the submission lock way ahead
> where we're adding the sync file in-fences as dependencies.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 14 +++++++---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 37 +++----------------------
>  drivers/gpu/drm/panfrost/panfrost_job.h |  5 +---
>  3 files changed, 15 insertions(+), 41 deletions(-)

Nice!

Reviewed-by: Steven Price <steven.price@arm.com>

> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 1ffaef5ec5ff..79904f55c19f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -218,7 +218,7 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  		if (ret)
>  			goto fail;
>  
> -		ret = drm_gem_fence_array_add(&job->deps, fence);
> +		ret = drm_sched_job_await_fence(&job->base, fence);
>  
>  		if (ret)
>  			goto fail;
> @@ -236,7 +236,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  	struct drm_panfrost_submit *args = data;
>  	struct drm_syncobj *sync_out = NULL;
>  	struct panfrost_job *job;
> -	int ret = 0;
> +	int ret = 0, slot;
>  
>  	if (!args->jc)
>  		return -EINVAL;
> @@ -258,14 +258,20 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  
>  	kref_init(&job->refcount);
>  
> -	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> -
>  	job->pfdev = pfdev;
>  	job->jc = args->jc;
>  	job->requirements = args->requirements;
>  	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
>  	job->file_priv = file->driver_priv;
>  
> +	slot = panfrost_job_get_slot(job);
> +
> +	ret = drm_sched_job_init(&job->base,
> +				 &job->file_priv->sched_entity[slot],
> +				 NULL);
> +	if (ret)
> +		goto fail_job;
> +
>  	ret = panfrost_copy_in_sync(dev, file, args, job);
>  	if (ret)
>  		goto fail_job;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 2d01a670a4e8..d097e52f8caa 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -109,7 +109,7 @@ static struct dma_fence *panfrost_fence_create(struct panfrost_device *pfdev, in
>  	return &fence->base;
>  }
>  
> -static int panfrost_job_get_slot(struct panfrost_job *job)
> +int panfrost_job_get_slot(struct panfrost_job *job)
>  {
>  	/* JS0: fragment jobs.
>  	 * JS1: vertex/tiler jobs
> @@ -198,13 +198,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  
>  static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  					  int bo_count,
> -					  struct xarray *deps)
> +					  struct drm_sched_job *job)
>  {
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
>  		/* panfrost always uses write mode in its current uapi */
> -		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> +		ret = drm_sched_job_await_implicit(job, bos[i], true);
>  		if (ret)
>  			return ret;
>  	}
> @@ -225,31 +225,21 @@ static void panfrost_attach_object_fences(struct drm_gem_object **bos,
>  int panfrost_job_push(struct panfrost_job *job)
>  {
>  	struct panfrost_device *pfdev = job->pfdev;
> -	int slot = panfrost_job_get_slot(job);
> -	struct drm_sched_entity *entity = &job->file_priv->sched_entity[slot];
>  	struct ww_acquire_ctx acquire_ctx;
>  	int ret = 0;
>  
> -
>  	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>  					    &acquire_ctx);
>  	if (ret)
>  		return ret;
>  
>  	mutex_lock(&pfdev->sched_lock);
> -
> -	ret = drm_sched_job_init(&job->base, entity, NULL);
> -	if (ret) {
> -		mutex_unlock(&pfdev->sched_lock);
> -		goto unlock;
> -	}
> -
>  	drm_sched_job_arm(&job->base);
>  
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
>  	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> -					     &job->deps);
> +					     &job->base);
>  	if (ret) {
>  		mutex_unlock(&pfdev->sched_lock);
>  		goto unlock;
> @@ -274,15 +264,8 @@ static void panfrost_job_cleanup(struct kref *ref)
>  {
>  	struct panfrost_job *job = container_of(ref, struct panfrost_job,
>  						refcount);
> -	struct dma_fence *fence;
> -	unsigned long index;
>  	unsigned int i;
>  
> -	xa_for_each(&job->deps, index, fence) {
> -		dma_fence_put(fence);
> -	}
> -	xa_destroy(&job->deps);
> -
>  	dma_fence_put(job->done_fence);
>  	dma_fence_put(job->render_done_fence);
>  
> @@ -321,17 +304,6 @@ static void panfrost_job_free(struct drm_sched_job *sched_job)
>  	panfrost_job_put(job);
>  }
>  
> -static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job,
> -						 struct drm_sched_entity *s_entity)
> -{
> -	struct panfrost_job *job = to_panfrost_job(sched_job);
> -
> -	if (!xa_empty(&job->deps))
> -		return xa_erase(&job->deps, job->last_dep++);
> -
> -	return NULL;
> -}
> -
>  static struct dma_fence *panfrost_job_run(struct drm_sched_job *sched_job)
>  {
>  	struct panfrost_job *job = to_panfrost_job(sched_job);
> @@ -457,7 +429,6 @@ static enum drm_gpu_sched_stat panfrost_job_timedout(struct drm_sched_job
>  }
>  
>  static const struct drm_sched_backend_ops panfrost_sched_ops = {
> -	.dependency = panfrost_job_dependency,
>  	.run_job = panfrost_job_run,
>  	.timedout_job = panfrost_job_timedout,
>  	.free_job = panfrost_job_free
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index 82306a03b57e..77e6d0e6f612 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -19,10 +19,6 @@ struct panfrost_job {
>  	struct panfrost_device *pfdev;
>  	struct panfrost_file_priv *file_priv;
>  
> -	/* Contains both explicit and implicit fences */
> -	struct xarray deps;
> -	unsigned long last_dep;
> -
>  	/* Fence to be signaled by IRQ handler when the job is complete. */
>  	struct dma_fence *done_fence;
>  
> @@ -42,6 +38,7 @@ int panfrost_job_init(struct panfrost_device *pfdev);
>  void panfrost_job_fini(struct panfrost_device *pfdev);
>  int panfrost_job_open(struct panfrost_file_priv *panfrost_priv);
>  void panfrost_job_close(struct panfrost_file_priv *panfrost_priv);
> +int panfrost_job_get_slot(struct panfrost_job *job);
>  int panfrost_job_push(struct panfrost_job *job);
>  void panfrost_job_put(struct panfrost_job *job);
>  void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 14:39     ` Lucas Stach
  -1 siblings, 0 replies; 57+ messages in thread
From: Lucas Stach @ 2021-06-24 14:39 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Jack Zhang, David Airlie, linaro-mm-sig, Boris Brezillon,
	Alex Deucher, Daniel Vetter, linux-media, Lee Jones,
	Christian König, Luben Tuikov, Nirmoy Das

Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
> Instead of just a callback we can just glue in the gem helpers that
> panfrost, v3d and lima currently use. There's really not that many
> ways to skin this cat.
> 
> On the naming bikeshed: The idea for using _await_ to denote adding
> dependencies to a job comes from i915, where that's used quite
> extensively all over the place, in lots of datastructures.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>  include/drm/gpu_scheduler.h              |  31 ++++++-
>  3 files changed, 146 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index f7347c284886..b6f72fafd504 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>  	job->sched->ops->free_job(job);
>  }
>  
> +static struct dma_fence *
> +drm_sched_job_dependency(struct drm_sched_job *job,
> +			 struct drm_sched_entity *entity)
> +{
> +	if (!xa_empty(&job->dependencies))
> +		return xa_erase(&job->dependencies, job->last_dependency++);

Not sure how much it buys us now that you dedup fences before adding
them to the xa, but we could avoid potentially avoid some ping-pong
looping in the scheduler by checking if the fence we are about to
return here is already signaled and skipping to the next one if so.

Regards,
Lucas

> +
> +	if (job->sched->ops->dependency)
> +		return job->sched->ops->dependency(job, entity);
> +
> +	return NULL;
> +}
> +
>  /**
>   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>   *
> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		/* Wait for all dependencies to avoid data corruptions */
> -		while ((f = job->sched->ops->dependency(job, entity)))
> +		while ((f = drm_sched_job_dependency(job, entity)))
>  			dma_fence_wait(f, false);
>  
>  		drm_sched_fence_scheduled(s_fence);
> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>   */
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  {
> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>  	struct drm_sched_job *sched_job;
>  
>  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  		return NULL;
>  
>  	while ((entity->dependency =
> -			sched->ops->dependency(sched_job, entity))) {
> +			drm_sched_job_dependency(sched_job, entity))) {
>  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>  
>  		if (drm_sched_entity_add_dependency_cb(entity))
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 70eefed17e06..370c336d383f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  
>  	INIT_LIST_HEAD(&job->list);
>  
> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_job_init);
> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  }
>  EXPORT_SYMBOL(drm_sched_job_arm);
>  
> +/**
> + * drm_sched_job_await_fence - adds the fence as a job dependency
> + * @job: scheduler job to add the dependencies to
> + * @fence: the dma_fence to add to the list of dependencies.
> + *
> + * Note that @fence is consumed in both the success and error cases.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence)
> +{
> +	struct dma_fence *entry;
> +	unsigned long index;
> +	u32 id = 0;
> +	int ret;
> +
> +	if (!fence)
> +		return 0;
> +
> +	/* Deduplicate if we already depend on a fence from the same context.
> +	 * This lets the size of the array of deps scale with the number of
> +	 * engines involved, rather than the number of BOs.
> +	 */
> +	xa_for_each(&job->dependencies, index, entry) {
> +		if (entry->context != fence->context)
> +			continue;
> +
> +		if (dma_fence_is_later(fence, entry)) {
> +			dma_fence_put(entry);
> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> +		} else {
> +			dma_fence_put(fence);
> +		}
> +		return 0;
> +	}
> +
> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> +	if (ret != 0)
> +		dma_fence_put(fence);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_fence);
> +
> +/**
> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> + * @job: scheduler job to add the dependencies to
> + * @obj: the gem object to add new dependencies from.
> + * @write: whether the job might write the object (so we need to depend on
> + * shared fences in the reservation object).
> + *
> + * This should be called after drm_gem_lock_reservations() on your array of
> + * GEM objects used in the job but before updating the reservations with your
> + * own fences.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write)
> +{
> +	int ret;
> +	struct dma_fence **fences;
> +	unsigned int i, fence_count;
> +
> +	if (!write) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> +
> +		return drm_sched_job_await_fence(job, fence);
> +	}
> +
> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> +	if (ret || !fence_count)
> +		return ret;
> +
> +	for (i = 0; i < fence_count; i++) {
> +		ret = drm_sched_job_await_fence(job, fences[i]);
> +		if (ret)
> +			break;
> +	}
> +
> +	for (; i < fence_count; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> +
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>   */
>  void drm_sched_job_cleanup(struct drm_sched_job *job)
>  {
> +	struct dma_fence *fence;
> +	unsigned long index;
> +
>  	dma_fence_put(&job->s_fence->finished);
>  	job->s_fence = NULL;
> +
> +	xa_for_each(&job->dependencies, index, fence) {
> +		dma_fence_put(fence);
> +	}
> +	xa_destroy(&job->dependencies);
> +
>  }
>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>  
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 80438d126c9d..e4d7e1496296 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -27,9 +27,12 @@
>  #include <drm/spsc_queue.h>
>  #include <linux/dma-fence.h>
>  #include <linux/completion.h>
> +#include <linux/xarray.h>
>  
>  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>  
> +struct drm_gem_object;
> +
>  struct drm_gpu_scheduler;
>  struct drm_sched_rq;
>  
> @@ -198,6 +201,16 @@ struct drm_sched_job {
>  	enum drm_sched_priority		s_priority;
>  	struct drm_sched_entity         *entity;
>  	struct dma_fence_cb		cb;
> +	/**
> +	 * @dependencies:
> +	 *
> +	 * Contains the dependencies as struct dma_fence for this job, see
> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> +	 */
> +	struct xarray			dependencies;
> +
> +	/** @last_dependency: tracks @dependencies as they signal */
> +	unsigned long			last_dependency;
>  };
>  
>  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>   */
>  struct drm_sched_backend_ops {
>  	/**
> -         * @dependency: Called when the scheduler is considering scheduling
> -         * this job next, to get another struct dma_fence for this job to
> -	 * block on.  Once it returns NULL, run_job() may be called.
> +	 * @dependency:
> +	 *
> +	 * Called when the scheduler is considering scheduling this job next, to
> +	 * get another struct dma_fence for this job to block on.  Once it
> +	 * returns NULL, run_job() may be called.
> +	 *
> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>  	 */
>  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>  					struct drm_sched_entity *s_entity);
> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner);
>  void drm_sched_job_arm(struct drm_sched_job *job);
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence);
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write);
> +
> +
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
@ 2021-06-24 14:39     ` Lucas Stach
  0 siblings, 0 replies; 57+ messages in thread
From: Lucas Stach @ 2021-06-24 14:39 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Luben Tuikov, Jack Zhang, David Airlie, linaro-mm-sig,
	Boris Brezillon, Alex Deucher, Daniel Vetter, Nirmoy Das,
	Lee Jones, Christian König, linux-media

Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
> Instead of just a callback we can just glue in the gem helpers that
> panfrost, v3d and lima currently use. There's really not that many
> ways to skin this cat.
> 
> On the naming bikeshed: The idea for using _await_ to denote adding
> dependencies to a job comes from i915, where that's used quite
> extensively all over the place, in lots of datastructures.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>  include/drm/gpu_scheduler.h              |  31 ++++++-
>  3 files changed, 146 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index f7347c284886..b6f72fafd504 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>  	job->sched->ops->free_job(job);
>  }
>  
> +static struct dma_fence *
> +drm_sched_job_dependency(struct drm_sched_job *job,
> +			 struct drm_sched_entity *entity)
> +{
> +	if (!xa_empty(&job->dependencies))
> +		return xa_erase(&job->dependencies, job->last_dependency++);

Not sure how much it buys us now that you dedup fences before adding
them to the xa, but we could avoid potentially avoid some ping-pong
looping in the scheduler by checking if the fence we are about to
return here is already signaled and skipping to the next one if so.

Regards,
Lucas

> +
> +	if (job->sched->ops->dependency)
> +		return job->sched->ops->dependency(job, entity);
> +
> +	return NULL;
> +}
> +
>  /**
>   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>   *
> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		/* Wait for all dependencies to avoid data corruptions */
> -		while ((f = job->sched->ops->dependency(job, entity)))
> +		while ((f = drm_sched_job_dependency(job, entity)))
>  			dma_fence_wait(f, false);
>  
>  		drm_sched_fence_scheduled(s_fence);
> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>   */
>  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  {
> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>  	struct drm_sched_job *sched_job;
>  
>  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>  		return NULL;
>  
>  	while ((entity->dependency =
> -			sched->ops->dependency(sched_job, entity))) {
> +			drm_sched_job_dependency(sched_job, entity))) {
>  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>  
>  		if (drm_sched_entity_add_dependency_cb(entity))
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 70eefed17e06..370c336d383f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  
>  	INIT_LIST_HEAD(&job->list);
>  
> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_job_init);
> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>  }
>  EXPORT_SYMBOL(drm_sched_job_arm);
>  
> +/**
> + * drm_sched_job_await_fence - adds the fence as a job dependency
> + * @job: scheduler job to add the dependencies to
> + * @fence: the dma_fence to add to the list of dependencies.
> + *
> + * Note that @fence is consumed in both the success and error cases.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence)
> +{
> +	struct dma_fence *entry;
> +	unsigned long index;
> +	u32 id = 0;
> +	int ret;
> +
> +	if (!fence)
> +		return 0;
> +
> +	/* Deduplicate if we already depend on a fence from the same context.
> +	 * This lets the size of the array of deps scale with the number of
> +	 * engines involved, rather than the number of BOs.
> +	 */
> +	xa_for_each(&job->dependencies, index, entry) {
> +		if (entry->context != fence->context)
> +			continue;
> +
> +		if (dma_fence_is_later(fence, entry)) {
> +			dma_fence_put(entry);
> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> +		} else {
> +			dma_fence_put(fence);
> +		}
> +		return 0;
> +	}
> +
> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> +	if (ret != 0)
> +		dma_fence_put(fence);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_fence);
> +
> +/**
> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> + * @job: scheduler job to add the dependencies to
> + * @obj: the gem object to add new dependencies from.
> + * @write: whether the job might write the object (so we need to depend on
> + * shared fences in the reservation object).
> + *
> + * This should be called after drm_gem_lock_reservations() on your array of
> + * GEM objects used in the job but before updating the reservations with your
> + * own fences.
> + *
> + * Returns:
> + * 0 on success, or an error on failing to expand the array.
> + */
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write)
> +{
> +	int ret;
> +	struct dma_fence **fences;
> +	unsigned int i, fence_count;
> +
> +	if (!write) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> +
> +		return drm_sched_job_await_fence(job, fence);
> +	}
> +
> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> +	if (ret || !fence_count)
> +		return ret;
> +
> +	for (i = 0; i < fence_count; i++) {
> +		ret = drm_sched_job_await_fence(job, fences[i]);
> +		if (ret)
> +			break;
> +	}
> +
> +	for (; i < fence_count; i++)
> +		dma_fence_put(fences[i]);
> +	kfree(fences);
> +	return ret;
> +}
> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> +
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>   */
>  void drm_sched_job_cleanup(struct drm_sched_job *job)
>  {
> +	struct dma_fence *fence;
> +	unsigned long index;
> +
>  	dma_fence_put(&job->s_fence->finished);
>  	job->s_fence = NULL;
> +
> +	xa_for_each(&job->dependencies, index, fence) {
> +		dma_fence_put(fence);
> +	}
> +	xa_destroy(&job->dependencies);
> +
>  }
>  EXPORT_SYMBOL(drm_sched_job_cleanup);
>  
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 80438d126c9d..e4d7e1496296 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -27,9 +27,12 @@
>  #include <drm/spsc_queue.h>
>  #include <linux/dma-fence.h>
>  #include <linux/completion.h>
> +#include <linux/xarray.h>
>  
>  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>  
> +struct drm_gem_object;
> +
>  struct drm_gpu_scheduler;
>  struct drm_sched_rq;
>  
> @@ -198,6 +201,16 @@ struct drm_sched_job {
>  	enum drm_sched_priority		s_priority;
>  	struct drm_sched_entity         *entity;
>  	struct dma_fence_cb		cb;
> +	/**
> +	 * @dependencies:
> +	 *
> +	 * Contains the dependencies as struct dma_fence for this job, see
> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> +	 */
> +	struct xarray			dependencies;
> +
> +	/** @last_dependency: tracks @dependencies as they signal */
> +	unsigned long			last_dependency;
>  };
>  
>  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>   */
>  struct drm_sched_backend_ops {
>  	/**
> -         * @dependency: Called when the scheduler is considering scheduling
> -         * this job next, to get another struct dma_fence for this job to
> -	 * block on.  Once it returns NULL, run_job() may be called.
> +	 * @dependency:
> +	 *
> +	 * Called when the scheduler is considering scheduling this job next, to
> +	 * get another struct dma_fence for this job to block on.  Once it
> +	 * returns NULL, run_job() may be called.
> +	 *
> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>  	 */
>  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>  					struct drm_sched_entity *s_entity);
> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>  		       struct drm_sched_entity *entity,
>  		       void *owner);
>  void drm_sched_job_arm(struct drm_sched_job *job);
> +int drm_sched_job_await_fence(struct drm_sched_job *job,
> +			      struct dma_fence *fence);
> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> +				 struct drm_gem_object *obj,
> +				 bool write);
> +
> +
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>  				    struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 14:00 ` [PATCH 10/11] drm/scheduler: Don't store self-dependencies Daniel Vetter
@ 2021-06-24 14:42   ` Lucas Stach
  2021-06-24 17:03   ` Christian König
  1 sibling, 0 replies; 57+ messages in thread
From: Lucas Stach @ 2021-06-24 14:42 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Alex Deucher, Daniel Vetter, Jack Zhang, Luben Tuikov,
	Christian König

Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
> This is essentially part of drm_sched_dependency_optimized(), which
> only amdgpu seems to make use of. Use it a bit more.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 370c336d383f..c31d7cf7df74 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
>  	if (!fence)
>  		return 0;
>  
> +	/* if it's a fence from us it's guaranteed to be earlier */
> +	if (fence->context == job->entity->fence_context ||
> +	    fence->context == job->entity->fence_context + 1) {
> +		dma_fence_put(fence);
> +		return 0;
> +	}
> +
>  	/* Deduplicate if we already depend on a fence from the same context.
>  	 * This lets the size of the array of deps scale with the number of
>  	 * engines involved, rather than the number of BOs.



^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
  2021-06-24 14:39     ` Lucas Stach
@ 2021-06-24 15:26       ` Daniel Vetter
  -1 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 15:26 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Daniel Vetter, DRI Development, Jack Zhang, David Airlie,
	linaro-mm-sig, Boris Brezillon, Alex Deucher, Daniel Vetter,
	linux-media, Lee Jones, Christian König, Luben Tuikov,
	Nirmoy Das

On Thu, Jun 24, 2021 at 04:39:09PM +0200, Lucas Stach wrote:
> Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
> > Instead of just a callback we can just glue in the gem helpers that
> > panfrost, v3d and lima currently use. There's really not that many
> > ways to skin this cat.
> > 
> > On the naming bikeshed: The idea for using _await_ to denote adding
> > dependencies to a job comes from i915, where that's used quite
> > extensively all over the place, in lots of datastructures.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
> >  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
> >  include/drm/gpu_scheduler.h              |  31 ++++++-
> >  3 files changed, 146 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index f7347c284886..b6f72fafd504 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> >  	job->sched->ops->free_job(job);
> >  }
> >  
> > +static struct dma_fence *
> > +drm_sched_job_dependency(struct drm_sched_job *job,
> > +			 struct drm_sched_entity *entity)
> > +{
> > +	if (!xa_empty(&job->dependencies))
> > +		return xa_erase(&job->dependencies, job->last_dependency++);
> 
> Not sure how much it buys us now that you dedup fences before adding
> them to the xa, but we could avoid potentially avoid some ping-pong
> looping in the scheduler by checking if the fence we are about to
> return here is already signaled and skipping to the next one if so.

I think there's quit a bit of tricks you can play if the scheduler has
more direct visibility into all (or at least most) of the fences. I'm
really just trying to establish the drm_sched_job_await api so there's a
notch more structure to the dependency handling.

Iow bikesheds on the interface and testing on the patches is what I'm
looking for :-)
-Daniel

> 
> Regards,
> Lucas
> 
> > +
> > +	if (job->sched->ops->dependency)
> > +		return job->sched->ops->dependency(job, entity);
> > +
> > +	return NULL;
> > +}
> > +
> >  /**
> >   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
> >   *
> > @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
> >  		struct drm_sched_fence *s_fence = job->s_fence;
> >  
> >  		/* Wait for all dependencies to avoid data corruptions */
> > -		while ((f = job->sched->ops->dependency(job, entity)))
> > +		while ((f = drm_sched_job_dependency(job, entity)))
> >  			dma_fence_wait(f, false);
> >  
> >  		drm_sched_fence_scheduled(s_fence);
> > @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
> >   */
> >  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >  {
> > -	struct drm_gpu_scheduler *sched = entity->rq->sched;
> >  	struct drm_sched_job *sched_job;
> >  
> >  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> > @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >  		return NULL;
> >  
> >  	while ((entity->dependency =
> > -			sched->ops->dependency(sched_job, entity))) {
> > +			drm_sched_job_dependency(sched_job, entity))) {
> >  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
> >  
> >  		if (drm_sched_entity_add_dependency_cb(entity))
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 70eefed17e06..370c336d383f 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >  
> >  	INIT_LIST_HEAD(&job->list);
> >  
> > +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> > +
> >  	return 0;
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_init);
> > @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_arm);
> >  
> > +/**
> > + * drm_sched_job_await_fence - adds the fence as a job dependency
> > + * @job: scheduler job to add the dependencies to
> > + * @fence: the dma_fence to add to the list of dependencies.
> > + *
> > + * Note that @fence is consumed in both the success and error cases.
> > + *
> > + * Returns:
> > + * 0 on success, or an error on failing to expand the array.
> > + */
> > +int drm_sched_job_await_fence(struct drm_sched_job *job,
> > +			      struct dma_fence *fence)
> > +{
> > +	struct dma_fence *entry;
> > +	unsigned long index;
> > +	u32 id = 0;
> > +	int ret;
> > +
> > +	if (!fence)
> > +		return 0;
> > +
> > +	/* Deduplicate if we already depend on a fence from the same context.
> > +	 * This lets the size of the array of deps scale with the number of
> > +	 * engines involved, rather than the number of BOs.
> > +	 */
> > +	xa_for_each(&job->dependencies, index, entry) {
> > +		if (entry->context != fence->context)
> > +			continue;
> > +
> > +		if (dma_fence_is_later(fence, entry)) {
> > +			dma_fence_put(entry);
> > +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> > +		} else {
> > +			dma_fence_put(fence);
> > +		}
> > +		return 0;
> > +	}
> > +
> > +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> > +	if (ret != 0)
> > +		dma_fence_put(fence);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_await_fence);
> > +
> > +/**
> > + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> > + * @job: scheduler job to add the dependencies to
> > + * @obj: the gem object to add new dependencies from.
> > + * @write: whether the job might write the object (so we need to depend on
> > + * shared fences in the reservation object).
> > + *
> > + * This should be called after drm_gem_lock_reservations() on your array of
> > + * GEM objects used in the job but before updating the reservations with your
> > + * own fences.
> > + *
> > + * Returns:
> > + * 0 on success, or an error on failing to expand the array.
> > + */
> > +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> > +				 struct drm_gem_object *obj,
> > +				 bool write)
> > +{
> > +	int ret;
> > +	struct dma_fence **fences;
> > +	unsigned int i, fence_count;
> > +
> > +	if (!write) {
> > +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> > +
> > +		return drm_sched_job_await_fence(job, fence);
> > +	}
> > +
> > +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> > +	if (ret || !fence_count)
> > +		return ret;
> > +
> > +	for (i = 0; i < fence_count; i++) {
> > +		ret = drm_sched_job_await_fence(job, fences[i]);
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +	for (; i < fence_count; i++)
> > +		dma_fence_put(fences[i]);
> > +	kfree(fences);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> > +
> > +
> >  /**
> >   * drm_sched_job_cleanup - clean up scheduler job resources
> >   *
> > @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
> >   */
> >  void drm_sched_job_cleanup(struct drm_sched_job *job)
> >  {
> > +	struct dma_fence *fence;
> > +	unsigned long index;
> > +
> >  	dma_fence_put(&job->s_fence->finished);
> >  	job->s_fence = NULL;
> > +
> > +	xa_for_each(&job->dependencies, index, fence) {
> > +		dma_fence_put(fence);
> > +	}
> > +	xa_destroy(&job->dependencies);
> > +
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_cleanup);
> >  
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 80438d126c9d..e4d7e1496296 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -27,9 +27,12 @@
> >  #include <drm/spsc_queue.h>
> >  #include <linux/dma-fence.h>
> >  #include <linux/completion.h>
> > +#include <linux/xarray.h>
> >  
> >  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
> >  
> > +struct drm_gem_object;
> > +
> >  struct drm_gpu_scheduler;
> >  struct drm_sched_rq;
> >  
> > @@ -198,6 +201,16 @@ struct drm_sched_job {
> >  	enum drm_sched_priority		s_priority;
> >  	struct drm_sched_entity         *entity;
> >  	struct dma_fence_cb		cb;
> > +	/**
> > +	 * @dependencies:
> > +	 *
> > +	 * Contains the dependencies as struct dma_fence for this job, see
> > +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> > +	 */
> > +	struct xarray			dependencies;
> > +
> > +	/** @last_dependency: tracks @dependencies as they signal */
> > +	unsigned long			last_dependency;
> >  };
> >  
> >  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> > @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
> >   */
> >  struct drm_sched_backend_ops {
> >  	/**
> > -         * @dependency: Called when the scheduler is considering scheduling
> > -         * this job next, to get another struct dma_fence for this job to
> > -	 * block on.  Once it returns NULL, run_job() may be called.
> > +	 * @dependency:
> > +	 *
> > +	 * Called when the scheduler is considering scheduling this job next, to
> > +	 * get another struct dma_fence for this job to block on.  Once it
> > +	 * returns NULL, run_job() may be called.
> > +	 *
> > +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> > +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
> >  	 */
> >  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
> >  					struct drm_sched_entity *s_entity);
> > @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >  		       struct drm_sched_entity *entity,
> >  		       void *owner);
> >  void drm_sched_job_arm(struct drm_sched_job *job);
> > +int drm_sched_job_await_fence(struct drm_sched_job *job,
> > +			      struct dma_fence *fence);
> > +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> > +				 struct drm_gem_object *obj,
> > +				 bool write);
> > +
> > +
> >  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >  				    struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
@ 2021-06-24 15:26       ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 15:26 UTC (permalink / raw)
  To: Lucas Stach
  Cc: Luben Tuikov, Jack Zhang, David Airlie, Daniel Vetter,
	DRI Development, linaro-mm-sig, Boris Brezillon, Alex Deucher,
	Daniel Vetter, Nirmoy Das, Lee Jones, Christian König,
	linux-media

On Thu, Jun 24, 2021 at 04:39:09PM +0200, Lucas Stach wrote:
> Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
> > Instead of just a callback we can just glue in the gem helpers that
> > panfrost, v3d and lima currently use. There's really not that many
> > ways to skin this cat.
> > 
> > On the naming bikeshed: The idea for using _await_ to denote adding
> > dependencies to a job comes from i915, where that's used quite
> > extensively all over the place, in lots of datastructures.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
> >  drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
> >  include/drm/gpu_scheduler.h              |  31 ++++++-
> >  3 files changed, 146 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index f7347c284886..b6f72fafd504 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
> >  	job->sched->ops->free_job(job);
> >  }
> >  
> > +static struct dma_fence *
> > +drm_sched_job_dependency(struct drm_sched_job *job,
> > +			 struct drm_sched_entity *entity)
> > +{
> > +	if (!xa_empty(&job->dependencies))
> > +		return xa_erase(&job->dependencies, job->last_dependency++);
> 
> Not sure how much it buys us now that you dedup fences before adding
> them to the xa, but we could avoid potentially avoid some ping-pong
> looping in the scheduler by checking if the fence we are about to
> return here is already signaled and skipping to the next one if so.

I think there's quit a bit of tricks you can play if the scheduler has
more direct visibility into all (or at least most) of the fences. I'm
really just trying to establish the drm_sched_job_await api so there's a
notch more structure to the dependency handling.

Iow bikesheds on the interface and testing on the patches is what I'm
looking for :-)
-Daniel

> 
> Regards,
> Lucas
> 
> > +
> > +	if (job->sched->ops->dependency)
> > +		return job->sched->ops->dependency(job, entity);
> > +
> > +	return NULL;
> > +}
> > +
> >  /**
> >   * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
> >   *
> > @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
> >  		struct drm_sched_fence *s_fence = job->s_fence;
> >  
> >  		/* Wait for all dependencies to avoid data corruptions */
> > -		while ((f = job->sched->ops->dependency(job, entity)))
> > +		while ((f = drm_sched_job_dependency(job, entity)))
> >  			dma_fence_wait(f, false);
> >  
> >  		drm_sched_fence_scheduled(s_fence);
> > @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
> >   */
> >  struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >  {
> > -	struct drm_gpu_scheduler *sched = entity->rq->sched;
> >  	struct drm_sched_job *sched_job;
> >  
> >  	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
> > @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
> >  		return NULL;
> >  
> >  	while ((entity->dependency =
> > -			sched->ops->dependency(sched_job, entity))) {
> > +			drm_sched_job_dependency(sched_job, entity))) {
> >  		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
> >  
> >  		if (drm_sched_entity_add_dependency_cb(entity))
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 70eefed17e06..370c336d383f 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >  
> >  	INIT_LIST_HEAD(&job->list);
> >  
> > +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
> > +
> >  	return 0;
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_init);
> > @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_arm);
> >  
> > +/**
> > + * drm_sched_job_await_fence - adds the fence as a job dependency
> > + * @job: scheduler job to add the dependencies to
> > + * @fence: the dma_fence to add to the list of dependencies.
> > + *
> > + * Note that @fence is consumed in both the success and error cases.
> > + *
> > + * Returns:
> > + * 0 on success, or an error on failing to expand the array.
> > + */
> > +int drm_sched_job_await_fence(struct drm_sched_job *job,
> > +			      struct dma_fence *fence)
> > +{
> > +	struct dma_fence *entry;
> > +	unsigned long index;
> > +	u32 id = 0;
> > +	int ret;
> > +
> > +	if (!fence)
> > +		return 0;
> > +
> > +	/* Deduplicate if we already depend on a fence from the same context.
> > +	 * This lets the size of the array of deps scale with the number of
> > +	 * engines involved, rather than the number of BOs.
> > +	 */
> > +	xa_for_each(&job->dependencies, index, entry) {
> > +		if (entry->context != fence->context)
> > +			continue;
> > +
> > +		if (dma_fence_is_later(fence, entry)) {
> > +			dma_fence_put(entry);
> > +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
> > +		} else {
> > +			dma_fence_put(fence);
> > +		}
> > +		return 0;
> > +	}
> > +
> > +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
> > +	if (ret != 0)
> > +		dma_fence_put(fence);
> > +
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_await_fence);
> > +
> > +/**
> > + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
> > + * @job: scheduler job to add the dependencies to
> > + * @obj: the gem object to add new dependencies from.
> > + * @write: whether the job might write the object (so we need to depend on
> > + * shared fences in the reservation object).
> > + *
> > + * This should be called after drm_gem_lock_reservations() on your array of
> > + * GEM objects used in the job but before updating the reservations with your
> > + * own fences.
> > + *
> > + * Returns:
> > + * 0 on success, or an error on failing to expand the array.
> > + */
> > +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> > +				 struct drm_gem_object *obj,
> > +				 bool write)
> > +{
> > +	int ret;
> > +	struct dma_fence **fences;
> > +	unsigned int i, fence_count;
> > +
> > +	if (!write) {
> > +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
> > +
> > +		return drm_sched_job_await_fence(job, fence);
> > +	}
> > +
> > +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
> > +	if (ret || !fence_count)
> > +		return ret;
> > +
> > +	for (i = 0; i < fence_count; i++) {
> > +		ret = drm_sched_job_await_fence(job, fences[i]);
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +	for (; i < fence_count; i++)
> > +		dma_fence_put(fences[i]);
> > +	kfree(fences);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_await_implicit);
> > +
> > +
> >  /**
> >   * drm_sched_job_cleanup - clean up scheduler job resources
> >   *
> > @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
> >   */
> >  void drm_sched_job_cleanup(struct drm_sched_job *job)
> >  {
> > +	struct dma_fence *fence;
> > +	unsigned long index;
> > +
> >  	dma_fence_put(&job->s_fence->finished);
> >  	job->s_fence = NULL;
> > +
> > +	xa_for_each(&job->dependencies, index, fence) {
> > +		dma_fence_put(fence);
> > +	}
> > +	xa_destroy(&job->dependencies);
> > +
> >  }
> >  EXPORT_SYMBOL(drm_sched_job_cleanup);
> >  
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index 80438d126c9d..e4d7e1496296 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -27,9 +27,12 @@
> >  #include <drm/spsc_queue.h>
> >  #include <linux/dma-fence.h>
> >  #include <linux/completion.h>
> > +#include <linux/xarray.h>
> >  
> >  #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
> >  
> > +struct drm_gem_object;
> > +
> >  struct drm_gpu_scheduler;
> >  struct drm_sched_rq;
> >  
> > @@ -198,6 +201,16 @@ struct drm_sched_job {
> >  	enum drm_sched_priority		s_priority;
> >  	struct drm_sched_entity         *entity;
> >  	struct dma_fence_cb		cb;
> > +	/**
> > +	 * @dependencies:
> > +	 *
> > +	 * Contains the dependencies as struct dma_fence for this job, see
> > +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
> > +	 */
> > +	struct xarray			dependencies;
> > +
> > +	/** @last_dependency: tracks @dependencies as they signal */
> > +	unsigned long			last_dependency;
> >  };
> >  
> >  static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
> > @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
> >   */
> >  struct drm_sched_backend_ops {
> >  	/**
> > -         * @dependency: Called when the scheduler is considering scheduling
> > -         * this job next, to get another struct dma_fence for this job to
> > -	 * block on.  Once it returns NULL, run_job() may be called.
> > +	 * @dependency:
> > +	 *
> > +	 * Called when the scheduler is considering scheduling this job next, to
> > +	 * get another struct dma_fence for this job to block on.  Once it
> > +	 * returns NULL, run_job() may be called.
> > +	 *
> > +	 * If a driver exclusively uses drm_sched_job_await_fence() and
> > +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
> >  	 */
> >  	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
> >  					struct drm_sched_entity *s_entity);
> > @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >  		       struct drm_sched_entity *entity,
> >  		       void *owner);
> >  void drm_sched_job_arm(struct drm_sched_job *job);
> > +int drm_sched_job_await_fence(struct drm_sched_job *job,
> > +			      struct dma_fence *fence);
> > +int drm_sched_job_await_implicit(struct drm_sched_job *job,
> > +				 struct drm_gem_object *obj,
> > +				 bool write);
> > +
> > +
> >  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >  				    struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init
  2021-06-24 14:00 ` [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init Daniel Vetter
@ 2021-06-24 16:59   ` Emma Anholt
  2021-06-24 17:24     ` Daniel Vetter
  2021-06-24 20:45   ` [PATCH] " Daniel Vetter
  1 sibling, 1 reply; 57+ messages in thread
From: Emma Anholt @ 2021-06-24 16:59 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development

On Thu, Jun 24, 2021 at 7:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Prep work for using the scheduler dependency handling.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Emma Anholt <emma@anholt.net>

Back when I wrote this, I think there were rules that there had to be
no failure paths between a job_init and a push.  Has that changed?

I really don't have the context to evaluate this, I'm not sure what
new "scheduler dependency handling" is given that there was already
something that I considered to be dependency handling!

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
  2021-06-24 14:39     ` Lucas Stach
@ 2021-06-24 16:59       ` Christian König
  -1 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 16:59 UTC (permalink / raw)
  To: Lucas Stach, Daniel Vetter, DRI Development
  Cc: Jack Zhang, David Airlie, linaro-mm-sig, Boris Brezillon,
	Alex Deucher, Daniel Vetter, linux-media, Lee Jones,
	Luben Tuikov, Nirmoy Das

Am 24.06.21 um 16:39 schrieb Lucas Stach:
> Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
>> Instead of just a callback we can just glue in the gem helpers that
>> panfrost, v3d and lima currently use. There's really not that many
>> ways to skin this cat.
>>
>> On the naming bikeshed: The idea for using _await_ to denote adding
>> dependencies to a job comes from i915, where that's used quite
>> extensively all over the place, in lots of datastructures.
>>
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: David Airlie <airlied@linux.ie>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>> Cc: "Christian König" <christian.koenig@amd.com>
>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Cc: Lee Jones <lee.jones@linaro.org>
>> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>> Cc: linux-media@vger.kernel.org
>> Cc: linaro-mm-sig@lists.linaro.org
>> ---
>>   drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>>   drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>>   include/drm/gpu_scheduler.h              |  31 ++++++-
>>   3 files changed, 146 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>> index f7347c284886..b6f72fafd504 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>>   	job->sched->ops->free_job(job);
>>   }
>>   
>> +static struct dma_fence *
>> +drm_sched_job_dependency(struct drm_sched_job *job,
>> +			 struct drm_sched_entity *entity)
>> +{
>> +	if (!xa_empty(&job->dependencies))
>> +		return xa_erase(&job->dependencies, job->last_dependency++);
> Not sure how much it buys us now that you dedup fences before adding
> them to the xa, but we could avoid potentially avoid some ping-pong
> looping in the scheduler by checking if the fence we are about to
> return here is already signaled and skipping to the next one if so.

You absolutely need this, especially for TTM based drivers since you 
basically need to add all the fences from all the BOs in you relocation 
list.

When I initially implemented the dependency handling I've tried multiple 
approaches, including something similar to that one here. Not sure how 
well the performance will be, but I think we can revert to something 
more complicated rather easily when we find that it doesn't work as 
expected.

One unresolved problem is that we need to track the last fence we 
optimized by looking at the scheduler instance. This is necessary since 
Vulkan dependencies don't work correctly otherwise.

Amdgpu currently has a rather awkward workaround for that currently.

But in general it looks like the right thing to do.

Regards,
Christian.


>
> Regards,
> Lucas
>
>> +
>> +	if (job->sched->ops->dependency)
>> +		return job->sched->ops->dependency(job, entity);
>> +
>> +	return NULL;
>> +}
>> +
>>   /**
>>    * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>>    *
>> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>>   		struct drm_sched_fence *s_fence = job->s_fence;
>>   
>>   		/* Wait for all dependencies to avoid data corruptions */
>> -		while ((f = job->sched->ops->dependency(job, entity)))
>> +		while ((f = drm_sched_job_dependency(job, entity)))
>>   			dma_fence_wait(f, false);
>>   
>>   		drm_sched_fence_scheduled(s_fence);
>> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>>    */
>>   struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>   {
>> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>>   	struct drm_sched_job *sched_job;
>>   
>>   	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
>> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>   		return NULL;
>>   
>>   	while ((entity->dependency =
>> -			sched->ops->dependency(sched_job, entity))) {
>> +			drm_sched_job_dependency(sched_job, entity))) {
>>   		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>>   
>>   		if (drm_sched_entity_add_dependency_cb(entity))
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 70eefed17e06..370c336d383f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   
>>   	INIT_LIST_HEAD(&job->list);
>>   
>> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
>> +
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_init);
>> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_arm);
>>   
>> +/**
>> + * drm_sched_job_await_fence - adds the fence as a job dependency
>> + * @job: scheduler job to add the dependencies to
>> + * @fence: the dma_fence to add to the list of dependencies.
>> + *
>> + * Note that @fence is consumed in both the success and error cases.
>> + *
>> + * Returns:
>> + * 0 on success, or an error on failing to expand the array.
>> + */
>> +int drm_sched_job_await_fence(struct drm_sched_job *job,
>> +			      struct dma_fence *fence)
>> +{
>> +	struct dma_fence *entry;
>> +	unsigned long index;
>> +	u32 id = 0;
>> +	int ret;
>> +
>> +	if (!fence)
>> +		return 0;
>> +
>> +	/* Deduplicate if we already depend on a fence from the same context.
>> +	 * This lets the size of the array of deps scale with the number of
>> +	 * engines involved, rather than the number of BOs.
>> +	 */
>> +	xa_for_each(&job->dependencies, index, entry) {
>> +		if (entry->context != fence->context)
>> +			continue;
>> +
>> +		if (dma_fence_is_later(fence, entry)) {
>> +			dma_fence_put(entry);
>> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
>> +		} else {
>> +			dma_fence_put(fence);
>> +		}
>> +		return 0;
>> +	}
>> +
>> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
>> +	if (ret != 0)
>> +		dma_fence_put(fence);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(drm_sched_job_await_fence);
>> +
>> +/**
>> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
>> + * @job: scheduler job to add the dependencies to
>> + * @obj: the gem object to add new dependencies from.
>> + * @write: whether the job might write the object (so we need to depend on
>> + * shared fences in the reservation object).
>> + *
>> + * This should be called after drm_gem_lock_reservations() on your array of
>> + * GEM objects used in the job but before updating the reservations with your
>> + * own fences.
>> + *
>> + * Returns:
>> + * 0 on success, or an error on failing to expand the array.
>> + */
>> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
>> +				 struct drm_gem_object *obj,
>> +				 bool write)
>> +{
>> +	int ret;
>> +	struct dma_fence **fences;
>> +	unsigned int i, fence_count;
>> +
>> +	if (!write) {
>> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
>> +
>> +		return drm_sched_job_await_fence(job, fence);
>> +	}
>> +
>> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
>> +	if (ret || !fence_count)
>> +		return ret;
>> +
>> +	for (i = 0; i < fence_count; i++) {
>> +		ret = drm_sched_job_await_fence(job, fences[i]);
>> +		if (ret)
>> +			break;
>> +	}
>> +
>> +	for (; i < fence_count; i++)
>> +		dma_fence_put(fences[i]);
>> +	kfree(fences);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
>> +
>> +
>>   /**
>>    * drm_sched_job_cleanup - clean up scheduler job resources
>>    *
>> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>>    */
>>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>>   {
>> +	struct dma_fence *fence;
>> +	unsigned long index;
>> +
>>   	dma_fence_put(&job->s_fence->finished);
>>   	job->s_fence = NULL;
>> +
>> +	xa_for_each(&job->dependencies, index, fence) {
>> +		dma_fence_put(fence);
>> +	}
>> +	xa_destroy(&job->dependencies);
>> +
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_cleanup);
>>   
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 80438d126c9d..e4d7e1496296 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -27,9 +27,12 @@
>>   #include <drm/spsc_queue.h>
>>   #include <linux/dma-fence.h>
>>   #include <linux/completion.h>
>> +#include <linux/xarray.h>
>>   
>>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>   
>> +struct drm_gem_object;
>> +
>>   struct drm_gpu_scheduler;
>>   struct drm_sched_rq;
>>   
>> @@ -198,6 +201,16 @@ struct drm_sched_job {
>>   	enum drm_sched_priority		s_priority;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>> +	/**
>> +	 * @dependencies:
>> +	 *
>> +	 * Contains the dependencies as struct dma_fence for this job, see
>> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
>> +	 */
>> +	struct xarray			dependencies;
>> +
>> +	/** @last_dependency: tracks @dependencies as they signal */
>> +	unsigned long			last_dependency;
>>   };
>>   
>>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>>    */
>>   struct drm_sched_backend_ops {
>>   	/**
>> -         * @dependency: Called when the scheduler is considering scheduling
>> -         * this job next, to get another struct dma_fence for this job to
>> -	 * block on.  Once it returns NULL, run_job() may be called.
>> +	 * @dependency:
>> +	 *
>> +	 * Called when the scheduler is considering scheduling this job next, to
>> +	 * get another struct dma_fence for this job to block on.  Once it
>> +	 * returns NULL, run_job() may be called.
>> +	 *
>> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
>> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>>   	 */
>>   	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>>   					struct drm_sched_entity *s_entity);
>> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   		       struct drm_sched_entity *entity,
>>   		       void *owner);
>>   void drm_sched_job_arm(struct drm_sched_job *job);
>> +int drm_sched_job_await_fence(struct drm_sched_job *job,
>> +			      struct dma_fence *fence);
>> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
>> +				 struct drm_gem_object *obj,
>> +				 bool write);
>> +
>> +
>>   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>   				    struct drm_gpu_scheduler **sched_list,
>>                                      unsigned int num_sched_list);
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 02/11] drm/sched: Add dependency tracking
@ 2021-06-24 16:59       ` Christian König
  0 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 16:59 UTC (permalink / raw)
  To: Lucas Stach, Daniel Vetter, DRI Development
  Cc: Luben Tuikov, Jack Zhang, David Airlie, linaro-mm-sig,
	Boris Brezillon, Alex Deucher, Daniel Vetter, Nirmoy Das,
	Lee Jones, linux-media

Am 24.06.21 um 16:39 schrieb Lucas Stach:
> Am Donnerstag, dem 24.06.2021 um 16:00 +0200 schrieb Daniel Vetter:
>> Instead of just a callback we can just glue in the gem helpers that
>> panfrost, v3d and lima currently use. There's really not that many
>> ways to skin this cat.
>>
>> On the naming bikeshed: The idea for using _await_ to denote adding
>> dependencies to a job comes from i915, where that's used quite
>> extensively all over the place, in lots of datastructures.
>>
>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>> Cc: David Airlie <airlied@linux.ie>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>> Cc: "Christian König" <christian.koenig@amd.com>
>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> Cc: Lee Jones <lee.jones@linaro.org>
>> Cc: Nirmoy Das <nirmoy.aiemd@gmail.com>
>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>> Cc: Alex Deucher <alexander.deucher@amd.com>
>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>> Cc: linux-media@vger.kernel.org
>> Cc: linaro-mm-sig@lists.linaro.org
>> ---
>>   drivers/gpu/drm/scheduler/sched_entity.c |  18 +++-
>>   drivers/gpu/drm/scheduler/sched_main.c   | 103 +++++++++++++++++++++++
>>   include/drm/gpu_scheduler.h              |  31 ++++++-
>>   3 files changed, 146 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>> index f7347c284886..b6f72fafd504 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -211,6 +211,19 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f,
>>   	job->sched->ops->free_job(job);
>>   }
>>   
>> +static struct dma_fence *
>> +drm_sched_job_dependency(struct drm_sched_job *job,
>> +			 struct drm_sched_entity *entity)
>> +{
>> +	if (!xa_empty(&job->dependencies))
>> +		return xa_erase(&job->dependencies, job->last_dependency++);
> Not sure how much it buys us now that you dedup fences before adding
> them to the xa, but we could avoid potentially avoid some ping-pong
> looping in the scheduler by checking if the fence we are about to
> return here is already signaled and skipping to the next one if so.

You absolutely need this, especially for TTM based drivers since you 
basically need to add all the fences from all the BOs in you relocation 
list.

When I initially implemented the dependency handling I've tried multiple 
approaches, including something similar to that one here. Not sure how 
well the performance will be, but I think we can revert to something 
more complicated rather easily when we find that it doesn't work as 
expected.

One unresolved problem is that we need to track the last fence we 
optimized by looking at the scheduler instance. This is necessary since 
Vulkan dependencies don't work correctly otherwise.

Amdgpu currently has a rather awkward workaround for that currently.

But in general it looks like the right thing to do.

Regards,
Christian.


>
> Regards,
> Lucas
>
>> +
>> +	if (job->sched->ops->dependency)
>> +		return job->sched->ops->dependency(job, entity);
>> +
>> +	return NULL;
>> +}
>> +
>>   /**
>>    * drm_sched_entity_kill_jobs - Make sure all remaining jobs are killed
>>    *
>> @@ -229,7 +242,7 @@ static void drm_sched_entity_kill_jobs(struct drm_sched_entity *entity)
>>   		struct drm_sched_fence *s_fence = job->s_fence;
>>   
>>   		/* Wait for all dependencies to avoid data corruptions */
>> -		while ((f = job->sched->ops->dependency(job, entity)))
>> +		while ((f = drm_sched_job_dependency(job, entity)))
>>   			dma_fence_wait(f, false);
>>   
>>   		drm_sched_fence_scheduled(s_fence);
>> @@ -419,7 +432,6 @@ static bool drm_sched_entity_add_dependency_cb(struct drm_sched_entity *entity)
>>    */
>>   struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>   {
>> -	struct drm_gpu_scheduler *sched = entity->rq->sched;
>>   	struct drm_sched_job *sched_job;
>>   
>>   	sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
>> @@ -427,7 +439,7 @@ struct drm_sched_job *drm_sched_entity_pop_job(struct drm_sched_entity *entity)
>>   		return NULL;
>>   
>>   	while ((entity->dependency =
>> -			sched->ops->dependency(sched_job, entity))) {
>> +			drm_sched_job_dependency(sched_job, entity))) {
>>   		trace_drm_sched_job_wait_dep(sched_job, entity->dependency);
>>   
>>   		if (drm_sched_entity_add_dependency_cb(entity))
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>> index 70eefed17e06..370c336d383f 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -603,6 +603,8 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   
>>   	INIT_LIST_HEAD(&job->list);
>>   
>> +	xa_init_flags(&job->dependencies, XA_FLAGS_ALLOC);
>> +
>>   	return 0;
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_init);
>> @@ -626,6 +628,98 @@ void drm_sched_job_arm(struct drm_sched_job *job)
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_arm);
>>   
>> +/**
>> + * drm_sched_job_await_fence - adds the fence as a job dependency
>> + * @job: scheduler job to add the dependencies to
>> + * @fence: the dma_fence to add to the list of dependencies.
>> + *
>> + * Note that @fence is consumed in both the success and error cases.
>> + *
>> + * Returns:
>> + * 0 on success, or an error on failing to expand the array.
>> + */
>> +int drm_sched_job_await_fence(struct drm_sched_job *job,
>> +			      struct dma_fence *fence)
>> +{
>> +	struct dma_fence *entry;
>> +	unsigned long index;
>> +	u32 id = 0;
>> +	int ret;
>> +
>> +	if (!fence)
>> +		return 0;
>> +
>> +	/* Deduplicate if we already depend on a fence from the same context.
>> +	 * This lets the size of the array of deps scale with the number of
>> +	 * engines involved, rather than the number of BOs.
>> +	 */
>> +	xa_for_each(&job->dependencies, index, entry) {
>> +		if (entry->context != fence->context)
>> +			continue;
>> +
>> +		if (dma_fence_is_later(fence, entry)) {
>> +			dma_fence_put(entry);
>> +			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
>> +		} else {
>> +			dma_fence_put(fence);
>> +		}
>> +		return 0;
>> +	}
>> +
>> +	ret = xa_alloc(&job->dependencies, &id, fence, xa_limit_32b, GFP_KERNEL);
>> +	if (ret != 0)
>> +		dma_fence_put(fence);
>> +
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(drm_sched_job_await_fence);
>> +
>> +/**
>> + * drm_sched_job_await_implicit - adds implicit dependencies as job dependencies
>> + * @job: scheduler job to add the dependencies to
>> + * @obj: the gem object to add new dependencies from.
>> + * @write: whether the job might write the object (so we need to depend on
>> + * shared fences in the reservation object).
>> + *
>> + * This should be called after drm_gem_lock_reservations() on your array of
>> + * GEM objects used in the job but before updating the reservations with your
>> + * own fences.
>> + *
>> + * Returns:
>> + * 0 on success, or an error on failing to expand the array.
>> + */
>> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
>> +				 struct drm_gem_object *obj,
>> +				 bool write)
>> +{
>> +	int ret;
>> +	struct dma_fence **fences;
>> +	unsigned int i, fence_count;
>> +
>> +	if (!write) {
>> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(obj->resv);
>> +
>> +		return drm_sched_job_await_fence(job, fence);
>> +	}
>> +
>> +	ret = dma_resv_get_fences(obj->resv, NULL, &fence_count, &fences);
>> +	if (ret || !fence_count)
>> +		return ret;
>> +
>> +	for (i = 0; i < fence_count; i++) {
>> +		ret = drm_sched_job_await_fence(job, fences[i]);
>> +		if (ret)
>> +			break;
>> +	}
>> +
>> +	for (; i < fence_count; i++)
>> +		dma_fence_put(fences[i]);
>> +	kfree(fences);
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(drm_sched_job_await_implicit);
>> +
>> +
>>   /**
>>    * drm_sched_job_cleanup - clean up scheduler job resources
>>    *
>> @@ -633,8 +727,17 @@ EXPORT_SYMBOL(drm_sched_job_arm);
>>    */
>>   void drm_sched_job_cleanup(struct drm_sched_job *job)
>>   {
>> +	struct dma_fence *fence;
>> +	unsigned long index;
>> +
>>   	dma_fence_put(&job->s_fence->finished);
>>   	job->s_fence = NULL;
>> +
>> +	xa_for_each(&job->dependencies, index, fence) {
>> +		dma_fence_put(fence);
>> +	}
>> +	xa_destroy(&job->dependencies);
>> +
>>   }
>>   EXPORT_SYMBOL(drm_sched_job_cleanup);
>>   
>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>> index 80438d126c9d..e4d7e1496296 100644
>> --- a/include/drm/gpu_scheduler.h
>> +++ b/include/drm/gpu_scheduler.h
>> @@ -27,9 +27,12 @@
>>   #include <drm/spsc_queue.h>
>>   #include <linux/dma-fence.h>
>>   #include <linux/completion.h>
>> +#include <linux/xarray.h>
>>   
>>   #define MAX_WAIT_SCHED_ENTITY_Q_EMPTY msecs_to_jiffies(1000)
>>   
>> +struct drm_gem_object;
>> +
>>   struct drm_gpu_scheduler;
>>   struct drm_sched_rq;
>>   
>> @@ -198,6 +201,16 @@ struct drm_sched_job {
>>   	enum drm_sched_priority		s_priority;
>>   	struct drm_sched_entity         *entity;
>>   	struct dma_fence_cb		cb;
>> +	/**
>> +	 * @dependencies:
>> +	 *
>> +	 * Contains the dependencies as struct dma_fence for this job, see
>> +	 * drm_sched_job_await_fence() and drm_sched_job_await_implicit().
>> +	 */
>> +	struct xarray			dependencies;
>> +
>> +	/** @last_dependency: tracks @dependencies as they signal */
>> +	unsigned long			last_dependency;
>>   };
>>   
>>   static inline bool drm_sched_invalidate_job(struct drm_sched_job *s_job,
>> @@ -220,9 +233,14 @@ enum drm_gpu_sched_stat {
>>    */
>>   struct drm_sched_backend_ops {
>>   	/**
>> -         * @dependency: Called when the scheduler is considering scheduling
>> -         * this job next, to get another struct dma_fence for this job to
>> -	 * block on.  Once it returns NULL, run_job() may be called.
>> +	 * @dependency:
>> +	 *
>> +	 * Called when the scheduler is considering scheduling this job next, to
>> +	 * get another struct dma_fence for this job to block on.  Once it
>> +	 * returns NULL, run_job() may be called.
>> +	 *
>> +	 * If a driver exclusively uses drm_sched_job_await_fence() and
>> +	 * drm_sched_job_await_implicit() this can be ommitted and left as NULL.
>>   	 */
>>   	struct dma_fence *(*dependency)(struct drm_sched_job *sched_job,
>>   					struct drm_sched_entity *s_entity);
>> @@ -314,6 +332,13 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>   		       struct drm_sched_entity *entity,
>>   		       void *owner);
>>   void drm_sched_job_arm(struct drm_sched_job *job);
>> +int drm_sched_job_await_fence(struct drm_sched_job *job,
>> +			      struct dma_fence *fence);
>> +int drm_sched_job_await_implicit(struct drm_sched_job *job,
>> +				 struct drm_gem_object *obj,
>> +				 bool write);
>> +
>> +
>>   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>   				    struct drm_gpu_scheduler **sched_list,
>>                                      unsigned int num_sched_list);
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 14:00 ` [PATCH 10/11] drm/scheduler: Don't store self-dependencies Daniel Vetter
  2021-06-24 14:42   ` Lucas Stach
@ 2021-06-24 17:03   ` Christian König
  2021-06-24 17:29     ` Daniel Vetter
  1 sibling, 1 reply; 57+ messages in thread
From: Christian König @ 2021-06-24 17:03 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Alex Deucher, Daniel Vetter, Jack Zhang, Luben Tuikov

Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> This is essentially part of drm_sched_dependency_optimized(), which
> only amdgpu seems to make use of. Use it a bit more.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> ---
>   drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
>   1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 370c336d383f..c31d7cf7df74 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
>   	if (!fence)
>   		return 0;
>   
> +	/* if it's a fence from us it's guaranteed to be earlier */
> +	if (fence->context == job->entity->fence_context ||
> +	    fence->context == job->entity->fence_context + 1) {
> +		dma_fence_put(fence);
> +		return 0;
> +	}
> +

Well NAK. That would break Vulkan.

The problem is that Vulkan can insert dependencies between jobs which 
run on the same queue.

So we need to track those as well and if the previous job for the same 
queue/scheduler is not yet finished a pipeline synchronization needs to 
be inserted.

That's one of the reasons we wasn't able to unify the dependency 
handling yet.

Christian.

>   	/* Deduplicate if we already depend on a fence from the same context.
>   	 * This lets the size of the array of deps scale with the number of
>   	 * engines involved, rather than the number of BOs.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init
  2021-06-24 16:59   ` Emma Anholt
@ 2021-06-24 17:24     ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 17:24 UTC (permalink / raw)
  To: Emma Anholt; +Cc: Daniel Vetter, DRI Development, Daniel Vetter

On Thu, Jun 24, 2021 at 09:59:08AM -0700, Emma Anholt wrote:
> On Thu, Jun 24, 2021 at 7:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > Prep work for using the scheduler dependency handling.

I'll add "We need to call drm_sched_job_init earlier so we can use the new
drm_sched_job_await* functions for dependency handling here. That gives a
bit more context on what's going on here.

> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Emma Anholt <emma@anholt.net>
> 
> Back when I wrote this, I think there were rules that there had to be
> no failure paths between a job_init and a push.  Has that changed?
> 
> I really don't have the context to evaluate this, I'm not sure what
> new "scheduler dependency handling" is given that there was already
> something that I considered to be dependency handling!

Full patch series link:

https://lore.kernel.org/dri-devel/20210624140025.438303-1-daniel.vetter@ffwll.ch/T/#t


The job_init vs push_job is addressed here:

https://lore.kernel.org/dri-devel/20210624140025.438303-1-daniel.vetter@ffwll.ch/T/#mb6c4d75e1c57a5056d7b2ec8fbb9839fc5be41a7

I split job_init into job_init (which can fail, and be done earlier) and
job_arm (which can't fail, and must be atomic with the push_job).

The entire goal of this is to lift the dependency handling from "everyone
copypastes v3d" to "drm_sched has it for you already", which is this patch
here:

https://lore.kernel.org/dri-devel/20210624140025.438303-1-daniel.vetter@ffwll.ch/T/#m9f64aaf840cbf8815cd2ea9a68f99a51da9baa5f

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 17:03   ` Christian König
@ 2021-06-24 17:29     ` Daniel Vetter
  2021-06-24 17:38       ` Christian König
  0 siblings, 1 reply; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 17:29 UTC (permalink / raw)
  To: Christian König
  Cc: Jack Zhang, Daniel Vetter, DRI Development, Luben Tuikov,
	Alex Deucher, Daniel Vetter

On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> > This is essentially part of drm_sched_dependency_optimized(), which
> > only amdgpu seems to make use of. Use it a bit more.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
> >   1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 370c336d383f..c31d7cf7df74 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
> >   	if (!fence)
> >   		return 0;
> > +	/* if it's a fence from us it's guaranteed to be earlier */
> > +	if (fence->context == job->entity->fence_context ||
> > +	    fence->context == job->entity->fence_context + 1) {
> > +		dma_fence_put(fence);
> > +		return 0;
> > +	}
> > +
> 
> Well NAK. That would break Vulkan.
> 
> The problem is that Vulkan can insert dependencies between jobs which run on
> the same queue.
> 
> So we need to track those as well and if the previous job for the same
> queue/scheduler is not yet finished a pipeline synchronization needs to be
> inserted.
> 
> That's one of the reasons we wasn't able to unify the dependency handling
> yet.

That sounds like an extremely amdgpu specific constraint? You're also the
only one who keeps track of whether the previous job we've scheduled has
finished already (I guess they can get pipelined and you don't flush by
default), so you insert fences.

I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
sure why we have to inflict this design constraint on all other drivers?
At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
would break with this, and i915 will also be perfectly fine.

Also note: I'm not using this for amdgpu, exactly because there's a few
funny things going on.

Finally: You _really_ need explicit dependency handling for vulkan in your
uapi, instead of the kernel second-guessing what userspace might be doing.
That's really not how vulkan is designed to work :-)

Cheers, Daniel


> Christian.
> 
> >   	/* Deduplicate if we already depend on a fence from the same context.
> >   	 * This lets the size of the array of deps scale with the number of
> >   	 * engines involved, rather than the number of BOs.
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 17:29     ` Christian König
  -1 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 17:29 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Daniel Vetter, Lucas Stach, Russell King, Christian Gmeiner,
	Qiang Yu, Rob Herring, Tomeu Vizoso, Steven Price,
	Alyssa Rosenzweig, David Airlie, Daniel Vetter, Sumit Semwal,
	Masahiro Yamada, Kees Cook, Adam Borowski, Nick Terrell,
	Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen, Viresh Kumar,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Lee Jones,
	Kevin Wang, Chen Li, Luben Tuikov, Marek Olšák,
	Dennis Li, Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, etnaviv, lima,
	linux-media, linaro-mm-sig

Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.

We originally had that in the push_job interface, but moved that to init 
for some reason I don't remember.

> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.

I'm really questioning myself if I like that naming.

What about using drm_sched_job_add_dependency instead?

Christian.

>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   .gitignore                               |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>   drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>   drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>   drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>   include/drm/gpu_scheduler.h              |  6 +++++-
>   10 files changed, 51 insertions(+), 10 deletions(-)
>
> diff --git a/.gitignore b/.gitignore
> index 7afd412dadd2..52433a930299 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -66,6 +66,7 @@ modules.order
>   /modules.builtin
>   /modules.builtin.modinfo
>   /modules.nsdeps
> +*.builtin
>   
>   #
>   # RPM spec file (make rpm-pkg)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index c5386d13eb4a..a4ec092af9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>   	if (r)
>   		goto error_unlock;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	/* No memory allocation is allowed while holding the notifier lock.
>   	 * The lock is held until amdgpu_cs_submit is finished and fence is
>   	 * added to BOs.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d33e6d97cc89..5ddb955d2315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>   	if (r)
>   		return r;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	*f = dma_fence_get(&job->base.s_fence->finished);
>   	amdgpu_job_free_resources(job);
>   	drm_sched_entity_push_job(&job->base, entity);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 19826e504efc..af1671f01c7f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>   	if (ret)
>   		goto out_unlock;
>   
> +	drm_sched_job_arm(&submit->sched_job);
> +
>   	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>   	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>   						submit->out_fence, 0,
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ecf3267334ff..bd1af1fd8c0f 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>   		return err;
>   	}
>   
> +	drm_sched_job_arm(&task->base);
> +
>   	task->num_bos = num_bos;
>   	task->vm = lima_vm_get(vm);
>   
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>   		goto unlock;
>   	}
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>   
>   	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..f7347c284886 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>    * @sched_job: job to submit
>    * @entity: scheduler entity
>    *
> - * Note: To guarantee that the order of insertion to queue matches
> - * the job's fence sequence number this function should be
> - * called with drm_sched_job_init under common lock.
> + * Note: To guarantee that the order of insertion to queue matches the job's
> + * fence sequence number this function should be called with drm_sched_job_arm()
> + * under common lock.
>    *
>    * Returns 0 for success, negative error code otherwise.
>    */
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 69de2c76731f..0ba810c198bd 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>   }
>   EXPORT_SYMBOL(to_drm_sched_fence);
>   
> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> -					       void *owner)
> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> +					      void *owner)
>   {
>   	struct drm_sched_fence *fence = NULL;
> -	unsigned seq;
>   
>   	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>   	if (fence == NULL)
> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>   	fence->sched = entity->rq->sched;
>   	spin_lock_init(&fence->lock);
>   
> +	return fence;
> +}
> +
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity)
> +{
> +	unsigned seq;
> +
>   	seq = atomic_inc_return(&entity->fence_seq);
>   	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>   		       &fence->lock, entity->fence_context, seq);
>   	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>   		       &fence->lock, entity->fence_context + 1, seq);
> -
> -	return fence;
>   }
>   
>   module_init(drm_sched_fence_slab_init);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 61420a9c1021..70eefed17e06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -48,9 +48,11 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   #include <linux/completion.h>
> +#include <linux/dma-resv.h>
>   #include <uapi/linux/sched/types.h>
>   
>   #include <drm/drm_print.h>
> +#include <drm/drm_gem.h>
>   #include <drm/gpu_scheduler.h>
>   #include <drm/spsc_queue.h>
>   
> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   	job->sched = sched;
>   	job->entity = entity;
>   	job->s_priority = entity->rq - sched->sched_rq;
> -	job->s_fence = drm_sched_fence_create(entity, owner);
> +	job->s_fence = drm_sched_fence_alloc(entity, owner);
>   	if (!job->s_fence)
>   		return -ENOMEM;
>   	job->id = atomic64_inc_return(&sched->job_id_count);
> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   }
>   EXPORT_SYMBOL(drm_sched_job_init);
>   
> +/**
> + * drm_sched_job_arm - arm a scheduler job for execution
> + * @job: scheduler job to arm
> + *
> + * This arms a scheduler job for execution. Specifically it initializes the
> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> + * or other places that need to track the completion of this job.
> + *
> + * Refer to drm_sched_entity_push_job() documentation for locking
> + * considerations.
> + *
> + * This can only be called if drm_sched_job_init() succeeded.
> + */
> +void drm_sched_job_arm(struct drm_sched_job *job)
> +{
> +	drm_sched_fence_init(job->s_fence, job->entity);
> +}
> +EXPORT_SYMBOL(drm_sched_job_arm);
> +
>   /**
>    * drm_sched_job_cleanup - clean up scheduler job resources
>    *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index d18af49fd009..80438d126c9d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>   int drm_sched_job_init(struct drm_sched_job *job,
>   		       struct drm_sched_entity *entity,
>   		       void *owner);
> +void drm_sched_job_arm(struct drm_sched_job *job);
>   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>   				    struct drm_gpu_scheduler **sched_list,
>                                      unsigned int num_sched_list);
> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>   				   enum drm_sched_priority priority);
>   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>   
> -struct drm_sched_fence *drm_sched_fence_create(
> +struct drm_sched_fence *drm_sched_fence_alloc(
>   	struct drm_sched_entity *s_entity, void *owner);
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity);
> +
>   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>   void drm_sched_fence_finished(struct drm_sched_fence *fence);
>   


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 17:29     ` Christian König
  0 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 17:29 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Adam Borowski, David Airlie, Viresh Kumar, Sonny Jiang,
	Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang, lima,
	Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, etnaviv, linaro-mm-sig,
	Nick Terrell, Deepak R Varma, Tomeu Vizoso, Boris Brezillon,
	Qiang Yu, Alex Deucher, Tian Tao, linux-media

Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.

We originally had that in the push_job interface, but moved that to init 
for some reason I don't remember.

> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.

I'm really questioning myself if I like that naming.

What about using drm_sched_job_add_dependency instead?

Christian.

>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>   .gitignore                               |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>   drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>   drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>   drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>   include/drm/gpu_scheduler.h              |  6 +++++-
>   10 files changed, 51 insertions(+), 10 deletions(-)
>
> diff --git a/.gitignore b/.gitignore
> index 7afd412dadd2..52433a930299 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -66,6 +66,7 @@ modules.order
>   /modules.builtin
>   /modules.builtin.modinfo
>   /modules.nsdeps
> +*.builtin
>   
>   #
>   # RPM spec file (make rpm-pkg)
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index c5386d13eb4a..a4ec092af9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>   	if (r)
>   		goto error_unlock;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	/* No memory allocation is allowed while holding the notifier lock.
>   	 * The lock is held until amdgpu_cs_submit is finished and fence is
>   	 * added to BOs.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d33e6d97cc89..5ddb955d2315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>   	if (r)
>   		return r;
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	*f = dma_fence_get(&job->base.s_fence->finished);
>   	amdgpu_job_free_resources(job);
>   	drm_sched_entity_push_job(&job->base, entity);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 19826e504efc..af1671f01c7f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>   	if (ret)
>   		goto out_unlock;
>   
> +	drm_sched_job_arm(&submit->sched_job);
> +
>   	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>   	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>   						submit->out_fence, 0,
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ecf3267334ff..bd1af1fd8c0f 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>   		return err;
>   	}
>   
> +	drm_sched_job_arm(&task->base);
> +
>   	task->num_bos = num_bos;
>   	task->vm = lima_vm_get(vm);
>   
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>   		goto unlock;
>   	}
>   
> +	drm_sched_job_arm(&job->base);
> +
>   	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>   
>   	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..f7347c284886 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>    * @sched_job: job to submit
>    * @entity: scheduler entity
>    *
> - * Note: To guarantee that the order of insertion to queue matches
> - * the job's fence sequence number this function should be
> - * called with drm_sched_job_init under common lock.
> + * Note: To guarantee that the order of insertion to queue matches the job's
> + * fence sequence number this function should be called with drm_sched_job_arm()
> + * under common lock.
>    *
>    * Returns 0 for success, negative error code otherwise.
>    */
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 69de2c76731f..0ba810c198bd 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>   }
>   EXPORT_SYMBOL(to_drm_sched_fence);
>   
> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> -					       void *owner)
> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> +					      void *owner)
>   {
>   	struct drm_sched_fence *fence = NULL;
> -	unsigned seq;
>   
>   	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>   	if (fence == NULL)
> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>   	fence->sched = entity->rq->sched;
>   	spin_lock_init(&fence->lock);
>   
> +	return fence;
> +}
> +
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity)
> +{
> +	unsigned seq;
> +
>   	seq = atomic_inc_return(&entity->fence_seq);
>   	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>   		       &fence->lock, entity->fence_context, seq);
>   	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>   		       &fence->lock, entity->fence_context + 1, seq);
> -
> -	return fence;
>   }
>   
>   module_init(drm_sched_fence_slab_init);
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> index 61420a9c1021..70eefed17e06 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -48,9 +48,11 @@
>   #include <linux/wait.h>
>   #include <linux/sched.h>
>   #include <linux/completion.h>
> +#include <linux/dma-resv.h>
>   #include <uapi/linux/sched/types.h>
>   
>   #include <drm/drm_print.h>
> +#include <drm/drm_gem.h>
>   #include <drm/gpu_scheduler.h>
>   #include <drm/spsc_queue.h>
>   
> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   	job->sched = sched;
>   	job->entity = entity;
>   	job->s_priority = entity->rq - sched->sched_rq;
> -	job->s_fence = drm_sched_fence_create(entity, owner);
> +	job->s_fence = drm_sched_fence_alloc(entity, owner);
>   	if (!job->s_fence)
>   		return -ENOMEM;
>   	job->id = atomic64_inc_return(&sched->job_id_count);
> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
>   }
>   EXPORT_SYMBOL(drm_sched_job_init);
>   
> +/**
> + * drm_sched_job_arm - arm a scheduler job for execution
> + * @job: scheduler job to arm
> + *
> + * This arms a scheduler job for execution. Specifically it initializes the
> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> + * or other places that need to track the completion of this job.
> + *
> + * Refer to drm_sched_entity_push_job() documentation for locking
> + * considerations.
> + *
> + * This can only be called if drm_sched_job_init() succeeded.
> + */
> +void drm_sched_job_arm(struct drm_sched_job *job)
> +{
> +	drm_sched_fence_init(job->s_fence, job->entity);
> +}
> +EXPORT_SYMBOL(drm_sched_job_arm);
> +
>   /**
>    * drm_sched_job_cleanup - clean up scheduler job resources
>    *
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index d18af49fd009..80438d126c9d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>   int drm_sched_job_init(struct drm_sched_job *job,
>   		       struct drm_sched_entity *entity,
>   		       void *owner);
> +void drm_sched_job_arm(struct drm_sched_job *job);
>   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>   				    struct drm_gpu_scheduler **sched_list,
>                                      unsigned int num_sched_list);
> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>   				   enum drm_sched_priority priority);
>   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>   
> -struct drm_sched_fence *drm_sched_fence_create(
> +struct drm_sched_fence *drm_sched_fence_alloc(
>   	struct drm_sched_entity *s_entity, void *owner);
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +			  struct drm_sched_entity *entity);
> +
>   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>   void drm_sched_fence_finished(struct drm_sched_fence *fence);
>   


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 17:29     ` Christian König
@ 2021-06-24 17:37       ` Daniel Vetter
  -1 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 17:37 UTC (permalink / raw)
  To: Christian König
  Cc: DRI Development, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Qiang Yu, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, David Airlie, Sumit Semwal,
	Masahiro Yamada, Kees Cook, Adam Borowski, Nick Terrell,
	Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen, Viresh Kumar,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Lee Jones,
	Kevin Wang, Chen Li, Luben Tuikov, Marek Olšák,
	Dennis Li, Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, The etnaviv authors, lima,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jun 24, 2021 at 7:30 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
>
> We originally had that in the push_job interface, but moved that to init
> for some reason I don't remember.
>
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
>
> I'm really questioning myself if I like that naming.
>
> What about using drm_sched_job_add_dependency instead?

You're suggesting a
s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
the wrong patch?
-Daniel

>
> Christian.
>
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: Qiang Yu <yuq825@gmail.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Adam Borowski <kilobyte@angband.pl>
> > Cc: Nick Terrell <terrelln@fb.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: etnaviv@lists.freedesktop.org
> > Cc: lima@lists.freedesktop.org
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   .gitignore                               |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >   drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >   drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >   include/drm/gpu_scheduler.h              |  6 +++++-
> >   10 files changed, 51 insertions(+), 10 deletions(-)
> >
> > diff --git a/.gitignore b/.gitignore
> > index 7afd412dadd2..52433a930299 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -66,6 +66,7 @@ modules.order
> >   /modules.builtin
> >   /modules.builtin.modinfo
> >   /modules.nsdeps
> > +*.builtin
> >
> >   #
> >   # RPM spec file (make rpm-pkg)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index c5386d13eb4a..a4ec092af9a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >       if (r)
> >               goto error_unlock;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       /* No memory allocation is allowed while holding the notifier lock.
> >        * The lock is held until amdgpu_cs_submit is finished and fence is
> >        * added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >       if (r)
> >               return r;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       *f = dma_fence_get(&job->base.s_fence->finished);
> >       amdgpu_job_free_resources(job);
> >       drm_sched_entity_push_job(&job->base, entity);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index 19826e504efc..af1671f01c7f 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >       if (ret)
> >               goto out_unlock;
> >
> > +     drm_sched_job_arm(&submit->sched_job);
> > +
> >       submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >       submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >                                               submit->out_fence, 0,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ecf3267334ff..bd1af1fd8c0f 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >               return err;
> >       }
> >
> > +     drm_sched_job_arm(&task->base);
> > +
> >       task->num_bos = num_bos;
> >       task->vm = lima_vm_get(vm);
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index beb62c8fc851..1e950534b9b0 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >               goto unlock;
> >       }
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >       ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..f7347c284886 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >    * @sched_job: job to submit
> >    * @entity: scheduler entity
> >    *
> > - * Note: To guarantee that the order of insertion to queue matches
> > - * the job's fence sequence number this function should be
> > - * called with drm_sched_job_init under common lock.
> > + * Note: To guarantee that the order of insertion to queue matches the job's
> > + * fence sequence number this function should be called with drm_sched_job_arm()
> > + * under common lock.
> >    *
> >    * Returns 0 for success, negative error code otherwise.
> >    */
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..0ba810c198bd 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >   }
> >   EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > -                                            void *owner)
> > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > +                                           void *owner)
> >   {
> >       struct drm_sched_fence *fence = NULL;
> > -     unsigned seq;
> >
> >       fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >       if (fence == NULL)
> > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >       fence->sched = entity->rq->sched;
> >       spin_lock_init(&fence->lock);
> >
> > +     return fence;
> > +}
> > +
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity)
> > +{
> > +     unsigned seq;
> > +
> >       seq = atomic_inc_return(&entity->fence_seq);
> >       dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >                      &fence->lock, entity->fence_context, seq);
> >       dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >                      &fence->lock, entity->fence_context + 1, seq);
> > -
> > -     return fence;
> >   }
> >
> >   module_init(drm_sched_fence_slab_init);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 61420a9c1021..70eefed17e06 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -48,9 +48,11 @@
> >   #include <linux/wait.h>
> >   #include <linux/sched.h>
> >   #include <linux/completion.h>
> > +#include <linux/dma-resv.h>
> >   #include <uapi/linux/sched/types.h>
> >
> >   #include <drm/drm_print.h>
> > +#include <drm/drm_gem.h>
> >   #include <drm/gpu_scheduler.h>
> >   #include <drm/spsc_queue.h>
> >
> > @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >       job->sched = sched;
> >       job->entity = entity;
> >       job->s_priority = entity->rq - sched->sched_rq;
> > -     job->s_fence = drm_sched_fence_create(entity, owner);
> > +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >       if (!job->s_fence)
> >               return -ENOMEM;
> >       job->id = atomic64_inc_return(&sched->job_id_count);
> > @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >   }
> >   EXPORT_SYMBOL(drm_sched_job_init);
> >
> > +/**
> > + * drm_sched_job_arm - arm a scheduler job for execution
> > + * @job: scheduler job to arm
> > + *
> > + * This arms a scheduler job for execution. Specifically it initializes the
> > + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> > + * or other places that need to track the completion of this job.
> > + *
> > + * Refer to drm_sched_entity_push_job() documentation for locking
> > + * considerations.
> > + *
> > + * This can only be called if drm_sched_job_init() succeeded.
> > + */
> > +void drm_sched_job_arm(struct drm_sched_job *job)
> > +{
> > +     drm_sched_fence_init(job->s_fence, job->entity);
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_arm);
> > +
> >   /**
> >    * drm_sched_job_cleanup - clean up scheduler job resources
> >    *
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index d18af49fd009..80438d126c9d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >   int drm_sched_job_init(struct drm_sched_job *job,
> >                      struct drm_sched_entity *entity,
> >                      void *owner);
> > +void drm_sched_job_arm(struct drm_sched_job *job);
> >   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >                                   struct drm_gpu_scheduler **sched_list,
> >                                      unsigned int num_sched_list);
> > @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >                                  enum drm_sched_priority priority);
> >   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(
> > +struct drm_sched_fence *drm_sched_fence_alloc(
> >       struct drm_sched_entity *s_entity, void *owner);
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity);
> > +
> >   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >   void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 17:37       ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 17:37 UTC (permalink / raw)
  To: Christian König
  Cc: Adam Borowski, David Airlie, Viresh Kumar, DRI Development,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, The etnaviv authors,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Nick Terrell,
	Deepak R Varma, Tomeu Vizoso, Boris Brezillon, Qiang Yu,
	Alex Deucher, Tian Tao, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jun 24, 2021 at 7:30 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
>
> We originally had that in the push_job interface, but moved that to init
> for some reason I don't remember.
>
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
>
> I'm really questioning myself if I like that naming.
>
> What about using drm_sched_job_add_dependency instead?

You're suggesting a
s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
the wrong patch?
-Daniel

>
> Christian.
>
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: Qiang Yu <yuq825@gmail.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Adam Borowski <kilobyte@angband.pl>
> > Cc: Nick Terrell <terrelln@fb.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: etnaviv@lists.freedesktop.org
> > Cc: lima@lists.freedesktop.org
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >   .gitignore                               |  1 +
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >   drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >   drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >   drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >   drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >   drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >   drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >   include/drm/gpu_scheduler.h              |  6 +++++-
> >   10 files changed, 51 insertions(+), 10 deletions(-)
> >
> > diff --git a/.gitignore b/.gitignore
> > index 7afd412dadd2..52433a930299 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -66,6 +66,7 @@ modules.order
> >   /modules.builtin
> >   /modules.builtin.modinfo
> >   /modules.nsdeps
> > +*.builtin
> >
> >   #
> >   # RPM spec file (make rpm-pkg)
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index c5386d13eb4a..a4ec092af9a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >       if (r)
> >               goto error_unlock;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       /* No memory allocation is allowed while holding the notifier lock.
> >        * The lock is held until amdgpu_cs_submit is finished and fence is
> >        * added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >       if (r)
> >               return r;
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       *f = dma_fence_get(&job->base.s_fence->finished);
> >       amdgpu_job_free_resources(job);
> >       drm_sched_entity_push_job(&job->base, entity);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index 19826e504efc..af1671f01c7f 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >       if (ret)
> >               goto out_unlock;
> >
> > +     drm_sched_job_arm(&submit->sched_job);
> > +
> >       submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >       submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >                                               submit->out_fence, 0,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ecf3267334ff..bd1af1fd8c0f 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >               return err;
> >       }
> >
> > +     drm_sched_job_arm(&task->base);
> > +
> >       task->num_bos = num_bos;
> >       task->vm = lima_vm_get(vm);
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index beb62c8fc851..1e950534b9b0 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >               goto unlock;
> >       }
> >
> > +     drm_sched_job_arm(&job->base);
> > +
> >       job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >       ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..f7347c284886 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >    * @sched_job: job to submit
> >    * @entity: scheduler entity
> >    *
> > - * Note: To guarantee that the order of insertion to queue matches
> > - * the job's fence sequence number this function should be
> > - * called with drm_sched_job_init under common lock.
> > + * Note: To guarantee that the order of insertion to queue matches the job's
> > + * fence sequence number this function should be called with drm_sched_job_arm()
> > + * under common lock.
> >    *
> >    * Returns 0 for success, negative error code otherwise.
> >    */
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..0ba810c198bd 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >   }
> >   EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > -                                            void *owner)
> > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > +                                           void *owner)
> >   {
> >       struct drm_sched_fence *fence = NULL;
> > -     unsigned seq;
> >
> >       fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >       if (fence == NULL)
> > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >       fence->sched = entity->rq->sched;
> >       spin_lock_init(&fence->lock);
> >
> > +     return fence;
> > +}
> > +
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity)
> > +{
> > +     unsigned seq;
> > +
> >       seq = atomic_inc_return(&entity->fence_seq);
> >       dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >                      &fence->lock, entity->fence_context, seq);
> >       dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >                      &fence->lock, entity->fence_context + 1, seq);
> > -
> > -     return fence;
> >   }
> >
> >   module_init(drm_sched_fence_slab_init);
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> > index 61420a9c1021..70eefed17e06 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -48,9 +48,11 @@
> >   #include <linux/wait.h>
> >   #include <linux/sched.h>
> >   #include <linux/completion.h>
> > +#include <linux/dma-resv.h>
> >   #include <uapi/linux/sched/types.h>
> >
> >   #include <drm/drm_print.h>
> > +#include <drm/drm_gem.h>
> >   #include <drm/gpu_scheduler.h>
> >   #include <drm/spsc_queue.h>
> >
> > @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >       job->sched = sched;
> >       job->entity = entity;
> >       job->s_priority = entity->rq - sched->sched_rq;
> > -     job->s_fence = drm_sched_fence_create(entity, owner);
> > +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >       if (!job->s_fence)
> >               return -ENOMEM;
> >       job->id = atomic64_inc_return(&sched->job_id_count);
> > @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >   }
> >   EXPORT_SYMBOL(drm_sched_job_init);
> >
> > +/**
> > + * drm_sched_job_arm - arm a scheduler job for execution
> > + * @job: scheduler job to arm
> > + *
> > + * This arms a scheduler job for execution. Specifically it initializes the
> > + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> > + * or other places that need to track the completion of this job.
> > + *
> > + * Refer to drm_sched_entity_push_job() documentation for locking
> > + * considerations.
> > + *
> > + * This can only be called if drm_sched_job_init() succeeded.
> > + */
> > +void drm_sched_job_arm(struct drm_sched_job *job)
> > +{
> > +     drm_sched_fence_init(job->s_fence, job->entity);
> > +}
> > +EXPORT_SYMBOL(drm_sched_job_arm);
> > +
> >   /**
> >    * drm_sched_job_cleanup - clean up scheduler job resources
> >    *
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index d18af49fd009..80438d126c9d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >   int drm_sched_job_init(struct drm_sched_job *job,
> >                      struct drm_sched_entity *entity,
> >                      void *owner);
> > +void drm_sched_job_arm(struct drm_sched_job *job);
> >   void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >                                   struct drm_gpu_scheduler **sched_list,
> >                                      unsigned int num_sched_list);
> > @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >                                  enum drm_sched_priority priority);
> >   bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(
> > +struct drm_sched_fence *drm_sched_fence_alloc(
> >       struct drm_sched_entity *s_entity, void *owner);
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                       struct drm_sched_entity *entity);
> > +
> >   void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >   void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 17:29     ` Daniel Vetter
@ 2021-06-24 17:38       ` Christian König
  2021-06-24 17:43         ` Daniel Vetter
  0 siblings, 1 reply; 57+ messages in thread
From: Christian König @ 2021-06-24 17:38 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jack Zhang, Daniel Vetter, DRI Development, Luben Tuikov,
	Alex Deucher, Daniel Vetter

Am 24.06.21 um 19:29 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
>>> This is essentially part of drm_sched_dependency_optimized(), which
>>> only amdgpu seems to make use of. Use it a bit more.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
>>>    1 file changed, 7 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 370c336d383f..c31d7cf7df74 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
>>>    	if (!fence)
>>>    		return 0;
>>> +	/* if it's a fence from us it's guaranteed to be earlier */
>>> +	if (fence->context == job->entity->fence_context ||
>>> +	    fence->context == job->entity->fence_context + 1) {
>>> +		dma_fence_put(fence);
>>> +		return 0;
>>> +	}
>>> +
>> Well NAK. That would break Vulkan.
>>
>> The problem is that Vulkan can insert dependencies between jobs which run on
>> the same queue.
>>
>> So we need to track those as well and if the previous job for the same
>> queue/scheduler is not yet finished a pipeline synchronization needs to be
>> inserted.
>>
>> That's one of the reasons we wasn't able to unify the dependency handling
>> yet.
> That sounds like an extremely amdgpu specific constraint?

Yeah, that's totally hardware specific.

It's just that I don't know how else we could track that without having 
the same separation as in amdgpu between implicit and explicit fences. 
And as far as I understand it that's exactly what you want to avoid.

As I said this turned out to be really awkward.

> You're also the
> only one who keeps track of whether the previous job we've scheduled has
> finished already (I guess they can get pipelined and you don't flush by
> default), so you insert fences.

Yes, exactly that.

> I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
> sure why we have to inflict this design constraint on all other drivers?
> At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
> would break with this, and i915 will also be perfectly fine.
>
> Also note: I'm not using this for amdgpu, exactly because there's a few
> funny things going on.

Yeah, exactly the reason why we never unified this.

Regards,
Christian.

> Finally: You _really_ need explicit dependency handling for vulkan in your
> uapi, instead of the kernel second-guessing what userspace might be doing.
> That's really not how vulkan is designed to work :-)

>
> Cheers, Daniel
>
>
>> Christian.
>>
>>>    	/* Deduplicate if we already depend on a fence from the same context.
>>>    	 * This lets the size of the array of deps scale with the number of
>>>    	 * engines involved, rather than the number of BOs.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 17:37       ` Daniel Vetter
@ 2021-06-24 17:39         ` Christian König
  -1 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 17:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Qiang Yu, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, David Airlie, Sumit Semwal,
	Masahiro Yamada, Kees Cook, Adam Borowski, Nick Terrell,
	Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen, Viresh Kumar,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Lee Jones,
	Kevin Wang, Chen Li, Luben Tuikov, Marek Olšák,
	Dennis Li, Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, The etnaviv authors, lima,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK



Am 24.06.21 um 19:37 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 7:30 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
>>> This is a very confusingly named function, because not just does it
>>> init an object, it arms it and provides a point of no return for
>>> pushing a job into the scheduler. It would be nice if that's a bit
>>> clearer in the interface.
>> We originally had that in the push_job interface, but moved that to init
>> for some reason I don't remember.
>>
>>> But the real reason is that I want to push the dependency tracking
>>> helpers into the scheduler code, and that means drm_sched_job_init
>>> must be called a lot earlier, without arming the job.
>> I'm really questioning myself if I like that naming.
>>
>> What about using drm_sched_job_add_dependency instead?
> You're suggesting a
> s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
> the wrong patch?

Replied to the wrong patch accidentally. I was talking about the "await" 
terminology.

Christian.

> -Daniel
>
>> Christian.
>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>> Cc: Qiang Yu <yuq825@gmail.com>
>>> Cc: Rob Herring <robh@kernel.org>
>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>> Cc: Nick Terrell <terrelln@fb.com>
>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>> Cc: Lee Jones <lee.jones@linaro.org>
>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>> Cc: Chen Li <chenli@uniontech.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>> Cc: etnaviv@lists.freedesktop.org
>>> Cc: lima@lists.freedesktop.org
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> ---
>>>    .gitignore                               |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>>>    drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>>>    drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>>>    include/drm/gpu_scheduler.h              |  6 +++++-
>>>    10 files changed, 51 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/.gitignore b/.gitignore
>>> index 7afd412dadd2..52433a930299 100644
>>> --- a/.gitignore
>>> +++ b/.gitignore
>>> @@ -66,6 +66,7 @@ modules.order
>>>    /modules.builtin
>>>    /modules.builtin.modinfo
>>>    /modules.nsdeps
>>> +*.builtin
>>>
>>>    #
>>>    # RPM spec file (make rpm-pkg)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index c5386d13eb4a..a4ec092af9a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>        if (r)
>>>                goto error_unlock;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        /* No memory allocation is allowed while holding the notifier lock.
>>>         * The lock is held until amdgpu_cs_submit is finished and fence is
>>>         * added to BOs.
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index d33e6d97cc89..5ddb955d2315 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>        if (r)
>>>                return r;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        *f = dma_fence_get(&job->base.s_fence->finished);
>>>        amdgpu_job_free_resources(job);
>>>        drm_sched_entity_push_job(&job->base, entity);
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index 19826e504efc..af1671f01c7f 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>        if (ret)
>>>                goto out_unlock;
>>>
>>> +     drm_sched_job_arm(&submit->sched_job);
>>> +
>>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>                                                submit->out_fence, 0,
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index ecf3267334ff..bd1af1fd8c0f 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>                return err;
>>>        }
>>>
>>> +     drm_sched_job_arm(&task->base);
>>> +
>>>        task->num_bos = num_bos;
>>>        task->vm = lima_vm_get(vm);
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index beb62c8fc851..1e950534b9b0 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>                goto unlock;
>>>        }
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>
>>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 79554aa4dbb1..f7347c284886 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>     * @sched_job: job to submit
>>>     * @entity: scheduler entity
>>>     *
>>> - * Note: To guarantee that the order of insertion to queue matches
>>> - * the job's fence sequence number this function should be
>>> - * called with drm_sched_job_init under common lock.
>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>> + * under common lock.
>>>     *
>>>     * Returns 0 for success, negative error code otherwise.
>>>     */
>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>> index 69de2c76731f..0ba810c198bd 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>    }
>>>    EXPORT_SYMBOL(to_drm_sched_fence);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>> -                                            void *owner)
>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>> +                                           void *owner)
>>>    {
>>>        struct drm_sched_fence *fence = NULL;
>>> -     unsigned seq;
>>>
>>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>        if (fence == NULL)
>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>        fence->sched = entity->rq->sched;
>>>        spin_lock_init(&fence->lock);
>>>
>>> +     return fence;
>>> +}
>>> +
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity)
>>> +{
>>> +     unsigned seq;
>>> +
>>>        seq = atomic_inc_return(&entity->fence_seq);
>>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>                       &fence->lock, entity->fence_context, seq);
>>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>                       &fence->lock, entity->fence_context + 1, seq);
>>> -
>>> -     return fence;
>>>    }
>>>
>>>    module_init(drm_sched_fence_slab_init);
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 61420a9c1021..70eefed17e06 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -48,9 +48,11 @@
>>>    #include <linux/wait.h>
>>>    #include <linux/sched.h>
>>>    #include <linux/completion.h>
>>> +#include <linux/dma-resv.h>
>>>    #include <uapi/linux/sched/types.h>
>>>
>>>    #include <drm/drm_print.h>
>>> +#include <drm/drm_gem.h>
>>>    #include <drm/gpu_scheduler.h>
>>>    #include <drm/spsc_queue.h>
>>>
>>> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>        job->sched = sched;
>>>        job->entity = entity;
>>>        job->s_priority = entity->rq - sched->sched_rq;
>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>        if (!job->s_fence)
>>>                return -ENOMEM;
>>>        job->id = atomic64_inc_return(&sched->job_id_count);
>>> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_job_init);
>>>
>>> +/**
>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>> + * @job: scheduler job to arm
>>> + *
>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>> + * or other places that need to track the completion of this job.
>>> + *
>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>> + * considerations.
>>> + *
>>> + * This can only be called if drm_sched_job_init() succeeded.
>>> + */
>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>> +{
>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>> +}
>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>> +
>>>    /**
>>>     * drm_sched_job_cleanup - clean up scheduler job resources
>>>     *
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index d18af49fd009..80438d126c9d 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>    int drm_sched_job_init(struct drm_sched_job *job,
>>>                       struct drm_sched_entity *entity,
>>>                       void *owner);
>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>                                    struct drm_gpu_scheduler **sched_list,
>>>                                       unsigned int num_sched_list);
>>> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>                                   enum drm_sched_priority priority);
>>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(
>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>        struct drm_sched_entity *s_entity, void *owner);
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity);
>>> +
>>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 17:39         ` Christian König
  0 siblings, 0 replies; 57+ messages in thread
From: Christian König @ 2021-06-24 17:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Adam Borowski, David Airlie, Viresh Kumar, DRI Development,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, The etnaviv authors,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Nick Terrell,
	Deepak R Varma, Tomeu Vizoso, Boris Brezillon, Qiang Yu,
	Alex Deucher, Tian Tao, open list:DMA BUFFER SHARING FRAMEWORK



Am 24.06.21 um 19:37 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 7:30 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
>>> This is a very confusingly named function, because not just does it
>>> init an object, it arms it and provides a point of no return for
>>> pushing a job into the scheduler. It would be nice if that's a bit
>>> clearer in the interface.
>> We originally had that in the push_job interface, but moved that to init
>> for some reason I don't remember.
>>
>>> But the real reason is that I want to push the dependency tracking
>>> helpers into the scheduler code, and that means drm_sched_job_init
>>> must be called a lot earlier, without arming the job.
>> I'm really questioning myself if I like that naming.
>>
>> What about using drm_sched_job_add_dependency instead?
> You're suggesting a
> s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
> the wrong patch?

Replied to the wrong patch accidentally. I was talking about the "await" 
terminology.

Christian.

> -Daniel
>
>> Christian.
>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
>>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
>>> Cc: Qiang Yu <yuq825@gmail.com>
>>> Cc: Rob Herring <robh@kernel.org>
>>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
>>> Cc: Steven Price <steven.price@arm.com>
>>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Masahiro Yamada <masahiroy@kernel.org>
>>> Cc: Kees Cook <keescook@chromium.org>
>>> Cc: Adam Borowski <kilobyte@angband.pl>
>>> Cc: Nick Terrell <terrelln@fb.com>
>>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
>>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>>> Cc: Sami Tolvanen <samitolvanen@google.com>
>>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Dave Airlie <airlied@redhat.com>
>>> Cc: Nirmoy Das <nirmoy.das@amd.com>
>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>> Cc: Lee Jones <lee.jones@linaro.org>
>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>> Cc: Chen Li <chenli@uniontech.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: "Marek Olšák" <marek.olsak@amd.com>
>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> Cc: Sonny Jiang <sonny.jiang@amd.com>
>>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
>>> Cc: Tian Tao <tiantao6@hisilicon.com>
>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>> Cc: etnaviv@lists.freedesktop.org
>>> Cc: lima@lists.freedesktop.org
>>> Cc: linux-media@vger.kernel.org
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> ---
>>>    .gitignore                               |  1 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>>>    drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>>>    drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>>>    include/drm/gpu_scheduler.h              |  6 +++++-
>>>    10 files changed, 51 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/.gitignore b/.gitignore
>>> index 7afd412dadd2..52433a930299 100644
>>> --- a/.gitignore
>>> +++ b/.gitignore
>>> @@ -66,6 +66,7 @@ modules.order
>>>    /modules.builtin
>>>    /modules.builtin.modinfo
>>>    /modules.nsdeps
>>> +*.builtin
>>>
>>>    #
>>>    # RPM spec file (make rpm-pkg)
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index c5386d13eb4a..a4ec092af9a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>>>        if (r)
>>>                goto error_unlock;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        /* No memory allocation is allowed while holding the notifier lock.
>>>         * The lock is held until amdgpu_cs_submit is finished and fence is
>>>         * added to BOs.
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> index d33e6d97cc89..5ddb955d2315 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
>>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>>>        if (r)
>>>                return r;
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        *f = dma_fence_get(&job->base.s_fence->finished);
>>>        amdgpu_job_free_resources(job);
>>>        drm_sched_entity_push_job(&job->base, entity);
>>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> index 19826e504efc..af1671f01c7f 100644
>>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
>>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>>>        if (ret)
>>>                goto out_unlock;
>>>
>>> +     drm_sched_job_arm(&submit->sched_job);
>>> +
>>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>>>                                                submit->out_fence, 0,
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index ecf3267334ff..bd1af1fd8c0f 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>>>                return err;
>>>        }
>>>
>>> +     drm_sched_job_arm(&task->base);
>>> +
>>>        task->num_bos = num_bos;
>>>        task->vm = lima_vm_get(vm);
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index beb62c8fc851..1e950534b9b0 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>>>                goto unlock;
>>>        }
>>>
>>> +     drm_sched_job_arm(&job->base);
>>> +
>>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>>>
>>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 79554aa4dbb1..f7347c284886 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>     * @sched_job: job to submit
>>>     * @entity: scheduler entity
>>>     *
>>> - * Note: To guarantee that the order of insertion to queue matches
>>> - * the job's fence sequence number this function should be
>>> - * called with drm_sched_job_init under common lock.
>>> + * Note: To guarantee that the order of insertion to queue matches the job's
>>> + * fence sequence number this function should be called with drm_sched_job_arm()
>>> + * under common lock.
>>>     *
>>>     * Returns 0 for success, negative error code otherwise.
>>>     */
>>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
>>> index 69de2c76731f..0ba810c198bd 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
>>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>>>    }
>>>    EXPORT_SYMBOL(to_drm_sched_fence);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>> -                                            void *owner)
>>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
>>> +                                           void *owner)
>>>    {
>>>        struct drm_sched_fence *fence = NULL;
>>> -     unsigned seq;
>>>
>>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>>>        if (fence == NULL)
>>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>>>        fence->sched = entity->rq->sched;
>>>        spin_lock_init(&fence->lock);
>>>
>>> +     return fence;
>>> +}
>>> +
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity)
>>> +{
>>> +     unsigned seq;
>>> +
>>>        seq = atomic_inc_return(&entity->fence_seq);
>>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
>>>                       &fence->lock, entity->fence_context, seq);
>>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
>>>                       &fence->lock, entity->fence_context + 1, seq);
>>> -
>>> -     return fence;
>>>    }
>>>
>>>    module_init(drm_sched_fence_slab_init);
>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>> index 61420a9c1021..70eefed17e06 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>> @@ -48,9 +48,11 @@
>>>    #include <linux/wait.h>
>>>    #include <linux/sched.h>
>>>    #include <linux/completion.h>
>>> +#include <linux/dma-resv.h>
>>>    #include <uapi/linux/sched/types.h>
>>>
>>>    #include <drm/drm_print.h>
>>> +#include <drm/drm_gem.h>
>>>    #include <drm/gpu_scheduler.h>
>>>    #include <drm/spsc_queue.h>
>>>
>>> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>        job->sched = sched;
>>>        job->entity = entity;
>>>        job->s_priority = entity->rq - sched->sched_rq;
>>> -     job->s_fence = drm_sched_fence_create(entity, owner);
>>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
>>>        if (!job->s_fence)
>>>                return -ENOMEM;
>>>        job->id = atomic64_inc_return(&sched->job_id_count);
>>> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
>>>    }
>>>    EXPORT_SYMBOL(drm_sched_job_init);
>>>
>>> +/**
>>> + * drm_sched_job_arm - arm a scheduler job for execution
>>> + * @job: scheduler job to arm
>>> + *
>>> + * This arms a scheduler job for execution. Specifically it initializes the
>>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
>>> + * or other places that need to track the completion of this job.
>>> + *
>>> + * Refer to drm_sched_entity_push_job() documentation for locking
>>> + * considerations.
>>> + *
>>> + * This can only be called if drm_sched_job_init() succeeded.
>>> + */
>>> +void drm_sched_job_arm(struct drm_sched_job *job)
>>> +{
>>> +     drm_sched_fence_init(job->s_fence, job->entity);
>>> +}
>>> +EXPORT_SYMBOL(drm_sched_job_arm);
>>> +
>>>    /**
>>>     * drm_sched_job_cleanup - clean up scheduler job resources
>>>     *
>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>> index d18af49fd009..80438d126c9d 100644
>>> --- a/include/drm/gpu_scheduler.h
>>> +++ b/include/drm/gpu_scheduler.h
>>> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>>>    int drm_sched_job_init(struct drm_sched_job *job,
>>>                       struct drm_sched_entity *entity,
>>>                       void *owner);
>>> +void drm_sched_job_arm(struct drm_sched_job *job);
>>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>>>                                    struct drm_gpu_scheduler **sched_list,
>>>                                       unsigned int num_sched_list);
>>> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>>>                                   enum drm_sched_priority priority);
>>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>>>
>>> -struct drm_sched_fence *drm_sched_fence_create(
>>> +struct drm_sched_fence *drm_sched_fence_alloc(
>>>        struct drm_sched_entity *s_entity, void *owner);
>>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
>>> +                       struct drm_sched_entity *entity);
>>> +
>>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
>>>
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 17:38       ` Christian König
@ 2021-06-24 17:43         ` Daniel Vetter
  2021-06-24 17:56           ` Christian König
  0 siblings, 1 reply; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 17:43 UTC (permalink / raw)
  To: Christian König
  Cc: Jack Zhang, DRI Development, Luben Tuikov, Alex Deucher, Daniel Vetter

On Thu, Jun 24, 2021 at 7:38 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 24.06.21 um 19:29 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
> >> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> >>> This is essentially part of drm_sched_dependency_optimized(), which
> >>> only amdgpu seems to make use of. Use it a bit more.
> >>>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>> ---
> >>>    drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
> >>>    1 file changed, 7 insertions(+)
> >>>
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 370c336d383f..c31d7cf7df74 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
> >>>     if (!fence)
> >>>             return 0;
> >>> +   /* if it's a fence from us it's guaranteed to be earlier */
> >>> +   if (fence->context == job->entity->fence_context ||
> >>> +       fence->context == job->entity->fence_context + 1) {
> >>> +           dma_fence_put(fence);
> >>> +           return 0;
> >>> +   }
> >>> +
> >> Well NAK. That would break Vulkan.

I'm assuming your reply means the NAK is retracted and was just the
usual "this doesn't perfectly fit for amdgpu" reflex?

> >> The problem is that Vulkan can insert dependencies between jobs which run on
> >> the same queue.
> >>
> >> So we need to track those as well and if the previous job for the same
> >> queue/scheduler is not yet finished a pipeline synchronization needs to be
> >> inserted.
> >>
> >> That's one of the reasons we wasn't able to unify the dependency handling
> >> yet.
> > That sounds like an extremely amdgpu specific constraint?
>
> Yeah, that's totally hardware specific.
>
> It's just that I don't know how else we could track that without having
> the same separation as in amdgpu between implicit and explicit fences.
> And as far as I understand it that's exactly what you want to avoid.
>
> As I said this turned out to be really awkward.
>
> > You're also the
> > only one who keeps track of whether the previous job we've scheduled has
> > finished already (I guess they can get pipelined and you don't flush by
> > default), so you insert fences.
>
> Yes, exactly that.
>
> > I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
> > sure why we have to inflict this design constraint on all other drivers?
> > At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
> > would break with this, and i915 will also be perfectly fine.
> >
> > Also note: I'm not using this for amdgpu, exactly because there's a few
> > funny things going on.
>
> Yeah, exactly the reason why we never unified this.

Yeah there's clear limits to this, because you also can't use the
await_implicit helper, because you have to keep filtering for owner or
the current amdgpu uapi goes horribly slow. I think the benefit would
be just that we could share the datastructure and the book-keeping,
but aside from that you'd need your own integration in amdgpu.

One idea I just had was whether we could use the tag bits xarray has
for the amdgpu purposed. Like we could do a
drm_sched_job_await_fence_tagged, where you supply additional
information (like the "this might be relevant for the vm_flush" and
things like that). Afaiui xarray tags are very fast to enumerate on if
you're looking for specific tags, but I might be wrong. Ideally this
would avoid the need for the duplicated amdgpu_job->sched.

Cheers, Daniel


> Regards,
> Christian.
>
> > Finally: You _really_ need explicit dependency handling for vulkan in your
> > uapi, instead of the kernel second-guessing what userspace might be doing.
> > That's really not how vulkan is designed to work :-)
>
> >
> > Cheers, Daniel
> >
> >
> >> Christian.
> >>
> >>>     /* Deduplicate if we already depend on a fence from the same context.
> >>>      * This lets the size of the array of deps scale with the number of
> >>>      * engines involved, rather than the number of BOs.
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 17:43         ` Daniel Vetter
@ 2021-06-24 17:56           ` Christian König
  2021-06-24 18:21             ` Daniel Vetter
  0 siblings, 1 reply; 57+ messages in thread
From: Christian König @ 2021-06-24 17:56 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Jack Zhang, DRI Development, Luben Tuikov, Alex Deucher, Daniel Vetter

Am 24.06.21 um 19:43 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 7:38 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 24.06.21 um 19:29 schrieb Daniel Vetter:
>>> On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
>>>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
>>>>> This is essentially part of drm_sched_dependency_optimized(), which
>>>>> only amdgpu seems to make use of. Use it a bit more.
>>>>>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
>>>>>     1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 370c336d383f..c31d7cf7df74 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
>>>>>      if (!fence)
>>>>>              return 0;
>>>>> +   /* if it's a fence from us it's guaranteed to be earlier */
>>>>> +   if (fence->context == job->entity->fence_context ||
>>>>> +       fence->context == job->entity->fence_context + 1) {
>>>>> +           dma_fence_put(fence);
>>>>> +           return 0;
>>>>> +   }
>>>>> +
>>>> Well NAK. That would break Vulkan.
> I'm assuming your reply means the NAK is retracted and was just the
> usual "this doesn't perfectly fit for amdgpu" reflex?

Well rather "NAK, you haven't considered that special handling in amdgpu 
and if you really want to unify this you need that as well."

>
>>>> The problem is that Vulkan can insert dependencies between jobs which run on
>>>> the same queue.
>>>>
>>>> So we need to track those as well and if the previous job for the same
>>>> queue/scheduler is not yet finished a pipeline synchronization needs to be
>>>> inserted.
>>>>
>>>> That's one of the reasons we wasn't able to unify the dependency handling
>>>> yet.
>>> That sounds like an extremely amdgpu specific constraint?
>> Yeah, that's totally hardware specific.
>>
>> It's just that I don't know how else we could track that without having
>> the same separation as in amdgpu between implicit and explicit fences.
>> And as far as I understand it that's exactly what you want to avoid.
>>
>> As I said this turned out to be really awkward.
>>
>>> You're also the
>>> only one who keeps track of whether the previous job we've scheduled has
>>> finished already (I guess they can get pipelined and you don't flush by
>>> default), so you insert fences.
>> Yes, exactly that.
>>
>>> I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
>>> sure why we have to inflict this design constraint on all other drivers?
>>> At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
>>> would break with this, and i915 will also be perfectly fine.
>>>
>>> Also note: I'm not using this for amdgpu, exactly because there's a few
>>> funny things going on.
>> Yeah, exactly the reason why we never unified this.
> Yeah there's clear limits to this, because you also can't use the
> await_implicit helper, because you have to keep filtering for owner or
> the current amdgpu uapi goes horribly slow. I think the benefit would
> be just that we could share the datastructure and the book-keeping,
> but aside from that you'd need your own integration in amdgpu.

Yeah, but that is trivial. The _add_dependency() function (or however we 
want to call it) needs to be exported anyway for adding fences from 
syncfile and syncobj.

Or do you also want to unify the handling for those?

> One idea I just had was whether we could use the tag bits xarray has
> for the amdgpu purposed. Like we could do a
> drm_sched_job_await_fence_tagged, where you supply additional
> information (like the "this might be relevant for the vm_flush" and
> things like that). Afaiui xarray tags are very fast to enumerate on if
> you're looking for specific tags, but I might be wrong. Ideally this
> would avoid the need for the duplicated amdgpu_job->sched.

That could work.

Essentially we just need the information from the scheduler which is the 
last fence which was dependency optimized.

In other words when you push jobs like those to the same scheduler

J1
J2 -> depends on J1.
J3 -> depends on whatever, but not j2

The hardware needs to insert a flush between J2 and J1, but not between 
j3 and j2.

This makes roughly 19% performance difference for some OpenGL games and 
incorrect rendering for Vulkan if you mess it up either way or the other.

Regards,
Christian.


>
> Cheers, Daniel
>
>
>> Regards,
>> Christian.
>>
>>> Finally: You _really_ need explicit dependency handling for vulkan in your
>>> uapi, instead of the kernel second-guessing what userspace might be doing.
>>> That's really not how vulkan is designed to work :-)
>>> Cheers, Daniel
>>>
>>>
>>>> Christian.
>>>>
>>>>>      /* Deduplicate if we already depend on a fence from the same context.
>>>>>       * This lets the size of the array of deps scale with the number of
>>>>>       * engines involved, rather than the number of BOs.
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 10/11] drm/scheduler: Don't store self-dependencies
  2021-06-24 17:56           ` Christian König
@ 2021-06-24 18:21             ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 18:21 UTC (permalink / raw)
  To: Christian König
  Cc: Jack Zhang, DRI Development, Luben Tuikov, Alex Deucher, Daniel Vetter

On Thu, Jun 24, 2021 at 7:56 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 24.06.21 um 19:43 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 7:38 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 24.06.21 um 19:29 schrieb Daniel Vetter:
> >>> On Thu, Jun 24, 2021 at 07:03:10PM +0200, Christian König wrote:
> >>>> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> >>>>> This is essentially part of drm_sched_dependency_optimized(), which
> >>>>> only amdgpu seems to make use of. Use it a bit more.
> >>>>>
> >>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>>>> ---
> >>>>>     drivers/gpu/drm/scheduler/sched_main.c | 7 +++++++
> >>>>>     1 file changed, 7 insertions(+)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> index 370c336d383f..c31d7cf7df74 100644
> >>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>>>> @@ -649,6 +649,13 @@ int drm_sched_job_await_fence(struct drm_sched_job *job,
> >>>>>      if (!fence)
> >>>>>              return 0;
> >>>>> +   /* if it's a fence from us it's guaranteed to be earlier */
> >>>>> +   if (fence->context == job->entity->fence_context ||
> >>>>> +       fence->context == job->entity->fence_context + 1) {
> >>>>> +           dma_fence_put(fence);
> >>>>> +           return 0;
> >>>>> +   }
> >>>>> +
> >>>> Well NAK. That would break Vulkan.
> > I'm assuming your reply means the NAK is retracted and was just the
> > usual "this doesn't perfectly fit for amdgpu" reflex?
>
> Well rather "NAK, you haven't considered that special handling in amdgpu
> and if you really want to unify this you need that as well."
>
> >
> >>>> The problem is that Vulkan can insert dependencies between jobs which run on
> >>>> the same queue.
> >>>>
> >>>> So we need to track those as well and if the previous job for the same
> >>>> queue/scheduler is not yet finished a pipeline synchronization needs to be
> >>>> inserted.
> >>>>
> >>>> That's one of the reasons we wasn't able to unify the dependency handling
> >>>> yet.
> >>> That sounds like an extremely amdgpu specific constraint?
> >> Yeah, that's totally hardware specific.
> >>
> >> It's just that I don't know how else we could track that without having
> >> the same separation as in amdgpu between implicit and explicit fences.
> >> And as far as I understand it that's exactly what you want to avoid.
> >>
> >> As I said this turned out to be really awkward.
> >>
> >>> You're also the
> >>> only one who keeps track of whether the previous job we've scheduled has
> >>> finished already (I guess they can get pipelined and you don't flush by
> >>> default), so you insert fences.
> >> Yes, exactly that.
> >>
> >>> I guess we can add a await_fence_no_dedup or so for amdgpu, but I'm not
> >>> sure why we have to inflict this design constraint on all other drivers?
> >>> At least I'm not seeing anything in lima, panfrost, v3d or entaviv that
> >>> would break with this, and i915 will also be perfectly fine.
> >>>
> >>> Also note: I'm not using this for amdgpu, exactly because there's a few
> >>> funny things going on.
> >> Yeah, exactly the reason why we never unified this.
> > Yeah there's clear limits to this, because you also can't use the
> > await_implicit helper, because you have to keep filtering for owner or
> > the current amdgpu uapi goes horribly slow. I think the benefit would
> > be just that we could share the datastructure and the book-keeping,
> > but aside from that you'd need your own integration in amdgpu.
>
> Yeah, but that is trivial. The _add_dependency() function (or however we
> want to call it) needs to be exported anyway for adding fences from
> syncfile and syncobj.
>
> Or do you also want to unify the handling for those?

I guess we could add some convenience wrapper that pulls in a
sync_file or sync_objc automatically. But there's not that much code
involved there, and it's also not tricky. Also drivers might need to
add dependencies for whatever anyway. The await_implicit is a bit
different, because that defines how implicit sync is supposed to work.

I guess the bikeshed then boils down to which one is the simple
await_fence() function. The one that filters for same timeline, or the
one that doesnt. I'd make the non-filtering one the special case so
that amdgpu sticks out a bit more - out of 6 drivers with schedulers
(i915 included) it seems to be the special one.

> > One idea I just had was whether we could use the tag bits xarray has
> > for the amdgpu purposed. Like we could do a
> > drm_sched_job_await_fence_tagged, where you supply additional
> > information (like the "this might be relevant for the vm_flush" and
> > things like that). Afaiui xarray tags are very fast to enumerate on if
> > you're looking for specific tags, but I might be wrong. Ideally this
> > would avoid the need for the duplicated amdgpu_job->sched.
>
> That could work.
>
> Essentially we just need the information from the scheduler which is the
> last fence which was dependency optimized.
>
> In other words when you push jobs like those to the same scheduler
>
> J1
> J2 -> depends on J1.
> J3 -> depends on whatever, but not j2
>
> The hardware needs to insert a flush between J2 and J1, but not between
> j3 and j2.
>
> This makes roughly 19% performance difference for some OpenGL games and
> incorrect rendering for Vulkan if you mess it up either way or the other.

Yeah that's massive. On i915 "too many pipeline stalls" even within
batches is a lot less, so we never bothered with this at all.
-Daniel

>
> Regards,
> Christian.
>
>
> >
> > Cheers, Daniel
> >
> >
> >> Regards,
> >> Christian.
> >>
> >>> Finally: You _really_ need explicit dependency handling for vulkan in your
> >>> uapi, instead of the kernel second-guessing what userspace might be doing.
> >>> That's really not how vulkan is designed to work :-)
> >>> Cheers, Daniel
> >>>
> >>>
> >>>> Christian.
> >>>>
> >>>>>      /* Deduplicate if we already depend on a fence from the same context.
> >>>>>       * This lets the size of the array of deps scale with the number of
> >>>>>       * engines involved, rather than the number of BOs.
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
  2021-06-24 17:39         ` Christian König
@ 2021-06-24 18:22           ` Daniel Vetter
  -1 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 18:22 UTC (permalink / raw)
  To: Christian König
  Cc: DRI Development, Daniel Vetter, Lucas Stach, Russell King,
	Christian Gmeiner, Qiang Yu, Rob Herring, Tomeu Vizoso,
	Steven Price, Alyssa Rosenzweig, David Airlie, Sumit Semwal,
	Masahiro Yamada, Kees Cook, Adam Borowski, Nick Terrell,
	Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen, Viresh Kumar,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Lee Jones,
	Kevin Wang, Chen Li, Luben Tuikov, Marek Olšák,
	Dennis Li, Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, The etnaviv authors, lima,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jun 24, 2021 at 7:39 PM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 24.06.21 um 19:37 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 7:30 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> >>> This is a very confusingly named function, because not just does it
> >>> init an object, it arms it and provides a point of no return for
> >>> pushing a job into the scheduler. It would be nice if that's a bit
> >>> clearer in the interface.
> >> We originally had that in the push_job interface, but moved that to init
> >> for some reason I don't remember.
> >>
> >>> But the real reason is that I want to push the dependency tracking
> >>> helpers into the scheduler code, and that means drm_sched_job_init
> >>> must be called a lot earlier, without arming the job.
> >> I'm really questioning myself if I like that naming.
> >>
> >> What about using drm_sched_job_add_dependency instead?
> > You're suggesting a
> > s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
> > the wrong patch?
>
> Replied to the wrong patch accidentally. I was talking about the "await"
> terminology.

Can you pls reply there so we don't have too much of a confusion in
the discussion?
Thanks, Daniel


> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>> Cc: Qiang Yu <yuq825@gmail.com>
> >>> Cc: Rob Herring <robh@kernel.org>
> >>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>> Cc: Steven Price <steven.price@arm.com>
> >>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>> Cc: David Airlie <airlied@linux.ie>
> >>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>> Cc: Kees Cook <keescook@chromium.org>
> >>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>> Cc: Nick Terrell <terrelln@fb.com>
> >>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Dave Airlie <airlied@redhat.com>
> >>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>> Cc: Lee Jones <lee.jones@linaro.org>
> >>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>> Cc: Chen Li <chenli@uniontech.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>> Cc: etnaviv@lists.freedesktop.org
> >>> Cc: lima@lists.freedesktop.org
> >>> Cc: linux-media@vger.kernel.org
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> ---
> >>>    .gitignore                               |  1 +
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >>>    drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >>>    drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >>>    include/drm/gpu_scheduler.h              |  6 +++++-
> >>>    10 files changed, 51 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/.gitignore b/.gitignore
> >>> index 7afd412dadd2..52433a930299 100644
> >>> --- a/.gitignore
> >>> +++ b/.gitignore
> >>> @@ -66,6 +66,7 @@ modules.order
> >>>    /modules.builtin
> >>>    /modules.builtin.modinfo
> >>>    /modules.nsdeps
> >>> +*.builtin
> >>>
> >>>    #
> >>>    # RPM spec file (make rpm-pkg)
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index c5386d13eb4a..a4ec092af9a7 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>        if (r)
> >>>                goto error_unlock;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        /* No memory allocation is allowed while holding the notifier lock.
> >>>         * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>         * added to BOs.
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> index d33e6d97cc89..5ddb955d2315 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>        if (r)
> >>>                return r;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        *f = dma_fence_get(&job->base.s_fence->finished);
> >>>        amdgpu_job_free_resources(job);
> >>>        drm_sched_entity_push_job(&job->base, entity);
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> index 19826e504efc..af1671f01c7f 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>        if (ret)
> >>>                goto out_unlock;
> >>>
> >>> +     drm_sched_job_arm(&submit->sched_job);
> >>> +
> >>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>                                                submit->out_fence, 0,
> >>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>> index ecf3267334ff..bd1af1fd8c0f 100644
> >>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>                return err;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&task->base);
> >>> +
> >>>        task->num_bos = num_bos;
> >>>        task->vm = lima_vm_get(vm);
> >>>
> >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> index beb62c8fc851..1e950534b9b0 100644
> >>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>                goto unlock;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>
> >>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> index 79554aa4dbb1..f7347c284886 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>     * @sched_job: job to submit
> >>>     * @entity: scheduler entity
> >>>     *
> >>> - * Note: To guarantee that the order of insertion to queue matches
> >>> - * the job's fence sequence number this function should be
> >>> - * called with drm_sched_job_init under common lock.
> >>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>> + * under common lock.
> >>>     *
> >>>     * Returns 0 for success, negative error code otherwise.
> >>>     */
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> index 69de2c76731f..0ba810c198bd 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>    }
> >>>    EXPORT_SYMBOL(to_drm_sched_fence);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>> -                                            void *owner)
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>> +                                           void *owner)
> >>>    {
> >>>        struct drm_sched_fence *fence = NULL;
> >>> -     unsigned seq;
> >>>
> >>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>        if (fence == NULL)
> >>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>        fence->sched = entity->rq->sched;
> >>>        spin_lock_init(&fence->lock);
> >>>
> >>> +     return fence;
> >>> +}
> >>> +
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity)
> >>> +{
> >>> +     unsigned seq;
> >>> +
> >>>        seq = atomic_inc_return(&entity->fence_seq);
> >>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>                       &fence->lock, entity->fence_context, seq);
> >>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>                       &fence->lock, entity->fence_context + 1, seq);
> >>> -
> >>> -     return fence;
> >>>    }
> >>>
> >>>    module_init(drm_sched_fence_slab_init);
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 61420a9c1021..70eefed17e06 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -48,9 +48,11 @@
> >>>    #include <linux/wait.h>
> >>>    #include <linux/sched.h>
> >>>    #include <linux/completion.h>
> >>> +#include <linux/dma-resv.h>
> >>>    #include <uapi/linux/sched/types.h>
> >>>
> >>>    #include <drm/drm_print.h>
> >>> +#include <drm/drm_gem.h>
> >>>    #include <drm/gpu_scheduler.h>
> >>>    #include <drm/spsc_queue.h>
> >>>
> >>> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>        job->sched = sched;
> >>>        job->entity = entity;
> >>>        job->s_priority = entity->rq - sched->sched_rq;
> >>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>        if (!job->s_fence)
> >>>                return -ENOMEM;
> >>>        job->id = atomic64_inc_return(&sched->job_id_count);
> >>> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>    }
> >>>    EXPORT_SYMBOL(drm_sched_job_init);
> >>>
> >>> +/**
> >>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>> + * @job: scheduler job to arm
> >>> + *
> >>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>> + * or other places that need to track the completion of this job.
> >>> + *
> >>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>> + * considerations.
> >>> + *
> >>> + * This can only be called if drm_sched_job_init() succeeded.
> >>> + */
> >>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>> +{
> >>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>> +}
> >>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>> +
> >>>    /**
> >>>     * drm_sched_job_cleanup - clean up scheduler job resources
> >>>     *
> >>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>> index d18af49fd009..80438d126c9d 100644
> >>> --- a/include/drm/gpu_scheduler.h
> >>> +++ b/include/drm/gpu_scheduler.h
> >>> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>    int drm_sched_job_init(struct drm_sched_job *job,
> >>>                       struct drm_sched_entity *entity,
> >>>                       void *owner);
> >>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>                                    struct drm_gpu_scheduler **sched_list,
> >>>                                       unsigned int num_sched_list);
> >>> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>                                   enum drm_sched_priority priority);
> >>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>        struct drm_sched_entity *s_entity, void *owner);
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity);
> >>> +
> >>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH 01/11] drm/sched: Split drm_sched_job_init
@ 2021-06-24 18:22           ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 18:22 UTC (permalink / raw)
  To: Christian König
  Cc: Adam Borowski, David Airlie, Viresh Kumar, DRI Development,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, The etnaviv authors,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Nick Terrell,
	Deepak R Varma, Tomeu Vizoso, Boris Brezillon, Qiang Yu,
	Alex Deucher, Tian Tao, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jun 24, 2021 at 7:39 PM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 24.06.21 um 19:37 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 7:30 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 24.06.21 um 16:00 schrieb Daniel Vetter:
> >>> This is a very confusingly named function, because not just does it
> >>> init an object, it arms it and provides a point of no return for
> >>> pushing a job into the scheduler. It would be nice if that's a bit
> >>> clearer in the interface.
> >> We originally had that in the push_job interface, but moved that to init
> >> for some reason I don't remember.
> >>
> >>> But the real reason is that I want to push the dependency tracking
> >>> helpers into the scheduler code, and that means drm_sched_job_init
> >>> must be called a lot earlier, without arming the job.
> >> I'm really questioning myself if I like that naming.
> >>
> >> What about using drm_sched_job_add_dependency instead?
> > You're suggesting a
> > s/drm_sched_job_init/drm_sched_job_add_dependency/, or just replied to
> > the wrong patch?
>
> Replied to the wrong patch accidentally. I was talking about the "await"
> terminology.

Can you pls reply there so we don't have too much of a confusion in
the discussion?
Thanks, Daniel


> Christian.
>
> > -Daniel
> >
> >> Christian.
> >>
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> Cc: Lucas Stach <l.stach@pengutronix.de>
> >>> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> >>> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> >>> Cc: Qiang Yu <yuq825@gmail.com>
> >>> Cc: Rob Herring <robh@kernel.org>
> >>> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> >>> Cc: Steven Price <steven.price@arm.com>
> >>> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> >>> Cc: David Airlie <airlied@linux.ie>
> >>> Cc: Daniel Vetter <daniel@ffwll.ch>
> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Masahiro Yamada <masahiroy@kernel.org>
> >>> Cc: Kees Cook <keescook@chromium.org>
> >>> Cc: Adam Borowski <kilobyte@angband.pl>
> >>> Cc: Nick Terrell <terrelln@fb.com>
> >>> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> >>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> >>> Cc: Sami Tolvanen <samitolvanen@google.com>
> >>> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Dave Airlie <airlied@redhat.com>
> >>> Cc: Nirmoy Das <nirmoy.das@amd.com>
> >>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>> Cc: Lee Jones <lee.jones@linaro.org>
> >>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>> Cc: Chen Li <chenli@uniontech.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: "Marek Olšák" <marek.olsak@amd.com>
> >>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> >>> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> >>> Cc: Sonny Jiang <sonny.jiang@amd.com>
> >>> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> >>> Cc: Tian Tao <tiantao6@hisilicon.com>
> >>> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> >>> Cc: etnaviv@lists.freedesktop.org
> >>> Cc: lima@lists.freedesktop.org
> >>> Cc: linux-media@vger.kernel.org
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> ---
> >>>    .gitignore                               |  1 +
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >>>    drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >>>    drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >>>    drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >>>    drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >>>    drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >>>    drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >>>    drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >>>    include/drm/gpu_scheduler.h              |  6 +++++-
> >>>    10 files changed, 51 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/.gitignore b/.gitignore
> >>> index 7afd412dadd2..52433a930299 100644
> >>> --- a/.gitignore
> >>> +++ b/.gitignore
> >>> @@ -66,6 +66,7 @@ modules.order
> >>>    /modules.builtin
> >>>    /modules.builtin.modinfo
> >>>    /modules.nsdeps
> >>> +*.builtin
> >>>
> >>>    #
> >>>    # RPM spec file (make rpm-pkg)
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index c5386d13eb4a..a4ec092af9a7 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >>>        if (r)
> >>>                goto error_unlock;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        /* No memory allocation is allowed while holding the notifier lock.
> >>>         * The lock is held until amdgpu_cs_submit is finished and fence is
> >>>         * added to BOs.
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> index d33e6d97cc89..5ddb955d2315 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> >>> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >>>        if (r)
> >>>                return r;
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        *f = dma_fence_get(&job->base.s_fence->finished);
> >>>        amdgpu_job_free_resources(job);
> >>>        drm_sched_entity_push_job(&job->base, entity);
> >>> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> index 19826e504efc..af1671f01c7f 100644
> >>> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> >>> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >>>        if (ret)
> >>>                goto out_unlock;
> >>>
> >>> +     drm_sched_job_arm(&submit->sched_job);
> >>> +
> >>>        submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >>>        submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >>>                                                submit->out_fence, 0,
> >>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> >>> index ecf3267334ff..bd1af1fd8c0f 100644
> >>> --- a/drivers/gpu/drm/lima/lima_sched.c
> >>> +++ b/drivers/gpu/drm/lima/lima_sched.c
> >>> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >>>                return err;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&task->base);
> >>> +
> >>>        task->num_bos = num_bos;
> >>>        task->vm = lima_vm_get(vm);
> >>>
> >>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> index beb62c8fc851..1e950534b9b0 100644
> >>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> >>> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >>>                goto unlock;
> >>>        }
> >>>
> >>> +     drm_sched_job_arm(&job->base);
> >>> +
> >>>        job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >>>
> >>>        ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> index 79554aa4dbb1..f7347c284886 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> >>> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >>>     * @sched_job: job to submit
> >>>     * @entity: scheduler entity
> >>>     *
> >>> - * Note: To guarantee that the order of insertion to queue matches
> >>> - * the job's fence sequence number this function should be
> >>> - * called with drm_sched_job_init under common lock.
> >>> + * Note: To guarantee that the order of insertion to queue matches the job's
> >>> + * fence sequence number this function should be called with drm_sched_job_arm()
> >>> + * under common lock.
> >>>     *
> >>>     * Returns 0 for success, negative error code otherwise.
> >>>     */
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> index 69de2c76731f..0ba810c198bd 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> >>> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >>>    }
> >>>    EXPORT_SYMBOL(to_drm_sched_fence);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>> -                                            void *owner)
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> >>> +                                           void *owner)
> >>>    {
> >>>        struct drm_sched_fence *fence = NULL;
> >>> -     unsigned seq;
> >>>
> >>>        fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >>>        if (fence == NULL)
> >>> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >>>        fence->sched = entity->rq->sched;
> >>>        spin_lock_init(&fence->lock);
> >>>
> >>> +     return fence;
> >>> +}
> >>> +
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity)
> >>> +{
> >>> +     unsigned seq;
> >>> +
> >>>        seq = atomic_inc_return(&entity->fence_seq);
> >>>        dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
> >>>                       &fence->lock, entity->fence_context, seq);
> >>>        dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
> >>>                       &fence->lock, entity->fence_context + 1, seq);
> >>> -
> >>> -     return fence;
> >>>    }
> >>>
> >>>    module_init(drm_sched_fence_slab_init);
> >>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
> >>> index 61420a9c1021..70eefed17e06 100644
> >>> --- a/drivers/gpu/drm/scheduler/sched_main.c
> >>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> >>> @@ -48,9 +48,11 @@
> >>>    #include <linux/wait.h>
> >>>    #include <linux/sched.h>
> >>>    #include <linux/completion.h>
> >>> +#include <linux/dma-resv.h>
> >>>    #include <uapi/linux/sched/types.h>
> >>>
> >>>    #include <drm/drm_print.h>
> >>> +#include <drm/drm_gem.h>
> >>>    #include <drm/gpu_scheduler.h>
> >>>    #include <drm/spsc_queue.h>
> >>>
> >>> @@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>        job->sched = sched;
> >>>        job->entity = entity;
> >>>        job->s_priority = entity->rq - sched->sched_rq;
> >>> -     job->s_fence = drm_sched_fence_create(entity, owner);
> >>> +     job->s_fence = drm_sched_fence_alloc(entity, owner);
> >>>        if (!job->s_fence)
> >>>                return -ENOMEM;
> >>>        job->id = atomic64_inc_return(&sched->job_id_count);
> >>> @@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
> >>>    }
> >>>    EXPORT_SYMBOL(drm_sched_job_init);
> >>>
> >>> +/**
> >>> + * drm_sched_job_arm - arm a scheduler job for execution
> >>> + * @job: scheduler job to arm
> >>> + *
> >>> + * This arms a scheduler job for execution. Specifically it initializes the
> >>> + * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
> >>> + * or other places that need to track the completion of this job.
> >>> + *
> >>> + * Refer to drm_sched_entity_push_job() documentation for locking
> >>> + * considerations.
> >>> + *
> >>> + * This can only be called if drm_sched_job_init() succeeded.
> >>> + */
> >>> +void drm_sched_job_arm(struct drm_sched_job *job)
> >>> +{
> >>> +     drm_sched_fence_init(job->s_fence, job->entity);
> >>> +}
> >>> +EXPORT_SYMBOL(drm_sched_job_arm);
> >>> +
> >>>    /**
> >>>     * drm_sched_job_cleanup - clean up scheduler job resources
> >>>     *
> >>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> >>> index d18af49fd009..80438d126c9d 100644
> >>> --- a/include/drm/gpu_scheduler.h
> >>> +++ b/include/drm/gpu_scheduler.h
> >>> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >>>    int drm_sched_job_init(struct drm_sched_job *job,
> >>>                       struct drm_sched_entity *entity,
> >>>                       void *owner);
> >>> +void drm_sched_job_arm(struct drm_sched_job *job);
> >>>    void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >>>                                    struct drm_gpu_scheduler **sched_list,
> >>>                                       unsigned int num_sched_list);
> >>> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >>>                                   enum drm_sched_priority priority);
> >>>    bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >>>
> >>> -struct drm_sched_fence *drm_sched_fence_create(
> >>> +struct drm_sched_fence *drm_sched_fence_alloc(
> >>>        struct drm_sched_entity *s_entity, void *owner);
> >>> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> >>> +                       struct drm_sched_entity *entity);
> >>> +
> >>>    void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >>>    void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >>>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH] drm/v3d: Move drm_sched_job_init to v3d_job_init
  2021-06-24 14:00 ` [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init Daniel Vetter
  2021-06-24 16:59   ` Emma Anholt
@ 2021-06-24 20:45   ` Daniel Vetter
  2021-06-24 20:49     ` Emma Anholt
  1 sibling, 1 reply; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 20:45 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Emma Anholt, Daniel Vetter

Prep work for using the scheduler dependency handling. We need to call
drm_sched_job_init earlier so we can use the new drm_sched_job_await*
functions for dependency handling here.

v2: Slightly better commit message and rebase to include the
drm_sched_job_arm() call (Emma).

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Emma Anholt <emma@anholt.net>
---
 drivers/gpu/drm/v3d/v3d_gem.c | 64 +++++++++++------------------------
 1 file changed, 20 insertions(+), 44 deletions(-)

diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 69ac20e11b09..d1028ccf6dd5 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -433,9 +433,10 @@ v3d_wait_bo_ioctl(struct drm_device *dev, void *data,
 static int
 v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	     struct v3d_job *job, void (*free)(struct kref *ref),
-	     u32 in_sync)
+	     u32 in_sync, enum v3d_queue queue)
 {
 	struct dma_fence *in_fence = NULL;
+	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	int ret;
 
 	job->v3d = v3d;
@@ -446,6 +447,10 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 		return ret;
 
 	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
+				 v3d_priv);
+	if (ret)
+		goto fail;
 
 	ret = drm_syncobj_find_fence(file_priv, in_sync, 0, 0, &in_fence);
 	if (ret == -EINVAL)
@@ -464,17 +469,9 @@ v3d_job_init(struct v3d_dev *v3d, struct drm_file *file_priv,
 	return ret;
 }
 
-static int
-v3d_push_job(struct v3d_file_priv *v3d_priv,
-	     struct v3d_job *job, enum v3d_queue queue)
+static void
+v3d_push_job(struct v3d_job *job)
 {
-	int ret;
-
-	ret = drm_sched_job_init(&job->base, &v3d_priv->sched_entity[queue],
-				 v3d_priv);
-	if (ret)
-		return ret;
-
 	drm_sched_job_arm(&job->base);
 
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
@@ -483,8 +480,6 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
 	kref_get(&job->refcount);
 
 	drm_sched_entity_push_job(&job->base);
-
-	return 0;
 }
 
 static void
@@ -530,7 +525,6 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		    struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_cl *args = data;
 	struct v3d_bin_job *bin = NULL;
 	struct v3d_render_job *render;
@@ -556,7 +550,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 	INIT_LIST_HEAD(&render->unref_list);
 
 	ret = v3d_job_init(v3d, file_priv, &render->base,
-			   v3d_render_job_free, args->in_sync_rcl);
+			   v3d_render_job_free, args->in_sync_rcl, V3D_RENDER);
 	if (ret) {
 		kfree(render);
 		return ret;
@@ -570,7 +564,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		}
 
 		ret = v3d_job_init(v3d, file_priv, &bin->base,
-				   v3d_job_free, args->in_sync_bcl);
+				   v3d_job_free, args->in_sync_bcl, V3D_BIN);
 		if (ret) {
 			v3d_job_put(&render->base);
 			kfree(bin);
@@ -592,7 +586,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 			goto fail;
 		}
 
-		ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+		ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN);
 		if (ret) {
 			kfree(clean_job);
 			clean_job = NULL;
@@ -615,9 +609,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 
 	mutex_lock(&v3d->sched_lock);
 	if (bin) {
-		ret = v3d_push_job(v3d_priv, &bin->base, V3D_BIN);
-		if (ret)
-			goto fail_unreserve;
+		v3d_push_job(&bin->base);
 
 		ret = drm_gem_fence_array_add(&render->base.deps,
 					      dma_fence_get(bin->base.done_fence));
@@ -625,9 +617,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 			goto fail_unreserve;
 	}
 
-	ret = v3d_push_job(v3d_priv, &render->base, V3D_RENDER);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&render->base);
 
 	if (clean_job) {
 		struct dma_fence *render_fence =
@@ -635,9 +625,7 @@ v3d_submit_cl_ioctl(struct drm_device *dev, void *data,
 		ret = drm_gem_fence_array_add(&clean_job->deps, render_fence);
 		if (ret)
 			goto fail_unreserve;
-		ret = v3d_push_job(v3d_priv, clean_job, V3D_CACHE_CLEAN);
-		if (ret)
-			goto fail_unreserve;
+		v3d_push_job(clean_job);
 	}
 
 	mutex_unlock(&v3d->sched_lock);
@@ -684,7 +672,6 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_tfu *args = data;
 	struct v3d_tfu_job *job;
 	struct ww_acquire_ctx acquire_ctx;
@@ -697,7 +684,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 
 	ret = v3d_job_init(v3d, file_priv, &job->base,
-			   v3d_job_free, args->in_sync);
+			   v3d_job_free, args->in_sync, V3D_TFU);
 	if (ret) {
 		kfree(job);
 		return ret;
@@ -741,9 +728,7 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 		goto fail;
 
 	mutex_lock(&v3d->sched_lock);
-	ret = v3d_push_job(v3d_priv, &job->base, V3D_TFU);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&job->base);
 	mutex_unlock(&v3d->sched_lock);
 
 	v3d_attach_fences_and_unlock_reservation(file_priv,
@@ -755,10 +740,6 @@ v3d_submit_tfu_ioctl(struct drm_device *dev, void *data,
 
 	return 0;
 
-fail_unreserve:
-	mutex_unlock(&v3d->sched_lock);
-	drm_gem_unlock_reservations(job->base.bo, job->base.bo_count,
-				    &acquire_ctx);
 fail:
 	v3d_job_put(&job->base);
 
@@ -779,7 +760,6 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv)
 {
 	struct v3d_dev *v3d = to_v3d_dev(dev);
-	struct v3d_file_priv *v3d_priv = file_priv->driver_priv;
 	struct drm_v3d_submit_csd *args = data;
 	struct v3d_csd_job *job;
 	struct v3d_job *clean_job;
@@ -798,7 +778,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 
 	ret = v3d_job_init(v3d, file_priv, &job->base,
-			   v3d_job_free, args->in_sync);
+			   v3d_job_free, args->in_sync, V3D_CSD);
 	if (ret) {
 		kfree(job);
 		return ret;
@@ -811,7 +791,7 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 	}
 
-	ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0);
+	ret = v3d_job_init(v3d, file_priv, clean_job, v3d_job_free, 0, V3D_CACHE_CLEAN);
 	if (ret) {
 		v3d_job_put(&job->base);
 		kfree(clean_job);
@@ -830,18 +810,14 @@ v3d_submit_csd_ioctl(struct drm_device *dev, void *data,
 		goto fail;
 
 	mutex_lock(&v3d->sched_lock);
-	ret = v3d_push_job(v3d_priv, &job->base, V3D_CSD);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(&job->base);
 
 	ret = drm_gem_fence_array_add(&clean_job->deps,
 				      dma_fence_get(job->base.done_fence));
 	if (ret)
 		goto fail_unreserve;
 
-	ret = v3d_push_job(v3d_priv, clean_job, V3D_CACHE_CLEAN);
-	if (ret)
-		goto fail_unreserve;
+	v3d_push_job(clean_job);
 	mutex_unlock(&v3d->sched_lock);
 
 	v3d_attach_fences_and_unlock_reservation(file_priv,
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH] drm/sched: Split drm_sched_job_init
  2021-06-24 14:00   ` Daniel Vetter
@ 2021-06-24 20:45     ` Daniel Vetter
  -1 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 20:45 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Steven Price, Daniel Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Qiang Yu, Rob Herring,
	Tomeu Vizoso, Alyssa Rosenzweig, David Airlie, Daniel Vetter,
	Sumit Semwal, Christian König, Masahiro Yamada, Kees Cook,
	Adam Borowski, Nick Terrell, Mauro Carvalho Chehab, Paul Menzel,
	Sami Tolvanen, Viresh Kumar, Alex Deucher, Dave Airlie,
	Nirmoy Das, Deepak R Varma, Lee Jones, Kevin Wang, Chen Li,
	Luben Tuikov, Marek Olšák, Dennis Li,
	Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, etnaviv, lima,
	linux-media, linaro-mm-sig, Emma Anholt

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c        |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
 drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
 drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
 drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
 include/drm/gpu_scheduler.h              |  6 +++++-
 10 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	if (r)
 		goto error_unlock;
 
+	drm_sched_job_arm(&job->base);
+
 	/* No memory allocation is allowed while holding the notifier lock.
 	 * The lock is held until amdgpu_cs_submit is finished and fence is
 	 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 	if (r)
 		return r;
 
+	drm_sched_job_arm(&job->base);
+
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
 	drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..af1671f01c7f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	if (ret)
 		goto out_unlock;
 
+	drm_sched_job_arm(&submit->sched_job);
+
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
 	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
 						submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..bd1af1fd8c0f 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
 		return err;
 	}
 
+	drm_sched_job_arm(&task->base);
+
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index beb62c8fc851..1e950534b9b0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
 		goto unlock;
 	}
 
+	drm_sched_job_arm(&job->base);
+
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..f7347c284886 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * @sched_job: job to submit
  * @entity: scheduler entity
  *
- * Note: To guarantee that the order of insertion to queue matches
- * the job's fence sequence number this function should be
- * called with drm_sched_job_init under common lock.
+ * Note: To guarantee that the order of insertion to queue matches the job's
+ * fence sequence number this function should be called with drm_sched_job_arm()
+ * under common lock.
  *
  * Returns 0 for success, negative error code otherwise.
  */
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..0ba810c198bd 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
-struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
-					       void *owner)
+struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
+					      void *owner)
 {
 	struct drm_sched_fence *fence = NULL;
-	unsigned seq;
 
 	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
 	if (fence == NULL)
@@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
 	fence->sched = entity->rq->sched;
 	spin_lock_init(&fence->lock);
 
+	return fence;
+}
+
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity)
+{
+	unsigned seq;
+
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
 	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
 		       &fence->lock, entity->fence_context + 1, seq);
-
-	return fence;
 }
 
 module_init(drm_sched_fence_slab_init);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 61420a9c1021..70eefed17e06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,9 +48,11 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
+#include <linux/dma-resv.h>
 #include <uapi/linux/sched/types.h>
 
 #include <drm/drm_print.h>
+#include <drm/drm_gem.h>
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
@@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 	job->sched = sched;
 	job->entity = entity;
 	job->s_priority = entity->rq - sched->sched_rq;
-	job->s_fence = drm_sched_fence_create(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner);
 	if (!job->s_fence)
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
@@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
 }
 EXPORT_SYMBOL(drm_sched_job_init);
 
+/**
+ * drm_sched_job_arm - arm a scheduler job for execution
+ * @job: scheduler job to arm
+ *
+ * This arms a scheduler job for execution. Specifically it initializes the
+ * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
+ * or other places that need to track the completion of this job.
+ *
+ * Refer to drm_sched_entity_push_job() documentation for locking
+ * considerations.
+ *
+ * This can only be called if drm_sched_job_init() succeeded.
+ */
+void drm_sched_job_arm(struct drm_sched_job *job)
+{
+	drm_sched_fence_init(job->s_fence, job->entity);
+}
+EXPORT_SYMBOL(drm_sched_job_arm);
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 4eb354226972..5c3a99027ecd 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
 	if (ret)
 		return ret;
 
+	drm_sched_job_arm(&job->base);
+
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	/* put by scheduler job completion */
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..80438d126c9d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
+void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
@@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 
-struct drm_sched_fence *drm_sched_fence_create(
+struct drm_sched_fence *drm_sched_fence_alloc(
 	struct drm_sched_entity *s_entity, void *owner);
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity);
+
 void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 void drm_sched_fence_finished(struct drm_sched_fence *fence);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH] drm/sched: Split drm_sched_job_init
@ 2021-06-24 20:45     ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 20:45 UTC (permalink / raw)
  To: DRI Development
  Cc: Emma Anholt, Adam Borowski, David Airlie, Daniel Vetter,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Viresh Kumar,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kevin Wang,
	Kees Cook, Marek Olšák, Russell King, etnaviv,
	linaro-mm-sig, Deepak R Varma, Tomeu Vizoso, Nick Terrell,
	Boris Brezillon, Qiang Yu, Alex Deucher, Tian Tao, linux-media,
	Christian König

This is a very confusingly named function, because not just does it
init an object, it arms it and provides a point of no return for
pushing a job into the scheduler. It would be nice if that's a bit
clearer in the interface.

But the real reason is that I want to push the dependency tracking
helpers into the scheduler code, and that means drm_sched_job_init
must be called a lot earlier, without arming the job.

v2:
- don't change .gitignore (Steven)
- don't forget v3d (Emma)

Acked-by: Steven Price <steven.price@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Cc: Qiang Yu <yuq825@gmail.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: "Marek Olšák" <marek.olsak@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Cc: Sonny Jiang <sonny.jiang@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Jack Zhang <Jack.Zhang1@amd.com>
Cc: etnaviv@lists.freedesktop.org
Cc: lima@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: Emma Anholt <emma@anholt.net>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
 drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
 drivers/gpu/drm/lima/lima_sched.c        |  2 ++
 drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
 drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
 drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
 drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
 drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
 include/drm/gpu_scheduler.h              |  6 +++++-
 10 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index c5386d13eb4a..a4ec092af9a7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
 	if (r)
 		goto error_unlock;
 
+	drm_sched_job_arm(&job->base);
+
 	/* No memory allocation is allowed while holding the notifier lock.
 	 * The lock is held until amdgpu_cs_submit is finished and fence is
 	 * added to BOs.
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index d33e6d97cc89..5ddb955d2315 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
 	if (r)
 		return r;
 
+	drm_sched_job_arm(&job->base);
+
 	*f = dma_fence_get(&job->base.s_fence->finished);
 	amdgpu_job_free_resources(job);
 	drm_sched_entity_push_job(&job->base, entity);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 19826e504efc..af1671f01c7f 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
 	if (ret)
 		goto out_unlock;
 
+	drm_sched_job_arm(&submit->sched_job);
+
 	submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
 	submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
 						submit->out_fence, 0,
diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ecf3267334ff..bd1af1fd8c0f 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
 		return err;
 	}
 
+	drm_sched_job_arm(&task->base);
+
 	task->num_bos = num_bos;
 	task->vm = lima_vm_get(vm);
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index beb62c8fc851..1e950534b9b0 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
 		goto unlock;
 	}
 
+	drm_sched_job_arm(&job->base);
+
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 79554aa4dbb1..f7347c284886 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * @sched_job: job to submit
  * @entity: scheduler entity
  *
- * Note: To guarantee that the order of insertion to queue matches
- * the job's fence sequence number this function should be
- * called with drm_sched_job_init under common lock.
+ * Note: To guarantee that the order of insertion to queue matches the job's
+ * fence sequence number this function should be called with drm_sched_job_arm()
+ * under common lock.
  *
  * Returns 0 for success, negative error code otherwise.
  */
diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
index 69de2c76731f..0ba810c198bd 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
 }
 EXPORT_SYMBOL(to_drm_sched_fence);
 
-struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
-					       void *owner)
+struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
+					      void *owner)
 {
 	struct drm_sched_fence *fence = NULL;
-	unsigned seq;
 
 	fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
 	if (fence == NULL)
@@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
 	fence->sched = entity->rq->sched;
 	spin_lock_init(&fence->lock);
 
+	return fence;
+}
+
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity)
+{
+	unsigned seq;
+
 	seq = atomic_inc_return(&entity->fence_seq);
 	dma_fence_init(&fence->scheduled, &drm_sched_fence_ops_scheduled,
 		       &fence->lock, entity->fence_context, seq);
 	dma_fence_init(&fence->finished, &drm_sched_fence_ops_finished,
 		       &fence->lock, entity->fence_context + 1, seq);
-
-	return fence;
 }
 
 module_init(drm_sched_fence_slab_init);
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 61420a9c1021..70eefed17e06 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -48,9 +48,11 @@
 #include <linux/wait.h>
 #include <linux/sched.h>
 #include <linux/completion.h>
+#include <linux/dma-resv.h>
 #include <uapi/linux/sched/types.h>
 
 #include <drm/drm_print.h>
+#include <drm/drm_gem.h>
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
@@ -594,7 +596,7 @@ int drm_sched_job_init(struct drm_sched_job *job,
 	job->sched = sched;
 	job->entity = entity;
 	job->s_priority = entity->rq - sched->sched_rq;
-	job->s_fence = drm_sched_fence_create(entity, owner);
+	job->s_fence = drm_sched_fence_alloc(entity, owner);
 	if (!job->s_fence)
 		return -ENOMEM;
 	job->id = atomic64_inc_return(&sched->job_id_count);
@@ -605,6 +607,25 @@ int drm_sched_job_init(struct drm_sched_job *job,
 }
 EXPORT_SYMBOL(drm_sched_job_init);
 
+/**
+ * drm_sched_job_arm - arm a scheduler job for execution
+ * @job: scheduler job to arm
+ *
+ * This arms a scheduler job for execution. Specifically it initializes the
+ * &drm_sched_job.s_fence of @job, so that it can be attached to struct dma_resv
+ * or other places that need to track the completion of this job.
+ *
+ * Refer to drm_sched_entity_push_job() documentation for locking
+ * considerations.
+ *
+ * This can only be called if drm_sched_job_init() succeeded.
+ */
+void drm_sched_job_arm(struct drm_sched_job *job)
+{
+	drm_sched_fence_init(job->s_fence, job->entity);
+}
+EXPORT_SYMBOL(drm_sched_job_arm);
+
 /**
  * drm_sched_job_cleanup - clean up scheduler job resources
  *
diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
index 4eb354226972..5c3a99027ecd 100644
--- a/drivers/gpu/drm/v3d/v3d_gem.c
+++ b/drivers/gpu/drm/v3d/v3d_gem.c
@@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
 	if (ret)
 		return ret;
 
+	drm_sched_job_arm(&job->base);
+
 	job->done_fence = dma_fence_get(&job->base.s_fence->finished);
 
 	/* put by scheduler job completion */
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index d18af49fd009..80438d126c9d 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
 int drm_sched_job_init(struct drm_sched_job *job,
 		       struct drm_sched_entity *entity,
 		       void *owner);
+void drm_sched_job_arm(struct drm_sched_job *job);
 void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
 				    struct drm_gpu_scheduler **sched_list,
                                    unsigned int num_sched_list);
@@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
 				   enum drm_sched_priority priority);
 bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
 
-struct drm_sched_fence *drm_sched_fence_create(
+struct drm_sched_fence *drm_sched_fence_alloc(
 	struct drm_sched_entity *s_entity, void *owner);
+void drm_sched_fence_init(struct drm_sched_fence *fence,
+			  struct drm_sched_entity *entity);
+
 void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 void drm_sched_fence_finished(struct drm_sched_fence *fence);
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH] drm/v3d: Move drm_sched_job_init to v3d_job_init
  2021-06-24 20:45   ` [PATCH] " Daniel Vetter
@ 2021-06-24 20:49     ` Emma Anholt
  0 siblings, 0 replies; 57+ messages in thread
From: Emma Anholt @ 2021-06-24 20:49 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, DRI Development

On Thu, Jun 24, 2021 at 1:45 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Prep work for using the scheduler dependency handling. We need to call
> drm_sched_job_init earlier so we can use the new drm_sched_job_await*
> functions for dependency handling here.
>
> v2: Slightly better commit message and rebase to include the
> drm_sched_job_arm() call (Emma).
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Emma Anholt <emma@anholt.net>

OK, makes sense now.  r-b.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH] drm/sched: Split drm_sched_job_init
  2021-06-24 20:45     ` Daniel Vetter
@ 2021-06-24 21:00       ` Emma Anholt
  -1 siblings, 0 replies; 57+ messages in thread
From: Emma Anholt @ 2021-06-24 21:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Steven Price, Daniel Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Qiang Yu, Rob Herring,
	Tomeu Vizoso, Alyssa Rosenzweig, David Airlie, Daniel Vetter,
	Sumit Semwal, Christian König, Masahiro Yamada, Kees Cook,
	Adam Borowski, Nick Terrell, Mauro Carvalho Chehab, Paul Menzel,
	Sami Tolvanen, Viresh Kumar, Alex Deucher, Dave Airlie,
	Nirmoy Das, Deepak R Varma, Lee Jones, Kevin Wang, Chen Li,
	Luben Tuikov, Marek Olšák, Dennis Li,
	Maarten Lankhorst, Andrey Grodzovsky, Sonny Jiang,
	Boris Brezillon, Tian Tao, Jack Zhang, etnaviv, lima,
	linux-media, linaro-mm-sig

On Thu, Jun 24, 2021 at 1:45 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.
>
> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.
>
> v2:
> - don't change .gitignore (Steven)
> - don't forget v3d (Emma)
>
> Acked-by: Steven Price <steven.price@arm.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: Emma Anholt <emma@anholt.net>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>  drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>  include/drm/gpu_scheduler.h              |  6 +++++-
>  10 files changed, 52 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index c5386d13eb4a..a4ec092af9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>         if (r)
>                 goto error_unlock;
>
> +       drm_sched_job_arm(&job->base);
> +
>         /* No memory allocation is allowed while holding the notifier lock.
>          * The lock is held until amdgpu_cs_submit is finished and fence is
>          * added to BOs.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d33e6d97cc89..5ddb955d2315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>         if (r)
>                 return r;
>
> +       drm_sched_job_arm(&job->base);
> +
>         *f = dma_fence_get(&job->base.s_fence->finished);
>         amdgpu_job_free_resources(job);
>         drm_sched_entity_push_job(&job->base, entity);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 19826e504efc..af1671f01c7f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>         if (ret)
>                 goto out_unlock;
>
> +       drm_sched_job_arm(&submit->sched_job);
> +
>         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>                                                 submit->out_fence, 0,
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ecf3267334ff..bd1af1fd8c0f 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>                 return err;
>         }
>
> +       drm_sched_job_arm(&task->base);
> +
>         task->num_bos = num_bos;
>         task->vm = lima_vm_get(vm);
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>                 goto unlock;
>         }
>
> +       drm_sched_job_arm(&job->base);
> +
>         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>
>         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..f7347c284886 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   * @sched_job: job to submit
>   * @entity: scheduler entity
>   *
> - * Note: To guarantee that the order of insertion to queue matches
> - * the job's fence sequence number this function should be
> - * called with drm_sched_job_init under common lock.
> + * Note: To guarantee that the order of insertion to queue matches the job's
> + * fence sequence number this function should be called with drm_sched_job_arm()
> + * under common lock.
>   *
>   * Returns 0 for success, negative error code otherwise.
>   */
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 69de2c76731f..0ba810c198bd 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>  }
>  EXPORT_SYMBOL(to_drm_sched_fence);
>
> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> -                                              void *owner)
> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> +                                             void *owner)
>  {
>         struct drm_sched_fence *fence = NULL;
> -       unsigned seq;
>
>         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>         if (fence == NULL)
> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>         fence->sched = entity->rq->sched;
>         spin_lock_init(&fence->lock);
>
> +       return fence;
> +}

If there's an error path between fence_alloc (job_init()) and
fence_create() (job_arm()) time, how does the s_fence get freed?
Before, I was committed to calling drm_sched_entity_push_job() which
lead to the job being processed and freed, but now I think we need
some other non-pushed-job free path.

> +EXPORT_SYMBOL(drm_sched_job_arm);
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 4eb354226972..5c3a99027ecd 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>         if (ret)
>                 return ret;
>
> +       drm_sched_job_arm(&job->base);
> +
>         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>
>         /* put by scheduler job completion */
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index d18af49fd009..80438d126c9d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>  int drm_sched_job_init(struct drm_sched_job *job,
>                        struct drm_sched_entity *entity,
>                        void *owner);
> +void drm_sched_job_arm(struct drm_sched_job *job);
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>                                     struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>                                    enum drm_sched_priority priority);
>  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>
> -struct drm_sched_fence *drm_sched_fence_create(
> +struct drm_sched_fence *drm_sched_fence_alloc(
>         struct drm_sched_entity *s_entity, void *owner);
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +                         struct drm_sched_entity *entity);
> +
>  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>  void drm_sched_fence_finished(struct drm_sched_fence *fence);
>
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH] drm/sched: Split drm_sched_job_init
@ 2021-06-24 21:00       ` Emma Anholt
  0 siblings, 0 replies; 57+ messages in thread
From: Emma Anholt @ 2021-06-24 21:00 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Adam Borowski, David Airlie, Viresh Kumar, DRI Development,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, etnaviv, linaro-mm-sig,
	Deepak R Varma, Tomeu Vizoso, Nick Terrell, Boris Brezillon,
	Qiang Yu, Alex Deucher, Tian Tao, linux-media,
	Christian König

On Thu, Jun 24, 2021 at 1:45 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> This is a very confusingly named function, because not just does it
> init an object, it arms it and provides a point of no return for
> pushing a job into the scheduler. It would be nice if that's a bit
> clearer in the interface.
>
> But the real reason is that I want to push the dependency tracking
> helpers into the scheduler code, and that means drm_sched_job_init
> must be called a lot earlier, without arming the job.
>
> v2:
> - don't change .gitignore (Steven)
> - don't forget v3d (Emma)
>
> Acked-by: Steven Price <steven.price@arm.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> Cc: Qiang Yu <yuq825@gmail.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Adam Borowski <kilobyte@angband.pl>
> Cc: Nick Terrell <terrelln@fb.com>
> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Viresh Kumar <viresh.kumar@linaro.org>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: "Marek Olšák" <marek.olsak@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> Cc: Sonny Jiang <sonny.jiang@amd.com>
> Cc: Boris Brezillon <boris.brezillon@collabora.com>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Jack Zhang <Jack.Zhang1@amd.com>
> Cc: etnaviv@lists.freedesktop.org
> Cc: lima@lists.freedesktop.org
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Cc: Emma Anholt <emma@anholt.net>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
>  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
>  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
>  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
>  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
>  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
>  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
>  drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
>  include/drm/gpu_scheduler.h              |  6 +++++-
>  10 files changed, 52 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index c5386d13eb4a..a4ec092af9a7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
>         if (r)
>                 goto error_unlock;
>
> +       drm_sched_job_arm(&job->base);
> +
>         /* No memory allocation is allowed while holding the notifier lock.
>          * The lock is held until amdgpu_cs_submit is finished and fence is
>          * added to BOs.
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> index d33e6d97cc89..5ddb955d2315 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
>         if (r)
>                 return r;
>
> +       drm_sched_job_arm(&job->base);
> +
>         *f = dma_fence_get(&job->base.s_fence->finished);
>         amdgpu_job_free_resources(job);
>         drm_sched_entity_push_job(&job->base, entity);
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> index 19826e504efc..af1671f01c7f 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
>         if (ret)
>                 goto out_unlock;
>
> +       drm_sched_job_arm(&submit->sched_job);
> +
>         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
>         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
>                                                 submit->out_fence, 0,
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ecf3267334ff..bd1af1fd8c0f 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
>                 return err;
>         }
>
> +       drm_sched_job_arm(&task->base);
> +
>         task->num_bos = num_bos;
>         task->vm = lima_vm_get(vm);
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index beb62c8fc851..1e950534b9b0 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
>                 goto unlock;
>         }
>
> +       drm_sched_job_arm(&job->base);
> +
>         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>
>         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 79554aa4dbb1..f7347c284886 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>   * @sched_job: job to submit
>   * @entity: scheduler entity
>   *
> - * Note: To guarantee that the order of insertion to queue matches
> - * the job's fence sequence number this function should be
> - * called with drm_sched_job_init under common lock.
> + * Note: To guarantee that the order of insertion to queue matches the job's
> + * fence sequence number this function should be called with drm_sched_job_arm()
> + * under common lock.
>   *
>   * Returns 0 for success, negative error code otherwise.
>   */
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> index 69de2c76731f..0ba810c198bd 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
>  }
>  EXPORT_SYMBOL(to_drm_sched_fence);
>
> -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> -                                              void *owner)
> +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> +                                             void *owner)
>  {
>         struct drm_sched_fence *fence = NULL;
> -       unsigned seq;
>
>         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
>         if (fence == NULL)
> @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
>         fence->sched = entity->rq->sched;
>         spin_lock_init(&fence->lock);
>
> +       return fence;
> +}

If there's an error path between fence_alloc (job_init()) and
fence_create() (job_arm()) time, how does the s_fence get freed?
Before, I was committed to calling drm_sched_entity_push_job() which
lead to the job being processed and freed, but now I think we need
some other non-pushed-job free path.

> +EXPORT_SYMBOL(drm_sched_job_arm);
> +
>  /**
>   * drm_sched_job_cleanup - clean up scheduler job resources
>   *
> diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> index 4eb354226972..5c3a99027ecd 100644
> --- a/drivers/gpu/drm/v3d/v3d_gem.c
> +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
>         if (ret)
>                 return ret;
>
> +       drm_sched_job_arm(&job->base);
> +
>         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
>
>         /* put by scheduler job completion */
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index d18af49fd009..80438d126c9d 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
>  int drm_sched_job_init(struct drm_sched_job *job,
>                        struct drm_sched_entity *entity,
>                        void *owner);
> +void drm_sched_job_arm(struct drm_sched_job *job);
>  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
>                                     struct drm_gpu_scheduler **sched_list,
>                                     unsigned int num_sched_list);
> @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
>                                    enum drm_sched_priority priority);
>  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
>
> -struct drm_sched_fence *drm_sched_fence_create(
> +struct drm_sched_fence *drm_sched_fence_alloc(
>         struct drm_sched_entity *s_entity, void *owner);
> +void drm_sched_fence_init(struct drm_sched_fence *fence,
> +                         struct drm_sched_entity *entity);
> +
>  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>  void drm_sched_fence_finished(struct drm_sched_fence *fence);
>
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH] drm/sched: Split drm_sched_job_init
  2021-06-24 21:00       ` Emma Anholt
@ 2021-06-24 21:30         ` Daniel Vetter
  -1 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 21:30 UTC (permalink / raw)
  To: Emma Anholt
  Cc: DRI Development, Steven Price, Daniel Vetter, Lucas Stach,
	Russell King, Christian Gmeiner, Qiang Yu, Rob Herring,
	Tomeu Vizoso, Alyssa Rosenzweig, David Airlie, Sumit Semwal,
	Christian König, Masahiro Yamada, Kees Cook, Adam Borowski,
	Nick Terrell, Mauro Carvalho Chehab, Paul Menzel, Sami Tolvanen,
	Viresh Kumar, Alex Deucher, Dave Airlie, Nirmoy Das,
	Deepak R Varma, Lee Jones, Kevin Wang, Chen Li, Luben Tuikov,
	Marek Olšák, Dennis Li, Maarten Lankhorst,
	Andrey Grodzovsky, Sonny Jiang, Boris Brezillon, Tian Tao,
	Jack Zhang, The etnaviv authors, lima,
	open list:DMA BUFFER SHARING FRAMEWORK,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jun 24, 2021 at 11:00 PM Emma Anholt <emma@anholt.net> wrote:
>
> On Thu, Jun 24, 2021 at 1:45 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
> >
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
> >
> > v2:
> > - don't change .gitignore (Steven)
> > - don't forget v3d (Emma)
> >
> > Acked-by: Steven Price <steven.price@arm.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: Qiang Yu <yuq825@gmail.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Adam Borowski <kilobyte@angband.pl>
> > Cc: Nick Terrell <terrelln@fb.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: etnaviv@lists.freedesktop.org
> > Cc: lima@lists.freedesktop.org
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: Emma Anholt <emma@anholt.net>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >  drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >  include/drm/gpu_scheduler.h              |  6 +++++-
> >  10 files changed, 52 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index c5386d13eb4a..a4ec092af9a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >         if (r)
> >                 goto error_unlock;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         /* No memory allocation is allowed while holding the notifier lock.
> >          * The lock is held until amdgpu_cs_submit is finished and fence is
> >          * added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >         if (r)
> >                 return r;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         *f = dma_fence_get(&job->base.s_fence->finished);
> >         amdgpu_job_free_resources(job);
> >         drm_sched_entity_push_job(&job->base, entity);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index 19826e504efc..af1671f01c7f 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >         if (ret)
> >                 goto out_unlock;
> >
> > +       drm_sched_job_arm(&submit->sched_job);
> > +
> >         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >                                                 submit->out_fence, 0,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ecf3267334ff..bd1af1fd8c0f 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >                 return err;
> >         }
> >
> > +       drm_sched_job_arm(&task->base);
> > +
> >         task->num_bos = num_bos;
> >         task->vm = lima_vm_get(vm);
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index beb62c8fc851..1e950534b9b0 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >                 goto unlock;
> >         }
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..f7347c284886 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >   * @sched_job: job to submit
> >   * @entity: scheduler entity
> >   *
> > - * Note: To guarantee that the order of insertion to queue matches
> > - * the job's fence sequence number this function should be
> > - * called with drm_sched_job_init under common lock.
> > + * Note: To guarantee that the order of insertion to queue matches the job's
> > + * fence sequence number this function should be called with drm_sched_job_arm()
> > + * under common lock.
> >   *
> >   * Returns 0 for success, negative error code otherwise.
> >   */
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..0ba810c198bd 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >  }
> >  EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > -                                              void *owner)
> > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > +                                             void *owner)
> >  {
> >         struct drm_sched_fence *fence = NULL;
> > -       unsigned seq;
> >
> >         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >         if (fence == NULL)
> > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >         fence->sched = entity->rq->sched;
> >         spin_lock_init(&fence->lock);
> >
> > +       return fence;
> > +}
>
> If there's an error path between fence_alloc (job_init()) and
> fence_create() (job_arm()) time, how does the s_fence get freed?
> Before, I was committed to calling drm_sched_entity_push_job() which
> lead to the job being processed and freed, but now I think we need
> some other non-pushed-job free path.

Yeah I need to fix that in each driver when I move the
drm_sched_job_init around. From a quick look I just need to move the
drm_sched_job_cleanup() call around slightly in each case, but it's a
bit too late to do that without screwing it up for sure. I'll look
into that tomorrow.
-Daniel

>
> > +EXPORT_SYMBOL(drm_sched_job_arm);
> > +
> >  /**
> >   * drm_sched_job_cleanup - clean up scheduler job resources
> >   *
> > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > index 4eb354226972..5c3a99027ecd 100644
> > --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >         if (ret)
> >                 return ret;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >         /* put by scheduler job completion */
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index d18af49fd009..80438d126c9d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >  int drm_sched_job_init(struct drm_sched_job *job,
> >                        struct drm_sched_entity *entity,
> >                        void *owner);
> > +void drm_sched_job_arm(struct drm_sched_job *job);
> >  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >                                     struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> > @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >                                    enum drm_sched_priority priority);
> >  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(
> > +struct drm_sched_fence *drm_sched_fence_alloc(
> >         struct drm_sched_entity *s_entity, void *owner);
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                         struct drm_sched_entity *entity);
> > +
> >  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >  void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >
> > --
> > 2.32.0.rc2
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH] drm/sched: Split drm_sched_job_init
@ 2021-06-24 21:30         ` Daniel Vetter
  0 siblings, 0 replies; 57+ messages in thread
From: Daniel Vetter @ 2021-06-24 21:30 UTC (permalink / raw)
  To: Emma Anholt
  Cc: Adam Borowski, David Airlie, Viresh Kumar, DRI Development,
	Sonny Jiang, Nirmoy Das, Daniel Vetter, Lee Jones, Jack Zhang,
	lima, Mauro Carvalho Chehab, Masahiro Yamada, Steven Price,
	Luben Tuikov, Alyssa Rosenzweig, Sami Tolvanen, Russell King,
	Dave Airlie, Dennis Li, Chen Li, Paul Menzel, Kees Cook,
	Marek Olšák, Kevin Wang, The etnaviv authors,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Deepak R Varma,
	Tomeu Vizoso, Nick Terrell, Boris Brezillon, Qiang Yu,
	Alex Deucher, Tian Tao, open list:DMA BUFFER SHARING FRAMEWORK,
	Christian König

On Thu, Jun 24, 2021 at 11:00 PM Emma Anholt <emma@anholt.net> wrote:
>
> On Thu, Jun 24, 2021 at 1:45 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > This is a very confusingly named function, because not just does it
> > init an object, it arms it and provides a point of no return for
> > pushing a job into the scheduler. It would be nice if that's a bit
> > clearer in the interface.
> >
> > But the real reason is that I want to push the dependency tracking
> > helpers into the scheduler code, and that means drm_sched_job_init
> > must be called a lot earlier, without arming the job.
> >
> > v2:
> > - don't change .gitignore (Steven)
> > - don't forget v3d (Emma)
> >
> > Acked-by: Steven Price <steven.price@arm.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Russell King <linux+etnaviv@armlinux.org.uk>
> > Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
> > Cc: Qiang Yu <yuq825@gmail.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Adam Borowski <kilobyte@angband.pl>
> > Cc: Nick Terrell <terrelln@fb.com>
> > Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Viresh Kumar <viresh.kumar@linaro.org>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Lee Jones <lee.jones@linaro.org>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: "Marek Olšák" <marek.olsak@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> > Cc: Sonny Jiang <sonny.jiang@amd.com>
> > Cc: Boris Brezillon <boris.brezillon@collabora.com>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Jack Zhang <Jack.Zhang1@amd.com>
> > Cc: etnaviv@lists.freedesktop.org
> > Cc: lima@lists.freedesktop.org
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > Cc: Emma Anholt <emma@anholt.net>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c   |  2 ++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_job.c  |  2 ++
> >  drivers/gpu/drm/etnaviv/etnaviv_sched.c  |  2 ++
> >  drivers/gpu/drm/lima/lima_sched.c        |  2 ++
> >  drivers/gpu/drm/panfrost/panfrost_job.c  |  2 ++
> >  drivers/gpu/drm/scheduler/sched_entity.c |  6 +++---
> >  drivers/gpu/drm/scheduler/sched_fence.c  | 15 ++++++++++-----
> >  drivers/gpu/drm/scheduler/sched_main.c   | 23 ++++++++++++++++++++++-
> >  drivers/gpu/drm/v3d/v3d_gem.c            |  2 ++
> >  include/drm/gpu_scheduler.h              |  6 +++++-
> >  10 files changed, 52 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index c5386d13eb4a..a4ec092af9a7 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -1226,6 +1226,8 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p,
> >         if (r)
> >                 goto error_unlock;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         /* No memory allocation is allowed while holding the notifier lock.
> >          * The lock is held until amdgpu_cs_submit is finished and fence is
> >          * added to BOs.
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > index d33e6d97cc89..5ddb955d2315 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
> > @@ -170,6 +170,8 @@ int amdgpu_job_submit(struct amdgpu_job *job, struct drm_sched_entity *entity,
> >         if (r)
> >                 return r;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         *f = dma_fence_get(&job->base.s_fence->finished);
> >         amdgpu_job_free_resources(job);
> >         drm_sched_entity_push_job(&job->base, entity);
> > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > index 19826e504efc..af1671f01c7f 100644
> > --- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > +++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
> > @@ -163,6 +163,8 @@ int etnaviv_sched_push_job(struct drm_sched_entity *sched_entity,
> >         if (ret)
> >                 goto out_unlock;
> >
> > +       drm_sched_job_arm(&submit->sched_job);
> > +
> >         submit->out_fence = dma_fence_get(&submit->sched_job.s_fence->finished);
> >         submit->out_fence_id = idr_alloc_cyclic(&submit->gpu->fence_idr,
> >                                                 submit->out_fence, 0,
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ecf3267334ff..bd1af1fd8c0f 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -129,6 +129,8 @@ int lima_sched_task_init(struct lima_sched_task *task,
> >                 return err;
> >         }
> >
> > +       drm_sched_job_arm(&task->base);
> > +
> >         task->num_bos = num_bos;
> >         task->vm = lima_vm_get(vm);
> >
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index beb62c8fc851..1e950534b9b0 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -244,6 +244,8 @@ int panfrost_job_push(struct panfrost_job *job)
> >                 goto unlock;
> >         }
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >         ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 79554aa4dbb1..f7347c284886 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -485,9 +485,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >   * @sched_job: job to submit
> >   * @entity: scheduler entity
> >   *
> > - * Note: To guarantee that the order of insertion to queue matches
> > - * the job's fence sequence number this function should be
> > - * called with drm_sched_job_init under common lock.
> > + * Note: To guarantee that the order of insertion to queue matches the job's
> > + * fence sequence number this function should be called with drm_sched_job_arm()
> > + * under common lock.
> >   *
> >   * Returns 0 for success, negative error code otherwise.
> >   */
> > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c
> > index 69de2c76731f..0ba810c198bd 100644
> > --- a/drivers/gpu/drm/scheduler/sched_fence.c
> > +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> > @@ -152,11 +152,10 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> >  }
> >  EXPORT_SYMBOL(to_drm_sched_fence);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> > -                                              void *owner)
> > +struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
> > +                                             void *owner)
> >  {
> >         struct drm_sched_fence *fence = NULL;
> > -       unsigned seq;
> >
> >         fence = kmem_cache_zalloc(sched_fence_slab, GFP_KERNEL);
> >         if (fence == NULL)
> > @@ -166,13 +165,19 @@ struct drm_sched_fence *drm_sched_fence_create(struct drm_sched_entity *entity,
> >         fence->sched = entity->rq->sched;
> >         spin_lock_init(&fence->lock);
> >
> > +       return fence;
> > +}
>
> If there's an error path between fence_alloc (job_init()) and
> fence_create() (job_arm()) time, how does the s_fence get freed?
> Before, I was committed to calling drm_sched_entity_push_job() which
> lead to the job being processed and freed, but now I think we need
> some other non-pushed-job free path.

Yeah I need to fix that in each driver when I move the
drm_sched_job_init around. From a quick look I just need to move the
drm_sched_job_cleanup() call around slightly in each case, but it's a
bit too late to do that without screwing it up for sure. I'll look
into that tomorrow.
-Daniel

>
> > +EXPORT_SYMBOL(drm_sched_job_arm);
> > +
> >  /**
> >   * drm_sched_job_cleanup - clean up scheduler job resources
> >   *
> > diff --git a/drivers/gpu/drm/v3d/v3d_gem.c b/drivers/gpu/drm/v3d/v3d_gem.c
> > index 4eb354226972..5c3a99027ecd 100644
> > --- a/drivers/gpu/drm/v3d/v3d_gem.c
> > +++ b/drivers/gpu/drm/v3d/v3d_gem.c
> > @@ -475,6 +475,8 @@ v3d_push_job(struct v3d_file_priv *v3d_priv,
> >         if (ret)
> >                 return ret;
> >
> > +       drm_sched_job_arm(&job->base);
> > +
> >         job->done_fence = dma_fence_get(&job->base.s_fence->finished);
> >
> >         /* put by scheduler job completion */
> > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> > index d18af49fd009..80438d126c9d 100644
> > --- a/include/drm/gpu_scheduler.h
> > +++ b/include/drm/gpu_scheduler.h
> > @@ -313,6 +313,7 @@ void drm_sched_fini(struct drm_gpu_scheduler *sched);
> >  int drm_sched_job_init(struct drm_sched_job *job,
> >                        struct drm_sched_entity *entity,
> >                        void *owner);
> > +void drm_sched_job_arm(struct drm_sched_job *job);
> >  void drm_sched_entity_modify_sched(struct drm_sched_entity *entity,
> >                                     struct drm_gpu_scheduler **sched_list,
> >                                     unsigned int num_sched_list);
> > @@ -352,8 +353,11 @@ void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
> >                                    enum drm_sched_priority priority);
> >  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
> >
> > -struct drm_sched_fence *drm_sched_fence_create(
> > +struct drm_sched_fence *drm_sched_fence_alloc(
> >         struct drm_sched_entity *s_entity, void *owner);
> > +void drm_sched_fence_init(struct drm_sched_fence *fence,
> > +                         struct drm_sched_entity *entity);
> > +
> >  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
> >  void drm_sched_fence_finished(struct drm_sched_fence *fence);
> >
> > --
> > 2.32.0.rc2
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2021-06-24 21:30 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-24 14:00 [PATCH 00/11] drm/scheduler dependency tracking Daniel Vetter
2021-06-24 14:00 ` [PATCH 01/11] drm/sched: Split drm_sched_job_init Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:32   ` Steven Price
2021-06-24 14:32     ` Steven Price
2021-06-24 17:29   ` Christian König
2021-06-24 17:29     ` Christian König
2021-06-24 17:37     ` Daniel Vetter
2021-06-24 17:37       ` Daniel Vetter
2021-06-24 17:39       ` Christian König
2021-06-24 17:39         ` Christian König
2021-06-24 18:22         ` Daniel Vetter
2021-06-24 18:22           ` Daniel Vetter
2021-06-24 20:45   ` [PATCH] " Daniel Vetter
2021-06-24 20:45     ` Daniel Vetter
2021-06-24 21:00     ` Emma Anholt
2021-06-24 21:00       ` Emma Anholt
2021-06-24 21:30       ` Daniel Vetter
2021-06-24 21:30         ` Daniel Vetter
2021-06-24 14:00 ` [PATCH 02/11] drm/sched: Add dependency tracking Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:32   ` Steven Price
2021-06-24 14:32     ` Steven Price
2021-06-24 14:39   ` Lucas Stach
2021-06-24 14:39     ` Lucas Stach
2021-06-24 15:26     ` Daniel Vetter
2021-06-24 15:26       ` Daniel Vetter
2021-06-24 16:59     ` Christian König
2021-06-24 16:59       ` Christian König
2021-06-24 14:00 ` [PATCH 03/11] drm/sched: drop entity parameter from drm_sched_push_job Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:32   ` Steven Price
2021-06-24 14:32     ` Steven Price
2021-06-24 14:00 ` [PATCH 04/11] drm/panfrost: use scheduler dependency tracking Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:32   ` Steven Price
2021-06-24 14:32     ` Steven Price
2021-06-24 14:00 ` [PATCH 05/11] drm/lima: " Daniel Vetter
2021-06-24 14:00 ` [PATCH 06/11] drm/v3d: Move drm_sched_job_init to v3d_job_init Daniel Vetter
2021-06-24 16:59   ` Emma Anholt
2021-06-24 17:24     ` Daniel Vetter
2021-06-24 20:45   ` [PATCH] " Daniel Vetter
2021-06-24 20:49     ` Emma Anholt
2021-06-24 14:00 ` [PATCH 07/11] drm/v3d: Use scheduler dependency handling Daniel Vetter
2021-06-24 14:00 ` [PATCH 08/11] drm/etnaviv: " Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:00 ` [PATCH 09/11] drm/gem: Delete gem array fencing helpers Daniel Vetter
2021-06-24 14:00   ` Daniel Vetter
2021-06-24 14:00 ` [PATCH 10/11] drm/scheduler: Don't store self-dependencies Daniel Vetter
2021-06-24 14:42   ` Lucas Stach
2021-06-24 17:03   ` Christian König
2021-06-24 17:29     ` Daniel Vetter
2021-06-24 17:38       ` Christian König
2021-06-24 17:43         ` Daniel Vetter
2021-06-24 17:56           ` Christian König
2021-06-24 18:21             ` Daniel Vetter
2021-06-24 14:00 ` [PATCH 11/11] drm/sched: Check locking in drm_sched_job_await_implicit Daniel Vetter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.