All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl
@ 2021-09-30 19:09 Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 1/8] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences() Boris Brezillon
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

Hello,

I finally got to resubmitting a new version of this series. I think
I fixed all the issues reported by Steve and Daniel. Still no support
for {IN,OUT}_FENCE_FD, but that can be added later if we need it.

For those who didn't follow the previous iterations, this is an
attempt at providing a new submit ioctl that's more Vulkan-friendly
than the existing one. This ioctl

1/ allows passing several out syncobjs so we can easily update
   several fence/semaphore in a single ioctl() call
2/ allows passing several jobs so we don't have to have one ioctl
   per job-chain recorded in the command buffer
3/ supports disabling implicit dependencies as well as 
   non-exclusive access to BOs, thus removing unnecessary
   synchronization

I've also been looking at adding {IN,OUT}_FENCE_FD support (allowing
one to pass at most one sync_file object in input and/or creating a
sync_file FD embedding the render out fence), but it's not entirely
clear to me when that's useful. Indeed, we can already do the
sync_file <-> syncobj conversion using the
SYNCOBJ_{FD_TO_HANDLE,HANDLE_TO_FD} ioctls if we have to.
Note that, unlike Turnip, PanVk is using syncobjs to implement
vkQueueWaitIdle(), so the syncobj -> sync_file conversion doesn't
have to happen for each submission, but maybe there's a good reason
to use sync_files for that too. Any feedback on that aspect would
be useful I guess.

Any feedback on this new ioctl is welcome, in particular, do you
think other things are missing/would be nice to have for Vulkan?

Regards,

Boris

P.S.: basic igt tests for these new ioctls re available there [1]

[1]https://gitlab.freedesktop.org/bbrezillon/igt-gpu-tools/-/tree/panfrost-batch-submit

Boris Brezillon (8):
  drm/panfrost: Pass a job to panfrost_{acquire,attach}_object_fences()
  drm/panfrost: Move the mappings collection out of
    panfrost_lookup_bos()
  drm/panfrost: Add BO access flags to relax dependencies between jobs
  drm/panfrost: Add the ability to create submit queues
  drm/panfrost: Add a new ioctl to submit batches
  drm/panfrost: Support synchronization jobs
  drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature
  drm/panfrost: Bump minor version to reflect the feature additions

 drivers/gpu/drm/panfrost/Makefile             |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h    |   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       | 637 +++++++++++++-----
 drivers/gpu/drm/panfrost/panfrost_job.c       |  93 ++-
 drivers/gpu/drm/panfrost/panfrost_job.h       |   8 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 132 ++++
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  26 +
 include/uapi/drm/panfrost_drm.h               | 119 ++++
 8 files changed, 796 insertions(+), 224 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

-- 
2.31.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/8] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences()
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 2/8] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos() Boris Brezillon
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

So we don't have to change the prototype if we extend the function.

v3:
* Fix subject

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 908d79520853..ed8d1588b1de 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -240,15 +240,13 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	spin_unlock(&pfdev->js->job_lock);
 }
 
-static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
-					  int bo_count,
-					  struct drm_sched_job *job)
+static int panfrost_acquire_object_fences(struct panfrost_job *job)
 {
 	int i, ret;
 
-	for (i = 0; i < bo_count; i++) {
+	for (i = 0; i < job->bo_count; i++) {
 		/* panfrost always uses write mode in its current uapi */
-		ret = drm_sched_job_add_implicit_dependencies(job, bos[i],
+		ret = drm_sched_job_add_implicit_dependencies(&job->base, job->bos[i],
 							      true);
 		if (ret)
 			return ret;
@@ -257,14 +255,12 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	return 0;
 }
 
-static void panfrost_attach_object_fences(struct drm_gem_object **bos,
-					  int bo_count,
-					  struct dma_fence *fence)
+static void panfrost_attach_object_fences(struct panfrost_job *job)
 {
 	int i;
 
-	for (i = 0; i < bo_count; i++)
-		dma_resv_add_excl_fence(bos[i]->resv, fence);
+	for (i = 0; i < job->bo_count; i++)
+		dma_resv_add_excl_fence(job->bos[i]->resv, job->render_done_fence);
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -283,8 +279,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
-	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
-					     &job->base);
+	ret = panfrost_acquire_object_fences(job);
 	if (ret) {
 		mutex_unlock(&pfdev->sched_lock);
 		goto unlock;
@@ -296,8 +291,7 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	mutex_unlock(&pfdev->sched_lock);
 
-	panfrost_attach_object_fences(job->bos, job->bo_count,
-				      job->render_done_fence);
+	panfrost_attach_object_fences(job);
 
 unlock:
 	drm_gem_unlock_reservations(job->bos, job->bo_count, &acquire_ctx);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/8] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos()
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 1/8] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences() Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 3/8] drm/panfrost: Add BO access flags to relax dependencies between jobs Boris Brezillon
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

So we can re-use it from elsewhere.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 53 ++++++++++++++-----------
 1 file changed, 29 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 40e4a4db3ab1..b131da3c9399 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -118,6 +118,34 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int
+panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job)
+{
+	struct panfrost_file_priv *priv = file_priv->driver_priv;
+	unsigned int i;
+
+	job->mappings = kvmalloc_array(job->bo_count,
+				       sizeof(*job->mappings),
+				       GFP_KERNEL | __GFP_ZERO);
+	if (!job->mappings)
+		return -ENOMEM;
+
+	for (i = 0; i < job->bo_count; i++) {
+		struct panfrost_gem_mapping *mapping;
+		struct panfrost_gem_object *bo;
+
+		bo = to_panfrost_bo(job->bos[i]);
+		mapping = panfrost_gem_mapping_get(bo, priv);
+		if (!mapping)
+			return -EINVAL;
+
+		atomic_inc(&bo->gpu_usecount);
+		job->mappings[i] = mapping;
+	}
+
+	return 0;
+}
+
 /**
  * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
  * referenced by the job.
@@ -137,9 +165,6 @@ panfrost_lookup_bos(struct drm_device *dev,
 		  struct drm_panfrost_submit *args,
 		  struct panfrost_job *job)
 {
-	struct panfrost_file_priv *priv = file_priv->driver_priv;
-	struct panfrost_gem_object *bo;
-	unsigned int i;
 	int ret;
 
 	job->bo_count = args->bo_handle_count;
@@ -153,27 +178,7 @@ panfrost_lookup_bos(struct drm_device *dev,
 	if (ret)
 		return ret;
 
-	job->mappings = kvmalloc_array(job->bo_count,
-				       sizeof(struct panfrost_gem_mapping *),
-				       GFP_KERNEL | __GFP_ZERO);
-	if (!job->mappings)
-		return -ENOMEM;
-
-	for (i = 0; i < job->bo_count; i++) {
-		struct panfrost_gem_mapping *mapping;
-
-		bo = to_panfrost_bo(job->bos[i]);
-		mapping = panfrost_gem_mapping_get(bo, priv);
-		if (!mapping) {
-			ret = -EINVAL;
-			break;
-		}
-
-		atomic_inc(&bo->gpu_usecount);
-		job->mappings[i] = mapping;
-	}
-
-	return ret;
+	return panfrost_get_job_mappings(file_priv, job);
 }
 
 /**
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/8] drm/panfrost: Add BO access flags to relax dependencies between jobs
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 1/8] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences() Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 2/8] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos() Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 4/8] drm/panfrost: Add the ability to create submit queues Boris Brezillon
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

Jobs reading from the same BO should not be serialized. Add access
flags so we can relax the implicit dependencies in that case. We force
exclusive access for now to keep the behavior unchanged, but a new
SUBMIT ioctl taking explicit access flags will be introduced.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 10 ++++++++++
 drivers/gpu/drm/panfrost/panfrost_job.c | 21 ++++++++++++++++++---
 drivers/gpu/drm/panfrost/panfrost_job.h |  1 +
 include/uapi/drm/panfrost_drm.h         |  3 +++
 4 files changed, 32 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index b131da3c9399..a386c66f349c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -165,6 +165,7 @@ panfrost_lookup_bos(struct drm_device *dev,
 		  struct drm_panfrost_submit *args,
 		  struct panfrost_job *job)
 {
+	unsigned int i;
 	int ret;
 
 	job->bo_count = args->bo_handle_count;
@@ -172,6 +173,15 @@ panfrost_lookup_bos(struct drm_device *dev,
 	if (!job->bo_count)
 		return 0;
 
+	job->bo_flags = kvmalloc_array(job->bo_count,
+				       sizeof(*job->bo_flags),
+				       GFP_KERNEL | __GFP_ZERO);
+	if (!job->bo_flags)
+		return -ENOMEM;
+
+	for (i = 0; i < job->bo_count; i++)
+		job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
+
 	ret = drm_gem_objects_lookup(file_priv,
 				     (void __user *)(uintptr_t)args->bo_handles,
 				     job->bo_count, &job->bos);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index ed8d1588b1de..1a9085d8dcf1 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -245,9 +245,17 @@ static int panfrost_acquire_object_fences(struct panfrost_job *job)
 	int i, ret;
 
 	for (i = 0; i < job->bo_count; i++) {
+		bool exclusive = job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE;
+
+		if (!exclusive) {
+			ret = dma_resv_reserve_shared(job->bos[i]->resv, 1);
+			if (ret)
+				return ret;
+		}
+
 		/* panfrost always uses write mode in its current uapi */
 		ret = drm_sched_job_add_implicit_dependencies(&job->base, job->bos[i],
-							      true);
+							      exclusive);
 		if (ret)
 			return ret;
 	}
@@ -259,8 +267,14 @@ static void panfrost_attach_object_fences(struct panfrost_job *job)
 {
 	int i;
 
-	for (i = 0; i < job->bo_count; i++)
-		dma_resv_add_excl_fence(job->bos[i]->resv, job->render_done_fence);
+	for (i = 0; i < job->bo_count; i++) {
+		struct dma_resv *robj = job->bos[i]->resv;
+
+		if (job->bo_flags[i] & PANFROST_BO_REF_EXCLUSIVE)
+			dma_resv_add_excl_fence(robj, job->render_done_fence);
+		else
+			dma_resv_add_shared_fence(robj, job->render_done_fence);
+	}
 }
 
 int panfrost_job_push(struct panfrost_job *job)
@@ -326,6 +340,7 @@ static void panfrost_job_cleanup(struct kref *ref)
 		kvfree(job->bos);
 	}
 
+	kvfree(job->bo_flags);
 	kfree(job);
 }
 
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index 77e6d0e6f612..96d755f12cf7 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -28,6 +28,7 @@ struct panfrost_job {
 
 	struct panfrost_gem_mapping **mappings;
 	struct drm_gem_object **bos;
+	u32 *bo_flags;
 	u32 bo_count;
 
 	/* Fence to be signaled by drm-sched once its done with the job */
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index a2de81225125..c8fdf45b1573 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -226,6 +226,9 @@ struct drm_panfrost_madvise {
 	__u32 retained;       /* out, whether backing store still exists */
 };
 
+/* Exclusive (AKA write) access to the BO */
+#define PANFROST_BO_REF_EXCLUSIVE	0x1
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 4/8] drm/panfrost: Add the ability to create submit queues
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
                   ` (2 preceding siblings ...)
  2021-09-30 19:09 ` [PATCH v5 3/8] drm/panfrost: Add BO access flags to relax dependencies between jobs Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches Boris Brezillon
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

Needed to keep VkQueues isolated from each other.

v4:
* Make panfrost_ioctl_create_submitqueue() return the queue ID
  instead of a queue object

v3:
* Limit the number of submitqueue per context to 16
* Fix a deadlock

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/Makefile             |   3 +-
 drivers/gpu/drm/panfrost/panfrost_device.h    |   2 +-
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  71 ++++++++--
 drivers/gpu/drm/panfrost/panfrost_job.c       |  44 ++----
 drivers/gpu/drm/panfrost/panfrost_job.h       |   7 +-
 .../gpu/drm/panfrost/panfrost_submitqueue.c   | 132 ++++++++++++++++++
 .../gpu/drm/panfrost/panfrost_submitqueue.h   |  26 ++++
 include/uapi/drm/panfrost_drm.h               |  17 +++
 8 files changed, 260 insertions(+), 42 deletions(-)
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.c
 create mode 100644 drivers/gpu/drm/panfrost/panfrost_submitqueue.h

diff --git a/drivers/gpu/drm/panfrost/Makefile b/drivers/gpu/drm/panfrost/Makefile
index b71935862417..e99192b66ec9 100644
--- a/drivers/gpu/drm/panfrost/Makefile
+++ b/drivers/gpu/drm/panfrost/Makefile
@@ -9,6 +9,7 @@ panfrost-y := \
 	panfrost_gpu.o \
 	panfrost_job.o \
 	panfrost_mmu.o \
-	panfrost_perfcnt.o
+	panfrost_perfcnt.o \
+	panfrost_submitqueue.o
 
 obj-$(CONFIG_DRM_PANFROST) += panfrost.o
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 8b25278f34c8..51c0ba4e50f5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -137,7 +137,7 @@ struct panfrost_mmu {
 struct panfrost_file_priv {
 	struct panfrost_device *pfdev;
 
-	struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
+	struct idr queues;
 
 	struct panfrost_mmu *mmu;
 };
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index a386c66f349c..f8f430f68090 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -19,6 +19,7 @@
 #include "panfrost_job.h"
 #include "panfrost_gpu.h"
 #include "panfrost_perfcnt.h"
+#include "panfrost_submitqueue.h"
 
 static bool unstable_ioctls;
 module_param_unsafe(unstable_ioctls, bool, 0600);
@@ -259,6 +260,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	struct panfrost_device *pfdev = dev->dev_private;
 	struct drm_panfrost_submit *args = data;
 	struct drm_syncobj *sync_out = NULL;
+	struct panfrost_submitqueue *queue;
 	struct panfrost_job *job;
 	int ret = 0, slot;
 
@@ -268,10 +270,16 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
 		return -EINVAL;
 
+	queue = panfrost_submitqueue_get(file->driver_priv, 0);
+	if (IS_ERR(queue))
+		return PTR_ERR(queue);
+
 	if (args->out_sync > 0) {
 		sync_out = drm_syncobj_find(file, args->out_sync);
-		if (!sync_out)
-			return -ENODEV;
+		if (!sync_out) {
+			ret = -ENODEV;
+			goto fail_put_queue;
+		}
 	}
 
 	job = kzalloc(sizeof(*job), GFP_KERNEL);
@@ -291,7 +299,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	slot = panfrost_job_get_slot(job);
 
 	ret = drm_sched_job_init(&job->base,
-				 &job->file_priv->sched_entity[slot],
+				 &queue->sched_entity[slot],
 				 NULL);
 	if (ret)
 		goto out_put_job;
@@ -304,7 +312,7 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 	if (ret)
 		goto out_cleanup_job;
 
-	ret = panfrost_job_push(job);
+	ret = panfrost_job_push(queue, job);
 	if (ret)
 		goto out_cleanup_job;
 
@@ -320,6 +328,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 out_put_syncout:
 	if (sync_out)
 		drm_syncobj_put(sync_out);
+fail_put_queue:
+	panfrost_submitqueue_put(queue);
 
 	return ret;
 }
@@ -469,6 +479,36 @@ static int panfrost_ioctl_madvise(struct drm_device *dev, void *data,
 	return ret;
 }
 
+static int
+panfrost_ioctl_create_submitqueue(struct drm_device *dev, void *data,
+				  struct drm_file *file_priv)
+{
+	struct panfrost_file_priv *priv = file_priv->driver_priv;
+	struct drm_panfrost_create_submitqueue *args = data;
+	int ret;
+
+	ret = panfrost_submitqueue_create(priv, args->priority, args->flags);
+	if (ret < 0)
+		return ret;
+
+	args->id = ret;
+	return 0;
+}
+
+static int
+panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
+				   struct drm_file *file_priv)
+{
+	struct panfrost_file_priv *priv = file_priv->driver_priv;
+	u32 id = *((u32 *)data);
+
+	/* Default queue can't be destroyed. */
+	if (!id)
+		return -ENOENT;
+
+	return panfrost_submitqueue_destroy(priv, id);
+}
+
 int panfrost_unstable_ioctl_check(void)
 {
 	if (!unstable_ioctls)
@@ -497,13 +537,22 @@ panfrost_open(struct drm_device *dev, struct drm_file *file)
 		goto err_free;
 	}
 
-	ret = panfrost_job_open(panfrost_priv);
+	idr_init(&panfrost_priv->queues);
+
+	ret = panfrost_submitqueue_create(panfrost_priv,
+					  PANFROST_SUBMITQUEUE_PRIORITY_MEDIUM,
+					  0);
+
+	/* We expect the default queue to get id 0, a positive queue id is
+	 * considered a failure in that case.
+	 */
 	if (ret)
-		goto err_job;
+		goto err_destroy_idr;
 
 	return 0;
 
-err_job:
+err_destroy_idr:
+	idr_destroy(&panfrost_priv->queues);
 	panfrost_mmu_ctx_put(panfrost_priv->mmu);
 err_free:
 	kfree(panfrost_priv);
@@ -514,11 +563,15 @@ static void
 panfrost_postclose(struct drm_device *dev, struct drm_file *file)
 {
 	struct panfrost_file_priv *panfrost_priv = file->driver_priv;
+	u32 id;
 
 	panfrost_perfcnt_close(file);
-	panfrost_job_close(panfrost_priv);
+
+	for (id = 0; idr_get_next(&panfrost_priv->queues, &id); id++)
+		panfrost_submitqueue_destroy(panfrost_priv, id);
 
 	panfrost_mmu_ctx_put(panfrost_priv->mmu);
+	idr_destroy(&panfrost_priv->queues);
 	kfree(panfrost_priv);
 }
 
@@ -535,6 +588,8 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
 	PANFROST_IOCTL(PERFCNT_ENABLE,	perfcnt_enable,	DRM_RENDER_ALLOW),
 	PANFROST_IOCTL(PERFCNT_DUMP,	perfcnt_dump,	DRM_RENDER_ALLOW),
 	PANFROST_IOCTL(MADVISE,		madvise,	DRM_RENDER_ALLOW),
+	PANFROST_IOCTL(CREATE_SUBMITQUEUE, create_submitqueue, DRM_RENDER_ALLOW),
+	PANFROST_IOCTL(DESTROY_SUBMITQUEUE, destroy_submitqueue, DRM_RENDER_ALLOW),
 };
 
 DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 1a9085d8dcf1..62e010307afc 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -20,6 +20,7 @@
 #include "panfrost_regs.h"
 #include "panfrost_gpu.h"
 #include "panfrost_mmu.h"
+#include "panfrost_submitqueue.h"
 
 #define JOB_TIMEOUT_MS 500
 
@@ -277,7 +278,8 @@ static void panfrost_attach_object_fences(struct panfrost_job *job)
 	}
 }
 
-int panfrost_job_push(struct panfrost_job *job)
+int panfrost_job_push(struct panfrost_submitqueue *queue,
+		      struct panfrost_job *job)
 {
 	struct panfrost_device *pfdev = job->pfdev;
 	struct ww_acquire_ctx acquire_ctx;
@@ -855,43 +857,18 @@ void panfrost_job_fini(struct panfrost_device *pfdev)
 	destroy_workqueue(pfdev->reset.wq);
 }
 
-int panfrost_job_open(struct panfrost_file_priv *panfrost_priv)
+void panfrost_job_kill_queue(struct panfrost_submitqueue *queue)
 {
-	struct panfrost_device *pfdev = panfrost_priv->pfdev;
-	struct panfrost_job_slot *js = pfdev->js;
-	struct drm_gpu_scheduler *sched;
-	int ret, i;
+	struct panfrost_device *pfdev = queue->pfdev;
+	int i, j;
 
-	for (i = 0; i < NUM_JOB_SLOTS; i++) {
-		sched = &js->queue[i].sched;
-		ret = drm_sched_entity_init(&panfrost_priv->sched_entity[i],
-					    DRM_SCHED_PRIORITY_NORMAL, &sched,
-					    1, NULL);
-		if (WARN_ON(ret))
-			return ret;
-	}
-	return 0;
-}
-
-void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
-{
-	struct panfrost_device *pfdev = panfrost_priv->pfdev;
-	int i;
-
-	for (i = 0; i < NUM_JOB_SLOTS; i++)
-		drm_sched_entity_destroy(&panfrost_priv->sched_entity[i]);
-
-	/* Kill in-flight jobs */
 	spin_lock(&pfdev->js->job_lock);
 	for (i = 0; i < NUM_JOB_SLOTS; i++) {
-		struct drm_sched_entity *entity = &panfrost_priv->sched_entity[i];
-		int j;
-
 		for (j = ARRAY_SIZE(pfdev->jobs[0]) - 1; j >= 0; j--) {
 			struct panfrost_job *job = pfdev->jobs[i][j];
 			u32 cmd;
 
-			if (!job || job->base.entity != entity)
+			if (!job || job->base.entity != &queue->sched_entity[i])
 				continue;
 
 			if (j == 1) {
@@ -910,7 +887,6 @@ void panfrost_job_close(struct panfrost_file_priv *panfrost_priv)
 			} else {
 				cmd = JS_COMMAND_HARD_STOP;
 			}
-
 			job_write(pfdev, JS_COMMAND(i), cmd);
 		}
 	}
@@ -930,3 +906,9 @@ int panfrost_job_is_idle(struct panfrost_device *pfdev)
 
 	return true;
 }
+
+struct drm_gpu_scheduler *
+panfrost_job_get_sched(struct panfrost_device *pfdev, unsigned int js)
+{
+	return &pfdev->js->queue[js].sched;
+}
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index 96d755f12cf7..636f91fab67c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -10,6 +10,7 @@
 struct panfrost_device;
 struct panfrost_gem_object;
 struct panfrost_file_priv;
+struct panfrost_submitqueue;
 
 struct panfrost_job {
 	struct drm_sched_job base;
@@ -40,9 +41,13 @@ void panfrost_job_fini(struct panfrost_device *pfdev);
 int panfrost_job_open(struct panfrost_file_priv *panfrost_priv);
 void panfrost_job_close(struct panfrost_file_priv *panfrost_priv);
 int panfrost_job_get_slot(struct panfrost_job *job);
-int panfrost_job_push(struct panfrost_job *job);
+int panfrost_job_push(struct panfrost_submitqueue *queue,
+		      struct panfrost_job *job);
 void panfrost_job_put(struct panfrost_job *job);
 void panfrost_job_enable_interrupts(struct panfrost_device *pfdev);
 int panfrost_job_is_idle(struct panfrost_device *pfdev);
+void panfrost_job_kill_queue(struct panfrost_submitqueue *queue);
+struct drm_gpu_scheduler *
+panfrost_job_get_sched(struct panfrost_device *pfdev, unsigned int js);
 
 #endif
diff --git a/drivers/gpu/drm/panfrost/panfrost_submitqueue.c b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c
new file mode 100644
index 000000000000..8944b4410be3
--- /dev/null
+++ b/drivers/gpu/drm/panfrost/panfrost_submitqueue.c
@@ -0,0 +1,132 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright 2021 Collabora ltd. */
+
+#include <linux/idr.h>
+
+#include "panfrost_device.h"
+#include "panfrost_job.h"
+#include "panfrost_submitqueue.h"
+
+#define PAN_MAX_SUBMITQUEUES	16
+
+static enum drm_sched_priority
+to_sched_prio(enum panfrost_submitqueue_priority priority)
+{
+	switch (priority) {
+	case PANFROST_SUBMITQUEUE_PRIORITY_LOW:
+		return DRM_SCHED_PRIORITY_MIN;
+	case PANFROST_SUBMITQUEUE_PRIORITY_MEDIUM:
+		return DRM_SCHED_PRIORITY_NORMAL;
+	case PANFROST_SUBMITQUEUE_PRIORITY_HIGH:
+		return DRM_SCHED_PRIORITY_HIGH;
+	default:
+		break;
+	}
+
+	return DRM_SCHED_PRIORITY_UNSET;
+}
+
+static void
+panfrost_submitqueue_cleanup(struct kref *ref)
+{
+	struct panfrost_submitqueue *queue;
+	unsigned int i;
+
+	queue = container_of(ref, struct panfrost_submitqueue, refcount);
+
+	for (i = 0; i < NUM_JOB_SLOTS; i++)
+		drm_sched_entity_destroy(&queue->sched_entity[i]);
+
+	/* Kill in-flight jobs */
+	panfrost_job_kill_queue(queue);
+
+	kfree(queue);
+}
+
+void panfrost_submitqueue_put(struct panfrost_submitqueue *queue)
+{
+	if (!IS_ERR_OR_NULL(queue))
+		kref_put(&queue->refcount, panfrost_submitqueue_cleanup);
+}
+
+int
+panfrost_submitqueue_create(struct panfrost_file_priv *ctx,
+			    enum panfrost_submitqueue_priority priority,
+			    u32 flags)
+{
+	struct panfrost_submitqueue *queue;
+	enum drm_sched_priority sched_prio;
+	int ret, i;
+
+	if (flags || priority >= PANFROST_SUBMITQUEUE_PRIORITY_COUNT)
+		return -EINVAL;
+
+	queue = kzalloc(sizeof(*queue), GFP_KERNEL);
+	if (!queue)
+		return -ENOMEM;
+
+	queue->pfdev = ctx->pfdev;
+	sched_prio = to_sched_prio(priority);
+	for (i = 0; i < NUM_JOB_SLOTS; i++) {
+		struct drm_gpu_scheduler *sched;
+
+		sched = panfrost_job_get_sched(ctx->pfdev, i);
+		ret = drm_sched_entity_init(&queue->sched_entity[i],
+					    sched_prio, &sched, 1, NULL);
+		if (ret)
+			break;
+	}
+
+	if (ret) {
+		for (i--; i >= 0; i--)
+			drm_sched_entity_destroy(&queue->sched_entity[i]);
+
+		return ret;
+	}
+
+	kref_init(&queue->refcount);
+
+	idr_preload(GFP_KERNEL);
+	idr_lock(&ctx->queues);
+	ret = idr_alloc(&ctx->queues, queue, 0, PAN_MAX_SUBMITQUEUES,
+			GFP_NOWAIT);
+	idr_unlock(&ctx->queues);
+	idr_preload_end();
+
+	if (ret < 0)
+		panfrost_submitqueue_put(queue);
+
+	return ret;
+}
+
+int panfrost_submitqueue_destroy(struct panfrost_file_priv *ctx, u32 id)
+{
+	struct panfrost_submitqueue *queue;
+
+	idr_lock(&ctx->queues);
+	queue = idr_remove(&ctx->queues, id);
+	idr_unlock(&ctx->queues);
+
+	if (!queue)
+		return -ENOENT;
+
+	panfrost_submitqueue_put(queue);
+	return 0;
+}
+
+struct panfrost_submitqueue *
+panfrost_submitqueue_get(struct panfrost_file_priv *ctx, u32 id)
+{
+	struct panfrost_submitqueue *queue;
+
+	idr_lock(&ctx->queues);
+	queue = idr_find(&ctx->queues, id);
+	if (queue)
+		kref_get(&queue->refcount);
+	idr_unlock(&ctx->queues);
+
+	if (!queue)
+		return ERR_PTR(-ENOENT);
+
+	return queue;
+}
diff --git a/drivers/gpu/drm/panfrost/panfrost_submitqueue.h b/drivers/gpu/drm/panfrost/panfrost_submitqueue.h
new file mode 100644
index 000000000000..ade224725844
--- /dev/null
+++ b/drivers/gpu/drm/panfrost/panfrost_submitqueue.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2032 Collabora ltd. */
+
+#ifndef __PANFROST_SUBMITQUEUE_H__
+#define __PANFROST_SUBMITQUEUE_H__
+
+#include <drm/panfrost_drm.h>
+
+struct panfrost_submitqueue {
+	struct kref refcount;
+	struct panfrost_device *pfdev;
+	struct drm_sched_entity sched_entity[NUM_JOB_SLOTS];
+};
+
+struct panfrost_file_priv;
+
+int
+panfrost_submitqueue_create(struct panfrost_file_priv *ctx,
+			    enum panfrost_submitqueue_priority priority,
+			    u32 flags);
+int panfrost_submitqueue_destroy(struct panfrost_file_priv *ctx, u32 id);
+struct panfrost_submitqueue *
+panfrost_submitqueue_get(struct panfrost_file_priv *ctx, u32 id);
+void panfrost_submitqueue_put(struct panfrost_submitqueue *queue);
+
+#endif
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index c8fdf45b1573..66e1c2f1ff4b 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -21,6 +21,8 @@ extern "C" {
 #define DRM_PANFROST_PERFCNT_ENABLE		0x06
 #define DRM_PANFROST_PERFCNT_DUMP		0x07
 #define DRM_PANFROST_MADVISE			0x08
+#define DRM_PANFROST_CREATE_SUBMITQUEUE		0x09
+#define DRM_PANFROST_DESTROY_SUBMITQUEUE	0x0a
 
 #define DRM_IOCTL_PANFROST_SUBMIT		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
 #define DRM_IOCTL_PANFROST_WAIT_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
@@ -29,6 +31,8 @@ extern "C" {
 #define DRM_IOCTL_PANFROST_GET_PARAM		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_PARAM, struct drm_panfrost_get_param)
 #define DRM_IOCTL_PANFROST_GET_BO_OFFSET	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_GET_BO_OFFSET, struct drm_panfrost_get_bo_offset)
 #define DRM_IOCTL_PANFROST_MADVISE		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MADVISE, struct drm_panfrost_madvise)
+#define DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_CREATE_SUBMITQUEUE, struct drm_panfrost_create_submitqueue)
+#define DRM_IOCTL_PANFROST_DESTROY_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_DESTROY_SUBMITQUEUE, __u32)
 
 /*
  * Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
@@ -226,6 +230,19 @@ struct drm_panfrost_madvise {
 	__u32 retained;       /* out, whether backing store still exists */
 };
 
+enum panfrost_submitqueue_priority {
+	PANFROST_SUBMITQUEUE_PRIORITY_LOW = 0,
+	PANFROST_SUBMITQUEUE_PRIORITY_MEDIUM,
+	PANFROST_SUBMITQUEUE_PRIORITY_HIGH,
+	PANFROST_SUBMITQUEUE_PRIORITY_COUNT,
+};
+
+struct drm_panfrost_create_submitqueue {
+	__u32 flags;	/* in, PANFROST_SUBMITQUEUE_x */
+	__u32 priority;	/* in, enum panfrost_submitqueue_priority */
+	__u32 id;	/* out, identifier */
+};
+
 /* Exclusive (AKA write) access to the BO */
 #define PANFROST_BO_REF_EXCLUSIVE	0x1
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
                   ` (3 preceding siblings ...)
  2021-09-30 19:09 ` [PATCH v5 4/8] drm/panfrost: Add the ability to create submit queues Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-10-04 11:30   ` Steven Price
  2021-09-30 19:09 ` [PATCH v5 6/8] drm/panfrost: Support synchronization jobs Boris Brezillon
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

This should help limit the number of ioctls when submitting multiple
jobs. The new ioctl also supports syncobj timelines and BO access flags.

v5:
* Fix typos
* Add BUILD_BUG_ON() checks to make sure SUBMIT_BATCH_VERSION and
  descriptor sizes are synced
* Simplify error handling in panfrost_ioctl_batch_submit()
* Don't disable implicit fences on exclusive references

v4:
* Implement panfrost_ioctl_submit() as a wrapper around
  panfrost_submit_job()
* Replace stride fields by a version field which is mapped to
  a <job_stride,bo_ref_stride,syncobj_ref_stride> tuple internally

v3:
* Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
  old submit path

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 584 ++++++++++++++++--------
 drivers/gpu/drm/panfrost/panfrost_job.c |   4 +-
 include/uapi/drm/panfrost_drm.h         |  92 ++++
 3 files changed, 492 insertions(+), 188 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index f8f430f68090..30dc158d56e6 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -147,193 +147,6 @@ panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job)
 	return 0;
 }
 
-/**
- * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve handles from userspace to BOs and attach them to job.
- *
- * Note that this function doesn't need to unreference the BOs on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
-static int
-panfrost_lookup_bos(struct drm_device *dev,
-		  struct drm_file *file_priv,
-		  struct drm_panfrost_submit *args,
-		  struct panfrost_job *job)
-{
-	unsigned int i;
-	int ret;
-
-	job->bo_count = args->bo_handle_count;
-
-	if (!job->bo_count)
-		return 0;
-
-	job->bo_flags = kvmalloc_array(job->bo_count,
-				       sizeof(*job->bo_flags),
-				       GFP_KERNEL | __GFP_ZERO);
-	if (!job->bo_flags)
-		return -ENOMEM;
-
-	for (i = 0; i < job->bo_count; i++)
-		job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
-
-	ret = drm_gem_objects_lookup(file_priv,
-				     (void __user *)(uintptr_t)args->bo_handles,
-				     job->bo_count, &job->bos);
-	if (ret)
-		return ret;
-
-	return panfrost_get_job_mappings(file_priv, job);
-}
-
-/**
- * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
- * referenced by the job.
- * @dev: DRM device
- * @file_priv: DRM file for this fd
- * @args: IOCTL args
- * @job: job being set up
- *
- * Resolve syncobjs from userspace to fences and attach them to job.
- *
- * Note that this function doesn't need to unreference the fences on
- * failure, because that will happen at panfrost_job_cleanup() time.
- */
-static int
-panfrost_copy_in_sync(struct drm_device *dev,
-		  struct drm_file *file_priv,
-		  struct drm_panfrost_submit *args,
-		  struct panfrost_job *job)
-{
-	u32 *handles;
-	int ret = 0;
-	int i, in_fence_count;
-
-	in_fence_count = args->in_sync_count;
-
-	if (!in_fence_count)
-		return 0;
-
-	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
-	if (!handles) {
-		ret = -ENOMEM;
-		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
-		goto fail;
-	}
-
-	if (copy_from_user(handles,
-			   (void __user *)(uintptr_t)args->in_syncs,
-			   in_fence_count * sizeof(u32))) {
-		ret = -EFAULT;
-		DRM_DEBUG("Failed to copy in syncobj handles\n");
-		goto fail;
-	}
-
-	for (i = 0; i < in_fence_count; i++) {
-		struct dma_fence *fence;
-
-		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
-					     &fence);
-		if (ret)
-			goto fail;
-
-		ret = drm_sched_job_add_dependency(&job->base, fence);
-
-		if (ret)
-			goto fail;
-	}
-
-fail:
-	kvfree(handles);
-	return ret;
-}
-
-static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
-		struct drm_file *file)
-{
-	struct panfrost_device *pfdev = dev->dev_private;
-	struct drm_panfrost_submit *args = data;
-	struct drm_syncobj *sync_out = NULL;
-	struct panfrost_submitqueue *queue;
-	struct panfrost_job *job;
-	int ret = 0, slot;
-
-	if (!args->jc)
-		return -EINVAL;
-
-	if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
-		return -EINVAL;
-
-	queue = panfrost_submitqueue_get(file->driver_priv, 0);
-	if (IS_ERR(queue))
-		return PTR_ERR(queue);
-
-	if (args->out_sync > 0) {
-		sync_out = drm_syncobj_find(file, args->out_sync);
-		if (!sync_out) {
-			ret = -ENODEV;
-			goto fail_put_queue;
-		}
-	}
-
-	job = kzalloc(sizeof(*job), GFP_KERNEL);
-	if (!job) {
-		ret = -ENOMEM;
-		goto out_put_syncout;
-	}
-
-	kref_init(&job->refcount);
-
-	job->pfdev = pfdev;
-	job->jc = args->jc;
-	job->requirements = args->requirements;
-	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
-	job->file_priv = file->driver_priv;
-
-	slot = panfrost_job_get_slot(job);
-
-	ret = drm_sched_job_init(&job->base,
-				 &queue->sched_entity[slot],
-				 NULL);
-	if (ret)
-		goto out_put_job;
-
-	ret = panfrost_copy_in_sync(dev, file, args, job);
-	if (ret)
-		goto out_cleanup_job;
-
-	ret = panfrost_lookup_bos(dev, file, args, job);
-	if (ret)
-		goto out_cleanup_job;
-
-	ret = panfrost_job_push(queue, job);
-	if (ret)
-		goto out_cleanup_job;
-
-	/* Update the return sync object for the job */
-	if (sync_out)
-		drm_syncobj_replace_fence(sync_out, job->render_done_fence);
-
-out_cleanup_job:
-	if (ret)
-		drm_sched_job_cleanup(&job->base);
-out_put_job:
-	panfrost_job_put(job);
-out_put_syncout:
-	if (sync_out)
-		drm_syncobj_put(sync_out);
-fail_put_queue:
-	panfrost_submitqueue_put(queue);
-
-	return ret;
-}
-
 static int
 panfrost_ioctl_wait_bo(struct drm_device *dev, void *data,
 		       struct drm_file *file_priv)
@@ -509,6 +322,402 @@ panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
 	return panfrost_submitqueue_destroy(priv, id);
 }
 
+#define PANFROST_BO_REF_ALLOWED_FLAGS \
+	(PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP)
+
+static int
+panfrost_get_job_bos(struct drm_file *file_priv,
+		     u64 refs, u32 ref_stride, u32 count,
+		     struct panfrost_job *job)
+{
+	void __user *in = u64_to_user_ptr(refs);
+	unsigned int i;
+
+	job->bo_count = count;
+
+	if (!count)
+		return 0;
+
+	job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos),
+				  GFP_KERNEL | __GFP_ZERO);
+	job->bo_flags = kvmalloc_array(job->bo_count,
+				       sizeof(*job->bo_flags),
+				       GFP_KERNEL | __GFP_ZERO);
+	if (!job->bos || !job->bo_flags)
+		return -ENOMEM;
+
+	for (i = 0; i < count; i++) {
+		struct drm_panfrost_bo_ref ref = { };
+		int ret;
+
+		ret = copy_struct_from_user(&ref, sizeof(ref),
+					    in + (i * ref_stride),
+					    ref_stride);
+		if (ret)
+			return ret;
+
+		/* Prior to the BATCH_SUBMIT ioctl all accessed BOs were
+		 * treated as exclusive.
+		 */
+		if (ref_stride == sizeof(u32))
+			ref.flags = PANFROST_BO_REF_EXCLUSIVE;
+
+		if ((ref.flags & ~PANFROST_BO_REF_ALLOWED_FLAGS))
+			return -EINVAL;
+
+		job->bos[i] = drm_gem_object_lookup(file_priv, ref.handle);
+		if (!job->bos[i])
+			return -EINVAL;
+
+		job->bo_flags[i] = ref.flags;
+	}
+
+	return 0;
+}
+
+static int
+panfrost_get_job_in_syncs(struct drm_file *file_priv,
+			  u64 refs, u32 ref_stride,
+			  u32 count, struct panfrost_job *job)
+{
+	const void __user *in = u64_to_user_ptr(refs);
+	unsigned int i;
+	int ret;
+
+	if (!count)
+		return 0;
+
+	for (i = 0; i < count; i++) {
+		struct drm_panfrost_syncobj_ref ref = { };
+		struct dma_fence *fence;
+
+		ret = copy_struct_from_user(&ref, sizeof(ref),
+					    in + (i * ref_stride),
+					    ref_stride);
+		if (ret)
+			return ret;
+
+		if (ref.pad)
+			return -EINVAL;
+
+		ret = drm_syncobj_find_fence(file_priv, ref.handle, ref.point,
+					     0, &fence);
+		if (ret)
+			return ret;
+
+		ret = drm_sched_job_add_dependency(&job->base, fence);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+struct panfrost_job_out_sync {
+	struct drm_syncobj *syncobj;
+	struct dma_fence_chain *chain;
+	u64 point;
+};
+
+static void
+panfrost_put_job_out_syncs(struct panfrost_job_out_sync *out_syncs, u32 count)
+{
+	unsigned int i;
+
+	for (i = 0; i < count; i++) {
+		if (!out_syncs[i].syncobj)
+			break;
+
+		drm_syncobj_put(out_syncs[i].syncobj);
+		kvfree(out_syncs[i].chain);
+	}
+
+	kvfree(out_syncs);
+}
+
+static struct panfrost_job_out_sync *
+panfrost_get_job_out_syncs(struct drm_file *file_priv,
+			   u64 refs, u32 ref_stride,
+			   u32 count)
+{
+	void __user *in = u64_to_user_ptr(refs);
+	struct panfrost_job_out_sync *out_syncs;
+	unsigned int i;
+	int ret;
+
+	if (!count)
+		return NULL;
+
+	/* If the syncobj ref_stride == sizeof(u32) we are called from the
+	 * old submit ioctl() which only accepted one out syncobj. In that
+	 * case the syncobj handle is passed directly through the
+	 * ->out_syncs field, so let's make sure the refs fits in a u32.
+	 */
+	if (ref_stride == sizeof(u32) &&
+	    (count != 1 || refs > UINT_MAX))
+		return ERR_PTR(-EINVAL);
+
+	out_syncs = kvmalloc_array(count, sizeof(*out_syncs),
+				   GFP_KERNEL | __GFP_ZERO);
+	if (!out_syncs)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < count; i++) {
+		struct drm_panfrost_syncobj_ref ref = { };
+
+		if (ref_stride == sizeof(u32)) {
+			/* Special case for the old submit wrapper: in that
+			 * case there's only one out_sync, and the syncobj
+			 * handle is passed directly in the out_syncs field.
+			 */
+			ref.handle = refs;
+		} else {
+			ret = copy_struct_from_user(&ref, sizeof(ref),
+						    in + (i * ref_stride),
+						    ref_stride);
+			if (ret)
+				goto err_free_out_syncs;
+		}
+
+		if (ref.pad) {
+			ret = -EINVAL;
+			goto err_free_out_syncs;
+		}
+
+		out_syncs[i].syncobj = drm_syncobj_find(file_priv, ref.handle);
+		if (!out_syncs[i].syncobj) {
+			ret = -ENODEV;
+			goto err_free_out_syncs;
+		}
+
+		out_syncs[i].point = ref.point;
+		if (!out_syncs[i].point)
+			continue;
+
+		out_syncs[i].chain = kmalloc(sizeof(*out_syncs[i].chain),
+					     GFP_KERNEL);
+		if (!out_syncs[i].chain) {
+			ret = -ENOMEM;
+			goto err_free_out_syncs;
+		}
+	}
+
+	return out_syncs;
+
+err_free_out_syncs:
+	panfrost_put_job_out_syncs(out_syncs, count);
+	return ERR_PTR(ret);
+}
+
+static void
+panfrost_set_job_out_fence(struct panfrost_job_out_sync *out_syncs,
+			   unsigned int count, struct dma_fence *fence)
+{
+	unsigned int i;
+
+	for (i = 0; i < count; i++) {
+		if (out_syncs[i].chain) {
+			drm_syncobj_add_point(out_syncs[i].syncobj,
+					      out_syncs[i].chain,
+					      fence, out_syncs[i].point);
+			out_syncs[i].chain = NULL;
+		} else {
+			drm_syncobj_replace_fence(out_syncs[i].syncobj,
+						  fence);
+		}
+	}
+}
+
+struct panfrost_submit_ioctl_version_info {
+	u32 job_stride;
+	u32 bo_ref_stride;
+	u32 syncobj_ref_stride;
+};
+
+static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
+	/* SUBMIT */
+	[0] = { 0, 4, 4 },
+
+	/* BATCH_SUBMIT v1 */
+	[1] = { 48, 8, 16 },
+};
+
+#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
+
+static int
+panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
+		    struct panfrost_submitqueue *queue,
+		    const struct drm_panfrost_job *args,
+		    u32 version)
+{
+	struct panfrost_device *pfdev = dev->dev_private;
+	struct panfrost_job_out_sync *out_syncs;
+	u32 bo_stride, syncobj_stride;
+	struct panfrost_job *job;
+	int ret, slot;
+
+	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
+		return -EINVAL;
+
+	if (!args->head)
+		return -EINVAL;
+
+	bo_stride = submit_versions[version].bo_ref_stride;
+	syncobj_stride = submit_versions[version].syncobj_ref_stride;
+
+	job = kzalloc(sizeof(*job), GFP_KERNEL);
+	if (!job)
+		return -ENOMEM;
+
+	kref_init(&job->refcount);
+
+	job->pfdev = pfdev;
+	job->jc = args->head;
+	job->requirements = args->requirements;
+	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
+	job->file_priv = file_priv->driver_priv;
+
+	slot = panfrost_job_get_slot(job);
+
+	ret = drm_sched_job_init(&job->base,
+				 &queue->sched_entity[slot],
+				 NULL);
+	if (ret)
+		goto out_put_job;
+
+	ret = panfrost_get_job_in_syncs(file_priv,
+					args->in_syncs,
+					syncobj_stride,
+					args->in_sync_count,
+					job);
+	if (ret)
+		goto out_cleanup_job;
+
+	out_syncs = panfrost_get_job_out_syncs(file_priv,
+					       args->out_syncs,
+					       syncobj_stride,
+					       args->out_sync_count);
+	if (IS_ERR(out_syncs)) {
+		ret = PTR_ERR(out_syncs);
+		goto out_cleanup_job;
+	}
+
+	ret = panfrost_get_job_bos(file_priv, args->bos, bo_stride,
+				   args->bo_count, job);
+	if (ret)
+		goto out_put_outsyncs;
+
+	ret = panfrost_get_job_mappings(file_priv, job);
+	if (ret)
+		goto out_put_outsyncs;
+
+	ret = panfrost_job_push(queue, job);
+	if (ret)
+		goto out_put_outsyncs;
+
+	panfrost_set_job_out_fence(out_syncs, args->out_sync_count,
+				   job->render_done_fence);
+
+out_put_outsyncs:
+	panfrost_put_job_out_syncs(out_syncs, args->out_sync_count);
+out_cleanup_job:
+	if (ret)
+		drm_sched_job_cleanup(&job->base);
+out_put_job:
+	panfrost_job_put(job);
+	return ret;
+}
+
+static int
+panfrost_ioctl_submit(struct drm_device *dev, void *data,
+		      struct drm_file *file)
+{
+	struct drm_panfrost_submit *args = data;
+	struct drm_panfrost_job job_args = {
+		.head = args->jc,
+		.bos = args->bo_handles,
+		.in_syncs = args->in_syncs,
+
+		/* We are abusing .out_syncs and passing the handle directly
+		 * instead of a pointer to a user u32 array, but
+		 * panfrost_job_submit() knows about it, so it's fine.
+		 */
+		.out_syncs = args->out_sync,
+		.in_sync_count = args->in_sync_count,
+		.out_sync_count = args->out_sync > 0 ? 1 : 0,
+		.bo_count = args->bo_handle_count,
+		.requirements = args->requirements
+	};
+	struct panfrost_submitqueue *queue;
+	int ret;
+
+	queue = panfrost_submitqueue_get(file->driver_priv, 0);
+	if (IS_ERR(queue))
+		return PTR_ERR(queue);
+
+	ret = panfrost_submit_job(dev, file, queue, &job_args, 0);
+	panfrost_submitqueue_put(queue);
+
+	return ret;
+}
+
+static int
+panfrost_ioctl_batch_submit(struct drm_device *dev, void *data,
+			    struct drm_file *file_priv)
+{
+	struct drm_panfrost_batch_submit *args = data;
+	void __user *jobs_args = u64_to_user_ptr(args->jobs);
+	struct panfrost_submitqueue *queue;
+	u32 version = args->version;
+	u32 job_stride;
+	unsigned int i;
+	int ret = 0;
+
+	/* Make sure the SUBMIT_BATCH_VERSION and desc sizes are synced */
+	BUILD_BUG_ON(sizeof(struct drm_panfrost_job) !=
+		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].job_stride);
+	BUILD_BUG_ON(sizeof(struct drm_panfrost_bo_ref) !=
+		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].bo_ref_stride);
+	BUILD_BUG_ON(sizeof(struct drm_panfrost_syncobj_ref) !=
+		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].syncobj_ref_stride);
+
+	/* Version 0 doesn't exists (it's reserved for the SUBMIT ioctl) */
+	if (!version)
+		return -EINVAL;
+
+	/* If the version specified is bigger than what we currently support,
+	 * pick the last supported version and let copy_struct_from_user()
+	 * check that any extra job, bo_ref and syncobj_ref fields are zeroed.
+	 */
+	if (version >= ARRAY_SIZE(submit_versions))
+		version = ARRAY_SIZE(submit_versions) - 1;
+
+	queue = panfrost_submitqueue_get(file_priv->driver_priv, args->queue);
+	if (IS_ERR(queue))
+		return PTR_ERR(queue);
+
+	job_stride = submit_versions[version].job_stride;
+	for (i = 0; i < args->job_count; i++) {
+		struct drm_panfrost_job job_args = { };
+
+		ret = copy_struct_from_user(&job_args, sizeof(job_args),
+					    jobs_args + (i * job_stride),
+					    job_stride);
+		if (!ret) {
+			ret = panfrost_submit_job(dev, file_priv, queue,
+						  &job_args, version);
+		}
+
+		if (ret) {
+			args->fail_idx = i;
+			break;
+		}
+	}
+
+	panfrost_submitqueue_put(queue);
+	return ret;
+}
+
 int panfrost_unstable_ioctl_check(void)
 {
 	if (!unstable_ioctls)
@@ -590,6 +799,7 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
 	PANFROST_IOCTL(MADVISE,		madvise,	DRM_RENDER_ALLOW),
 	PANFROST_IOCTL(CREATE_SUBMITQUEUE, create_submitqueue, DRM_RENDER_ALLOW),
 	PANFROST_IOCTL(DESTROY_SUBMITQUEUE, destroy_submitqueue, DRM_RENDER_ALLOW),
+	PANFROST_IOCTL(BATCH_SUBMIT,	batch_submit,	DRM_RENDER_ALLOW),
 };
 
 DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 62e010307afc..0367cee8f6df 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -254,7 +254,9 @@ static int panfrost_acquire_object_fences(struct panfrost_job *job)
 				return ret;
 		}
 
-		/* panfrost always uses write mode in its current uapi */
+		if ((job->bo_flags[i] & PANFROST_BO_REF_NO_IMPLICIT_DEP) && !exclusive)
+			continue;
+
 		ret = drm_sched_job_add_implicit_dependencies(&job->base, job->bos[i],
 							      exclusive);
 		if (ret)
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 66e1c2f1ff4b..5e3f8a344f41 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -23,6 +23,7 @@ extern "C" {
 #define DRM_PANFROST_MADVISE			0x08
 #define DRM_PANFROST_CREATE_SUBMITQUEUE		0x09
 #define DRM_PANFROST_DESTROY_SUBMITQUEUE	0x0a
+#define DRM_PANFROST_BATCH_SUBMIT		0x0b
 
 #define DRM_IOCTL_PANFROST_SUBMIT		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
 #define DRM_IOCTL_PANFROST_WAIT_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
@@ -33,6 +34,7 @@ extern "C" {
 #define DRM_IOCTL_PANFROST_MADVISE		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MADVISE, struct drm_panfrost_madvise)
 #define DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_CREATE_SUBMITQUEUE, struct drm_panfrost_create_submitqueue)
 #define DRM_IOCTL_PANFROST_DESTROY_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_DESTROY_SUBMITQUEUE, __u32)
+#define DRM_IOCTL_PANFROST_BATCH_SUBMIT		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_BATCH_SUBMIT, struct drm_panfrost_batch_submit)
 
 /*
  * Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
@@ -243,9 +245,99 @@ struct drm_panfrost_create_submitqueue {
 	__u32 id;	/* out, identifier */
 };
 
+/* Syncobj reference passed at job submission time to encode explicit
+ * input/output fences.
+ */
+struct drm_panfrost_syncobj_ref {
+	/** Syncobj handle */
+	__u32 handle;
+
+	/** Padding field, must be set to 0 */
+	__u32 pad;
+
+	/**
+	 * For timeline syncobjs, the point on the timeline the reference
+	 * points to. 0 for the last point.
+	 * Must be set to 0 for non-timeline syncobjs
+	 */
+	__u64 point;
+};
+
 /* Exclusive (AKA write) access to the BO */
 #define PANFROST_BO_REF_EXCLUSIVE	0x1
 
+/* Disable the implicit dependency on the BO fence */
+#define PANFROST_BO_REF_NO_IMPLICIT_DEP	0x2
+
+/* Describes a BO referenced by a job and the type of access. */
+struct drm_panfrost_bo_ref {
+	/** A GEM handle */
+	__u32 handle;
+
+	/** A combination of PANFROST_BO_REF_x flags */
+	__u32 flags;
+};
+
+/* Describes a GPU job and the resources attached to it. */
+struct drm_panfrost_job {
+	/** GPU pointer to the head of the job chain. */
+	__u64 head;
+
+	/**
+	 * Array of drm_panfrost_bo_ref objects describing the BOs referenced
+	 * by this job.
+	 */
+	__u64 bos;
+
+	/**
+	 * Arrays of drm_panfrost_syncobj_ref objects describing the input
+	 * and output fences.
+	 */
+	__u64 in_syncs;
+	__u64 out_syncs;
+
+	/** Syncobj reference array sizes. */
+	__u32 in_sync_count;
+	__u32 out_sync_count;
+
+	/** BO reference array size. */
+	__u32 bo_count;
+
+	/** Combination of PANFROST_JD_REQ_* flags. */
+	__u32 requirements;
+};
+
+#define PANFROST_SUBMIT_BATCH_VERSION	1
+
+/* Used to submit multiple jobs in one call */
+struct drm_panfrost_batch_submit {
+	/**
+	 * Always set to PANFROST_SUBMIT_BATCH_VERSION. This is used to let the
+	 * kernel know about the size of the various structs passed to the
+	 * BATCH_SUBMIT ioctl.
+	 */
+	__u32 version;
+
+	/** Number of jobs to submit. */
+	__u32 job_count;
+
+	/* Pointer to a job array. */
+	__u64 jobs;
+
+	/**
+	 * ID of the queue to submit those jobs to. 0 is the default
+	 * submit queue and should always exists. If you need a dedicated
+	 * queue, create it with DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE.
+	 */
+	__u32 queue;
+
+	/**
+	 * If the submission fails, this encodes the index of the first job
+	 * that failed.
+	 */
+	__u32 fail_idx;
+};
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 6/8] drm/panfrost: Support synchronization jobs
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
                   ` (4 preceding siblings ...)
  2021-09-30 19:09 ` [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-10-04 11:30   ` Steven Price
  2021-09-30 19:09 ` [PATCH v5 7/8] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 8/8] drm/panfrost: Bump minor version to reflect the feature additions Boris Brezillon
  7 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

Sometimes, all the user wants to do is add a synchronization point.
Userspace can already do that by submitting a NULL job, but this implies
submitting something to the GPU when we could simply skip the job and
signal the done fence directly.

v5:
* New patch

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
 drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
 include/uapi/drm/panfrost_drm.h         | 7 +++++++
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 30dc158d56e6..89a0c484310c 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
 	[1] = { 48, 8, 16 },
 };
 
-#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
+#define PANFROST_JD_ALLOWED_REQS \
+	(PANFROST_JD_REQ_FS | \
+	 PANFROST_JD_REQ_DEP_ONLY)
 
 static int
 panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
@@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
 	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
 		return -EINVAL;
 
-	if (!args->head)
+	/* If this is a dependency-only job, the job chain head should be NULL,
+	 * otherwise it should be non-NULL.
+	 */
+	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))
 		return -EINVAL;
 
 	bo_stride = submit_versions[version].bo_ref_stride;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 0367cee8f6df..6d8706d4a096 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	u64 jc_head = job->jc;
 	int ret;
 
+	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
+		/* Nothing to execute, signal the fence directly. */
+		dma_fence_signal_locked(job->done_fence);
+		return;
+	}
+
 	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
 
 	ret = pm_runtime_get_sync(pfdev->dev);
diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
index 5e3f8a344f41..b9df066970f6 100644
--- a/include/uapi/drm/panfrost_drm.h
+++ b/include/uapi/drm/panfrost_drm.h
@@ -46,6 +46,13 @@ extern "C" {
 #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
 
 #define PANFROST_JD_REQ_FS (1 << 0)
+
+/*
+ * Dependency only job. The job chain head should be set to 0 when this flag
+ * is set.
+ */
+#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
+
 /**
  * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
  * engine.
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 7/8] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
                   ` (5 preceding siblings ...)
  2021-09-30 19:09 ` [PATCH v5 6/8] drm/panfrost: Support synchronization jobs Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  2021-09-30 19:09 ` [PATCH v5 8/8] drm/panfrost: Bump minor version to reflect the feature additions Boris Brezillon
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

Now that we have a new SUBMIT ioctl dealing with timelined syncojbs we
can advertise the feature.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 89a0c484310c..9f983b763372 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -817,7 +817,8 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.3 - adds PANFROST_BO_NO{READ,WRITE} flags
  */
 static const struct drm_driver panfrost_drm_driver = {
-	.driver_features	= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ,
+	.driver_features	= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
+				  DRIVER_SYNCOBJ_TIMELINE,
 	.open			= panfrost_open,
 	.postclose		= panfrost_postclose,
 	.ioctls			= panfrost_drm_driver_ioctls,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 8/8] drm/panfrost: Bump minor version to reflect the feature additions
  2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
                   ` (6 preceding siblings ...)
  2021-09-30 19:09 ` [PATCH v5 7/8] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature Boris Brezillon
@ 2021-09-30 19:09 ` Boris Brezillon
  7 siblings, 0 replies; 13+ messages in thread
From: Boris Brezillon @ 2021-09-30 19:09 UTC (permalink / raw)
  To: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Steven Price, Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König, Boris Brezillon

We now have a new ioctl that allows submitting multiple jobs at once
(among other things) and we support timelined syncobjs. Bump the
minor version number to reflect those changes.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 9f983b763372..21871810df77 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -815,6 +815,9 @@ DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
  * - 1.1 - adds HEAP and NOEXEC flags for CREATE_BO
  * - 1.2 - adds AFBC_FEATURES query
  * - 1.3 - adds PANFROST_BO_NO{READ,WRITE} flags
+ * - 1.4 - adds the BATCH_SUBMIT, CREATE_SUBMITQUEUE, DESTROY_SUBMITQUEUE
+ *	   ioctls, adds support for DEP_ONLY jobs and advertises the
+ *	   SYNCOBJ_TIMELINE feature
  */
 static const struct drm_driver panfrost_drm_driver = {
 	.driver_features	= DRIVER_RENDER | DRIVER_GEM | DRIVER_SYNCOBJ |
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches
  2021-09-30 19:09 ` [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches Boris Brezillon
@ 2021-10-04 11:30   ` Steven Price
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Price @ 2021-10-04 11:30 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König

On 30/09/2021 20:09, Boris Brezillon wrote:
> This should help limit the number of ioctls when submitting multiple
> jobs. The new ioctl also supports syncobj timelines and BO access flags.
> 
> v5:
> * Fix typos
> * Add BUILD_BUG_ON() checks to make sure SUBMIT_BATCH_VERSION and
>   descriptor sizes are synced
> * Simplify error handling in panfrost_ioctl_batch_submit()
> * Don't disable implicit fences on exclusive references
> 
> v4:
> * Implement panfrost_ioctl_submit() as a wrapper around
>   panfrost_submit_job()
> * Replace stride fields by a version field which is mapped to
>   a <job_stride,bo_ref_stride,syncobj_ref_stride> tuple internally
> 
> v3:
> * Re-use panfrost_get_job_bos() and panfrost_get_job_in_syncs() in the
>   old submit path
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 584 ++++++++++++++++--------
>  drivers/gpu/drm/panfrost/panfrost_job.c |   4 +-
>  include/uapi/drm/panfrost_drm.h         |  92 ++++
>  3 files changed, 492 insertions(+), 188 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index f8f430f68090..30dc158d56e6 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -147,193 +147,6 @@ panfrost_get_job_mappings(struct drm_file *file_priv, struct panfrost_job *job)
>  	return 0;
>  }
>  
> -/**
> - * panfrost_lookup_bos() - Sets up job->bo[] with the GEM objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve handles from userspace to BOs and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the BOs on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_lookup_bos(struct drm_device *dev,
> -		  struct drm_file *file_priv,
> -		  struct drm_panfrost_submit *args,
> -		  struct panfrost_job *job)
> -{
> -	unsigned int i;
> -	int ret;
> -
> -	job->bo_count = args->bo_handle_count;
> -
> -	if (!job->bo_count)
> -		return 0;
> -
> -	job->bo_flags = kvmalloc_array(job->bo_count,
> -				       sizeof(*job->bo_flags),
> -				       GFP_KERNEL | __GFP_ZERO);
> -	if (!job->bo_flags)
> -		return -ENOMEM;
> -
> -	for (i = 0; i < job->bo_count; i++)
> -		job->bo_flags[i] = PANFROST_BO_REF_EXCLUSIVE;
> -
> -	ret = drm_gem_objects_lookup(file_priv,
> -				     (void __user *)(uintptr_t)args->bo_handles,
> -				     job->bo_count, &job->bos);
> -	if (ret)
> -		return ret;
> -
> -	return panfrost_get_job_mappings(file_priv, job);
> -}
> -
> -/**
> - * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
> - * referenced by the job.
> - * @dev: DRM device
> - * @file_priv: DRM file for this fd
> - * @args: IOCTL args
> - * @job: job being set up
> - *
> - * Resolve syncobjs from userspace to fences and attach them to job.
> - *
> - * Note that this function doesn't need to unreference the fences on
> - * failure, because that will happen at panfrost_job_cleanup() time.
> - */
> -static int
> -panfrost_copy_in_sync(struct drm_device *dev,
> -		  struct drm_file *file_priv,
> -		  struct drm_panfrost_submit *args,
> -		  struct panfrost_job *job)
> -{
> -	u32 *handles;
> -	int ret = 0;
> -	int i, in_fence_count;
> -
> -	in_fence_count = args->in_sync_count;
> -
> -	if (!in_fence_count)
> -		return 0;
> -
> -	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
> -	if (!handles) {
> -		ret = -ENOMEM;
> -		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> -		goto fail;
> -	}
> -
> -	if (copy_from_user(handles,
> -			   (void __user *)(uintptr_t)args->in_syncs,
> -			   in_fence_count * sizeof(u32))) {
> -		ret = -EFAULT;
> -		DRM_DEBUG("Failed to copy in syncobj handles\n");
> -		goto fail;
> -	}
> -
> -	for (i = 0; i < in_fence_count; i++) {
> -		struct dma_fence *fence;
> -
> -		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -					     &fence);
> -		if (ret)
> -			goto fail;
> -
> -		ret = drm_sched_job_add_dependency(&job->base, fence);
> -
> -		if (ret)
> -			goto fail;
> -	}
> -
> -fail:
> -	kvfree(handles);
> -	return ret;
> -}
> -
> -static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
> -		struct drm_file *file)
> -{
> -	struct panfrost_device *pfdev = dev->dev_private;
> -	struct drm_panfrost_submit *args = data;
> -	struct drm_syncobj *sync_out = NULL;
> -	struct panfrost_submitqueue *queue;
> -	struct panfrost_job *job;
> -	int ret = 0, slot;
> -
> -	if (!args->jc)
> -		return -EINVAL;
> -
> -	if (args->requirements && args->requirements != PANFROST_JD_REQ_FS)
> -		return -EINVAL;
> -
> -	queue = panfrost_submitqueue_get(file->driver_priv, 0);
> -	if (IS_ERR(queue))
> -		return PTR_ERR(queue);
> -
> -	if (args->out_sync > 0) {
> -		sync_out = drm_syncobj_find(file, args->out_sync);
> -		if (!sync_out) {
> -			ret = -ENODEV;
> -			goto fail_put_queue;
> -		}
> -	}
> -
> -	job = kzalloc(sizeof(*job), GFP_KERNEL);
> -	if (!job) {
> -		ret = -ENOMEM;
> -		goto out_put_syncout;
> -	}
> -
> -	kref_init(&job->refcount);
> -
> -	job->pfdev = pfdev;
> -	job->jc = args->jc;
> -	job->requirements = args->requirements;
> -	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
> -	job->file_priv = file->driver_priv;
> -
> -	slot = panfrost_job_get_slot(job);
> -
> -	ret = drm_sched_job_init(&job->base,
> -				 &queue->sched_entity[slot],
> -				 NULL);
> -	if (ret)
> -		goto out_put_job;
> -
> -	ret = panfrost_copy_in_sync(dev, file, args, job);
> -	if (ret)
> -		goto out_cleanup_job;
> -
> -	ret = panfrost_lookup_bos(dev, file, args, job);
> -	if (ret)
> -		goto out_cleanup_job;
> -
> -	ret = panfrost_job_push(queue, job);
> -	if (ret)
> -		goto out_cleanup_job;
> -
> -	/* Update the return sync object for the job */
> -	if (sync_out)
> -		drm_syncobj_replace_fence(sync_out, job->render_done_fence);
> -
> -out_cleanup_job:
> -	if (ret)
> -		drm_sched_job_cleanup(&job->base);
> -out_put_job:
> -	panfrost_job_put(job);
> -out_put_syncout:
> -	if (sync_out)
> -		drm_syncobj_put(sync_out);
> -fail_put_queue:
> -	panfrost_submitqueue_put(queue);
> -
> -	return ret;
> -}
> -
>  static int
>  panfrost_ioctl_wait_bo(struct drm_device *dev, void *data,
>  		       struct drm_file *file_priv)
> @@ -509,6 +322,402 @@ panfrost_ioctl_destroy_submitqueue(struct drm_device *dev, void *data,
>  	return panfrost_submitqueue_destroy(priv, id);
>  }
>  
> +#define PANFROST_BO_REF_ALLOWED_FLAGS \
> +	(PANFROST_BO_REF_EXCLUSIVE | PANFROST_BO_REF_NO_IMPLICIT_DEP)
> +
> +static int
> +panfrost_get_job_bos(struct drm_file *file_priv,
> +		     u64 refs, u32 ref_stride, u32 count,
> +		     struct panfrost_job *job)
> +{
> +	void __user *in = u64_to_user_ptr(refs);
> +	unsigned int i;
> +
> +	job->bo_count = count;
> +
> +	if (!count)
> +		return 0;
> +
> +	job->bos = kvmalloc_array(job->bo_count, sizeof(*job->bos),
> +				  GFP_KERNEL | __GFP_ZERO);
> +	job->bo_flags = kvmalloc_array(job->bo_count,
> +				       sizeof(*job->bo_flags),
> +				       GFP_KERNEL | __GFP_ZERO);
> +	if (!job->bos || !job->bo_flags)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < count; i++) {
> +		struct drm_panfrost_bo_ref ref = { };
> +		int ret;
> +
> +		ret = copy_struct_from_user(&ref, sizeof(ref),
> +					    in + (i * ref_stride),
> +					    ref_stride);
> +		if (ret)
> +			return ret;
> +
> +		/* Prior to the BATCH_SUBMIT ioctl all accessed BOs were
> +		 * treated as exclusive.
> +		 */
> +		if (ref_stride == sizeof(u32))
> +			ref.flags = PANFROST_BO_REF_EXCLUSIVE;
> +
> +		if ((ref.flags & ~PANFROST_BO_REF_ALLOWED_FLAGS))
> +			return -EINVAL;
> +
> +		job->bos[i] = drm_gem_object_lookup(file_priv, ref.handle);
> +		if (!job->bos[i])
> +			return -EINVAL;
> +
> +		job->bo_flags[i] = ref.flags;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +panfrost_get_job_in_syncs(struct drm_file *file_priv,
> +			  u64 refs, u32 ref_stride,
> +			  u32 count, struct panfrost_job *job)
> +{
> +	const void __user *in = u64_to_user_ptr(refs);
> +	unsigned int i;
> +	int ret;
> +
> +	if (!count)
> +		return 0;
> +
> +	for (i = 0; i < count; i++) {
> +		struct drm_panfrost_syncobj_ref ref = { };
> +		struct dma_fence *fence;
> +
> +		ret = copy_struct_from_user(&ref, sizeof(ref),
> +					    in + (i * ref_stride),
> +					    ref_stride);
> +		if (ret)
> +			return ret;
> +
> +		if (ref.pad)
> +			return -EINVAL;
> +
> +		ret = drm_syncobj_find_fence(file_priv, ref.handle, ref.point,
> +					     0, &fence);
> +		if (ret)
> +			return ret;
> +
> +		ret = drm_sched_job_add_dependency(&job->base, fence);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +struct panfrost_job_out_sync {
> +	struct drm_syncobj *syncobj;
> +	struct dma_fence_chain *chain;
> +	u64 point;
> +};
> +
> +static void
> +panfrost_put_job_out_syncs(struct panfrost_job_out_sync *out_syncs, u32 count)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < count; i++) {
> +		if (!out_syncs[i].syncobj)
> +			break;
> +
> +		drm_syncobj_put(out_syncs[i].syncobj);
> +		kvfree(out_syncs[i].chain);
> +	}
> +
> +	kvfree(out_syncs);
> +}
> +
> +static struct panfrost_job_out_sync *
> +panfrost_get_job_out_syncs(struct drm_file *file_priv,
> +			   u64 refs, u32 ref_stride,
> +			   u32 count)
> +{
> +	void __user *in = u64_to_user_ptr(refs);
> +	struct panfrost_job_out_sync *out_syncs;
> +	unsigned int i;
> +	int ret;
> +
> +	if (!count)
> +		return NULL;
> +
> +	/* If the syncobj ref_stride == sizeof(u32) we are called from the
> +	 * old submit ioctl() which only accepted one out syncobj. In that
> +	 * case the syncobj handle is passed directly through the
> +	 * ->out_syncs field, so let's make sure the refs fits in a u32.
> +	 */
> +	if (ref_stride == sizeof(u32) &&
> +	    (count != 1 || refs > UINT_MAX))
> +		return ERR_PTR(-EINVAL);
> +
> +	out_syncs = kvmalloc_array(count, sizeof(*out_syncs),
> +				   GFP_KERNEL | __GFP_ZERO);
> +	if (!out_syncs)
> +		return ERR_PTR(-ENOMEM);
> +
> +	for (i = 0; i < count; i++) {
> +		struct drm_panfrost_syncobj_ref ref = { };
> +
> +		if (ref_stride == sizeof(u32)) {
> +			/* Special case for the old submit wrapper: in that
> +			 * case there's only one out_sync, and the syncobj
> +			 * handle is passed directly in the out_syncs field.
> +			 */
> +			ref.handle = refs;
> +		} else {
> +			ret = copy_struct_from_user(&ref, sizeof(ref),
> +						    in + (i * ref_stride),
> +						    ref_stride);
> +			if (ret)
> +				goto err_free_out_syncs;
> +		}
> +
> +		if (ref.pad) {
> +			ret = -EINVAL;
> +			goto err_free_out_syncs;
> +		}
> +
> +		out_syncs[i].syncobj = drm_syncobj_find(file_priv, ref.handle);
> +		if (!out_syncs[i].syncobj) {
> +			ret = -ENODEV;
> +			goto err_free_out_syncs;
> +		}
> +
> +		out_syncs[i].point = ref.point;
> +		if (!out_syncs[i].point)
> +			continue;
> +
> +		out_syncs[i].chain = kmalloc(sizeof(*out_syncs[i].chain),
> +					     GFP_KERNEL);
> +		if (!out_syncs[i].chain) {
> +			ret = -ENOMEM;
> +			goto err_free_out_syncs;
> +		}
> +	}
> +
> +	return out_syncs;
> +
> +err_free_out_syncs:
> +	panfrost_put_job_out_syncs(out_syncs, count);
> +	return ERR_PTR(ret);
> +}
> +
> +static void
> +panfrost_set_job_out_fence(struct panfrost_job_out_sync *out_syncs,
> +			   unsigned int count, struct dma_fence *fence)
> +{
> +	unsigned int i;
> +
> +	for (i = 0; i < count; i++) {
> +		if (out_syncs[i].chain) {
> +			drm_syncobj_add_point(out_syncs[i].syncobj,
> +					      out_syncs[i].chain,
> +					      fence, out_syncs[i].point);
> +			out_syncs[i].chain = NULL;
> +		} else {
> +			drm_syncobj_replace_fence(out_syncs[i].syncobj,
> +						  fence);
> +		}
> +	}
> +}
> +
> +struct panfrost_submit_ioctl_version_info {
> +	u32 job_stride;
> +	u32 bo_ref_stride;
> +	u32 syncobj_ref_stride;
> +};
> +
> +static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
> +	/* SUBMIT */
> +	[0] = { 0, 4, 4 },
> +
> +	/* BATCH_SUBMIT v1 */
> +	[1] = { 48, 8, 16 },
> +};
> +
> +#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
> +
> +static int
> +panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> +		    struct panfrost_submitqueue *queue,
> +		    const struct drm_panfrost_job *args,
> +		    u32 version)
> +{
> +	struct panfrost_device *pfdev = dev->dev_private;
> +	struct panfrost_job_out_sync *out_syncs;
> +	u32 bo_stride, syncobj_stride;
> +	struct panfrost_job *job;
> +	int ret, slot;
> +
> +	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
> +		return -EINVAL;
> +
> +	if (!args->head)
> +		return -EINVAL;
> +
> +	bo_stride = submit_versions[version].bo_ref_stride;
> +	syncobj_stride = submit_versions[version].syncobj_ref_stride;
> +
> +	job = kzalloc(sizeof(*job), GFP_KERNEL);
> +	if (!job)
> +		return -ENOMEM;
> +
> +	kref_init(&job->refcount);
> +
> +	job->pfdev = pfdev;
> +	job->jc = args->head;
> +	job->requirements = args->requirements;
> +	job->flush_id = panfrost_gpu_get_latest_flush_id(pfdev);
> +	job->file_priv = file_priv->driver_priv;
> +
> +	slot = panfrost_job_get_slot(job);
> +
> +	ret = drm_sched_job_init(&job->base,
> +				 &queue->sched_entity[slot],
> +				 NULL);
> +	if (ret)
> +		goto out_put_job;
> +
> +	ret = panfrost_get_job_in_syncs(file_priv,
> +					args->in_syncs,
> +					syncobj_stride,
> +					args->in_sync_count,
> +					job);
> +	if (ret)
> +		goto out_cleanup_job;
> +
> +	out_syncs = panfrost_get_job_out_syncs(file_priv,
> +					       args->out_syncs,
> +					       syncobj_stride,
> +					       args->out_sync_count);
> +	if (IS_ERR(out_syncs)) {
> +		ret = PTR_ERR(out_syncs);
> +		goto out_cleanup_job;
> +	}
> +
> +	ret = panfrost_get_job_bos(file_priv, args->bos, bo_stride,
> +				   args->bo_count, job);
> +	if (ret)
> +		goto out_put_outsyncs;
> +
> +	ret = panfrost_get_job_mappings(file_priv, job);
> +	if (ret)
> +		goto out_put_outsyncs;
> +
> +	ret = panfrost_job_push(queue, job);
> +	if (ret)
> +		goto out_put_outsyncs;
> +
> +	panfrost_set_job_out_fence(out_syncs, args->out_sync_count,
> +				   job->render_done_fence);
> +
> +out_put_outsyncs:
> +	panfrost_put_job_out_syncs(out_syncs, args->out_sync_count);
> +out_cleanup_job:
> +	if (ret)
> +		drm_sched_job_cleanup(&job->base);
> +out_put_job:
> +	panfrost_job_put(job);
> +	return ret;
> +}
> +
> +static int
> +panfrost_ioctl_submit(struct drm_device *dev, void *data,
> +		      struct drm_file *file)
> +{
> +	struct drm_panfrost_submit *args = data;
> +	struct drm_panfrost_job job_args = {
> +		.head = args->jc,
> +		.bos = args->bo_handles,
> +		.in_syncs = args->in_syncs,
> +
> +		/* We are abusing .out_syncs and passing the handle directly
> +		 * instead of a pointer to a user u32 array, but
> +		 * panfrost_job_submit() knows about it, so it's fine.
> +		 */
> +		.out_syncs = args->out_sync,
> +		.in_sync_count = args->in_sync_count,
> +		.out_sync_count = args->out_sync > 0 ? 1 : 0,
> +		.bo_count = args->bo_handle_count,
> +		.requirements = args->requirements
> +	};
> +	struct panfrost_submitqueue *queue;
> +	int ret;
> +
> +	queue = panfrost_submitqueue_get(file->driver_priv, 0);
> +	if (IS_ERR(queue))
> +		return PTR_ERR(queue);
> +
> +	ret = panfrost_submit_job(dev, file, queue, &job_args, 0);
> +	panfrost_submitqueue_put(queue);
> +
> +	return ret;
> +}
> +
> +static int
> +panfrost_ioctl_batch_submit(struct drm_device *dev, void *data,
> +			    struct drm_file *file_priv)
> +{
> +	struct drm_panfrost_batch_submit *args = data;
> +	void __user *jobs_args = u64_to_user_ptr(args->jobs);
> +	struct panfrost_submitqueue *queue;
> +	u32 version = args->version;
> +	u32 job_stride;
> +	unsigned int i;
> +	int ret = 0;
> +
> +	/* Make sure the SUBMIT_BATCH_VERSION and desc sizes are synced */
> +	BUILD_BUG_ON(sizeof(struct drm_panfrost_job) !=
> +		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].job_stride);
> +	BUILD_BUG_ON(sizeof(struct drm_panfrost_bo_ref) !=
> +		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].bo_ref_stride);
> +	BUILD_BUG_ON(sizeof(struct drm_panfrost_syncobj_ref) !=
> +		     submit_versions[PANFROST_SUBMIT_BATCH_VERSION].syncobj_ref_stride);
> +
> +	/* Version 0 doesn't exists (it's reserved for the SUBMIT ioctl) */
> +	if (!version)
> +		return -EINVAL;
> +
> +	/* If the version specified is bigger than what we currently support,
> +	 * pick the last supported version and let copy_struct_from_user()
> +	 * check that any extra job, bo_ref and syncobj_ref fields are zeroed.
> +	 */
> +	if (version >= ARRAY_SIZE(submit_versions))
> +		version = ARRAY_SIZE(submit_versions) - 1;
> +
> +	queue = panfrost_submitqueue_get(file_priv->driver_priv, args->queue);
> +	if (IS_ERR(queue))
> +		return PTR_ERR(queue);
> +
> +	job_stride = submit_versions[version].job_stride;
> +	for (i = 0; i < args->job_count; i++) {
> +		struct drm_panfrost_job job_args = { };
> +
> +		ret = copy_struct_from_user(&job_args, sizeof(job_args),
> +					    jobs_args + (i * job_stride),
> +					    job_stride);
> +		if (!ret) {
> +			ret = panfrost_submit_job(dev, file_priv, queue,
> +						  &job_args, version);
> +		}
> +
> +		if (ret) {
> +			args->fail_idx = i;
> +			break;
> +		}
> +	}
> +
> +	panfrost_submitqueue_put(queue);
> +	return ret;
> +}
> +
>  int panfrost_unstable_ioctl_check(void)
>  {
>  	if (!unstable_ioctls)
> @@ -590,6 +799,7 @@ static const struct drm_ioctl_desc panfrost_drm_driver_ioctls[] = {
>  	PANFROST_IOCTL(MADVISE,		madvise,	DRM_RENDER_ALLOW),
>  	PANFROST_IOCTL(CREATE_SUBMITQUEUE, create_submitqueue, DRM_RENDER_ALLOW),
>  	PANFROST_IOCTL(DESTROY_SUBMITQUEUE, destroy_submitqueue, DRM_RENDER_ALLOW),
> +	PANFROST_IOCTL(BATCH_SUBMIT,	batch_submit,	DRM_RENDER_ALLOW),
>  };
>  
>  DEFINE_DRM_GEM_FOPS(panfrost_drm_driver_fops);
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 62e010307afc..0367cee8f6df 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -254,7 +254,9 @@ static int panfrost_acquire_object_fences(struct panfrost_job *job)
>  				return ret;
>  		}
>  
> -		/* panfrost always uses write mode in its current uapi */
> +		if ((job->bo_flags[i] & PANFROST_BO_REF_NO_IMPLICIT_DEP) && !exclusive)
> +			continue;
> +
>  		ret = drm_sched_job_add_implicit_dependencies(&job->base, job->bos[i],
>  							      exclusive);
>  		if (ret)
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index 66e1c2f1ff4b..5e3f8a344f41 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -23,6 +23,7 @@ extern "C" {
>  #define DRM_PANFROST_MADVISE			0x08
>  #define DRM_PANFROST_CREATE_SUBMITQUEUE		0x09
>  #define DRM_PANFROST_DESTROY_SUBMITQUEUE	0x0a
> +#define DRM_PANFROST_BATCH_SUBMIT		0x0b
>  
>  #define DRM_IOCTL_PANFROST_SUBMIT		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_SUBMIT, struct drm_panfrost_submit)
>  #define DRM_IOCTL_PANFROST_WAIT_BO		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_WAIT_BO, struct drm_panfrost_wait_bo)
> @@ -33,6 +34,7 @@ extern "C" {
>  #define DRM_IOCTL_PANFROST_MADVISE		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_MADVISE, struct drm_panfrost_madvise)
>  #define DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_CREATE_SUBMITQUEUE, struct drm_panfrost_create_submitqueue)
>  #define DRM_IOCTL_PANFROST_DESTROY_SUBMITQUEUE	DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_DESTROY_SUBMITQUEUE, __u32)
> +#define DRM_IOCTL_PANFROST_BATCH_SUBMIT		DRM_IOWR(DRM_COMMAND_BASE + DRM_PANFROST_BATCH_SUBMIT, struct drm_panfrost_batch_submit)
>  
>  /*
>   * Unstable ioctl(s): only exposed when the unsafe unstable_ioctls module
> @@ -243,9 +245,99 @@ struct drm_panfrost_create_submitqueue {
>  	__u32 id;	/* out, identifier */
>  };
>  
> +/* Syncobj reference passed at job submission time to encode explicit
> + * input/output fences.
> + */
> +struct drm_panfrost_syncobj_ref {
> +	/** Syncobj handle */
> +	__u32 handle;
> +
> +	/** Padding field, must be set to 0 */
> +	__u32 pad;
> +
> +	/**
> +	 * For timeline syncobjs, the point on the timeline the reference
> +	 * points to. 0 for the last point.
> +	 * Must be set to 0 for non-timeline syncobjs
> +	 */
> +	__u64 point;
> +};
> +
>  /* Exclusive (AKA write) access to the BO */
>  #define PANFROST_BO_REF_EXCLUSIVE	0x1
>  
> +/* Disable the implicit dependency on the BO fence */
> +#define PANFROST_BO_REF_NO_IMPLICIT_DEP	0x2
> +
> +/* Describes a BO referenced by a job and the type of access. */
> +struct drm_panfrost_bo_ref {
> +	/** A GEM handle */
> +	__u32 handle;
> +
> +	/** A combination of PANFROST_BO_REF_x flags */
> +	__u32 flags;
> +};
> +
> +/* Describes a GPU job and the resources attached to it. */
> +struct drm_panfrost_job {
> +	/** GPU pointer to the head of the job chain. */
> +	__u64 head;
> +
> +	/**
> +	 * Array of drm_panfrost_bo_ref objects describing the BOs referenced
> +	 * by this job.
> +	 */
> +	__u64 bos;
> +
> +	/**
> +	 * Arrays of drm_panfrost_syncobj_ref objects describing the input
> +	 * and output fences.
> +	 */
> +	__u64 in_syncs;
> +	__u64 out_syncs;
> +
> +	/** Syncobj reference array sizes. */
> +	__u32 in_sync_count;
> +	__u32 out_sync_count;
> +
> +	/** BO reference array size. */
> +	__u32 bo_count;
> +
> +	/** Combination of PANFROST_JD_REQ_* flags. */
> +	__u32 requirements;
> +};
> +
> +#define PANFROST_SUBMIT_BATCH_VERSION	1
> +
> +/* Used to submit multiple jobs in one call */
> +struct drm_panfrost_batch_submit {
> +	/**
> +	 * Always set to PANFROST_SUBMIT_BATCH_VERSION. This is used to let the
> +	 * kernel know about the size of the various structs passed to the
> +	 * BATCH_SUBMIT ioctl.
> +	 */
> +	__u32 version;
> +
> +	/** Number of jobs to submit. */
> +	__u32 job_count;
> +
> +	/* Pointer to a job array. */
> +	__u64 jobs;
> +
> +	/**
> +	 * ID of the queue to submit those jobs to. 0 is the default
> +	 * submit queue and should always exists. If you need a dedicated
> +	 * queue, create it with DRM_IOCTL_PANFROST_CREATE_SUBMITQUEUE.
> +	 */
> +	__u32 queue;
> +
> +	/**
> +	 * If the submission fails, this encodes the index of the first job
> +	 * that failed.
> +	 */
> +	__u32 fail_idx;
> +};
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 6/8] drm/panfrost: Support synchronization jobs
  2021-09-30 19:09 ` [PATCH v5 6/8] drm/panfrost: Support synchronization jobs Boris Brezillon
@ 2021-10-04 11:30   ` Steven Price
  2021-10-04 12:24     ` Boris Brezillon
  0 siblings, 1 reply; 13+ messages in thread
From: Steven Price @ 2021-10-04 11:30 UTC (permalink / raw)
  To: Boris Brezillon, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	Robin Murphy
  Cc: dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König

On 30/09/2021 20:09, Boris Brezillon wrote:
> Sometimes, all the user wants to do is add a synchronization point.
> Userspace can already do that by submitting a NULL job, but this implies
> submitting something to the GPU when we could simply skip the job and
> signal the done fence directly.
> 
> v5:
> * New patch
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>

I had thought we would be fine without kbase's "dependency only atom"
because we don't have the fan-{in,out} problems that kbase's atoms
produce. But if we're ending up with real hardware NULL jobs then this
is clearly better.

A couple of minor points below, but as far as I can tell this is
functionally correct.

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
>  drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
>  include/uapi/drm/panfrost_drm.h         | 7 +++++++
>  3 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 30dc158d56e6..89a0c484310c 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
>  	[1] = { 48, 8, 16 },
>  };
>  
> -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
> +#define PANFROST_JD_ALLOWED_REQS \
> +	(PANFROST_JD_REQ_FS | \
> +	 PANFROST_JD_REQ_DEP_ONLY)
>  
>  static int
>  panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
>  	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
>  		return -EINVAL;
>  
> -	if (!args->head)
> +	/* If this is a dependency-only job, the job chain head should be NULL,
> +	 * otherwise it should be non-NULL.
> +	 */
> +	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))

NIT: There's confusion over NULL vs 0 here - the code is correct
(args->head is a u64 and not a pointer for a kernel) but the comment
makes it seem like it should be a pointer.

We could side-step the mismatch by rewriting as below:

  !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)

Although I'm not convinced whether that's more readable or not!

>  		return -EINVAL;
>  
>  	bo_stride = submit_versions[version].bo_ref_stride;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 0367cee8f6df..6d8706d4a096 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  	u64 jc_head = job->jc;
>  	int ret;
>  
> +	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
> +		/* Nothing to execute, signal the fence directly. */
> +		dma_fence_signal_locked(job->done_fence);
> +		return;
> +	}
> +

It took me a while to convince myself that the reference counting for
the PM reference is correct. Before panfrost_job_hw_submit() always
returned with an extra reference, but now we have a case which doesn't.
AFAICT this is probably fine because we dereference on the path where
the hardware has completed the job (which obviously won't happen here).
But I'm still a bit uneasy whether the reference counts are always correct.

Steve

>  	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
>  
>  	ret = pm_runtime_get_sync(pfdev->dev);
> diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> index 5e3f8a344f41..b9df066970f6 100644
> --- a/include/uapi/drm/panfrost_drm.h
> +++ b/include/uapi/drm/panfrost_drm.h
> @@ -46,6 +46,13 @@ extern "C" {
>  #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
>  
>  #define PANFROST_JD_REQ_FS (1 << 0)
> +
> +/*
> + * Dependency only job. The job chain head should be set to 0 when this flag
> + * is set.
> + */
> +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
> +
>  /**
>   * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
>   * engine.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 6/8] drm/panfrost: Support synchronization jobs
  2021-10-04 11:30   ` Steven Price
@ 2021-10-04 12:24     ` Boris Brezillon
  2021-10-04 13:05       ` Steven Price
  0 siblings, 1 reply; 13+ messages in thread
From: Boris Brezillon @ 2021-10-04 12:24 UTC (permalink / raw)
  To: Steven Price
  Cc: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Robin Murphy,
	dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König

On Mon, 4 Oct 2021 12:30:42 +0100
Steven Price <steven.price@arm.com> wrote:

> On 30/09/2021 20:09, Boris Brezillon wrote:
> > Sometimes, all the user wants to do is add a synchronization point.
> > Userspace can already do that by submitting a NULL job, but this implies
> > submitting something to the GPU when we could simply skip the job and
> > signal the done fence directly.
> > 
> > v5:
> > * New patch
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>  
> 
> I had thought we would be fine without kbase's "dependency only atom"
> because we don't have the fan-{in,out} problems that kbase's atoms
> produce. But if we're ending up with real hardware NULL jobs then this
> is clearly better.
> 
> A couple of minor points below, but as far as I can tell this is
> functionally correct.
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_drv.c | 9 +++++++--
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 6 ++++++
> >  include/uapi/drm/panfrost_drm.h         | 7 +++++++
> >  3 files changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > index 30dc158d56e6..89a0c484310c 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> > @@ -542,7 +542,9 @@ static const struct panfrost_submit_ioctl_version_info submit_versions[] = {
> >  	[1] = { 48, 8, 16 },
> >  };
> >  
> > -#define PANFROST_JD_ALLOWED_REQS PANFROST_JD_REQ_FS
> > +#define PANFROST_JD_ALLOWED_REQS \
> > +	(PANFROST_JD_REQ_FS | \
> > +	 PANFROST_JD_REQ_DEP_ONLY)
> >  
> >  static int
> >  panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> > @@ -559,7 +561,10 @@ panfrost_submit_job(struct drm_device *dev, struct drm_file *file_priv,
> >  	if (args->requirements & ~PANFROST_JD_ALLOWED_REQS)
> >  		return -EINVAL;
> >  
> > -	if (!args->head)
> > +	/* If this is a dependency-only job, the job chain head should be NULL,
> > +	 * otherwise it should be non-NULL.
> > +	 */
> > +	if ((args->head != 0) != !(args->requirements & PANFROST_JD_REQ_DEP_ONLY))  
> 
> NIT: There's confusion over NULL vs 0 here - the code is correct
> (args->head is a u64 and not a pointer for a kernel) but the comment
> makes it seem like it should be a pointer.
> 
> We could side-step the mismatch by rewriting as below:
> 
>   !args->head == !(args->requirements & PANFROST_JD_REQ_DEP_ONLY)
> 
> Although I'm not convinced whether that's more readable or not!

I'll replace 'NULL' by 'zero' in the comment.

> 
> >  		return -EINVAL;
> >  
> >  	bo_stride = submit_versions[version].bo_ref_stride;
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 0367cee8f6df..6d8706d4a096 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -192,6 +192,12 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
> >  	u64 jc_head = job->jc;
> >  	int ret;
> >  
> > +	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY) {
> > +		/* Nothing to execute, signal the fence directly. */
> > +		dma_fence_signal_locked(job->done_fence);
> > +		return;
> > +	}
> > +  
> 
> It took me a while to convince myself that the reference counting for
> the PM reference is correct. Before panfrost_job_hw_submit() always
> returned with an extra reference, but now we have a case which doesn't.
> AFAICT this is probably fine because we dereference on the path where
> the hardware has completed the job (which obviously won't happen here).
> But I'm still a bit uneasy whether the reference counts are always correct.

I think it is. We only decrement the PM count in the interrupt handler
path, and as you pointed out, we won't reach that path here. But if
that helps, I can move this if to `panfrost_job_run()`:

	/* Nothing to execute, signal the fence directly. */
	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY)
		dma_fence_signal_locked(job->done_fence);
	else
		panfrost_job_hw_submit(job, slot);

> 
> Steve
> 
> >  	panfrost_devfreq_record_busy(&pfdev->pfdevfreq);
> >  
> >  	ret = pm_runtime_get_sync(pfdev->dev);
> > diff --git a/include/uapi/drm/panfrost_drm.h b/include/uapi/drm/panfrost_drm.h
> > index 5e3f8a344f41..b9df066970f6 100644
> > --- a/include/uapi/drm/panfrost_drm.h
> > +++ b/include/uapi/drm/panfrost_drm.h
> > @@ -46,6 +46,13 @@ extern "C" {
> >  #define DRM_IOCTL_PANFROST_PERFCNT_DUMP		DRM_IOW(DRM_COMMAND_BASE + DRM_PANFROST_PERFCNT_DUMP, struct drm_panfrost_perfcnt_dump)
> >  
> >  #define PANFROST_JD_REQ_FS (1 << 0)
> > +
> > +/*
> > + * Dependency only job. The job chain head should be set to 0 when this flag
> > + * is set.
> > + */
> > +#define PANFROST_JD_REQ_DEP_ONLY (1 << 1)
> > +
> >  /**
> >   * struct drm_panfrost_submit - ioctl argument for submitting commands to the 3D
> >   * engine.
> >   
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 6/8] drm/panfrost: Support synchronization jobs
  2021-10-04 12:24     ` Boris Brezillon
@ 2021-10-04 13:05       ` Steven Price
  0 siblings, 0 replies; 13+ messages in thread
From: Steven Price @ 2021-10-04 13:05 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig, Robin Murphy,
	dri-devel, Daniel Vetter, Jason Ekstrand, Daniel Stone,
	Christian König

On 04/10/2021 13:24, Boris Brezillon wrote:
> On Mon, 4 Oct 2021 12:30:42 +0100
> Steven Price <steven.price@arm.com> wrote:
[...]
>>
>> It took me a while to convince myself that the reference counting for
>> the PM reference is correct. Before panfrost_job_hw_submit() always
>> returned with an extra reference, but now we have a case which doesn't.
>> AFAICT this is probably fine because we dereference on the path where
>> the hardware has completed the job (which obviously won't happen here).
>> But I'm still a bit uneasy whether the reference counts are always correct.
> 
> I think it is. We only decrement the PM count in the interrupt handler
> path, and as you pointed out, we won't reach that path here. But if
> that helps, I can move this if to `panfrost_job_run()`:
> 
> 	/* Nothing to execute, signal the fence directly. */
> 	if (job->requirements & PANFROST_JD_REQ_DEP_ONLY)
> 		dma_fence_signal_locked(job->done_fence);
> 	else
> 		panfrost_job_hw_submit(job, slot);
> 

I think that would make it a bit more readable - really
panfrost_job_hw_submit() needs a bit of TLC, I did post a patch ages
ago[1] but it didn't get any feedback and then I forgot about it. Things
have moved on so it would need a little bit of rework.

Thanks,

Steve

[1]
https://lore.kernel.org/dri-devel/20210512152419.30003-1-steven.price@arm.com/

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-10-04 13:05 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 19:09 [PATCH v5 0/8] drm/panfrost: drm/panfrost: Add a new submit ioctl Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 1/8] drm/panfrost: Pass a job to panfrost_{acquire, attach}_object_fences() Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 2/8] drm/panfrost: Move the mappings collection out of panfrost_lookup_bos() Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 3/8] drm/panfrost: Add BO access flags to relax dependencies between jobs Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 4/8] drm/panfrost: Add the ability to create submit queues Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 5/8] drm/panfrost: Add a new ioctl to submit batches Boris Brezillon
2021-10-04 11:30   ` Steven Price
2021-09-30 19:09 ` [PATCH v5 6/8] drm/panfrost: Support synchronization jobs Boris Brezillon
2021-10-04 11:30   ` Steven Price
2021-10-04 12:24     ` Boris Brezillon
2021-10-04 13:05       ` Steven Price
2021-09-30 19:09 ` [PATCH v5 7/8] drm/panfrost: Advertise the SYNCOBJ_TIMELINE feature Boris Brezillon
2021-09-30 19:09 ` [PATCH v5 8/8] drm/panfrost: Bump minor version to reflect the feature additions Boris Brezillon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.