dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
@ 2020-08-28 10:40 Thierry Reding
  2020-08-28 10:40 ` [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync() Thierry Reding
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

Hi,

This series implements a new IOCTL to submit push buffers that can
optionally return a sync FD or sync object to userspace. This is useful
in cases where userspace wants to synchronize operations between the GPU
and another driver (such as KMS for display). Among other things this
allows extensions such as eglDupNativeFenceFDANDROID to be implemented.

Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM
sync objects to be passed rather than only sync FDs. It also allows any
number of sync FDs/objects to be passed in or emitted. I think those are
useful features, but I left them in a separate patch in case everybody
else thinks that this won't be needed. If we decide to merge the new ABI
then patch 4 should be squashed into patch 3.

The corresponding userspace changes can be found here:

  libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/
  mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/

I've verified that this works with kmscube's --atomic mode and Weston.

Thierry

Thierry Reding (6):
  drm/nouveau: Split nouveau_fence_sync()
  drm/nouveau: Add nouveau_fence_ref()
  drm/nouveau: Support fence FDs at kickoff
  drm/nouveau: Support sync FDs and syncobjs
  drm/nouveau: Support DMA fence arrays
  drm/nouveau: Allow zero pushbuffer submits

 drivers/gpu/drm/nouveau/dispnv04/crtc.c |   4 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c    |  38 ++-
 drivers/gpu/drm/nouveau/nouveau_bo.h    |   2 +
 drivers/gpu/drm/nouveau/nouveau_drm.c   |   1 +
 drivers/gpu/drm/nouveau/nouveau_fence.c |  90 +++---
 drivers/gpu/drm/nouveau/nouveau_fence.h |   3 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c   | 402 ++++++++++++++++++------
 drivers/gpu/drm/nouveau/nouveau_gem.h   |   2 +
 include/uapi/drm/nouveau_drm.h          |  23 ++
 9 files changed, 410 insertions(+), 155 deletions(-)

-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync()
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-08-28 10:40 ` [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref() Thierry Reding
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

Turn nouveau_fence_sync() into a low-level helper that adds fence waits
to the channel command stream. The new nouveau_bo_sync() helper replaces
the previous nouveau_fence_sync() implementation. It passes each of the
buffer object's fences to nouveau_fence_sync() in turn.

This provides more fine-grained control over fences which is needed by
subsequent patches for sync fd support.

Heavily based on work by Lauri Peltonen <lpeltonen@nvidia.com>.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/nouveau/dispnv04/crtc.c |  4 +-
 drivers/gpu/drm/nouveau/nouveau_bo.c    | 38 +++++++++++++-
 drivers/gpu/drm/nouveau/nouveau_bo.h    |  2 +
 drivers/gpu/drm/nouveau/nouveau_fence.c | 68 +++++--------------------
 drivers/gpu/drm/nouveau/nouveau_fence.h |  2 +-
 drivers/gpu/drm/nouveau/nouveau_gem.c   |  2 +-
 6 files changed, 57 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 6416b6907aeb..4af702d0d6bf 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -1117,7 +1117,7 @@ nv04_page_flip_emit(struct nouveau_channel *chan,
 	spin_unlock_irqrestore(&dev->event_lock, flags);
 
 	/* Synchronize with the old framebuffer */
-	ret = nouveau_fence_sync(old_bo, chan, false, false);
+	ret = nouveau_bo_sync(old_bo, chan, false, false);
 	if (ret)
 		goto fail;
 
@@ -1183,7 +1183,7 @@ nv04_crtc_page_flip(struct drm_crtc *crtc, struct drm_framebuffer *fb,
 		goto fail_unpin;
 
 	/* synchronise rendering channel with the kernel's channel */
-	ret = nouveau_fence_sync(new_bo, chan, false, true);
+	ret = nouveau_bo_sync(new_bo, chan, false, true);
 	if (ret) {
 		ttm_bo_unreserve(&new_bo->bo);
 		goto fail_unpin;
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 9140387f30dc..25ceabfa741c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -574,6 +574,42 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
 					PAGE_SIZE, DMA_FROM_DEVICE);
 }
 
+int
+nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
+		bool exclusive, bool intr)
+{
+	struct dma_resv *resv = nvbo->bo.base.resv;
+	struct dma_resv_list *fobj;
+	struct dma_fence *fence;
+	int ret = 0, i;
+
+	if (!exclusive) {
+		ret = dma_resv_reserve_shared(resv, 1);
+		if (ret < 0)
+			return ret;
+	}
+
+	fobj = dma_resv_get_list(resv);
+	fence = dma_resv_get_excl(resv);
+
+	if (fence && (!exclusive || !fobj || !fobj->shared_count))
+		return nouveau_fence_sync(fence, chan, intr);
+
+	if (!exclusive || !fobj)
+		return ret;
+
+	for (i = 0; i < fobj->shared_count && !ret; ++i) {
+		fence = rcu_dereference_protected(fobj->shared[i],
+						  dma_resv_held(resv));
+
+		ret = nouveau_fence_sync(fence, chan, intr);
+		if (ret < 0)
+			break;
+	}
+
+	return ret;
+}
+
 int
 nouveau_bo_validate(struct nouveau_bo *nvbo, bool interruptible,
 		    bool no_wait_gpu)
@@ -717,7 +753,7 @@ nouveau_bo_move_m2mf(struct ttm_buffer_object *bo, int evict, bool intr,
 	}
 
 	mutex_lock_nested(&cli->mutex, SINGLE_DEPTH_NESTING);
-	ret = nouveau_fence_sync(nouveau_bo(bo), chan, true, intr);
+	ret = nouveau_bo_sync(nouveau_bo(bo), chan, true, intr);
 	if (ret == 0) {
 		ret = drm->ttm.move(chan, bo, &bo->mem, new_reg);
 		if (ret == 0) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.h b/drivers/gpu/drm/nouveau/nouveau_bo.h
index aecb7481df0d..93d1706619a1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.h
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.h
@@ -96,6 +96,8 @@ int  nouveau_bo_validate(struct nouveau_bo *, bool interruptible,
 			 bool no_wait_gpu);
 void nouveau_bo_sync_for_device(struct nouveau_bo *nvbo);
 void nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo);
+int nouveau_bo_sync(struct nouveau_bo *nvbo, struct nouveau_channel *channel,
+		    bool exclusive, bool intr);
 
 /* TODO: submit equivalent to TTM generic API upstream? */
 static inline void __iomem *
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index e5dcbf67de7e..8e7550553584 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -339,66 +339,26 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr)
 }
 
 int
-nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool exclusive, bool intr)
+nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		   bool intr)
 {
 	struct nouveau_fence_chan *fctx = chan->fence;
-	struct dma_fence *fence;
-	struct dma_resv *resv = nvbo->bo.base.resv;
-	struct dma_resv_list *fobj;
+	struct nouveau_channel *prev = NULL;
 	struct nouveau_fence *f;
-	int ret = 0, i;
-
-	if (!exclusive) {
-		ret = dma_resv_reserve_shared(resv, 1);
+	bool must_wait = true;
+	int ret = 0;
 
-		if (ret)
-			return ret;
+	f = nouveau_local_fence(fence, chan->drm);
+	if (f) {
+		rcu_read_lock();
+		prev = rcu_dereference(f->channel);
+		if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
+			must_wait = false;
+		rcu_read_unlock();
 	}
 
-	fobj = dma_resv_get_list(resv);
-	fence = dma_resv_get_excl(resv);
-
-	if (fence && (!exclusive || !fobj || !fobj->shared_count)) {
-		struct nouveau_channel *prev = NULL;
-		bool must_wait = true;
-
-		f = nouveau_local_fence(fence, chan->drm);
-		if (f) {
-			rcu_read_lock();
-			prev = rcu_dereference(f->channel);
-			if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
-				must_wait = false;
-			rcu_read_unlock();
-		}
-
-		if (must_wait)
-			ret = dma_fence_wait(fence, intr);
-
-		return ret;
-	}
-
-	if (!exclusive || !fobj)
-		return ret;
-
-	for (i = 0; i < fobj->shared_count && !ret; ++i) {
-		struct nouveau_channel *prev = NULL;
-		bool must_wait = true;
-
-		fence = rcu_dereference_protected(fobj->shared[i],
-						dma_resv_held(resv));
-
-		f = nouveau_local_fence(fence, chan->drm);
-		if (f) {
-			rcu_read_lock();
-			prev = rcu_dereference(f->channel);
-			if (prev && (prev == chan || fctx->sync(f, prev, chan) == 0))
-				must_wait = false;
-			rcu_read_unlock();
-		}
-
-		if (must_wait)
-			ret = dma_fence_wait(fence, intr);
-	}
+	if (must_wait)
+		ret = dma_fence_wait(fence, intr);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 4887caa69c65..76cbf0c27a30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -24,7 +24,7 @@ void nouveau_fence_unref(struct nouveau_fence **);
 int  nouveau_fence_emit(struct nouveau_fence *, struct nouveau_channel *);
 bool nouveau_fence_done(struct nouveau_fence *);
 int  nouveau_fence_wait(struct nouveau_fence *, bool lazy, bool intr);
-int  nouveau_fence_sync(struct nouveau_bo *, struct nouveau_channel *, bool exclusive, bool intr);
+int  nouveau_fence_sync(struct dma_fence *, struct nouveau_channel *, bool intr);
 
 struct nouveau_fence_chan {
 	spinlock_t lock;
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 81f111ad3f4f..590e4c1d2e8a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -513,7 +513,7 @@ validate_list(struct nouveau_channel *chan, struct nouveau_cli *cli,
 			return ret;
 		}
 
-		ret = nouveau_fence_sync(nvbo, chan, !!b->write_domains, true);
+		ret = nouveau_bo_sync(nvbo, chan, !!b->write_domains, true);
 		if (unlikely(ret)) {
 			if (ret != -ERESTARTSYS)
 				NV_PRINTK(err, cli, "fail post-validate sync\n");
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref()
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
  2020-08-28 10:40 ` [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync() Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-08-28 10:40 ` [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff Thierry Reding
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

This is a simple wrapper that increments the reference count of the
backing DMA fence.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 9 +++++++++
 drivers/gpu/drm/nouveau/nouveau_fence.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 8e7550553584..8530c2684832 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -363,6 +363,15 @@ nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
 	return ret;
 }
 
+struct nouveau_fence *
+nouveau_fence_ref(struct nouveau_fence *fence)
+{
+	if (fence)
+		dma_fence_get(&fence->base);
+
+	return fence;
+}
+
 void
 nouveau_fence_unref(struct nouveau_fence **pfence)
 {
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.h b/drivers/gpu/drm/nouveau/nouveau_fence.h
index 76cbf0c27a30..b8afd4b06445 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.h
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.h
@@ -19,6 +19,7 @@ struct nouveau_fence {
 
 int  nouveau_fence_new(struct nouveau_channel *, bool sysmem,
 		       struct nouveau_fence **);
+struct nouveau_fence *nouveau_fence_ref(struct nouveau_fence *);
 void nouveau_fence_unref(struct nouveau_fence **);
 
 int  nouveau_fence_emit(struct nouveau_fence *, struct nouveau_channel *);
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
  2020-08-28 10:40 ` [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync() Thierry Reding
  2020-08-28 10:40 ` [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref() Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-08-28 10:40 ` [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs Thierry Reding
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

Add a new NOUVEAU_GEM_PUSHBUF2 IOCTL that accepts and emits a sync fence
FD from/to userspace if requested by the corresponding flags.

Based heavily on work by Lauri Peltonen <lpeltonen@nvidia.com>

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_drm.c |  1 +
 drivers/gpu/drm/nouveau/nouveau_gem.c | 79 +++++++++++++++++++++++++--
 drivers/gpu/drm/nouveau/nouveau_gem.h |  2 +
 include/uapi/drm/nouveau_drm.h        | 14 +++++
 4 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 22d246acc5e5..c9cb2648a28b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -1140,6 +1140,7 @@ nouveau_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_PREP, nouveau_gem_ioctl_cpu_prep, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_CPU_FINI, nouveau_gem_ioctl_cpu_fini, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_INFO, nouveau_gem_ioctl_info, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(NOUVEAU_GEM_PUSHBUF2, nouveau_gem_ioctl_pushbuf2, DRM_RENDER_ALLOW),
 };
 
 long
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 590e4c1d2e8a..039f244c0a00 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -24,6 +24,9 @@
  *
  */
 
+#include <linux/file.h>
+#include <linux/sync_file.h>
+
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
 #include "nouveau_fence.h"
@@ -677,24 +680,30 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli,
 	return ret;
 }
 
-int
-nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
-			  struct drm_file *file_priv)
+static int
+__nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
+			    struct drm_nouveau_gem_pushbuf2 *request,
+			    struct drm_file *file_priv)
 {
 	struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv);
 	struct nouveau_cli *cli = nouveau_cli(file_priv);
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
-	struct drm_nouveau_gem_pushbuf *req = data;
+	struct drm_nouveau_gem_pushbuf *req = &request->base;
 	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
 	struct drm_nouveau_gem_pushbuf_bo *bo;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
 	struct nouveau_fence *fence = NULL;
+	struct dma_fence *prefence = NULL;
 	int i, j, ret = 0;
 	bool do_reloc = false, sync = false;
 
+	/* check for unrecognized flags */
+	if (request->flags & ~NOUVEAU_GEM_PUSHBUF_FLAGS)
+		return -EINVAL;
+
 	if (unlikely(!abi16))
 		return -ENOMEM;
 
@@ -764,6 +773,15 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 		goto out_prevalid;
 	}
 
+	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_WAIT) {
+		prefence = sync_file_get_fence(request->fence);
+		if (prefence) {
+			ret = nouveau_fence_sync(prefence, chan, true);
+			if (ret < 0)
+				goto out;
+		}
+	}
+
 	/* Apply any relocations that are required */
 	if (do_reloc) {
 		if (!reloc) {
@@ -865,7 +883,30 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 		}
 	}
 
+	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_EMIT) {
+		struct sync_file *file;
+		int fd;
+
+		fd = get_unused_fd_flags(O_CLOEXEC);
+		if (fd < 0) {
+			ret = fd;
+			goto out;
+		}
+
+		file = sync_file_create(&fence->base);
+		if (!file) {
+			put_unused_fd(fd);
+			goto out;
+		}
+
+		fd_install(fd, file->file);
+		request->fence = fd;
+	}
+
 out:
+	if (prefence)
+		dma_fence_put(prefence);
+
 	validate_fini(&op, chan, fence, bo);
 	nouveau_fence_unref(&fence);
 
@@ -906,6 +947,27 @@ nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
 	return nouveau_abi16_put(abi16, ret);
 }
 
+int
+nouveau_gem_ioctl_pushbuf(struct drm_device *dev, void *data,
+			  struct drm_file *file_priv)
+{
+	struct drm_nouveau_gem_pushbuf *request = data;
+	struct drm_nouveau_gem_pushbuf2 req;
+	int ret;
+
+	memset(&req, 0, sizeof(req));
+	memcpy(&req.base, request, sizeof(*request));
+
+	ret = __nouveau_gem_ioctl_pushbuf(dev, &req, file_priv);
+
+	request->gart_available = req.base.gart_available;
+	request->vram_available = req.base.vram_available;
+	request->suffix1 = req.base.suffix1;
+	request->suffix0 = req.base.suffix0;
+
+	return ret;
+}
+
 int
 nouveau_gem_ioctl_cpu_prep(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv)
@@ -973,3 +1035,12 @@ nouveau_gem_ioctl_info(struct drm_device *dev, void *data,
 	return ret;
 }
 
+int
+nouveau_gem_ioctl_pushbuf2(struct drm_device *dev, void *data,
+			   struct drm_file *file_priv)
+{
+	struct drm_nouveau_gem_pushbuf2 *req = data;
+
+	return __nouveau_gem_ioctl_pushbuf(dev, req, file_priv);
+}
+
diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.h b/drivers/gpu/drm/nouveau/nouveau_gem.h
index 978e07591990..652a63242303 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.h
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.h
@@ -29,6 +29,8 @@ extern int nouveau_gem_ioctl_cpu_fini(struct drm_device *, void *,
 				      struct drm_file *);
 extern int nouveau_gem_ioctl_info(struct drm_device *, void *,
 				  struct drm_file *);
+extern int nouveau_gem_ioctl_pushbuf2(struct drm_device *, void *,
+				      struct drm_file *);
 
 extern int nouveau_gem_prime_pin(struct drm_gem_object *);
 extern void nouveau_gem_prime_unpin(struct drm_gem_object *);
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 853a327433d3..85425dc90301 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -115,6 +115,18 @@ struct drm_nouveau_gem_pushbuf {
 	__u64 gart_available;
 };
 
+#define NOUVEAU_GEM_PUSHBUF_FENCE_WAIT (1 << 0)
+#define NOUVEAU_GEM_PUSHBUF_FENCE_EMIT (1 << 1)
+#define NOUVEAU_GEM_PUSHBUF_FLAGS (NOUVEAU_GEM_PUSHBUF_FENCE_WAIT | \
+				   NOUVEAU_GEM_PUSHBUF_FENCE_EMIT)
+
+struct drm_nouveau_gem_pushbuf2 {
+	struct drm_nouveau_gem_pushbuf base;
+	__u32 flags;
+	__s32 fence;
+	__u64 reserved;
+};
+
 #define NOUVEAU_GEM_CPU_PREP_NOWAIT                                  0x00000001
 #define NOUVEAU_GEM_CPU_PREP_WRITE                                   0x00000004
 struct drm_nouveau_gem_cpu_prep {
@@ -141,6 +153,7 @@ struct drm_nouveau_gem_cpu_fini {
 #define DRM_NOUVEAU_GEM_CPU_PREP       0x42
 #define DRM_NOUVEAU_GEM_CPU_FINI       0x43
 #define DRM_NOUVEAU_GEM_INFO           0x44
+#define DRM_NOUVEAU_GEM_PUSHBUF2       0x45
 
 struct drm_nouveau_svm_init {
 	__u64 unmanaged_addr;
@@ -196,6 +209,7 @@ struct drm_nouveau_svm_bind {
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_PREP       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_PREP, struct drm_nouveau_gem_cpu_prep)
 #define DRM_IOCTL_NOUVEAU_GEM_CPU_FINI       DRM_IOW (DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_CPU_FINI, struct drm_nouveau_gem_cpu_fini)
 #define DRM_IOCTL_NOUVEAU_GEM_INFO           DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_INFO, struct drm_nouveau_gem_info)
+#define DRM_IOCTL_NOUVEAU_GEM_PUSHBUF2       DRM_IOWR(DRM_COMMAND_BASE + DRM_NOUVEAU_GEM_PUSHBUF2, struct drm_nouveau_gem_pushbuf2)
 
 #if defined(__cplusplus)
 }
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
                   ` (2 preceding siblings ...)
  2020-08-28 10:40 ` [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-08-28 10:40 ` [PATCH 5/6] drm/nouveau: Support DMA fence arrays Thierry Reding
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

Extends the new NOUVEAU_GEM_PUSHBUF2 IOCTL to accept and emit one or
more sync FDs and/or DRM native sync objects.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
Note: If acceptable, this should be merged into the previous patch that
adds the new IOCTL.

 drivers/gpu/drm/nouveau/nouveau_gem.c | 180 ++++++++++++++++++++++----
 include/uapi/drm/nouveau_drm.h        |  21 ++-
 2 files changed, 167 insertions(+), 34 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 039f244c0a00..b3ece731e4e1 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -26,6 +26,7 @@
 
 #include <linux/file.h>
 #include <linux/sync_file.h>
+#include <drm/drm_syncobj.h>
 
 #include "nouveau_drv.h"
 #include "nouveau_dma.h"
@@ -680,12 +681,137 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli,
 	return ret;
 }
 
+static int nouveau_channel_wait_fence(struct nouveau_channel *channel,
+				      struct drm_file *file_priv,
+				      struct drm_nouveau_gem_fence *f)
+{
+	struct dma_fence *fence;
+
+	if (f->flags & NOUVEAU_GEM_FENCE_FD) {
+		fence = sync_file_get_fence(f->handle);
+		if (!fence)
+			return -ENOENT;
+	} else {
+		struct drm_syncobj *syncobj;
+
+		syncobj = drm_syncobj_find(file_priv, f->handle);
+		if (!syncobj)
+			return -ENOENT;
+
+		fence = drm_syncobj_fence_get(syncobj);
+		drm_syncobj_put(syncobj);
+	}
+
+	return nouveau_fence_sync(fence, channel, true);
+}
+
+static int nouveau_channel_wait_fences(struct nouveau_channel *channel,
+				       struct drm_file *file_priv,
+				       struct drm_nouveau_gem_fence *fences,
+				       unsigned int num_fences)
+{
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < num_fences; i++) {
+		if (fences[i].flags & NOUVEAU_GEM_FENCE_WAIT) {
+			ret = nouveau_channel_wait_fence(channel, file_priv,
+							 &fences[i]);
+			if (ret < 0)
+				return ret;
+		}
+	}
+
+	return 0;
+}
+
+static struct nouveau_fence *
+nouveau_channel_emit_fence(struct nouveau_channel *channel,
+			   struct drm_file *file_priv,
+			   struct drm_nouveau_gem_fence *f)
+{
+	struct nouveau_fence *fence;
+	int ret;
+
+	ret = nouveau_fence_new(channel, false, &fence);
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	if (f->flags & NOUVEAU_GEM_FENCE_FD) {
+		struct sync_file *file;
+		int fd;
+
+		fd = get_unused_fd_flags(O_CLOEXEC);
+		if (fd < 0) {
+			ret = fd;
+			goto put;
+		}
+
+		file = sync_file_create(&fence->base);
+		if (!file) {
+			put_unused_fd(fd);
+			ret = -ENOMEM;
+			goto put;
+		}
+
+		fd_install(fd, file->file);
+		f->handle = fd;
+	} else {
+		struct drm_syncobj *syncobj;
+
+		ret = drm_syncobj_create(&syncobj, 0, &fence->base);
+		if (ret < 0)
+			goto put;
+
+		ret = drm_syncobj_get_handle(file_priv, syncobj, &f->handle);
+		drm_syncobj_put(syncobj);
+	}
+
+put:
+	nouveau_fence_unref(&fence);
+	return ERR_PTR(ret);
+}
+
+static struct nouveau_fence *
+nouveau_channel_emit_fences(struct nouveau_channel *channel,
+			    struct drm_file *file_priv,
+			    struct drm_nouveau_gem_fence *fences,
+			    unsigned int num_fences)
+{
+	struct nouveau_fence *fence = NULL, *f;
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < num_fences; i++) {
+		if (fences[i].flags & NOUVEAU_GEM_FENCE_EMIT) {
+			f = nouveau_channel_emit_fence(channel, file_priv,
+						        &fences[i]);
+			if (IS_ERR(f))
+				return f;
+
+			if (!fence)
+				fence = f;
+		}
+	}
+
+	if (!fence) {
+		ret = nouveau_fence_new(channel, false, &fence);
+		if (ret)
+			fence = ERR_PTR(ret);
+	} else {
+		nouveau_fence_ref(fence);
+	}
+
+	return fence;
+}
+
 static int
 __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 			    struct drm_nouveau_gem_pushbuf2 *request,
 			    struct drm_file *file_priv)
 {
 	struct nouveau_abi16 *abi16 = nouveau_abi16_get(file_priv);
+	struct drm_nouveau_gem_fence __user *user_fences;
 	struct nouveau_cli *cli = nouveau_cli(file_priv);
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
@@ -693,12 +819,13 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
 	struct drm_nouveau_gem_pushbuf_bo *bo;
+	struct drm_nouveau_gem_fence *fences = NULL;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
 	struct nouveau_fence *fence = NULL;
-	struct dma_fence *prefence = NULL;
 	int i, j, ret = 0;
 	bool do_reloc = false, sync = false;
+	size_t size;
 
 	/* check for unrecognized flags */
 	if (request->flags & ~NOUVEAU_GEM_PUSHBUF_FLAGS)
@@ -773,13 +900,18 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		goto out_prevalid;
 	}
 
-	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_WAIT) {
-		prefence = sync_file_get_fence(request->fence);
-		if (prefence) {
-			ret = nouveau_fence_sync(prefence, chan, true);
-			if (ret < 0)
-				goto out;
+	if (request->num_fences > 0) {
+		fences = u_memcpya(request->fences, request->num_fences,
+				   sizeof(*fences));
+		if (IS_ERR(fences)) {
+			ret = PTR_ERR(fences);
+			goto out;
 		}
+
+		ret = nouveau_channel_wait_fences(chan, file_priv, fences,
+						  request->num_fences);
+		if (ret < 0)
+			goto out;
 	}
 
 	/* Apply any relocations that are required */
@@ -869,10 +1001,13 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		}
 	}
 
-	ret = nouveau_fence_new(chan, false, &fence);
-	if (ret) {
+	fence = nouveau_channel_emit_fences(chan, file_priv, fences,
+					    request->num_fences);
+	if (IS_ERR(fence)) {
+		ret = PTR_ERR(fence);
 		NV_PRINTK(err, cli, "error fencing pushbuf: %d\n", ret);
 		WIND_RING(chan);
+		fence = NULL;
 		goto out;
 	}
 
@@ -883,29 +1018,18 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		}
 	}
 
-	if (request->flags & NOUVEAU_GEM_PUSHBUF_FENCE_EMIT) {
-		struct sync_file *file;
-		int fd;
-
-		fd = get_unused_fd_flags(O_CLOEXEC);
-		if (fd < 0) {
-			ret = fd;
-			goto out;
-		}
-
-		file = sync_file_create(&fence->base);
-		if (!file) {
-			put_unused_fd(fd);
-			goto out;
-		}
+	user_fences = u64_to_user_ptr(request->fences);
+	size = sizeof(*fences) * request->num_fences;
 
-		fd_install(fd, file->file);
-		request->fence = fd;
+	if (copy_to_user(user_fences, fences, size)) {
+		WIND_RING(chan);
+		ret = -EFAULT;
+		fence = NULL;
+		goto out;
 	}
 
 out:
-	if (prefence)
-		dma_fence_put(prefence);
+	u_free(fences);
 
 	validate_fini(&op, chan, fence, bo);
 	nouveau_fence_unref(&fence);
diff --git a/include/uapi/drm/nouveau_drm.h b/include/uapi/drm/nouveau_drm.h
index 85425dc90301..5b8d40228a1b 100644
--- a/include/uapi/drm/nouveau_drm.h
+++ b/include/uapi/drm/nouveau_drm.h
@@ -115,16 +115,25 @@ struct drm_nouveau_gem_pushbuf {
 	__u64 gart_available;
 };
 
-#define NOUVEAU_GEM_PUSHBUF_FENCE_WAIT (1 << 0)
-#define NOUVEAU_GEM_PUSHBUF_FENCE_EMIT (1 << 1)
-#define NOUVEAU_GEM_PUSHBUF_FLAGS (NOUVEAU_GEM_PUSHBUF_FENCE_WAIT | \
-				   NOUVEAU_GEM_PUSHBUF_FENCE_EMIT)
+#define NOUVEAU_GEM_FENCE_WAIT	(1 << 0)
+#define NOUVEAU_GEM_FENCE_EMIT	(1 << 1)
+#define NOUVEAU_GEM_FENCE_FD	(1 << 2)
+#define NOUVEAU_GEM_FENCE_FLAGS	(NOUVEAU_GEM_FENCE_WAIT | \
+				 NOUVEAU_GEM_FENCE_EMIT | \
+				 NOUVEAU_GEM_FENCE_FD)
+
+struct drm_nouveau_gem_fence {
+	__u32 handle;
+	__u32 flags;
+};
+
+#define NOUVEAU_GEM_PUSHBUF_FLAGS	0
 
 struct drm_nouveau_gem_pushbuf2 {
 	struct drm_nouveau_gem_pushbuf base;
 	__u32 flags;
-	__s32 fence;
-	__u64 reserved;
+	__u32 num_fences;
+	__u64 fences;
 };
 
 #define NOUVEAU_GEM_CPU_PREP_NOWAIT                                  0x00000001
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] drm/nouveau: Support DMA fence arrays
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
                   ` (3 preceding siblings ...)
  2020-08-28 10:40 ` [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-08-28 10:40 ` [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits Thierry Reding
  2020-09-23  9:18 ` [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

A DMA fence can be composed of multiple fences in an array. Support this
in the Nouveau driver by iteratively synchronizing to each DMA fence in
the array.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 31 ++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 8530c2684832..c0849e09279c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -24,6 +24,7 @@
  *
  */
 
+#include <linux/dma-fence-array.h>
 #include <linux/ktime.h>
 #include <linux/hrtimer.h>
 #include <linux/sched/signal.h>
@@ -338,9 +339,9 @@ nouveau_fence_wait(struct nouveau_fence *fence, bool lazy, bool intr)
 		return 0;
 }
 
-int
-nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
-		   bool intr)
+static int
+__nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		     bool intr)
 {
 	struct nouveau_fence_chan *fctx = chan->fence;
 	struct nouveau_channel *prev = NULL;
@@ -363,6 +364,30 @@ nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
 	return ret;
 }
 
+int
+nouveau_fence_sync(struct dma_fence *fence, struct nouveau_channel *chan,
+		   bool intr)
+{
+	int ret = 0;
+
+	if (dma_fence_is_array(fence)) {
+		struct dma_fence_array *array = to_dma_fence_array(fence);
+		unsigned int i;
+
+		for (i = 0; i < array->num_fences; i++) {
+			struct dma_fence *f = array->fences[i];
+
+			ret = __nouveau_fence_sync(f, chan, intr);
+			if (ret < 0)
+				break;
+		}
+	} else {
+		ret = __nouveau_fence_sync(fence, chan, intr);
+	}
+
+	return ret;
+}
+
 struct nouveau_fence *
 nouveau_fence_ref(struct nouveau_fence *fence)
 {
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
                   ` (4 preceding siblings ...)
  2020-08-28 10:40 ` [PATCH 5/6] drm/nouveau: Support DMA fence arrays Thierry Reding
@ 2020-08-28 10:40 ` Thierry Reding
  2020-09-23  9:18 ` [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
  6 siblings, 0 replies; 11+ messages in thread
From: Thierry Reding @ 2020-08-28 10:40 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel

From: Thierry Reding <treding@nvidia.com>

These are useful in cases where only a fence is to be created to wait
for existing jobs in the command stream.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 drivers/gpu/drm/nouveau/nouveau_gem.c | 197 +++++++++++++-------------
 1 file changed, 99 insertions(+), 98 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c
index b3ece731e4e1..c70a045d7141 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -816,9 +816,9 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	struct nouveau_abi16_chan *temp;
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct drm_nouveau_gem_pushbuf *req = &request->base;
-	struct drm_nouveau_gem_pushbuf_push *push;
 	struct drm_nouveau_gem_pushbuf_reloc *reloc = NULL;
-	struct drm_nouveau_gem_pushbuf_bo *bo;
+	struct drm_nouveau_gem_pushbuf_push *push = NULL;
+	struct drm_nouveau_gem_pushbuf_bo *bo = NULL;
 	struct drm_nouveau_gem_fence *fences = NULL;
 	struct nouveau_channel *chan = NULL;
 	struct validate_op op;
@@ -850,8 +850,6 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 
 	req->vram_available = drm->gem.vram_available;
 	req->gart_available = drm->gem.gart_available;
-	if (unlikely(req->nr_push == 0))
-		goto out_next;
 
 	if (unlikely(req->nr_push > NOUVEAU_GEM_MAX_PUSH)) {
 		NV_PRINTK(err, cli, "pushbuf push count exceeds limit: %d max %d\n",
@@ -871,33 +869,35 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 		return nouveau_abi16_put(abi16, -EINVAL);
 	}
 
-	push = u_memcpya(req->push, req->nr_push, sizeof(*push));
-	if (IS_ERR(push))
-		return nouveau_abi16_put(abi16, PTR_ERR(push));
+	if (req->nr_push > 0) {
+		push = u_memcpya(req->push, req->nr_push, sizeof(*push));
+		if (IS_ERR(push))
+			return nouveau_abi16_put(abi16, PTR_ERR(push));
 
-	bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo));
-	if (IS_ERR(bo)) {
-		u_free(push);
-		return nouveau_abi16_put(abi16, PTR_ERR(bo));
-	}
+		bo = u_memcpya(req->buffers, req->nr_buffers, sizeof(*bo));
+		if (IS_ERR(bo)) {
+			u_free(push);
+			return nouveau_abi16_put(abi16, PTR_ERR(bo));
+		}
 
-	/* Ensure all push buffers are on validate list */
-	for (i = 0; i < req->nr_push; i++) {
-		if (push[i].bo_index >= req->nr_buffers) {
-			NV_PRINTK(err, cli, "push %d buffer not in list\n", i);
-			ret = -EINVAL;
-			goto out_prevalid;
+		/* Ensure all push buffers are on validate list */
+		for (i = 0; i < req->nr_push; i++) {
+			if (push[i].bo_index >= req->nr_buffers) {
+				NV_PRINTK(err, cli, "push %d buffer not in list\n", i);
+				ret = -EINVAL;
+				goto out_prevalid;
+			}
 		}
-	}
 
-	/* Validate buffer list */
+		/* Validate buffer list */
 revalidate:
-	ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo,
-					   req->nr_buffers, &op, &do_reloc);
-	if (ret) {
-		if (ret != -ERESTARTSYS)
-			NV_PRINTK(err, cli, "validate: %d\n", ret);
-		goto out_prevalid;
+		ret = nouveau_gem_pushbuf_validate(chan, file_priv, bo,
+						   req->nr_buffers, &op, &do_reloc);
+		if (ret) {
+			if (ret != -ERESTARTSYS)
+				NV_PRINTK(err, cli, "validate: %d\n", ret);
+			goto out_prevalid;
+		}
 	}
 
 	if (request->num_fences > 0) {
@@ -915,89 +915,89 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	}
 
 	/* Apply any relocations that are required */
-	if (do_reloc) {
-		if (!reloc) {
-			validate_fini(&op, chan, NULL, bo);
-			reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc));
-			if (IS_ERR(reloc)) {
-				ret = PTR_ERR(reloc);
-				goto out_prevalid;
-			}
+	if (req->nr_push > 0) {
+		if (do_reloc) {
+			if (!reloc) {
+				validate_fini(&op, chan, NULL, bo);
+				reloc = u_memcpya(req->relocs, req->nr_relocs, sizeof(*reloc));
+				if (IS_ERR(reloc)) {
+					ret = PTR_ERR(reloc);
+					goto out_prevalid;
+				}
 
-			goto revalidate;
-		}
+				goto revalidate;
+			}
 
-		ret = nouveau_gem_pushbuf_reloc_apply(cli, req, reloc, bo);
-		if (ret) {
-			NV_PRINTK(err, cli, "reloc apply: %d\n", ret);
-			goto out;
+			ret = nouveau_gem_pushbuf_reloc_apply(cli, req, reloc, bo);
+			if (ret) {
+				NV_PRINTK(err, cli, "reloc apply: %d\n", ret);
+				goto out;
+			}
 		}
-	}
 
-	if (chan->dma.ib_max) {
-		ret = nouveau_dma_wait(chan, req->nr_push + 1, 16);
-		if (ret) {
-			NV_PRINTK(err, cli, "nv50cal_space: %d\n", ret);
-			goto out;
-		}
+		if (chan->dma.ib_max) {
+			ret = nouveau_dma_wait(chan, req->nr_push + 1, 16);
+			if (ret) {
+				NV_PRINTK(err, cli, "nv50cal_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_vma *vma = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_vma *vma = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
 
-			nv50_dma_push(chan, vma->addr + push[i].offset,
-				      push[i].length);
-		}
-	} else
-	if (drm->client.device.info.chipset >= 0x25) {
-		ret = PUSH_WAIT(chan->chan.push, req->nr_push * 2);
-		if (ret) {
-			NV_PRINTK(err, cli, "cal_space: %d\n", ret);
-			goto out;
-		}
+				nv50_dma_push(chan, vma->addr + push[i].offset,
+					      push[i].length);
+			}
+		} else if (drm->client.device.info.chipset >= 0x25) {
+			ret = PUSH_WAIT(chan->chan.push, req->nr_push * 2);
+			if (ret) {
+				NV_PRINTK(err, cli, "cal_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_bo *nvbo = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_bo *nvbo = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
 
-			PUSH_CALL(chan->chan.push, nvbo->offset + push[i].offset);
-			PUSH_DATA(chan->chan.push, 0);
-		}
-	} else {
-		ret = PUSH_WAIT(chan->chan.push, req->nr_push * (2 + NOUVEAU_DMA_SKIPS));
-		if (ret) {
-			NV_PRINTK(err, cli, "jmp_space: %d\n", ret);
-			goto out;
-		}
+				PUSH_CALL(chan->chan.push, nvbo->offset + push[i].offset);
+				PUSH_DATA(chan->chan.push, 0);
+			}
+		} else {
+			ret = PUSH_WAIT(chan->chan.push, req->nr_push * (2 + NOUVEAU_DMA_SKIPS));
+			if (ret) {
+				NV_PRINTK(err, cli, "jmp_space: %d\n", ret);
+				goto out;
+			}
 
-		for (i = 0; i < req->nr_push; i++) {
-			struct nouveau_bo *nvbo = (void *)(unsigned long)
-				bo[push[i].bo_index].user_priv;
-			uint32_t cmd;
-
-			cmd = chan->push.addr + ((chan->dma.cur + 2) << 2);
-			cmd |= 0x20000000;
-			if (unlikely(cmd != req->suffix0)) {
-				if (!nvbo->kmap.virtual) {
-					ret = ttm_bo_kmap(&nvbo->bo, 0,
-							  nvbo->bo.mem.
-							  num_pages,
-							  &nvbo->kmap);
-					if (ret) {
-						WIND_RING(chan);
-						goto out;
+			for (i = 0; i < req->nr_push; i++) {
+				struct nouveau_bo *nvbo = (void *)(unsigned long)
+					bo[push[i].bo_index].user_priv;
+				uint32_t cmd;
+
+				cmd = chan->push.addr + ((chan->dma.cur + 2) << 2);
+				cmd |= 0x20000000;
+				if (unlikely(cmd != req->suffix0)) {
+					if (!nvbo->kmap.virtual) {
+						ret = ttm_bo_kmap(&nvbo->bo, 0,
+								  nvbo->bo.mem.num_pages,
+								  &nvbo->kmap);
+						if (ret) {
+							WIND_RING(chan);
+							goto out;
+						}
+						nvbo->validate_mapped = true;
 					}
-					nvbo->validate_mapped = true;
-				}
 
-				nouveau_bo_wr32(nvbo, (push[i].offset +
-						push[i].length - 8) / 4, cmd);
-			}
+					nouveau_bo_wr32(nvbo, (push[i].offset +
+							push[i].length - 8) / 4, cmd);
+				}
 
-			PUSH_JUMP(chan->chan.push, nvbo->offset + push[i].offset);
-			PUSH_DATA(chan->chan.push, 0);
-			for (j = 0; j < NOUVEAU_DMA_SKIPS; j++)
+				PUSH_JUMP(chan->chan.push, nvbo->offset + push[i].offset);
 				PUSH_DATA(chan->chan.push, 0);
+				for (j = 0; j < NOUVEAU_DMA_SKIPS; j++)
+					PUSH_DATA(chan->chan.push, 0);
+			}
 		}
 	}
 
@@ -1031,7 +1031,9 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 out:
 	u_free(fences);
 
-	validate_fini(&op, chan, fence, bo);
+	if (req->nr_push > 0)
+		validate_fini(&op, chan, fence, bo);
+
 	nouveau_fence_unref(&fence);
 
 	if (do_reloc) {
@@ -1054,7 +1056,6 @@ __nouveau_gem_ioctl_pushbuf(struct drm_device *dev,
 	u_free(bo);
 	u_free(push);
 
-out_next:
 	if (chan->dma.ib_max) {
 		req->suffix0 = 0x00000000;
 		req->suffix1 = 0x00000000;
-- 
2.28.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
  2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
                   ` (5 preceding siblings ...)
  2020-08-28 10:40 ` [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits Thierry Reding
@ 2020-09-23  9:18 ` Thierry Reding
  2020-09-23 15:21   ` [Nouveau] " Daniel Vetter
  6 siblings, 1 reply; 11+ messages in thread
From: Thierry Reding @ 2020-09-23  9:18 UTC (permalink / raw)
  To: Ben Skeggs; +Cc: nouveau, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 1624 bytes --]

On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
> 
> Hi,
> 
> This series implements a new IOCTL to submit push buffers that can
> optionally return a sync FD or sync object to userspace. This is useful
> in cases where userspace wants to synchronize operations between the GPU
> and another driver (such as KMS for display). Among other things this
> allows extensions such as eglDupNativeFenceFDANDROID to be implemented.
> 
> Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM
> sync objects to be passed rather than only sync FDs. It also allows any
> number of sync FDs/objects to be passed in or emitted. I think those are
> useful features, but I left them in a separate patch in case everybody
> else thinks that this won't be needed. If we decide to merge the new ABI
> then patch 4 should be squashed into patch 3.
> 
> The corresponding userspace changes can be found here:
> 
>   libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/
>   mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/
> 
> I've verified that this works with kmscube's --atomic mode and Weston.

Hi Ben,

any thoughts on this series? I realize that this is somewhat suboptimal
because we're effectively adding a duplicate of the existing IOCTL with
only the "minor" extension of adding sync FDs/objects, but at the same
time I don't have any good ideas on what else to add to make this more
appealing or if you have any plans of your own to address this in the
future.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
  2020-09-23  9:18 ` [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
@ 2020-09-23 15:21   ` Daniel Vetter
  2020-09-24 10:05     ` Thierry Reding
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2020-09-23 15:21 UTC (permalink / raw)
  To: Thierry Reding; +Cc: nouveau, Ben Skeggs, dri-devel

On Wed, Sep 23, 2020 at 11:18:53AM +0200, Thierry Reding wrote:
> On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Hi,
> > 
> > This series implements a new IOCTL to submit push buffers that can
> > optionally return a sync FD or sync object to userspace. This is useful
> > in cases where userspace wants to synchronize operations between the GPU
> > and another driver (such as KMS for display). Among other things this
> > allows extensions such as eglDupNativeFenceFDANDROID to be implemented.
> > 
> > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM
> > sync objects to be passed rather than only sync FDs. It also allows any
> > number of sync FDs/objects to be passed in or emitted. I think those are
> > useful features, but I left them in a separate patch in case everybody
> > else thinks that this won't be needed. If we decide to merge the new ABI
> > then patch 4 should be squashed into patch 3.
> > 
> > The corresponding userspace changes can be found here:
> > 
> >   libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/
> >   mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/
> > 
> > I've verified that this works with kmscube's --atomic mode and Weston.
> 
> Hi Ben,
> 
> any thoughts on this series? I realize that this is somewhat suboptimal
> because we're effectively adding a duplicate of the existing IOCTL with
> only the "minor" extension of adding sync FDs/objects, but at the same
> time I don't have any good ideas on what else to add to make this more
> appealing or if you have any plans of your own to address this in the
> future.

drm core automatically zero-extends ioctl structs both ways, so if all you
do is add more stuff to the top level ioctl struct at the bottom, there's
no need to duplicate any code. At least as long as you guarantee that 0 ==
old behaviour for both in and out parameters.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
  2020-09-23 15:21   ` [Nouveau] " Daniel Vetter
@ 2020-09-24 10:05     ` Thierry Reding
  2020-09-24 11:16       ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Thierry Reding @ 2020-09-24 10:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: nouveau, Ben Skeggs, dri-devel


[-- Attachment #1.1: Type: text/plain, Size: 2402 bytes --]

On Wed, Sep 23, 2020 at 05:21:24PM +0200, Daniel Vetter wrote:
> On Wed, Sep 23, 2020 at 11:18:53AM +0200, Thierry Reding wrote:
> > On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote:
> > > From: Thierry Reding <treding@nvidia.com>
> > > 
> > > Hi,
> > > 
> > > This series implements a new IOCTL to submit push buffers that can
> > > optionally return a sync FD or sync object to userspace. This is useful
> > > in cases where userspace wants to synchronize operations between the GPU
> > > and another driver (such as KMS for display). Among other things this
> > > allows extensions such as eglDupNativeFenceFDANDROID to be implemented.
> > > 
> > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM
> > > sync objects to be passed rather than only sync FDs. It also allows any
> > > number of sync FDs/objects to be passed in or emitted. I think those are
> > > useful features, but I left them in a separate patch in case everybody
> > > else thinks that this won't be needed. If we decide to merge the new ABI
> > > then patch 4 should be squashed into patch 3.
> > > 
> > > The corresponding userspace changes can be found here:
> > > 
> > >   libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/
> > >   mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/
> > > 
> > > I've verified that this works with kmscube's --atomic mode and Weston.
> > 
> > Hi Ben,
> > 
> > any thoughts on this series? I realize that this is somewhat suboptimal
> > because we're effectively adding a duplicate of the existing IOCTL with
> > only the "minor" extension of adding sync FDs/objects, but at the same
> > time I don't have any good ideas on what else to add to make this more
> > appealing or if you have any plans of your own to address this in the
> > future.
> 
> drm core automatically zero-extends ioctl structs both ways, so if all you
> do is add more stuff to the top level ioctl struct at the bottom, there's
> no need to duplicate any code. At least as long as you guarantee that 0 ==
> old behaviour for both in and out parameters.

But that only works if the structure size remains fixed, right? In this
case, however, we have to extend the structure with additional fields,
so the size is going to change and therefore the IOCTL number will also
change.

Thierry

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Nouveau] [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects
  2020-09-24 10:05     ` Thierry Reding
@ 2020-09-24 11:16       ` Daniel Vetter
  0 siblings, 0 replies; 11+ messages in thread
From: Daniel Vetter @ 2020-09-24 11:16 UTC (permalink / raw)
  To: Thierry Reding; +Cc: Nouveau Dev, Ben Skeggs, dri-devel

On Thu, Sep 24, 2020 at 12:05 PM Thierry Reding
<thierry.reding@gmail.com> wrote:
>
> On Wed, Sep 23, 2020 at 05:21:24PM +0200, Daniel Vetter wrote:
> > On Wed, Sep 23, 2020 at 11:18:53AM +0200, Thierry Reding wrote:
> > > On Fri, Aug 28, 2020 at 12:40:10PM +0200, Thierry Reding wrote:
> > > > From: Thierry Reding <treding@nvidia.com>
> > > >
> > > > Hi,
> > > >
> > > > This series implements a new IOCTL to submit push buffers that can
> > > > optionally return a sync FD or sync object to userspace. This is useful
> > > > in cases where userspace wants to synchronize operations between the GPU
> > > > and another driver (such as KMS for display). Among other things this
> > > > allows extensions such as eglDupNativeFenceFDANDROID to be implemented.
> > > >
> > > > Note that patch 4 modifies the ABI introduced in patch 3 by allowing DRM
> > > > sync objects to be passed rather than only sync FDs. It also allows any
> > > > number of sync FDs/objects to be passed in or emitted. I think those are
> > > > useful features, but I left them in a separate patch in case everybody
> > > > else thinks that this won't be needed. If we decide to merge the new ABI
> > > > then patch 4 should be squashed into patch 3.
> > > >
> > > > The corresponding userspace changes can be found here:
> > > >
> > > >   libdrm: https://gitlab.freedesktop.org/tagr/drm/-/commits/nouveau-sync-fd-v2/
> > > >   mesa: https://gitlab.freedesktop.org/tagr/mesa/-/commits/nouveau-sync-fd/
> > > >
> > > > I've verified that this works with kmscube's --atomic mode and Weston.
> > >
> > > Hi Ben,
> > >
> > > any thoughts on this series? I realize that this is somewhat suboptimal
> > > because we're effectively adding a duplicate of the existing IOCTL with
> > > only the "minor" extension of adding sync FDs/objects, but at the same
> > > time I don't have any good ideas on what else to add to make this more
> > > appealing or if you have any plans of your own to address this in the
> > > future.
> >
> > drm core automatically zero-extends ioctl structs both ways, so if all you
> > do is add more stuff to the top level ioctl struct at the bottom, there's
> > no need to duplicate any code. At least as long as you guarantee that 0 ==
> > old behaviour for both in and out parameters.
>
> But that only works if the structure size remains fixed, right? In this
> case, however, we have to extend the structure with additional fields,
> so the size is going to change and therefore the IOCTL number will also
> change.

Nope, drm_ioctl() is pretty much magic, and will zero-extend size
mismatches in both ways. Which means you can run userspace compile
against old kernels (so user_sz > kernel_sz) and you can run old
userspace on new kernels (so user_sz < kernel_sz) and it will all work
correctly. No need to allocate new ioctl numbers for this case, just
extend at the bottom. We're doing this pretty much all the time.

You might still want a getparam (or explicit flag, if all versions of
that ioctl validated the flags correctly) since doing since a dummy
pushbuf on an old kernel won't result in anything getting rejected.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-09-24 11:16 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-28 10:40 [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
2020-08-28 10:40 ` [PATCH 1/6] drm/nouveau: Split nouveau_fence_sync() Thierry Reding
2020-08-28 10:40 ` [PATCH 2/6] drm/nouveau: Add nouveau_fence_ref() Thierry Reding
2020-08-28 10:40 ` [PATCH 3/6] drm/nouveau: Support fence FDs at kickoff Thierry Reding
2020-08-28 10:40 ` [PATCH 4/6] drm/nouveau: Support sync FDs and syncobjs Thierry Reding
2020-08-28 10:40 ` [PATCH 5/6] drm/nouveau: Support DMA fence arrays Thierry Reding
2020-08-28 10:40 ` [PATCH 6/6] drm/nouveau: Allow zero pushbuffer submits Thierry Reding
2020-09-23  9:18 ` [PATCH 0/6] drm/nouveau: Support sync FDs and sync objects Thierry Reding
2020-09-23 15:21   ` [Nouveau] " Daniel Vetter
2020-09-24 10:05     ` Thierry Reding
2020-09-24 11:16       ` Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).