[PATCH v4 0/2] Add sync object UAPI support to VirtIO-GPU driver

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/2] Add sync object UAPI support to VirtIO-GPU driver
@ 2023-03-23 23:07 ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: dri-devel, linux-kernel, kernel, virtualization

We have multiple Vulkan context types that are awaiting for the addition
of the sync object DRM UAPI support to the VirtIO-GPU kernel driver:

 1. Venus context
 2. Native contexts (virtio-freedreno, virtio-intel, virtio-amdgpu)

Mesa core supports DRM sync object UAPI, providing Vulkan drivers with a
generic fencing implementation that we want to utilize.

This patch adds initial sync objects support. It creates fundament for a
further fencing improvements. Later on we will want to extend the VirtIO-GPU
fencing API with passing fence IDs to host for waiting, it will be a new
additional VirtIO-GPU IOCTL and more. Today we have several VirtIO-GPU context
drivers in works that require VirtIO-GPU to support sync objects UAPI.

The patch is heavily inspired by the sync object UAPI implementation of the
MSM driver.

Changelog:

v4: - Added r-b from Rob Clark to the "refactoring" patch.

    - Replaced for/while(ptr && itr) with if (ptr), like was suggested by
      Rob Clark.

    - Dropped NOWARN and NORETRY GFP flags and switched syncobj patch
      to use kvmalloc.

    - Removed unused variables from syncobj patch that were borrowed by
      accident from another (upcoming) patch after one of git rebases.

v3: - Switched to use dma_fence_unwrap_for_each(), like was suggested by
      Rob Clark.

    - Fixed missing dma_fence_put() in error code path that was spotted by
      Rob Clark.

    - Removed obsoleted comment to virtio_gpu_execbuffer_ioctl(), like was
      suggested by Rob Clark.

v2: - Fixed chain-fence context matching by making use of
      dma_fence_chain_contained().

    - Fixed potential uninitialized var usage in error code patch of
      parse_post_deps(). MSM driver had a similar issue that is fixed
      already in upstream.

    - Added new patch that refactors job submission code path. I found
      that it was very difficult to add a new/upcoming host-waits feature
      because of how variables are passed around the code, the virtgpu_ioctl.c
      also was growing to unmanageable size.

Dmitry Osipenko (2):
  drm/virtio: Refactor job submission code path
  drm/virtio: Support sync objects

 drivers/gpu/drm/virtio/Makefile         |   2 +-
 drivers/gpu/drm/virtio/virtgpu_drv.c    |   3 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h    |   4 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c  | 182 ---------
 drivers/gpu/drm/virtio/virtgpu_submit.c | 521 ++++++++++++++++++++++++
 include/uapi/drm/virtgpu_drm.h          |  16 +-
 6 files changed, 543 insertions(+), 185 deletions(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_submit.c

-- 
2.39.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 0/2] Add sync object UAPI support to VirtIO-GPU driver
@ 2023-03-23 23:07 ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: kernel, linux-kernel, dri-devel, virtualization

We have multiple Vulkan context types that are awaiting for the addition
of the sync object DRM UAPI support to the VirtIO-GPU kernel driver:

 1. Venus context
 2. Native contexts (virtio-freedreno, virtio-intel, virtio-amdgpu)

Mesa core supports DRM sync object UAPI, providing Vulkan drivers with a
generic fencing implementation that we want to utilize.

This patch adds initial sync objects support. It creates fundament for a
further fencing improvements. Later on we will want to extend the VirtIO-GPU
fencing API with passing fence IDs to host for waiting, it will be a new
additional VirtIO-GPU IOCTL and more. Today we have several VirtIO-GPU context
drivers in works that require VirtIO-GPU to support sync objects UAPI.

The patch is heavily inspired by the sync object UAPI implementation of the
MSM driver.

Changelog:

v4: - Added r-b from Rob Clark to the "refactoring" patch.

    - Replaced for/while(ptr && itr) with if (ptr), like was suggested by
      Rob Clark.

    - Dropped NOWARN and NORETRY GFP flags and switched syncobj patch
      to use kvmalloc.

    - Removed unused variables from syncobj patch that were borrowed by
      accident from another (upcoming) patch after one of git rebases.

v3: - Switched to use dma_fence_unwrap_for_each(), like was suggested by
      Rob Clark.

    - Fixed missing dma_fence_put() in error code path that was spotted by
      Rob Clark.

    - Removed obsoleted comment to virtio_gpu_execbuffer_ioctl(), like was
      suggested by Rob Clark.

v2: - Fixed chain-fence context matching by making use of
      dma_fence_chain_contained().

    - Fixed potential uninitialized var usage in error code patch of
      parse_post_deps(). MSM driver had a similar issue that is fixed
      already in upstream.

    - Added new patch that refactors job submission code path. I found
      that it was very difficult to add a new/upcoming host-waits feature
      because of how variables are passed around the code, the virtgpu_ioctl.c
      also was growing to unmanageable size.

Dmitry Osipenko (2):
  drm/virtio: Refactor job submission code path
  drm/virtio: Support sync objects

 drivers/gpu/drm/virtio/Makefile         |   2 +-
 drivers/gpu/drm/virtio/virtgpu_drv.c    |   3 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h    |   4 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c  | 182 ---------
 drivers/gpu/drm/virtio/virtgpu_submit.c | 521 ++++++++++++++++++++++++
 include/uapi/drm/virtgpu_drm.h          |  16 +-
 6 files changed, 543 insertions(+), 185 deletions(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_submit.c

-- 
2.39.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 1/2] drm/virtio: Refactor job submission code path
  2023-03-23 23:07 ` Dmitry Osipenko
@ 2023-03-23 23:07   ` Dmitry Osipenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: dri-devel, linux-kernel, kernel, virtualization

Move virtio_gpu_execbuffer_ioctl() into separate virtgpu_submit.c file
and refactor the code along the way to ease addition of new features to
the ioctl.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 drivers/gpu/drm/virtio/Makefile         |   2 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h    |   4 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c  | 182 --------------
 drivers/gpu/drm/virtio/virtgpu_submit.c | 302 ++++++++++++++++++++++++
 4 files changed, 307 insertions(+), 183 deletions(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_submit.c

diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
index b99fa4a73b68..d2e1788a8227 100644
--- a/drivers/gpu/drm/virtio/Makefile
+++ b/drivers/gpu/drm/virtio/Makefile
@@ -6,6 +6,6 @@
 virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_gem.o virtgpu_vram.o \
 	virtgpu_display.o virtgpu_vq.o \
 	virtgpu_fence.o virtgpu_object.o virtgpu_debugfs.o virtgpu_plane.o \
-	virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o
+	virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o virtgpu_submit.o
 
 obj-$(CONFIG_DRM_VIRTIO_GPU) += virtio-gpu.o
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index af6ffb696086..4126c384286b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -486,4 +486,8 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
 				   struct sg_table *sgt,
 				   enum dma_data_direction dir);
 
+/* virtgpu_submit.c */
+int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
+				struct drm_file *file);
+
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index da45215a933d..b24b11f25197 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -38,36 +38,6 @@
 				    VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \
 				    VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
 
-static int virtio_gpu_fence_event_create(struct drm_device *dev,
-					 struct drm_file *file,
-					 struct virtio_gpu_fence *fence,
-					 uint32_t ring_idx)
-{
-	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
-	struct virtio_gpu_fence_event *e = NULL;
-	int ret;
-
-	if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
-		return 0;
-
-	e = kzalloc(sizeof(*e), GFP_KERNEL);
-	if (!e)
-		return -ENOMEM;
-
-	e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED;
-	e->event.length = sizeof(e->event);
-
-	ret = drm_event_reserve_init(dev, file, &e->base, &e->event);
-	if (ret)
-		goto free;
-
-	fence->e = e;
-	return 0;
-free:
-	kfree(e);
-	return ret;
-}
-
 /* Must be called with &virtio_gpu_fpriv.struct_mutex held. */
 static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev,
 					     struct virtio_gpu_fpriv *vfpriv)
@@ -108,158 +78,6 @@ static int virtio_gpu_map_ioctl(struct drm_device *dev, void *data,
 					 &virtio_gpu_map->offset);
 }
 
-/*
- * Usage of execbuffer:
- * Relocations need to take into account the full VIRTIO_GPUDrawable size.
- * However, the command as passed from user space must *not* contain the initial
- * VIRTIO_GPUReleaseInfo struct (first XXX bytes)
- */
-static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
-				 struct drm_file *file)
-{
-	struct drm_virtgpu_execbuffer *exbuf = data;
-	struct virtio_gpu_device *vgdev = dev->dev_private;
-	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
-	struct virtio_gpu_fence *out_fence;
-	int ret;
-	uint32_t *bo_handles = NULL;
-	void __user *user_bo_handles = NULL;
-	struct virtio_gpu_object_array *buflist = NULL;
-	struct sync_file *sync_file;
-	int out_fence_fd = -1;
-	void *buf;
-	uint64_t fence_ctx;
-	uint32_t ring_idx;
-
-	fence_ctx = vgdev->fence_drv.context;
-	ring_idx = 0;
-
-	if (vgdev->has_virgl_3d == false)
-		return -ENOSYS;
-
-	if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
-		return -EINVAL;
-
-	if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
-		if (exbuf->ring_idx >= vfpriv->num_rings)
-			return -EINVAL;
-
-		if (!vfpriv->base_fence_ctx)
-			return -EINVAL;
-
-		fence_ctx = vfpriv->base_fence_ctx;
-		ring_idx = exbuf->ring_idx;
-	}
-
-	virtio_gpu_create_context(dev, file);
-	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) {
-		struct dma_fence *in_fence;
-
-		in_fence = sync_file_get_fence(exbuf->fence_fd);
-
-		if (!in_fence)
-			return -EINVAL;
-
-		/*
-		 * Wait if the fence is from a foreign context, or if the fence
-		 * array contains any fence from a foreign context.
-		 */
-		ret = 0;
-		if (!dma_fence_match_context(in_fence, fence_ctx + ring_idx))
-			ret = dma_fence_wait(in_fence, true);
-
-		dma_fence_put(in_fence);
-		if (ret)
-			return ret;
-	}
-
-	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) {
-		out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
-		if (out_fence_fd < 0)
-			return out_fence_fd;
-	}
-
-	if (exbuf->num_bo_handles) {
-		bo_handles = kvmalloc_array(exbuf->num_bo_handles,
-					    sizeof(uint32_t), GFP_KERNEL);
-		if (!bo_handles) {
-			ret = -ENOMEM;
-			goto out_unused_fd;
-		}
-
-		user_bo_handles = u64_to_user_ptr(exbuf->bo_handles);
-		if (copy_from_user(bo_handles, user_bo_handles,
-				   exbuf->num_bo_handles * sizeof(uint32_t))) {
-			ret = -EFAULT;
-			goto out_unused_fd;
-		}
-
-		buflist = virtio_gpu_array_from_handles(file, bo_handles,
-							exbuf->num_bo_handles);
-		if (!buflist) {
-			ret = -ENOENT;
-			goto out_unused_fd;
-		}
-		kvfree(bo_handles);
-		bo_handles = NULL;
-	}
-
-	buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size);
-	if (IS_ERR(buf)) {
-		ret = PTR_ERR(buf);
-		goto out_unused_fd;
-	}
-
-	if (buflist) {
-		ret = virtio_gpu_array_lock_resv(buflist);
-		if (ret)
-			goto out_memdup;
-	}
-
-	out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
-	if(!out_fence) {
-		ret = -ENOMEM;
-		goto out_unresv;
-	}
-
-	ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
-	if (ret)
-		goto out_unresv;
-
-	if (out_fence_fd >= 0) {
-		sync_file = sync_file_create(&out_fence->f);
-		if (!sync_file) {
-			dma_fence_put(&out_fence->f);
-			ret = -ENOMEM;
-			goto out_unresv;
-		}
-
-		exbuf->fence_fd = out_fence_fd;
-		fd_install(out_fence_fd, sync_file->file);
-	}
-
-	virtio_gpu_cmd_submit(vgdev, buf, exbuf->size,
-			      vfpriv->ctx_id, buflist, out_fence);
-	dma_fence_put(&out_fence->f);
-	virtio_gpu_notify(vgdev);
-	return 0;
-
-out_unresv:
-	if (buflist)
-		virtio_gpu_array_unlock_resv(buflist);
-out_memdup:
-	kvfree(buf);
-out_unused_fd:
-	kvfree(bo_handles);
-	if (buflist)
-		virtio_gpu_array_put_free(buflist);
-
-	if (out_fence_fd >= 0)
-		put_unused_fd(out_fence_fd);
-
-	return ret;
-}
-
 static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data,
 				     struct drm_file *file)
 {
diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c
new file mode 100644
index 000000000000..42c79869f192
--- /dev/null
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -0,0 +1,302 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright (C) 2015 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * Authors:
+ *    Dave Airlie
+ *    Alon Levy
+ */
+
+#include <linux/dma-fence-unwrap.h>
+#include <linux/file.h>
+#include <linux/sync_file.h>
+#include <linux/uaccess.h>
+
+#include <drm/drm_file.h>
+#include <drm/virtgpu_drm.h>
+
+#include "virtgpu_drv.h"
+
+struct virtio_gpu_submit {
+	struct virtio_gpu_object_array *buflist;
+	struct drm_virtgpu_execbuffer *exbuf;
+	struct virtio_gpu_fence *out_fence;
+	struct virtio_gpu_fpriv *vfpriv;
+	struct virtio_gpu_device *vgdev;
+	struct drm_file *file;
+	uint64_t fence_ctx;
+	uint32_t ring_idx;
+	int out_fence_fd;
+	void *buf;
+};
+
+static int virtio_gpu_do_fence_wait(struct virtio_gpu_submit *submit,
+				    struct dma_fence *in_fence)
+{
+	uint32_t context = submit->fence_ctx + submit->ring_idx;
+
+	if (dma_fence_match_context(in_fence, context))
+		return 0;
+
+	return dma_fence_wait(in_fence, true);
+}
+
+static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
+				     struct dma_fence *fence)
+{
+	struct dma_fence_unwrap itr;
+	struct dma_fence *f;
+	int err;
+
+	dma_fence_unwrap_for_each(f, &itr, fence) {
+		err = virtio_gpu_do_fence_wait(submit, f);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_fence_event_create(struct drm_device *dev,
+					 struct drm_file *file,
+					 struct virtio_gpu_fence *fence,
+					 uint32_t ring_idx)
+{
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	struct virtio_gpu_fence_event *e = NULL;
+	int ret;
+
+	if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
+		return 0;
+
+	e = kzalloc(sizeof(*e), GFP_KERNEL);
+	if (!e)
+		return -ENOMEM;
+
+	e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED;
+	e->event.length = sizeof(e->event);
+
+	ret = drm_event_reserve_init(dev, file, &e->base, &e->event);
+	if (ret) {
+		kfree(e);
+		return ret;
+	}
+
+	fence->e = e;
+
+	return 0;
+}
+
+static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	uint32_t *bo_handles;
+
+	if (!exbuf->num_bo_handles)
+		return 0;
+
+	bo_handles = kvmalloc_array(exbuf->num_bo_handles, sizeof(*bo_handles),
+				    GFP_KERNEL);
+	if (!bo_handles)
+		return -ENOMEM;
+
+	if (copy_from_user(bo_handles, u64_to_user_ptr(exbuf->bo_handles),
+			   exbuf->num_bo_handles * sizeof(*bo_handles))) {
+		kvfree(bo_handles);
+		return -EFAULT;
+	}
+
+	submit->buflist = virtio_gpu_array_from_handles(submit->file, bo_handles,
+							exbuf->num_bo_handles);
+	if (!submit->buflist) {
+		kvfree(bo_handles);
+		return -ENOENT;
+	}
+
+	kvfree(bo_handles);
+
+	return 0;
+}
+
+static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
+{
+	if (!IS_ERR(submit->buf))
+		kvfree(submit->buf);
+
+	if (submit->buflist)
+		virtio_gpu_array_put_free(submit->buflist);
+
+	if (submit->out_fence_fd >= 0)
+		put_unused_fd(submit->out_fence_fd);
+
+	if (submit->out_fence)
+		dma_fence_put(&submit->out_fence->f);
+}
+
+static void virtio_gpu_submit(struct virtio_gpu_submit *submit)
+{
+	virtio_gpu_cmd_submit(submit->vgdev, submit->buf, submit->exbuf->size,
+			      submit->vfpriv->ctx_id, submit->buflist,
+			      submit->out_fence);
+	virtio_gpu_notify(submit->vgdev);
+}
+
+static void virtio_gpu_complete_submit(struct virtio_gpu_submit *submit)
+{
+	submit->buf = NULL;
+	submit->buflist = NULL;
+	submit->out_fence = NULL;
+	submit->out_fence_fd = -1;
+}
+
+static int virtio_gpu_init_submit(struct virtio_gpu_submit *submit,
+				  struct drm_virtgpu_execbuffer *exbuf,
+				  struct drm_device *dev,
+				  struct drm_file *file,
+				  uint64_t fence_ctx, uint32_t ring_idx)
+{
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	struct virtio_gpu_device *vgdev = dev->dev_private;
+	struct virtio_gpu_fence *out_fence;
+	int err;
+
+	memset(submit, 0, sizeof(*submit));
+
+	out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
+	if (!out_fence)
+		return -ENOMEM;
+
+	err = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
+	if (err) {
+		dma_fence_put(&out_fence->f);
+		return err;
+	}
+
+	submit->out_fence = out_fence;
+	submit->fence_ctx = fence_ctx;
+	submit->ring_idx = ring_idx;
+	submit->out_fence_fd = -1;
+	submit->vfpriv = vfpriv;
+	submit->vgdev = vgdev;
+	submit->exbuf = exbuf;
+	submit->file = file;
+
+	err = virtio_gpu_init_submit_buflist(submit);
+	if (err)
+		return err;
+
+	submit->buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size);
+	if (IS_ERR(submit->buf))
+		return PTR_ERR(submit->buf);
+
+	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) {
+		err = get_unused_fd_flags(O_CLOEXEC);
+		if (err < 0)
+			return err;
+
+		submit->out_fence_fd = err;
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_wait_in_fence(struct virtio_gpu_submit *submit)
+{
+	int ret = 0;
+
+	if (submit->exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) {
+		struct dma_fence *in_fence =
+				sync_file_get_fence(submit->exbuf->fence_fd);
+		if (!in_fence)
+			return -EINVAL;
+
+		/*
+		 * Wait if the fence is from a foreign context, or if the fence
+		 * array contains any fence from a foreign context.
+		 */
+		ret = virtio_gpu_dma_fence_wait(submit, in_fence);
+
+		dma_fence_put(in_fence);
+	}
+
+	return ret;
+}
+
+static int virtio_gpu_install_out_fence_fd(struct virtio_gpu_submit *submit)
+{
+	if (submit->out_fence_fd >= 0) {
+		struct sync_file *sync_file =
+					sync_file_create(&submit->out_fence->f);
+		if (!sync_file)
+			return -ENOMEM;
+
+		submit->exbuf->fence_fd = submit->out_fence_fd;
+		fd_install(submit->out_fence_fd, sync_file->file);
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_lock_buflist(struct virtio_gpu_submit *submit)
+{
+	if (submit->buflist)
+		return virtio_gpu_array_lock_resv(submit->buflist);
+
+	return 0;
+}
+
+int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
+				struct drm_file *file)
+{
+	struct virtio_gpu_device *vgdev = dev->dev_private;
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	uint64_t fence_ctx = vgdev->fence_drv.context;
+	struct drm_virtgpu_execbuffer *exbuf = data;
+	struct virtio_gpu_submit submit;
+	uint32_t ring_idx = 0;
+	int ret = -EINVAL;
+
+	if (vgdev->has_virgl_3d == false)
+		return -ENOSYS;
+
+	if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
+		return ret;
+
+	if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
+		if (exbuf->ring_idx >= vfpriv->num_rings)
+			return ret;
+
+		if (!vfpriv->base_fence_ctx)
+			return ret;
+
+		fence_ctx = vfpriv->base_fence_ctx;
+		ring_idx = exbuf->ring_idx;
+	}
+
+	virtio_gpu_create_context(dev, file);
+
+	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
+				     fence_ctx, ring_idx);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_wait_in_fence(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_install_out_fence_fd(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_lock_buflist(&submit);
+	if (ret)
+		goto cleanup;
+
+	virtio_gpu_submit(&submit);
+	virtio_gpu_complete_submit(&submit);
+cleanup:
+	virtio_gpu_cleanup_submit(&submit);
+
+	return ret;
+}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 1/2] drm/virtio: Refactor job submission code path
@ 2023-03-23 23:07   ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: kernel, linux-kernel, dri-devel, virtualization

Move virtio_gpu_execbuffer_ioctl() into separate virtgpu_submit.c file
and refactor the code along the way to ease addition of new features to
the ioctl.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 drivers/gpu/drm/virtio/Makefile         |   2 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h    |   4 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c  | 182 --------------
 drivers/gpu/drm/virtio/virtgpu_submit.c | 302 ++++++++++++++++++++++++
 4 files changed, 307 insertions(+), 183 deletions(-)
 create mode 100644 drivers/gpu/drm/virtio/virtgpu_submit.c

diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile
index b99fa4a73b68..d2e1788a8227 100644
--- a/drivers/gpu/drm/virtio/Makefile
+++ b/drivers/gpu/drm/virtio/Makefile
@@ -6,6 +6,6 @@
 virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_gem.o virtgpu_vram.o \
 	virtgpu_display.o virtgpu_vq.o \
 	virtgpu_fence.o virtgpu_object.o virtgpu_debugfs.o virtgpu_plane.o \
-	virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o
+	virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o virtgpu_submit.o
 
 obj-$(CONFIG_DRM_VIRTIO_GPU) += virtio-gpu.o
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index af6ffb696086..4126c384286b 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -486,4 +486,8 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev,
 				   struct sg_table *sgt,
 				   enum dma_data_direction dir);
 
+/* virtgpu_submit.c */
+int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
+				struct drm_file *file);
+
 #endif
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index da45215a933d..b24b11f25197 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -38,36 +38,6 @@
 				    VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \
 				    VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
 
-static int virtio_gpu_fence_event_create(struct drm_device *dev,
-					 struct drm_file *file,
-					 struct virtio_gpu_fence *fence,
-					 uint32_t ring_idx)
-{
-	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
-	struct virtio_gpu_fence_event *e = NULL;
-	int ret;
-
-	if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
-		return 0;
-
-	e = kzalloc(sizeof(*e), GFP_KERNEL);
-	if (!e)
-		return -ENOMEM;
-
-	e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED;
-	e->event.length = sizeof(e->event);
-
-	ret = drm_event_reserve_init(dev, file, &e->base, &e->event);
-	if (ret)
-		goto free;
-
-	fence->e = e;
-	return 0;
-free:
-	kfree(e);
-	return ret;
-}
-
 /* Must be called with &virtio_gpu_fpriv.struct_mutex held. */
 static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev,
 					     struct virtio_gpu_fpriv *vfpriv)
@@ -108,158 +78,6 @@ static int virtio_gpu_map_ioctl(struct drm_device *dev, void *data,
 					 &virtio_gpu_map->offset);
 }
 
-/*
- * Usage of execbuffer:
- * Relocations need to take into account the full VIRTIO_GPUDrawable size.
- * However, the command as passed from user space must *not* contain the initial
- * VIRTIO_GPUReleaseInfo struct (first XXX bytes)
- */
-static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
-				 struct drm_file *file)
-{
-	struct drm_virtgpu_execbuffer *exbuf = data;
-	struct virtio_gpu_device *vgdev = dev->dev_private;
-	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
-	struct virtio_gpu_fence *out_fence;
-	int ret;
-	uint32_t *bo_handles = NULL;
-	void __user *user_bo_handles = NULL;
-	struct virtio_gpu_object_array *buflist = NULL;
-	struct sync_file *sync_file;
-	int out_fence_fd = -1;
-	void *buf;
-	uint64_t fence_ctx;
-	uint32_t ring_idx;
-
-	fence_ctx = vgdev->fence_drv.context;
-	ring_idx = 0;
-
-	if (vgdev->has_virgl_3d == false)
-		return -ENOSYS;
-
-	if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
-		return -EINVAL;
-
-	if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
-		if (exbuf->ring_idx >= vfpriv->num_rings)
-			return -EINVAL;
-
-		if (!vfpriv->base_fence_ctx)
-			return -EINVAL;
-
-		fence_ctx = vfpriv->base_fence_ctx;
-		ring_idx = exbuf->ring_idx;
-	}
-
-	virtio_gpu_create_context(dev, file);
-	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) {
-		struct dma_fence *in_fence;
-
-		in_fence = sync_file_get_fence(exbuf->fence_fd);
-
-		if (!in_fence)
-			return -EINVAL;
-
-		/*
-		 * Wait if the fence is from a foreign context, or if the fence
-		 * array contains any fence from a foreign context.
-		 */
-		ret = 0;
-		if (!dma_fence_match_context(in_fence, fence_ctx + ring_idx))
-			ret = dma_fence_wait(in_fence, true);
-
-		dma_fence_put(in_fence);
-		if (ret)
-			return ret;
-	}
-
-	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) {
-		out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
-		if (out_fence_fd < 0)
-			return out_fence_fd;
-	}
-
-	if (exbuf->num_bo_handles) {
-		bo_handles = kvmalloc_array(exbuf->num_bo_handles,
-					    sizeof(uint32_t), GFP_KERNEL);
-		if (!bo_handles) {
-			ret = -ENOMEM;
-			goto out_unused_fd;
-		}
-
-		user_bo_handles = u64_to_user_ptr(exbuf->bo_handles);
-		if (copy_from_user(bo_handles, user_bo_handles,
-				   exbuf->num_bo_handles * sizeof(uint32_t))) {
-			ret = -EFAULT;
-			goto out_unused_fd;
-		}
-
-		buflist = virtio_gpu_array_from_handles(file, bo_handles,
-							exbuf->num_bo_handles);
-		if (!buflist) {
-			ret = -ENOENT;
-			goto out_unused_fd;
-		}
-		kvfree(bo_handles);
-		bo_handles = NULL;
-	}
-
-	buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size);
-	if (IS_ERR(buf)) {
-		ret = PTR_ERR(buf);
-		goto out_unused_fd;
-	}
-
-	if (buflist) {
-		ret = virtio_gpu_array_lock_resv(buflist);
-		if (ret)
-			goto out_memdup;
-	}
-
-	out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
-	if(!out_fence) {
-		ret = -ENOMEM;
-		goto out_unresv;
-	}
-
-	ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
-	if (ret)
-		goto out_unresv;
-
-	if (out_fence_fd >= 0) {
-		sync_file = sync_file_create(&out_fence->f);
-		if (!sync_file) {
-			dma_fence_put(&out_fence->f);
-			ret = -ENOMEM;
-			goto out_unresv;
-		}
-
-		exbuf->fence_fd = out_fence_fd;
-		fd_install(out_fence_fd, sync_file->file);
-	}
-
-	virtio_gpu_cmd_submit(vgdev, buf, exbuf->size,
-			      vfpriv->ctx_id, buflist, out_fence);
-	dma_fence_put(&out_fence->f);
-	virtio_gpu_notify(vgdev);
-	return 0;
-
-out_unresv:
-	if (buflist)
-		virtio_gpu_array_unlock_resv(buflist);
-out_memdup:
-	kvfree(buf);
-out_unused_fd:
-	kvfree(bo_handles);
-	if (buflist)
-		virtio_gpu_array_put_free(buflist);
-
-	if (out_fence_fd >= 0)
-		put_unused_fd(out_fence_fd);
-
-	return ret;
-}
-
 static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data,
 				     struct drm_file *file)
 {
diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c
new file mode 100644
index 000000000000..42c79869f192
--- /dev/null
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -0,0 +1,302 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright (C) 2015 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * Authors:
+ *    Dave Airlie
+ *    Alon Levy
+ */
+
+#include <linux/dma-fence-unwrap.h>
+#include <linux/file.h>
+#include <linux/sync_file.h>
+#include <linux/uaccess.h>
+
+#include <drm/drm_file.h>
+#include <drm/virtgpu_drm.h>
+
+#include "virtgpu_drv.h"
+
+struct virtio_gpu_submit {
+	struct virtio_gpu_object_array *buflist;
+	struct drm_virtgpu_execbuffer *exbuf;
+	struct virtio_gpu_fence *out_fence;
+	struct virtio_gpu_fpriv *vfpriv;
+	struct virtio_gpu_device *vgdev;
+	struct drm_file *file;
+	uint64_t fence_ctx;
+	uint32_t ring_idx;
+	int out_fence_fd;
+	void *buf;
+};
+
+static int virtio_gpu_do_fence_wait(struct virtio_gpu_submit *submit,
+				    struct dma_fence *in_fence)
+{
+	uint32_t context = submit->fence_ctx + submit->ring_idx;
+
+	if (dma_fence_match_context(in_fence, context))
+		return 0;
+
+	return dma_fence_wait(in_fence, true);
+}
+
+static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
+				     struct dma_fence *fence)
+{
+	struct dma_fence_unwrap itr;
+	struct dma_fence *f;
+	int err;
+
+	dma_fence_unwrap_for_each(f, &itr, fence) {
+		err = virtio_gpu_do_fence_wait(submit, f);
+		if (err)
+			return err;
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_fence_event_create(struct drm_device *dev,
+					 struct drm_file *file,
+					 struct virtio_gpu_fence *fence,
+					 uint32_t ring_idx)
+{
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	struct virtio_gpu_fence_event *e = NULL;
+	int ret;
+
+	if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx)))
+		return 0;
+
+	e = kzalloc(sizeof(*e), GFP_KERNEL);
+	if (!e)
+		return -ENOMEM;
+
+	e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED;
+	e->event.length = sizeof(e->event);
+
+	ret = drm_event_reserve_init(dev, file, &e->base, &e->event);
+	if (ret) {
+		kfree(e);
+		return ret;
+	}
+
+	fence->e = e;
+
+	return 0;
+}
+
+static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	uint32_t *bo_handles;
+
+	if (!exbuf->num_bo_handles)
+		return 0;
+
+	bo_handles = kvmalloc_array(exbuf->num_bo_handles, sizeof(*bo_handles),
+				    GFP_KERNEL);
+	if (!bo_handles)
+		return -ENOMEM;
+
+	if (copy_from_user(bo_handles, u64_to_user_ptr(exbuf->bo_handles),
+			   exbuf->num_bo_handles * sizeof(*bo_handles))) {
+		kvfree(bo_handles);
+		return -EFAULT;
+	}
+
+	submit->buflist = virtio_gpu_array_from_handles(submit->file, bo_handles,
+							exbuf->num_bo_handles);
+	if (!submit->buflist) {
+		kvfree(bo_handles);
+		return -ENOENT;
+	}
+
+	kvfree(bo_handles);
+
+	return 0;
+}
+
+static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
+{
+	if (!IS_ERR(submit->buf))
+		kvfree(submit->buf);
+
+	if (submit->buflist)
+		virtio_gpu_array_put_free(submit->buflist);
+
+	if (submit->out_fence_fd >= 0)
+		put_unused_fd(submit->out_fence_fd);
+
+	if (submit->out_fence)
+		dma_fence_put(&submit->out_fence->f);
+}
+
+static void virtio_gpu_submit(struct virtio_gpu_submit *submit)
+{
+	virtio_gpu_cmd_submit(submit->vgdev, submit->buf, submit->exbuf->size,
+			      submit->vfpriv->ctx_id, submit->buflist,
+			      submit->out_fence);
+	virtio_gpu_notify(submit->vgdev);
+}
+
+static void virtio_gpu_complete_submit(struct virtio_gpu_submit *submit)
+{
+	submit->buf = NULL;
+	submit->buflist = NULL;
+	submit->out_fence = NULL;
+	submit->out_fence_fd = -1;
+}
+
+static int virtio_gpu_init_submit(struct virtio_gpu_submit *submit,
+				  struct drm_virtgpu_execbuffer *exbuf,
+				  struct drm_device *dev,
+				  struct drm_file *file,
+				  uint64_t fence_ctx, uint32_t ring_idx)
+{
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	struct virtio_gpu_device *vgdev = dev->dev_private;
+	struct virtio_gpu_fence *out_fence;
+	int err;
+
+	memset(submit, 0, sizeof(*submit));
+
+	out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
+	if (!out_fence)
+		return -ENOMEM;
+
+	err = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
+	if (err) {
+		dma_fence_put(&out_fence->f);
+		return err;
+	}
+
+	submit->out_fence = out_fence;
+	submit->fence_ctx = fence_ctx;
+	submit->ring_idx = ring_idx;
+	submit->out_fence_fd = -1;
+	submit->vfpriv = vfpriv;
+	submit->vgdev = vgdev;
+	submit->exbuf = exbuf;
+	submit->file = file;
+
+	err = virtio_gpu_init_submit_buflist(submit);
+	if (err)
+		return err;
+
+	submit->buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size);
+	if (IS_ERR(submit->buf))
+		return PTR_ERR(submit->buf);
+
+	if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) {
+		err = get_unused_fd_flags(O_CLOEXEC);
+		if (err < 0)
+			return err;
+
+		submit->out_fence_fd = err;
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_wait_in_fence(struct virtio_gpu_submit *submit)
+{
+	int ret = 0;
+
+	if (submit->exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) {
+		struct dma_fence *in_fence =
+				sync_file_get_fence(submit->exbuf->fence_fd);
+		if (!in_fence)
+			return -EINVAL;
+
+		/*
+		 * Wait if the fence is from a foreign context, or if the fence
+		 * array contains any fence from a foreign context.
+		 */
+		ret = virtio_gpu_dma_fence_wait(submit, in_fence);
+
+		dma_fence_put(in_fence);
+	}
+
+	return ret;
+}
+
+static int virtio_gpu_install_out_fence_fd(struct virtio_gpu_submit *submit)
+{
+	if (submit->out_fence_fd >= 0) {
+		struct sync_file *sync_file =
+					sync_file_create(&submit->out_fence->f);
+		if (!sync_file)
+			return -ENOMEM;
+
+		submit->exbuf->fence_fd = submit->out_fence_fd;
+		fd_install(submit->out_fence_fd, sync_file->file);
+	}
+
+	return 0;
+}
+
+static int virtio_gpu_lock_buflist(struct virtio_gpu_submit *submit)
+{
+	if (submit->buflist)
+		return virtio_gpu_array_lock_resv(submit->buflist);
+
+	return 0;
+}
+
+int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
+				struct drm_file *file)
+{
+	struct virtio_gpu_device *vgdev = dev->dev_private;
+	struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+	uint64_t fence_ctx = vgdev->fence_drv.context;
+	struct drm_virtgpu_execbuffer *exbuf = data;
+	struct virtio_gpu_submit submit;
+	uint32_t ring_idx = 0;
+	int ret = -EINVAL;
+
+	if (vgdev->has_virgl_3d == false)
+		return -ENOSYS;
+
+	if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
+		return ret;
+
+	if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
+		if (exbuf->ring_idx >= vfpriv->num_rings)
+			return ret;
+
+		if (!vfpriv->base_fence_ctx)
+			return ret;
+
+		fence_ctx = vfpriv->base_fence_ctx;
+		ring_idx = exbuf->ring_idx;
+	}
+
+	virtio_gpu_create_context(dev, file);
+
+	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
+				     fence_ctx, ring_idx);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_wait_in_fence(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_install_out_fence_fd(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_lock_buflist(&submit);
+	if (ret)
+		goto cleanup;
+
+	virtio_gpu_submit(&submit);
+	virtio_gpu_complete_submit(&submit);
+cleanup:
+	virtio_gpu_cleanup_submit(&submit);
+
+	return ret;
+}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-03-23 23:07 ` Dmitry Osipenko
@ 2023-03-23 23:07   ` Dmitry Osipenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: dri-devel, linux-kernel, kernel, virtualization

Add sync object DRM UAPI support to VirtIO-GPU driver. It's required
for enabling a full-featured Vulkan fencing by Venus and native context
VirtIO-GPU Mesa drivers.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 drivers/gpu/drm/virtio/virtgpu_drv.c    |   3 +-
 drivers/gpu/drm/virtio/virtgpu_submit.c | 219 ++++++++++++++++++++++++
 include/uapi/drm/virtgpu_drm.h          |  16 +-
 3 files changed, 236 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c
index add075681e18..a22155577152 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -176,7 +176,8 @@ static const struct drm_driver driver = {
 	 * If KMS is disabled DRIVER_MODESET and DRIVER_ATOMIC are masked
 	 * out via drm_device::driver_features:
 	 */
-	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC,
+	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC |
+			   DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE,
 	.open = virtio_gpu_driver_open,
 	.postclose = virtio_gpu_driver_postclose,
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c
index 42c79869f192..a18b21f9d07a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_submit.c
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -14,11 +14,24 @@
 #include <linux/uaccess.h>
 
 #include <drm/drm_file.h>
+#include <drm/drm_syncobj.h>
 #include <drm/virtgpu_drm.h>
 
 #include "virtgpu_drv.h"
 
+struct virtio_gpu_submit_post_dep {
+	struct drm_syncobj *syncobj;
+	struct dma_fence_chain *chain;
+	uint64_t point;
+};
+
 struct virtio_gpu_submit {
+	struct virtio_gpu_submit_post_dep *post_deps;
+	unsigned int num_out_syncobjs;
+
+	struct drm_syncobj **in_syncobjs;
+	unsigned int num_in_syncobjs;
+
 	struct virtio_gpu_object_array *buflist;
 	struct drm_virtgpu_execbuffer *exbuf;
 	struct virtio_gpu_fence *out_fence;
@@ -58,6 +71,189 @@ static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
 	return 0;
 }
 
+static void virtio_gpu_free_syncobjs(struct drm_syncobj **syncobjs,
+				     uint32_t nr_syncobjs)
+{
+	uint32_t i = nr_syncobjs;
+
+	while (i--) {
+		if (syncobjs[i])
+			drm_syncobj_put(syncobjs[i]);
+	}
+
+	kvfree(syncobjs);
+}
+
+static int
+virtio_gpu_parse_deps(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
+	size_t syncobj_stride = exbuf->syncobj_stride;
+	struct drm_syncobj **syncobjs;
+	int ret = 0, i;
+
+	if (!submit->num_in_syncobjs)
+		return 0;
+
+	syncobjs = kvcalloc(submit->num_in_syncobjs, sizeof(*syncobjs),
+			    GFP_KERNEL);
+	if (!syncobjs)
+		return -ENOMEM;
+
+	for (i = 0; i < submit->num_in_syncobjs; i++) {
+		uint64_t address = exbuf->in_syncobjs + i * syncobj_stride;
+		struct dma_fence *fence;
+
+		if (copy_from_user(&syncobj_desc,
+				   u64_to_user_ptr(address),
+				   min(syncobj_stride, sizeof(syncobj_desc)))) {
+			ret = -EFAULT;
+			break;
+		}
+
+		if (syncobj_desc.flags & ~VIRTGPU_EXECBUF_SYNCOBJ_FLAGS) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = drm_syncobj_find_fence(submit->file, syncobj_desc.handle,
+					     syncobj_desc.point, 0, &fence);
+		if (ret)
+			break;
+
+		ret = virtio_gpu_dma_fence_wait(submit, fence);
+
+		dma_fence_put(fence);
+		if (ret)
+			break;
+
+		if (syncobj_desc.flags & VIRTGPU_EXECBUF_SYNCOBJ_RESET) {
+			syncobjs[i] =
+				drm_syncobj_find(submit->file, syncobj_desc.handle);
+			if (!syncobjs[i]) {
+				ret = -EINVAL;
+				break;
+			}
+		}
+	}
+
+	if (ret) {
+		virtio_gpu_free_syncobjs(syncobjs, i);
+		return ret;
+	}
+
+	submit->in_syncobjs = syncobjs;
+
+	return ret;
+}
+
+static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
+				      uint32_t nr_syncobjs)
+{
+	uint32_t i;
+
+	for (i = 0; i < nr_syncobjs; i++) {
+		if (syncobjs[i])
+			drm_syncobj_replace_fence(syncobjs[i], NULL);
+	}
+}
+
+static void
+virtio_gpu_free_post_deps(struct virtio_gpu_submit_post_dep *post_deps,
+			  uint32_t nr_syncobjs)
+{
+	uint32_t i = nr_syncobjs;
+
+	while (i--) {
+		kfree(post_deps[i].chain);
+		drm_syncobj_put(post_deps[i].syncobj);
+	}
+
+	kvfree(post_deps);
+}
+
+static int virtio_gpu_parse_post_deps(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
+	struct virtio_gpu_submit_post_dep *post_deps;
+	size_t syncobj_stride = exbuf->syncobj_stride;
+	int ret = 0, i;
+
+	if (!submit->num_out_syncobjs)
+		return 0;
+
+	post_deps = kvcalloc(submit->num_out_syncobjs, sizeof(*post_deps),
+			     GFP_KERNEL);
+	if (!post_deps)
+		return -ENOMEM;
+
+	for (i = 0; i < submit->num_out_syncobjs; i++) {
+		uint64_t address = exbuf->out_syncobjs + i * syncobj_stride;
+
+		if (copy_from_user(&syncobj_desc,
+				   u64_to_user_ptr(address),
+				   min(syncobj_stride, sizeof(syncobj_desc)))) {
+			ret = -EFAULT;
+			break;
+		}
+
+		post_deps[i].point = syncobj_desc.point;
+
+		if (syncobj_desc.flags) {
+			ret = -EINVAL;
+			break;
+		}
+
+		if (syncobj_desc.point) {
+			post_deps[i].chain = dma_fence_chain_alloc();
+			if (!post_deps[i].chain) {
+				ret = -ENOMEM;
+				break;
+			}
+		}
+
+		post_deps[i].syncobj =
+			drm_syncobj_find(submit->file, syncobj_desc.handle);
+		if (!post_deps[i].syncobj) {
+			ret = -EINVAL;
+			break;
+		}
+	}
+
+	if (ret) {
+		virtio_gpu_free_post_deps(post_deps, i);
+		return ret;
+	}
+
+	submit->post_deps = post_deps;
+
+	return 0;
+}
+
+static void
+virtio_gpu_process_post_deps(struct virtio_gpu_submit *submit)
+{
+	struct virtio_gpu_submit_post_dep *post_deps = submit->post_deps;
+	struct dma_fence *fence = &submit->out_fence->f;
+	uint32_t i;
+
+	if (!post_deps)
+		return;
+
+	for (i = 0; i < submit->num_out_syncobjs; i++) {
+		if (post_deps[i].chain) {
+			drm_syncobj_add_point(post_deps[i].syncobj,
+					      post_deps[i].chain,
+					      fence, post_deps[i].point);
+			post_deps[i].chain = NULL;
+		} else {
+			drm_syncobj_replace_fence(post_deps[i].syncobj, fence);
+		}
+	}
+}
+
 static int virtio_gpu_fence_event_create(struct drm_device *dev,
 					 struct drm_file *file,
 					 struct virtio_gpu_fence *fence,
@@ -121,6 +317,18 @@ static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit)
 
 static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
 {
+	if (submit->in_syncobjs) {
+		virtio_gpu_reset_syncobjs(submit->in_syncobjs,
+					  submit->num_in_syncobjs);
+
+		virtio_gpu_free_syncobjs(submit->in_syncobjs,
+					 submit->num_in_syncobjs);
+	}
+
+	if (submit->post_deps)
+		virtio_gpu_free_post_deps(submit->post_deps,
+					  submit->num_out_syncobjs);
+
 	if (!IS_ERR(submit->buf))
 		kvfree(submit->buf);
 
@@ -173,6 +381,8 @@ static int virtio_gpu_init_submit(struct virtio_gpu_submit *submit,
 		return err;
 	}
 
+	submit->num_out_syncobjs = exbuf->num_out_syncobjs;
+	submit->num_in_syncobjs = exbuf->num_in_syncobjs;
 	submit->out_fence = out_fence;
 	submit->fence_ctx = fence_ctx;
 	submit->ring_idx = ring_idx;
@@ -285,6 +495,14 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto cleanup;
 
+	ret = virtio_gpu_parse_deps(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_parse_post_deps(&submit);
+	if (ret)
+		goto cleanup;
+
 	ret = virtio_gpu_install_out_fence_fd(&submit);
 	if (ret)
 		goto cleanup;
@@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 		goto cleanup;
 
 	virtio_gpu_submit(&submit);
+	virtio_gpu_process_post_deps(&submit);
 	virtio_gpu_complete_submit(&submit);
 cleanup:
 	virtio_gpu_cleanup_submit(&submit);
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index 7b158fcb02b4..ce4948aacafd 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -64,6 +64,16 @@ struct drm_virtgpu_map {
 	__u32 pad;
 };
 
+#define VIRTGPU_EXECBUF_SYNCOBJ_RESET		0x01
+#define VIRTGPU_EXECBUF_SYNCOBJ_FLAGS ( \
+		VIRTGPU_EXECBUF_SYNCOBJ_RESET | \
+		0)
+struct drm_virtgpu_execbuffer_syncobj {
+	__u32 handle;
+	__u32 flags;
+	__u64 point;
+};
+
 /* fence_fd is modified on success if VIRTGPU_EXECBUF_FENCE_FD_OUT flag is set. */
 struct drm_virtgpu_execbuffer {
 	__u32 flags;
@@ -73,7 +83,11 @@ struct drm_virtgpu_execbuffer {
 	__u32 num_bo_handles;
 	__s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */
 	__u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */
-	__u32 pad;
+	__u32 syncobj_stride;
+	__u32 num_in_syncobjs;
+	__u32 num_out_syncobjs;
+	__u64 in_syncobjs;
+	__u64 out_syncobjs;
 };
 
 #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-03-23 23:07   ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-03-23 23:07 UTC (permalink / raw)
  To: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, Emil Velikov
  Cc: kernel, linux-kernel, dri-devel, virtualization

Add sync object DRM UAPI support to VirtIO-GPU driver. It's required
for enabling a full-featured Vulkan fencing by Venus and native context
VirtIO-GPU Mesa drivers.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
---
 drivers/gpu/drm/virtio/virtgpu_drv.c    |   3 +-
 drivers/gpu/drm/virtio/virtgpu_submit.c | 219 ++++++++++++++++++++++++
 include/uapi/drm/virtgpu_drm.h          |  16 +-
 3 files changed, 236 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c
index add075681e18..a22155577152 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -176,7 +176,8 @@ static const struct drm_driver driver = {
 	 * If KMS is disabled DRIVER_MODESET and DRIVER_ATOMIC are masked
 	 * out via drm_device::driver_features:
 	 */
-	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC,
+	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC |
+			   DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE,
 	.open = virtio_gpu_driver_open,
 	.postclose = virtio_gpu_driver_postclose,
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c
index 42c79869f192..a18b21f9d07a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_submit.c
+++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
@@ -14,11 +14,24 @@
 #include <linux/uaccess.h>
 
 #include <drm/drm_file.h>
+#include <drm/drm_syncobj.h>
 #include <drm/virtgpu_drm.h>
 
 #include "virtgpu_drv.h"
 
+struct virtio_gpu_submit_post_dep {
+	struct drm_syncobj *syncobj;
+	struct dma_fence_chain *chain;
+	uint64_t point;
+};
+
 struct virtio_gpu_submit {
+	struct virtio_gpu_submit_post_dep *post_deps;
+	unsigned int num_out_syncobjs;
+
+	struct drm_syncobj **in_syncobjs;
+	unsigned int num_in_syncobjs;
+
 	struct virtio_gpu_object_array *buflist;
 	struct drm_virtgpu_execbuffer *exbuf;
 	struct virtio_gpu_fence *out_fence;
@@ -58,6 +71,189 @@ static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
 	return 0;
 }
 
+static void virtio_gpu_free_syncobjs(struct drm_syncobj **syncobjs,
+				     uint32_t nr_syncobjs)
+{
+	uint32_t i = nr_syncobjs;
+
+	while (i--) {
+		if (syncobjs[i])
+			drm_syncobj_put(syncobjs[i]);
+	}
+
+	kvfree(syncobjs);
+}
+
+static int
+virtio_gpu_parse_deps(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
+	size_t syncobj_stride = exbuf->syncobj_stride;
+	struct drm_syncobj **syncobjs;
+	int ret = 0, i;
+
+	if (!submit->num_in_syncobjs)
+		return 0;
+
+	syncobjs = kvcalloc(submit->num_in_syncobjs, sizeof(*syncobjs),
+			    GFP_KERNEL);
+	if (!syncobjs)
+		return -ENOMEM;
+
+	for (i = 0; i < submit->num_in_syncobjs; i++) {
+		uint64_t address = exbuf->in_syncobjs + i * syncobj_stride;
+		struct dma_fence *fence;
+
+		if (copy_from_user(&syncobj_desc,
+				   u64_to_user_ptr(address),
+				   min(syncobj_stride, sizeof(syncobj_desc)))) {
+			ret = -EFAULT;
+			break;
+		}
+
+		if (syncobj_desc.flags & ~VIRTGPU_EXECBUF_SYNCOBJ_FLAGS) {
+			ret = -EINVAL;
+			break;
+		}
+
+		ret = drm_syncobj_find_fence(submit->file, syncobj_desc.handle,
+					     syncobj_desc.point, 0, &fence);
+		if (ret)
+			break;
+
+		ret = virtio_gpu_dma_fence_wait(submit, fence);
+
+		dma_fence_put(fence);
+		if (ret)
+			break;
+
+		if (syncobj_desc.flags & VIRTGPU_EXECBUF_SYNCOBJ_RESET) {
+			syncobjs[i] =
+				drm_syncobj_find(submit->file, syncobj_desc.handle);
+			if (!syncobjs[i]) {
+				ret = -EINVAL;
+				break;
+			}
+		}
+	}
+
+	if (ret) {
+		virtio_gpu_free_syncobjs(syncobjs, i);
+		return ret;
+	}
+
+	submit->in_syncobjs = syncobjs;
+
+	return ret;
+}
+
+static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
+				      uint32_t nr_syncobjs)
+{
+	uint32_t i;
+
+	for (i = 0; i < nr_syncobjs; i++) {
+		if (syncobjs[i])
+			drm_syncobj_replace_fence(syncobjs[i], NULL);
+	}
+}
+
+static void
+virtio_gpu_free_post_deps(struct virtio_gpu_submit_post_dep *post_deps,
+			  uint32_t nr_syncobjs)
+{
+	uint32_t i = nr_syncobjs;
+
+	while (i--) {
+		kfree(post_deps[i].chain);
+		drm_syncobj_put(post_deps[i].syncobj);
+	}
+
+	kvfree(post_deps);
+}
+
+static int virtio_gpu_parse_post_deps(struct virtio_gpu_submit *submit)
+{
+	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
+	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
+	struct virtio_gpu_submit_post_dep *post_deps;
+	size_t syncobj_stride = exbuf->syncobj_stride;
+	int ret = 0, i;
+
+	if (!submit->num_out_syncobjs)
+		return 0;
+
+	post_deps = kvcalloc(submit->num_out_syncobjs, sizeof(*post_deps),
+			     GFP_KERNEL);
+	if (!post_deps)
+		return -ENOMEM;
+
+	for (i = 0; i < submit->num_out_syncobjs; i++) {
+		uint64_t address = exbuf->out_syncobjs + i * syncobj_stride;
+
+		if (copy_from_user(&syncobj_desc,
+				   u64_to_user_ptr(address),
+				   min(syncobj_stride, sizeof(syncobj_desc)))) {
+			ret = -EFAULT;
+			break;
+		}
+
+		post_deps[i].point = syncobj_desc.point;
+
+		if (syncobj_desc.flags) {
+			ret = -EINVAL;
+			break;
+		}
+
+		if (syncobj_desc.point) {
+			post_deps[i].chain = dma_fence_chain_alloc();
+			if (!post_deps[i].chain) {
+				ret = -ENOMEM;
+				break;
+			}
+		}
+
+		post_deps[i].syncobj =
+			drm_syncobj_find(submit->file, syncobj_desc.handle);
+		if (!post_deps[i].syncobj) {
+			ret = -EINVAL;
+			break;
+		}
+	}
+
+	if (ret) {
+		virtio_gpu_free_post_deps(post_deps, i);
+		return ret;
+	}
+
+	submit->post_deps = post_deps;
+
+	return 0;
+}
+
+static void
+virtio_gpu_process_post_deps(struct virtio_gpu_submit *submit)
+{
+	struct virtio_gpu_submit_post_dep *post_deps = submit->post_deps;
+	struct dma_fence *fence = &submit->out_fence->f;
+	uint32_t i;
+
+	if (!post_deps)
+		return;
+
+	for (i = 0; i < submit->num_out_syncobjs; i++) {
+		if (post_deps[i].chain) {
+			drm_syncobj_add_point(post_deps[i].syncobj,
+					      post_deps[i].chain,
+					      fence, post_deps[i].point);
+			post_deps[i].chain = NULL;
+		} else {
+			drm_syncobj_replace_fence(post_deps[i].syncobj, fence);
+		}
+	}
+}
+
 static int virtio_gpu_fence_event_create(struct drm_device *dev,
 					 struct drm_file *file,
 					 struct virtio_gpu_fence *fence,
@@ -121,6 +317,18 @@ static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit)
 
 static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
 {
+	if (submit->in_syncobjs) {
+		virtio_gpu_reset_syncobjs(submit->in_syncobjs,
+					  submit->num_in_syncobjs);
+
+		virtio_gpu_free_syncobjs(submit->in_syncobjs,
+					 submit->num_in_syncobjs);
+	}
+
+	if (submit->post_deps)
+		virtio_gpu_free_post_deps(submit->post_deps,
+					  submit->num_out_syncobjs);
+
 	if (!IS_ERR(submit->buf))
 		kvfree(submit->buf);
 
@@ -173,6 +381,8 @@ static int virtio_gpu_init_submit(struct virtio_gpu_submit *submit,
 		return err;
 	}
 
+	submit->num_out_syncobjs = exbuf->num_out_syncobjs;
+	submit->num_in_syncobjs = exbuf->num_in_syncobjs;
 	submit->out_fence = out_fence;
 	submit->fence_ctx = fence_ctx;
 	submit->ring_idx = ring_idx;
@@ -285,6 +495,14 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto cleanup;
 
+	ret = virtio_gpu_parse_deps(&submit);
+	if (ret)
+		goto cleanup;
+
+	ret = virtio_gpu_parse_post_deps(&submit);
+	if (ret)
+		goto cleanup;
+
 	ret = virtio_gpu_install_out_fence_fd(&submit);
 	if (ret)
 		goto cleanup;
@@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
 		goto cleanup;
 
 	virtio_gpu_submit(&submit);
+	virtio_gpu_process_post_deps(&submit);
 	virtio_gpu_complete_submit(&submit);
 cleanup:
 	virtio_gpu_cleanup_submit(&submit);
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index 7b158fcb02b4..ce4948aacafd 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -64,6 +64,16 @@ struct drm_virtgpu_map {
 	__u32 pad;
 };
 
+#define VIRTGPU_EXECBUF_SYNCOBJ_RESET		0x01
+#define VIRTGPU_EXECBUF_SYNCOBJ_FLAGS ( \
+		VIRTGPU_EXECBUF_SYNCOBJ_RESET | \
+		0)
+struct drm_virtgpu_execbuffer_syncobj {
+	__u32 handle;
+	__u32 flags;
+	__u64 point;
+};
+
 /* fence_fd is modified on success if VIRTGPU_EXECBUF_FENCE_FD_OUT flag is set. */
 struct drm_virtgpu_execbuffer {
 	__u32 flags;
@@ -73,7 +83,11 @@ struct drm_virtgpu_execbuffer {
 	__u32 num_bo_handles;
 	__s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */
 	__u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */
-	__u32 pad;
+	__u32 syncobj_stride;
+	__u32 num_in_syncobjs;
+	__u32 num_out_syncobjs;
+	__u64 in_syncobjs;
+	__u64 out_syncobjs;
 };
 
 #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/2] drm/virtio: Refactor job submission code path
  2023-03-23 23:07   ` Dmitry Osipenko
@ 2023-03-30 15:32     ` Emil Velikov
  -1 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-03-30 15:32 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel, kernel,
	virtualization

Hey Dmitry,

On 2023/03/24, Dmitry Osipenko wrote:
> Move virtio_gpu_execbuffer_ioctl() into separate virtgpu_submit.c file
> and refactor the code along the way to ease addition of new features to
> the ioctl.
> 

At a glance, we have a handful of no-op as well as some functional
changes - let's split those up in separate patches.

> Reviewed-by: Rob Clark <robdclark@gmail.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> ---


> +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
> +				     struct dma_fence *fence)
> +{
> +	struct dma_fence_unwrap itr;
> +	struct dma_fence *f;
> +	int err;
> +
> +	dma_fence_unwrap_for_each(f, &itr, fence) {

The dma_fence_unwrap_for_each() change should be a separate patch,
highlighting why we want it.

> +		err = virtio_gpu_do_fence_wait(submit, f);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}
> +


> +	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
> +				     fence_ctx, ring_idx);
> +	if (ret)
> +		goto cleanup;
> +
> +	ret = virtio_gpu_wait_in_fence(&submit);
> +	if (ret)
> +		goto cleanup;
> +

We have reshuffled the order around in_fence waiting, out_fence install,
handles, cmdbuf, drm events, etc. Can we get that split up a bit, with
some comments.

If it were me, I would keep the wait_in_fence early and inline
virtio_gpu_init_submit (the nesting/abstraction seems a bit much). This
means one can omit the virtio_gpu_submit::exbuf all together.


HTH
Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/2] drm/virtio: Refactor job submission code path
@ 2023-03-30 15:32     ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-03-30 15:32 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization

Hey Dmitry,

On 2023/03/24, Dmitry Osipenko wrote:
> Move virtio_gpu_execbuffer_ioctl() into separate virtgpu_submit.c file
> and refactor the code along the way to ease addition of new features to
> the ioctl.
> 

At a glance, we have a handful of no-op as well as some functional
changes - let's split those up in separate patches.

> Reviewed-by: Rob Clark <robdclark@gmail.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> ---


> +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
> +				     struct dma_fence *fence)
> +{
> +	struct dma_fence_unwrap itr;
> +	struct dma_fence *f;
> +	int err;
> +
> +	dma_fence_unwrap_for_each(f, &itr, fence) {

The dma_fence_unwrap_for_each() change should be a separate patch,
highlighting why we want it.

> +		err = virtio_gpu_do_fence_wait(submit, f);
> +		if (err)
> +			return err;
> +	}
> +
> +	return 0;
> +}
> +


> +	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
> +				     fence_ctx, ring_idx);
> +	if (ret)
> +		goto cleanup;
> +
> +	ret = virtio_gpu_wait_in_fence(&submit);
> +	if (ret)
> +		goto cleanup;
> +

We have reshuffled the order around in_fence waiting, out_fence install,
handles, cmdbuf, drm events, etc. Can we get that split up a bit, with
some comments.

If it were me, I would keep the wait_in_fence early and inline
virtio_gpu_init_submit (the nesting/abstraction seems a bit much). This
means one can omit the virtio_gpu_submit::exbuf all together.


HTH
Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-03-23 23:07   ` Dmitry Osipenko
@ 2023-03-30 17:24     ` Emil Velikov
  -1 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-03-30 17:24 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel, kernel,
	virtualization

Hi Dmitry,

Have you considered creating a few DRM helpers for this functionality?

AFAICT this is the third driver which supports syncobj timelines and
looking at one of the implementations ... it is not great. Note that
this suggestion is _not_ a blocker.

On 2023/03/24, Dmitry Osipenko wrote:
> Add sync object DRM UAPI support to VirtIO-GPU driver. It's required
> for enabling a full-featured Vulkan fencing by Venus and native context
> VirtIO-GPU Mesa drivers.
> 
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> ---

> +static int
> +virtio_gpu_parse_deps(struct virtio_gpu_submit *submit)
> +{
> +	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
> +	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
> +	size_t syncobj_stride = exbuf->syncobj_stride;
> +	struct drm_syncobj **syncobjs;
> +	int ret = 0, i;
> +
> +	if (!submit->num_in_syncobjs)
> +		return 0;
> +
> +	syncobjs = kvcalloc(submit->num_in_syncobjs, sizeof(*syncobjs),
> +			    GFP_KERNEL);

Please add an inline note about the decision behind the allocators used,
both here and in the parse_post_deps below. IIRC there was some nice
discussion between you and Rob in earlier revisions.

> +	if (!syncobjs)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < submit->num_in_syncobjs; i++) {
> +		uint64_t address = exbuf->in_syncobjs + i * syncobj_stride;
> +		struct dma_fence *fence;
> +
> +		if (copy_from_user(&syncobj_desc,
> +				   u64_to_user_ptr(address),
> +				   min(syncobj_stride, sizeof(syncobj_desc)))) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		if (syncobj_desc.flags & ~VIRTGPU_EXECBUF_SYNCOBJ_FLAGS) {
> +			ret = -EINVAL;
> +			break;
> +		}
> +
> +		ret = drm_syncobj_find_fence(submit->file, syncobj_desc.handle,
> +					     syncobj_desc.point, 0, &fence);
> +		if (ret)
> +			break;
> +

> +		ret = virtio_gpu_dma_fence_wait(submit, fence);
> +
> +		dma_fence_put(fence);
> +		if (ret)
> +			break;

If going the DRM helpers route:

The above two are effectively the only variance across vendors - a
simple function point as arg should suffice. Might want to use internal
flags, but that's also trivial.

> +	submit->in_syncobjs = syncobjs;
> +
> +	return ret;
> +}
> +
> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
> +				      uint32_t nr_syncobjs)
> +{
> +	uint32_t i;
> +
> +	for (i = 0; i < nr_syncobjs; i++) {
> +		if (syncobjs[i])
> +			drm_syncobj_replace_fence(syncobjs[i], NULL);

Side note: the drm_syncobj_put() called immediately after also calls
replace/reset fence internally. Although reading from the docs, I'm not
sure if relying on that is a wise move.

Just thought I'd point it out.

>  
> +	ret = virtio_gpu_parse_deps(&submit);
> +	if (ret)
> +		goto cleanup;
> +
> +	ret = virtio_gpu_parse_post_deps(&submit);
> +	if (ret)
> +		goto cleanup;
> +

I think we should zero num_(in|out)_syncobjs when the respective parse
fails. Otherwise we get one "cleanup" within the parse function itself
and a second during the cleanup_submit. Haven't looked at it too closely
but I suspect that will trigger an UAF or two.

>  	ret = virtio_gpu_install_out_fence_fd(&submit);
>  	if (ret)
>  		goto cleanup;
> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>  		goto cleanup;
>  
>  	virtio_gpu_submit(&submit);
> +	virtio_gpu_process_post_deps(&submit);

Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
(similar to msm) allows the reader to get closure about the in syncobjs.

This is just personal preference, so don't read too much into it.

HTH
Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-03-30 17:24     ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-03-30 17:24 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization

Hi Dmitry,

Have you considered creating a few DRM helpers for this functionality?

AFAICT this is the third driver which supports syncobj timelines and
looking at one of the implementations ... it is not great. Note that
this suggestion is _not_ a blocker.

On 2023/03/24, Dmitry Osipenko wrote:
> Add sync object DRM UAPI support to VirtIO-GPU driver. It's required
> for enabling a full-featured Vulkan fencing by Venus and native context
> VirtIO-GPU Mesa drivers.
> 
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> ---

> +static int
> +virtio_gpu_parse_deps(struct virtio_gpu_submit *submit)
> +{
> +	struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
> +	struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
> +	size_t syncobj_stride = exbuf->syncobj_stride;
> +	struct drm_syncobj **syncobjs;
> +	int ret = 0, i;
> +
> +	if (!submit->num_in_syncobjs)
> +		return 0;
> +
> +	syncobjs = kvcalloc(submit->num_in_syncobjs, sizeof(*syncobjs),
> +			    GFP_KERNEL);

Please add an inline note about the decision behind the allocators used,
both here and in the parse_post_deps below. IIRC there was some nice
discussion between you and Rob in earlier revisions.

> +	if (!syncobjs)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < submit->num_in_syncobjs; i++) {
> +		uint64_t address = exbuf->in_syncobjs + i * syncobj_stride;
> +		struct dma_fence *fence;
> +
> +		if (copy_from_user(&syncobj_desc,
> +				   u64_to_user_ptr(address),
> +				   min(syncobj_stride, sizeof(syncobj_desc)))) {
> +			ret = -EFAULT;
> +			break;
> +		}
> +
> +		if (syncobj_desc.flags & ~VIRTGPU_EXECBUF_SYNCOBJ_FLAGS) {
> +			ret = -EINVAL;
> +			break;
> +		}
> +
> +		ret = drm_syncobj_find_fence(submit->file, syncobj_desc.handle,
> +					     syncobj_desc.point, 0, &fence);
> +		if (ret)
> +			break;
> +

> +		ret = virtio_gpu_dma_fence_wait(submit, fence);
> +
> +		dma_fence_put(fence);
> +		if (ret)
> +			break;

If going the DRM helpers route:

The above two are effectively the only variance across vendors - a
simple function point as arg should suffice. Might want to use internal
flags, but that's also trivial.

> +	submit->in_syncobjs = syncobjs;
> +
> +	return ret;
> +}
> +
> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
> +				      uint32_t nr_syncobjs)
> +{
> +	uint32_t i;
> +
> +	for (i = 0; i < nr_syncobjs; i++) {
> +		if (syncobjs[i])
> +			drm_syncobj_replace_fence(syncobjs[i], NULL);

Side note: the drm_syncobj_put() called immediately after also calls
replace/reset fence internally. Although reading from the docs, I'm not
sure if relying on that is a wise move.

Just thought I'd point it out.

>  
> +	ret = virtio_gpu_parse_deps(&submit);
> +	if (ret)
> +		goto cleanup;
> +
> +	ret = virtio_gpu_parse_post_deps(&submit);
> +	if (ret)
> +		goto cleanup;
> +

I think we should zero num_(in|out)_syncobjs when the respective parse
fails. Otherwise we get one "cleanup" within the parse function itself
and a second during the cleanup_submit. Haven't looked at it too closely
but I suspect that will trigger an UAF or two.

>  	ret = virtio_gpu_install_out_fence_fd(&submit);
>  	if (ret)
>  		goto cleanup;
> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>  		goto cleanup;
>  
>  	virtio_gpu_submit(&submit);
> +	virtio_gpu_process_post_deps(&submit);

Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
(similar to msm) allows the reader to get closure about the in syncobjs.

This is just personal preference, so don't read too much into it.

HTH
Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-03-30 17:24     ` Emil Velikov
@ 2023-04-02 17:45       ` Dmitry Osipenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-02 17:45 UTC (permalink / raw)
  To: Emil Velikov
  Cc: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel, kernel,
	virtualization

On 3/30/23 20:24, Emil Velikov wrote:
> Hi Dmitry,
> 
> Have you considered creating a few DRM helpers for this functionality?
> 
> AFAICT this is the third driver which supports syncobj timelines and
> looking at one of the implementations ... it is not great. Note that
> this suggestion is _not_ a blocker.

Would like to see a third driver starting to use the exactly same
drm_execbuffer_syncobj struct because UABI part isn't generic, though
it's a replica of the MSM driver for now.

The virtio-gpu is only at the beginning of starting to use sync objects,
compared to MSM driver. Will be better to defer the generalization until
virtio-gpu will become more mature, like maybe after a year since the
time virtio userspace will start using sync objects, IMO.

...
>> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
>> +				      uint32_t nr_syncobjs)
>> +{
>> +	uint32_t i;
>> +
>> +	for (i = 0; i < nr_syncobjs; i++) {
>> +		if (syncobjs[i])
>> +			drm_syncobj_replace_fence(syncobjs[i], NULL);
> 
> Side note: the drm_syncobj_put() called immediately after also calls
> replace/reset fence internally. Although reading from the docs, I'm not
> sure if relying on that is a wise move.
> 
> Just thought I'd point it out.

The drm_syncobj_put() doesn't call replace/reset fence until syncobj is
freed. We drop the old fence for active/alive in-syncobj here after
handling the fence-wait, this makes syncobj reusable, otherwise
userpsace would have to re-create syncobjs after each submission.

>>  
>> +	ret = virtio_gpu_parse_deps(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
>> +	ret = virtio_gpu_parse_post_deps(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
> 
> I think we should zero num_(in|out)_syncobjs when the respective parse
> fails. Otherwise we get one "cleanup" within the parse function itself
> and a second during the cleanup_submit. Haven't looked at it too closely
> but I suspect that will trigger an UAF or two.

There are checks for NULL pointers in the code that will prevent the
UAF. I'll add zeroing of the nums for more consistency.

>>  	ret = virtio_gpu_install_out_fence_fd(&submit);
>>  	if (ret)
>>  		goto cleanup;
>> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>>  		goto cleanup;
>>  
>>  	virtio_gpu_submit(&submit);
>> +	virtio_gpu_process_post_deps(&submit);
> 
> Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
> virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
> (similar to msm) allows the reader to get closure about the in syncobjs.
> 
> This is just personal preference, so don't read too much into it.

The job submission path should be short as possible in general.
Technically, virtio_gpu_process_post_deps() should be fast, but since
I'm not 100% sure about all the corner cases, it's better to hold until
job is sent out.

Thank you very much for the review! I'll address the rest of comments in v5.

-- 
Best regards,
Dmitry

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-02 17:45       ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-02 17:45 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization

On 3/30/23 20:24, Emil Velikov wrote:
> Hi Dmitry,
> 
> Have you considered creating a few DRM helpers for this functionality?
> 
> AFAICT this is the third driver which supports syncobj timelines and
> looking at one of the implementations ... it is not great. Note that
> this suggestion is _not_ a blocker.

Would like to see a third driver starting to use the exactly same
drm_execbuffer_syncobj struct because UABI part isn't generic, though
it's a replica of the MSM driver for now.

The virtio-gpu is only at the beginning of starting to use sync objects,
compared to MSM driver. Will be better to defer the generalization until
virtio-gpu will become more mature, like maybe after a year since the
time virtio userspace will start using sync objects, IMO.

...
>> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
>> +				      uint32_t nr_syncobjs)
>> +{
>> +	uint32_t i;
>> +
>> +	for (i = 0; i < nr_syncobjs; i++) {
>> +		if (syncobjs[i])
>> +			drm_syncobj_replace_fence(syncobjs[i], NULL);
> 
> Side note: the drm_syncobj_put() called immediately after also calls
> replace/reset fence internally. Although reading from the docs, I'm not
> sure if relying on that is a wise move.
> 
> Just thought I'd point it out.

The drm_syncobj_put() doesn't call replace/reset fence until syncobj is
freed. We drop the old fence for active/alive in-syncobj here after
handling the fence-wait, this makes syncobj reusable, otherwise
userpsace would have to re-create syncobjs after each submission.

>>  
>> +	ret = virtio_gpu_parse_deps(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
>> +	ret = virtio_gpu_parse_post_deps(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
> 
> I think we should zero num_(in|out)_syncobjs when the respective parse
> fails. Otherwise we get one "cleanup" within the parse function itself
> and a second during the cleanup_submit. Haven't looked at it too closely
> but I suspect that will trigger an UAF or two.

There are checks for NULL pointers in the code that will prevent the
UAF. I'll add zeroing of the nums for more consistency.

>>  	ret = virtio_gpu_install_out_fence_fd(&submit);
>>  	if (ret)
>>  		goto cleanup;
>> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
>>  		goto cleanup;
>>  
>>  	virtio_gpu_submit(&submit);
>> +	virtio_gpu_process_post_deps(&submit);
> 
> Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
> virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
> (similar to msm) allows the reader to get closure about the in syncobjs.
> 
> This is just personal preference, so don't read too much into it.

The job submission path should be short as possible in general.
Technically, virtio_gpu_process_post_deps() should be fast, but since
I'm not 100% sure about all the corner cases, it's better to hold until
job is sent out.

Thank you very much for the review! I'll address the rest of comments in v5.

-- 
Best regards,
Dmitry

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-04-02 17:45       ` Dmitry Osipenko
  (?)
@ 2023-04-03 13:00         ` Emil Velikov
  -1 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Emil Velikov, Pierre-Eric Pelloux-Prayer, Marek Olšák,
	linux-kernel, dri-devel, Gurchetan Singh, Gerd Hoffmann,
	David Airlie, kernel, virtualization

On Sun, 2 Apr 2023 at 18:45, Dmitry Osipenko
<dmitry.osipenko@collabora.com> wrote:
>
> On 3/30/23 20:24, Emil Velikov wrote:
> > Hi Dmitry,
> >
> > Have you considered creating a few DRM helpers for this functionality?
> >
> > AFAICT this is the third driver which supports syncobj timelines and
> > looking at one of the implementations ... it is not great. Note that
> > this suggestion is _not_ a blocker.
>
> Would like to see a third driver starting to use the exactly same
> drm_execbuffer_syncobj struct because UABI part isn't generic, though
> it's a replica of the MSM driver for now.
>
> The virtio-gpu is only at the beginning of starting to use sync objects,
> compared to MSM driver. Will be better to defer the generalization until
> virtio-gpu will become more mature, like maybe after a year since the
> time virtio userspace will start using sync objects, IMO.
>

I wasn't talking about generic UAPI, but having drm helpers instead.
The former (as you pointed out) would need time to crystallize. While
the latter can be done even today.

> ...
> >> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
> >> +                                  uint32_t nr_syncobjs)
> >> +{
> >> +    uint32_t i;
> >> +
> >> +    for (i = 0; i < nr_syncobjs; i++) {
> >> +            if (syncobjs[i])
> >> +                    drm_syncobj_replace_fence(syncobjs[i], NULL);
> >
> > Side note: the drm_syncobj_put() called immediately after also calls
> > replace/reset fence internally. Although reading from the docs, I'm not
> > sure if relying on that is a wise move.
> >
> > Just thought I'd point it out.
>
> The drm_syncobj_put() doesn't call replace/reset fence until syncobj is
> freed. We drop the old fence for active/alive in-syncobj here after
> handling the fence-wait, this makes syncobj reusable, otherwise
> userpsace would have to re-create syncobjs after each submission.
>

I see, thanks.

> >>
> >> +    ret = virtio_gpu_parse_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >> +    ret = virtio_gpu_parse_post_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >
> > I think we should zero num_(in|out)_syncobjs when the respective parse
> > fails. Otherwise we get one "cleanup" within the parse function itself
> > and a second during the cleanup_submit. Haven't looked at it too closely
> > but I suspect that will trigger an UAF or two.
>
> There are checks for NULL pointers in the code that will prevent the
> UAF.  I'll add zeroing of the nums for more consistency.
>

Riiiight the drm_syncobj is attached to the encapsulating struct
virtio_gpu_submit _only_ on success.
By clearing the num variables,  the NULL checks will no longer be
needed ... in case you'd want to drop that.

Either way - even as-is the code is safe.

> >>      ret = virtio_gpu_install_out_fence_fd(&submit);
> >>      if (ret)
> >>              goto cleanup;
> >> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> >>              goto cleanup;
> >>
> >>      virtio_gpu_submit(&submit);
> >> +    virtio_gpu_process_post_deps(&submit);
> >
> > Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
> > virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
> > (similar to msm) allows the reader to get closure about the in syncobjs.
> >
> > This is just personal preference, so don't read too much into it.
>
> The job submission path should be short as possible in general.
> Technically, virtio_gpu_process_post_deps() should be fast, but since
> I'm not 100% sure about all the corner cases, it's better to hold until
> job is sent out.
>

Ack, thanks again

-Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-03 13:00         ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, David Airlie, kernel, virtualization,
	Emil Velikov

On Sun, 2 Apr 2023 at 18:45, Dmitry Osipenko
<dmitry.osipenko@collabora.com> wrote:
>
> On 3/30/23 20:24, Emil Velikov wrote:
> > Hi Dmitry,
> >
> > Have you considered creating a few DRM helpers for this functionality?
> >
> > AFAICT this is the third driver which supports syncobj timelines and
> > looking at one of the implementations ... it is not great. Note that
> > this suggestion is _not_ a blocker.
>
> Would like to see a third driver starting to use the exactly same
> drm_execbuffer_syncobj struct because UABI part isn't generic, though
> it's a replica of the MSM driver for now.
>
> The virtio-gpu is only at the beginning of starting to use sync objects,
> compared to MSM driver. Will be better to defer the generalization until
> virtio-gpu will become more mature, like maybe after a year since the
> time virtio userspace will start using sync objects, IMO.
>

I wasn't talking about generic UAPI, but having drm helpers instead.
The former (as you pointed out) would need time to crystallize. While
the latter can be done even today.

> ...
> >> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
> >> +                                  uint32_t nr_syncobjs)
> >> +{
> >> +    uint32_t i;
> >> +
> >> +    for (i = 0; i < nr_syncobjs; i++) {
> >> +            if (syncobjs[i])
> >> +                    drm_syncobj_replace_fence(syncobjs[i], NULL);
> >
> > Side note: the drm_syncobj_put() called immediately after also calls
> > replace/reset fence internally. Although reading from the docs, I'm not
> > sure if relying on that is a wise move.
> >
> > Just thought I'd point it out.
>
> The drm_syncobj_put() doesn't call replace/reset fence until syncobj is
> freed. We drop the old fence for active/alive in-syncobj here after
> handling the fence-wait, this makes syncobj reusable, otherwise
> userpsace would have to re-create syncobjs after each submission.
>

I see, thanks.

> >>
> >> +    ret = virtio_gpu_parse_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >> +    ret = virtio_gpu_parse_post_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >
> > I think we should zero num_(in|out)_syncobjs when the respective parse
> > fails. Otherwise we get one "cleanup" within the parse function itself
> > and a second during the cleanup_submit. Haven't looked at it too closely
> > but I suspect that will trigger an UAF or two.
>
> There are checks for NULL pointers in the code that will prevent the
> UAF.  I'll add zeroing of the nums for more consistency.
>

Riiiight the drm_syncobj is attached to the encapsulating struct
virtio_gpu_submit _only_ on success.
By clearing the num variables,  the NULL checks will no longer be
needed ... in case you'd want to drop that.

Either way - even as-is the code is safe.

> >>      ret = virtio_gpu_install_out_fence_fd(&submit);
> >>      if (ret)
> >>              goto cleanup;
> >> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> >>              goto cleanup;
> >>
> >>      virtio_gpu_submit(&submit);
> >> +    virtio_gpu_process_post_deps(&submit);
> >
> > Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
> > virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
> > (similar to msm) allows the reader to get closure about the in syncobjs.
> >
> > This is just personal preference, so don't read too much into it.
>
> The job submission path should be short as possible in general.
> Technically, virtio_gpu_process_post_deps() should be fast, but since
> I'm not 100% sure about all the corner cases, it's better to hold until
> job is sent out.
>

Ack, thanks again

-Emil
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-03 13:00         ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:00 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization, Emil Velikov

On Sun, 2 Apr 2023 at 18:45, Dmitry Osipenko
<dmitry.osipenko@collabora.com> wrote:
>
> On 3/30/23 20:24, Emil Velikov wrote:
> > Hi Dmitry,
> >
> > Have you considered creating a few DRM helpers for this functionality?
> >
> > AFAICT this is the third driver which supports syncobj timelines and
> > looking at one of the implementations ... it is not great. Note that
> > this suggestion is _not_ a blocker.
>
> Would like to see a third driver starting to use the exactly same
> drm_execbuffer_syncobj struct because UABI part isn't generic, though
> it's a replica of the MSM driver for now.
>
> The virtio-gpu is only at the beginning of starting to use sync objects,
> compared to MSM driver. Will be better to defer the generalization until
> virtio-gpu will become more mature, like maybe after a year since the
> time virtio userspace will start using sync objects, IMO.
>

I wasn't talking about generic UAPI, but having drm helpers instead.
The former (as you pointed out) would need time to crystallize. While
the latter can be done even today.

> ...
> >> +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs,
> >> +                                  uint32_t nr_syncobjs)
> >> +{
> >> +    uint32_t i;
> >> +
> >> +    for (i = 0; i < nr_syncobjs; i++) {
> >> +            if (syncobjs[i])
> >> +                    drm_syncobj_replace_fence(syncobjs[i], NULL);
> >
> > Side note: the drm_syncobj_put() called immediately after also calls
> > replace/reset fence internally. Although reading from the docs, I'm not
> > sure if relying on that is a wise move.
> >
> > Just thought I'd point it out.
>
> The drm_syncobj_put() doesn't call replace/reset fence until syncobj is
> freed. We drop the old fence for active/alive in-syncobj here after
> handling the fence-wait, this makes syncobj reusable, otherwise
> userpsace would have to re-create syncobjs after each submission.
>

I see, thanks.

> >>
> >> +    ret = virtio_gpu_parse_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >> +    ret = virtio_gpu_parse_post_deps(&submit);
> >> +    if (ret)
> >> +            goto cleanup;
> >> +
> >
> > I think we should zero num_(in|out)_syncobjs when the respective parse
> > fails. Otherwise we get one "cleanup" within the parse function itself
> > and a second during the cleanup_submit. Haven't looked at it too closely
> > but I suspect that will trigger an UAF or two.
>
> There are checks for NULL pointers in the code that will prevent the
> UAF.  I'll add zeroing of the nums for more consistency.
>

Riiiight the drm_syncobj is attached to the encapsulating struct
virtio_gpu_submit _only_ on success.
By clearing the num variables,  the NULL checks will no longer be
needed ... in case you'd want to drop that.

Either way - even as-is the code is safe.

> >>      ret = virtio_gpu_install_out_fence_fd(&submit);
> >>      if (ret)
> >>              goto cleanup;
> >> @@ -294,6 +512,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data,
> >>              goto cleanup;
> >>
> >>      virtio_gpu_submit(&submit);
> >> +    virtio_gpu_process_post_deps(&submit);
> >
> > Any particular reason why the virtio_gpu_reset_syncobjs is deferred to
> > virtio_gpu_cleanup_submit(). Having it just above the process_post_deps
> > (similar to msm) allows the reader to get closure about the in syncobjs.
> >
> > This is just personal preference, so don't read too much into it.
>
> The job submission path should be short as possible in general.
> Technically, virtio_gpu_process_post_deps() should be fast, but since
> I'm not 100% sure about all the corner cases, it's better to hold until
> job is sent out.
>

Ack, thanks again

-Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-04-03 13:00         ` Emil Velikov
  (?)
@ 2023-04-03 13:22           ` Emil Velikov
  -1 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:22 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Emil Velikov, Pierre-Eric Pelloux-Prayer, Marek Olšák,
	linux-kernel, dri-devel, Gurchetan Singh, Gerd Hoffmann,
	David Airlie, kernel, virtualization

On Mon, 3 Apr 2023 at 14:00, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> > > I think we should zero num_(in|out)_syncobjs when the respective parse
> > > fails. Otherwise we get one "cleanup" within the parse function itself
> > > and a second during the cleanup_submit. Haven't looked at it too closely
> > > but I suspect that will trigger an UAF or two.
> >
> > There are checks for NULL pointers in the code that will prevent the
> > UAF.  I'll add zeroing of the nums for more consistency.
> >
>
> Riiiight the drm_syncobj is attached to the encapsulating struct
> virtio_gpu_submit _only_ on success.
> By clearing the num variables,  the NULL checks will no longer be
> needed ... in case you'd want to drop that.
>
> Either way - even as-is the code is safe.
>

Err or not. The NULL check itself will cause NULL pointer deref.

In more detail: in/out syncobjs are memset() to NULL in
virtio_gpu_init_submit(). The virtio_gpu_parse_(|post_)deps() will
fail and leave them unchanged. Then virtio_gpu_free_syncobjs() and
virtio_gpu_reset_syncobjs() will trigger a NULL ptr deref, because
they are accessing the elements of a the (NULL) array.

Apart from the num_(in|out)_syncobjcs=0, I would drop the NULL checks
- they give a false sense of security IMHO.

-Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-03 13:22           ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:22 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, David Airlie, kernel, virtualization,
	Emil Velikov

On Mon, 3 Apr 2023 at 14:00, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> > > I think we should zero num_(in|out)_syncobjs when the respective parse
> > > fails. Otherwise we get one "cleanup" within the parse function itself
> > > and a second during the cleanup_submit. Haven't looked at it too closely
> > > but I suspect that will trigger an UAF or two.
> >
> > There are checks for NULL pointers in the code that will prevent the
> > UAF.  I'll add zeroing of the nums for more consistency.
> >
>
> Riiiight the drm_syncobj is attached to the encapsulating struct
> virtio_gpu_submit _only_ on success.
> By clearing the num variables,  the NULL checks will no longer be
> needed ... in case you'd want to drop that.
>
> Either way - even as-is the code is safe.
>

Err or not. The NULL check itself will cause NULL pointer deref.

In more detail: in/out syncobjs are memset() to NULL in
virtio_gpu_init_submit(). The virtio_gpu_parse_(|post_)deps() will
fail and leave them unchanged. Then virtio_gpu_free_syncobjs() and
virtio_gpu_reset_syncobjs() will trigger a NULL ptr deref, because
they are accessing the elements of a the (NULL) array.

Apart from the num_(in|out)_syncobjcs=0, I would drop the NULL checks
- they give a false sense of security IMHO.

-Emil
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-03 13:22           ` Emil Velikov
  0 siblings, 0 replies; 22+ messages in thread
From: Emil Velikov @ 2023-04-03 13:22 UTC (permalink / raw)
  To: Dmitry Osipenko
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization, Emil Velikov

On Mon, 3 Apr 2023 at 14:00, Emil Velikov <emil.l.velikov@gmail.com> wrote:

> > > I think we should zero num_(in|out)_syncobjs when the respective parse
> > > fails. Otherwise we get one "cleanup" within the parse function itself
> > > and a second during the cleanup_submit. Haven't looked at it too closely
> > > but I suspect that will trigger an UAF or two.
> >
> > There are checks for NULL pointers in the code that will prevent the
> > UAF.  I'll add zeroing of the nums for more consistency.
> >
>
> Riiiight the drm_syncobj is attached to the encapsulating struct
> virtio_gpu_submit _only_ on success.
> By clearing the num variables,  the NULL checks will no longer be
> needed ... in case you'd want to drop that.
>
> Either way - even as-is the code is safe.
>

Err or not. The NULL check itself will cause NULL pointer deref.

In more detail: in/out syncobjs are memset() to NULL in
virtio_gpu_init_submit(). The virtio_gpu_parse_(|post_)deps() will
fail and leave them unchanged. Then virtio_gpu_free_syncobjs() and
virtio_gpu_reset_syncobjs() will trigger a NULL ptr deref, because
they are accessing the elements of a the (NULL) array.

Apart from the num_(in|out)_syncobjcs=0, I would drop the NULL checks
- they give a false sense of security IMHO.

-Emil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
  2023-04-03 13:22           ` Emil Velikov
@ 2023-04-03 15:07             ` Dmitry Osipenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-03 15:07 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Emil Velikov, Pierre-Eric Pelloux-Prayer, Marek Olšák,
	linux-kernel, dri-devel, Gurchetan Singh, Gerd Hoffmann,
	David Airlie, kernel, virtualization

On 4/3/23 16:22, Emil Velikov wrote:
> On Mon, 3 Apr 2023 at 14:00, Emil Velikov <emil.l.velikov@gmail.com> wrote:
> 
>>>> I think we should zero num_(in|out)_syncobjs when the respective parse
>>>> fails. Otherwise we get one "cleanup" within the parse function itself
>>>> and a second during the cleanup_submit. Haven't looked at it too closely
>>>> but I suspect that will trigger an UAF or two.
>>>
>>> There are checks for NULL pointers in the code that will prevent the
>>> UAF.  I'll add zeroing of the nums for more consistency.
>>>
>>
>> Riiiight the drm_syncobj is attached to the encapsulating struct
>> virtio_gpu_submit _only_ on success.
>> By clearing the num variables,  the NULL checks will no longer be
>> needed ... in case you'd want to drop that.
>>
>> Either way - even as-is the code is safe.
>>
> 
> Err or not. The NULL check itself will cause NULL pointer deref.
> 
> In more detail: in/out syncobjs are memset() to NULL in
> virtio_gpu_init_submit(). The virtio_gpu_parse_(|post_)deps() will
> fail and leave them unchanged. Then virtio_gpu_free_syncobjs() and
> virtio_gpu_reset_syncobjs() will trigger a NULL ptr deref, because
> they are accessing the elements of a the (NULL) array.
> 
> Apart from the num_(in|out)_syncobjcs=0, I would drop the NULL checks
> - they give a false sense of security IMHO.

The reset/free are both under the NULL check on cleanup. I think it
should work okay on error. Will improve it anyways to make more
intuitive. Thanks!

static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
{
	if (submit->in_syncobjs) {
		virtio_gpu_reset_syncobjs(submit->in_syncobjs,
					  submit->num_in_syncobjs);

		virtio_gpu_free_syncobjs(submit->in_syncobjs,
					 submit->num_in_syncobjs);
	}

-- 
Best regards,
Dmitry


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/2] drm/virtio: Support sync objects
@ 2023-04-03 15:07             ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-03 15:07 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization, Emil Velikov

On 4/3/23 16:22, Emil Velikov wrote:
> On Mon, 3 Apr 2023 at 14:00, Emil Velikov <emil.l.velikov@gmail.com> wrote:
> 
>>>> I think we should zero num_(in|out)_syncobjs when the respective parse
>>>> fails. Otherwise we get one "cleanup" within the parse function itself
>>>> and a second during the cleanup_submit. Haven't looked at it too closely
>>>> but I suspect that will trigger an UAF or two.
>>>
>>> There are checks for NULL pointers in the code that will prevent the
>>> UAF.  I'll add zeroing of the nums for more consistency.
>>>
>>
>> Riiiight the drm_syncobj is attached to the encapsulating struct
>> virtio_gpu_submit _only_ on success.
>> By clearing the num variables,  the NULL checks will no longer be
>> needed ... in case you'd want to drop that.
>>
>> Either way - even as-is the code is safe.
>>
> 
> Err or not. The NULL check itself will cause NULL pointer deref.
> 
> In more detail: in/out syncobjs are memset() to NULL in
> virtio_gpu_init_submit(). The virtio_gpu_parse_(|post_)deps() will
> fail and leave them unchanged. Then virtio_gpu_free_syncobjs() and
> virtio_gpu_reset_syncobjs() will trigger a NULL ptr deref, because
> they are accessing the elements of a the (NULL) array.
> 
> Apart from the num_(in|out)_syncobjcs=0, I would drop the NULL checks
> - they give a false sense of security IMHO.

The reset/free are both under the NULL check on cleanup. I think it
should work okay on error. Will improve it anyways to make more
intuitive. Thanks!

static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit)
{
	if (submit->in_syncobjs) {
		virtio_gpu_reset_syncobjs(submit->in_syncobjs,
					  submit->num_in_syncobjs);

		virtio_gpu_free_syncobjs(submit->in_syncobjs,
					 submit->num_in_syncobjs);
	}

-- 
Best regards,
Dmitry


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/2] drm/virtio: Refactor job submission code path
  2023-03-30 15:32     ` Emil Velikov
@ 2023-04-03 19:59       ` Dmitry Osipenko
  -1 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-03 19:59 UTC (permalink / raw)
  To: Emil Velikov
  Cc: David Airlie, Gerd Hoffmann, Gurchetan Singh, Chia-I Wu,
	Daniel Vetter, Rob Clark, Marek Olšák,
	Pierre-Eric Pelloux-Prayer, dri-devel, linux-kernel, kernel,
	virtualization

On 3/30/23 18:32, Emil Velikov wrote:
>> +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
>> +				     struct dma_fence *fence)
>> +{
>> +	struct dma_fence_unwrap itr;
>> +	struct dma_fence *f;
>> +	int err;
>> +
>> +	dma_fence_unwrap_for_each(f, &itr, fence) {
> The dma_fence_unwrap_for_each() change should be a separate patch,
> highlighting why we want it.

Good point, it actually should be a potential optimization for the
in-fence waiting.

>> +	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
>> +				     fence_ctx, ring_idx);
>> +	if (ret)
>> +		goto cleanup;
>> +
>> +	ret = virtio_gpu_wait_in_fence(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
> We have reshuffled the order around in_fence waiting, out_fence install,
> handles, cmdbuf, drm events, etc. Can we get that split up a bit, with
> some comments.
> 
> If it were me, I would keep the wait_in_fence early and inline
> virtio_gpu_init_submit (the nesting/abstraction seems a bit much). This
> means one can omit the virtio_gpu_submit::exbuf all together.

I tried to inline and this variant makes code much less readable to me.

The point of having wait_in_fence after submit_init is that it makes
submit code path shorter. If we have to wait for in-fence, then once
fence signals, there is no need to init and instead move directly to a
further submission step.

Perhaps won't hurt to also factor out the wait_fence from parse_deps in
the second patch and do all the waits right before locking the buflist.

-- 
Best regards,
Dmitry


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 1/2] drm/virtio: Refactor job submission code path
@ 2023-04-03 19:59       ` Dmitry Osipenko
  0 siblings, 0 replies; 22+ messages in thread
From: Dmitry Osipenko @ 2023-04-03 19:59 UTC (permalink / raw)
  To: Emil Velikov
  Cc: Pierre-Eric Pelloux-Prayer, Marek Olšák, linux-kernel,
	dri-devel, Gurchetan Singh, Gerd Hoffmann, David Airlie, kernel,
	virtualization

On 3/30/23 18:32, Emil Velikov wrote:
>> +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit,
>> +				     struct dma_fence *fence)
>> +{
>> +	struct dma_fence_unwrap itr;
>> +	struct dma_fence *f;
>> +	int err;
>> +
>> +	dma_fence_unwrap_for_each(f, &itr, fence) {
> The dma_fence_unwrap_for_each() change should be a separate patch,
> highlighting why we want it.

Good point, it actually should be a potential optimization for the
in-fence waiting.

>> +	ret = virtio_gpu_init_submit(&submit, exbuf, dev, file,
>> +				     fence_ctx, ring_idx);
>> +	if (ret)
>> +		goto cleanup;
>> +
>> +	ret = virtio_gpu_wait_in_fence(&submit);
>> +	if (ret)
>> +		goto cleanup;
>> +
> We have reshuffled the order around in_fence waiting, out_fence install,
> handles, cmdbuf, drm events, etc. Can we get that split up a bit, with
> some comments.
> 
> If it were me, I would keep the wait_in_fence early and inline
> virtio_gpu_init_submit (the nesting/abstraction seems a bit much). This
> means one can omit the virtio_gpu_submit::exbuf all together.

I tried to inline and this variant makes code much less readable to me.

The point of having wait_in_fence after submit_init is that it makes
submit code path shorter. If we have to wait for in-fence, then once
fence signals, there is no need to init and instead move directly to a
further submission step.

Perhaps won't hurt to also factor out the wait_fence from parse_deps in
the second patch and do all the waits right before locking the buflist.

-- 
Best regards,
Dmitry


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-04-03 19:59 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 23:07 [PATCH v4 0/2] Add sync object UAPI support to VirtIO-GPU driver Dmitry Osipenko
2023-03-23 23:07 ` Dmitry Osipenko
2023-03-23 23:07 ` [PATCH v4 1/2] drm/virtio: Refactor job submission code path Dmitry Osipenko
2023-03-23 23:07   ` Dmitry Osipenko
2023-03-30 15:32   ` Emil Velikov
2023-03-30 15:32     ` Emil Velikov
2023-04-03 19:59     ` Dmitry Osipenko
2023-04-03 19:59       ` Dmitry Osipenko
2023-03-23 23:07 ` [PATCH v4 2/2] drm/virtio: Support sync objects Dmitry Osipenko
2023-03-23 23:07   ` Dmitry Osipenko
2023-03-30 17:24   ` Emil Velikov
2023-03-30 17:24     ` Emil Velikov
2023-04-02 17:45     ` Dmitry Osipenko
2023-04-02 17:45       ` Dmitry Osipenko
2023-04-03 13:00       ` Emil Velikov
2023-04-03 13:00         ` Emil Velikov
2023-04-03 13:00         ` Emil Velikov
2023-04-03 13:22         ` Emil Velikov
2023-04-03 13:22           ` Emil Velikov
2023-04-03 13:22           ` Emil Velikov
2023-04-03 15:07           ` Dmitry Osipenko
2023-04-03 15:07             ` Dmitry Osipenko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.