All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/15] implicit fencing/dma-resv rules for shared buffers
@ 2021-06-22 16:54 ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development

Hi all,

After many bits have been spilled on dri-devel discussion this I think
we're converging on a consensus understanding of where we are, and it's
time to resubmit patches.

This is essentially v2 of

https://lore.kernel.org/dri-devel/20210521090959.1663703-7-daniel.vetter@ffwll.ch/

but a lot has changed:

- Christian fixed up amdgpu with a much more competent patch.

- I used the entire audit I've done in that patch to instead improve the
  documentation. That's the first 3 patches.

- panfrost patches fixed (hopefully, testing would be appreciated)

- drm/tiny patch fixed

- I've also thrown an RFC on top at the end for what I think amdgpu should
  be doing. Probably really, really buggy, so beware :-)

Review on the entire pile except the very last RFC very much appreciated.

Note that this does not, by far, fix all the various issues in handling
dma_buf.resv fences. This is just the part I had mostly ready already, and
which didn't take long to refresh and rebase. The other part is checking
whether drivers do anything funny that breaks the cross driver contract in
how they handle dependencies the get from the dma_buf.resv. I know they
do, but the full audit is not yet done.

Cheers, Daniel

Daniel Vetter (15):
  dma-resv: Fix kerneldoc
  dma-buf: Switch to inline kerneldoc
  dma-buf: Document dma-buf implicit fencing/resv fencing rules
  drm/panfrost: Shrink sched_lock
  drm/panfrost: Use xarray and helpers for depedency tracking
  drm/panfrost: Fix implicit sync
  drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
  drm/armada: Remove prepare/cleanup_fb hooks
  drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  drm/omap: Follow implicit fencing in prepare_fb
  drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
  drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  RFC: drm/amdgpu: Implement a proper implicit fencing uapi

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  21 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   6 +
 drivers/gpu/drm/armada/armada_overlay.c       |   2 -
 drivers/gpu/drm/armada/armada_plane.c         |  29 ----
 drivers/gpu/drm/armada/armada_plane.h         |   2 -
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c      |   1 -
 drivers/gpu/drm/ast/ast_mode.c                |   3 +-
 drivers/gpu/drm/drm_atomic_helper.c           |  10 ++
 drivers/gpu/drm/drm_gem.c                     |   3 +
 drivers/gpu/drm/drm_gem_atomic_helper.c       |   3 +
 drivers/gpu/drm/drm_simple_kms_helper.c       |  12 +-
 drivers/gpu/drm/gud/gud_drv.c                 |   1 -
 .../gpu/drm/hisilicon/hibmc/hibmc_drm_de.c    |   3 +-
 drivers/gpu/drm/imx/dcss/dcss-plane.c         |   1 -
 drivers/gpu/drm/imx/ipuv3-plane.c             |   1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c     |   1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c         |   1 -
 drivers/gpu/drm/mcde/mcde_display.c           |   1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c      |   1 -
 drivers/gpu/drm/meson/meson_overlay.c         |   1 -
 drivers/gpu/drm/meson/meson_plane.c           |   1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c             |   2 -
 drivers/gpu/drm/omapdrm/omap_plane.c          |   3 +
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  41 +++--
 drivers/gpu/drm/panfrost/panfrost_job.c       |  71 ++++-----
 drivers/gpu/drm/panfrost/panfrost_job.h       |   8 +-
 drivers/gpu/drm/pl111/pl111_display.c         |   1 -
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c   |   1 -
 drivers/gpu/drm/stm/ltdc.c                    |   1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c           |   1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c        |   1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c        |   1 -
 drivers/gpu/drm/tidss/tidss_plane.c           |   1 -
 drivers/gpu/drm/tiny/hx8357d.c                |   1 -
 drivers/gpu/drm/tiny/ili9225.c                |   1 -
 drivers/gpu/drm/tiny/ili9341.c                |   1 -
 drivers/gpu/drm/tiny/ili9486.c                |   1 -
 drivers/gpu/drm/tiny/mi0283qt.c               |   1 -
 drivers/gpu/drm/tiny/repaper.c                |   1 -
 drivers/gpu/drm/tiny/st7586.c                 |   1 -
 drivers/gpu/drm/tiny/st7735r.c                |   1 -
 drivers/gpu/drm/tve200/tve200_display.c       |   1 -
 drivers/gpu/drm/vboxvideo/vbox_mode.c         |   3 +-
 drivers/gpu/drm/xen/xen_drm_front_kms.c       |   1 -
 include/drm/drm_gem_vram_helper.h             |  12 ++
 include/drm/drm_modeset_helper_vtables.h      |   7 +-
 include/drm/drm_simple_kms_helper.h           |   7 +-
 include/linux/dma-buf.h                       | 146 +++++++++++++++---
 include/linux/dma-resv.h                      |   2 +-
 include/uapi/drm/amdgpu_drm.h                 |  10 ++
 51 files changed, 270 insertions(+), 170 deletions(-)

-- 
2.32.0.rc2


^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 00/15] implicit fencing/dma-resv rules for shared buffers
@ 2021-06-22 16:54 ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development; +Cc: Daniel Vetter, Intel Graphics Development

Hi all,

After many bits have been spilled on dri-devel discussion this I think
we're converging on a consensus understanding of where we are, and it's
time to resubmit patches.

This is essentially v2 of

https://lore.kernel.org/dri-devel/20210521090959.1663703-7-daniel.vetter@ffwll.ch/

but a lot has changed:

- Christian fixed up amdgpu with a much more competent patch.

- I used the entire audit I've done in that patch to instead improve the
  documentation. That's the first 3 patches.

- panfrost patches fixed (hopefully, testing would be appreciated)

- drm/tiny patch fixed

- I've also thrown an RFC on top at the end for what I think amdgpu should
  be doing. Probably really, really buggy, so beware :-)

Review on the entire pile except the very last RFC very much appreciated.

Note that this does not, by far, fix all the various issues in handling
dma_buf.resv fences. This is just the part I had mostly ready already, and
which didn't take long to refresh and rebase. The other part is checking
whether drivers do anything funny that breaks the cross driver contract in
how they handle dependencies the get from the dma_buf.resv. I know they
do, but the full audit is not yet done.

Cheers, Daniel

Daniel Vetter (15):
  dma-resv: Fix kerneldoc
  dma-buf: Switch to inline kerneldoc
  dma-buf: Document dma-buf implicit fencing/resv fencing rules
  drm/panfrost: Shrink sched_lock
  drm/panfrost: Use xarray and helpers for depedency tracking
  drm/panfrost: Fix implicit sync
  drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
  drm/armada: Remove prepare/cleanup_fb hooks
  drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  drm/omap: Follow implicit fencing in prepare_fb
  drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
  drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  RFC: drm/amdgpu: Implement a proper implicit fencing uapi

 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c        |   7 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c       |  21 +++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   6 +
 drivers/gpu/drm/armada/armada_overlay.c       |   2 -
 drivers/gpu/drm/armada/armada_plane.c         |  29 ----
 drivers/gpu/drm/armada/armada_plane.h         |   2 -
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c      |   1 -
 drivers/gpu/drm/ast/ast_mode.c                |   3 +-
 drivers/gpu/drm/drm_atomic_helper.c           |  10 ++
 drivers/gpu/drm/drm_gem.c                     |   3 +
 drivers/gpu/drm/drm_gem_atomic_helper.c       |   3 +
 drivers/gpu/drm/drm_simple_kms_helper.c       |  12 +-
 drivers/gpu/drm/gud/gud_drv.c                 |   1 -
 .../gpu/drm/hisilicon/hibmc/hibmc_drm_de.c    |   3 +-
 drivers/gpu/drm/imx/dcss/dcss-plane.c         |   1 -
 drivers/gpu/drm/imx/ipuv3-plane.c             |   1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c     |   1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c         |   1 -
 drivers/gpu/drm/mcde/mcde_display.c           |   1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c      |   1 -
 drivers/gpu/drm/meson/meson_overlay.c         |   1 -
 drivers/gpu/drm/meson/meson_plane.c           |   1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c             |   2 -
 drivers/gpu/drm/omapdrm/omap_plane.c          |   3 +
 drivers/gpu/drm/panfrost/panfrost_drv.c       |  41 +++--
 drivers/gpu/drm/panfrost/panfrost_job.c       |  71 ++++-----
 drivers/gpu/drm/panfrost/panfrost_job.h       |   8 +-
 drivers/gpu/drm/pl111/pl111_display.c         |   1 -
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c   |   1 -
 drivers/gpu/drm/stm/ltdc.c                    |   1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c           |   1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c        |   1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c        |   1 -
 drivers/gpu/drm/tidss/tidss_plane.c           |   1 -
 drivers/gpu/drm/tiny/hx8357d.c                |   1 -
 drivers/gpu/drm/tiny/ili9225.c                |   1 -
 drivers/gpu/drm/tiny/ili9341.c                |   1 -
 drivers/gpu/drm/tiny/ili9486.c                |   1 -
 drivers/gpu/drm/tiny/mi0283qt.c               |   1 -
 drivers/gpu/drm/tiny/repaper.c                |   1 -
 drivers/gpu/drm/tiny/st7586.c                 |   1 -
 drivers/gpu/drm/tiny/st7735r.c                |   1 -
 drivers/gpu/drm/tve200/tve200_display.c       |   1 -
 drivers/gpu/drm/vboxvideo/vbox_mode.c         |   3 +-
 drivers/gpu/drm/xen/xen_drm_front_kms.c       |   1 -
 include/drm/drm_gem_vram_helper.h             |  12 ++
 include/drm/drm_modeset_helper_vtables.h      |   7 +-
 include/drm/drm_simple_kms_helper.h           |   7 +-
 include/linux/dma-buf.h                       | 146 +++++++++++++++---
 include/linux/dma-resv.h                      |   2 +-
 include/uapi/drm/amdgpu_drm.h                 |  10 ++
 51 files changed, 270 insertions(+), 170 deletions(-)

-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-06-22 16:54   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Sumit Semwal, Christian König, linux-media, linaro-mm-sig

Oversight from

commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
Author: Christian König <christian.koenig@amd.com>
Date:   Mon May 10 16:14:09 2021 +0200

    dma-buf: rename and cleanup dma_resv_get_excl v3

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-resv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 562b885cf9c3..e1ca2080a1ff 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
 }
 
 /**
- * dma_resv_exclusive - return the object's exclusive fence
+ * dma_resv_excl_fence - return the object's exclusive fence
  * @obj: the reservation object
  *
  * Returns the exclusive fence (if any). Caller must either hold the objects
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 16:54   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, linaro-mm-sig,
	Daniel Vetter, Christian König, linux-media

Oversight from

commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
Author: Christian König <christian.koenig@amd.com>
Date:   Mon May 10 16:14:09 2021 +0200

    dma-buf: rename and cleanup dma_resv_get_excl v3

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-resv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 562b885cf9c3..e1ca2080a1ff 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
 }
 
 /**
- * dma_resv_exclusive - return the object's exclusive fence
+ * dma_resv_excl_fence - return the object's exclusive fence
  * @obj: the reservation object
  *
  * Returns the exclusive fence (if any). Caller must either hold the objects
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 16:54   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Sumit Semwal,
	linaro-mm-sig, Daniel Vetter, Christian König, linux-media

Oversight from

commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
Author: Christian König <christian.koenig@amd.com>
Date:   Mon May 10 16:14:09 2021 +0200

    dma-buf: rename and cleanup dma_resv_get_excl v3

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-resv.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index 562b885cf9c3..e1ca2080a1ff 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
 }
 
 /**
- * dma_resv_exclusive - return the object's exclusive fence
+ * dma_resv_excl_fence - return the object's exclusive fence
  * @obj: the reservation object
  *
  * Returns the exclusive fence (if any). Caller must either hold the objects
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 02/15] dma-buf: Switch to inline kerneldoc
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-06-22 16:54   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Sumit Semwal, Christian König, Alex Deucher, Dave Airlie,
	Nirmoy Das, Deepak R Varma, Chen Li, Kevin Wang, linux-media,
	linaro-mm-sig

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
 1 file changed, 83 insertions(+), 24 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..6d18b9e448b9 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,28 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
  * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
  * and is incremented on each attach.
  *
@@ -324,24 +302,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,7 +403,12 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 16:54   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Deepak R Varma, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, linaro-mm-sig, Nirmoy Das, Chen Li, Dave Airlie,
	Alex Deucher, Daniel Vetter, Christian König, linux-media

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
 1 file changed, 83 insertions(+), 24 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..6d18b9e448b9 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,28 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
  * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
  * and is incremented on each attach.
  *
@@ -324,24 +302,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,7 +403,12 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 16:54   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Deepak R Varma, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, Sumit Semwal, linaro-mm-sig, Nirmoy Das, Chen Li,
	Dave Airlie, Alex Deucher, Daniel Vetter, Christian König,
	linux-media

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
 1 file changed, 83 insertions(+), 24 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..6d18b9e448b9 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,28 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
  * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
  * and is incremented on each attach.
  *
@@ -324,24 +302,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,7 +403,12 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:54   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Daniel Stone, Christian König, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	linaro-mm-sig, Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Alex Deucher, mesa-dev, Michel Dänzer, Dennis Li,
	Deepak R Varma

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 6d18b9e448b9..4807cefe81f5 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -388,6 +388,45 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
+	 * below rules.
+	 *
+	 * - Drivers should add a shared fence through
+	 *   dma_resv_add_shared_fence() for anything the userspace API
+	 *   considers a read access. This highly depends upon the API and
+	 *   window system: E.g. OpenGL is generally implicitly synchronized on
+	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
+	 *   generally explicitly synchronized for everything, and window system
+	 *   buffers have explicit API calls (which then need to make sure the
+	 *   implicit fences store here in @resv are updated correctly).
+	 *
+	 * - Similarly drivers should set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their @dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-22 16:54   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:54 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Daniel Stone, Christian König, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	Sumit Semwal, linaro-mm-sig, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Bas Nieuwenhuizen,
	Alex Deucher, mesa-dev, Michel Dänzer, Dennis Li,
	Deepak R Varma

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 6d18b9e448b9..4807cefe81f5 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -388,6 +388,45 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
+	 * below rules.
+	 *
+	 * - Drivers should add a shared fence through
+	 *   dma_resv_add_shared_fence() for anything the userspace API
+	 *   considers a read access. This highly depends upon the API and
+	 *   window system: E.g. OpenGL is generally implicitly synchronized on
+	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
+	 *   generally explicitly synchronized for everything, and window system
+	 *   buffers have explicit API calls (which then need to make sure the
+	 *   implicit fences store here in @resv are updated correctly).
+	 *
+	 * - Similarly drivers should set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their @dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 04/15] drm/panfrost: Shrink sched_lock
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Tomeu Vizoso, Daniel Vetter, Intel Graphics Development,
	Steven Price, Alyssa Rosenzweig, Daniel Vetter

drm/scheduler requires a lock between _init and _push_job, but the
reservation lock dance doesn't. So shrink the critical section a
notch.

v2: Lucas pointed out how this should really work, I got it all wrong
in v1.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 2df3e999a38d..38f8580c19f1 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -224,14 +224,13 @@ int panfrost_job_push(struct panfrost_job *job)
 	struct ww_acquire_ctx acquire_ctx;
 	int ret = 0;
 
-	mutex_lock(&pfdev->sched_lock);
 
 	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
 					    &acquire_ctx);
-	if (ret) {
-		mutex_unlock(&pfdev->sched_lock);
+	if (ret)
 		return ret;
-	}
+
+	mutex_lock(&pfdev->sched_lock);
 
 	ret = drm_sched_job_init(&job->base, entity, NULL);
 	if (ret) {
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 04/15] drm/panfrost: Shrink sched_lock
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Herring, Tomeu Vizoso, Daniel Vetter,
	Intel Graphics Development, Steven Price, Alyssa Rosenzweig,
	Daniel Vetter, Lucas Stach

drm/scheduler requires a lock between _init and _push_job, but the
reservation lock dance doesn't. So shrink the critical section a
notch.

v2: Lucas pointed out how this should really work, I got it all wrong
in v1.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 2df3e999a38d..38f8580c19f1 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -224,14 +224,13 @@ int panfrost_job_push(struct panfrost_job *job)
 	struct ww_acquire_ctx acquire_ctx;
 	int ret = 0;
 
-	mutex_lock(&pfdev->sched_lock);
 
 	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
 					    &acquire_ctx);
-	if (ret) {
-		mutex_unlock(&pfdev->sched_lock);
+	if (ret)
 		return ret;
-	}
+
+	mutex_lock(&pfdev->sched_lock);
 
 	ret = drm_sched_job_init(&job->base, entity, NULL);
 	if (ret) {
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Lucas Stach,
	Christian König, Luben Tuikov, Alex Deucher, Lee Jones,
	Steven Price, Rob Herring, Tomeu Vizoso, Alyssa Rosenzweig,
	Sumit Semwal, linux-media, linaro-mm-sig, Daniel Vetter

More consistency and prep work for the next patch.

Aside: I wonder whether we shouldn't just move this entire xarray
business into the scheduler so that not everyone has to reinvent the
same wheels. Cc'ing some scheduler people for this too.

v2: Correctly handle sched_lock since Lucas pointed out it's needed.

v3: Rebase, dma_resv_get_excl_unlocked got renamed

v4: Don't leak job references on failure (Steven).

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
 drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
 drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
 3 files changed, 49 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 075ec0ef746c..3ee828f1e7a5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
 	if (!job->bo_count)
 		return 0;
 
-	job->implicit_fences = kvmalloc_array(job->bo_count,
-				  sizeof(struct dma_fence *),
-				  GFP_KERNEL | __GFP_ZERO);
-	if (!job->implicit_fences)
-		return -ENOMEM;
-
 	ret = drm_gem_objects_lookup(file_priv,
 				     (void __user *)(uintptr_t)args->bo_handles,
 				     job->bo_count, &job->bos);
@@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
 }
 
 /**
- * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
+ * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
  * referenced by the job.
  * @dev: DRM device
  * @file_priv: DRM file for this fd
@@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
 {
 	u32 *handles;
 	int ret = 0;
-	int i;
+	int i, in_fence_count;
 
-	job->in_fence_count = args->in_sync_count;
+	in_fence_count = args->in_sync_count;
 
-	if (!job->in_fence_count)
+	if (!in_fence_count)
 		return 0;
 
-	job->in_fences = kvmalloc_array(job->in_fence_count,
-					sizeof(struct dma_fence *),
-					GFP_KERNEL | __GFP_ZERO);
-	if (!job->in_fences) {
-		DRM_DEBUG("Failed to allocate job in fences\n");
-		return -ENOMEM;
-	}
-
-	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
+	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
 	if (!handles) {
 		ret = -ENOMEM;
 		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
@@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
 
 	if (copy_from_user(handles,
 			   (void __user *)(uintptr_t)args->in_syncs,
-			   job->in_fence_count * sizeof(u32))) {
+			   in_fence_count * sizeof(u32))) {
 		ret = -EFAULT;
 		DRM_DEBUG("Failed to copy in syncobj handles\n");
 		goto fail;
 	}
 
-	for (i = 0; i < job->in_fence_count; i++) {
+	for (i = 0; i < in_fence_count; i++) {
+		struct dma_fence *fence;
+
 		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
-					     &job->in_fences[i]);
-		if (ret == -EINVAL)
+					     &fence);
+		if (ret)
+			goto fail;
+
+		ret = drm_gem_fence_array_add(&job->deps, fence);
+
+		if (ret)
 			goto fail;
 	}
 
@@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	kref_init(&job->refcount);
 
+	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+
 	job->pfdev = pfdev;
 	job->jc = args->jc;
 	job->requirements = args->requirements;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 38f8580c19f1..71cd43fa1b36 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
 }
 
-static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
-					   int bo_count,
-					   struct dma_fence **implicit_fences)
+static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
+					  int bo_count,
+					  struct xarray *deps)
 {
-	int i;
+	int i, ret;
 
-	for (i = 0; i < bo_count; i++)
-		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
+	for (i = 0; i < bo_count; i++) {
+		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
+
+		ret = drm_gem_fence_array_add(deps, fence);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
 }
 
 static void panfrost_attach_object_fences(struct drm_gem_object **bos,
@@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
-	kref_get(&job->refcount); /* put by scheduler job completion */
+	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
+					     &job->deps);
+	if (ret) {
+		mutex_unlock(&pfdev->sched_lock);
+		goto unlock;
+	}
 
-	panfrost_acquire_object_fences(job->bos, job->bo_count,
-				       job->implicit_fences);
+	kref_get(&job->refcount); /* put by scheduler job completion */
 
 	drm_sched_entity_push_job(&job->base, entity);
 
@@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
 {
 	struct panfrost_job *job = container_of(ref, struct panfrost_job,
 						refcount);
+	struct dma_fence *fence;
+	unsigned long index;
 	unsigned int i;
 
-	if (job->in_fences) {
-		for (i = 0; i < job->in_fence_count; i++)
-			dma_fence_put(job->in_fences[i]);
-		kvfree(job->in_fences);
-	}
-	if (job->implicit_fences) {
-		for (i = 0; i < job->bo_count; i++)
-			dma_fence_put(job->implicit_fences[i]);
-		kvfree(job->implicit_fences);
+	xa_for_each(&job->deps, index, fence) {
+		dma_fence_put(fence);
 	}
+	xa_destroy(&job->deps);
+
 	dma_fence_put(job->done_fence);
 	dma_fence_put(job->render_done_fence);
 
@@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
 						 struct drm_sched_entity *s_entity)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
-	struct dma_fence *fence;
-	unsigned int i;
-
-	/* Explicit fences */
-	for (i = 0; i < job->in_fence_count; i++) {
-		if (job->in_fences[i]) {
-			fence = job->in_fences[i];
-			job->in_fences[i] = NULL;
-			return fence;
-		}
-	}
 
-	/* Implicit fences, max. one per BO */
-	for (i = 0; i < job->bo_count; i++) {
-		if (job->implicit_fences[i]) {
-			fence = job->implicit_fences[i];
-			job->implicit_fences[i] = NULL;
-			return fence;
-		}
-	}
+	if (!xa_empty(&job->deps))
+		return xa_erase(&job->deps, job->last_dep++);
 
 	return NULL;
 }
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index bbd3ba97ff67..82306a03b57e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -19,9 +19,9 @@ struct panfrost_job {
 	struct panfrost_device *pfdev;
 	struct panfrost_file_priv *file_priv;
 
-	/* Optional fences userspace can pass in for the job to depend on. */
-	struct dma_fence **in_fences;
-	u32 in_fence_count;
+	/* Contains both explicit and implicit fences */
+	struct xarray deps;
+	unsigned long last_dep;
 
 	/* Fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *done_fence;
@@ -30,8 +30,6 @@ struct panfrost_job {
 	__u32 requirements;
 	__u32 flush_id;
 
-	/* Exclusive fences we have taken from the BOs to wait for */
-	struct dma_fence **implicit_fences;
 	struct panfrost_gem_mapping **mappings;
 	struct drm_gem_object **bos;
 	u32 bo_count;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Tomeu Vizoso, Daniel Vetter, Intel Graphics Development,
	Steven Price, linaro-mm-sig, Luben Tuikov, Alyssa Rosenzweig,
	Alex Deucher, Daniel Vetter, linux-media, Lee Jones,
	Christian König

More consistency and prep work for the next patch.

Aside: I wonder whether we shouldn't just move this entire xarray
business into the scheduler so that not everyone has to reinvent the
same wheels. Cc'ing some scheduler people for this too.

v2: Correctly handle sched_lock since Lucas pointed out it's needed.

v3: Rebase, dma_resv_get_excl_unlocked got renamed

v4: Don't leak job references on failure (Steven).

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
 drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
 drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
 3 files changed, 49 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 075ec0ef746c..3ee828f1e7a5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
 	if (!job->bo_count)
 		return 0;
 
-	job->implicit_fences = kvmalloc_array(job->bo_count,
-				  sizeof(struct dma_fence *),
-				  GFP_KERNEL | __GFP_ZERO);
-	if (!job->implicit_fences)
-		return -ENOMEM;
-
 	ret = drm_gem_objects_lookup(file_priv,
 				     (void __user *)(uintptr_t)args->bo_handles,
 				     job->bo_count, &job->bos);
@@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
 }
 
 /**
- * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
+ * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
  * referenced by the job.
  * @dev: DRM device
  * @file_priv: DRM file for this fd
@@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
 {
 	u32 *handles;
 	int ret = 0;
-	int i;
+	int i, in_fence_count;
 
-	job->in_fence_count = args->in_sync_count;
+	in_fence_count = args->in_sync_count;
 
-	if (!job->in_fence_count)
+	if (!in_fence_count)
 		return 0;
 
-	job->in_fences = kvmalloc_array(job->in_fence_count,
-					sizeof(struct dma_fence *),
-					GFP_KERNEL | __GFP_ZERO);
-	if (!job->in_fences) {
-		DRM_DEBUG("Failed to allocate job in fences\n");
-		return -ENOMEM;
-	}
-
-	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
+	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
 	if (!handles) {
 		ret = -ENOMEM;
 		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
@@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
 
 	if (copy_from_user(handles,
 			   (void __user *)(uintptr_t)args->in_syncs,
-			   job->in_fence_count * sizeof(u32))) {
+			   in_fence_count * sizeof(u32))) {
 		ret = -EFAULT;
 		DRM_DEBUG("Failed to copy in syncobj handles\n");
 		goto fail;
 	}
 
-	for (i = 0; i < job->in_fence_count; i++) {
+	for (i = 0; i < in_fence_count; i++) {
+		struct dma_fence *fence;
+
 		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
-					     &job->in_fences[i]);
-		if (ret == -EINVAL)
+					     &fence);
+		if (ret)
+			goto fail;
+
+		ret = drm_gem_fence_array_add(&job->deps, fence);
+
+		if (ret)
 			goto fail;
 	}
 
@@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	kref_init(&job->refcount);
 
+	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+
 	job->pfdev = pfdev;
 	job->jc = args->jc;
 	job->requirements = args->requirements;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 38f8580c19f1..71cd43fa1b36 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
 }
 
-static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
-					   int bo_count,
-					   struct dma_fence **implicit_fences)
+static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
+					  int bo_count,
+					  struct xarray *deps)
 {
-	int i;
+	int i, ret;
 
-	for (i = 0; i < bo_count; i++)
-		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
+	for (i = 0; i < bo_count; i++) {
+		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
+
+		ret = drm_gem_fence_array_add(deps, fence);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
 }
 
 static void panfrost_attach_object_fences(struct drm_gem_object **bos,
@@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
-	kref_get(&job->refcount); /* put by scheduler job completion */
+	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
+					     &job->deps);
+	if (ret) {
+		mutex_unlock(&pfdev->sched_lock);
+		goto unlock;
+	}
 
-	panfrost_acquire_object_fences(job->bos, job->bo_count,
-				       job->implicit_fences);
+	kref_get(&job->refcount); /* put by scheduler job completion */
 
 	drm_sched_entity_push_job(&job->base, entity);
 
@@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
 {
 	struct panfrost_job *job = container_of(ref, struct panfrost_job,
 						refcount);
+	struct dma_fence *fence;
+	unsigned long index;
 	unsigned int i;
 
-	if (job->in_fences) {
-		for (i = 0; i < job->in_fence_count; i++)
-			dma_fence_put(job->in_fences[i]);
-		kvfree(job->in_fences);
-	}
-	if (job->implicit_fences) {
-		for (i = 0; i < job->bo_count; i++)
-			dma_fence_put(job->implicit_fences[i]);
-		kvfree(job->implicit_fences);
+	xa_for_each(&job->deps, index, fence) {
+		dma_fence_put(fence);
 	}
+	xa_destroy(&job->deps);
+
 	dma_fence_put(job->done_fence);
 	dma_fence_put(job->render_done_fence);
 
@@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
 						 struct drm_sched_entity *s_entity)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
-	struct dma_fence *fence;
-	unsigned int i;
-
-	/* Explicit fences */
-	for (i = 0; i < job->in_fence_count; i++) {
-		if (job->in_fences[i]) {
-			fence = job->in_fences[i];
-			job->in_fences[i] = NULL;
-			return fence;
-		}
-	}
 
-	/* Implicit fences, max. one per BO */
-	for (i = 0; i < job->bo_count; i++) {
-		if (job->implicit_fences[i]) {
-			fence = job->implicit_fences[i];
-			job->implicit_fences[i] = NULL;
-			return fence;
-		}
-	}
+	if (!xa_empty(&job->deps))
+		return xa_erase(&job->deps, job->last_dep++);
 
 	return NULL;
 }
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index bbd3ba97ff67..82306a03b57e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -19,9 +19,9 @@ struct panfrost_job {
 	struct panfrost_device *pfdev;
 	struct panfrost_file_priv *file_priv;
 
-	/* Optional fences userspace can pass in for the job to depend on. */
-	struct dma_fence **in_fences;
-	u32 in_fence_count;
+	/* Contains both explicit and implicit fences */
+	struct xarray deps;
+	unsigned long last_dep;
 
 	/* Fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *done_fence;
@@ -30,8 +30,6 @@ struct panfrost_job {
 	__u32 requirements;
 	__u32 flush_id;
 
-	/* Exclusive fences we have taken from the BOs to wait for */
-	struct dma_fence **implicit_fences;
 	struct panfrost_gem_mapping **mappings;
 	struct drm_gem_object **bos;
 	u32 bo_count;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Sumit Semwal, Rob Herring, Tomeu Vizoso, Daniel Vetter,
	Intel Graphics Development, Steven Price, linaro-mm-sig,
	Luben Tuikov, Alyssa Rosenzweig, Alex Deucher, Daniel Vetter,
	linux-media, Lee Jones, Christian König, Lucas Stach

More consistency and prep work for the next patch.

Aside: I wonder whether we shouldn't just move this entire xarray
business into the scheduler so that not everyone has to reinvent the
same wheels. Cc'ing some scheduler people for this too.

v2: Correctly handle sched_lock since Lucas pointed out it's needed.

v3: Rebase, dma_resv_get_excl_unlocked got renamed

v4: Don't leak job references on failure (Steven).

Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
 drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
 drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
 3 files changed, 49 insertions(+), 65 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index 075ec0ef746c..3ee828f1e7a5 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
 	if (!job->bo_count)
 		return 0;
 
-	job->implicit_fences = kvmalloc_array(job->bo_count,
-				  sizeof(struct dma_fence *),
-				  GFP_KERNEL | __GFP_ZERO);
-	if (!job->implicit_fences)
-		return -ENOMEM;
-
 	ret = drm_gem_objects_lookup(file_priv,
 				     (void __user *)(uintptr_t)args->bo_handles,
 				     job->bo_count, &job->bos);
@@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
 }
 
 /**
- * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
+ * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
  * referenced by the job.
  * @dev: DRM device
  * @file_priv: DRM file for this fd
@@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
 {
 	u32 *handles;
 	int ret = 0;
-	int i;
+	int i, in_fence_count;
 
-	job->in_fence_count = args->in_sync_count;
+	in_fence_count = args->in_sync_count;
 
-	if (!job->in_fence_count)
+	if (!in_fence_count)
 		return 0;
 
-	job->in_fences = kvmalloc_array(job->in_fence_count,
-					sizeof(struct dma_fence *),
-					GFP_KERNEL | __GFP_ZERO);
-	if (!job->in_fences) {
-		DRM_DEBUG("Failed to allocate job in fences\n");
-		return -ENOMEM;
-	}
-
-	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
+	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
 	if (!handles) {
 		ret = -ENOMEM;
 		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
@@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
 
 	if (copy_from_user(handles,
 			   (void __user *)(uintptr_t)args->in_syncs,
-			   job->in_fence_count * sizeof(u32))) {
+			   in_fence_count * sizeof(u32))) {
 		ret = -EFAULT;
 		DRM_DEBUG("Failed to copy in syncobj handles\n");
 		goto fail;
 	}
 
-	for (i = 0; i < job->in_fence_count; i++) {
+	for (i = 0; i < in_fence_count; i++) {
+		struct dma_fence *fence;
+
 		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
-					     &job->in_fences[i]);
-		if (ret == -EINVAL)
+					     &fence);
+		if (ret)
+			goto fail;
+
+		ret = drm_gem_fence_array_add(&job->deps, fence);
+
+		if (ret)
 			goto fail;
 	}
 
@@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
 
 	kref_init(&job->refcount);
 
+	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
+
 	job->pfdev = pfdev;
 	job->jc = args->jc;
 	job->requirements = args->requirements;
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 38f8580c19f1..71cd43fa1b36 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
 	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
 }
 
-static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
-					   int bo_count,
-					   struct dma_fence **implicit_fences)
+static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
+					  int bo_count,
+					  struct xarray *deps)
 {
-	int i;
+	int i, ret;
 
-	for (i = 0; i < bo_count; i++)
-		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
+	for (i = 0; i < bo_count; i++) {
+		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
+
+		ret = drm_gem_fence_array_add(deps, fence);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
 }
 
 static void panfrost_attach_object_fences(struct drm_gem_object **bos,
@@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
 
 	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
 
-	kref_get(&job->refcount); /* put by scheduler job completion */
+	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
+					     &job->deps);
+	if (ret) {
+		mutex_unlock(&pfdev->sched_lock);
+		goto unlock;
+	}
 
-	panfrost_acquire_object_fences(job->bos, job->bo_count,
-				       job->implicit_fences);
+	kref_get(&job->refcount); /* put by scheduler job completion */
 
 	drm_sched_entity_push_job(&job->base, entity);
 
@@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
 {
 	struct panfrost_job *job = container_of(ref, struct panfrost_job,
 						refcount);
+	struct dma_fence *fence;
+	unsigned long index;
 	unsigned int i;
 
-	if (job->in_fences) {
-		for (i = 0; i < job->in_fence_count; i++)
-			dma_fence_put(job->in_fences[i]);
-		kvfree(job->in_fences);
-	}
-	if (job->implicit_fences) {
-		for (i = 0; i < job->bo_count; i++)
-			dma_fence_put(job->implicit_fences[i]);
-		kvfree(job->implicit_fences);
+	xa_for_each(&job->deps, index, fence) {
+		dma_fence_put(fence);
 	}
+	xa_destroy(&job->deps);
+
 	dma_fence_put(job->done_fence);
 	dma_fence_put(job->render_done_fence);
 
@@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
 						 struct drm_sched_entity *s_entity)
 {
 	struct panfrost_job *job = to_panfrost_job(sched_job);
-	struct dma_fence *fence;
-	unsigned int i;
-
-	/* Explicit fences */
-	for (i = 0; i < job->in_fence_count; i++) {
-		if (job->in_fences[i]) {
-			fence = job->in_fences[i];
-			job->in_fences[i] = NULL;
-			return fence;
-		}
-	}
 
-	/* Implicit fences, max. one per BO */
-	for (i = 0; i < job->bo_count; i++) {
-		if (job->implicit_fences[i]) {
-			fence = job->implicit_fences[i];
-			job->implicit_fences[i] = NULL;
-			return fence;
-		}
-	}
+	if (!xa_empty(&job->deps))
+		return xa_erase(&job->deps, job->last_dep++);
 
 	return NULL;
 }
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
index bbd3ba97ff67..82306a03b57e 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.h
+++ b/drivers/gpu/drm/panfrost/panfrost_job.h
@@ -19,9 +19,9 @@ struct panfrost_job {
 	struct panfrost_device *pfdev;
 	struct panfrost_file_priv *file_priv;
 
-	/* Optional fences userspace can pass in for the job to depend on. */
-	struct dma_fence **in_fences;
-	u32 in_fence_count;
+	/* Contains both explicit and implicit fences */
+	struct xarray deps;
+	unsigned long last_dep;
 
 	/* Fence to be signaled by IRQ handler when the job is complete. */
 	struct dma_fence *done_fence;
@@ -30,8 +30,6 @@ struct panfrost_job {
 	__u32 requirements;
 	__u32 flush_id;
 
-	/* Exclusive fences we have taken from the BOs to wait for */
-	struct dma_fence **implicit_fences;
 	struct panfrost_gem_mapping **mappings;
 	struct drm_gem_object **bos;
 	u32 bo_count;
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 06/15] drm/panfrost: Fix implicit sync
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Daniel Vetter,
	Rob Herring, Tomeu Vizoso, Steven Price, Alyssa Rosenzweig,
	Sumit Semwal, Christian König, linux-media, linaro-mm-sig

Currently this has no practial relevance I think because there's not
many who can pull off a setup with panfrost and another gpu in the
same system. But the rules are that if you're setting an exclusive
fence, indicating a gpu write access in the implicit fencing system,
then you need to wait for all fences, not just the previous exclusive
fence.

panfrost against itself has no problem, because it always sets the
exclusive fence (but that's probably something that will need to be
fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
Also no problem with that against display.

With the prep work done to switch over to the dependency helpers this
is now a oneliner.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71cd43fa1b36..ef004d587dc4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
-		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
-
-		ret = drm_gem_fence_array_add(deps, fence);
+		/* panfrost always uses write mode in its current uapi */
+		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
 		if (ret)
 			return ret;
 	}
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Tomeu Vizoso, Christian König, Daniel Vetter,
	Intel Graphics Development, Steven Price, linaro-mm-sig,
	Alyssa Rosenzweig, Daniel Vetter, linux-media

Currently this has no practial relevance I think because there's not
many who can pull off a setup with panfrost and another gpu in the
same system. But the rules are that if you're setting an exclusive
fence, indicating a gpu write access in the implicit fencing system,
then you need to wait for all fences, not just the previous exclusive
fence.

panfrost against itself has no problem, because it always sets the
exclusive fence (but that's probably something that will need to be
fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
Also no problem with that against display.

With the prep work done to switch over to the dependency helpers this
is now a oneliner.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71cd43fa1b36..ef004d587dc4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
-		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
-
-		ret = drm_gem_fence_array_add(deps, fence);
+		/* panfrost always uses write mode in its current uapi */
+		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
 		if (ret)
 			return ret;
 	}
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Herring, Tomeu Vizoso, Christian König, Daniel Vetter,
	Intel Graphics Development, Steven Price, linaro-mm-sig,
	Alyssa Rosenzweig, Daniel Vetter, Sumit Semwal, linux-media

Currently this has no practial relevance I think because there's not
many who can pull off a setup with panfrost and another gpu in the
same system. But the rules are that if you're setting an exclusive
fence, indicating a gpu write access in the implicit fencing system,
then you need to wait for all fences, not just the previous exclusive
fence.

panfrost against itself has no problem, because it always sets the
exclusive fence (but that's probably something that will need to be
fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
Also no problem with that against display.

With the prep work done to switch over to the dependency helpers this
is now a oneliner.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 71cd43fa1b36..ef004d587dc4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
 	int i, ret;
 
 	for (i = 0; i < bo_count; i++) {
-		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
-
-		ret = drm_gem_fence_array_add(deps, fence);
+		/* panfrost always uses write mode in its current uapi */
+		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
 		if (ret)
 			return ret;
 	}
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, Daniel Vetter

There's a bunch of atomic drivers who don't do this quite correctly,
luckily most of them aren't in wide use or people would have noticed
the tearing.

By making this the default we avoid the constant audit pain and can
additionally remove a ton of lines from vfuncs for a bit more clarity
in smaller drivers.

While at it complain if there's a cleanup_fb hook but no prepare_fb
hook, because that makes no sense. I haven't found any driver which
violates this, but better safe than sorry.

Subsequent patches will reap the benefits.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
 drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
 include/drm/drm_modeset_helper_vtables.h |  7 +++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 531f2374b072..9f6c5f21c4d6 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -35,6 +35,7 @@
 #include <drm/drm_damage_helper.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_print.h>
 #include <drm/drm_self_refresh_helper.h>
@@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
 			ret = funcs->prepare_fb(plane, new_plane_state);
 			if (ret)
 				goto fail;
+		} else {
+			WARN_ON_ONCE(funcs->cleanup_fb);
+
+			if (!drm_core_check_feature(dev, DRIVER_GEM))
+				continue;
+
+			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
+			if (ret)
+				goto fail;
 		}
 	}
 
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index a27135084ae5..bc9396f2a0ed 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -135,6 +135,9 @@
  * GEM based framebuffer drivers which have their buffers always pinned in
  * memory.
  *
+ * This function is the default implementation for GEM drivers of
+ * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
+ *
  * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
  * explicit fencing in atomic modeset updates.
  */
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index f3a4b47b3986..4e727261dca5 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
 	 * equivalent functionality should be implemented through private
 	 * members in the plane structure.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_plane_helper_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
+	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
+	 * implement this. Other drivers which need additional plane processing
+	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
+	 * hook.
 	 *
 	 * The helpers will call @cleanup_fb with matching arguments for every
 	 * successful call to this hook.
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Maxime Ripard, Thomas Zimmermann, Daniel Vetter

There's a bunch of atomic drivers who don't do this quite correctly,
luckily most of them aren't in wide use or people would have noticed
the tearing.

By making this the default we avoid the constant audit pain and can
additionally remove a ton of lines from vfuncs for a bit more clarity
in smaller drivers.

While at it complain if there's a cleanup_fb hook but no prepare_fb
hook, because that makes no sense. I haven't found any driver which
violates this, but better safe than sorry.

Subsequent patches will reap the benefits.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
 drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
 include/drm/drm_modeset_helper_vtables.h |  7 +++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 531f2374b072..9f6c5f21c4d6 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -35,6 +35,7 @@
 #include <drm/drm_damage_helper.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_print.h>
 #include <drm/drm_self_refresh_helper.h>
@@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
 			ret = funcs->prepare_fb(plane, new_plane_state);
 			if (ret)
 				goto fail;
+		} else {
+			WARN_ON_ONCE(funcs->cleanup_fb);
+
+			if (!drm_core_check_feature(dev, DRIVER_GEM))
+				continue;
+
+			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
+			if (ret)
+				goto fail;
 		}
 	}
 
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index a27135084ae5..bc9396f2a0ed 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -135,6 +135,9 @@
  * GEM based framebuffer drivers which have their buffers always pinned in
  * memory.
  *
+ * This function is the default implementation for GEM drivers of
+ * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
+ *
  * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
  * explicit fencing in atomic modeset updates.
  */
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index f3a4b47b3986..4e727261dca5 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
 	 * equivalent functionality should be implemented through private
 	 * members in the plane structure.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_plane_helper_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
+	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
+	 * implement this. Other drivers which need additional plane processing
+	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
+	 * hook.
 	 *
 	 * The helpers will call @cleanup_fb with matching arguments for every
 	 * successful call to this hook.
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                     ` (4 preceding siblings ...)
  (?)
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Heiko Stuebner,
	Paul Cercueil, Jernej Skrabec, Chun-Kuang Hu,
	Martin Blumenstingl, Tomi Valkeinen, Philippe Cornu, Lucas Stach,
	Daniel Vetter, Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Philipp Zabel, Matthias Brugger, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Marek Vasut, Stefan Agner, Sandy Huang,
	Yannick Fertre, Benjamin Gaignard, Maxime Coquelin,
	Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai, Jyri Sarha,
	Tomi Valkeinen, linux-arm-kernel, linux-mips, linux-mediatek,
	linux-amlogic, linux-rockchip, linux-stm32, linux-sunxi

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Heiko Stuebner,
	Paul Cercueil, Jernej Skrabec, Chun-Kuang Hu,
	Martin Blumenstingl, Tomi Valkeinen, Philippe Cornu, Lucas Stach,
	Daniel Vetter, Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Philipp Zabel, Matthias Brugger, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Marek Vasut, Stefan Agner, Sandy Huang,
	Yannick Fertre, Benjamin Gaignard, Maxime Coquelin,
	Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai, Jyri Sarha,
	Tomi Valkeinen, linux-arm-kernel, linux-mips, linux-mediatek,
	linux-amlogic, linux-rockchip, linux-stm32, linux-sunxi

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Heiko Stuebner,
	Paul Cercueil, Jernej Skrabec, Chun-Kuang Hu,
	Martin Blumenstingl, Tomi Valkeinen, Philippe Cornu, Lucas Stach,
	Daniel Vetter, Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Philipp Zabel, Matthias Brugger, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Marek Vasut, Stefan Agner, Sandy Huang,
	Yannick Fertre, Benjamin Gaignard, Maxime Coquelin,
	Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai, Jyri Sarha,
	Tomi Valkeinen, linux-arm-kernel, linux-mips, linux-mediatek,
	linux-amlogic, linux-rockchip, linux-stm32, linux-sunxi

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Heiko Stuebner,
	Paul Cercueil, Jernej Skrabec, Chun-Kuang Hu,
	Martin Blumenstingl, Tomi Valkeinen, Philippe Cornu, Lucas Stach,
	Daniel Vetter, Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Philipp Zabel, Matthias Brugger, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Marek Vasut, Stefan Agner, Sandy Huang,
	Yannick Fertre, Benjamin Gaignard, Maxime Coquelin,
	Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai, Jyri Sarha,
	Tomi Valkeinen, linux-arm-kernel, linux-mips, linux-mediatek,
	linux-amlogic, linux-rockchip, linux-stm32, linux-sunxi

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Neil Armstrong, Daniel Vetter, Alexandre Torgue, linux-mips,
	Paul Cercueil, Benjamin Gaignard, Daniel Vetter, linux-stm32,
	Jerome Brunet, Marek Vasut, Kevin Hilman, Jernej Skrabec,
	linux-rockchip, Chen-Yu Tsai, NXP Linux Team, Tomi Valkeinen,
	Sascha Hauer, Chun-Kuang Hu, Pengutronix Kernel Team,
	Martin Blumenstingl, Intel Graphics Development, linux-mediatek,
	Laurentiu Palcu, Matthias Brugger, linux-amlogic,
	linux-arm-kernel, Maxime Coquelin, Tomi Valkeinen, Jyri Sarha,
	Yannick Fertre, Sandy Huang, linux-sunxi, Philippe Cornu,
	Shawn Guo

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Heiko Stuebner, Neil Armstrong, Daniel Vetter, Alexandre Torgue,
	Stefan Agner, linux-mips, Paul Cercueil, Benjamin Gaignard,
	Daniel Vetter, Fabio Estevam, linux-stm32, Jerome Brunet,
	Marek Vasut, Kevin Hilman, Jernej Skrabec, linux-rockchip,
	Chen-Yu Tsai, NXP Linux Team, Tomi Valkeinen, Sascha Hauer,
	Chun-Kuang Hu, Pengutronix Kernel Team, Martin Blumenstingl,
	Intel Graphics Development, Maxime Ripard, linux-mediatek,
	Laurentiu Palcu, Matthias Brugger, linux-amlogic,
	linux-arm-kernel, Maxime Coquelin, Tomi Valkeinen, Jyri Sarha,
	Yannick Fertre, Sandy Huang, linux-sunxi, Philippe Cornu,
	Philipp Zabel, Shawn Guo, Lucas Stach

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Heiko Stuebner,
	Paul Cercueil, Jernej Skrabec, Chun-Kuang Hu,
	Martin Blumenstingl, Tomi Valkeinen, Philippe Cornu, Lucas Stach,
	Daniel Vetter, Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Philipp Zabel, Matthias Brugger, Neil Armstrong, Kevin Hilman,
	Jerome Brunet, Marek Vasut, Stefan Agner, Sandy Huang,
	Yannick Fertre, Benjamin Gaignard, Maxime Coquelin,
	Alexandre Torgue, Maxime Ripard, Chen-Yu Tsai, Jyri Sarha,
	Tomi Valkeinen, linux-arm-kernel, linux-mips, linux-mediatek,
	linux-amlogic, linux-rockchip, linux-stm32, linux-sunxi

No need to set it explicitly.

Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Acked-by: Philippe Cornu <philippe.cornu@foss.st.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Paul Cercueil <paul@crapouillou.net>
Cc: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Cc: Matthias Brugger <matthias.bgg@gmail.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Jerome Brunet <jbrunet@baylibre.com>
Cc: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Sandy Huang <hjc@rock-chips.com>
Cc: "Heiko Stübner" <heiko@sntech.de>
Cc: Yannick Fertre <yannick.fertre@foss.st.com>
Cc: Philippe Cornu <philippe.cornu@foss.st.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Jernej Skrabec <jernej.skrabec@gmail.com>
Cc: Jyri Sarha <jyri.sarha@iki.fi>
Cc: Tomi Valkeinen <tomba@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-mips@vger.kernel.org
Cc: linux-mediatek@lists.infradead.org
Cc: linux-amlogic@lists.infradead.org
Cc: linux-rockchip@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Cc: linux-sunxi@lists.linux.dev
---
 drivers/gpu/drm/imx/dcss/dcss-plane.c       | 1 -
 drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
 drivers/gpu/drm/ingenic/ingenic-drm-drv.c   | 1 -
 drivers/gpu/drm/ingenic/ingenic-ipu.c       | 1 -
 drivers/gpu/drm/mediatek/mtk_drm_plane.c    | 1 -
 drivers/gpu/drm/meson/meson_overlay.c       | 1 -
 drivers/gpu/drm/meson/meson_plane.c         | 1 -
 drivers/gpu/drm/mxsfb/mxsfb_kms.c           | 2 --
 drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 1 -
 drivers/gpu/drm/stm/ltdc.c                  | 1 -
 drivers/gpu/drm/sun4i/sun4i_layer.c         | 1 -
 drivers/gpu/drm/sun4i/sun8i_ui_layer.c      | 1 -
 drivers/gpu/drm/sun4i/sun8i_vi_layer.c      | 1 -
 drivers/gpu/drm/tidss/tidss_plane.c         | 1 -
 14 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/imx/dcss/dcss-plane.c b/drivers/gpu/drm/imx/dcss/dcss-plane.c
index 044d3bdf313c..ac45d54acd4e 100644
--- a/drivers/gpu/drm/imx/dcss/dcss-plane.c
+++ b/drivers/gpu/drm/imx/dcss/dcss-plane.c
@@ -361,7 +361,6 @@ static void dcss_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs dcss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = dcss_plane_atomic_check,
 	.atomic_update = dcss_plane_atomic_update,
 	.atomic_disable = dcss_plane_atomic_disable,
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index 8710f55d2579..ef114b6aa691 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ipu_plane_atomic_check,
 	.atomic_disable = ipu_plane_atomic_disable,
 	.atomic_update = ipu_plane_atomic_update,
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
index 5244f4763477..c296472164d9 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm-drv.c
@@ -830,7 +830,6 @@ static const struct drm_plane_helper_funcs ingenic_drm_plane_helper_funcs = {
 	.atomic_update		= ingenic_drm_plane_atomic_update,
 	.atomic_check		= ingenic_drm_plane_atomic_check,
 	.atomic_disable		= ingenic_drm_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_crtc_helper_funcs ingenic_drm_crtc_helper_funcs = {
diff --git a/drivers/gpu/drm/ingenic/ingenic-ipu.c b/drivers/gpu/drm/ingenic/ingenic-ipu.c
index 61b6d9fdbba1..aeb8a757d213 100644
--- a/drivers/gpu/drm/ingenic/ingenic-ipu.c
+++ b/drivers/gpu/drm/ingenic/ingenic-ipu.c
@@ -625,7 +625,6 @@ static const struct drm_plane_helper_funcs ingenic_ipu_plane_helper_funcs = {
 	.atomic_update		= ingenic_ipu_plane_atomic_update,
 	.atomic_check		= ingenic_ipu_plane_atomic_check,
 	.atomic_disable		= ingenic_ipu_plane_atomic_disable,
-	.prepare_fb		= drm_gem_plane_helper_prepare_fb,
 };
 
 static int
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index b5582dcf564c..1667a7e7de38 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -227,7 +227,6 @@ static void mtk_plane_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mtk_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mtk_plane_atomic_check,
 	.atomic_update = mtk_plane_atomic_update,
 	.atomic_disable = mtk_plane_atomic_disable,
diff --git a/drivers/gpu/drm/meson/meson_overlay.c b/drivers/gpu/drm/meson/meson_overlay.c
index ed063152aecd..dfef8afcc245 100644
--- a/drivers/gpu/drm/meson/meson_overlay.c
+++ b/drivers/gpu/drm/meson/meson_overlay.c
@@ -747,7 +747,6 @@ static const struct drm_plane_helper_funcs meson_overlay_helper_funcs = {
 	.atomic_check	= meson_overlay_atomic_check,
 	.atomic_disable	= meson_overlay_atomic_disable,
 	.atomic_update	= meson_overlay_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_overlay_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a18510dae4c8..8640a8a8a469 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -422,7 +422,6 @@ static const struct drm_plane_helper_funcs meson_plane_helper_funcs = {
 	.atomic_check	= meson_plane_atomic_check,
 	.atomic_disable	= meson_plane_atomic_disable,
 	.atomic_update	= meson_plane_atomic_update,
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 };
 
 static bool meson_plane_format_mod_supported(struct drm_plane *plane,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_kms.c b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
index 300e7bab0f43..8797c671d0d5 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_kms.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_kms.c
@@ -500,13 +500,11 @@ static bool mxsfb_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs mxsfb_plane_primary_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_primary_atomic_update,
 };
 
 static const struct drm_plane_helper_funcs mxsfb_plane_overlay_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = mxsfb_plane_atomic_check,
 	.atomic_update = mxsfb_plane_overlay_atomic_update,
 };
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index f5b9028a16a3..ba9e14da41b4 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -1110,7 +1110,6 @@ static const struct drm_plane_helper_funcs plane_helper_funcs = {
 	.atomic_disable = vop_plane_atomic_disable,
 	.atomic_async_check = vop_plane_atomic_async_check,
 	.atomic_async_update = vop_plane_atomic_async_update,
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 };
 
 static const struct drm_plane_funcs vop_plane_funcs = {
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 08b71248044d..0a6f0239a9f8 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -947,7 +947,6 @@ static const struct drm_plane_funcs ltdc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs ltdc_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = ltdc_plane_atomic_check,
 	.atomic_update = ltdc_plane_atomic_update,
 	.atomic_disable = ltdc_plane_atomic_disable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index 11771bdd6e7c..929e95f86b5b 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -127,7 +127,6 @@ static bool sun4i_layer_format_mod_supported(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
diff --git a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
index e779855bcd6e..7845c2a53a7f 100644
--- a/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_ui_layer.c
@@ -332,7 +332,6 @@ static void sun8i_ui_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_ui_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_ui_layer_atomic_check,
 	.atomic_disable	= sun8i_ui_layer_atomic_disable,
 	.atomic_update	= sun8i_ui_layer_atomic_update,
diff --git a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
index 1c86c2dd0bbf..bb7c43036dfa 100644
--- a/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_vi_layer.c
@@ -436,7 +436,6 @@ static void sun8i_vi_layer_atomic_update(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs sun8i_vi_layer_helper_funcs = {
-	.prepare_fb	= drm_gem_plane_helper_prepare_fb,
 	.atomic_check	= sun8i_vi_layer_atomic_check,
 	.atomic_disable	= sun8i_vi_layer_atomic_disable,
 	.atomic_update	= sun8i_vi_layer_atomic_update,
diff --git a/drivers/gpu/drm/tidss/tidss_plane.c b/drivers/gpu/drm/tidss/tidss_plane.c
index 1acd15aa4193..217415ec8eea 100644
--- a/drivers/gpu/drm/tidss/tidss_plane.c
+++ b/drivers/gpu/drm/tidss/tidss_plane.c
@@ -158,7 +158,6 @@ static void drm_plane_destroy(struct drm_plane *plane)
 }
 
 static const struct drm_plane_helper_funcs tidss_plane_helper_funcs = {
-	.prepare_fb = drm_gem_plane_helper_prepare_fb,
 	.atomic_check = tidss_plane_atomic_check,
 	.atomic_update = tidss_plane_atomic_update,
 	.atomic_disable = tidss_plane_atomic_disable,
-- 
2.32.0.rc2


_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 09/15] drm/armada: Remove prepare/cleanup_fb hooks
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Russell King, Daniel Vetter

All they do is refcount the fb, which the atomic helpers already do.

This is was necessary with the legacy helpers and I guess just carry
over in the conversion. drm_plane_state always has a full reference
for its ->fb pointer during its entire lifetime,
see __drm_atomic_helper_plane_destroy_state()

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
---
 drivers/gpu/drm/armada/armada_overlay.c |  2 --
 drivers/gpu/drm/armada/armada_plane.c   | 29 -------------------------
 drivers/gpu/drm/armada/armada_plane.h   |  2 --
 3 files changed, 33 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_overlay.c b/drivers/gpu/drm/armada/armada_overlay.c
index d3e3e5fdc390..424250535fed 100644
--- a/drivers/gpu/drm/armada/armada_overlay.c
+++ b/drivers/gpu/drm/armada/armada_overlay.c
@@ -247,8 +247,6 @@ static void armada_drm_overlay_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs armada_overlay_plane_helper_funcs = {
-	.prepare_fb	= armada_drm_plane_prepare_fb,
-	.cleanup_fb	= armada_drm_plane_cleanup_fb,
 	.atomic_check	= armada_drm_plane_atomic_check,
 	.atomic_update	= armada_drm_overlay_plane_atomic_update,
 	.atomic_disable	= armada_drm_overlay_plane_atomic_disable,
diff --git a/drivers/gpu/drm/armada/armada_plane.c b/drivers/gpu/drm/armada/armada_plane.c
index 40f5c34fb4d8..1c56a2883b91 100644
--- a/drivers/gpu/drm/armada/armada_plane.c
+++ b/drivers/gpu/drm/armada/armada_plane.c
@@ -78,33 +78,6 @@ void armada_drm_plane_calc(struct drm_plane_state *state, u32 addrs[2][3],
 	}
 }
 
-int armada_drm_plane_prepare_fb(struct drm_plane *plane,
-	struct drm_plane_state *state)
-{
-	DRM_DEBUG_KMS("[PLANE:%d:%s] [FB:%d]\n",
-		plane->base.id, plane->name,
-		state->fb ? state->fb->base.id : 0);
-
-	/*
-	 * Take a reference on the new framebuffer - we want to
-	 * hold on to it while the hardware is displaying it.
-	 */
-	if (state->fb)
-		drm_framebuffer_get(state->fb);
-	return 0;
-}
-
-void armada_drm_plane_cleanup_fb(struct drm_plane *plane,
-	struct drm_plane_state *old_state)
-{
-	DRM_DEBUG_KMS("[PLANE:%d:%s] [FB:%d]\n",
-		plane->base.id, plane->name,
-		old_state->fb ? old_state->fb->base.id : 0);
-
-	if (old_state->fb)
-		drm_framebuffer_put(old_state->fb);
-}
-
 int armada_drm_plane_atomic_check(struct drm_plane *plane,
 	struct drm_atomic_state *state)
 {
@@ -282,8 +255,6 @@ static void armada_drm_primary_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs armada_primary_plane_helper_funcs = {
-	.prepare_fb	= armada_drm_plane_prepare_fb,
-	.cleanup_fb	= armada_drm_plane_cleanup_fb,
 	.atomic_check	= armada_drm_plane_atomic_check,
 	.atomic_update	= armada_drm_primary_plane_atomic_update,
 	.atomic_disable	= armada_drm_primary_plane_atomic_disable,
diff --git a/drivers/gpu/drm/armada/armada_plane.h b/drivers/gpu/drm/armada/armada_plane.h
index 51dab8d8da22..368415c609a6 100644
--- a/drivers/gpu/drm/armada/armada_plane.h
+++ b/drivers/gpu/drm/armada/armada_plane.h
@@ -21,8 +21,6 @@ struct armada_plane_state {
 
 void armada_drm_plane_calc(struct drm_plane_state *state, u32 addrs[2][3],
 	u16 pitches[3], bool interlaced);
-int armada_drm_plane_prepare_fb(struct drm_plane *plane,
-	struct drm_plane_state *state);
 void armada_drm_plane_cleanup_fb(struct drm_plane *plane,
 	struct drm_plane_state *old_state);
 int armada_drm_plane_atomic_check(struct drm_plane *plane,
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 09/15] drm/armada: Remove prepare/cleanup_fb hooks
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Russell King, Daniel Vetter

All they do is refcount the fb, which the atomic helpers already do.

This is was necessary with the legacy helpers and I guess just carry
over in the conversion. drm_plane_state always has a full reference
for its ->fb pointer during its entire lifetime,
see __drm_atomic_helper_plane_destroy_state()

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
---
 drivers/gpu/drm/armada/armada_overlay.c |  2 --
 drivers/gpu/drm/armada/armada_plane.c   | 29 -------------------------
 drivers/gpu/drm/armada/armada_plane.h   |  2 --
 3 files changed, 33 deletions(-)

diff --git a/drivers/gpu/drm/armada/armada_overlay.c b/drivers/gpu/drm/armada/armada_overlay.c
index d3e3e5fdc390..424250535fed 100644
--- a/drivers/gpu/drm/armada/armada_overlay.c
+++ b/drivers/gpu/drm/armada/armada_overlay.c
@@ -247,8 +247,6 @@ static void armada_drm_overlay_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs armada_overlay_plane_helper_funcs = {
-	.prepare_fb	= armada_drm_plane_prepare_fb,
-	.cleanup_fb	= armada_drm_plane_cleanup_fb,
 	.atomic_check	= armada_drm_plane_atomic_check,
 	.atomic_update	= armada_drm_overlay_plane_atomic_update,
 	.atomic_disable	= armada_drm_overlay_plane_atomic_disable,
diff --git a/drivers/gpu/drm/armada/armada_plane.c b/drivers/gpu/drm/armada/armada_plane.c
index 40f5c34fb4d8..1c56a2883b91 100644
--- a/drivers/gpu/drm/armada/armada_plane.c
+++ b/drivers/gpu/drm/armada/armada_plane.c
@@ -78,33 +78,6 @@ void armada_drm_plane_calc(struct drm_plane_state *state, u32 addrs[2][3],
 	}
 }
 
-int armada_drm_plane_prepare_fb(struct drm_plane *plane,
-	struct drm_plane_state *state)
-{
-	DRM_DEBUG_KMS("[PLANE:%d:%s] [FB:%d]\n",
-		plane->base.id, plane->name,
-		state->fb ? state->fb->base.id : 0);
-
-	/*
-	 * Take a reference on the new framebuffer - we want to
-	 * hold on to it while the hardware is displaying it.
-	 */
-	if (state->fb)
-		drm_framebuffer_get(state->fb);
-	return 0;
-}
-
-void armada_drm_plane_cleanup_fb(struct drm_plane *plane,
-	struct drm_plane_state *old_state)
-{
-	DRM_DEBUG_KMS("[PLANE:%d:%s] [FB:%d]\n",
-		plane->base.id, plane->name,
-		old_state->fb ? old_state->fb->base.id : 0);
-
-	if (old_state->fb)
-		drm_framebuffer_put(old_state->fb);
-}
-
 int armada_drm_plane_atomic_check(struct drm_plane *plane,
 	struct drm_atomic_state *state)
 {
@@ -282,8 +255,6 @@ static void armada_drm_primary_plane_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs armada_primary_plane_helper_funcs = {
-	.prepare_fb	= armada_drm_plane_prepare_fb,
-	.cleanup_fb	= armada_drm_plane_cleanup_fb,
 	.atomic_check	= armada_drm_plane_atomic_check,
 	.atomic_update	= armada_drm_primary_plane_atomic_update,
 	.atomic_disable	= armada_drm_primary_plane_atomic_disable,
diff --git a/drivers/gpu/drm/armada/armada_plane.h b/drivers/gpu/drm/armada/armada_plane.h
index 51dab8d8da22..368415c609a6 100644
--- a/drivers/gpu/drm/armada/armada_plane.h
+++ b/drivers/gpu/drm/armada/armada_plane.h
@@ -21,8 +21,6 @@ struct armada_plane_state {
 
 void armada_drm_plane_calc(struct drm_plane_state *state, u32 addrs[2][3],
 	u16 pitches[3], bool interlaced);
-int armada_drm_plane_prepare_fb(struct drm_plane *plane,
-	struct drm_plane_state *state);
 void armada_drm_plane_cleanup_fb(struct drm_plane *plane,
 	struct drm_plane_state *old_state);
 int armada_drm_plane_atomic_check(struct drm_plane *plane,
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Tian Tao, Hans de Goede, Laurent Pinchart, Thomas Zimmermann,
	Dave Airlie, Daniel Vetter

Like we have for the shadow helpers too, and roll it out to drivers.

Acked-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
 drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
 include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index e5996ae03c49..f5d58c3088fe 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
-	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 	.atomic_check = ast_primary_plane_helper_atomic_check,
 	.atomic_update = ast_primary_plane_helper_atomic_update,
 	.atomic_disable = ast_primary_plane_helper_atomic_disable,
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
index 29b8332b2bca..ccf80e369b4b 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
@@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
-	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 	.atomic_check = hibmc_plane_atomic_check,
 	.atomic_update = hibmc_plane_atomic_update,
 };
diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
index 964381d55fc1..972c83b720aa 100644
--- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
+++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
@@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
 	.atomic_check = vbox_primary_atomic_check,
 	.atomic_update = vbox_primary_atomic_update,
 	.atomic_disable = vbox_primary_atomic_disable,
-	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 };
 
 static const struct drm_plane_funcs vbox_primary_plane_funcs = {
diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
index 27ed7e9243b9..f48d181c824b 100644
--- a/include/drm/drm_gem_vram_helper.h
+++ b/include/drm/drm_gem_vram_helper.h
@@ -124,6 +124,18 @@ void
 drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
 				     struct drm_plane_state *old_state);
 
+/**
+ * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
+ *	Initializes struct drm_plane_helper_funcs for VRAM handling
+ *
+ * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
+ * macro initializes struct drm_plane_helper_funcs to use the respective helper
+ * functions.
+ */
+#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
+	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
+	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
+
 /*
  * Helpers for struct drm_simple_display_pipe_funcs
  */
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Maxime Ripard, Tian Tao, Laurent Pinchart, Thomas Zimmermann,
	Dave Airlie, Daniel Vetter

Like we have for the shadow helpers too, and roll it out to drivers.

Acked-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
---
 drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
 drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
 drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
 include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
 4 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index e5996ae03c49..f5d58c3088fe 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
 }
 
 static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
-	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 	.atomic_check = ast_primary_plane_helper_atomic_check,
 	.atomic_update = ast_primary_plane_helper_atomic_update,
 	.atomic_disable = ast_primary_plane_helper_atomic_disable,
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
index 29b8332b2bca..ccf80e369b4b 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
@@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
 };
 
 static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
-	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 	.atomic_check = hibmc_plane_atomic_check,
 	.atomic_update = hibmc_plane_atomic_update,
 };
diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
index 964381d55fc1..972c83b720aa 100644
--- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
+++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
@@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
 	.atomic_check = vbox_primary_atomic_check,
 	.atomic_update = vbox_primary_atomic_update,
 	.atomic_disable = vbox_primary_atomic_disable,
-	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
-	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
+	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
 };
 
 static const struct drm_plane_funcs vbox_primary_plane_funcs = {
diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
index 27ed7e9243b9..f48d181c824b 100644
--- a/include/drm/drm_gem_vram_helper.h
+++ b/include/drm/drm_gem_vram_helper.h
@@ -124,6 +124,18 @@ void
 drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
 				     struct drm_plane_state *old_state);
 
+/**
+ * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
+ *	Initializes struct drm_plane_helper_funcs for VRAM handling
+ *
+ * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
+ * macro initializes struct drm_plane_helper_funcs to use the respective helper
+ * functions.
+ */
+#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
+	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
+	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
+
 /*
  * Helpers for struct drm_simple_display_pipe_funcs
  */
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 11/15] drm/omap: Follow implicit fencing in prepare_fb
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Intel Graphics Development,
	Tomi Valkeinen, Tomi Valkeinen

I guess no one ever tried running omap together with lima or panfrost,
not even sure that's possible. Anyway for consistency, fix this.

Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tomi Valkeinen <tomba@kernel.org>
---
 drivers/gpu/drm/omapdrm/omap_plane.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c b/drivers/gpu/drm/omapdrm/omap_plane.c
index 801da917507d..512af976b7e9 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -6,6 +6,7 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 
 #include "omap_dmm_tiler.h"
@@ -29,6 +30,8 @@ static int omap_plane_prepare_fb(struct drm_plane *plane,
 	if (!new_state->fb)
 		return 0;
 
+	drm_gem_plane_helper_prepare_fb(plane, new_state);
+
 	return omap_framebuffer_pin(new_state->fb);
 }
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 11/15] drm/omap: Follow implicit fencing in prepare_fb
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Daniel Vetter, Intel Graphics Development,
	Tomi Valkeinen, Tomi Valkeinen

I guess no one ever tried running omap together with lima or panfrost,
not even sure that's possible. Anyway for consistency, fix this.

Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Tomi Valkeinen <tomba@kernel.org>
---
 drivers/gpu/drm/omapdrm/omap_plane.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c b/drivers/gpu/drm/omapdrm/omap_plane.c
index 801da917507d..512af976b7e9 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -6,6 +6,7 @@
 
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 
 #include "omap_dmm_tiler.h"
@@ -29,6 +30,8 @@ static int omap_plane_prepare_fb(struct drm_plane *plane,
 	if (!new_state->fb)
 		return 0;
 
+	drm_gem_plane_helper_prepare_fb(plane, new_state);
+
 	return omap_framebuffer_pin(new_state->fb);
 }
 
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 12/15] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Noralf Trønnes, Thomas Zimmermann, Daniel Vetter

It's tedious to review this all the time, and my audit showed that
arcpgu actually forgot to set this.

Make this the default and stop worrying.

Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
strange combinations of hooks: cleanup_fb without prepare_fb doesn't
make sense, and since simpler drivers are all new they better be GEM
based drivers.

v2: Warn and bail when it's _not_ a GEM driver (Noralf)

Cc: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
 include/drm/drm_simple_kms_helper.h     |  7 +++++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index 0b095a313c44..735f4f34bcc4 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -9,6 +9,8 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_probe_helper.h>
@@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
 	struct drm_simple_display_pipe *pipe;
 
 	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
-	if (!pipe->funcs || !pipe->funcs->prepare_fb)
-		return 0;
+	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
+		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
+			return 0;
+
+		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
+
+		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
+	}
 
 	return pipe->funcs->prepare_fb(pipe, state);
 }
diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
index ef9944e9c5fc..363a9a8c3587 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
 	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
 	 * more details.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
+	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
+	 * to implement this. Other drivers which need additional plane
+	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
+	 * their @prepare_fb hook.
 	 */
 	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
 			  struct drm_plane_state *plane_state);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 12/15] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Maxime Ripard, Noralf Trønnes, Thomas Zimmermann,
	Daniel Vetter

It's tedious to review this all the time, and my audit showed that
arcpgu actually forgot to set this.

Make this the default and stop worrying.

Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
strange combinations of hooks: cleanup_fb without prepare_fb doesn't
make sense, and since simpler drivers are all new they better be GEM
based drivers.

v2: Warn and bail when it's _not_ a GEM driver (Noralf)

Cc: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
 include/drm/drm_simple_kms_helper.h     |  7 +++++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index 0b095a313c44..735f4f34bcc4 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -9,6 +9,8 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_probe_helper.h>
@@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
 	struct drm_simple_display_pipe *pipe;
 
 	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
-	if (!pipe->funcs || !pipe->funcs->prepare_fb)
-		return 0;
+	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
+		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
+			return 0;
+
+		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
+
+		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
+	}
 
 	return pipe->funcs->prepare_fb(pipe, state);
 }
diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
index ef9944e9c5fc..363a9a8c3587 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
 	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
 	 * more details.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
+	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
+	 * to implement this. Other drivers which need additional plane
+	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
+	 * their @prepare_fb hook.
 	 */
 	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
 			  struct drm_plane_state *plane_state);
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 13/15] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
  (?)
  (?)
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, David Lechner,
	Noralf Trønnes, Oleksandr Andrushchenko, Linus Walleij,
	Daniel Vetter, Joel Stanley, Andrew Jeffery, Emma Anholt,
	Kamlesh Gurudasani, Maxime Ripard, Thomas Zimmermann,
	Sam Ravnborg, Alex Deucher, Andy Shevchenko, linux-aspeed,
	linux-arm-kernel, xen-devel

Goes through all the drivers and deletes the default hook since it's
the default now.

Acked-by: David Lechner <david@lechnology.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Lechner <david@lechnology.com>
Cc: Kamlesh Gurudasani <kamlesh.gurudasani@gmail.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 1 -
 drivers/gpu/drm/gud/gud_drv.c            | 1 -
 drivers/gpu/drm/mcde/mcde_display.c      | 1 -
 drivers/gpu/drm/pl111/pl111_display.c    | 1 -
 drivers/gpu/drm/tiny/hx8357d.c           | 1 -
 drivers/gpu/drm/tiny/ili9225.c           | 1 -
 drivers/gpu/drm/tiny/ili9341.c           | 1 -
 drivers/gpu/drm/tiny/ili9486.c           | 1 -
 drivers/gpu/drm/tiny/mi0283qt.c          | 1 -
 drivers/gpu/drm/tiny/repaper.c           | 1 -
 drivers/gpu/drm/tiny/st7586.c            | 1 -
 drivers/gpu/drm/tiny/st7735r.c           | 1 -
 drivers/gpu/drm/tve200/tve200_display.c  | 1 -
 drivers/gpu/drm/xen/xen_drm_front_kms.c  | 1 -
 14 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index 098f96d4d50d..827e62c1daba 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -220,7 +220,6 @@ static const struct drm_simple_display_pipe_funcs aspeed_gfx_funcs = {
 	.enable		= aspeed_gfx_pipe_enable,
 	.disable	= aspeed_gfx_pipe_disable,
 	.update		= aspeed_gfx_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank	= aspeed_gfx_enable_vblank,
 	.disable_vblank	= aspeed_gfx_disable_vblank,
 };
diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index e8b672dc9832..1925df9c0fb7 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -364,7 +364,6 @@ static void gud_debugfs_init(struct drm_minor *minor)
 static const struct drm_simple_display_pipe_funcs gud_pipe_funcs = {
 	.check      = gud_pipe_check,
 	.update	    = gud_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_mode_config_funcs gud_mode_config_funcs = {
diff --git a/drivers/gpu/drm/mcde/mcde_display.c b/drivers/gpu/drm/mcde/mcde_display.c
index 4ddc55d58f38..ce12a36e2db4 100644
--- a/drivers/gpu/drm/mcde/mcde_display.c
+++ b/drivers/gpu/drm/mcde/mcde_display.c
@@ -1479,7 +1479,6 @@ static struct drm_simple_display_pipe_funcs mcde_display_funcs = {
 	.update = mcde_display_update,
 	.enable_vblank = mcde_display_enable_vblank,
 	.disable_vblank = mcde_display_disable_vblank,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 int mcde_display_init(struct drm_device *drm)
diff --git a/drivers/gpu/drm/pl111/pl111_display.c b/drivers/gpu/drm/pl111/pl111_display.c
index 6fd7f13f1aca..b5a8859739a2 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -440,7 +440,6 @@ static struct drm_simple_display_pipe_funcs pl111_display_funcs = {
 	.enable = pl111_display_enable,
 	.disable = pl111_display_disable,
 	.update = pl111_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int pl111_clk_div_choose_div(struct clk_hw *hw, unsigned long rate,
diff --git a/drivers/gpu/drm/tiny/hx8357d.c b/drivers/gpu/drm/tiny/hx8357d.c
index da5df93450de..9b33c05732aa 100644
--- a/drivers/gpu/drm/tiny/hx8357d.c
+++ b/drivers/gpu/drm/tiny/hx8357d.c
@@ -184,7 +184,6 @@ static const struct drm_simple_display_pipe_funcs hx8357d_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx350hv15_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9225.c b/drivers/gpu/drm/tiny/ili9225.c
index 69265d8a3beb..976d3209f164 100644
--- a/drivers/gpu/drm/tiny/ili9225.c
+++ b/drivers/gpu/drm/tiny/ili9225.c
@@ -328,7 +328,6 @@ static const struct drm_simple_display_pipe_funcs ili9225_pipe_funcs = {
 	.enable		= ili9225_pipe_enable,
 	.disable	= ili9225_pipe_disable,
 	.update		= ili9225_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode ili9225_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9341.c b/drivers/gpu/drm/tiny/ili9341.c
index ad9ce7b4f76f..37e0c33399c8 100644
--- a/drivers/gpu/drm/tiny/ili9341.c
+++ b/drivers/gpu/drm/tiny/ili9341.c
@@ -140,7 +140,6 @@ static const struct drm_simple_display_pipe_funcs ili9341_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx240qv29_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9486.c b/drivers/gpu/drm/tiny/ili9486.c
index 75aa1476c66c..e9a63f4b2993 100644
--- a/drivers/gpu/drm/tiny/ili9486.c
+++ b/drivers/gpu/drm/tiny/ili9486.c
@@ -153,7 +153,6 @@ static const struct drm_simple_display_pipe_funcs waveshare_pipe_funcs = {
 	.enable = waveshare_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode waveshare_mode = {
diff --git a/drivers/gpu/drm/tiny/mi0283qt.c b/drivers/gpu/drm/tiny/mi0283qt.c
index 82fd1ad3413f..023de49e7a8e 100644
--- a/drivers/gpu/drm/tiny/mi0283qt.c
+++ b/drivers/gpu/drm/tiny/mi0283qt.c
@@ -144,7 +144,6 @@ static const struct drm_simple_display_pipe_funcs mi0283qt_pipe_funcs = {
 	.enable = mi0283qt_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode mi0283qt_mode = {
diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
index 2cee07a2e00b..007d9d59f01c 100644
--- a/drivers/gpu/drm/tiny/repaper.c
+++ b/drivers/gpu/drm/tiny/repaper.c
@@ -861,7 +861,6 @@ static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
 	.enable = repaper_pipe_enable,
 	.disable = repaper_pipe_disable,
 	.update = repaper_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int repaper_connector_get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/tiny/st7586.c b/drivers/gpu/drm/tiny/st7586.c
index 05db980cc047..1be55bed609a 100644
--- a/drivers/gpu/drm/tiny/st7586.c
+++ b/drivers/gpu/drm/tiny/st7586.c
@@ -268,7 +268,6 @@ static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
 	.enable		= st7586_pipe_enable,
 	.disable	= st7586_pipe_disable,
 	.update		= st7586_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode st7586_mode = {
diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
index ec9dc817a2cc..122320db5d38 100644
--- a/drivers/gpu/drm/tiny/st7735r.c
+++ b/drivers/gpu/drm/tiny/st7735r.c
@@ -136,7 +136,6 @@ static const struct drm_simple_display_pipe_funcs st7735r_pipe_funcs = {
 	.enable		= st7735r_pipe_enable,
 	.disable	= mipi_dbi_pipe_disable,
 	.update		= mipi_dbi_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct st7735r_cfg jd_t18003_t01_cfg = {
diff --git a/drivers/gpu/drm/tve200/tve200_display.c b/drivers/gpu/drm/tve200/tve200_display.c
index 50e1fb71869f..17b8c8dd169d 100644
--- a/drivers/gpu/drm/tve200/tve200_display.c
+++ b/drivers/gpu/drm/tve200/tve200_display.c
@@ -316,7 +316,6 @@ static const struct drm_simple_display_pipe_funcs tve200_display_funcs = {
 	.enable = tve200_display_enable,
 	.disable = tve200_display_disable,
 	.update = tve200_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank = tve200_display_enable_vblank,
 	.disable_vblank = tve200_display_disable_vblank,
 };
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index 371202ebe900..cfda74490765 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -302,7 +302,6 @@ static const struct drm_simple_display_pipe_funcs display_funcs = {
 	.mode_valid = display_mode_valid,
 	.enable = display_enable,
 	.disable = display_disable,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.check = display_check,
 	.update = display_update,
 };
-- 
2.32.0.rc2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 13/15] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: linux-arm-kernel, Andy Shevchenko, David Lechner, Emma Anholt,
	Oleksandr Andrushchenko, Andrew Jeffery, Daniel Vetter,
	Intel Graphics Development, Noralf Trønnes, Joel Stanley,
	Thomas Zimmermann, xen-devel, Alex Deucher, Daniel Vetter,
	Kamlesh Gurudasani, Sam Ravnborg, linux-aspeed

Goes through all the drivers and deletes the default hook since it's
the default now.

Acked-by: David Lechner <david@lechnology.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Lechner <david@lechnology.com>
Cc: Kamlesh Gurudasani <kamlesh.gurudasani@gmail.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 1 -
 drivers/gpu/drm/gud/gud_drv.c            | 1 -
 drivers/gpu/drm/mcde/mcde_display.c      | 1 -
 drivers/gpu/drm/pl111/pl111_display.c    | 1 -
 drivers/gpu/drm/tiny/hx8357d.c           | 1 -
 drivers/gpu/drm/tiny/ili9225.c           | 1 -
 drivers/gpu/drm/tiny/ili9341.c           | 1 -
 drivers/gpu/drm/tiny/ili9486.c           | 1 -
 drivers/gpu/drm/tiny/mi0283qt.c          | 1 -
 drivers/gpu/drm/tiny/repaper.c           | 1 -
 drivers/gpu/drm/tiny/st7586.c            | 1 -
 drivers/gpu/drm/tiny/st7735r.c           | 1 -
 drivers/gpu/drm/tve200/tve200_display.c  | 1 -
 drivers/gpu/drm/xen/xen_drm_front_kms.c  | 1 -
 14 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index 098f96d4d50d..827e62c1daba 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -220,7 +220,6 @@ static const struct drm_simple_display_pipe_funcs aspeed_gfx_funcs = {
 	.enable		= aspeed_gfx_pipe_enable,
 	.disable	= aspeed_gfx_pipe_disable,
 	.update		= aspeed_gfx_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank	= aspeed_gfx_enable_vblank,
 	.disable_vblank	= aspeed_gfx_disable_vblank,
 };
diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index e8b672dc9832..1925df9c0fb7 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -364,7 +364,6 @@ static void gud_debugfs_init(struct drm_minor *minor)
 static const struct drm_simple_display_pipe_funcs gud_pipe_funcs = {
 	.check      = gud_pipe_check,
 	.update	    = gud_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_mode_config_funcs gud_mode_config_funcs = {
diff --git a/drivers/gpu/drm/mcde/mcde_display.c b/drivers/gpu/drm/mcde/mcde_display.c
index 4ddc55d58f38..ce12a36e2db4 100644
--- a/drivers/gpu/drm/mcde/mcde_display.c
+++ b/drivers/gpu/drm/mcde/mcde_display.c
@@ -1479,7 +1479,6 @@ static struct drm_simple_display_pipe_funcs mcde_display_funcs = {
 	.update = mcde_display_update,
 	.enable_vblank = mcde_display_enable_vblank,
 	.disable_vblank = mcde_display_disable_vblank,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 int mcde_display_init(struct drm_device *drm)
diff --git a/drivers/gpu/drm/pl111/pl111_display.c b/drivers/gpu/drm/pl111/pl111_display.c
index 6fd7f13f1aca..b5a8859739a2 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -440,7 +440,6 @@ static struct drm_simple_display_pipe_funcs pl111_display_funcs = {
 	.enable = pl111_display_enable,
 	.disable = pl111_display_disable,
 	.update = pl111_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int pl111_clk_div_choose_div(struct clk_hw *hw, unsigned long rate,
diff --git a/drivers/gpu/drm/tiny/hx8357d.c b/drivers/gpu/drm/tiny/hx8357d.c
index da5df93450de..9b33c05732aa 100644
--- a/drivers/gpu/drm/tiny/hx8357d.c
+++ b/drivers/gpu/drm/tiny/hx8357d.c
@@ -184,7 +184,6 @@ static const struct drm_simple_display_pipe_funcs hx8357d_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx350hv15_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9225.c b/drivers/gpu/drm/tiny/ili9225.c
index 69265d8a3beb..976d3209f164 100644
--- a/drivers/gpu/drm/tiny/ili9225.c
+++ b/drivers/gpu/drm/tiny/ili9225.c
@@ -328,7 +328,6 @@ static const struct drm_simple_display_pipe_funcs ili9225_pipe_funcs = {
 	.enable		= ili9225_pipe_enable,
 	.disable	= ili9225_pipe_disable,
 	.update		= ili9225_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode ili9225_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9341.c b/drivers/gpu/drm/tiny/ili9341.c
index ad9ce7b4f76f..37e0c33399c8 100644
--- a/drivers/gpu/drm/tiny/ili9341.c
+++ b/drivers/gpu/drm/tiny/ili9341.c
@@ -140,7 +140,6 @@ static const struct drm_simple_display_pipe_funcs ili9341_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx240qv29_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9486.c b/drivers/gpu/drm/tiny/ili9486.c
index 75aa1476c66c..e9a63f4b2993 100644
--- a/drivers/gpu/drm/tiny/ili9486.c
+++ b/drivers/gpu/drm/tiny/ili9486.c
@@ -153,7 +153,6 @@ static const struct drm_simple_display_pipe_funcs waveshare_pipe_funcs = {
 	.enable = waveshare_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode waveshare_mode = {
diff --git a/drivers/gpu/drm/tiny/mi0283qt.c b/drivers/gpu/drm/tiny/mi0283qt.c
index 82fd1ad3413f..023de49e7a8e 100644
--- a/drivers/gpu/drm/tiny/mi0283qt.c
+++ b/drivers/gpu/drm/tiny/mi0283qt.c
@@ -144,7 +144,6 @@ static const struct drm_simple_display_pipe_funcs mi0283qt_pipe_funcs = {
 	.enable = mi0283qt_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode mi0283qt_mode = {
diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
index 2cee07a2e00b..007d9d59f01c 100644
--- a/drivers/gpu/drm/tiny/repaper.c
+++ b/drivers/gpu/drm/tiny/repaper.c
@@ -861,7 +861,6 @@ static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
 	.enable = repaper_pipe_enable,
 	.disable = repaper_pipe_disable,
 	.update = repaper_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int repaper_connector_get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/tiny/st7586.c b/drivers/gpu/drm/tiny/st7586.c
index 05db980cc047..1be55bed609a 100644
--- a/drivers/gpu/drm/tiny/st7586.c
+++ b/drivers/gpu/drm/tiny/st7586.c
@@ -268,7 +268,6 @@ static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
 	.enable		= st7586_pipe_enable,
 	.disable	= st7586_pipe_disable,
 	.update		= st7586_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode st7586_mode = {
diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
index ec9dc817a2cc..122320db5d38 100644
--- a/drivers/gpu/drm/tiny/st7735r.c
+++ b/drivers/gpu/drm/tiny/st7735r.c
@@ -136,7 +136,6 @@ static const struct drm_simple_display_pipe_funcs st7735r_pipe_funcs = {
 	.enable		= st7735r_pipe_enable,
 	.disable	= mipi_dbi_pipe_disable,
 	.update		= mipi_dbi_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct st7735r_cfg jd_t18003_t01_cfg = {
diff --git a/drivers/gpu/drm/tve200/tve200_display.c b/drivers/gpu/drm/tve200/tve200_display.c
index 50e1fb71869f..17b8c8dd169d 100644
--- a/drivers/gpu/drm/tve200/tve200_display.c
+++ b/drivers/gpu/drm/tve200/tve200_display.c
@@ -316,7 +316,6 @@ static const struct drm_simple_display_pipe_funcs tve200_display_funcs = {
 	.enable = tve200_display_enable,
 	.disable = tve200_display_disable,
 	.update = tve200_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank = tve200_display_enable_vblank,
 	.disable_vblank = tve200_display_disable_vblank,
 };
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index 371202ebe900..cfda74490765 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -302,7 +302,6 @@ static const struct drm_simple_display_pipe_funcs display_funcs = {
 	.mode_valid = display_mode_valid,
 	.enable = display_enable,
 	.disable = display_disable,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.check = display_check,
 	.update = display_update,
 };
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 13/15] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: linux-arm-kernel, Andy Shevchenko, David Lechner, Emma Anholt,
	Oleksandr Andrushchenko, Andrew Jeffery, Daniel Vetter,
	Intel Graphics Development, Maxime Ripard, Noralf Trønnes,
	Joel Stanley, Thomas Zimmermann, xen-devel, Alex Deucher,
	Daniel Vetter, Kamlesh Gurudasani, Sam Ravnborg, Linus Walleij,
	linux-aspeed

Goes through all the drivers and deletes the default hook since it's
the default now.

Acked-by: David Lechner <david@lechnology.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Lechner <david@lechnology.com>
Cc: Kamlesh Gurudasani <kamlesh.gurudasani@gmail.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 1 -
 drivers/gpu/drm/gud/gud_drv.c            | 1 -
 drivers/gpu/drm/mcde/mcde_display.c      | 1 -
 drivers/gpu/drm/pl111/pl111_display.c    | 1 -
 drivers/gpu/drm/tiny/hx8357d.c           | 1 -
 drivers/gpu/drm/tiny/ili9225.c           | 1 -
 drivers/gpu/drm/tiny/ili9341.c           | 1 -
 drivers/gpu/drm/tiny/ili9486.c           | 1 -
 drivers/gpu/drm/tiny/mi0283qt.c          | 1 -
 drivers/gpu/drm/tiny/repaper.c           | 1 -
 drivers/gpu/drm/tiny/st7586.c            | 1 -
 drivers/gpu/drm/tiny/st7735r.c           | 1 -
 drivers/gpu/drm/tve200/tve200_display.c  | 1 -
 drivers/gpu/drm/xen/xen_drm_front_kms.c  | 1 -
 14 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index 098f96d4d50d..827e62c1daba 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -220,7 +220,6 @@ static const struct drm_simple_display_pipe_funcs aspeed_gfx_funcs = {
 	.enable		= aspeed_gfx_pipe_enable,
 	.disable	= aspeed_gfx_pipe_disable,
 	.update		= aspeed_gfx_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank	= aspeed_gfx_enable_vblank,
 	.disable_vblank	= aspeed_gfx_disable_vblank,
 };
diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index e8b672dc9832..1925df9c0fb7 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -364,7 +364,6 @@ static void gud_debugfs_init(struct drm_minor *minor)
 static const struct drm_simple_display_pipe_funcs gud_pipe_funcs = {
 	.check      = gud_pipe_check,
 	.update	    = gud_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_mode_config_funcs gud_mode_config_funcs = {
diff --git a/drivers/gpu/drm/mcde/mcde_display.c b/drivers/gpu/drm/mcde/mcde_display.c
index 4ddc55d58f38..ce12a36e2db4 100644
--- a/drivers/gpu/drm/mcde/mcde_display.c
+++ b/drivers/gpu/drm/mcde/mcde_display.c
@@ -1479,7 +1479,6 @@ static struct drm_simple_display_pipe_funcs mcde_display_funcs = {
 	.update = mcde_display_update,
 	.enable_vblank = mcde_display_enable_vblank,
 	.disable_vblank = mcde_display_disable_vblank,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 int mcde_display_init(struct drm_device *drm)
diff --git a/drivers/gpu/drm/pl111/pl111_display.c b/drivers/gpu/drm/pl111/pl111_display.c
index 6fd7f13f1aca..b5a8859739a2 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -440,7 +440,6 @@ static struct drm_simple_display_pipe_funcs pl111_display_funcs = {
 	.enable = pl111_display_enable,
 	.disable = pl111_display_disable,
 	.update = pl111_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int pl111_clk_div_choose_div(struct clk_hw *hw, unsigned long rate,
diff --git a/drivers/gpu/drm/tiny/hx8357d.c b/drivers/gpu/drm/tiny/hx8357d.c
index da5df93450de..9b33c05732aa 100644
--- a/drivers/gpu/drm/tiny/hx8357d.c
+++ b/drivers/gpu/drm/tiny/hx8357d.c
@@ -184,7 +184,6 @@ static const struct drm_simple_display_pipe_funcs hx8357d_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx350hv15_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9225.c b/drivers/gpu/drm/tiny/ili9225.c
index 69265d8a3beb..976d3209f164 100644
--- a/drivers/gpu/drm/tiny/ili9225.c
+++ b/drivers/gpu/drm/tiny/ili9225.c
@@ -328,7 +328,6 @@ static const struct drm_simple_display_pipe_funcs ili9225_pipe_funcs = {
 	.enable		= ili9225_pipe_enable,
 	.disable	= ili9225_pipe_disable,
 	.update		= ili9225_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode ili9225_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9341.c b/drivers/gpu/drm/tiny/ili9341.c
index ad9ce7b4f76f..37e0c33399c8 100644
--- a/drivers/gpu/drm/tiny/ili9341.c
+++ b/drivers/gpu/drm/tiny/ili9341.c
@@ -140,7 +140,6 @@ static const struct drm_simple_display_pipe_funcs ili9341_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx240qv29_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9486.c b/drivers/gpu/drm/tiny/ili9486.c
index 75aa1476c66c..e9a63f4b2993 100644
--- a/drivers/gpu/drm/tiny/ili9486.c
+++ b/drivers/gpu/drm/tiny/ili9486.c
@@ -153,7 +153,6 @@ static const struct drm_simple_display_pipe_funcs waveshare_pipe_funcs = {
 	.enable = waveshare_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode waveshare_mode = {
diff --git a/drivers/gpu/drm/tiny/mi0283qt.c b/drivers/gpu/drm/tiny/mi0283qt.c
index 82fd1ad3413f..023de49e7a8e 100644
--- a/drivers/gpu/drm/tiny/mi0283qt.c
+++ b/drivers/gpu/drm/tiny/mi0283qt.c
@@ -144,7 +144,6 @@ static const struct drm_simple_display_pipe_funcs mi0283qt_pipe_funcs = {
 	.enable = mi0283qt_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode mi0283qt_mode = {
diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
index 2cee07a2e00b..007d9d59f01c 100644
--- a/drivers/gpu/drm/tiny/repaper.c
+++ b/drivers/gpu/drm/tiny/repaper.c
@@ -861,7 +861,6 @@ static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
 	.enable = repaper_pipe_enable,
 	.disable = repaper_pipe_disable,
 	.update = repaper_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int repaper_connector_get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/tiny/st7586.c b/drivers/gpu/drm/tiny/st7586.c
index 05db980cc047..1be55bed609a 100644
--- a/drivers/gpu/drm/tiny/st7586.c
+++ b/drivers/gpu/drm/tiny/st7586.c
@@ -268,7 +268,6 @@ static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
 	.enable		= st7586_pipe_enable,
 	.disable	= st7586_pipe_disable,
 	.update		= st7586_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode st7586_mode = {
diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
index ec9dc817a2cc..122320db5d38 100644
--- a/drivers/gpu/drm/tiny/st7735r.c
+++ b/drivers/gpu/drm/tiny/st7735r.c
@@ -136,7 +136,6 @@ static const struct drm_simple_display_pipe_funcs st7735r_pipe_funcs = {
 	.enable		= st7735r_pipe_enable,
 	.disable	= mipi_dbi_pipe_disable,
 	.update		= mipi_dbi_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct st7735r_cfg jd_t18003_t01_cfg = {
diff --git a/drivers/gpu/drm/tve200/tve200_display.c b/drivers/gpu/drm/tve200/tve200_display.c
index 50e1fb71869f..17b8c8dd169d 100644
--- a/drivers/gpu/drm/tve200/tve200_display.c
+++ b/drivers/gpu/drm/tve200/tve200_display.c
@@ -316,7 +316,6 @@ static const struct drm_simple_display_pipe_funcs tve200_display_funcs = {
 	.enable = tve200_display_enable,
 	.disable = tve200_display_disable,
 	.update = tve200_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank = tve200_display_enable_vblank,
 	.disable_vblank = tve200_display_disable_vblank,
 };
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index 371202ebe900..cfda74490765 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -302,7 +302,6 @@ static const struct drm_simple_display_pipe_funcs display_funcs = {
 	.mode_valid = display_mode_valid,
 	.enable = display_enable,
 	.disable = display_disable,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.check = display_check,
 	.update = display_update,
 };
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 13/15] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, David Lechner,
	Noralf Trønnes, Oleksandr Andrushchenko, Linus Walleij,
	Daniel Vetter, Joel Stanley, Andrew Jeffery, Emma Anholt,
	Kamlesh Gurudasani, Maxime Ripard, Thomas Zimmermann,
	Sam Ravnborg, Alex Deucher, Andy Shevchenko, linux-aspeed,
	linux-arm-kernel, xen-devel

Goes through all the drivers and deletes the default hook since it's
the default now.

Acked-by: David Lechner <david@lechnology.com>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: "Noralf Trønnes" <noralf@tronnes.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Emma Anholt <emma@anholt.net>
Cc: David Lechner <david@lechnology.com>
Cc: Kamlesh Gurudasani <kamlesh.gurudasani@gmail.com>
Cc: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: linux-aspeed@lists.ozlabs.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: xen-devel@lists.xenproject.org
---
 drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c | 1 -
 drivers/gpu/drm/gud/gud_drv.c            | 1 -
 drivers/gpu/drm/mcde/mcde_display.c      | 1 -
 drivers/gpu/drm/pl111/pl111_display.c    | 1 -
 drivers/gpu/drm/tiny/hx8357d.c           | 1 -
 drivers/gpu/drm/tiny/ili9225.c           | 1 -
 drivers/gpu/drm/tiny/ili9341.c           | 1 -
 drivers/gpu/drm/tiny/ili9486.c           | 1 -
 drivers/gpu/drm/tiny/mi0283qt.c          | 1 -
 drivers/gpu/drm/tiny/repaper.c           | 1 -
 drivers/gpu/drm/tiny/st7586.c            | 1 -
 drivers/gpu/drm/tiny/st7735r.c           | 1 -
 drivers/gpu/drm/tve200/tve200_display.c  | 1 -
 drivers/gpu/drm/xen/xen_drm_front_kms.c  | 1 -
 14 files changed, 14 deletions(-)

diff --git a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
index 098f96d4d50d..827e62c1daba 100644
--- a/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
+++ b/drivers/gpu/drm/aspeed/aspeed_gfx_crtc.c
@@ -220,7 +220,6 @@ static const struct drm_simple_display_pipe_funcs aspeed_gfx_funcs = {
 	.enable		= aspeed_gfx_pipe_enable,
 	.disable	= aspeed_gfx_pipe_disable,
 	.update		= aspeed_gfx_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank	= aspeed_gfx_enable_vblank,
 	.disable_vblank	= aspeed_gfx_disable_vblank,
 };
diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index e8b672dc9832..1925df9c0fb7 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -364,7 +364,6 @@ static void gud_debugfs_init(struct drm_minor *minor)
 static const struct drm_simple_display_pipe_funcs gud_pipe_funcs = {
 	.check      = gud_pipe_check,
 	.update	    = gud_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_mode_config_funcs gud_mode_config_funcs = {
diff --git a/drivers/gpu/drm/mcde/mcde_display.c b/drivers/gpu/drm/mcde/mcde_display.c
index 4ddc55d58f38..ce12a36e2db4 100644
--- a/drivers/gpu/drm/mcde/mcde_display.c
+++ b/drivers/gpu/drm/mcde/mcde_display.c
@@ -1479,7 +1479,6 @@ static struct drm_simple_display_pipe_funcs mcde_display_funcs = {
 	.update = mcde_display_update,
 	.enable_vblank = mcde_display_enable_vblank,
 	.disable_vblank = mcde_display_disable_vblank,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 int mcde_display_init(struct drm_device *drm)
diff --git a/drivers/gpu/drm/pl111/pl111_display.c b/drivers/gpu/drm/pl111/pl111_display.c
index 6fd7f13f1aca..b5a8859739a2 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -440,7 +440,6 @@ static struct drm_simple_display_pipe_funcs pl111_display_funcs = {
 	.enable = pl111_display_enable,
 	.disable = pl111_display_disable,
 	.update = pl111_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int pl111_clk_div_choose_div(struct clk_hw *hw, unsigned long rate,
diff --git a/drivers/gpu/drm/tiny/hx8357d.c b/drivers/gpu/drm/tiny/hx8357d.c
index da5df93450de..9b33c05732aa 100644
--- a/drivers/gpu/drm/tiny/hx8357d.c
+++ b/drivers/gpu/drm/tiny/hx8357d.c
@@ -184,7 +184,6 @@ static const struct drm_simple_display_pipe_funcs hx8357d_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx350hv15_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9225.c b/drivers/gpu/drm/tiny/ili9225.c
index 69265d8a3beb..976d3209f164 100644
--- a/drivers/gpu/drm/tiny/ili9225.c
+++ b/drivers/gpu/drm/tiny/ili9225.c
@@ -328,7 +328,6 @@ static const struct drm_simple_display_pipe_funcs ili9225_pipe_funcs = {
 	.enable		= ili9225_pipe_enable,
 	.disable	= ili9225_pipe_disable,
 	.update		= ili9225_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode ili9225_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9341.c b/drivers/gpu/drm/tiny/ili9341.c
index ad9ce7b4f76f..37e0c33399c8 100644
--- a/drivers/gpu/drm/tiny/ili9341.c
+++ b/drivers/gpu/drm/tiny/ili9341.c
@@ -140,7 +140,6 @@ static const struct drm_simple_display_pipe_funcs ili9341_pipe_funcs = {
 	.enable = yx240qv29_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode yx240qv29_mode = {
diff --git a/drivers/gpu/drm/tiny/ili9486.c b/drivers/gpu/drm/tiny/ili9486.c
index 75aa1476c66c..e9a63f4b2993 100644
--- a/drivers/gpu/drm/tiny/ili9486.c
+++ b/drivers/gpu/drm/tiny/ili9486.c
@@ -153,7 +153,6 @@ static const struct drm_simple_display_pipe_funcs waveshare_pipe_funcs = {
 	.enable = waveshare_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode waveshare_mode = {
diff --git a/drivers/gpu/drm/tiny/mi0283qt.c b/drivers/gpu/drm/tiny/mi0283qt.c
index 82fd1ad3413f..023de49e7a8e 100644
--- a/drivers/gpu/drm/tiny/mi0283qt.c
+++ b/drivers/gpu/drm/tiny/mi0283qt.c
@@ -144,7 +144,6 @@ static const struct drm_simple_display_pipe_funcs mi0283qt_pipe_funcs = {
 	.enable = mi0283qt_enable,
 	.disable = mipi_dbi_pipe_disable,
 	.update = mipi_dbi_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode mi0283qt_mode = {
diff --git a/drivers/gpu/drm/tiny/repaper.c b/drivers/gpu/drm/tiny/repaper.c
index 2cee07a2e00b..007d9d59f01c 100644
--- a/drivers/gpu/drm/tiny/repaper.c
+++ b/drivers/gpu/drm/tiny/repaper.c
@@ -861,7 +861,6 @@ static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
 	.enable = repaper_pipe_enable,
 	.disable = repaper_pipe_disable,
 	.update = repaper_pipe_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static int repaper_connector_get_modes(struct drm_connector *connector)
diff --git a/drivers/gpu/drm/tiny/st7586.c b/drivers/gpu/drm/tiny/st7586.c
index 05db980cc047..1be55bed609a 100644
--- a/drivers/gpu/drm/tiny/st7586.c
+++ b/drivers/gpu/drm/tiny/st7586.c
@@ -268,7 +268,6 @@ static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
 	.enable		= st7586_pipe_enable,
 	.disable	= st7586_pipe_disable,
 	.update		= st7586_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct drm_display_mode st7586_mode = {
diff --git a/drivers/gpu/drm/tiny/st7735r.c b/drivers/gpu/drm/tiny/st7735r.c
index ec9dc817a2cc..122320db5d38 100644
--- a/drivers/gpu/drm/tiny/st7735r.c
+++ b/drivers/gpu/drm/tiny/st7735r.c
@@ -136,7 +136,6 @@ static const struct drm_simple_display_pipe_funcs st7735r_pipe_funcs = {
 	.enable		= st7735r_pipe_enable,
 	.disable	= mipi_dbi_pipe_disable,
 	.update		= mipi_dbi_pipe_update,
-	.prepare_fb	= drm_gem_simple_display_pipe_prepare_fb,
 };
 
 static const struct st7735r_cfg jd_t18003_t01_cfg = {
diff --git a/drivers/gpu/drm/tve200/tve200_display.c b/drivers/gpu/drm/tve200/tve200_display.c
index 50e1fb71869f..17b8c8dd169d 100644
--- a/drivers/gpu/drm/tve200/tve200_display.c
+++ b/drivers/gpu/drm/tve200/tve200_display.c
@@ -316,7 +316,6 @@ static const struct drm_simple_display_pipe_funcs tve200_display_funcs = {
 	.enable = tve200_display_enable,
 	.disable = tve200_display_disable,
 	.update = tve200_display_update,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.enable_vblank = tve200_display_enable_vblank,
 	.disable_vblank = tve200_display_disable_vblank,
 };
diff --git a/drivers/gpu/drm/xen/xen_drm_front_kms.c b/drivers/gpu/drm/xen/xen_drm_front_kms.c
index 371202ebe900..cfda74490765 100644
--- a/drivers/gpu/drm/xen/xen_drm_front_kms.c
+++ b/drivers/gpu/drm/xen/xen_drm_front_kms.c
@@ -302,7 +302,6 @@ static const struct drm_simple_display_pipe_funcs display_funcs = {
 	.mode_valid = display_mode_valid,
 	.enable = display_enable,
 	.disable = display_disable,
-	.prepare_fb = drm_gem_simple_display_pipe_prepare_fb,
 	.check = display_check,
 	.update = display_update,
 };
-- 
2.32.0.rc2



^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, Daniel Vetter, Christian König

Spotted while trying to convert panfrost to these.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_gem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index ba2e64ed8b47..68deb1de8235 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
  * @fence_array: array of dma_fence * for the job to block on.
  * @fence: the dma_fence to add to the list of dependencies.
  *
+ * This functions consumes the reference for @fence both on success and error
+ * cases.
+ *
  * Returns:
  * 0 on success, or an error on failing to expand the array.
  */
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Maxime Ripard, Thomas Zimmermann, Daniel Vetter,
	Christian König, Lucas Stach

Spotted while trying to convert panfrost to these.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_gem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index ba2e64ed8b47..68deb1de8235 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
  * @fence_array: array of dma_fence * for the job to block on.
  * @fence: the dma_fence to add to the list of dependencies.
  *
+ * This functions consumes the reference for @fence both on success and error
+ * cases.
+ *
  * Returns:
  * 0 on success, or an error on failing to expand the array.
  */
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 16:55   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Daniel Stone, Christian König, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	linaro-mm-sig, Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Alex Deucher, mesa-dev, Michel Dänzer, Dennis Li,
	Deepak R Varma

WARNING: Absolutely untested beyond "gcc isn't dying in agony".

Implicit fencing done properly needs to treat the implicit fencing
slots like a funny kind of IPC mailbox. In other words it needs to be
explicitly. This is the only way it will mesh well with explicit
fencing userspace like vk, and it's also the bare minimum required to
be able to manage anything else that wants to use the same buffer on
multiple engines in parallel, and still be able to share it through
implicit sync.

amdgpu completely lacks such an uapi. Fix this.

Luckily the concept of ignoring implicit fences exists already, and
takes care of all the complexities of making sure that non-optional
fences (like bo moves) are not ignored. This support was added in

commit 177ae09b5d699a5ebd1cafcee78889db968abf54
Author: Andres Rodriguez <andresx7@gmail.com>
Date:   Fri Sep 15 20:44:06 2017 -0400

    drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2

Unfortuantely it's the wrong semantics, because it's a bo flag and
disables implicit sync on an allocated buffer completely.

We _do_ want implicit sync, but control it explicitly. For this we
need a flag on the drm_file, so that a given userspace (like vulkan)
can manage the implicit sync slots explicitly. The other side of the
pipeline (compositor, other process or just different stage in a media
pipeline in the same process) can then either do the same, or fully
participate in the implicit sync as implemented by the kernel by
default.

By building on the existing flag for buffers we avoid any issues with
opening up additional security concerns - anything this new flag here
allows is already.

All drivers which supports this concept of a userspace-specific
opt-out of implicit sync have a flag in their CS ioctl, but in reality
that turned out to be a bit too inflexible. See the discussion below,
let's try to do a bit better for amdgpu.

This alone only allows us to completely avoid any stalls due to
implicit sync, it does not yet allow us to use implicit sync as a
strange form of IPC for sync_file.

For that we need two more pieces:

- a way to get the current implicit sync fences out of a buffer. Could
  be done in a driver ioctl, but everyone needs this, and generally a
  dma-buf is involved anyway to establish the sharing. So an ioctl on
  the dma-buf makes a ton more sense:

  https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/

  Current drivers in upstream solves this by having the opt-out flag
  on their CS ioctl. This has the downside that very often the CS
  which must actually stall for the implicit fence is run a while
  after the implicit fence point was logically sampled per the api
  spec (vk passes an explicit syncobj around for that afaiui), and so
  results in oversync. Converting the implicit sync fences into a
  snap-shot sync_file is actually accurate.

- Simillar we need to be able to set the exclusive implicit fence.
  Current drivers again do this with a CS ioctl flag, with again the
  same problems that the time the CS happens additional dependencies
  have been added. An explicit ioctl to only insert a sync_file (while
  respecting the rules for how exclusive and shared fence slots must
  be update in struct dma_resv) is much better. This is proposed here:

  https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/

These three pieces together allow userspace to fully control implicit
fencing and remove all unecessary stall points due to them.

Well, as much as the implicit fencing model fundamentally allows:
There is only one set of fences, you can only choose to sync against
only writers (exclusive slot), or everyone. Hence suballocating
multiple buffers or anything else like this is fundamentally not
possible, and can only be fixed by a proper explicit fencing model.

Aside from that caveat this model gets implicit fencing as closely to
explicit fencing semantics as possible:

On the actual implementation I opted for a simple setparam ioctl, no
locking (just atomic reads/writes) for simplicity. There is a nice
flag parameter in the VM ioctl which we could use, except:
- it's not checked, so userspace likely passes garbage
- there's already a comment that userspace _does_ pass garbage in the
  priority field
So yeah unfortunately this flag parameter for setting vm flags is
useless, and we need to hack up a new one.

v2: Explain why a new SETPARAM (Jason)

v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
need both, or this doesn't do much.

v4: Rebase over the amdgpu patch to always set the implicit sync
fences.

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
 include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 65df34c17264..c5386d13eb4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	struct amdgpu_bo *gds;
 	struct amdgpu_bo *gws;
 	struct amdgpu_bo *oa;
+	bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
 	int r;
 
 	INIT_LIST_HEAD(&p->validated);
@@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 
 		e->bo_va = amdgpu_vm_bo_find(vm, bo);
 
-		if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
+		if (bo->tbo.base.dma_buf &&
+		    !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
 			e->chain = dma_fence_chain_alloc();
 			if (!e->chain) {
 				r = -ENOMEM;
@@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 {
 	struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
 	struct amdgpu_bo_list_entry *e;
+	bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
 	int r;
 
 	list_for_each_entry(e, &p->validated, tv.head) {
@@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 		struct dma_resv *resv = bo->tbo.base.resv;
 		enum amdgpu_sync_mode sync_mode;
 
-		sync_mode = amdgpu_bo_explicit_sync(bo) ?
+		sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
 			AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
 		r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
 				     &fpriv->vm);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c080ba15ae77..f982626b5328 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
 	return 0;
 }
 
+int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *filp)
+{
+	struct drm_amdgpu_setparam *setparam = data;
+	struct amdgpu_fpriv *fpriv = filp->driver_priv;
+
+	switch (setparam->param) {
+	case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
+		if (setparam->value)
+			WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
+		else
+			WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
@@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 };
 
 static const struct drm_driver amdgpu_kms_driver = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index ddb85a85cbba..0e8c440c6303 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -321,6 +321,12 @@ struct amdgpu_vm {
 	bool			bulk_moveable;
 	/* Flag to indicate if VM is used for compute */
 	bool			is_compute_context;
+	/*
+	 * Flag to indicate whether implicit sync should always be skipped on
+	 * this context. We do not care about races at all, userspace is allowed
+	 * to shoot itself with implicit sync to its fullest liking.
+	 */
+	bool no_implicit_sync;
 };
 
 struct amdgpu_vm_manager {
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 0cbd1540aeac..9eae245c14d6 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
 #define DRM_AMDGPU_VM			0x13
 #define DRM_AMDGPU_FENCE_TO_HANDLE	0x14
 #define DRM_AMDGPU_SCHED		0x15
+#define DRM_AMDGPU_SETPARAM		0x16
 
 #define DRM_IOCTL_AMDGPU_GEM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
 #define DRM_IOCTL_AMDGPU_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
 #define DRM_IOCTL_AMDGPU_VM		DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
 #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
 #define DRM_IOCTL_AMDGPU_SCHED		DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_SETPARAM	DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
 
 /**
  * DOC: memory domains
@@ -306,6 +308,14 @@ union drm_amdgpu_sched {
 	struct drm_amdgpu_sched_in in;
 };
 
+#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC	1
+
+struct drm_amdgpu_setparam {
+	/* AMDGPU_SETPARAM_* */
+	__u32	param;
+	__u32	value;
+};
+
 /*
  * This is not a reliable API and you should expect it to fail for any
  * number of reasons and have fallback path that do not use userptr to
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-22 16:55   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 16:55 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Daniel Stone, Christian König, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	Sumit Semwal, linaro-mm-sig, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Bas Nieuwenhuizen,
	Alex Deucher, mesa-dev, Michel Dänzer, Dennis Li,
	Deepak R Varma

WARNING: Absolutely untested beyond "gcc isn't dying in agony".

Implicit fencing done properly needs to treat the implicit fencing
slots like a funny kind of IPC mailbox. In other words it needs to be
explicitly. This is the only way it will mesh well with explicit
fencing userspace like vk, and it's also the bare minimum required to
be able to manage anything else that wants to use the same buffer on
multiple engines in parallel, and still be able to share it through
implicit sync.

amdgpu completely lacks such an uapi. Fix this.

Luckily the concept of ignoring implicit fences exists already, and
takes care of all the complexities of making sure that non-optional
fences (like bo moves) are not ignored. This support was added in

commit 177ae09b5d699a5ebd1cafcee78889db968abf54
Author: Andres Rodriguez <andresx7@gmail.com>
Date:   Fri Sep 15 20:44:06 2017 -0400

    drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2

Unfortuantely it's the wrong semantics, because it's a bo flag and
disables implicit sync on an allocated buffer completely.

We _do_ want implicit sync, but control it explicitly. For this we
need a flag on the drm_file, so that a given userspace (like vulkan)
can manage the implicit sync slots explicitly. The other side of the
pipeline (compositor, other process or just different stage in a media
pipeline in the same process) can then either do the same, or fully
participate in the implicit sync as implemented by the kernel by
default.

By building on the existing flag for buffers we avoid any issues with
opening up additional security concerns - anything this new flag here
allows is already.

All drivers which supports this concept of a userspace-specific
opt-out of implicit sync have a flag in their CS ioctl, but in reality
that turned out to be a bit too inflexible. See the discussion below,
let's try to do a bit better for amdgpu.

This alone only allows us to completely avoid any stalls due to
implicit sync, it does not yet allow us to use implicit sync as a
strange form of IPC for sync_file.

For that we need two more pieces:

- a way to get the current implicit sync fences out of a buffer. Could
  be done in a driver ioctl, but everyone needs this, and generally a
  dma-buf is involved anyway to establish the sharing. So an ioctl on
  the dma-buf makes a ton more sense:

  https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/

  Current drivers in upstream solves this by having the opt-out flag
  on their CS ioctl. This has the downside that very often the CS
  which must actually stall for the implicit fence is run a while
  after the implicit fence point was logically sampled per the api
  spec (vk passes an explicit syncobj around for that afaiui), and so
  results in oversync. Converting the implicit sync fences into a
  snap-shot sync_file is actually accurate.

- Simillar we need to be able to set the exclusive implicit fence.
  Current drivers again do this with a CS ioctl flag, with again the
  same problems that the time the CS happens additional dependencies
  have been added. An explicit ioctl to only insert a sync_file (while
  respecting the rules for how exclusive and shared fence slots must
  be update in struct dma_resv) is much better. This is proposed here:

  https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/

These three pieces together allow userspace to fully control implicit
fencing and remove all unecessary stall points due to them.

Well, as much as the implicit fencing model fundamentally allows:
There is only one set of fences, you can only choose to sync against
only writers (exclusive slot), or everyone. Hence suballocating
multiple buffers or anything else like this is fundamentally not
possible, and can only be fixed by a proper explicit fencing model.

Aside from that caveat this model gets implicit fencing as closely to
explicit fencing semantics as possible:

On the actual implementation I opted for a simple setparam ioctl, no
locking (just atomic reads/writes) for simplicity. There is a nice
flag parameter in the VM ioctl which we could use, except:
- it's not checked, so userspace likely passes garbage
- there's already a comment that userspace _does_ pass garbage in the
  priority field
So yeah unfortunately this flag parameter for setting vm flags is
useless, and we need to hack up a new one.

v2: Explain why a new SETPARAM (Jason)

v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
need both, or this doesn't do much.

v4: Rebase over the amdgpu patch to always set the implicit sync
fences.

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
 include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 65df34c17264..c5386d13eb4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	struct amdgpu_bo *gds;
 	struct amdgpu_bo *gws;
 	struct amdgpu_bo *oa;
+	bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
 	int r;
 
 	INIT_LIST_HEAD(&p->validated);
@@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 
 		e->bo_va = amdgpu_vm_bo_find(vm, bo);
 
-		if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
+		if (bo->tbo.base.dma_buf &&
+		    !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
 			e->chain = dma_fence_chain_alloc();
 			if (!e->chain) {
 				r = -ENOMEM;
@@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 {
 	struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
 	struct amdgpu_bo_list_entry *e;
+	bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
 	int r;
 
 	list_for_each_entry(e, &p->validated, tv.head) {
@@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
 		struct dma_resv *resv = bo->tbo.base.resv;
 		enum amdgpu_sync_mode sync_mode;
 
-		sync_mode = amdgpu_bo_explicit_sync(bo) ?
+		sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
 			AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
 		r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
 				     &fpriv->vm);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index c080ba15ae77..f982626b5328 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
 	return 0;
 }
 
+int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *filp)
+{
+	struct drm_amdgpu_setparam *setparam = data;
+	struct amdgpu_fpriv *fpriv = filp->driver_priv;
+
+	switch (setparam->param) {
+	case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
+		if (setparam->value)
+			WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
+		else
+			WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
@@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 };
 
 static const struct drm_driver amdgpu_kms_driver = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index ddb85a85cbba..0e8c440c6303 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -321,6 +321,12 @@ struct amdgpu_vm {
 	bool			bulk_moveable;
 	/* Flag to indicate if VM is used for compute */
 	bool			is_compute_context;
+	/*
+	 * Flag to indicate whether implicit sync should always be skipped on
+	 * this context. We do not care about races at all, userspace is allowed
+	 * to shoot itself with implicit sync to its fullest liking.
+	 */
+	bool no_implicit_sync;
 };
 
 struct amdgpu_vm_manager {
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index 0cbd1540aeac..9eae245c14d6 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -54,6 +54,7 @@ extern "C" {
 #define DRM_AMDGPU_VM			0x13
 #define DRM_AMDGPU_FENCE_TO_HANDLE	0x14
 #define DRM_AMDGPU_SCHED		0x15
+#define DRM_AMDGPU_SETPARAM		0x16
 
 #define DRM_IOCTL_AMDGPU_GEM_CREATE	DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
 #define DRM_IOCTL_AMDGPU_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
@@ -71,6 +72,7 @@ extern "C" {
 #define DRM_IOCTL_AMDGPU_VM		DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
 #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
 #define DRM_IOCTL_AMDGPU_SCHED		DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
+#define DRM_IOCTL_AMDGPU_SETPARAM	DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
 
 /**
  * DOC: memory domains
@@ -306,6 +308,14 @@ union drm_amdgpu_sched {
 	struct drm_amdgpu_sched_in in;
 };
 
+#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC	1
+
+struct drm_amdgpu_setparam {
+	/* AMDGPU_SETPARAM_* */
+	__u32	param;
+	__u32	value;
+};
+
 /*
  * This is not a reliable API and you should expect it to fail for any
  * number of reasons and have fallback path that do not use userptr to
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for implicit fencing/dma-resv rules for shared buffers
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (15 preceding siblings ...)
  (?)
@ 2021-06-22 17:08 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-22 17:08 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers
URL   : https://patchwork.freedesktop.org/series/91789/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
fa13cbf232a0 dma-resv: Fix kerneldoc
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 6edbd6abb783 ("dma-buf: rename and cleanup dma_resv_get_excl v3")'
#11: 
commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9

-:35: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 1 errors, 1 warnings, 0 checks, 8 lines checked
d3ab5dc6443a dma-buf: Switch to inline kerneldoc
-:94: WARNING:TYPO_SPELLING: 'superseeded' may be misspelled - perhaps 'superseded'?
#94: FILE: include/linux/dma-buf.h:335:
+	 * vmap/unmap. Note that in many cases this is superseeded by
 	                                               ^^^^^^^^^^^

-:175: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 2 warnings, 0 checks, 142 lines checked
0589bfb59341 dma-buf: Document dma-buf implicit fencing/resv fencing rules
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

-:140: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 049aca4363d8 ("drm/amdgpu: fix using shared fence for exported BOs v2")'
#140: 
commit 049aca4363d8af87cab8d53de5401602db3b9999

-:155: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 9b495a588799 ("dma-buf: add poll support, v3")'
#155: 
	commit 9b495a5887994a6d74d5c261d012083a92b94738

-:183: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#183: 
  writes, and a per-bo flag to to skip implicit fencing in the CS

-:200: WARNING:TYPO_SPELLING: 'wont' may be misspelled - perhaps 'won't'?
#200: 
  wont notice the perf impact. I think we can ignore LTS distros who
  ^^^^

-:233: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 8c505bdc9c8b ("drm/amdgpu: rework dma_resv handling v3")'
#233: 
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)

-:313: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 3 errors, 4 warnings, 0 checks, 45 lines checked
2662cbf1df91 drm/panfrost: Shrink sched_lock
-:41: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 17 lines checked
804e59e3e0f4 drm/panfrost: Use xarray and helpers for depedency tracking
-:254: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 197 lines checked
370f17f25d5e drm/panfrost: Fix implicit sync
-:49: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 11 lines checked
d7729f8bfaf7 drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
-:87: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 44 lines checked
c761bfef9a8f drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
-:231: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 104 lines checked
a8e5d51acf68 drm/armada: Remove prepare/cleanup_fb hooks
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 57 lines checked
829aa12253aa drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
-:84: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 45 lines checked
07b7f877328f drm/omap: Follow implicit fencing in prepare_fb
-:33: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 15 lines checked
c2d4255f903b drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
-:78: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 37 lines checked
2ac97cd30e0f drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
-:203: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 98 lines checked
15977ff8b76f drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
-:34: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 9 lines checked
76d518e4d7e7 RFC: drm/amdgpu: Implement a proper implicit fencing uapi
-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 177ae09b5d69 ("drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2")'
#25: 
commit 177ae09b5d699a5ebd1cafcee78889db968abf54

-:62: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#62: 
  https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/

-:82: WARNING:TYPO_SPELLING: 'unecessary' may be misspelled - perhaps 'unnecessary'?
#82: 
fencing and remove all unecessary stall points due to them.
                       ^^^^^^^^^^

-:203: CHECK:SPACING: spaces preferred around that '|' (ctx:VxV)
#203: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:1765:
+	DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	                                                                  ^

-:240: WARNING:LONG_LINE: line length of 115 exceeds 100 columns
#240: FILE: include/uapi/drm/amdgpu_drm.h:75:
+#define DRM_IOCTL_AMDGPU_SETPARAM	DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)

-:258: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 1 errors, 4 warnings, 1 checks, 104 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for implicit fencing/dma-resv rules for shared buffers
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (16 preceding siblings ...)
  (?)
@ 2021-06-22 17:11 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-22 17:11 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers
URL   : https://patchwork.freedesktop.org/series/91789/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/display/intel_display.c:1893:21:    expected struct i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1893:21:    got void [noderef] __iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1893:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_ttm.c:733:38: warning: symbol 'i915_gem_ttm_obj_ops' was not declared. Should it be static?
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1207:24: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/selftests/test-drm_damage_helper.c:244:25: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/selftests/test-drm_damage_helper.c:268:23: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/ttm/ttm_bo.c:1157:9: warning: context imbalance in 'ttm_bo_swapout' - unexpected unlock
+drivers/gpu/drm/ttm/ttm_bo.c:309:28: warning: context imbalance in 'ttm_bo_cleanup_refs' - unexpected unlock
+drivers/gpu/drm/ttm/ttm_bo.c:367:27: warning: context imbalance in 'ttm_bo_delayed_delete' - different lock contexts for basic block
+drivers/gpu/drm/ttm/ttm_bo.c:633:5: warning: context imbalance in 'ttm_mem_evict_first' - wrong count at exit
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28:    expected void volatile [noderef] __iomem *addr
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28:    got void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/ttm/ttm_device.c:130:5: warning: context imbalance in 'ttm_device_swapout' - wrong count at exit
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080)
+./include/linux/seqlock.h:840:24: warning: trying to copy expression type 31
+./include/linux/seqlock.h:840:24: warning: trying to copy expression type 31
+./include/linux/seqlock.h:866:16: warning: trying to copy expression type 31
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for implicit fencing/dma-resv rules for shared buffers
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (17 preceding siblings ...)
  (?)
@ 2021-06-22 17:38 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-22 17:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3347 bytes --]

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers
URL   : https://patchwork.freedesktop.org/series/91789/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10263 -> Patchwork_20431
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/index.html

Known issues
------------

  Here are the changes found in Patchwork_20431 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@cs-gfx:
    - fi-kbl-soraka:      NOTRUN -> [SKIP][1] ([fdo#109271]) +15 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/fi-kbl-soraka/igt@amdgpu/amd_basic@cs-gfx.html

  * igt@i915_selftest@live@gt_heartbeat:
    - fi-kbl-7567u:       [PASS][2] -> [DMESG-FAIL][3] ([i915#2291] / [i915#541])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/fi-kbl-7567u/igt@i915_selftest@live@gt_heartbeat.html
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/fi-kbl-7567u/igt@i915_selftest@live@gt_heartbeat.html

  * igt@runner@aborted:
    - fi-skl-guc:         NOTRUN -> [FAIL][4] ([i915#2426] / [i915#3363])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/fi-skl-guc/igt@runner@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [i915#2291]: https://gitlab.freedesktop.org/drm/intel/issues/2291
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#541]: https://gitlab.freedesktop.org/drm/intel/issues/541


Participating hosts (43 -> 39)
------------------------------

  Missing    (4): fi-ilk-m540 fi-bsw-cyan fi-bdw-samus fi-hsw-4200u 


Build changes
-------------

  * Linux: CI_DRM_10263 -> Patchwork_20431

  CI-20190529: 20190529
  CI_DRM_10263: 5b5e458879485ea4eb87d4208b95a33ee5437fcc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6117: 3ba0a02404f243d6d8f232c6215163cc4b0fd699 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20431: 76d518e4d7e7790bd832f6d103b8e6a309750710 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

76d518e4d7e7 RFC: drm/amdgpu: Implement a proper implicit fencing uapi
15977ff8b76f drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
2ac97cd30e0f drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
c2d4255f903b drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
07b7f877328f drm/omap: Follow implicit fencing in prepare_fb
829aa12253aa drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
a8e5d51acf68 drm/armada: Remove prepare/cleanup_fb hooks
c761bfef9a8f drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
d7729f8bfaf7 drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
370f17f25d5e drm/panfrost: Fix implicit sync
804e59e3e0f4 drm/panfrost: Use xarray and helpers for depedency tracking
2662cbf1df91 drm/panfrost: Shrink sched_lock
0589bfb59341 dma-buf: Document dma-buf implicit fencing/resv fencing rules
d3ab5dc6443a dma-buf: Switch to inline kerneldoc
fa13cbf232a0 dma-resv: Fix kerneldoc

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/index.html

[-- Attachment #1.2: Type: text/html, Size: 4082 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
  (?)
@ 2021-06-22 18:19     ` Alex Deucher
  -1 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>     dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  include/linux/dma-resv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  }
>
>  /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>   * @obj: the reservation object
>   *
>   * Returns the exclusive fence (if any). Caller must either hold the objects
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 18:19     ` Alex Deucher
  0 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>     dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  include/linux/dma-resv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  }
>
>  /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>   * @obj: the reservation object
>   *
>   * Returns the exclusive fence (if any). Caller must either hold the objects
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 18:19     ` Alex Deucher
  0 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:19 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>     dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

> ---
>  include/linux/dma-resv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  }
>
>  /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>   * @obj: the reservation object
>   *
>   * Returns the exclusive fence (if any). Caller must either hold the objects
> --
> 2.32.0.rc2
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
  (?)
@ 2021-06-22 18:24     ` Alex Deucher
  -1 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Deepak R Varma, Intel Graphics Development,
	Kevin Wang, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Dave Airlie, Alex Deucher, Daniel Vetter,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>  1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>
>  /**
>   * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>   * and is incremented on each attach.
>   *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>   * Device DMA access is handled by the separate &struct dma_buf_attachment.
>   */
>  struct dma_buf {
> +       /**
> +        * @size:
> +        *
> +        * Size of the buffer; invariant over the lifetime of the buffer.
> +        */
>         size_t size;
> +
> +       /**
> +        * @file:
> +        *
> +        * File pointer used for sharing buffers across, and for refcounting.
> +        * See dma_buf_get() and dma_buf_put().
> +        */
>         struct file *file;
> +
> +       /**
> +        * @attachments:
> +        *
> +        * List of dma_buf_attachment that denotes all devices attached,
> +        * protected by &dma_resv lock @resv.
> +        */
>         struct list_head attachments;
> +
> +       /** @ops: dma_buf_ops associated with this buffer object. */

For consistency you may want to format this like:
/**
  * @ops:
  *
  * dma_buf_ops associated with this buffer object.
  */

>         const struct dma_buf_ops *ops;
> +
> +       /**
> +        * @lock:
> +        *
> +        * Used internally to serialize list manipulation, attach/detach and
> +        * vmap/unmap. Note that in many cases this is superseeded by
> +        * dma_resv_lock() on @resv.
> +        */
>         struct mutex lock;
> +
> +       /**
> +        * @vmapping_counter:
> +        *
> +        * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +        * Protected by @lock.
> +        */
>         unsigned vmapping_counter;
> +
> +       /**
> +        * @vmap_ptr:
> +        * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +        */

Same comment as above.

>         struct dma_buf_map vmap_ptr;
> +
> +       /**
> +        * @exp_name:
> +        *
> +        * Name of the exporter; useful for debugging. See the
> +        * DMA_BUF_SET_NAME IOCTL.
> +        */
>         const char *exp_name;
> +
> +       /**
> +        * @name:
> +        *
> +        * Userspace-provided name; useful for accounting and debugging,
> +        * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +        */
>         const char *name;
> +
> +       /** @name_lock: Spinlock to protect name acces for read access. */
>         spinlock_t name_lock;
> +
> +       /**
> +        * @owner:
> +        *
> +        * Pointer to exporter module; used for refcounting when exporter is a
> +        * kernel module.
> +        */
>         struct module *owner;
> +
> +       /** @list_node: node for dma_buf accounting and debugging. */

and here.

>         struct list_head list_node;
> +
> +       /** @priv: exporter specific private data for this buffer object. */

and here.

>         void *priv;
> +
> +       /**
> +        * @resv:
> +        *
> +        * Reservation object linked to this dma-buf.
> +        */
>         struct dma_resv *resv;
>
> -       /* poll support */
> +       /** @poll: for userspace poll support */

here.

>         wait_queue_head_t poll;
>
> +       /** @cb_excl: for userspace poll support */
> +       /** @cb_shared: for userspace poll support */

Here.

Either way,
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>         struct dma_buf_poll_cb_t {
>                 struct dma_fence_cb cb;
>                 wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>                 __poll_t active;
>         } cb_excl, cb_shared;
>  #ifdef CONFIG_DMABUF_SYSFS_STATS
> -       /* for sysfs stats */
> +       /**
> +        * @sysfs_entry:
> +        *
> +        * For exposing information about this buffer in sysfs. See also
> +        * `DMA-BUF statistics`_ for the uapi this enables.
> +        */
>         struct dma_buf_sysfs_entry {
>                 struct kobject kobj;
>                 struct dma_buf *dmabuf;
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 18:24     ` Alex Deucher
  0 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Daniel Vetter, Alex Deucher, Dave Airlie,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>  1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>
>  /**
>   * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>   * and is incremented on each attach.
>   *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>   * Device DMA access is handled by the separate &struct dma_buf_attachment.
>   */
>  struct dma_buf {
> +       /**
> +        * @size:
> +        *
> +        * Size of the buffer; invariant over the lifetime of the buffer.
> +        */
>         size_t size;
> +
> +       /**
> +        * @file:
> +        *
> +        * File pointer used for sharing buffers across, and for refcounting.
> +        * See dma_buf_get() and dma_buf_put().
> +        */
>         struct file *file;
> +
> +       /**
> +        * @attachments:
> +        *
> +        * List of dma_buf_attachment that denotes all devices attached,
> +        * protected by &dma_resv lock @resv.
> +        */
>         struct list_head attachments;
> +
> +       /** @ops: dma_buf_ops associated with this buffer object. */

For consistency you may want to format this like:
/**
  * @ops:
  *
  * dma_buf_ops associated with this buffer object.
  */

>         const struct dma_buf_ops *ops;
> +
> +       /**
> +        * @lock:
> +        *
> +        * Used internally to serialize list manipulation, attach/detach and
> +        * vmap/unmap. Note that in many cases this is superseeded by
> +        * dma_resv_lock() on @resv.
> +        */
>         struct mutex lock;
> +
> +       /**
> +        * @vmapping_counter:
> +        *
> +        * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +        * Protected by @lock.
> +        */
>         unsigned vmapping_counter;
> +
> +       /**
> +        * @vmap_ptr:
> +        * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +        */

Same comment as above.

>         struct dma_buf_map vmap_ptr;
> +
> +       /**
> +        * @exp_name:
> +        *
> +        * Name of the exporter; useful for debugging. See the
> +        * DMA_BUF_SET_NAME IOCTL.
> +        */
>         const char *exp_name;
> +
> +       /**
> +        * @name:
> +        *
> +        * Userspace-provided name; useful for accounting and debugging,
> +        * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +        */
>         const char *name;
> +
> +       /** @name_lock: Spinlock to protect name acces for read access. */
>         spinlock_t name_lock;
> +
> +       /**
> +        * @owner:
> +        *
> +        * Pointer to exporter module; used for refcounting when exporter is a
> +        * kernel module.
> +        */
>         struct module *owner;
> +
> +       /** @list_node: node for dma_buf accounting and debugging. */

and here.

>         struct list_head list_node;
> +
> +       /** @priv: exporter specific private data for this buffer object. */

and here.

>         void *priv;
> +
> +       /**
> +        * @resv:
> +        *
> +        * Reservation object linked to this dma-buf.
> +        */
>         struct dma_resv *resv;
>
> -       /* poll support */
> +       /** @poll: for userspace poll support */

here.

>         wait_queue_head_t poll;
>
> +       /** @cb_excl: for userspace poll support */
> +       /** @cb_shared: for userspace poll support */

Here.

Either way,
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>         struct dma_buf_poll_cb_t {
>                 struct dma_fence_cb cb;
>                 wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>                 __poll_t active;
>         } cb_excl, cb_shared;
>  #ifdef CONFIG_DMABUF_SYSFS_STATS
> -       /* for sysfs stats */
> +       /**
> +        * @sysfs_entry:
> +        *
> +        * For exposing information about this buffer in sysfs. See also
> +        * `DMA-BUF statistics`_ for the uapi this enables.
> +        */
>         struct dma_buf_sysfs_entry {
>                 struct kobject kobj;
>                 struct dma_buf *dmabuf;
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 18:24     ` Alex Deucher
  0 siblings, 0 replies; 175+ messages in thread
From: Alex Deucher @ 2021-06-22 18:24 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Daniel Vetter, Alex Deucher, Dave Airlie,
	Christian König, linux-media

On Tue, Jun 22, 2021 at 12:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>  1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>
>  /**
>   * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>   * and is incremented on each attach.
>   *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>   * Device DMA access is handled by the separate &struct dma_buf_attachment.
>   */
>  struct dma_buf {
> +       /**
> +        * @size:
> +        *
> +        * Size of the buffer; invariant over the lifetime of the buffer.
> +        */
>         size_t size;
> +
> +       /**
> +        * @file:
> +        *
> +        * File pointer used for sharing buffers across, and for refcounting.
> +        * See dma_buf_get() and dma_buf_put().
> +        */
>         struct file *file;
> +
> +       /**
> +        * @attachments:
> +        *
> +        * List of dma_buf_attachment that denotes all devices attached,
> +        * protected by &dma_resv lock @resv.
> +        */
>         struct list_head attachments;
> +
> +       /** @ops: dma_buf_ops associated with this buffer object. */

For consistency you may want to format this like:
/**
  * @ops:
  *
  * dma_buf_ops associated with this buffer object.
  */

>         const struct dma_buf_ops *ops;
> +
> +       /**
> +        * @lock:
> +        *
> +        * Used internally to serialize list manipulation, attach/detach and
> +        * vmap/unmap. Note that in many cases this is superseeded by
> +        * dma_resv_lock() on @resv.
> +        */
>         struct mutex lock;
> +
> +       /**
> +        * @vmapping_counter:
> +        *
> +        * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +        * Protected by @lock.
> +        */
>         unsigned vmapping_counter;
> +
> +       /**
> +        * @vmap_ptr:
> +        * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +        */

Same comment as above.

>         struct dma_buf_map vmap_ptr;
> +
> +       /**
> +        * @exp_name:
> +        *
> +        * Name of the exporter; useful for debugging. See the
> +        * DMA_BUF_SET_NAME IOCTL.
> +        */
>         const char *exp_name;
> +
> +       /**
> +        * @name:
> +        *
> +        * Userspace-provided name; useful for accounting and debugging,
> +        * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +        */
>         const char *name;
> +
> +       /** @name_lock: Spinlock to protect name acces for read access. */
>         spinlock_t name_lock;
> +
> +       /**
> +        * @owner:
> +        *
> +        * Pointer to exporter module; used for refcounting when exporter is a
> +        * kernel module.
> +        */
>         struct module *owner;
> +
> +       /** @list_node: node for dma_buf accounting and debugging. */

and here.

>         struct list_head list_node;
> +
> +       /** @priv: exporter specific private data for this buffer object. */

and here.

>         void *priv;
> +
> +       /**
> +        * @resv:
> +        *
> +        * Reservation object linked to this dma-buf.
> +        */
>         struct dma_resv *resv;
>
> -       /* poll support */
> +       /** @poll: for userspace poll support */

here.

>         wait_queue_head_t poll;
>
> +       /** @cb_excl: for userspace poll support */
> +       /** @cb_shared: for userspace poll support */

Here.

Either way,
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>

>         struct dma_buf_poll_cb_t {
>                 struct dma_fence_cb cb;
>                 wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>                 __poll_t active;
>         } cb_excl, cb_shared;
>  #ifdef CONFIG_DMABUF_SYSFS_STATS
> -       /* for sysfs stats */
> +       /**
> +        * @sysfs_entry:
> +        *
> +        * For exposing information about this buffer in sysfs. See also
> +        * `DMA-BUF statistics`_ for the uapi this enables.
> +        */
>         struct dma_buf_sysfs_entry {
>                 struct kobject kobj;
>                 struct dma_buf *dmabuf;
> --
> 2.32.0.rc2
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
@ 2021-06-22 18:49     ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 18:49 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development, linaro-mm-sig,
	Daniel Vetter, Christian König, linux-media

Hi Daniel,

On Tue, Jun 22, 2021 at 06:54:57PM +0200, Daniel Vetter wrote:
> Oversight from
> 
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200

this is what we uses Fixes: ... for.

It looks wrong to hide it in the description.

	Sam

> 
>     dma-buf: rename and cleanup dma_resv_get_excl v3
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-resv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  }
>  
>  /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>   * @obj: the reservation object
>   *
>   * Returns the exclusive fence (if any). Caller must either hold the objects
> -- 
> 2.32.0.rc2

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 18:49     ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 18:49 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development, linaro-mm-sig,
	Daniel Vetter, Christian König, linux-media

Hi Daniel,

On Tue, Jun 22, 2021 at 06:54:57PM +0200, Daniel Vetter wrote:
> Oversight from
> 
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200

this is what we uses Fixes: ... for.

It looks wrong to hide it in the description.

	Sam

> 
>     dma-buf: rename and cleanup dma_resv_get_excl v3
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-resv.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>  }
>  
>  /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>   * @obj: the reservation object
>   *
>   * Returns the exclusive fence (if any). Caller must either hold the objects
> -- 
> 2.32.0.rc2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
@ 2021-06-22 19:01     ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, linaro-mm-sig, Nirmoy Das, Chen Li,
	Daniel Vetter, Alex Deucher, Dave Airlie, Christian König,
	linux-media

Hi Daniel.

On Tue, Jun 22, 2021 at 06:54:58PM +0200, Daniel Vetter wrote:
> Also review & update everything while we're at it.
> 
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>  1 file changed, 83 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>  
>  /**
>   * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.

This sentence
>   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>   * and is incremented on each attach.
belongs to the paragraph describing sysfs_entry and should be moved too.
Or maybe reworded and then document all fields in dma_buf_sysfs_entry?

With this fixed:
Acked-by: Sam Ravnborg <sam@ravnborg.org>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 19:01     ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:01 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, linaro-mm-sig, Nirmoy Das, Chen Li,
	Daniel Vetter, Alex Deucher, Dave Airlie, Christian König,
	linux-media

Hi Daniel.

On Tue, Jun 22, 2021 at 06:54:58PM +0200, Daniel Vetter wrote:
> Also review & update everything while we're at it.
> 
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>  1 file changed, 83 insertions(+), 24 deletions(-)
> 
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>  
>  /**
>   * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.

This sentence
>   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>   * and is incremented on each attach.
belongs to the paragraph describing sysfs_entry and should be moved too.
Or maybe reworded and then document all fields in dma_buf_sysfs_entry?

With this fixed:
Acked-by: Sam Ravnborg <sam@ravnborg.org>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 19:10     ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:10 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

Hi Daniel,

On Tue, Jun 22, 2021 at 06:55:03PM +0200, Daniel Vetter wrote:
> There's a bunch of atomic drivers who don't do this quite correctly,
> luckily most of them aren't in wide use or people would have noticed
> the tearing.
> 
> By making this the default we avoid the constant audit pain and can
> additionally remove a ton of lines from vfuncs for a bit more clarity
> in smaller drivers.
> 
> While at it complain if there's a cleanup_fb hook but no prepare_fb
> hook, because that makes no sense. I haven't found any driver which
> violates this, but better safe than sorry.
> 
> Subsequent patches will reap the benefits.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
>  drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
>  include/drm/drm_modeset_helper_vtables.h |  7 +++++--
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 531f2374b072..9f6c5f21c4d6 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -35,6 +35,7 @@
>  #include <drm/drm_damage_helper.h>
>  #include <drm/drm_device.h>
>  #include <drm/drm_drv.h>
> +#include <drm/drm_gem_atomic_helper.h>
>  #include <drm/drm_plane_helper.h>
>  #include <drm/drm_print.h>
>  #include <drm/drm_self_refresh_helper.h>
> @@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
>  			ret = funcs->prepare_fb(plane, new_plane_state);
>  			if (ret)
>  				goto fail;
> +		} else {
> +			WARN_ON_ONCE(funcs->cleanup_fb);
> +
> +			if (!drm_core_check_feature(dev, DRIVER_GEM))
> +				continue;
> +
> +			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
> +			if (ret)
> +				goto fail;
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index a27135084ae5..bc9396f2a0ed 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -135,6 +135,9 @@
>   * GEM based framebuffer drivers which have their buffers always pinned in
>   * memory.
>   *
> + * This function is the default implementation for GEM drivers of
> + * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
> + *
>   * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
>   * explicit fencing in atomic modeset updates.
>   */
> diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
> index f3a4b47b3986..4e727261dca5 100644
> --- a/include/drm/drm_modeset_helper_vtables.h
> +++ b/include/drm/drm_modeset_helper_vtables.h
> @@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
>  	 * equivalent functionality should be implemented through private
>  	 * members in the plane structure.
>  	 *
> -	 * Drivers which always have their buffers pinned should use
> -	 * drm_gem_plane_helper_prepare_fb() for this hook.
> +	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
s/not/nor/ ??
> +	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
              ^add comma?
> +	 * implement this.


Leave cleanup_fb out of the description to make it more readable.
In the description of cleanup_fb you can document that it is wrong to
have it without a matcching prepare_fb if you feel for it.

	Sam


         * Other drivers which need additional plane processing
> +	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
> +	 * hook.
>  	 *
>  	 * The helpers will call @cleanup_fb with matching arguments for every
>  	 * successful call to this hook.
> -- 
> 2.32.0.rc2

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
@ 2021-06-22 19:10     ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:10 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

Hi Daniel,

On Tue, Jun 22, 2021 at 06:55:03PM +0200, Daniel Vetter wrote:
> There's a bunch of atomic drivers who don't do this quite correctly,
> luckily most of them aren't in wide use or people would have noticed
> the tearing.
> 
> By making this the default we avoid the constant audit pain and can
> additionally remove a ton of lines from vfuncs for a bit more clarity
> in smaller drivers.
> 
> While at it complain if there's a cleanup_fb hook but no prepare_fb
> hook, because that makes no sense. I haven't found any driver which
> violates this, but better safe than sorry.
> 
> Subsequent patches will reap the benefits.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
>  drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
>  include/drm/drm_modeset_helper_vtables.h |  7 +++++--
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> index 531f2374b072..9f6c5f21c4d6 100644
> --- a/drivers/gpu/drm/drm_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_atomic_helper.c
> @@ -35,6 +35,7 @@
>  #include <drm/drm_damage_helper.h>
>  #include <drm/drm_device.h>
>  #include <drm/drm_drv.h>
> +#include <drm/drm_gem_atomic_helper.h>
>  #include <drm/drm_plane_helper.h>
>  #include <drm/drm_print.h>
>  #include <drm/drm_self_refresh_helper.h>
> @@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
>  			ret = funcs->prepare_fb(plane, new_plane_state);
>  			if (ret)
>  				goto fail;
> +		} else {
> +			WARN_ON_ONCE(funcs->cleanup_fb);
> +
> +			if (!drm_core_check_feature(dev, DRIVER_GEM))
> +				continue;
> +
> +			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
> +			if (ret)
> +				goto fail;
>  		}
>  	}
>  
> diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
> index a27135084ae5..bc9396f2a0ed 100644
> --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> @@ -135,6 +135,9 @@
>   * GEM based framebuffer drivers which have their buffers always pinned in
>   * memory.
>   *
> + * This function is the default implementation for GEM drivers of
> + * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
> + *
>   * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
>   * explicit fencing in atomic modeset updates.
>   */
> diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
> index f3a4b47b3986..4e727261dca5 100644
> --- a/include/drm/drm_modeset_helper_vtables.h
> +++ b/include/drm/drm_modeset_helper_vtables.h
> @@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
>  	 * equivalent functionality should be implemented through private
>  	 * members in the plane structure.
>  	 *
> -	 * Drivers which always have their buffers pinned should use
> -	 * drm_gem_plane_helper_prepare_fb() for this hook.
> +	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
s/not/nor/ ??
> +	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
              ^add comma?
> +	 * implement this.


Leave cleanup_fb out of the description to make it more readable.
In the description of cleanup_fb you can document that it is wrong to
have it without a matcching prepare_fb if you feel for it.

	Sam


         * Other drivers which need additional plane processing
> +	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
> +	 * hook.
>  	 *
>  	 * The helpers will call @cleanup_fb with matching arguments for every
>  	 * successful call to this hook.
> -- 
> 2.32.0.rc2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✗ Fi.CI.IGT: failure for implicit fencing/dma-resv rules for shared buffers
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (18 preceding siblings ...)
  (?)
@ 2021-06-22 19:12 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-22 19:12 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 26190 bytes --]

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers
URL   : https://patchwork.freedesktop.org/series/91789/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_10263_full -> Patchwork_20431_full
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_20431_full absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_20431_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_20431_full:

### IGT changes ###

#### Possible regressions ####

  * igt@kms_cursor_legacy@all-pipes-torture-bo:
    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2]
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-tglb3/igt@kms_cursor_legacy@all-pipes-torture-bo.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-tglb6/igt@kms_cursor_legacy@all-pipes-torture-bo.html

  
Known issues
------------

  Here are the changes found in Patchwork_20431_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ctx_isolation@preservation-s3@vcs0:
    - shard-kbl:          NOTRUN -> [DMESG-WARN][3] ([i915#180]) +2 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@vcs0.html

  * igt@gem_ctx_persistence@process:
    - shard-snb:          NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#1099]) +6 similar issues
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-snb6/igt@gem_ctx_persistence@process.html

  * igt@gem_exec_fair@basic-none-vip@rcs0:
    - shard-kbl:          [PASS][5] -> [FAIL][6] ([i915#2842]) +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl4/igt@gem_exec_fair@basic-none-vip@rcs0.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl2/igt@gem_exec_fair@basic-none-vip@rcs0.html

  * igt@gem_exec_fair@basic-none@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][7] ([i915#2842])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb4/igt@gem_exec_fair@basic-none@vcs1.html

  * igt@gem_exec_fair@basic-pace@rcs0:
    - shard-glk:          [PASS][8] -> [FAIL][9] ([i915#2842])
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-glk7/igt@gem_exec_fair@basic-pace@rcs0.html
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk1/igt@gem_exec_fair@basic-pace@rcs0.html

  * igt@gem_mmap_gtt@big-copy-xy:
    - shard-skl:          [PASS][10] -> [FAIL][11] ([i915#307])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl2/igt@gem_mmap_gtt@big-copy-xy.html
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl8/igt@gem_mmap_gtt@big-copy-xy.html

  * igt@gem_mmap_gtt@cpuset-big-copy:
    - shard-iclb:         [PASS][12] -> [FAIL][13] ([i915#307])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb1/igt@gem_mmap_gtt@cpuset-big-copy.html
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb1/igt@gem_mmap_gtt@cpuset-big-copy.html

  * igt@gem_pread@exhaustion:
    - shard-apl:          NOTRUN -> [WARN][14] ([i915#2658])
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl3/igt@gem_pread@exhaustion.html
    - shard-snb:          NOTRUN -> [WARN][15] ([i915#2658])
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-snb7/igt@gem_pread@exhaustion.html

  * igt@gen7_exec_parse@basic-offset:
    - shard-apl:          NOTRUN -> [SKIP][16] ([fdo#109271]) +226 similar issues
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl1/igt@gen7_exec_parse@basic-offset.html

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-iclb:         NOTRUN -> [WARN][17] ([i915#1804] / [i915#2684])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@i915_suspend@debugfs-reader:
    - shard-kbl:          [PASS][18] -> [DMESG-WARN][19] ([i915#180]) +2 similar issues
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl7/igt@i915_suspend@debugfs-reader.html
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl2/igt@i915_suspend@debugfs-reader.html

  * igt@kms_async_flips@alternate-sync-async-flip:
    - shard-skl:          [PASS][20] -> [FAIL][21] ([i915#2521])
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl10/igt@kms_async_flips@alternate-sync-async-flip.html
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl10/igt@kms_async_flips@alternate-sync-async-flip.html

  * igt@kms_big_fb@linear-32bpp-rotate-180:
    - shard-glk:          [PASS][22] -> [DMESG-WARN][23] ([i915#118] / [i915#95])
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-glk7/igt@kms_big_fb@linear-32bpp-rotate-180.html
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk1/igt@kms_big_fb@linear-32bpp-rotate-180.html

  * igt@kms_chamelium@dp-crc-fast:
    - shard-iclb:         NOTRUN -> [SKIP][24] ([fdo#109284] / [fdo#111827]) +1 similar issue
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_chamelium@dp-crc-fast.html
    - shard-glk:          NOTRUN -> [SKIP][25] ([fdo#109271] / [fdo#111827]) +2 similar issues
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk3/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_chamelium@hdmi-edid-change-during-suspend:
    - shard-apl:          NOTRUN -> [SKIP][26] ([fdo#109271] / [fdo#111827]) +22 similar issues
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl8/igt@kms_chamelium@hdmi-edid-change-during-suspend.html

  * igt@kms_color@pipe-d-ctm-max:
    - shard-skl:          NOTRUN -> [SKIP][27] ([fdo#109271]) +64 similar issues
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_color@pipe-d-ctm-max.html

  * igt@kms_color_chamelium@pipe-a-ctm-blue-to-red:
    - shard-snb:          NOTRUN -> [SKIP][28] ([fdo#109271] / [fdo#111827]) +16 similar issues
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-snb7/igt@kms_color_chamelium@pipe-a-ctm-blue-to-red.html

  * igt@kms_color_chamelium@pipe-c-degamma:
    - shard-skl:          NOTRUN -> [SKIP][29] ([fdo#109271] / [fdo#111827]) +6 similar issues
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_color_chamelium@pipe-c-degamma.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-iclb:         NOTRUN -> [SKIP][30] ([fdo#109300] / [fdo#111066])
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@srm:
    - shard-apl:          NOTRUN -> [TIMEOUT][31] ([i915#1319])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl7/igt@kms_content_protection@srm.html

  * igt@kms_cursor_crc@pipe-b-cursor-512x170-sliding:
    - shard-iclb:         NOTRUN -> [SKIP][32] ([fdo#109278] / [fdo#109279])
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_cursor_crc@pipe-b-cursor-512x170-sliding.html

  * igt@kms_cursor_crc@pipe-c-cursor-256x256-random:
    - shard-skl:          [PASS][33] -> [FAIL][34] ([i915#3444])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl3/igt@kms_cursor_crc@pipe-c-cursor-256x256-random.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl4/igt@kms_cursor_crc@pipe-c-cursor-256x256-random.html

  * igt@kms_cursor_edge_walk@pipe-d-128x128-right-edge:
    - shard-snb:          NOTRUN -> [SKIP][35] ([fdo#109271]) +315 similar issues
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-snb5/igt@kms_cursor_edge_walk@pipe-d-128x128-right-edge.html

  * igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size:
    - shard-iclb:         NOTRUN -> [SKIP][36] ([fdo#109274] / [fdo#109278]) +1 similar issue
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size.html

  * igt@kms_flip@flip-vs-expired-vblank@a-edp1:
    - shard-skl:          NOTRUN -> [FAIL][37] ([i915#79])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_flip@flip-vs-expired-vblank@a-edp1.html

  * igt@kms_flip@plain-flip-fb-recreate@b-edp1:
    - shard-skl:          [PASS][38] -> [FAIL][39] ([i915#2122]) +2 similar issues
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl7/igt@kms_flip@plain-flip-fb-recreate@b-edp1.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl2/igt@kms_flip@plain-flip-fb-recreate@b-edp1.html

  * igt@kms_flip@plain-flip-ts-check@a-edp1:
    - shard-skl:          NOTRUN -> [FAIL][40] ([i915#2122])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_flip@plain-flip-ts-check@a-edp1.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-shrfb-fliptrack-mmap-gtt:
    - shard-iclb:         NOTRUN -> [SKIP][41] ([fdo#109280]) +5 similar issues
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_frontbuffer_tracking@fbcpsr-2p-shrfb-fliptrack-mmap-gtt.html
    - shard-glk:          NOTRUN -> [SKIP][42] ([fdo#109271]) +21 similar issues
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk3/igt@kms_frontbuffer_tracking@fbcpsr-2p-shrfb-fliptrack-mmap-gtt.html

  * igt@kms_invalid_dotclock:
    - shard-iclb:         NOTRUN -> [SKIP][43] ([fdo#109310])
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_invalid_dotclock.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
    - shard-apl:          NOTRUN -> [SKIP][44] ([fdo#109271] / [i915#533])
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl8/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-7efc:
    - shard-skl:          NOTRUN -> [FAIL][45] ([fdo#108145] / [i915#265]) +1 similar issue
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_plane_alpha_blend@pipe-a-alpha-7efc.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb:
    - shard-apl:          NOTRUN -> [FAIL][46] ([fdo#108145] / [i915#265]) +4 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl3/igt@kms_plane_alpha_blend@pipe-a-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          [PASS][47] -> [FAIL][48] ([fdo#108145] / [i915#265]) +1 similar issue
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl3/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl1/igt@kms_plane_alpha_blend@pipe-b-coverage-7efc.html

  * igt@kms_plane_alpha_blend@pipe-d-coverage-vs-premult-vs-constant:
    - shard-iclb:         NOTRUN -> [SKIP][49] ([fdo#109278]) +3 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb7/igt@kms_plane_alpha_blend@pipe-d-coverage-vs-premult-vs-constant.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1:
    - shard-apl:          NOTRUN -> [SKIP][50] ([fdo#109271] / [i915#658]) +5 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl3/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-1.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-5:
    - shard-skl:          NOTRUN -> [SKIP][51] ([fdo#109271] / [i915#658]) +1 similar issue
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-5.html

  * igt@kms_psr@psr2_sprite_plane_move:
    - shard-iclb:         [PASS][52] -> [SKIP][53] ([fdo#109441]) +1 similar issue
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb2/igt@kms_psr@psr2_sprite_plane_move.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb5/igt@kms_psr@psr2_sprite_plane_move.html

  * igt@kms_writeback@writeback-invalid-parameters:
    - shard-apl:          NOTRUN -> [SKIP][54] ([fdo#109271] / [i915#2437])
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl7/igt@kms_writeback@writeback-invalid-parameters.html

  * igt@perf@polling:
    - shard-skl:          [PASS][55] -> [FAIL][56] ([i915#1542]) +1 similar issue
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl9/igt@perf@polling.html
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl9/igt@perf@polling.html

  * igt@sysfs_clients@create:
    - shard-apl:          NOTRUN -> [SKIP][57] ([fdo#109271] / [i915#2994]) +2 similar issues
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl2/igt@sysfs_clients@create.html

  
#### Possible fixes ####

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
    - shard-kbl:          [INCOMPLETE][58] ([i915#794]) -> [PASS][59]
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl3/igt@gem_ctx_isolation@preservation-s3@rcs0.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl7/igt@gem_ctx_isolation@preservation-s3@rcs0.html

  * igt@gem_exec_fair@basic-none-share@rcs0:
    - shard-iclb:         [FAIL][60] ([i915#2842]) -> [PASS][61]
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb6/igt@gem_exec_fair@basic-none-share@rcs0.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb3/igt@gem_exec_fair@basic-none-share@rcs0.html
    - shard-tglb:         [FAIL][62] ([i915#2842]) -> [PASS][63] +1 similar issue
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-tglb8/igt@gem_exec_fair@basic-none-share@rcs0.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-tglb7/igt@gem_exec_fair@basic-none-share@rcs0.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-kbl:          [FAIL][64] ([i915#2842]) -> [PASS][65] +3 similar issues
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl1/igt@gem_exec_fair@basic-none-solo@rcs0.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl1/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_parallel@engines@basic:
    - shard-glk:          [DMESG-WARN][66] ([i915#118] / [i915#95]) -> [PASS][67] +1 similar issue
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-glk3/igt@gem_exec_parallel@engines@basic.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk7/igt@gem_exec_parallel@engines@basic.html

  * igt@gem_workarounds@suspend-resume-context:
    - shard-apl:          [DMESG-WARN][68] ([i915#180]) -> [PASS][69] +3 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-apl6/igt@gem_workarounds@suspend-resume-context.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-apl2/igt@gem_workarounds@suspend-resume-context.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-glk:          [DMESG-WARN][70] ([i915#1436] / [i915#716]) -> [PASS][71]
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-glk3/igt@gen9_exec_parse@allowed-all.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk3/igt@gen9_exec_parse@allowed-all.html

  * igt@gen9_exec_parse@allowed-single:
    - shard-skl:          [DMESG-WARN][72] ([i915#1436] / [i915#716]) -> [PASS][73]
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl1/igt@gen9_exec_parse@allowed-single.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl3/igt@gen9_exec_parse@allowed-single.html

  * igt@kms_color@pipe-a-ctm-0-5:
    - shard-skl:          [DMESG-WARN][74] ([i915#1982]) -> [PASS][75] +1 similar issue
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl10/igt@kms_color@pipe-a-ctm-0-5.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl7/igt@kms_color@pipe-a-ctm-0-5.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [FAIL][76] ([i915#2346]) -> [PASS][77]
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl4/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl6/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-skl:          [FAIL][78] ([i915#2346] / [i915#533]) -> [PASS][79]
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl9/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl1/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_fbcon_fbt@psr-suspend:
    - shard-skl:          [INCOMPLETE][80] ([i915#198]) -> [PASS][81]
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl9/igt@kms_fbcon_fbt@psr-suspend.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl5/igt@kms_fbcon_fbt@psr-suspend.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1:
    - shard-skl:          [FAIL][82] ([i915#79]) -> [PASS][83]
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl5/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl4/igt@kms_flip@flip-vs-expired-vblank-interruptible@a-edp1.html

  * igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2:
    - shard-glk:          [FAIL][84] ([i915#79]) -> [PASS][85]
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-glk7/igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-glk1/igt@kms_flip@flip-vs-expired-vblank@a-hdmi-a2.html

  * igt@kms_psr@psr2_cursor_mmap_gtt:
    - shard-iclb:         [SKIP][86] ([fdo#109441]) -> [PASS][87]
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb1/igt@kms_psr@psr2_cursor_mmap_gtt.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb2/igt@kms_psr@psr2_cursor_mmap_gtt.html

  
#### Warnings ####

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4:
    - shard-iclb:         [SKIP][88] ([i915#2920]) -> [SKIP][89] ([i915#658])
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb2/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb5/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-4.html

  * igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5:
    - shard-iclb:         [SKIP][90] ([i915#658]) -> [SKIP][91] ([i915#2920]) +1 similar issue
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb4/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb2/igt@kms_psr2_sf@primary-plane-update-sf-dmg-area-5.html

  * igt@runner@aborted:
    - shard-kbl:          ([FAIL][92], [FAIL][93], [FAIL][94], [FAIL][95]) ([i915#1814] / [i915#3002] / [i915#3363]) -> ([FAIL][96], [FAIL][97], [FAIL][98], [FAIL][99], [FAIL][100], [FAIL][101], [FAIL][102]) ([i915#1436] / [i915#180] / [i915#1814] / [i915#3002] / [i915#3363] / [i915#602])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl3/igt@runner@aborted.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl7/igt@runner@aborted.html
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl7/igt@runner@aborted.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-kbl7/igt@runner@aborted.html
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl1/igt@runner@aborted.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl7/igt@runner@aborted.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl4/igt@runner@aborted.html
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl7/igt@runner@aborted.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl7/igt@runner@aborted.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl2/igt@runner@aborted.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-kbl2/igt@runner@aborted.html
    - shard-iclb:         ([FAIL][103], [FAIL][104], [FAIL][105]) ([i915#1814] / [i915#3002]) -> ([FAIL][106], [FAIL][107]) ([i915#3002])
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb3/igt@runner@aborted.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb2/igt@runner@aborted.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-iclb1/igt@runner@aborted.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb3/igt@runner@aborted.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-iclb5/igt@runner@aborted.html
    - shard-skl:          ([FAIL][108], [FAIL][109], [FAIL][110]) ([i915#1436] / [i915#3002] / [i915#3363]) -> ([FAIL][111], [FAIL][112]) ([i915#3002] / [i915#3363])
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl1/igt@runner@aborted.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl7/igt@runner@aborted.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10263/shard-skl6/igt@runner@aborted.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl10/igt@runner@aborted.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/shard-skl2/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#109274]: https://bugs.freedesktop.org/show_bug.cgi?id=109274
  [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278
  [fdo#109279]: https://bugs.freedesktop.org/show_bug.cgi?id=109279
  [fdo#109280]: https://bugs.freedesktop.org/show_bug.cgi?id=109280
  [fdo#109284]: https://bugs.freedesktop.org/show_bug.cgi?id=109284
  [fdo#109300]: https://bugs.freedesktop.org/show_bug.cgi?id=109300
  [fdo#109310]: https://bugs.freedesktop.org/show_bug.cgi?id=109310
  [fdo#109441]: https://bugs.freedesktop.org/show_bug.cgi?id=109441
  [fdo#111066]: https://bugs.freedesktop.org/show_bug.cgi?id=111066
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1099]: https://gitlab.freedesktop.org/drm/intel/issues/1099
  [i915#118]: https://gitlab.freedesktop.org/drm/intel/issues/118
  [i915#1319]: https://gitlab.freedesktop.org/drm/intel/issues/1319
  [i915#1436]: https://gitlab.freedesktop.org/drm/intel/issues/1436
  [i915#1542]: https://gitlab.freedesktop.org/drm/intel/issues/1542
  [i915#180]: https://gitlab.freedesktop.org/drm/intel/issues/180
  [i915#1804]: https://gitlab.freedesktop.org/drm/intel/issues/1804
  [i915#1814]: https://gitlab.freedesktop.org/drm/intel/issues/1814
  [i915#198]: https://gitlab.freedesktop.org/drm/intel/issues/198
  [i915#1982]: https://gitlab.freedesktop.org/drm/intel/issues/1982
  [i915#2122]: https://gitlab.freedesktop.org/drm/intel/issues/2122
  [i915#2346]: https://gitlab.freedesktop.org/drm/intel/issues/2346
  [i915#2437]: https://gitlab.freedesktop.org/drm/intel/issues/2437
  [i915#2521]: https://gitlab.freedesktop.org/drm/intel/issues/2521
  [i915#265]: https://gitlab.freedesktop.org/drm/intel/issues/265
  [i915#2658]: https://gitlab.freedesktop.org/drm/intel/issues/2658
  [i915#2684]: https://gitlab.freedesktop.org/drm/intel/issues/2684
  [i915#2842]: https://gitlab.freedesktop.org/drm/intel/issues/2842
  [i915#2920]: https://gitlab.freedesktop.org/drm/intel/issues/2920
  [i915#2994]: https://gitlab.freedesktop.org/drm/intel/issues/2994
  [i915#3002]: https://gitlab.freedesktop.org/drm/intel/issues/3002
  [i915#307]: https://gitlab.freedesktop.org/drm/intel/issues/307
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#3444]: https://gitlab.freedesktop.org/drm/intel/issues/3444
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#602]: https://gitlab.freedesktop.org/drm/intel/issues/602
  [i915#658]: https://gitlab.freedesktop.org/drm/intel/issues/658
  [i915#716]: https://gitlab.freedesktop.org/drm/intel/issues/716
  [i915#79]: https://gitlab.freedesktop.org/drm/intel/issues/79
  [i915#794]: https://gitlab.freedesktop.org/drm/intel/issues/794
  [i915#95]: https://gitlab.freedesktop.org/drm/intel/issues/95


Participating hosts (10 -> 10)
------------------------------

  No changes in participating hosts


Build changes
-------------

  * Linux: CI_DRM_10263 -> Patchwork_20431

  CI-20190529: 20190529
  CI_DRM_10263: 5b5e458879485ea4eb87d4208b95a33ee5437fcc @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6117: 3ba0a02404f243d6d8f232c6215163cc4b0fd699 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20431: 76d518e4d7e7790bd832f6d103b8e6a309750710 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20431/index.html

[-- Attachment #1.2: Type: text/html, Size: 32643 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 12/15] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-22 19:15     ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Intel Graphics Development, DRI Development,
	Noralf Trønnes, Thomas Zimmermann, Daniel Vetter

Hi Daniel,

On Tue, Jun 22, 2021 at 06:55:08PM +0200, Daniel Vetter wrote:
> It's tedious to review this all the time, and my audit showed that
> arcpgu actually forgot to set this.
> 
> Make this the default and stop worrying.
> 
> Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
> strange combinations of hooks: cleanup_fb without prepare_fb doesn't
> make sense, and since simpler drivers are all new they better be GEM
> based drivers.
> 
> v2: Warn and bail when it's _not_ a GEM driver (Noralf)
> 
> Cc: Noralf Trønnes <noralf@tronnes.org>
> Acked-by: Noralf Trønnes <noralf@tronnes.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
>  include/drm/drm_simple_kms_helper.h     |  7 +++++--
>  2 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
> index 0b095a313c44..735f4f34bcc4 100644
> --- a/drivers/gpu/drm/drm_simple_kms_helper.c
> +++ b/drivers/gpu/drm/drm_simple_kms_helper.c
> @@ -9,6 +9,8 @@
>  #include <drm/drm_atomic.h>
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_bridge.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_gem_atomic_helper.h>
>  #include <drm/drm_managed.h>
>  #include <drm/drm_plane_helper.h>
>  #include <drm/drm_probe_helper.h>
> @@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
>  	struct drm_simple_display_pipe *pipe;
>  
>  	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
> -	if (!pipe->funcs || !pipe->funcs->prepare_fb)
> -		return 0;
> +	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
> +		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
> +			return 0;
> +
> +		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
> +
> +		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
> +	}
>  
>  	return pipe->funcs->prepare_fb(pipe, state);
>  }
> diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
> index ef9944e9c5fc..363a9a8c3587 100644
> --- a/include/drm/drm_simple_kms_helper.h
> +++ b/include/drm/drm_simple_kms_helper.h
> @@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
>  	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
>  	 * more details.
>  	 *
> -	 * Drivers which always have their buffers pinned should use
> -	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
> +	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> +	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
> +	 * to implement this.
Same comments like before.

	Sam

         * Other drivers which need additional plane
> +	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
> +	 * their @prepare_fb hook.
>  	 */
>  	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
>  			  struct drm_plane_state *plane_state);
> -- 
> 2.32.0.rc2

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 12/15] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
@ 2021-06-22 19:15     ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-22 19:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Intel Graphics Development, DRI Development,
	Noralf Trønnes, Thomas Zimmermann, Daniel Vetter

Hi Daniel,

On Tue, Jun 22, 2021 at 06:55:08PM +0200, Daniel Vetter wrote:
> It's tedious to review this all the time, and my audit showed that
> arcpgu actually forgot to set this.
> 
> Make this the default and stop worrying.
> 
> Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
> strange combinations of hooks: cleanup_fb without prepare_fb doesn't
> make sense, and since simpler drivers are all new they better be GEM
> based drivers.
> 
> v2: Warn and bail when it's _not_ a GEM driver (Noralf)
> 
> Cc: Noralf Trønnes <noralf@tronnes.org>
> Acked-by: Noralf Trønnes <noralf@tronnes.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>  drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
>  include/drm/drm_simple_kms_helper.h     |  7 +++++--
>  2 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
> index 0b095a313c44..735f4f34bcc4 100644
> --- a/drivers/gpu/drm/drm_simple_kms_helper.c
> +++ b/drivers/gpu/drm/drm_simple_kms_helper.c
> @@ -9,6 +9,8 @@
>  #include <drm/drm_atomic.h>
>  #include <drm/drm_atomic_helper.h>
>  #include <drm/drm_bridge.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_gem_atomic_helper.h>
>  #include <drm/drm_managed.h>
>  #include <drm/drm_plane_helper.h>
>  #include <drm/drm_probe_helper.h>
> @@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
>  	struct drm_simple_display_pipe *pipe;
>  
>  	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
> -	if (!pipe->funcs || !pipe->funcs->prepare_fb)
> -		return 0;
> +	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
> +		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
> +			return 0;
> +
> +		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
> +
> +		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
> +	}
>  
>  	return pipe->funcs->prepare_fb(pipe, state);
>  }
> diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
> index ef9944e9c5fc..363a9a8c3587 100644
> --- a/include/drm/drm_simple_kms_helper.h
> +++ b/include/drm/drm_simple_kms_helper.h
> @@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
>  	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
>  	 * more details.
>  	 *
> -	 * Drivers which always have their buffers pinned should use
> -	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
> +	 * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> +	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
> +	 * to implement this.
Same comments like before.

	Sam

         * Other drivers which need additional plane
> +	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
> +	 * their @prepare_fb hook.
>  	 */
>  	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
>  			  struct drm_plane_state *plane_state);
> -- 
> 2.32.0.rc2
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-22 18:49     ` [Intel-gfx] " Sam Ravnborg
  (?)
@ 2021-06-22 19:19       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:19 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: DRI Development, Intel Graphics Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 8:50 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel,
>
> On Tue, Jun 22, 2021 at 06:54:57PM +0200, Daniel Vetter wrote:
> > Oversight from
> >
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
>
> this is what we uses Fixes: ... for.
>
> It looks wrong to hide it in the description.

I've honestly become a bit vary of using Fixes: for docs/comments
because the stable autoselect bots are _really_ keen on picking up
anything with a Fixes: line in it. And that feels a bit like nonsense.
-Daniel

>
>         Sam
>
> >
> >     dma-buf: rename and cleanup dma_resv_get_excl v3
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-resv.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >  }
> >
> >  /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >   * @obj: the reservation object
> >   *
> >   * Returns the exclusive fence (if any). Caller must either hold the objects
> > --
> > 2.32.0.rc2



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 19:19       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:19 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 8:50 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel,
>
> On Tue, Jun 22, 2021 at 06:54:57PM +0200, Daniel Vetter wrote:
> > Oversight from
> >
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
>
> this is what we uses Fixes: ... for.
>
> It looks wrong to hide it in the description.

I've honestly become a bit vary of using Fixes: for docs/comments
because the stable autoselect bots are _really_ keen on picking up
anything with a Fixes: line in it. And that feels a bit like nonsense.
-Daniel

>
>         Sam
>
> >
> >     dma-buf: rename and cleanup dma_resv_get_excl v3
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-resv.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >  }
> >
> >  /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >   * @obj: the reservation object
> >   *
> >   * Returns the exclusive fence (if any). Caller must either hold the objects
> > --
> > 2.32.0.rc2



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-22 19:19       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:19 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 8:50 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel,
>
> On Tue, Jun 22, 2021 at 06:54:57PM +0200, Daniel Vetter wrote:
> > Oversight from
> >
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
>
> this is what we uses Fixes: ... for.
>
> It looks wrong to hide it in the description.

I've honestly become a bit vary of using Fixes: for docs/comments
because the stable autoselect bots are _really_ keen on picking up
anything with a Fixes: line in it. And that feels a bit like nonsense.
-Daniel

>
>         Sam
>
> >
> >     dma-buf: rename and cleanup dma_resv_get_excl v3
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-resv.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >  }
> >
> >  /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >   * @obj: the reservation object
> >   *
> >   * Returns the exclusive fence (if any). Caller must either hold the objects
> > --
> > 2.32.0.rc2



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
  2021-06-22 19:01     ` [Intel-gfx] " Sam Ravnborg
  (?)
@ 2021-06-22 19:21       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:21 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: DRI Development, Deepak R Varma, Intel Graphics Development,
	Kevin Wang, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Dave Airlie, Alex Deucher, Daniel Vetter,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 9:01 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel.
>
> On Tue, Jun 22, 2021 at 06:54:58PM +0200, Daniel Vetter wrote:
> > Also review & update everything while we're at it.
> >
> > This is prep work to smash a ton of stuff into the kerneldoc for
> > @resv.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 83 insertions(+), 24 deletions(-)
> >
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 92eec38a03aa..6d18b9e448b9 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -289,28 +289,6 @@ struct dma_buf_ops {
> >
> >  /**
> >   * struct dma_buf - shared buffer object
> > - * @size: size of the buffer; invariant over the lifetime of the buffer.
> > - * @file: file pointer used for sharing buffers across, and for refcounting.
> > - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> > - *               protected by dma_resv lock.
> > - * @ops: dma_buf_ops associated with this buffer object.
> > - * @lock: used internally to serialize list manipulation, attach/detach and
> > - *        vmap/unmap
> > - * @vmapping_counter: used internally to refcnt the vmaps
> > - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> > - * @exp_name: name of the exporter; useful for debugging.
> > - * @name: userspace-provided name; useful for accounting and debugging,
> > - *        protected by @resv.
> > - * @name_lock: spinlock to protect name access
> > - * @owner: pointer to exporter module; used for refcounting when exporter is a
> > - *         kernel module.
> > - * @list_node: node for dma_buf accounting and debugging.
> > - * @priv: exporter specific private data for this buffer object.
> > - * @resv: reservation object linked to this dma-buf
> > - * @poll: for userspace poll support
> > - * @cb_excl: for userspace poll support
> > - * @cb_shared: for userspace poll support
> > - * @sysfs_entry: for exposing information about this buffer in sysfs.
>
> This sentence
> >   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
> >   * and is incremented on each attach.
> belongs to the paragraph describing sysfs_entry and should be moved too.
> Or maybe reworded and then document all fields in dma_buf_sysfs_entry?

Unfortunately kerneldoc lost the ability to document embedded
structs/unions. At least last time I checked, it's a bit a bikeshed.
So I'd need to pull the entire struct out. I'll just move it since
it's indeed misplaced.

> With this fixed:
> Acked-by: Sam Ravnborg <sam@ravnborg.org>

Thanks for taking a look.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 19:21       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:21 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Daniel Vetter, Alex Deucher, Dave Airlie,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 9:01 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel.
>
> On Tue, Jun 22, 2021 at 06:54:58PM +0200, Daniel Vetter wrote:
> > Also review & update everything while we're at it.
> >
> > This is prep work to smash a ton of stuff into the kerneldoc for
> > @resv.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 83 insertions(+), 24 deletions(-)
> >
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 92eec38a03aa..6d18b9e448b9 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -289,28 +289,6 @@ struct dma_buf_ops {
> >
> >  /**
> >   * struct dma_buf - shared buffer object
> > - * @size: size of the buffer; invariant over the lifetime of the buffer.
> > - * @file: file pointer used for sharing buffers across, and for refcounting.
> > - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> > - *               protected by dma_resv lock.
> > - * @ops: dma_buf_ops associated with this buffer object.
> > - * @lock: used internally to serialize list manipulation, attach/detach and
> > - *        vmap/unmap
> > - * @vmapping_counter: used internally to refcnt the vmaps
> > - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> > - * @exp_name: name of the exporter; useful for debugging.
> > - * @name: userspace-provided name; useful for accounting and debugging,
> > - *        protected by @resv.
> > - * @name_lock: spinlock to protect name access
> > - * @owner: pointer to exporter module; used for refcounting when exporter is a
> > - *         kernel module.
> > - * @list_node: node for dma_buf accounting and debugging.
> > - * @priv: exporter specific private data for this buffer object.
> > - * @resv: reservation object linked to this dma-buf
> > - * @poll: for userspace poll support
> > - * @cb_excl: for userspace poll support
> > - * @cb_shared: for userspace poll support
> > - * @sysfs_entry: for exposing information about this buffer in sysfs.
>
> This sentence
> >   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
> >   * and is incremented on each attach.
> belongs to the paragraph describing sysfs_entry and should be moved too.
> Or maybe reworded and then document all fields in dma_buf_sysfs_entry?

Unfortunately kerneldoc lost the ability to document embedded
structs/unions. At least last time I checked, it's a bit a bikeshed.
So I'd need to pull the entire struct out. I'll just move it since
it's indeed misplaced.

> With this fixed:
> Acked-by: Sam Ravnborg <sam@ravnborg.org>

Thanks for taking a look.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-22 19:21       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 19:21 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Nirmoy Das, Chen Li, Daniel Vetter, Alex Deucher, Dave Airlie,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Tue, Jun 22, 2021 at 9:01 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel.
>
> On Tue, Jun 22, 2021 at 06:54:58PM +0200, Daniel Vetter wrote:
> > Also review & update everything while we're at it.
> >
> > This is prep work to smash a ton of stuff into the kerneldoc for
> > @resv.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Nirmoy Das <nirmoy.das@amd.com>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
> >  1 file changed, 83 insertions(+), 24 deletions(-)
> >
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 92eec38a03aa..6d18b9e448b9 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -289,28 +289,6 @@ struct dma_buf_ops {
> >
> >  /**
> >   * struct dma_buf - shared buffer object
> > - * @size: size of the buffer; invariant over the lifetime of the buffer.
> > - * @file: file pointer used for sharing buffers across, and for refcounting.
> > - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> > - *               protected by dma_resv lock.
> > - * @ops: dma_buf_ops associated with this buffer object.
> > - * @lock: used internally to serialize list manipulation, attach/detach and
> > - *        vmap/unmap
> > - * @vmapping_counter: used internally to refcnt the vmaps
> > - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> > - * @exp_name: name of the exporter; useful for debugging.
> > - * @name: userspace-provided name; useful for accounting and debugging,
> > - *        protected by @resv.
> > - * @name_lock: spinlock to protect name access
> > - * @owner: pointer to exporter module; used for refcounting when exporter is a
> > - *         kernel module.
> > - * @list_node: node for dma_buf accounting and debugging.
> > - * @priv: exporter specific private data for this buffer object.
> > - * @resv: reservation object linked to this dma-buf
> > - * @poll: for userspace poll support
> > - * @cb_excl: for userspace poll support
> > - * @cb_shared: for userspace poll support
> > - * @sysfs_entry: for exposing information about this buffer in sysfs.
>
> This sentence
> >   * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
> >   * and is incremented on each attach.
> belongs to the paragraph describing sysfs_entry and should be moved too.
> Or maybe reworded and then document all fields in dma_buf_sysfs_entry?

Unfortunately kerneldoc lost the ability to document embedded
structs/unions. At least last time I checked, it's a bit a bikeshed.
So I'd need to pull the entire struct out. I'll just move it since
it's indeed misplaced.

> With this fixed:
> Acked-by: Sam Ravnborg <sam@ravnborg.org>

Thanks for taking a look.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  2021-06-22 19:10     ` [Intel-gfx] " Sam Ravnborg
@ 2021-06-22 20:20       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 20:20 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

On Tue, Jun 22, 2021 at 9:10 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel,
>
> On Tue, Jun 22, 2021 at 06:55:03PM +0200, Daniel Vetter wrote:
> > There's a bunch of atomic drivers who don't do this quite correctly,
> > luckily most of them aren't in wide use or people would have noticed
> > the tearing.
> >
> > By making this the default we avoid the constant audit pain and can
> > additionally remove a ton of lines from vfuncs for a bit more clarity
> > in smaller drivers.
> >
> > While at it complain if there's a cleanup_fb hook but no prepare_fb
> > hook, because that makes no sense. I haven't found any driver which
> > violates this, but better safe than sorry.
> >
> > Subsequent patches will reap the benefits.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
> >  drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
> >  include/drm/drm_modeset_helper_vtables.h |  7 +++++--
> >  3 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 531f2374b072..9f6c5f21c4d6 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -35,6 +35,7 @@
> >  #include <drm/drm_damage_helper.h>
> >  #include <drm/drm_device.h>
> >  #include <drm/drm_drv.h>
> > +#include <drm/drm_gem_atomic_helper.h>
> >  #include <drm/drm_plane_helper.h>
> >  #include <drm/drm_print.h>
> >  #include <drm/drm_self_refresh_helper.h>
> > @@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
> >                       ret = funcs->prepare_fb(plane, new_plane_state);
> >                       if (ret)
> >                               goto fail;
> > +             } else {
> > +                     WARN_ON_ONCE(funcs->cleanup_fb);
> > +
> > +                     if (!drm_core_check_feature(dev, DRIVER_GEM))
> > +                             continue;
> > +
> > +                     ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
> > +                     if (ret)
> > +                             goto fail;
> >               }
> >       }
> >
> > diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > index a27135084ae5..bc9396f2a0ed 100644
> > --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > @@ -135,6 +135,9 @@
> >   * GEM based framebuffer drivers which have their buffers always pinned in
> >   * memory.
> >   *
> > + * This function is the default implementation for GEM drivers of
> > + * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
> > + *
> >   * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
> >   * explicit fencing in atomic modeset updates.
> >   */
> > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
> > index f3a4b47b3986..4e727261dca5 100644
> > --- a/include/drm/drm_modeset_helper_vtables.h
> > +++ b/include/drm/drm_modeset_helper_vtables.h
> > @@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
> >        * equivalent functionality should be implemented through private
> >        * members in the plane structure.
> >        *
> > -      * Drivers which always have their buffers pinned should use
> > -      * drm_gem_plane_helper_prepare_fb() for this hook.
> > +      * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> s/not/nor/ ??

Yup.

> > +      * set drm_gem_plane_helper_prepare_fb() is called automatically to
>               ^add comma?
> > +      * implement this.
>
>
> Leave cleanup_fb out of the description to make it more readable.

With the not->nor typo fixed, why does this make it more readable?
Afaiui neither ... nor ... is fairly standard English, and I really
want to make this the default only if you specify absolutely no plane
fb handling of your own.

> In the description of cleanup_fb you can document that it is wrong to
> have it without a matcching prepare_fb if you feel for it.

So the reason I didn't document things that way is that imo the
"cleanup_fb  but not prepare_fb" case is just nonsense. But I also
didn't want to accidentally paper over bugs where people set only
cleanup_fb and forget to hook up the other one, hence the warning. But
if you think we should explain that in docs, I guess I can shuffle it
around. Just feel like specifying everything in the comments doesn't
help the readability of the docs.
-Daniel

>
>         Sam
>
>
>          * Other drivers which need additional plane processing
> > +      * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
> > +      * hook.
> >        *
> >        * The helpers will call @cleanup_fb with matching arguments for every
> >        * successful call to this hook.
> > --
> > 2.32.0.rc2



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
@ 2021-06-22 20:20       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-22 20:20 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

On Tue, Jun 22, 2021 at 9:10 PM Sam Ravnborg <sam@ravnborg.org> wrote:
>
> Hi Daniel,
>
> On Tue, Jun 22, 2021 at 06:55:03PM +0200, Daniel Vetter wrote:
> > There's a bunch of atomic drivers who don't do this quite correctly,
> > luckily most of them aren't in wide use or people would have noticed
> > the tearing.
> >
> > By making this the default we avoid the constant audit pain and can
> > additionally remove a ton of lines from vfuncs for a bit more clarity
> > in smaller drivers.
> >
> > While at it complain if there's a cleanup_fb hook but no prepare_fb
> > hook, because that makes no sense. I haven't found any driver which
> > violates this, but better safe than sorry.
> >
> > Subsequent patches will reap the benefits.
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >  drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
> >  drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
> >  include/drm/drm_modeset_helper_vtables.h |  7 +++++--
> >  3 files changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > index 531f2374b072..9f6c5f21c4d6 100644
> > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > @@ -35,6 +35,7 @@
> >  #include <drm/drm_damage_helper.h>
> >  #include <drm/drm_device.h>
> >  #include <drm/drm_drv.h>
> > +#include <drm/drm_gem_atomic_helper.h>
> >  #include <drm/drm_plane_helper.h>
> >  #include <drm/drm_print.h>
> >  #include <drm/drm_self_refresh_helper.h>
> > @@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
> >                       ret = funcs->prepare_fb(plane, new_plane_state);
> >                       if (ret)
> >                               goto fail;
> > +             } else {
> > +                     WARN_ON_ONCE(funcs->cleanup_fb);
> > +
> > +                     if (!drm_core_check_feature(dev, DRIVER_GEM))
> > +                             continue;
> > +
> > +                     ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
> > +                     if (ret)
> > +                             goto fail;
> >               }
> >       }
> >
> > diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > index a27135084ae5..bc9396f2a0ed 100644
> > --- a/drivers/gpu/drm/drm_gem_atomic_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
> > @@ -135,6 +135,9 @@
> >   * GEM based framebuffer drivers which have their buffers always pinned in
> >   * memory.
> >   *
> > + * This function is the default implementation for GEM drivers of
> > + * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
> > + *
> >   * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
> >   * explicit fencing in atomic modeset updates.
> >   */
> > diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
> > index f3a4b47b3986..4e727261dca5 100644
> > --- a/include/drm/drm_modeset_helper_vtables.h
> > +++ b/include/drm/drm_modeset_helper_vtables.h
> > @@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
> >        * equivalent functionality should be implemented through private
> >        * members in the plane structure.
> >        *
> > -      * Drivers which always have their buffers pinned should use
> > -      * drm_gem_plane_helper_prepare_fb() for this hook.
> > +      * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> s/not/nor/ ??

Yup.

> > +      * set drm_gem_plane_helper_prepare_fb() is called automatically to
>               ^add comma?
> > +      * implement this.
>
>
> Leave cleanup_fb out of the description to make it more readable.

With the not->nor typo fixed, why does this make it more readable?
Afaiui neither ... nor ... is fairly standard English, and I really
want to make this the default only if you specify absolutely no plane
fb handling of your own.

> In the description of cleanup_fb you can document that it is wrong to
> have it without a matcching prepare_fb if you feel for it.

So the reason I didn't document things that way is that imo the
"cleanup_fb  but not prepare_fb" case is just nonsense. But I also
didn't want to accidentally paper over bugs where people set only
cleanup_fb and forget to hook up the other one, hence the warning. But
if you think we should explain that in docs, I guess I can shuffle it
around. Just feel like specifying everything in the comments doesn't
help the readability of the docs.
-Daniel

>
>         Sam
>
>
>          * Other drivers which need additional plane processing
> > +      * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
> > +      * hook.
> >        *
> >        * The helpers will call @cleanup_fb with matching arguments for every
> >        * successful call to this hook.
> > --
> > 2.32.0.rc2



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-06-22 23:56   ` kernel test robot
  -1 siblings, 0 replies; 175+ messages in thread
From: kernel test robot @ 2021-06-22 23:56 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2558 bytes --]

Hi Daniel,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on drm-tip/drm-tip]
[cannot apply to sunxi/sunxi/for-next drm-intel/for-linux-next linus/master linux-arm/drm-armada-devel linux-arm/drm-armada-fixes v5.13-rc7 next-20210622]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/implicit-fencing-dma-resv-rules-for-shared-buffers/20210623-005623
base:   git://anongit.freedesktop.org/drm/drm-tip drm-tip
config: x86_64-randconfig-s031-20210622 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-341-g8af24329-dirty
        # https://github.com/0day-ci/linux/commit/42de2bd7635cf7c6d79494a3a35512c53196524f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/implicit-fencing-dma-resv-rules-for-shared-buffers/20210623-005623
        git checkout 42de2bd7635cf7c6d79494a3a35512c53196524f
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:1727:5: warning: no previous prototype for 'amdgpu_setparam_ioctl' [-Wmissing-prototypes]
    1727 | int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
         |     ^~~~~~~~~~~~~~~~~~~~~


vim +/amdgpu_setparam_ioctl +1727 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

  1726	
> 1727	int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
  1728				  struct drm_file *filp)
  1729	{
  1730		struct drm_amdgpu_setparam *setparam = data;
  1731		struct amdgpu_fpriv *fpriv = filp->driver_priv;
  1732	
  1733		switch (setparam->param) {
  1734		case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
  1735			if (setparam->value)
  1736				WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
  1737			else
  1738				WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
  1739			break;
  1740		default:
  1741			return -EINVAL;
  1742		}
  1743	
  1744		return 0;
  1745	}
  1746	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 47142 bytes --]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
  (?)
@ 2021-06-23  8:31     ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:31 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Sumit Semwal,
	linux-media, linaro-mm-sig

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>      dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-resv.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>   }
>   
>   /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>    * @obj: the reservation object
>    *
>    * Returns the exclusive fence (if any). Caller must either hold the objects


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-23  8:31     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:31 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: linaro-mm-sig, Daniel Vetter, Intel Graphics Development, linux-media

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>      dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-resv.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>   }
>   
>   /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>    * @obj: the reservation object
>    *
>    * Returns the exclusive fence (if any). Caller must either hold the objects


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-23  8:31     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:31 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: linaro-mm-sig, Daniel Vetter, Intel Graphics Development,
	Sumit Semwal, linux-media

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Oversight from
>
> commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> Author: Christian König <christian.koenig@amd.com>
> Date:   Mon May 10 16:14:09 2021 +0200
>
>      dma-buf: rename and cleanup dma_resv_get_excl v3
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-resv.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 562b885cf9c3..e1ca2080a1ff 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
>   }
>   
>   /**
> - * dma_resv_exclusive - return the object's exclusive fence
> + * dma_resv_excl_fence - return the object's exclusive fence
>    * @obj: the reservation object
>    *
>    * Returns the exclusive fence (if any). Caller must either hold the objects

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
  (?)
@ 2021-06-23  8:32     ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Sumit Semwal,
	Alex Deucher, Dave Airlie, Nirmoy Das, Deepak R Varma, Chen Li,
	Kevin Wang, linux-media, linaro-mm-sig

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>   1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>   
>   /**
>    * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>    * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>    * and is incremented on each attach.
>    *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>    * Device DMA access is handled by the separate &struct dma_buf_attachment.
>    */
>   struct dma_buf {
> +	/**
> +	 * @size:
> +	 *
> +	 * Size of the buffer; invariant over the lifetime of the buffer.
> +	 */
>   	size_t size;
> +
> +	/**
> +	 * @file:
> +	 *
> +	 * File pointer used for sharing buffers across, and for refcounting.
> +	 * See dma_buf_get() and dma_buf_put().
> +	 */
>   	struct file *file;
> +
> +	/**
> +	 * @attachments:
> +	 *
> +	 * List of dma_buf_attachment that denotes all devices attached,
> +	 * protected by &dma_resv lock @resv.
> +	 */
>   	struct list_head attachments;
> +
> +	/** @ops: dma_buf_ops associated with this buffer object. */
>   	const struct dma_buf_ops *ops;
> +
> +	/**
> +	 * @lock:
> +	 *
> +	 * Used internally to serialize list manipulation, attach/detach and
> +	 * vmap/unmap. Note that in many cases this is superseeded by
> +	 * dma_resv_lock() on @resv.
> +	 */
>   	struct mutex lock;
> +
> +	/**
> +	 * @vmapping_counter:
> +	 *
> +	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +	 * Protected by @lock.
> +	 */
>   	unsigned vmapping_counter;
> +
> +	/**
> +	 * @vmap_ptr:
> +	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +	 */
>   	struct dma_buf_map vmap_ptr;
> +
> +	/**
> +	 * @exp_name:
> +	 *
> +	 * Name of the exporter; useful for debugging. See the
> +	 * DMA_BUF_SET_NAME IOCTL.
> +	 */
>   	const char *exp_name;
> +
> +	/**
> +	 * @name:
> +	 *
> +	 * Userspace-provided name; useful for accounting and debugging,
> +	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +	 */
>   	const char *name;
> +
> +	/** @name_lock: Spinlock to protect name acces for read access. */
>   	spinlock_t name_lock;
> +
> +	/**
> +	 * @owner:
> +	 *
> +	 * Pointer to exporter module; used for refcounting when exporter is a
> +	 * kernel module.
> +	 */
>   	struct module *owner;
> +
> +	/** @list_node: node for dma_buf accounting and debugging. */
>   	struct list_head list_node;
> +
> +	/** @priv: exporter specific private data for this buffer object. */
>   	void *priv;
> +
> +	/**
> +	 * @resv:
> +	 *
> +	 * Reservation object linked to this dma-buf.
> +	 */
>   	struct dma_resv *resv;
>   
> -	/* poll support */
> +	/** @poll: for userspace poll support */
>   	wait_queue_head_t poll;
>   
> +	/** @cb_excl: for userspace poll support */
> +	/** @cb_shared: for userspace poll support */
>   	struct dma_buf_poll_cb_t {
>   		struct dma_fence_cb cb;
>   		wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>   		__poll_t active;
>   	} cb_excl, cb_shared;
>   #ifdef CONFIG_DMABUF_SYSFS_STATS
> -	/* for sysfs stats */
> +	/**
> +	 * @sysfs_entry:
> +	 *
> +	 * For exposing information about this buffer in sysfs. See also
> +	 * `DMA-BUF statistics`_ for the uapi this enables.
> +	 */
>   	struct dma_buf_sysfs_entry {
>   		struct kobject kobj;
>   		struct dma_buf *dmabuf;


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-23  8:32     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	linaro-mm-sig, Nirmoy Das, Chen Li, Dave Airlie, Alex Deucher,
	Daniel Vetter, linux-media

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>   1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>   
>   /**
>    * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>    * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>    * and is incremented on each attach.
>    *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>    * Device DMA access is handled by the separate &struct dma_buf_attachment.
>    */
>   struct dma_buf {
> +	/**
> +	 * @size:
> +	 *
> +	 * Size of the buffer; invariant over the lifetime of the buffer.
> +	 */
>   	size_t size;
> +
> +	/**
> +	 * @file:
> +	 *
> +	 * File pointer used for sharing buffers across, and for refcounting.
> +	 * See dma_buf_get() and dma_buf_put().
> +	 */
>   	struct file *file;
> +
> +	/**
> +	 * @attachments:
> +	 *
> +	 * List of dma_buf_attachment that denotes all devices attached,
> +	 * protected by &dma_resv lock @resv.
> +	 */
>   	struct list_head attachments;
> +
> +	/** @ops: dma_buf_ops associated with this buffer object. */
>   	const struct dma_buf_ops *ops;
> +
> +	/**
> +	 * @lock:
> +	 *
> +	 * Used internally to serialize list manipulation, attach/detach and
> +	 * vmap/unmap. Note that in many cases this is superseeded by
> +	 * dma_resv_lock() on @resv.
> +	 */
>   	struct mutex lock;
> +
> +	/**
> +	 * @vmapping_counter:
> +	 *
> +	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +	 * Protected by @lock.
> +	 */
>   	unsigned vmapping_counter;
> +
> +	/**
> +	 * @vmap_ptr:
> +	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +	 */
>   	struct dma_buf_map vmap_ptr;
> +
> +	/**
> +	 * @exp_name:
> +	 *
> +	 * Name of the exporter; useful for debugging. See the
> +	 * DMA_BUF_SET_NAME IOCTL.
> +	 */
>   	const char *exp_name;
> +
> +	/**
> +	 * @name:
> +	 *
> +	 * Userspace-provided name; useful for accounting and debugging,
> +	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +	 */
>   	const char *name;
> +
> +	/** @name_lock: Spinlock to protect name acces for read access. */
>   	spinlock_t name_lock;
> +
> +	/**
> +	 * @owner:
> +	 *
> +	 * Pointer to exporter module; used for refcounting when exporter is a
> +	 * kernel module.
> +	 */
>   	struct module *owner;
> +
> +	/** @list_node: node for dma_buf accounting and debugging. */
>   	struct list_head list_node;
> +
> +	/** @priv: exporter specific private data for this buffer object. */
>   	void *priv;
> +
> +	/**
> +	 * @resv:
> +	 *
> +	 * Reservation object linked to this dma-buf.
> +	 */
>   	struct dma_resv *resv;
>   
> -	/* poll support */
> +	/** @poll: for userspace poll support */
>   	wait_queue_head_t poll;
>   
> +	/** @cb_excl: for userspace poll support */
> +	/** @cb_shared: for userspace poll support */
>   	struct dma_buf_poll_cb_t {
>   		struct dma_fence_cb cb;
>   		wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>   		__poll_t active;
>   	} cb_excl, cb_shared;
>   #ifdef CONFIG_DMABUF_SYSFS_STATS
> -	/* for sysfs stats */
> +	/**
> +	 * @sysfs_entry:
> +	 *
> +	 * For exposing information about this buffer in sysfs. See also
> +	 * `DMA-BUF statistics`_ for the uapi this enables.
> +	 */
>   	struct dma_buf_sysfs_entry {
>   		struct kobject kobj;
>   		struct dma_buf *dmabuf;


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 02/15] dma-buf: Switch to inline kerneldoc
@ 2021-06-23  8:32     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	linaro-mm-sig, Nirmoy Das, Chen Li, Dave Airlie, Alex Deucher,
	Daniel Vetter, Sumit Semwal, linux-media

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Also review & update everything while we're at it.
>
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org

Acked-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-buf.h | 107 +++++++++++++++++++++++++++++++---------
>   1 file changed, 83 insertions(+), 24 deletions(-)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 92eec38a03aa..6d18b9e448b9 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -289,28 +289,6 @@ struct dma_buf_ops {
>   
>   /**
>    * struct dma_buf - shared buffer object
> - * @size: size of the buffer; invariant over the lifetime of the buffer.
> - * @file: file pointer used for sharing buffers across, and for refcounting.
> - * @attachments: list of dma_buf_attachment that denotes all devices attached,
> - *               protected by dma_resv lock.
> - * @ops: dma_buf_ops associated with this buffer object.
> - * @lock: used internally to serialize list manipulation, attach/detach and
> - *        vmap/unmap
> - * @vmapping_counter: used internally to refcnt the vmaps
> - * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
> - * @exp_name: name of the exporter; useful for debugging.
> - * @name: userspace-provided name; useful for accounting and debugging,
> - *        protected by @resv.
> - * @name_lock: spinlock to protect name access
> - * @owner: pointer to exporter module; used for refcounting when exporter is a
> - *         kernel module.
> - * @list_node: node for dma_buf accounting and debugging.
> - * @priv: exporter specific private data for this buffer object.
> - * @resv: reservation object linked to this dma-buf
> - * @poll: for userspace poll support
> - * @cb_excl: for userspace poll support
> - * @cb_shared: for userspace poll support
> - * @sysfs_entry: for exposing information about this buffer in sysfs.
>    * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
>    * and is incremented on each attach.
>    *
> @@ -324,24 +302,100 @@ struct dma_buf_ops {
>    * Device DMA access is handled by the separate &struct dma_buf_attachment.
>    */
>   struct dma_buf {
> +	/**
> +	 * @size:
> +	 *
> +	 * Size of the buffer; invariant over the lifetime of the buffer.
> +	 */
>   	size_t size;
> +
> +	/**
> +	 * @file:
> +	 *
> +	 * File pointer used for sharing buffers across, and for refcounting.
> +	 * See dma_buf_get() and dma_buf_put().
> +	 */
>   	struct file *file;
> +
> +	/**
> +	 * @attachments:
> +	 *
> +	 * List of dma_buf_attachment that denotes all devices attached,
> +	 * protected by &dma_resv lock @resv.
> +	 */
>   	struct list_head attachments;
> +
> +	/** @ops: dma_buf_ops associated with this buffer object. */
>   	const struct dma_buf_ops *ops;
> +
> +	/**
> +	 * @lock:
> +	 *
> +	 * Used internally to serialize list manipulation, attach/detach and
> +	 * vmap/unmap. Note that in many cases this is superseeded by
> +	 * dma_resv_lock() on @resv.
> +	 */
>   	struct mutex lock;
> +
> +	/**
> +	 * @vmapping_counter:
> +	 *
> +	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
> +	 * Protected by @lock.
> +	 */
>   	unsigned vmapping_counter;
> +
> +	/**
> +	 * @vmap_ptr:
> +	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
> +	 */
>   	struct dma_buf_map vmap_ptr;
> +
> +	/**
> +	 * @exp_name:
> +	 *
> +	 * Name of the exporter; useful for debugging. See the
> +	 * DMA_BUF_SET_NAME IOCTL.
> +	 */
>   	const char *exp_name;
> +
> +	/**
> +	 * @name:
> +	 *
> +	 * Userspace-provided name; useful for accounting and debugging,
> +	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
> +	 */
>   	const char *name;
> +
> +	/** @name_lock: Spinlock to protect name acces for read access. */
>   	spinlock_t name_lock;
> +
> +	/**
> +	 * @owner:
> +	 *
> +	 * Pointer to exporter module; used for refcounting when exporter is a
> +	 * kernel module.
> +	 */
>   	struct module *owner;
> +
> +	/** @list_node: node for dma_buf accounting and debugging. */
>   	struct list_head list_node;
> +
> +	/** @priv: exporter specific private data for this buffer object. */
>   	void *priv;
> +
> +	/**
> +	 * @resv:
> +	 *
> +	 * Reservation object linked to this dma-buf.
> +	 */
>   	struct dma_resv *resv;
>   
> -	/* poll support */
> +	/** @poll: for userspace poll support */
>   	wait_queue_head_t poll;
>   
> +	/** @cb_excl: for userspace poll support */
> +	/** @cb_shared: for userspace poll support */
>   	struct dma_buf_poll_cb_t {
>   		struct dma_fence_cb cb;
>   		wait_queue_head_t *poll;
> @@ -349,7 +403,12 @@ struct dma_buf {
>   		__poll_t active;
>   	} cb_excl, cb_shared;
>   #ifdef CONFIG_DMABUF_SYSFS_STATS
> -	/* for sysfs stats */
> +	/**
> +	 * @sysfs_entry:
> +	 *
> +	 * For exposing information about this buffer in sysfs. See also
> +	 * `DMA-BUF statistics`_ for the uapi this enables.
> +	 */
>   	struct dma_buf_sysfs_entry {
>   		struct kobject kobj;
>   		struct dma_buf *dmabuf;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-22 16:54   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23  8:41     ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:41 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, linaro-mm-sig,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Docs for struct dma_resv are fairly clear:
>
> "A reservation object can have attached one exclusive fence (normally
> associated with write operations) or N shared fences (read
> operations)."
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fdriver-api%2Fdma-buf.html%23reservation-objects&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264104449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=rcd4i0VpK3BhwFLzsxd66OhshdJJh3yhRo2MOqlCEBo%3D&amp;reserved=0
>
> Furthermore a review across all of upstream.
>
> First of render drivers and how they set implicit fences:
>
> - nouveau follows this contract, see in validate_fini_no_ticket()
>
> 			nouveau_bo_fence(nvbo, fence, !!b->write_domains);
>
>    and that last boolean controls whether the exclusive or shared fence
>    slot is used.
>
> - radeon follows this contract by setting
>
> 		p->relocs[i].tv.num_shared = !r->write_domain;
>
>    in radeon_cs_parser_relocs(), which ensures that the call to
>    ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
>    right thing.
>
> - vmwgfx seems to follow this contract with the shotgun approach of
>    always setting ttm_val_buf->num_shared = 0, which means
>    ttm_eu_fence_buffer_objects() will only use the exclusive slot.
>
> - etnaviv follows this contract, as can be trivially seen by looking
>    at submit_attach_object_fences()
>
> - i915 is a bit a convoluted maze with multiple paths leading to
>    i915_vma_move_to_active(). Which sets the exclusive flag if
>    EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
>    softpin mode, or through the write_domain when using relocations. It
>    follows this contract.
>
> - lima follows this contract, see lima_gem_submit() which sets the
>    exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
>    bo
>
> - msm follows this contract, see msm_gpu_submit() which sets the
>    exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
>
> - panfrost follows this contract with the shotgun approach of just
>    always setting the exclusive fence, see
>    panfrost_attach_object_fences(). Benefits of a single engine I guess
>
> - v3d follows this contract with the same shotgun approach in
>    v3d_attach_fences_and_unlock_reservation(), but it has at least an
>    XXX comment that maybe this should be improved
>
> - v4c uses the same shotgun approach of always setting an exclusive
>    fence, see vc4_update_bo_seqnos()
>
> - vgem also follows this contract, see vgem_fence_attach_ioctl() and
>    the VGEM_FENCE_WRITE. This is used in some igts to validate prime
>    sharing with i915.ko without the need of a 2nd gpu
>
> - vritio follows this contract again with the shotgun approach of
>    always setting an exclusive fence, see virtio_gpu_array_add_fence()
>
> This covers the setting of the exclusive fences when writing.
>
> Synchronizing against the exclusive fence is a lot more tricky, and I
> only spot checked a few:
>
> - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
>    implicit dependencies (which is used by vulkan)
>
> - etnaviv does this. Implicit dependencies are collected in
>    submit_fence_sync(), again with an opt-out flag
>    ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
>    etnaviv_sched_dependency which is the
>    drm_sched_backend_ops->dependency callback.
>
> - v4c seems to not do much here, maybe gets away with it by not having
>    a scheduler and only a single engine. Since all newer broadcom chips than
>    the OG vc4 use v3d for rendering, which follows this contract, the
>    impact of this issue is fairly small.
>
> - v3d does this using the drm_gem_fence_array_add_implicit() helper,
>    which then it's drm_sched_backend_ops->dependency callback
>    v3d_job_dependency() picks up.
>
> - panfrost is nice here and tracks the implicit fences in
>    panfrost_job->implicit_fences, which again the
>    drm_sched_backend_ops->dependency callback panfrost_job_dependency()
>    picks up. It is mildly questionable though since it only picks up
>    exclusive fences in panfrost_acquire_object_fences(), but not buggy
>    in practice because it also always sets the exclusive fence. It
>    should pick up both sets of fences, just in case there's ever going
>    to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
>    pcie port and a real gpu, which might actually happen eventually. A
>    bug, but easy to fix. Should probably use the
>    drm_gem_fence_array_add_implicit() helper.
>
> - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
>    the same schema as v3d.
>
> - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
>    but because it doesn't use the drm/scheduler it handles fences from
>    the wrong context with a synchronous dma_fence_wait. See
>    submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>    a scheduler might be a good idea.
>
> - all the remaining drivers are ttm based, where I hope they do
>    appropriately obey implicit fences already. I didn't do the full
>    audit there because a) not follow the contract would confuse ttm
>    quite well and b) reading non-standard scheduler and submit code
>    which isn't based on drm/scheduler is a pain.
>
> Onwards to the display side.
>
> - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
>    correctly. Overwhelmingly most drivers get this right, except a few
>    totally dont. I'll follow up with a patch to make this the default
>    and avoid a bunch of bugs.
>
> - I didn't audit the ttm drivers, but given that dma_resv started
>    there I hope they get this right.
>
> In conclusion this IS the contract, both as documented and
> overwhelmingly implemented, specically as implemented by all render
> drivers except amdgpu.
>
> Amdgpu tried to fix this already in
>
> commit 049aca4363d8af87cab8d53de5401602db3b9999
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Sep 19 16:54:35 2018 +0200
>
>      drm/amdgpu: fix using shared fence for exported BOs v2
>
> but this fix falls short on a number of areas:
>
> - It's racy, by the time the buffer is shared it might be too late. To
>    make sure there's definitely never a problem we need to set the
>    fences correctly for any buffer that's potentially exportable.
>
> - It's breaking uapi, dma-buf fds support poll() and differentitiate
>    between, which was introduced in
>
> 	commit 9b495a5887994a6d74d5c261d012083a92b94738
> 	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> 	Date:   Tue Jul 1 12:57:43 2014 +0200
>
> 	    dma-buf: add poll support, v3
>
> - Christian König wants to nack new uapi building further on this
>    dma_resv contract because it breaks amdgpu, quoting
>
>    "Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
>    "This doesn't works for amdgpu at all for the reasons outlined above."
>
>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2Ff2eb6751-2f82-9b23-f57e-548de5b729de%40gmail.com%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264114436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=CboicEqZau1%2FlEgEM3w4Ye2Nq6wULjePIehxaMqW3Fg%3D&amp;reserved=0
>
>    Rejecting new development because your own driver is broken and
>    violates established cross driver contracts and uapi is really not
>    how upstream works.
>
> Now this patch will have a severe performance impact on anything that
> runs on multiple engines. So we can't just merge it outright, but need
> a bit a plan:
>
> - amdgpu needs a proper uapi for handling implicit fencing. The funny
>    thing is that to do it correctly, implicit fencing must be treated
>    as a very strange IPC mechanism for transporting fences, where both
>    setting the fence and dependency intercepts must be handled
>    explicitly. Current best practices is a per-bo flag to indicate
>    writes, and a per-bo flag to to skip implicit fencing in the CS
>    ioctl as a new chunk.
>
> - Since amdgpu has been shipping with broken behaviour we need an
>    opt-out flag from the butchered implicit fencing model to enable the
>    proper explicit implicit fencing model.
>
> - for kernel memory fences due to bo moves at least the i915 idea is
>    to use ttm_bo->moving. amdgpu probably needs the same.
>
> - since the current p2p dma-buf interface assumes the kernel memory
>    fence is in the exclusive dma_resv fence slot we need to add a new
>    fence slot for kernel fences, which must never be ignored. Since
>    currently only amdgpu supports this there's no real problem here
>    yet, until amdgpu gains a NO_IMPLICIT CS flag.
>
> - New userspace needs to ship in enough desktop distros so that users
>    wont notice the perf impact. I think we can ignore LTS distros who
>    upgrade their kernels but not their mesa3d snapshot.
>
> - Then when this is all in place we can merge this patch here.
>
> What is not a solution to this problem here is trying to make the
> dma_resv rules in the kernel more clever. The fundamental issue here
> is that the amdgpu CS uapi is the least expressive one across all
> drivers (only equalled by panfrost, which has an actual excuse) by not
> allowing any userspace control over how implicit sync is conducted.
>
> Until this is fixed it's completely pointless to make the kernel more
> clever to improve amdgpu, because all we're doing is papering over
> this uapi design issue. amdgpu needs to attain the status quo
> established by other drivers first, once that's achieved we can tackle
> the remaining issues in a consistent way across drivers.
>
> v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
> entirely missed.
>
> This is great because it means the amdgpu specific piece for proper
> implicit fence handling exists already, and that since a while. The
> only thing that's now missing is
> - fishing the implicit fences out of a shared object at the right time
> - setting the exclusive implicit fence slot at the right time.
>
> Jason has a patch series to fill that gap with a bunch of generic
> ioctl on the dma-buf fd:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-1-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264114436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2FXL98%2F2%2F1GZOICxCTPTzfIvcSm144vPBjUDM29aeyF8%3D&amp;reserved=0
>
> v3: Since Christian has fixed amdgpu now in
>
> commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Jun 9 13:51:36 2021 +0200
>
>      drm/amdgpu: rework dma_resv handling v3
>
> Use the audit covered in this commit message as the excuse to update
> the dma-buf docs around dma_buf.resv usage across drivers.
>
> Since dynamic importers have different rules also hammer these in
> again while we're at it.
>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Rob Clark <robdclark@chromium.org>
> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> Cc: Michel Dänzer <michel@daenzer.net>
> Cc: Daniel Stone <daniels@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
>   1 file changed, 39 insertions(+)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 6d18b9e448b9..4807cefe81f5 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -388,6 +388,45 @@ struct dma_buf {
>   	 * @resv:
>   	 *
>   	 * Reservation object linked to this dma-buf.
> +	 *
> +	 * IMPLICIT SYNCHRONIZATION RULES:
> +	 *
> +	 * Drivers which support implicit synchronization of buffer access as
> +	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> +	 * below rules.
> +	 *
> +	 * - Drivers should add a shared fence through
> +	 *   dma_resv_add_shared_fence() for anything the userspace API
> +	 *   considers a read access. This highly depends upon the API and
> +	 *   window system: E.g. OpenGL is generally implicitly synchronized on
> +	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> +	 *   generally explicitly synchronized for everything, and window system
> +	 *   buffers have explicit API calls (which then need to make sure the
> +	 *   implicit fences store here in @resv are updated correctly).
> +	 *
> +	 * - Similarly drivers should set the exclusive fence through
> +	 *   dma_resv_add_excl_fence() for anything the userspace API considers
> +	 *   write access.
> +	 *
> +	 * - Drivers may just always set the exclusive fence, since that only
> +	 *   causes unecessarily synchronization, but no correctness issues.
> +	 *
> +	 * - Some drivers only expose a synchronous userspace API with no
> +	 *   pipelining across drivers. These do not set any fences for their
> +	 *   access. An example here is v4l.
> +	 *
> +	 * DYNAMIC IMPORTER RULES:
> +	 *
> +	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
> +	 * additional constraints on how they set up fences:
> +	 *
> +	 * - Dynamic importers must obey the exclusive fence and wait for it to
> +	 *   signal before allowing access to the buffer's underlying storage
> +	 *   through.
> +	 *
> +	 * - Dynamic importers should set fences for any access that they can't
> +	 *   disable immediately from their @dma_buf_attach_ops.move_notify
> +	 *   callback.
>   	 */
>   	struct dma_resv *resv;
>   


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-23  8:41     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:41 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, Sumit Semwal,
	linaro-mm-sig, Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Michel Dänzer,
	Dennis Li, Deepak R Varma

Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> Docs for struct dma_resv are fairly clear:
>
> "A reservation object can have attached one exclusive fence (normally
> associated with write operations) or N shared fences (read
> operations)."
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fdriver-api%2Fdma-buf.html%23reservation-objects&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264104449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=rcd4i0VpK3BhwFLzsxd66OhshdJJh3yhRo2MOqlCEBo%3D&amp;reserved=0
>
> Furthermore a review across all of upstream.
>
> First of render drivers and how they set implicit fences:
>
> - nouveau follows this contract, see in validate_fini_no_ticket()
>
> 			nouveau_bo_fence(nvbo, fence, !!b->write_domains);
>
>    and that last boolean controls whether the exclusive or shared fence
>    slot is used.
>
> - radeon follows this contract by setting
>
> 		p->relocs[i].tv.num_shared = !r->write_domain;
>
>    in radeon_cs_parser_relocs(), which ensures that the call to
>    ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
>    right thing.
>
> - vmwgfx seems to follow this contract with the shotgun approach of
>    always setting ttm_val_buf->num_shared = 0, which means
>    ttm_eu_fence_buffer_objects() will only use the exclusive slot.
>
> - etnaviv follows this contract, as can be trivially seen by looking
>    at submit_attach_object_fences()
>
> - i915 is a bit a convoluted maze with multiple paths leading to
>    i915_vma_move_to_active(). Which sets the exclusive flag if
>    EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
>    softpin mode, or through the write_domain when using relocations. It
>    follows this contract.
>
> - lima follows this contract, see lima_gem_submit() which sets the
>    exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
>    bo
>
> - msm follows this contract, see msm_gpu_submit() which sets the
>    exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
>
> - panfrost follows this contract with the shotgun approach of just
>    always setting the exclusive fence, see
>    panfrost_attach_object_fences(). Benefits of a single engine I guess
>
> - v3d follows this contract with the same shotgun approach in
>    v3d_attach_fences_and_unlock_reservation(), but it has at least an
>    XXX comment that maybe this should be improved
>
> - v4c uses the same shotgun approach of always setting an exclusive
>    fence, see vc4_update_bo_seqnos()
>
> - vgem also follows this contract, see vgem_fence_attach_ioctl() and
>    the VGEM_FENCE_WRITE. This is used in some igts to validate prime
>    sharing with i915.ko without the need of a 2nd gpu
>
> - vritio follows this contract again with the shotgun approach of
>    always setting an exclusive fence, see virtio_gpu_array_add_fence()
>
> This covers the setting of the exclusive fences when writing.
>
> Synchronizing against the exclusive fence is a lot more tricky, and I
> only spot checked a few:
>
> - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
>    implicit dependencies (which is used by vulkan)
>
> - etnaviv does this. Implicit dependencies are collected in
>    submit_fence_sync(), again with an opt-out flag
>    ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
>    etnaviv_sched_dependency which is the
>    drm_sched_backend_ops->dependency callback.
>
> - v4c seems to not do much here, maybe gets away with it by not having
>    a scheduler and only a single engine. Since all newer broadcom chips than
>    the OG vc4 use v3d for rendering, which follows this contract, the
>    impact of this issue is fairly small.
>
> - v3d does this using the drm_gem_fence_array_add_implicit() helper,
>    which then it's drm_sched_backend_ops->dependency callback
>    v3d_job_dependency() picks up.
>
> - panfrost is nice here and tracks the implicit fences in
>    panfrost_job->implicit_fences, which again the
>    drm_sched_backend_ops->dependency callback panfrost_job_dependency()
>    picks up. It is mildly questionable though since it only picks up
>    exclusive fences in panfrost_acquire_object_fences(), but not buggy
>    in practice because it also always sets the exclusive fence. It
>    should pick up both sets of fences, just in case there's ever going
>    to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
>    pcie port and a real gpu, which might actually happen eventually. A
>    bug, but easy to fix. Should probably use the
>    drm_gem_fence_array_add_implicit() helper.
>
> - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
>    the same schema as v3d.
>
> - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
>    but because it doesn't use the drm/scheduler it handles fences from
>    the wrong context with a synchronous dma_fence_wait. See
>    submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>    a scheduler might be a good idea.
>
> - all the remaining drivers are ttm based, where I hope they do
>    appropriately obey implicit fences already. I didn't do the full
>    audit there because a) not follow the contract would confuse ttm
>    quite well and b) reading non-standard scheduler and submit code
>    which isn't based on drm/scheduler is a pain.
>
> Onwards to the display side.
>
> - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
>    correctly. Overwhelmingly most drivers get this right, except a few
>    totally dont. I'll follow up with a patch to make this the default
>    and avoid a bunch of bugs.
>
> - I didn't audit the ttm drivers, but given that dma_resv started
>    there I hope they get this right.
>
> In conclusion this IS the contract, both as documented and
> overwhelmingly implemented, specically as implemented by all render
> drivers except amdgpu.
>
> Amdgpu tried to fix this already in
>
> commit 049aca4363d8af87cab8d53de5401602db3b9999
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Sep 19 16:54:35 2018 +0200
>
>      drm/amdgpu: fix using shared fence for exported BOs v2
>
> but this fix falls short on a number of areas:
>
> - It's racy, by the time the buffer is shared it might be too late. To
>    make sure there's definitely never a problem we need to set the
>    fences correctly for any buffer that's potentially exportable.
>
> - It's breaking uapi, dma-buf fds support poll() and differentitiate
>    between, which was introduced in
>
> 	commit 9b495a5887994a6d74d5c261d012083a92b94738
> 	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> 	Date:   Tue Jul 1 12:57:43 2014 +0200
>
> 	    dma-buf: add poll support, v3
>
> - Christian König wants to nack new uapi building further on this
>    dma_resv contract because it breaks amdgpu, quoting
>
>    "Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
>    "This doesn't works for amdgpu at all for the reasons outlined above."
>
>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2Ff2eb6751-2f82-9b23-f57e-548de5b729de%40gmail.com%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264114436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=CboicEqZau1%2FlEgEM3w4Ye2Nq6wULjePIehxaMqW3Fg%3D&amp;reserved=0
>
>    Rejecting new development because your own driver is broken and
>    violates established cross driver contracts and uapi is really not
>    how upstream works.
>
> Now this patch will have a severe performance impact on anything that
> runs on multiple engines. So we can't just merge it outright, but need
> a bit a plan:
>
> - amdgpu needs a proper uapi for handling implicit fencing. The funny
>    thing is that to do it correctly, implicit fencing must be treated
>    as a very strange IPC mechanism for transporting fences, where both
>    setting the fence and dependency intercepts must be handled
>    explicitly. Current best practices is a per-bo flag to indicate
>    writes, and a per-bo flag to to skip implicit fencing in the CS
>    ioctl as a new chunk.
>
> - Since amdgpu has been shipping with broken behaviour we need an
>    opt-out flag from the butchered implicit fencing model to enable the
>    proper explicit implicit fencing model.
>
> - for kernel memory fences due to bo moves at least the i915 idea is
>    to use ttm_bo->moving. amdgpu probably needs the same.
>
> - since the current p2p dma-buf interface assumes the kernel memory
>    fence is in the exclusive dma_resv fence slot we need to add a new
>    fence slot for kernel fences, which must never be ignored. Since
>    currently only amdgpu supports this there's no real problem here
>    yet, until amdgpu gains a NO_IMPLICIT CS flag.
>
> - New userspace needs to ship in enough desktop distros so that users
>    wont notice the perf impact. I think we can ignore LTS distros who
>    upgrade their kernels but not their mesa3d snapshot.
>
> - Then when this is all in place we can merge this patch here.
>
> What is not a solution to this problem here is trying to make the
> dma_resv rules in the kernel more clever. The fundamental issue here
> is that the amdgpu CS uapi is the least expressive one across all
> drivers (only equalled by panfrost, which has an actual excuse) by not
> allowing any userspace control over how implicit sync is conducted.
>
> Until this is fixed it's completely pointless to make the kernel more
> clever to improve amdgpu, because all we're doing is papering over
> this uapi design issue. amdgpu needs to attain the status quo
> established by other drivers first, once that's achieved we can tackle
> the remaining issues in a consistent way across drivers.
>
> v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
> entirely missed.
>
> This is great because it means the amdgpu specific piece for proper
> implicit fence handling exists already, and that since a while. The
> only thing that's now missing is
> - fishing the implicit fences out of a shared object at the right time
> - setting the exclusive implicit fence slot at the right time.
>
> Jason has a patch series to fill that gap with a bunch of generic
> ioctl on the dma-buf fd:
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-1-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C42d8c70b62084a846e9508d9359e8629%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637599777264114436%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2FXL98%2F2%2F1GZOICxCTPTzfIvcSm144vPBjUDM29aeyF8%3D&amp;reserved=0
>
> v3: Since Christian has fixed amdgpu now in
>
> commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Jun 9 13:51:36 2021 +0200
>
>      drm/amdgpu: rework dma_resv handling v3
>
> Use the audit covered in this commit message as the excuse to update
> the dma-buf docs around dma_buf.resv usage across drivers.
>
> Since dynamic importers have different rules also hammer these in
> again while we're at it.
>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Rob Clark <robdclark@chromium.org>
> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> Cc: Michel Dänzer <michel@daenzer.net>
> Cc: Daniel Stone <daniels@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
>   1 file changed, 39 insertions(+)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 6d18b9e448b9..4807cefe81f5 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -388,6 +388,45 @@ struct dma_buf {
>   	 * @resv:
>   	 *
>   	 * Reservation object linked to this dma-buf.
> +	 *
> +	 * IMPLICIT SYNCHRONIZATION RULES:
> +	 *
> +	 * Drivers which support implicit synchronization of buffer access as
> +	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> +	 * below rules.
> +	 *
> +	 * - Drivers should add a shared fence through
> +	 *   dma_resv_add_shared_fence() for anything the userspace API
> +	 *   considers a read access. This highly depends upon the API and
> +	 *   window system: E.g. OpenGL is generally implicitly synchronized on
> +	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> +	 *   generally explicitly synchronized for everything, and window system
> +	 *   buffers have explicit API calls (which then need to make sure the
> +	 *   implicit fences store here in @resv are updated correctly).
> +	 *
> +	 * - Similarly drivers should set the exclusive fence through
> +	 *   dma_resv_add_excl_fence() for anything the userspace API considers
> +	 *   write access.
> +	 *
> +	 * - Drivers may just always set the exclusive fence, since that only
> +	 *   causes unecessarily synchronization, but no correctness issues.
> +	 *
> +	 * - Some drivers only expose a synchronous userspace API with no
> +	 *   pipelining across drivers. These do not set any fences for their
> +	 *   access. An example here is v4l.
> +	 *
> +	 * DYNAMIC IMPORTER RULES:
> +	 *
> +	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
> +	 * additional constraints on how they set up fences:
> +	 *
> +	 * - Dynamic importers must obey the exclusive fence and wait for it to
> +	 *   signal before allowing access to the buffer's underlying storage
> +	 *   through.
> +	 *
> +	 * - Dynamic importers should set fences for any access that they can't
> +	 *   disable immediately from their @dma_buf_attach_ops.move_notify
> +	 *   callback.
>   	 */
>   	struct dma_resv *resv;
>   

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23  8:42     ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:42 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Thomas Zimmermann,
	Daniel Vetter

Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Spotted while trying to convert panfrost to these.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>   drivers/gpu/drm/drm_gem.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index ba2e64ed8b47..68deb1de8235 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>    * @fence_array: array of dma_fence * for the job to block on.
>    * @fence: the dma_fence to add to the list of dependencies.
>    *
> + * This functions consumes the reference for @fence both on success and error
> + * cases.
> + *

Oh, the later is a bit ugly I think. But good to know.

Reviewed-by: Christian König <christian.koenig@amd.com>

>    * Returns:
>    * 0 on success, or an error on failing to expand the array.
>    */


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-23  8:42     ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23  8:42 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Maxime Ripard,
	Thomas Zimmermann, Daniel Vetter, Lucas Stach

Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Spotted while trying to convert panfrost to these.
>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> ---
>   drivers/gpu/drm/drm_gem.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index ba2e64ed8b47..68deb1de8235 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>    * @fence_array: array of dma_fence * for the job to block on.
>    * @fence: the dma_fence to add to the list of dependencies.
>    *
> + * This functions consumes the reference for @fence both on success and error
> + * cases.
> + *

Oh, the later is a bit ugly I think. But good to know.

Reviewed-by: Christian König <christian.koenig@amd.com>

>    * Returns:
>    * 0 on success, or an error on failing to expand the array.
>    */

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23  9:45     ` Bas Nieuwenhuizen
  -1 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23  9:45 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Christian König,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>
> Implicit fencing done properly needs to treat the implicit fencing
> slots like a funny kind of IPC mailbox. In other words it needs to be
> explicitly. This is the only way it will mesh well with explicit
> fencing userspace like vk, and it's also the bare minimum required to
> be able to manage anything else that wants to use the same buffer on
> multiple engines in parallel, and still be able to share it through
> implicit sync.
>
> amdgpu completely lacks such an uapi. Fix this.
>
> Luckily the concept of ignoring implicit fences exists already, and
> takes care of all the complexities of making sure that non-optional
> fences (like bo moves) are not ignored. This support was added in
>
> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> Author: Andres Rodriguez <andresx7@gmail.com>
> Date:   Fri Sep 15 20:44:06 2017 -0400
>
>     drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>
> Unfortuantely it's the wrong semantics, because it's a bo flag and
> disables implicit sync on an allocated buffer completely.
>
> We _do_ want implicit sync, but control it explicitly. For this we
> need a flag on the drm_file, so that a given userspace (like vulkan)
> can manage the implicit sync slots explicitly. The other side of the
> pipeline (compositor, other process or just different stage in a media
> pipeline in the same process) can then either do the same, or fully
> participate in the implicit sync as implemented by the kernel by
> default.
>
> By building on the existing flag for buffers we avoid any issues with
> opening up additional security concerns - anything this new flag here
> allows is already.
>
> All drivers which supports this concept of a userspace-specific
> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> that turned out to be a bit too inflexible. See the discussion below,
> let's try to do a bit better for amdgpu.
>
> This alone only allows us to completely avoid any stalls due to
> implicit sync, it does not yet allow us to use implicit sync as a
> strange form of IPC for sync_file.
>
> For that we need two more pieces:
>
> - a way to get the current implicit sync fences out of a buffer. Could
>   be done in a driver ioctl, but everyone needs this, and generally a
>   dma-buf is involved anyway to establish the sharing. So an ioctl on
>   the dma-buf makes a ton more sense:
>
>   https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/
>
>   Current drivers in upstream solves this by having the opt-out flag
>   on their CS ioctl. This has the downside that very often the CS
>   which must actually stall for the implicit fence is run a while
>   after the implicit fence point was logically sampled per the api
>   spec (vk passes an explicit syncobj around for that afaiui), and so
>   results in oversync. Converting the implicit sync fences into a
>   snap-shot sync_file is actually accurate.
>
> - Simillar we need to be able to set the exclusive implicit fence.
>   Current drivers again do this with a CS ioctl flag, with again the
>   same problems that the time the CS happens additional dependencies
>   have been added. An explicit ioctl to only insert a sync_file (while
>   respecting the rules for how exclusive and shared fence slots must
>   be update in struct dma_resv) is much better. This is proposed here:
>
>   https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/
>
> These three pieces together allow userspace to fully control implicit
> fencing and remove all unecessary stall points due to them.
>
> Well, as much as the implicit fencing model fundamentally allows:
> There is only one set of fences, you can only choose to sync against
> only writers (exclusive slot), or everyone. Hence suballocating
> multiple buffers or anything else like this is fundamentally not
> possible, and can only be fixed by a proper explicit fencing model.
>
> Aside from that caveat this model gets implicit fencing as closely to
> explicit fencing semantics as possible:
>
> On the actual implementation I opted for a simple setparam ioctl, no
> locking (just atomic reads/writes) for simplicity. There is a nice
> flag parameter in the VM ioctl which we could use, except:
> - it's not checked, so userspace likely passes garbage
> - there's already a comment that userspace _does_ pass garbage in the
>   priority field
> So yeah unfortunately this flag parameter for setting vm flags is
> useless, and we need to hack up a new one.
>
> v2: Explain why a new SETPARAM (Jason)
>
> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> need both, or this doesn't do much.
>
> v4: Rebase over the amdgpu patch to always set the implicit sync
> fences.

So I think there is still a case missing in this implementation.
Consider these 3 cases

(format: a->b: b waits on a. Yes, I know arrows are hard)

explicit->explicit: This doesn't wait now, which is good
Implicit->explicit: This doesn't wait now, which is good
explicit->implicit : This still waits as the explicit submission still
adds shared fences and most things that set an exclusive fence for
implicit sync will hence wait on it.

This is probably good enough for what radv needs now but also sounds
like a risk wrt baking in new uapi behavior that we don't want to be
the end result.

Within AMDGPU this is probably solvable in two ways:

1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
2) Have an EXPLICIT fence owner that is used for explicit submissions
that is ignored by AMDGPU_SYNC_NE_OWNER.

But this doesn't solve cross-driver interactions here.

>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Rob Clark <robdclark@chromium.org>
> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> Cc: Michel Dänzer <michel@daenzer.net>
> Cc: Daniel Stone <daniels@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>  include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>  4 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 65df34c17264..c5386d13eb4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>         struct amdgpu_bo *gds;
>         struct amdgpu_bo *gws;
>         struct amdgpu_bo *oa;
> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>         int r;
>
>         INIT_LIST_HEAD(&p->validated);
> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>
>                 e->bo_va = amdgpu_vm_bo_find(vm, bo);
>
> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> +               if (bo->tbo.base.dma_buf &&
> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>                         e->chain = dma_fence_chain_alloc();
>                         if (!e->chain) {
>                                 r = -ENOMEM;
> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>  {
>         struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>         struct amdgpu_bo_list_entry *e;
> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>         int r;
>
>         list_for_each_entry(e, &p->validated, tv.head) {
> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>                 struct dma_resv *resv = bo->tbo.base.resv;
>                 enum amdgpu_sync_mode sync_mode;
>
> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>                         AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>                 r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>                                      &fpriv->vm);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index c080ba15ae77..f982626b5328 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>         return 0;
>  }
>
> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> +                         struct drm_file *filp)
> +{
> +       struct drm_amdgpu_setparam *setparam = data;
> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> +
> +       switch (setparam->param) {
> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> +               if (setparam->value)
> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> +               else
> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
>  const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>  };
>
>  static const struct drm_driver amdgpu_kms_driver = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index ddb85a85cbba..0e8c440c6303 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>         bool                    bulk_moveable;
>         /* Flag to indicate if VM is used for compute */
>         bool                    is_compute_context;
> +       /*
> +        * Flag to indicate whether implicit sync should always be skipped on
> +        * this context. We do not care about races at all, userspace is allowed
> +        * to shoot itself with implicit sync to its fullest liking.
> +        */
> +       bool no_implicit_sync;
>  };
>
>  struct amdgpu_vm_manager {
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 0cbd1540aeac..9eae245c14d6 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -54,6 +54,7 @@ extern "C" {
>  #define DRM_AMDGPU_VM                  0x13
>  #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>  #define DRM_AMDGPU_SCHED               0x15
> +#define DRM_AMDGPU_SETPARAM            0x16
>
>  #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>  #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> @@ -71,6 +72,7 @@ extern "C" {
>  #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>  #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>
>  /**
>   * DOC: memory domains
> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>         struct drm_amdgpu_sched_in in;
>  };
>
> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> +
> +struct drm_amdgpu_setparam {
> +       /* AMDGPU_SETPARAM_* */
> +       __u32   param;
> +       __u32   value;
> +};
> +
>  /*
>   * This is not a reliable API and you should expect it to fail for any
>   * number of reasons and have fallback path that do not use userptr to
> --
> 2.32.0.rc2
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23  9:45     ` Bas Nieuwenhuizen
  0 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23  9:45 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Christian König,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Michel Dänzer,
	Dennis Li, Deepak R Varma

On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>
> Implicit fencing done properly needs to treat the implicit fencing
> slots like a funny kind of IPC mailbox. In other words it needs to be
> explicitly. This is the only way it will mesh well with explicit
> fencing userspace like vk, and it's also the bare minimum required to
> be able to manage anything else that wants to use the same buffer on
> multiple engines in parallel, and still be able to share it through
> implicit sync.
>
> amdgpu completely lacks such an uapi. Fix this.
>
> Luckily the concept of ignoring implicit fences exists already, and
> takes care of all the complexities of making sure that non-optional
> fences (like bo moves) are not ignored. This support was added in
>
> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> Author: Andres Rodriguez <andresx7@gmail.com>
> Date:   Fri Sep 15 20:44:06 2017 -0400
>
>     drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>
> Unfortuantely it's the wrong semantics, because it's a bo flag and
> disables implicit sync on an allocated buffer completely.
>
> We _do_ want implicit sync, but control it explicitly. For this we
> need a flag on the drm_file, so that a given userspace (like vulkan)
> can manage the implicit sync slots explicitly. The other side of the
> pipeline (compositor, other process or just different stage in a media
> pipeline in the same process) can then either do the same, or fully
> participate in the implicit sync as implemented by the kernel by
> default.
>
> By building on the existing flag for buffers we avoid any issues with
> opening up additional security concerns - anything this new flag here
> allows is already.
>
> All drivers which supports this concept of a userspace-specific
> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> that turned out to be a bit too inflexible. See the discussion below,
> let's try to do a bit better for amdgpu.
>
> This alone only allows us to completely avoid any stalls due to
> implicit sync, it does not yet allow us to use implicit sync as a
> strange form of IPC for sync_file.
>
> For that we need two more pieces:
>
> - a way to get the current implicit sync fences out of a buffer. Could
>   be done in a driver ioctl, but everyone needs this, and generally a
>   dma-buf is involved anyway to establish the sharing. So an ioctl on
>   the dma-buf makes a ton more sense:
>
>   https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/
>
>   Current drivers in upstream solves this by having the opt-out flag
>   on their CS ioctl. This has the downside that very often the CS
>   which must actually stall for the implicit fence is run a while
>   after the implicit fence point was logically sampled per the api
>   spec (vk passes an explicit syncobj around for that afaiui), and so
>   results in oversync. Converting the implicit sync fences into a
>   snap-shot sync_file is actually accurate.
>
> - Simillar we need to be able to set the exclusive implicit fence.
>   Current drivers again do this with a CS ioctl flag, with again the
>   same problems that the time the CS happens additional dependencies
>   have been added. An explicit ioctl to only insert a sync_file (while
>   respecting the rules for how exclusive and shared fence slots must
>   be update in struct dma_resv) is much better. This is proposed here:
>
>   https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/
>
> These three pieces together allow userspace to fully control implicit
> fencing and remove all unecessary stall points due to them.
>
> Well, as much as the implicit fencing model fundamentally allows:
> There is only one set of fences, you can only choose to sync against
> only writers (exclusive slot), or everyone. Hence suballocating
> multiple buffers or anything else like this is fundamentally not
> possible, and can only be fixed by a proper explicit fencing model.
>
> Aside from that caveat this model gets implicit fencing as closely to
> explicit fencing semantics as possible:
>
> On the actual implementation I opted for a simple setparam ioctl, no
> locking (just atomic reads/writes) for simplicity. There is a nice
> flag parameter in the VM ioctl which we could use, except:
> - it's not checked, so userspace likely passes garbage
> - there's already a comment that userspace _does_ pass garbage in the
>   priority field
> So yeah unfortunately this flag parameter for setting vm flags is
> useless, and we need to hack up a new one.
>
> v2: Explain why a new SETPARAM (Jason)
>
> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> need both, or this doesn't do much.
>
> v4: Rebase over the amdgpu patch to always set the implicit sync
> fences.

So I think there is still a case missing in this implementation.
Consider these 3 cases

(format: a->b: b waits on a. Yes, I know arrows are hard)

explicit->explicit: This doesn't wait now, which is good
Implicit->explicit: This doesn't wait now, which is good
explicit->implicit : This still waits as the explicit submission still
adds shared fences and most things that set an exclusive fence for
implicit sync will hence wait on it.

This is probably good enough for what radv needs now but also sounds
like a risk wrt baking in new uapi behavior that we don't want to be
the end result.

Within AMDGPU this is probably solvable in two ways:

1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
2) Have an EXPLICIT fence owner that is used for explicit submissions
that is ignored by AMDGPU_SYNC_NE_OWNER.

But this doesn't solve cross-driver interactions here.

>
> Cc: mesa-dev@lists.freedesktop.org
> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> Cc: Dave Airlie <airlied@gmail.com>
> Cc: Rob Clark <robdclark@chromium.org>
> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> Cc: Michel Dänzer <michel@daenzer.net>
> Cc: Daniel Stone <daniels@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: Dennis Li <Dennis.Li@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>  include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>  4 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index 65df34c17264..c5386d13eb4a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>         struct amdgpu_bo *gds;
>         struct amdgpu_bo *gws;
>         struct amdgpu_bo *oa;
> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>         int r;
>
>         INIT_LIST_HEAD(&p->validated);
> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>
>                 e->bo_va = amdgpu_vm_bo_find(vm, bo);
>
> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> +               if (bo->tbo.base.dma_buf &&
> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>                         e->chain = dma_fence_chain_alloc();
>                         if (!e->chain) {
>                                 r = -ENOMEM;
> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>  {
>         struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>         struct amdgpu_bo_list_entry *e;
> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>         int r;
>
>         list_for_each_entry(e, &p->validated, tv.head) {
> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>                 struct dma_resv *resv = bo->tbo.base.resv;
>                 enum amdgpu_sync_mode sync_mode;
>
> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>                         AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>                 r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>                                      &fpriv->vm);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index c080ba15ae77..f982626b5328 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>         return 0;
>  }
>
> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> +                         struct drm_file *filp)
> +{
> +       struct drm_amdgpu_setparam *setparam = data;
> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> +
> +       switch (setparam->param) {
> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> +               if (setparam->value)
> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> +               else
> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> +               break;
> +       default:
> +               return -EINVAL;
> +       }
> +
> +       return 0;
> +}
> +
>  const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>  };
>
>  static const struct drm_driver amdgpu_kms_driver = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index ddb85a85cbba..0e8c440c6303 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>         bool                    bulk_moveable;
>         /* Flag to indicate if VM is used for compute */
>         bool                    is_compute_context;
> +       /*
> +        * Flag to indicate whether implicit sync should always be skipped on
> +        * this context. We do not care about races at all, userspace is allowed
> +        * to shoot itself with implicit sync to its fullest liking.
> +        */
> +       bool no_implicit_sync;
>  };
>
>  struct amdgpu_vm_manager {
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index 0cbd1540aeac..9eae245c14d6 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -54,6 +54,7 @@ extern "C" {
>  #define DRM_AMDGPU_VM                  0x13
>  #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>  #define DRM_AMDGPU_SCHED               0x15
> +#define DRM_AMDGPU_SETPARAM            0x16
>
>  #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>  #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> @@ -71,6 +72,7 @@ extern "C" {
>  #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>  #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>
>  /**
>   * DOC: memory domains
> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>         struct drm_amdgpu_sched_in in;
>  };
>
> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> +
> +struct drm_amdgpu_setparam {
> +       /* AMDGPU_SETPARAM_* */
> +       __u32   param;
> +       __u32   value;
> +};
> +
>  /*
>   * This is not a reliable API and you should expect it to fail for any
>   * number of reasons and have fallback path that do not use userptr to
> --
> 2.32.0.rc2
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23  9:45     ` [Intel-gfx] " Bas Nieuwenhuizen
@ 2021-06-23 12:18       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 12:18 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Christian König,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
<bas@basnieuwenhuizen.nl> wrote:
>
> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >
> > Implicit fencing done properly needs to treat the implicit fencing
> > slots like a funny kind of IPC mailbox. In other words it needs to be
> > explicitly. This is the only way it will mesh well with explicit
> > fencing userspace like vk, and it's also the bare minimum required to
> > be able to manage anything else that wants to use the same buffer on
> > multiple engines in parallel, and still be able to share it through
> > implicit sync.
> >
> > amdgpu completely lacks such an uapi. Fix this.
> >
> > Luckily the concept of ignoring implicit fences exists already, and
> > takes care of all the complexities of making sure that non-optional
> > fences (like bo moves) are not ignored. This support was added in
> >
> > commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > Author: Andres Rodriguez <andresx7@gmail.com>
> > Date:   Fri Sep 15 20:44:06 2017 -0400
> >
> >     drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >
> > Unfortuantely it's the wrong semantics, because it's a bo flag and
> > disables implicit sync on an allocated buffer completely.
> >
> > We _do_ want implicit sync, but control it explicitly. For this we
> > need a flag on the drm_file, so that a given userspace (like vulkan)
> > can manage the implicit sync slots explicitly. The other side of the
> > pipeline (compositor, other process or just different stage in a media
> > pipeline in the same process) can then either do the same, or fully
> > participate in the implicit sync as implemented by the kernel by
> > default.
> >
> > By building on the existing flag for buffers we avoid any issues with
> > opening up additional security concerns - anything this new flag here
> > allows is already.
> >
> > All drivers which supports this concept of a userspace-specific
> > opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > that turned out to be a bit too inflexible. See the discussion below,
> > let's try to do a bit better for amdgpu.
> >
> > This alone only allows us to completely avoid any stalls due to
> > implicit sync, it does not yet allow us to use implicit sync as a
> > strange form of IPC for sync_file.
> >
> > For that we need two more pieces:
> >
> > - a way to get the current implicit sync fences out of a buffer. Could
> >   be done in a driver ioctl, but everyone needs this, and generally a
> >   dma-buf is involved anyway to establish the sharing. So an ioctl on
> >   the dma-buf makes a ton more sense:
> >
> >   https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/
> >
> >   Current drivers in upstream solves this by having the opt-out flag
> >   on their CS ioctl. This has the downside that very often the CS
> >   which must actually stall for the implicit fence is run a while
> >   after the implicit fence point was logically sampled per the api
> >   spec (vk passes an explicit syncobj around for that afaiui), and so
> >   results in oversync. Converting the implicit sync fences into a
> >   snap-shot sync_file is actually accurate.
> >
> > - Simillar we need to be able to set the exclusive implicit fence.
> >   Current drivers again do this with a CS ioctl flag, with again the
> >   same problems that the time the CS happens additional dependencies
> >   have been added. An explicit ioctl to only insert a sync_file (while
> >   respecting the rules for how exclusive and shared fence slots must
> >   be update in struct dma_resv) is much better. This is proposed here:
> >
> >   https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/
> >
> > These three pieces together allow userspace to fully control implicit
> > fencing and remove all unecessary stall points due to them.
> >
> > Well, as much as the implicit fencing model fundamentally allows:
> > There is only one set of fences, you can only choose to sync against
> > only writers (exclusive slot), or everyone. Hence suballocating
> > multiple buffers or anything else like this is fundamentally not
> > possible, and can only be fixed by a proper explicit fencing model.
> >
> > Aside from that caveat this model gets implicit fencing as closely to
> > explicit fencing semantics as possible:
> >
> > On the actual implementation I opted for a simple setparam ioctl, no
> > locking (just atomic reads/writes) for simplicity. There is a nice
> > flag parameter in the VM ioctl which we could use, except:
> > - it's not checked, so userspace likely passes garbage
> > - there's already a comment that userspace _does_ pass garbage in the
> >   priority field
> > So yeah unfortunately this flag parameter for setting vm flags is
> > useless, and we need to hack up a new one.
> >
> > v2: Explain why a new SETPARAM (Jason)
> >
> > v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > need both, or this doesn't do much.
> >
> > v4: Rebase over the amdgpu patch to always set the implicit sync
> > fences.
>
> So I think there is still a case missing in this implementation.
> Consider these 3 cases
>
> (format: a->b: b waits on a. Yes, I know arrows are hard)
>
> explicit->explicit: This doesn't wait now, which is good
> Implicit->explicit: This doesn't wait now, which is good
> explicit->implicit : This still waits as the explicit submission still
> adds shared fences and most things that set an exclusive fence for
> implicit sync will hence wait on it.
>
> This is probably good enough for what radv needs now but also sounds
> like a risk wrt baking in new uapi behavior that we don't want to be
> the end result.
>
> Within AMDGPU this is probably solvable in two ways:
>
> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.

I'm not sure that works. I think the right fix is that radeonsi also
switches to this model, with maybe a per-bo CS flag to set indicate
write access, to cut down on the number of ioctls that are needed
otherwise on shared buffers. This per-bo flag would essentially select
between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.

The current amdgpu uapi just doesn't allow any other model without an
explicit opt-in. So current implicit sync userspace just has to
oversync, there's not much choice.

> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> that is ignored by AMDGPU_SYNC_NE_OWNER.
>
> But this doesn't solve cross-driver interactions here.

Yeah cross-driver is still entirely unsolved, because
amdgpu_bo_explicit_sync() on the bo didn't solve that either.
-Daniel

>
> >
> > Cc: mesa-dev@lists.freedesktop.org
> > Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Rob Clark <robdclark@chromium.org>
> > Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > Cc: Michel Dänzer <michel@daenzer.net>
> > Cc: Daniel Stone <daniels@collabora.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: linaro-mm-sig@lists.linaro.org
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >  include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >  4 files changed, 42 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 65df34c17264..c5386d13eb4a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >         struct amdgpu_bo *gds;
> >         struct amdgpu_bo *gws;
> >         struct amdgpu_bo *oa;
> > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >         int r;
> >
> >         INIT_LIST_HEAD(&p->validated);
> > @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >
> >                 e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >
> > -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > +               if (bo->tbo.base.dma_buf &&
> > +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >                         e->chain = dma_fence_chain_alloc();
> >                         if (!e->chain) {
> >                                 r = -ENOMEM;
> > @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >  {
> >         struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >         struct amdgpu_bo_list_entry *e;
> > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >         int r;
> >
> >         list_for_each_entry(e, &p->validated, tv.head) {
> > @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >                 struct dma_resv *resv = bo->tbo.base.resv;
> >                 enum amdgpu_sync_mode sync_mode;
> >
> > -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >                         AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >                 r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >                                      &fpriv->vm);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index c080ba15ae77..f982626b5328 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >         return 0;
> >  }
> >
> > +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > +                         struct drm_file *filp)
> > +{
> > +       struct drm_amdgpu_setparam *setparam = data;
> > +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > +
> > +       switch (setparam->param) {
> > +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > +               if (setparam->value)
> > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > +               else
> > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >  };
> >
> >  static const struct drm_driver amdgpu_kms_driver = {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > index ddb85a85cbba..0e8c440c6303 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >         bool                    bulk_moveable;
> >         /* Flag to indicate if VM is used for compute */
> >         bool                    is_compute_context;
> > +       /*
> > +        * Flag to indicate whether implicit sync should always be skipped on
> > +        * this context. We do not care about races at all, userspace is allowed
> > +        * to shoot itself with implicit sync to its fullest liking.
> > +        */
> > +       bool no_implicit_sync;
> >  };
> >
> >  struct amdgpu_vm_manager {
> > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > index 0cbd1540aeac..9eae245c14d6 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -54,6 +54,7 @@ extern "C" {
> >  #define DRM_AMDGPU_VM                  0x13
> >  #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >  #define DRM_AMDGPU_SCHED               0x15
> > +#define DRM_AMDGPU_SETPARAM            0x16
> >
> >  #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >  #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > @@ -71,6 +72,7 @@ extern "C" {
> >  #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >  #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >
> >  /**
> >   * DOC: memory domains
> > @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >         struct drm_amdgpu_sched_in in;
> >  };
> >
> > +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > +
> > +struct drm_amdgpu_setparam {
> > +       /* AMDGPU_SETPARAM_* */
> > +       __u32   param;
> > +       __u32   value;
> > +};
> > +
> >  /*
> >   * This is not a reliable API and you should expect it to fail for any
> >   * number of reasons and have fallback path that do not use userptr to
> > --
> > 2.32.0.rc2
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 12:18       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 12:18 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Christian König,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Michel Dänzer,
	Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
<bas@basnieuwenhuizen.nl> wrote:
>
> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >
> > Implicit fencing done properly needs to treat the implicit fencing
> > slots like a funny kind of IPC mailbox. In other words it needs to be
> > explicitly. This is the only way it will mesh well with explicit
> > fencing userspace like vk, and it's also the bare minimum required to
> > be able to manage anything else that wants to use the same buffer on
> > multiple engines in parallel, and still be able to share it through
> > implicit sync.
> >
> > amdgpu completely lacks such an uapi. Fix this.
> >
> > Luckily the concept of ignoring implicit fences exists already, and
> > takes care of all the complexities of making sure that non-optional
> > fences (like bo moves) are not ignored. This support was added in
> >
> > commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > Author: Andres Rodriguez <andresx7@gmail.com>
> > Date:   Fri Sep 15 20:44:06 2017 -0400
> >
> >     drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >
> > Unfortuantely it's the wrong semantics, because it's a bo flag and
> > disables implicit sync on an allocated buffer completely.
> >
> > We _do_ want implicit sync, but control it explicitly. For this we
> > need a flag on the drm_file, so that a given userspace (like vulkan)
> > can manage the implicit sync slots explicitly. The other side of the
> > pipeline (compositor, other process or just different stage in a media
> > pipeline in the same process) can then either do the same, or fully
> > participate in the implicit sync as implemented by the kernel by
> > default.
> >
> > By building on the existing flag for buffers we avoid any issues with
> > opening up additional security concerns - anything this new flag here
> > allows is already.
> >
> > All drivers which supports this concept of a userspace-specific
> > opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > that turned out to be a bit too inflexible. See the discussion below,
> > let's try to do a bit better for amdgpu.
> >
> > This alone only allows us to completely avoid any stalls due to
> > implicit sync, it does not yet allow us to use implicit sync as a
> > strange form of IPC for sync_file.
> >
> > For that we need two more pieces:
> >
> > - a way to get the current implicit sync fences out of a buffer. Could
> >   be done in a driver ioctl, but everyone needs this, and generally a
> >   dma-buf is involved anyway to establish the sharing. So an ioctl on
> >   the dma-buf makes a ton more sense:
> >
> >   https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/
> >
> >   Current drivers in upstream solves this by having the opt-out flag
> >   on their CS ioctl. This has the downside that very often the CS
> >   which must actually stall for the implicit fence is run a while
> >   after the implicit fence point was logically sampled per the api
> >   spec (vk passes an explicit syncobj around for that afaiui), and so
> >   results in oversync. Converting the implicit sync fences into a
> >   snap-shot sync_file is actually accurate.
> >
> > - Simillar we need to be able to set the exclusive implicit fence.
> >   Current drivers again do this with a CS ioctl flag, with again the
> >   same problems that the time the CS happens additional dependencies
> >   have been added. An explicit ioctl to only insert a sync_file (while
> >   respecting the rules for how exclusive and shared fence slots must
> >   be update in struct dma_resv) is much better. This is proposed here:
> >
> >   https://lore.kernel.org/dri-devel/20210520190007.534046-5-jason@jlekstrand.net/
> >
> > These three pieces together allow userspace to fully control implicit
> > fencing and remove all unecessary stall points due to them.
> >
> > Well, as much as the implicit fencing model fundamentally allows:
> > There is only one set of fences, you can only choose to sync against
> > only writers (exclusive slot), or everyone. Hence suballocating
> > multiple buffers or anything else like this is fundamentally not
> > possible, and can only be fixed by a proper explicit fencing model.
> >
> > Aside from that caveat this model gets implicit fencing as closely to
> > explicit fencing semantics as possible:
> >
> > On the actual implementation I opted for a simple setparam ioctl, no
> > locking (just atomic reads/writes) for simplicity. There is a nice
> > flag parameter in the VM ioctl which we could use, except:
> > - it's not checked, so userspace likely passes garbage
> > - there's already a comment that userspace _does_ pass garbage in the
> >   priority field
> > So yeah unfortunately this flag parameter for setting vm flags is
> > useless, and we need to hack up a new one.
> >
> > v2: Explain why a new SETPARAM (Jason)
> >
> > v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > need both, or this doesn't do much.
> >
> > v4: Rebase over the amdgpu patch to always set the implicit sync
> > fences.
>
> So I think there is still a case missing in this implementation.
> Consider these 3 cases
>
> (format: a->b: b waits on a. Yes, I know arrows are hard)
>
> explicit->explicit: This doesn't wait now, which is good
> Implicit->explicit: This doesn't wait now, which is good
> explicit->implicit : This still waits as the explicit submission still
> adds shared fences and most things that set an exclusive fence for
> implicit sync will hence wait on it.
>
> This is probably good enough for what radv needs now but also sounds
> like a risk wrt baking in new uapi behavior that we don't want to be
> the end result.
>
> Within AMDGPU this is probably solvable in two ways:
>
> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.

I'm not sure that works. I think the right fix is that radeonsi also
switches to this model, with maybe a per-bo CS flag to set indicate
write access, to cut down on the number of ioctls that are needed
otherwise on shared buffers. This per-bo flag would essentially select
between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.

The current amdgpu uapi just doesn't allow any other model without an
explicit opt-in. So current implicit sync userspace just has to
oversync, there's not much choice.

> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> that is ignored by AMDGPU_SYNC_NE_OWNER.
>
> But this doesn't solve cross-driver interactions here.

Yeah cross-driver is still entirely unsolved, because
amdgpu_bo_explicit_sync() on the bo didn't solve that either.
-Daniel

>
> >
> > Cc: mesa-dev@lists.freedesktop.org
> > Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > Cc: Dave Airlie <airlied@gmail.com>
> > Cc: Rob Clark <robdclark@chromium.org>
> > Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > Cc: Michel Dänzer <michel@daenzer.net>
> > Cc: Daniel Stone <daniels@collabora.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Alex Deucher <alexander.deucher@amd.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > Cc: Chen Li <chenli@uniontech.com>
> > Cc: Kevin Wang <kevin1.wang@amd.com>
> > Cc: Dennis Li <Dennis.Li@amd.com>
> > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > Cc: linaro-mm-sig@lists.linaro.org
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >  include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >  4 files changed, 42 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > index 65df34c17264..c5386d13eb4a 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >         struct amdgpu_bo *gds;
> >         struct amdgpu_bo *gws;
> >         struct amdgpu_bo *oa;
> > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >         int r;
> >
> >         INIT_LIST_HEAD(&p->validated);
> > @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >
> >                 e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >
> > -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > +               if (bo->tbo.base.dma_buf &&
> > +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >                         e->chain = dma_fence_chain_alloc();
> >                         if (!e->chain) {
> >                                 r = -ENOMEM;
> > @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >  {
> >         struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >         struct amdgpu_bo_list_entry *e;
> > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >         int r;
> >
> >         list_for_each_entry(e, &p->validated, tv.head) {
> > @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >                 struct dma_resv *resv = bo->tbo.base.resv;
> >                 enum amdgpu_sync_mode sync_mode;
> >
> > -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >                         AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >                 r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >                                      &fpriv->vm);
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > index c080ba15ae77..f982626b5328 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >         return 0;
> >  }
> >
> > +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > +                         struct drm_file *filp)
> > +{
> > +       struct drm_amdgpu_setparam *setparam = data;
> > +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > +
> > +       switch (setparam->param) {
> > +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > +               if (setparam->value)
> > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > +               else
> > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > +               break;
> > +       default:
> > +               return -EINVAL;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> >  const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >         DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >  };
> >
> >  static const struct drm_driver amdgpu_kms_driver = {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > index ddb85a85cbba..0e8c440c6303 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >         bool                    bulk_moveable;
> >         /* Flag to indicate if VM is used for compute */
> >         bool                    is_compute_context;
> > +       /*
> > +        * Flag to indicate whether implicit sync should always be skipped on
> > +        * this context. We do not care about races at all, userspace is allowed
> > +        * to shoot itself with implicit sync to its fullest liking.
> > +        */
> > +       bool no_implicit_sync;
> >  };
> >
> >  struct amdgpu_vm_manager {
> > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > index 0cbd1540aeac..9eae245c14d6 100644
> > --- a/include/uapi/drm/amdgpu_drm.h
> > +++ b/include/uapi/drm/amdgpu_drm.h
> > @@ -54,6 +54,7 @@ extern "C" {
> >  #define DRM_AMDGPU_VM                  0x13
> >  #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >  #define DRM_AMDGPU_SCHED               0x15
> > +#define DRM_AMDGPU_SETPARAM            0x16
> >
> >  #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >  #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > @@ -71,6 +72,7 @@ extern "C" {
> >  #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >  #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >  #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >
> >  /**
> >   * DOC: memory domains
> > @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >         struct drm_amdgpu_sched_in in;
> >  };
> >
> > +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > +
> > +struct drm_amdgpu_setparam {
> > +       /* AMDGPU_SETPARAM_* */
> > +       __u32   param;
> > +       __u32   value;
> > +};
> > +
> >  /*
> >   * This is not a reliable API and you should expect it to fail for any
> >   * number of reasons and have fallback path that do not use userptr to
> > --
> > 2.32.0.rc2
> >



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 12:18       ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 12:59         ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 12:59 UTC (permalink / raw)
  To: Daniel Vetter, Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Intel Graphics Development, Kevin Wang,
	DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Michel Dänzer, Dennis Li,
	Deepak R Varma

Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> <bas@basnieuwenhuizen.nl> wrote:
>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>
>>> Implicit fencing done properly needs to treat the implicit fencing
>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>> explicitly. This is the only way it will mesh well with explicit
>>> fencing userspace like vk, and it's also the bare minimum required to
>>> be able to manage anything else that wants to use the same buffer on
>>> multiple engines in parallel, and still be able to share it through
>>> implicit sync.
>>>
>>> amdgpu completely lacks such an uapi. Fix this.
>>>
>>> Luckily the concept of ignoring implicit fences exists already, and
>>> takes care of all the complexities of making sure that non-optional
>>> fences (like bo moves) are not ignored. This support was added in
>>>
>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>
>>>      drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>
>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>> disables implicit sync on an allocated buffer completely.
>>>
>>> We _do_ want implicit sync, but control it explicitly. For this we
>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>> can manage the implicit sync slots explicitly. The other side of the
>>> pipeline (compositor, other process or just different stage in a media
>>> pipeline in the same process) can then either do the same, or fully
>>> participate in the implicit sync as implemented by the kernel by
>>> default.
>>>
>>> By building on the existing flag for buffers we avoid any issues with
>>> opening up additional security concerns - anything this new flag here
>>> allows is already.
>>>
>>> All drivers which supports this concept of a userspace-specific
>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>> that turned out to be a bit too inflexible. See the discussion below,
>>> let's try to do a bit better for amdgpu.
>>>
>>> This alone only allows us to completely avoid any stalls due to
>>> implicit sync, it does not yet allow us to use implicit sync as a
>>> strange form of IPC for sync_file.
>>>
>>> For that we need two more pieces:
>>>
>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>    be done in a driver ioctl, but everyone needs this, and generally a
>>>    dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>    the dma-buf makes a ton more sense:
>>>
>>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gUnM8%2Fulx%2B%2BDLxByO%2F0V3cLqt%2Fc2unWjizEpptQqM8g%3D&amp;reserved=0
>>>
>>>    Current drivers in upstream solves this by having the opt-out flag
>>>    on their CS ioctl. This has the downside that very often the CS
>>>    which must actually stall for the implicit fence is run a while
>>>    after the implicit fence point was logically sampled per the api
>>>    spec (vk passes an explicit syncobj around for that afaiui), and so
>>>    results in oversync. Converting the implicit sync fences into a
>>>    snap-shot sync_file is actually accurate.
>>>
>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>    Current drivers again do this with a CS ioctl flag, with again the
>>>    same problems that the time the CS happens additional dependencies
>>>    have been added. An explicit ioctl to only insert a sync_file (while
>>>    respecting the rules for how exclusive and shared fence slots must
>>>    be update in struct dma_resv) is much better. This is proposed here:
>>>
>>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wFGNyeL1YSpkebf1L1DDb2euihf1fvmR9G8cfywrpVU%3D&amp;reserved=0
>>>
>>> These three pieces together allow userspace to fully control implicit
>>> fencing and remove all unecessary stall points due to them.
>>>
>>> Well, as much as the implicit fencing model fundamentally allows:
>>> There is only one set of fences, you can only choose to sync against
>>> only writers (exclusive slot), or everyone. Hence suballocating
>>> multiple buffers or anything else like this is fundamentally not
>>> possible, and can only be fixed by a proper explicit fencing model.
>>>
>>> Aside from that caveat this model gets implicit fencing as closely to
>>> explicit fencing semantics as possible:
>>>
>>> On the actual implementation I opted for a simple setparam ioctl, no
>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>> flag parameter in the VM ioctl which we could use, except:
>>> - it's not checked, so userspace likely passes garbage
>>> - there's already a comment that userspace _does_ pass garbage in the
>>>    priority field
>>> So yeah unfortunately this flag parameter for setting vm flags is
>>> useless, and we need to hack up a new one.
>>>
>>> v2: Explain why a new SETPARAM (Jason)
>>>
>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>> need both, or this doesn't do much.
>>>
>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>> fences.
>> So I think there is still a case missing in this implementation.
>> Consider these 3 cases
>>
>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>
>> explicit->explicit: This doesn't wait now, which is good
>> Implicit->explicit: This doesn't wait now, which is good
>> explicit->implicit : This still waits as the explicit submission still
>> adds shared fences and most things that set an exclusive fence for
>> implicit sync will hence wait on it.
>>
>> This is probably good enough for what radv needs now but also sounds
>> like a risk wrt baking in new uapi behavior that we don't want to be
>> the end result.
>>
>> Within AMDGPU this is probably solvable in two ways:
>>
>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> I'm not sure that works. I think the right fix is that radeonsi also
> switches to this model, with maybe a per-bo CS flag to set indicate
> write access, to cut down on the number of ioctls that are needed
> otherwise on shared buffers. This per-bo flag would essentially select
> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.

Yeah, but I'm still not entirely sure why that approach isn't sufficient?

Problem with the per context or per vm flag is that you then don't get 
any implicit synchronization any more when another process starts using 
the buffer.

> The current amdgpu uapi just doesn't allow any other model without an
> explicit opt-in. So current implicit sync userspace just has to
> oversync, there's not much choice.
>
>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>
>> But this doesn't solve cross-driver interactions here.
> Yeah cross-driver is still entirely unsolved, because
> amdgpu_bo_explicit_sync() on the bo didn't solve that either.

Hui? You have lost me. Why is that still unsolved?

Regards,
Christian.

> -Daniel
>
>>> Cc: mesa-dev@lists.freedesktop.org
>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>> Cc: Dave Airlie <airlied@gmail.com>
>>> Cc: Rob Clark <robdclark@chromium.org>
>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>> Cc: Michel Dänzer <michel@daenzer.net>
>>> Cc: Daniel Stone <daniels@collabora.com>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>> Cc: Chen Li <chenli@uniontech.com>
>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>   include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>   4 files changed, 42 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 65df34c17264..c5386d13eb4a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>          struct amdgpu_bo *gds;
>>>          struct amdgpu_bo *gws;
>>>          struct amdgpu_bo *oa;
>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>          int r;
>>>
>>>          INIT_LIST_HEAD(&p->validated);
>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>
>>>                  e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>
>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>> +               if (bo->tbo.base.dma_buf &&
>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>                          e->chain = dma_fence_chain_alloc();
>>>                          if (!e->chain) {
>>>                                  r = -ENOMEM;
>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>   {
>>>          struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>          struct amdgpu_bo_list_entry *e;
>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>          int r;
>>>
>>>          list_for_each_entry(e, &p->validated, tv.head) {
>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>                  struct dma_resv *resv = bo->tbo.base.resv;
>>>                  enum amdgpu_sync_mode sync_mode;
>>>
>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>                          AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>                  r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>                                       &fpriv->vm);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index c080ba15ae77..f982626b5328 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>          return 0;
>>>   }
>>>
>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>> +                         struct drm_file *filp)
>>> +{
>>> +       struct drm_amdgpu_setparam *setparam = data;
>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>> +
>>> +       switch (setparam->param) {
>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>> +               if (setparam->value)
>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>> +               else
>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>> +               break;
>>> +       default:
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       return 0;
>>> +}
>>> +
>>>   const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>   };
>>>
>>>   static const struct drm_driver amdgpu_kms_driver = {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index ddb85a85cbba..0e8c440c6303 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>          bool                    bulk_moveable;
>>>          /* Flag to indicate if VM is used for compute */
>>>          bool                    is_compute_context;
>>> +       /*
>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>> +        * this context. We do not care about races at all, userspace is allowed
>>> +        * to shoot itself with implicit sync to its fullest liking.
>>> +        */
>>> +       bool no_implicit_sync;
>>>   };
>>>
>>>   struct amdgpu_vm_manager {
>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>> index 0cbd1540aeac..9eae245c14d6 100644
>>> --- a/include/uapi/drm/amdgpu_drm.h
>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>> @@ -54,6 +54,7 @@ extern "C" {
>>>   #define DRM_AMDGPU_VM                  0x13
>>>   #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>   #define DRM_AMDGPU_SCHED               0x15
>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>
>>>   #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>> @@ -71,6 +72,7 @@ extern "C" {
>>>   #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>   #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>
>>>   /**
>>>    * DOC: memory domains
>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>          struct drm_amdgpu_sched_in in;
>>>   };
>>>
>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>> +
>>> +struct drm_amdgpu_setparam {
>>> +       /* AMDGPU_SETPARAM_* */
>>> +       __u32   param;
>>> +       __u32   value;
>>> +};
>>> +
>>>   /*
>>>    * This is not a reliable API and you should expect it to fail for any
>>>    * number of reasons and have fallback path that do not use userptr to
>>> --
>>> 2.32.0.rc2
>>>
>
>


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 12:59         ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 12:59 UTC (permalink / raw)
  To: Daniel Vetter, Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Intel Graphics Development, Kevin Wang,
	DRI Development, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Dave Airlie, Michel Dänzer, Dennis Li,
	Deepak R Varma

Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> <bas@basnieuwenhuizen.nl> wrote:
>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>
>>> Implicit fencing done properly needs to treat the implicit fencing
>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>> explicitly. This is the only way it will mesh well with explicit
>>> fencing userspace like vk, and it's also the bare minimum required to
>>> be able to manage anything else that wants to use the same buffer on
>>> multiple engines in parallel, and still be able to share it through
>>> implicit sync.
>>>
>>> amdgpu completely lacks such an uapi. Fix this.
>>>
>>> Luckily the concept of ignoring implicit fences exists already, and
>>> takes care of all the complexities of making sure that non-optional
>>> fences (like bo moves) are not ignored. This support was added in
>>>
>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>
>>>      drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>
>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>> disables implicit sync on an allocated buffer completely.
>>>
>>> We _do_ want implicit sync, but control it explicitly. For this we
>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>> can manage the implicit sync slots explicitly. The other side of the
>>> pipeline (compositor, other process or just different stage in a media
>>> pipeline in the same process) can then either do the same, or fully
>>> participate in the implicit sync as implemented by the kernel by
>>> default.
>>>
>>> By building on the existing flag for buffers we avoid any issues with
>>> opening up additional security concerns - anything this new flag here
>>> allows is already.
>>>
>>> All drivers which supports this concept of a userspace-specific
>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>> that turned out to be a bit too inflexible. See the discussion below,
>>> let's try to do a bit better for amdgpu.
>>>
>>> This alone only allows us to completely avoid any stalls due to
>>> implicit sync, it does not yet allow us to use implicit sync as a
>>> strange form of IPC for sync_file.
>>>
>>> For that we need two more pieces:
>>>
>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>    be done in a driver ioctl, but everyone needs this, and generally a
>>>    dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>    the dma-buf makes a ton more sense:
>>>
>>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gUnM8%2Fulx%2B%2BDLxByO%2F0V3cLqt%2Fc2unWjizEpptQqM8g%3D&amp;reserved=0
>>>
>>>    Current drivers in upstream solves this by having the opt-out flag
>>>    on their CS ioctl. This has the downside that very often the CS
>>>    which must actually stall for the implicit fence is run a while
>>>    after the implicit fence point was logically sampled per the api
>>>    spec (vk passes an explicit syncobj around for that afaiui), and so
>>>    results in oversync. Converting the implicit sync fences into a
>>>    snap-shot sync_file is actually accurate.
>>>
>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>    Current drivers again do this with a CS ioctl flag, with again the
>>>    same problems that the time the CS happens additional dependencies
>>>    have been added. An explicit ioctl to only insert a sync_file (while
>>>    respecting the rules for how exclusive and shared fence slots must
>>>    be update in struct dma_resv) is much better. This is proposed here:
>>>
>>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wFGNyeL1YSpkebf1L1DDb2euihf1fvmR9G8cfywrpVU%3D&amp;reserved=0
>>>
>>> These three pieces together allow userspace to fully control implicit
>>> fencing and remove all unecessary stall points due to them.
>>>
>>> Well, as much as the implicit fencing model fundamentally allows:
>>> There is only one set of fences, you can only choose to sync against
>>> only writers (exclusive slot), or everyone. Hence suballocating
>>> multiple buffers or anything else like this is fundamentally not
>>> possible, and can only be fixed by a proper explicit fencing model.
>>>
>>> Aside from that caveat this model gets implicit fencing as closely to
>>> explicit fencing semantics as possible:
>>>
>>> On the actual implementation I opted for a simple setparam ioctl, no
>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>> flag parameter in the VM ioctl which we could use, except:
>>> - it's not checked, so userspace likely passes garbage
>>> - there's already a comment that userspace _does_ pass garbage in the
>>>    priority field
>>> So yeah unfortunately this flag parameter for setting vm flags is
>>> useless, and we need to hack up a new one.
>>>
>>> v2: Explain why a new SETPARAM (Jason)
>>>
>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>> need both, or this doesn't do much.
>>>
>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>> fences.
>> So I think there is still a case missing in this implementation.
>> Consider these 3 cases
>>
>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>
>> explicit->explicit: This doesn't wait now, which is good
>> Implicit->explicit: This doesn't wait now, which is good
>> explicit->implicit : This still waits as the explicit submission still
>> adds shared fences and most things that set an exclusive fence for
>> implicit sync will hence wait on it.
>>
>> This is probably good enough for what radv needs now but also sounds
>> like a risk wrt baking in new uapi behavior that we don't want to be
>> the end result.
>>
>> Within AMDGPU this is probably solvable in two ways:
>>
>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> I'm not sure that works. I think the right fix is that radeonsi also
> switches to this model, with maybe a per-bo CS flag to set indicate
> write access, to cut down on the number of ioctls that are needed
> otherwise on shared buffers. This per-bo flag would essentially select
> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.

Yeah, but I'm still not entirely sure why that approach isn't sufficient?

Problem with the per context or per vm flag is that you then don't get 
any implicit synchronization any more when another process starts using 
the buffer.

> The current amdgpu uapi just doesn't allow any other model without an
> explicit opt-in. So current implicit sync userspace just has to
> oversync, there's not much choice.
>
>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>
>> But this doesn't solve cross-driver interactions here.
> Yeah cross-driver is still entirely unsolved, because
> amdgpu_bo_explicit_sync() on the bo didn't solve that either.

Hui? You have lost me. Why is that still unsolved?

Regards,
Christian.

> -Daniel
>
>>> Cc: mesa-dev@lists.freedesktop.org
>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>> Cc: Dave Airlie <airlied@gmail.com>
>>> Cc: Rob Clark <robdclark@chromium.org>
>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>> Cc: Michel Dänzer <michel@daenzer.net>
>>> Cc: Daniel Stone <daniels@collabora.com>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>> Cc: Chen Li <chenli@uniontech.com>
>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>> Cc: linaro-mm-sig@lists.linaro.org
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>   include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>   4 files changed, 42 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> index 65df34c17264..c5386d13eb4a 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>          struct amdgpu_bo *gds;
>>>          struct amdgpu_bo *gws;
>>>          struct amdgpu_bo *oa;
>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>          int r;
>>>
>>>          INIT_LIST_HEAD(&p->validated);
>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>
>>>                  e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>
>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>> +               if (bo->tbo.base.dma_buf &&
>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>                          e->chain = dma_fence_chain_alloc();
>>>                          if (!e->chain) {
>>>                                  r = -ENOMEM;
>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>   {
>>>          struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>          struct amdgpu_bo_list_entry *e;
>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>          int r;
>>>
>>>          list_for_each_entry(e, &p->validated, tv.head) {
>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>                  struct dma_resv *resv = bo->tbo.base.resv;
>>>                  enum amdgpu_sync_mode sync_mode;
>>>
>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>                          AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>                  r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>                                       &fpriv->vm);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index c080ba15ae77..f982626b5328 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>          return 0;
>>>   }
>>>
>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>> +                         struct drm_file *filp)
>>> +{
>>> +       struct drm_amdgpu_setparam *setparam = data;
>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>> +
>>> +       switch (setparam->param) {
>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>> +               if (setparam->value)
>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>> +               else
>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>> +               break;
>>> +       default:
>>> +               return -EINVAL;
>>> +       }
>>> +
>>> +       return 0;
>>> +}
>>> +
>>>   const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>   };
>>>
>>>   static const struct drm_driver amdgpu_kms_driver = {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> index ddb85a85cbba..0e8c440c6303 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>          bool                    bulk_moveable;
>>>          /* Flag to indicate if VM is used for compute */
>>>          bool                    is_compute_context;
>>> +       /*
>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>> +        * this context. We do not care about races at all, userspace is allowed
>>> +        * to shoot itself with implicit sync to its fullest liking.
>>> +        */
>>> +       bool no_implicit_sync;
>>>   };
>>>
>>>   struct amdgpu_vm_manager {
>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>> index 0cbd1540aeac..9eae245c14d6 100644
>>> --- a/include/uapi/drm/amdgpu_drm.h
>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>> @@ -54,6 +54,7 @@ extern "C" {
>>>   #define DRM_AMDGPU_VM                  0x13
>>>   #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>   #define DRM_AMDGPU_SCHED               0x15
>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>
>>>   #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>> @@ -71,6 +72,7 @@ extern "C" {
>>>   #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>   #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>
>>>   /**
>>>    * DOC: memory domains
>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>          struct drm_amdgpu_sched_in in;
>>>   };
>>>
>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>> +
>>> +struct drm_amdgpu_setparam {
>>> +       /* AMDGPU_SETPARAM_* */
>>> +       __u32   param;
>>> +       __u32   value;
>>> +};
>>> +
>>>   /*
>>>    * This is not a reliable API and you should expect it to fail for any
>>>    * number of reasons and have fallback path that do not use userptr to
>>> --
>>> 2.32.0.rc2
>>>
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 12:59         ` [Intel-gfx] " Christian König
@ 2021-06-23 13:38           ` Bas Nieuwenhuizen
  -1 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23 13:38 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 2:59 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > <bas@basnieuwenhuizen.nl> wrote:
> >> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>
> >>> Implicit fencing done properly needs to treat the implicit fencing
> >>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>> explicitly. This is the only way it will mesh well with explicit
> >>> fencing userspace like vk, and it's also the bare minimum required to
> >>> be able to manage anything else that wants to use the same buffer on
> >>> multiple engines in parallel, and still be able to share it through
> >>> implicit sync.
> >>>
> >>> amdgpu completely lacks such an uapi. Fix this.
> >>>
> >>> Luckily the concept of ignoring implicit fences exists already, and
> >>> takes care of all the complexities of making sure that non-optional
> >>> fences (like bo moves) are not ignored. This support was added in
> >>>
> >>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>
> >>>      drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>
> >>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>> disables implicit sync on an allocated buffer completely.
> >>>
> >>> We _do_ want implicit sync, but control it explicitly. For this we
> >>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>> can manage the implicit sync slots explicitly. The other side of the
> >>> pipeline (compositor, other process or just different stage in a media
> >>> pipeline in the same process) can then either do the same, or fully
> >>> participate in the implicit sync as implemented by the kernel by
> >>> default.
> >>>
> >>> By building on the existing flag for buffers we avoid any issues with
> >>> opening up additional security concerns - anything this new flag here
> >>> allows is already.
> >>>
> >>> All drivers which supports this concept of a userspace-specific
> >>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>> that turned out to be a bit too inflexible. See the discussion below,
> >>> let's try to do a bit better for amdgpu.
> >>>
> >>> This alone only allows us to completely avoid any stalls due to
> >>> implicit sync, it does not yet allow us to use implicit sync as a
> >>> strange form of IPC for sync_file.
> >>>
> >>> For that we need two more pieces:
> >>>
> >>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>    be done in a driver ioctl, but everyone needs this, and generally a
> >>>    dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>    the dma-buf makes a ton more sense:
> >>>
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gUnM8%2Fulx%2B%2BDLxByO%2F0V3cLqt%2Fc2unWjizEpptQqM8g%3D&amp;reserved=0
> >>>
> >>>    Current drivers in upstream solves this by having the opt-out flag
> >>>    on their CS ioctl. This has the downside that very often the CS
> >>>    which must actually stall for the implicit fence is run a while
> >>>    after the implicit fence point was logically sampled per the api
> >>>    spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>    results in oversync. Converting the implicit sync fences into a
> >>>    snap-shot sync_file is actually accurate.
> >>>
> >>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>    Current drivers again do this with a CS ioctl flag, with again the
> >>>    same problems that the time the CS happens additional dependencies
> >>>    have been added. An explicit ioctl to only insert a sync_file (while
> >>>    respecting the rules for how exclusive and shared fence slots must
> >>>    be update in struct dma_resv) is much better. This is proposed here:
> >>>
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wFGNyeL1YSpkebf1L1DDb2euihf1fvmR9G8cfywrpVU%3D&amp;reserved=0
> >>>
> >>> These three pieces together allow userspace to fully control implicit
> >>> fencing and remove all unecessary stall points due to them.
> >>>
> >>> Well, as much as the implicit fencing model fundamentally allows:
> >>> There is only one set of fences, you can only choose to sync against
> >>> only writers (exclusive slot), or everyone. Hence suballocating
> >>> multiple buffers or anything else like this is fundamentally not
> >>> possible, and can only be fixed by a proper explicit fencing model.
> >>>
> >>> Aside from that caveat this model gets implicit fencing as closely to
> >>> explicit fencing semantics as possible:
> >>>
> >>> On the actual implementation I opted for a simple setparam ioctl, no
> >>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>> flag parameter in the VM ioctl which we could use, except:
> >>> - it's not checked, so userspace likely passes garbage
> >>> - there's already a comment that userspace _does_ pass garbage in the
> >>>    priority field
> >>> So yeah unfortunately this flag parameter for setting vm flags is
> >>> useless, and we need to hack up a new one.
> >>>
> >>> v2: Explain why a new SETPARAM (Jason)
> >>>
> >>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>> need both, or this doesn't do much.
> >>>
> >>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>> fences.
> >> So I think there is still a case missing in this implementation.
> >> Consider these 3 cases
> >>
> >> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>
> >> explicit->explicit: This doesn't wait now, which is good
> >> Implicit->explicit: This doesn't wait now, which is good
> >> explicit->implicit : This still waits as the explicit submission still
> >> adds shared fences and most things that set an exclusive fence for
> >> implicit sync will hence wait on it.
> >>
> >> This is probably good enough for what radv needs now but also sounds
> >> like a risk wrt baking in new uapi behavior that we don't want to be
> >> the end result.
> >>
> >> Within AMDGPU this is probably solvable in two ways:
> >>
> >> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > I'm not sure that works. I think the right fix is that radeonsi also
> > switches to this model, with maybe a per-bo CS flag to set indicate
> > write access, to cut down on the number of ioctls that are needed
> > otherwise on shared buffers. This per-bo flag would essentially select
> > between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>
> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>
> Problem with the per context or per vm flag is that you then don't get
> any implicit synchronization any more when another process starts using
> the buffer.

That is exactly what I want for Vulkan :)
>
> > The current amdgpu uapi just doesn't allow any other model without an
> > explicit opt-in. So current implicit sync userspace just has to
> > oversync, there's not much choice.
> >
> >> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>
> >> But this doesn't solve cross-driver interactions here.
> > Yeah cross-driver is still entirely unsolved, because
> > amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>
> Hui? You have lost me. Why is that still unsolved?

The part we're trying to solve with this patch is Vulkan should not
participate in any implicit sync at all wrt submissions (and then
handle the implicit sync for WSI explicitly using the fence
import/export stuff that Jason wrote). As long we add shared fences to
the dma_resv we participate in implicit sync (at the level of an
implicit sync read) still, at least from the perspective of later jobs
waiting on these fences.

>
> Regards,
> Christian.
>
> > -Daniel
> >
> >>> Cc: mesa-dev@lists.freedesktop.org
> >>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>> Cc: Dave Airlie <airlied@gmail.com>
> >>> Cc: Rob Clark <robdclark@chromium.org>
> >>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>> Cc: Michel Dänzer <michel@daenzer.net>
> >>> Cc: Daniel Stone <daniels@collabora.com>
> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>> Cc: Chen Li <chenli@uniontech.com>
> >>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>   include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>   4 files changed, 42 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index 65df34c17264..c5386d13eb4a 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>          struct amdgpu_bo *gds;
> >>>          struct amdgpu_bo *gws;
> >>>          struct amdgpu_bo *oa;
> >>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>          int r;
> >>>
> >>>          INIT_LIST_HEAD(&p->validated);
> >>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>
> >>>                  e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>
> >>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>> +               if (bo->tbo.base.dma_buf &&
> >>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>                          e->chain = dma_fence_chain_alloc();
> >>>                          if (!e->chain) {
> >>>                                  r = -ENOMEM;
> >>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>   {
> >>>          struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>          struct amdgpu_bo_list_entry *e;
> >>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>          int r;
> >>>
> >>>          list_for_each_entry(e, &p->validated, tv.head) {
> >>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>                  struct dma_resv *resv = bo->tbo.base.resv;
> >>>                  enum amdgpu_sync_mode sync_mode;
> >>>
> >>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>                          AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>                  r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>                                       &fpriv->vm);
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> index c080ba15ae77..f982626b5328 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>          return 0;
> >>>   }
> >>>
> >>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>> +                         struct drm_file *filp)
> >>> +{
> >>> +       struct drm_amdgpu_setparam *setparam = data;
> >>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>> +
> >>> +       switch (setparam->param) {
> >>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>> +               if (setparam->value)
> >>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>> +               else
> >>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>> +               break;
> >>> +       default:
> >>> +               return -EINVAL;
> >>> +       }
> >>> +
> >>> +       return 0;
> >>> +}
> >>> +
> >>>   const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>   };
> >>>
> >>>   static const struct drm_driver amdgpu_kms_driver = {
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> index ddb85a85cbba..0e8c440c6303 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>          bool                    bulk_moveable;
> >>>          /* Flag to indicate if VM is used for compute */
> >>>          bool                    is_compute_context;
> >>> +       /*
> >>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>> +        * this context. We do not care about races at all, userspace is allowed
> >>> +        * to shoot itself with implicit sync to its fullest liking.
> >>> +        */
> >>> +       bool no_implicit_sync;
> >>>   };
> >>>
> >>>   struct amdgpu_vm_manager {
> >>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>> index 0cbd1540aeac..9eae245c14d6 100644
> >>> --- a/include/uapi/drm/amdgpu_drm.h
> >>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>> @@ -54,6 +54,7 @@ extern "C" {
> >>>   #define DRM_AMDGPU_VM                  0x13
> >>>   #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>   #define DRM_AMDGPU_SCHED               0x15
> >>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>
> >>>   #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>> @@ -71,6 +72,7 @@ extern "C" {
> >>>   #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>   #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>
> >>>   /**
> >>>    * DOC: memory domains
> >>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>          struct drm_amdgpu_sched_in in;
> >>>   };
> >>>
> >>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>> +
> >>> +struct drm_amdgpu_setparam {
> >>> +       /* AMDGPU_SETPARAM_* */
> >>> +       __u32   param;
> >>> +       __u32   value;
> >>> +};
> >>> +
> >>>   /*
> >>>    * This is not a reliable API and you should expect it to fail for any
> >>>    * number of reasons and have fallback path that do not use userptr to
> >>> --
> >>> 2.32.0.rc2
> >>>
> >
> >
>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 13:38           ` Bas Nieuwenhuizen
  0 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23 13:38 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Michel Dänzer,
	Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 2:59 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > <bas@basnieuwenhuizen.nl> wrote:
> >> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>
> >>> Implicit fencing done properly needs to treat the implicit fencing
> >>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>> explicitly. This is the only way it will mesh well with explicit
> >>> fencing userspace like vk, and it's also the bare minimum required to
> >>> be able to manage anything else that wants to use the same buffer on
> >>> multiple engines in parallel, and still be able to share it through
> >>> implicit sync.
> >>>
> >>> amdgpu completely lacks such an uapi. Fix this.
> >>>
> >>> Luckily the concept of ignoring implicit fences exists already, and
> >>> takes care of all the complexities of making sure that non-optional
> >>> fences (like bo moves) are not ignored. This support was added in
> >>>
> >>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>
> >>>      drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>
> >>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>> disables implicit sync on an allocated buffer completely.
> >>>
> >>> We _do_ want implicit sync, but control it explicitly. For this we
> >>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>> can manage the implicit sync slots explicitly. The other side of the
> >>> pipeline (compositor, other process or just different stage in a media
> >>> pipeline in the same process) can then either do the same, or fully
> >>> participate in the implicit sync as implemented by the kernel by
> >>> default.
> >>>
> >>> By building on the existing flag for buffers we avoid any issues with
> >>> opening up additional security concerns - anything this new flag here
> >>> allows is already.
> >>>
> >>> All drivers which supports this concept of a userspace-specific
> >>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>> that turned out to be a bit too inflexible. See the discussion below,
> >>> let's try to do a bit better for amdgpu.
> >>>
> >>> This alone only allows us to completely avoid any stalls due to
> >>> implicit sync, it does not yet allow us to use implicit sync as a
> >>> strange form of IPC for sync_file.
> >>>
> >>> For that we need two more pieces:
> >>>
> >>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>    be done in a driver ioctl, but everyone needs this, and generally a
> >>>    dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>    the dma-buf makes a ton more sense:
> >>>
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gUnM8%2Fulx%2B%2BDLxByO%2F0V3cLqt%2Fc2unWjizEpptQqM8g%3D&amp;reserved=0
> >>>
> >>>    Current drivers in upstream solves this by having the opt-out flag
> >>>    on their CS ioctl. This has the downside that very often the CS
> >>>    which must actually stall for the implicit fence is run a while
> >>>    after the implicit fence point was logically sampled per the api
> >>>    spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>    results in oversync. Converting the implicit sync fences into a
> >>>    snap-shot sync_file is actually accurate.
> >>>
> >>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>    Current drivers again do this with a CS ioctl flag, with again the
> >>>    same problems that the time the CS happens additional dependencies
> >>>    have been added. An explicit ioctl to only insert a sync_file (while
> >>>    respecting the rules for how exclusive and shared fence slots must
> >>>    be update in struct dma_resv) is much better. This is proposed here:
> >>>
> >>>    https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cf026055f523d4e4df95b08d936410e39%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600475351085536%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wFGNyeL1YSpkebf1L1DDb2euihf1fvmR9G8cfywrpVU%3D&amp;reserved=0
> >>>
> >>> These three pieces together allow userspace to fully control implicit
> >>> fencing and remove all unecessary stall points due to them.
> >>>
> >>> Well, as much as the implicit fencing model fundamentally allows:
> >>> There is only one set of fences, you can only choose to sync against
> >>> only writers (exclusive slot), or everyone. Hence suballocating
> >>> multiple buffers or anything else like this is fundamentally not
> >>> possible, and can only be fixed by a proper explicit fencing model.
> >>>
> >>> Aside from that caveat this model gets implicit fencing as closely to
> >>> explicit fencing semantics as possible:
> >>>
> >>> On the actual implementation I opted for a simple setparam ioctl, no
> >>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>> flag parameter in the VM ioctl which we could use, except:
> >>> - it's not checked, so userspace likely passes garbage
> >>> - there's already a comment that userspace _does_ pass garbage in the
> >>>    priority field
> >>> So yeah unfortunately this flag parameter for setting vm flags is
> >>> useless, and we need to hack up a new one.
> >>>
> >>> v2: Explain why a new SETPARAM (Jason)
> >>>
> >>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>> need both, or this doesn't do much.
> >>>
> >>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>> fences.
> >> So I think there is still a case missing in this implementation.
> >> Consider these 3 cases
> >>
> >> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>
> >> explicit->explicit: This doesn't wait now, which is good
> >> Implicit->explicit: This doesn't wait now, which is good
> >> explicit->implicit : This still waits as the explicit submission still
> >> adds shared fences and most things that set an exclusive fence for
> >> implicit sync will hence wait on it.
> >>
> >> This is probably good enough for what radv needs now but also sounds
> >> like a risk wrt baking in new uapi behavior that we don't want to be
> >> the end result.
> >>
> >> Within AMDGPU this is probably solvable in two ways:
> >>
> >> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > I'm not sure that works. I think the right fix is that radeonsi also
> > switches to this model, with maybe a per-bo CS flag to set indicate
> > write access, to cut down on the number of ioctls that are needed
> > otherwise on shared buffers. This per-bo flag would essentially select
> > between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>
> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>
> Problem with the per context or per vm flag is that you then don't get
> any implicit synchronization any more when another process starts using
> the buffer.

That is exactly what I want for Vulkan :)
>
> > The current amdgpu uapi just doesn't allow any other model without an
> > explicit opt-in. So current implicit sync userspace just has to
> > oversync, there's not much choice.
> >
> >> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>
> >> But this doesn't solve cross-driver interactions here.
> > Yeah cross-driver is still entirely unsolved, because
> > amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>
> Hui? You have lost me. Why is that still unsolved?

The part we're trying to solve with this patch is Vulkan should not
participate in any implicit sync at all wrt submissions (and then
handle the implicit sync for WSI explicitly using the fence
import/export stuff that Jason wrote). As long we add shared fences to
the dma_resv we participate in implicit sync (at the level of an
implicit sync read) still, at least from the perspective of later jobs
waiting on these fences.

>
> Regards,
> Christian.
>
> > -Daniel
> >
> >>> Cc: mesa-dev@lists.freedesktop.org
> >>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>> Cc: Dave Airlie <airlied@gmail.com>
> >>> Cc: Rob Clark <robdclark@chromium.org>
> >>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>> Cc: Michel Dänzer <michel@daenzer.net>
> >>> Cc: Daniel Stone <daniels@collabora.com>
> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>> Cc: "Christian König" <christian.koenig@amd.com>
> >>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>> Cc: Chen Li <chenli@uniontech.com>
> >>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>> Cc: linaro-mm-sig@lists.linaro.org
> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>> ---
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>   include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>   4 files changed, 42 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> index 65df34c17264..c5386d13eb4a 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>          struct amdgpu_bo *gds;
> >>>          struct amdgpu_bo *gws;
> >>>          struct amdgpu_bo *oa;
> >>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>          int r;
> >>>
> >>>          INIT_LIST_HEAD(&p->validated);
> >>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>
> >>>                  e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>
> >>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>> +               if (bo->tbo.base.dma_buf &&
> >>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>                          e->chain = dma_fence_chain_alloc();
> >>>                          if (!e->chain) {
> >>>                                  r = -ENOMEM;
> >>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>   {
> >>>          struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>          struct amdgpu_bo_list_entry *e;
> >>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>          int r;
> >>>
> >>>          list_for_each_entry(e, &p->validated, tv.head) {
> >>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>                  struct dma_resv *resv = bo->tbo.base.resv;
> >>>                  enum amdgpu_sync_mode sync_mode;
> >>>
> >>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>                          AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>                  r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>                                       &fpriv->vm);
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> index c080ba15ae77..f982626b5328 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>          return 0;
> >>>   }
> >>>
> >>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>> +                         struct drm_file *filp)
> >>> +{
> >>> +       struct drm_amdgpu_setparam *setparam = data;
> >>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>> +
> >>> +       switch (setparam->param) {
> >>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>> +               if (setparam->value)
> >>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>> +               else
> >>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>> +               break;
> >>> +       default:
> >>> +               return -EINVAL;
> >>> +       }
> >>> +
> >>> +       return 0;
> >>> +}
> >>> +
> >>>   const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>          DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>   };
> >>>
> >>>   static const struct drm_driver amdgpu_kms_driver = {
> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> index ddb85a85cbba..0e8c440c6303 100644
> >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>          bool                    bulk_moveable;
> >>>          /* Flag to indicate if VM is used for compute */
> >>>          bool                    is_compute_context;
> >>> +       /*
> >>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>> +        * this context. We do not care about races at all, userspace is allowed
> >>> +        * to shoot itself with implicit sync to its fullest liking.
> >>> +        */
> >>> +       bool no_implicit_sync;
> >>>   };
> >>>
> >>>   struct amdgpu_vm_manager {
> >>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>> index 0cbd1540aeac..9eae245c14d6 100644
> >>> --- a/include/uapi/drm/amdgpu_drm.h
> >>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>> @@ -54,6 +54,7 @@ extern "C" {
> >>>   #define DRM_AMDGPU_VM                  0x13
> >>>   #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>   #define DRM_AMDGPU_SCHED               0x15
> >>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>
> >>>   #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>   #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>> @@ -71,6 +72,7 @@ extern "C" {
> >>>   #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>   #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>   #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>
> >>>   /**
> >>>    * DOC: memory domains
> >>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>          struct drm_amdgpu_sched_in in;
> >>>   };
> >>>
> >>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>> +
> >>> +struct drm_amdgpu_setparam {
> >>> +       /* AMDGPU_SETPARAM_* */
> >>> +       __u32   param;
> >>> +       __u32   value;
> >>> +};
> >>> +
> >>>   /*
> >>>    * This is not a reliable API and you should expect it to fail for any
> >>>    * number of reasons and have fallback path that do not use userptr to
> >>> --
> >>> 2.32.0.rc2
> >>>
> >
> >
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 13:38           ` [Intel-gfx] " Bas Nieuwenhuizen
@ 2021-06-23 13:44             ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 13:44 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>> <bas@basnieuwenhuizen.nl> wrote:
>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>
>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>> be able to manage anything else that wants to use the same buffer on
>>>>> multiple engines in parallel, and still be able to share it through
>>>>> implicit sync.
>>>>>
>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>
>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>> takes care of all the complexities of making sure that non-optional
>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>
>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>
>>>>>       drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>
>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>> disables implicit sync on an allocated buffer completely.
>>>>>
>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>> pipeline (compositor, other process or just different stage in a media
>>>>> pipeline in the same process) can then either do the same, or fully
>>>>> participate in the implicit sync as implemented by the kernel by
>>>>> default.
>>>>>
>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>> opening up additional security concerns - anything this new flag here
>>>>> allows is already.
>>>>>
>>>>> All drivers which supports this concept of a userspace-specific
>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>> let's try to do a bit better for amdgpu.
>>>>>
>>>>> This alone only allows us to completely avoid any stalls due to
>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>> strange form of IPC for sync_file.
>>>>>
>>>>> For that we need two more pieces:
>>>>>
>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>     be done in a driver ioctl, but everyone needs this, and generally a
>>>>>     dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>     the dma-buf makes a ton more sense:
>>>>>
>>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287217723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=L8KCz8711Y2qZx0%2FJWT6HSg4o6OMhn%2BC4U2IR06nViE%3D&amp;reserved=0
>>>>>
>>>>>     Current drivers in upstream solves this by having the opt-out flag
>>>>>     on their CS ioctl. This has the downside that very often the CS
>>>>>     which must actually stall for the implicit fence is run a while
>>>>>     after the implicit fence point was logically sampled per the api
>>>>>     spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>     results in oversync. Converting the implicit sync fences into a
>>>>>     snap-shot sync_file is actually accurate.
>>>>>
>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>     Current drivers again do this with a CS ioctl flag, with again the
>>>>>     same problems that the time the CS happens additional dependencies
>>>>>     have been added. An explicit ioctl to only insert a sync_file (while
>>>>>     respecting the rules for how exclusive and shared fence slots must
>>>>>     be update in struct dma_resv) is much better. This is proposed here:
>>>>>
>>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287227719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8Ws%2B573T5rj9Bs08%2BQB5CbIAsWgo36hYiH%2Fd0dPcJeg%3D&amp;reserved=0
>>>>>
>>>>> These three pieces together allow userspace to fully control implicit
>>>>> fencing and remove all unecessary stall points due to them.
>>>>>
>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>> There is only one set of fences, you can only choose to sync against
>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>> multiple buffers or anything else like this is fundamentally not
>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>
>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>> explicit fencing semantics as possible:
>>>>>
>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>> - it's not checked, so userspace likely passes garbage
>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>     priority field
>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>> useless, and we need to hack up a new one.
>>>>>
>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>
>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>> need both, or this doesn't do much.
>>>>>
>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>> fences.
>>>> So I think there is still a case missing in this implementation.
>>>> Consider these 3 cases
>>>>
>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>
>>>> explicit->explicit: This doesn't wait now, which is good
>>>> Implicit->explicit: This doesn't wait now, which is good
>>>> explicit->implicit : This still waits as the explicit submission still
>>>> adds shared fences and most things that set an exclusive fence for
>>>> implicit sync will hence wait on it.
>>>>
>>>> This is probably good enough for what radv needs now but also sounds
>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>> the end result.
>>>>
>>>> Within AMDGPU this is probably solvable in two ways:
>>>>
>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>> I'm not sure that works. I think the right fix is that radeonsi also
>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>> write access, to cut down on the number of ioctls that are needed
>>> otherwise on shared buffers. This per-bo flag would essentially select
>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>
>> Problem with the per context or per vm flag is that you then don't get
>> any implicit synchronization any more when another process starts using
>> the buffer.
> That is exactly what I want for Vulkan :)

Yeah, but as far as I know this is not something we can do.

See we have use cases like screen capture and debug which rely on that 
behavior.

The only thing we can do is to say on a per buffer flag that a buffer 
should not participate in implicit sync at all.

Regards,
Christian.

>>> The current amdgpu uapi just doesn't allow any other model without an
>>> explicit opt-in. So current implicit sync userspace just has to
>>> oversync, there's not much choice.
>>>
>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>
>>>> But this doesn't solve cross-driver interactions here.
>>> Yeah cross-driver is still entirely unsolved, because
>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>> Hui? You have lost me. Why is that still unsolved?
> The part we're trying to solve with this patch is Vulkan should not
> participate in any implicit sync at all wrt submissions (and then
> handle the implicit sync for WSI explicitly using the fence
> import/export stuff that Jason wrote). As long we add shared fences to
> the dma_resv we participate in implicit sync (at the level of an
> implicit sync read) still, at least from the perspective of later jobs
> waiting on these fences.
>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>>
>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>    include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>    4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>           struct amdgpu_bo *gds;
>>>>>           struct amdgpu_bo *gws;
>>>>>           struct amdgpu_bo *oa;
>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>           int r;
>>>>>
>>>>>           INIT_LIST_HEAD(&p->validated);
>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>
>>>>>                   e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>
>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>                           e->chain = dma_fence_chain_alloc();
>>>>>                           if (!e->chain) {
>>>>>                                   r = -ENOMEM;
>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>    {
>>>>>           struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>           struct amdgpu_bo_list_entry *e;
>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>           int r;
>>>>>
>>>>>           list_for_each_entry(e, &p->validated, tv.head) {
>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>                   struct dma_resv *resv = bo->tbo.base.resv;
>>>>>                   enum amdgpu_sync_mode sync_mode;
>>>>>
>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>                           AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>                   r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>                                        &fpriv->vm);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> index c080ba15ae77..f982626b5328 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>           return 0;
>>>>>    }
>>>>>
>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>> +                         struct drm_file *filp)
>>>>> +{
>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>> +
>>>>> +       switch (setparam->param) {
>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>> +               if (setparam->value)
>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>> +               else
>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>> +               break;
>>>>> +       default:
>>>>> +               return -EINVAL;
>>>>> +       }
>>>>> +
>>>>> +       return 0;
>>>>> +}
>>>>> +
>>>>>    const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>    };
>>>>>
>>>>>    static const struct drm_driver amdgpu_kms_driver = {
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>           bool                    bulk_moveable;
>>>>>           /* Flag to indicate if VM is used for compute */
>>>>>           bool                    is_compute_context;
>>>>> +       /*
>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>> +        */
>>>>> +       bool no_implicit_sync;
>>>>>    };
>>>>>
>>>>>    struct amdgpu_vm_manager {
>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>    #define DRM_AMDGPU_VM                  0x13
>>>>>    #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>    #define DRM_AMDGPU_SCHED               0x15
>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>
>>>>>    #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>    #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>    #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>    #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>
>>>>>    /**
>>>>>     * DOC: memory domains
>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>           struct drm_amdgpu_sched_in in;
>>>>>    };
>>>>>
>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>> +
>>>>> +struct drm_amdgpu_setparam {
>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>> +       __u32   param;
>>>>> +       __u32   value;
>>>>> +};
>>>>> +
>>>>>    /*
>>>>>     * This is not a reliable API and you should expect it to fail for any
>>>>>     * number of reasons and have fallback path that do not use userptr to
>>>>> --
>>>>> 2.32.0.rc2
>>>>>
>>>


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 13:44             ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 13:44 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Michel Dänzer,
	Dennis Li, Deepak R Varma

Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>> <bas@basnieuwenhuizen.nl> wrote:
>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>
>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>> be able to manage anything else that wants to use the same buffer on
>>>>> multiple engines in parallel, and still be able to share it through
>>>>> implicit sync.
>>>>>
>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>
>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>> takes care of all the complexities of making sure that non-optional
>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>
>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>
>>>>>       drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>
>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>> disables implicit sync on an allocated buffer completely.
>>>>>
>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>> pipeline (compositor, other process or just different stage in a media
>>>>> pipeline in the same process) can then either do the same, or fully
>>>>> participate in the implicit sync as implemented by the kernel by
>>>>> default.
>>>>>
>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>> opening up additional security concerns - anything this new flag here
>>>>> allows is already.
>>>>>
>>>>> All drivers which supports this concept of a userspace-specific
>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>> let's try to do a bit better for amdgpu.
>>>>>
>>>>> This alone only allows us to completely avoid any stalls due to
>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>> strange form of IPC for sync_file.
>>>>>
>>>>> For that we need two more pieces:
>>>>>
>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>     be done in a driver ioctl, but everyone needs this, and generally a
>>>>>     dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>     the dma-buf makes a ton more sense:
>>>>>
>>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287217723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=L8KCz8711Y2qZx0%2FJWT6HSg4o6OMhn%2BC4U2IR06nViE%3D&amp;reserved=0
>>>>>
>>>>>     Current drivers in upstream solves this by having the opt-out flag
>>>>>     on their CS ioctl. This has the downside that very often the CS
>>>>>     which must actually stall for the implicit fence is run a while
>>>>>     after the implicit fence point was logically sampled per the api
>>>>>     spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>     results in oversync. Converting the implicit sync fences into a
>>>>>     snap-shot sync_file is actually accurate.
>>>>>
>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>     Current drivers again do this with a CS ioctl flag, with again the
>>>>>     same problems that the time the CS happens additional dependencies
>>>>>     have been added. An explicit ioctl to only insert a sync_file (while
>>>>>     respecting the rules for how exclusive and shared fence slots must
>>>>>     be update in struct dma_resv) is much better. This is proposed here:
>>>>>
>>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287227719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8Ws%2B573T5rj9Bs08%2BQB5CbIAsWgo36hYiH%2Fd0dPcJeg%3D&amp;reserved=0
>>>>>
>>>>> These three pieces together allow userspace to fully control implicit
>>>>> fencing and remove all unecessary stall points due to them.
>>>>>
>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>> There is only one set of fences, you can only choose to sync against
>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>> multiple buffers or anything else like this is fundamentally not
>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>
>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>> explicit fencing semantics as possible:
>>>>>
>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>> - it's not checked, so userspace likely passes garbage
>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>     priority field
>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>> useless, and we need to hack up a new one.
>>>>>
>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>
>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>> need both, or this doesn't do much.
>>>>>
>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>> fences.
>>>> So I think there is still a case missing in this implementation.
>>>> Consider these 3 cases
>>>>
>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>
>>>> explicit->explicit: This doesn't wait now, which is good
>>>> Implicit->explicit: This doesn't wait now, which is good
>>>> explicit->implicit : This still waits as the explicit submission still
>>>> adds shared fences and most things that set an exclusive fence for
>>>> implicit sync will hence wait on it.
>>>>
>>>> This is probably good enough for what radv needs now but also sounds
>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>> the end result.
>>>>
>>>> Within AMDGPU this is probably solvable in two ways:
>>>>
>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>> I'm not sure that works. I think the right fix is that radeonsi also
>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>> write access, to cut down on the number of ioctls that are needed
>>> otherwise on shared buffers. This per-bo flag would essentially select
>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>
>> Problem with the per context or per vm flag is that you then don't get
>> any implicit synchronization any more when another process starts using
>> the buffer.
> That is exactly what I want for Vulkan :)

Yeah, but as far as I know this is not something we can do.

See we have use cases like screen capture and debug which rely on that 
behavior.

The only thing we can do is to say on a per buffer flag that a buffer 
should not participate in implicit sync at all.

Regards,
Christian.

>>> The current amdgpu uapi just doesn't allow any other model without an
>>> explicit opt-in. So current implicit sync userspace just has to
>>> oversync, there's not much choice.
>>>
>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>
>>>> But this doesn't solve cross-driver interactions here.
>>> Yeah cross-driver is still entirely unsolved, because
>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>> Hui? You have lost me. Why is that still unsolved?
> The part we're trying to solve with this patch is Vulkan should not
> participate in any implicit sync at all wrt submissions (and then
> handle the implicit sync for WSI explicitly using the fence
> import/export stuff that Jason wrote). As long we add shared fences to
> the dma_resv we participate in implicit sync (at the level of an
> implicit sync read) still, at least from the perspective of later jobs
> waiting on these fences.
>
>> Regards,
>> Christian.
>>
>>> -Daniel
>>>
>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> ---
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>    include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>    4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>           struct amdgpu_bo *gds;
>>>>>           struct amdgpu_bo *gws;
>>>>>           struct amdgpu_bo *oa;
>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>           int r;
>>>>>
>>>>>           INIT_LIST_HEAD(&p->validated);
>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>
>>>>>                   e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>
>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>                           e->chain = dma_fence_chain_alloc();
>>>>>                           if (!e->chain) {
>>>>>                                   r = -ENOMEM;
>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>    {
>>>>>           struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>           struct amdgpu_bo_list_entry *e;
>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>           int r;
>>>>>
>>>>>           list_for_each_entry(e, &p->validated, tv.head) {
>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>                   struct dma_resv *resv = bo->tbo.base.resv;
>>>>>                   enum amdgpu_sync_mode sync_mode;
>>>>>
>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>                           AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>                   r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>                                        &fpriv->vm);
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> index c080ba15ae77..f982626b5328 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>           return 0;
>>>>>    }
>>>>>
>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>> +                         struct drm_file *filp)
>>>>> +{
>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>> +
>>>>> +       switch (setparam->param) {
>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>> +               if (setparam->value)
>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>> +               else
>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>> +               break;
>>>>> +       default:
>>>>> +               return -EINVAL;
>>>>> +       }
>>>>> +
>>>>> +       return 0;
>>>>> +}
>>>>> +
>>>>>    const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>    };
>>>>>
>>>>>    static const struct drm_driver amdgpu_kms_driver = {
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>           bool                    bulk_moveable;
>>>>>           /* Flag to indicate if VM is used for compute */
>>>>>           bool                    is_compute_context;
>>>>> +       /*
>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>> +        */
>>>>> +       bool no_implicit_sync;
>>>>>    };
>>>>>
>>>>>    struct amdgpu_vm_manager {
>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>    #define DRM_AMDGPU_VM                  0x13
>>>>>    #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>    #define DRM_AMDGPU_SCHED               0x15
>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>
>>>>>    #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>    #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>    #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>    #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>
>>>>>    /**
>>>>>     * DOC: memory domains
>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>           struct drm_amdgpu_sched_in in;
>>>>>    };
>>>>>
>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>> +
>>>>> +struct drm_amdgpu_setparam {
>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>> +       __u32   param;
>>>>> +       __u32   value;
>>>>> +};
>>>>> +
>>>>>    /*
>>>>>     * This is not a reliable API and you should expect it to fail for any
>>>>>     * number of reasons and have fallback path that do not use userptr to
>>>>> --
>>>>> 2.32.0.rc2
>>>>>
>>>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 13:44             ` [Intel-gfx] " Christian König
@ 2021-06-23 13:49               ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 13:49 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 3:44 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> >>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> >>> <bas@basnieuwenhuizen.nl> wrote:
> >>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>>>
> >>>>> Implicit fencing done properly needs to treat the implicit fencing
> >>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>>>> explicitly. This is the only way it will mesh well with explicit
> >>>>> fencing userspace like vk, and it's also the bare minimum required to
> >>>>> be able to manage anything else that wants to use the same buffer on
> >>>>> multiple engines in parallel, and still be able to share it through
> >>>>> implicit sync.
> >>>>>
> >>>>> amdgpu completely lacks such an uapi. Fix this.
> >>>>>
> >>>>> Luckily the concept of ignoring implicit fences exists already, and
> >>>>> takes care of all the complexities of making sure that non-optional
> >>>>> fences (like bo moves) are not ignored. This support was added in
> >>>>>
> >>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>>>
> >>>>>       drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>>>
> >>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>>>> disables implicit sync on an allocated buffer completely.
> >>>>>
> >>>>> We _do_ want implicit sync, but control it explicitly. For this we
> >>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>>>> can manage the implicit sync slots explicitly. The other side of the
> >>>>> pipeline (compositor, other process or just different stage in a media
> >>>>> pipeline in the same process) can then either do the same, or fully
> >>>>> participate in the implicit sync as implemented by the kernel by
> >>>>> default.
> >>>>>
> >>>>> By building on the existing flag for buffers we avoid any issues with
> >>>>> opening up additional security concerns - anything this new flag here
> >>>>> allows is already.
> >>>>>
> >>>>> All drivers which supports this concept of a userspace-specific
> >>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>>>> that turned out to be a bit too inflexible. See the discussion below,
> >>>>> let's try to do a bit better for amdgpu.
> >>>>>
> >>>>> This alone only allows us to completely avoid any stalls due to
> >>>>> implicit sync, it does not yet allow us to use implicit sync as a
> >>>>> strange form of IPC for sync_file.
> >>>>>
> >>>>> For that we need two more pieces:
> >>>>>
> >>>>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>>>     be done in a driver ioctl, but everyone needs this, and generally a
> >>>>>     dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>>>     the dma-buf makes a ton more sense:
> >>>>>
> >>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287217723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=L8KCz8711Y2qZx0%2FJWT6HSg4o6OMhn%2BC4U2IR06nViE%3D&amp;reserved=0
> >>>>>
> >>>>>     Current drivers in upstream solves this by having the opt-out flag
> >>>>>     on their CS ioctl. This has the downside that very often the CS
> >>>>>     which must actually stall for the implicit fence is run a while
> >>>>>     after the implicit fence point was logically sampled per the api
> >>>>>     spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>>>     results in oversync. Converting the implicit sync fences into a
> >>>>>     snap-shot sync_file is actually accurate.
> >>>>>
> >>>>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>>>     Current drivers again do this with a CS ioctl flag, with again the
> >>>>>     same problems that the time the CS happens additional dependencies
> >>>>>     have been added. An explicit ioctl to only insert a sync_file (while
> >>>>>     respecting the rules for how exclusive and shared fence slots must
> >>>>>     be update in struct dma_resv) is much better. This is proposed here:
> >>>>>
> >>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287227719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8Ws%2B573T5rj9Bs08%2BQB5CbIAsWgo36hYiH%2Fd0dPcJeg%3D&amp;reserved=0
> >>>>>
> >>>>> These three pieces together allow userspace to fully control implicit
> >>>>> fencing and remove all unecessary stall points due to them.
> >>>>>
> >>>>> Well, as much as the implicit fencing model fundamentally allows:
> >>>>> There is only one set of fences, you can only choose to sync against
> >>>>> only writers (exclusive slot), or everyone. Hence suballocating
> >>>>> multiple buffers or anything else like this is fundamentally not
> >>>>> possible, and can only be fixed by a proper explicit fencing model.
> >>>>>
> >>>>> Aside from that caveat this model gets implicit fencing as closely to
> >>>>> explicit fencing semantics as possible:
> >>>>>
> >>>>> On the actual implementation I opted for a simple setparam ioctl, no
> >>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>>>> flag parameter in the VM ioctl which we could use, except:
> >>>>> - it's not checked, so userspace likely passes garbage
> >>>>> - there's already a comment that userspace _does_ pass garbage in the
> >>>>>     priority field
> >>>>> So yeah unfortunately this flag parameter for setting vm flags is
> >>>>> useless, and we need to hack up a new one.
> >>>>>
> >>>>> v2: Explain why a new SETPARAM (Jason)
> >>>>>
> >>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>>>> need both, or this doesn't do much.
> >>>>>
> >>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>>>> fences.
> >>>> So I think there is still a case missing in this implementation.
> >>>> Consider these 3 cases
> >>>>
> >>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>>>
> >>>> explicit->explicit: This doesn't wait now, which is good
> >>>> Implicit->explicit: This doesn't wait now, which is good
> >>>> explicit->implicit : This still waits as the explicit submission still
> >>>> adds shared fences and most things that set an exclusive fence for
> >>>> implicit sync will hence wait on it.
> >>>>
> >>>> This is probably good enough for what radv needs now but also sounds
> >>>> like a risk wrt baking in new uapi behavior that we don't want to be
> >>>> the end result.
> >>>>
> >>>> Within AMDGPU this is probably solvable in two ways:
> >>>>
> >>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> >>> I'm not sure that works. I think the right fix is that radeonsi also
> >>> switches to this model, with maybe a per-bo CS flag to set indicate
> >>> write access, to cut down on the number of ioctls that are needed
> >>> otherwise on shared buffers. This per-bo flag would essentially select
> >>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> >> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> >>
> >> Problem with the per context or per vm flag is that you then don't get
> >> any implicit synchronization any more when another process starts using
> >> the buffer.
> > That is exactly what I want for Vulkan :)
>
> Yeah, but as far as I know this is not something we can do.
>
> See we have use cases like screen capture and debug which rely on that
> behavior.

They will keep working, if (and only if) the vulkan side sets the
winsys fences correctly. Also, everything else in vulkan aside from
winsys is explicitly not synced at all, you have to import drm syncobj
timeline on the gl side.

> The only thing we can do is to say on a per buffer flag that a buffer
> should not participate in implicit sync at all.

Nah, this doesn't work. Because it's not a global decision, is a local
decision for the rendered. Vulkan wants to control implicit sync
explicitly, and the kernel can't force more synchronization. If a
buffer is shared as a winsys buffer between vulkan client and gl using
compositor, then you _have_ to use implicit sync on it. But vk needs
to set the fences directly (and if the app gets it wrong, you get
misrendering, but that is the specified behavour of vulkan).
-Daniel

>
> Regards,
> Christian.
>
> >>> The current amdgpu uapi just doesn't allow any other model without an
> >>> explicit opt-in. So current implicit sync userspace just has to
> >>> oversync, there's not much choice.
> >>>
> >>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>>>
> >>>> But this doesn't solve cross-driver interactions here.
> >>> Yeah cross-driver is still entirely unsolved, because
> >>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> >> Hui? You have lost me. Why is that still unsolved?
> > The part we're trying to solve with this patch is Vulkan should not
> > participate in any implicit sync at all wrt submissions (and then
> > handle the implicit sync for WSI explicitly using the fence
> > import/export stuff that Jason wrote). As long we add shared fences to
> > the dma_resv we participate in implicit sync (at the level of an
> > implicit sync read) still, at least from the perspective of later jobs
> > waiting on these fences.
> >
> >> Regards,
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>> Cc: Rob Clark <robdclark@chromium.org>
> >>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>>>> Cc: Michel Dänzer <michel@daenzer.net>
> >>>>> Cc: Daniel Stone <daniels@collabora.com>
> >>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>>>    include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>>>    4 files changed, 42 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> index 65df34c17264..c5386d13eb4a 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>           struct amdgpu_bo *gds;
> >>>>>           struct amdgpu_bo *gws;
> >>>>>           struct amdgpu_bo *oa;
> >>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>           int r;
> >>>>>
> >>>>>           INIT_LIST_HEAD(&p->validated);
> >>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>
> >>>>>                   e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>>>
> >>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>>>> +               if (bo->tbo.base.dma_buf &&
> >>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>>>                           e->chain = dma_fence_chain_alloc();
> >>>>>                           if (!e->chain) {
> >>>>>                                   r = -ENOMEM;
> >>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>    {
> >>>>>           struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>>>           struct amdgpu_bo_list_entry *e;
> >>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>           int r;
> >>>>>
> >>>>>           list_for_each_entry(e, &p->validated, tv.head) {
> >>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>                   struct dma_resv *resv = bo->tbo.base.resv;
> >>>>>                   enum amdgpu_sync_mode sync_mode;
> >>>>>
> >>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>>>                           AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>>>                   r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>>>                                        &fpriv->vm);
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> index c080ba15ae77..f982626b5328 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>>>           return 0;
> >>>>>    }
> >>>>>
> >>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>>>> +                         struct drm_file *filp)
> >>>>> +{
> >>>>> +       struct drm_amdgpu_setparam *setparam = data;
> >>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>>>> +
> >>>>> +       switch (setparam->param) {
> >>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>>>> +               if (setparam->value)
> >>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>>>> +               else
> >>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>>>> +               break;
> >>>>> +       default:
> >>>>> +               return -EINVAL;
> >>>>> +       }
> >>>>> +
> >>>>> +       return 0;
> >>>>> +}
> >>>>> +
> >>>>>    const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>    };
> >>>>>
> >>>>>    static const struct drm_driver amdgpu_kms_driver = {
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> index ddb85a85cbba..0e8c440c6303 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>>>           bool                    bulk_moveable;
> >>>>>           /* Flag to indicate if VM is used for compute */
> >>>>>           bool                    is_compute_context;
> >>>>> +       /*
> >>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>>>> +        * this context. We do not care about races at all, userspace is allowed
> >>>>> +        * to shoot itself with implicit sync to its fullest liking.
> >>>>> +        */
> >>>>> +       bool no_implicit_sync;
> >>>>>    };
> >>>>>
> >>>>>    struct amdgpu_vm_manager {
> >>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>>>> index 0cbd1540aeac..9eae245c14d6 100644
> >>>>> --- a/include/uapi/drm/amdgpu_drm.h
> >>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>>>> @@ -54,6 +54,7 @@ extern "C" {
> >>>>>    #define DRM_AMDGPU_VM                  0x13
> >>>>>    #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>>>    #define DRM_AMDGPU_SCHED               0x15
> >>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>>>
> >>>>>    #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>>>    #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>>>> @@ -71,6 +72,7 @@ extern "C" {
> >>>>>    #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>>>    #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>>>
> >>>>>    /**
> >>>>>     * DOC: memory domains
> >>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>>>           struct drm_amdgpu_sched_in in;
> >>>>>    };
> >>>>>
> >>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>>>> +
> >>>>> +struct drm_amdgpu_setparam {
> >>>>> +       /* AMDGPU_SETPARAM_* */
> >>>>> +       __u32   param;
> >>>>> +       __u32   value;
> >>>>> +};
> >>>>> +
> >>>>>    /*
> >>>>>     * This is not a reliable API and you should expect it to fail for any
> >>>>>     * number of reasons and have fallback path that do not use userptr to
> >>>>> --
> >>>>> 2.32.0.rc2
> >>>>>
> >>>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 13:49               ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 13:49 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Dave Airlie,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 3:44 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> >>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> >>> <bas@basnieuwenhuizen.nl> wrote:
> >>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>>>
> >>>>> Implicit fencing done properly needs to treat the implicit fencing
> >>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>>>> explicitly. This is the only way it will mesh well with explicit
> >>>>> fencing userspace like vk, and it's also the bare minimum required to
> >>>>> be able to manage anything else that wants to use the same buffer on
> >>>>> multiple engines in parallel, and still be able to share it through
> >>>>> implicit sync.
> >>>>>
> >>>>> amdgpu completely lacks such an uapi. Fix this.
> >>>>>
> >>>>> Luckily the concept of ignoring implicit fences exists already, and
> >>>>> takes care of all the complexities of making sure that non-optional
> >>>>> fences (like bo moves) are not ignored. This support was added in
> >>>>>
> >>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>>>
> >>>>>       drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>>>
> >>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>>>> disables implicit sync on an allocated buffer completely.
> >>>>>
> >>>>> We _do_ want implicit sync, but control it explicitly. For this we
> >>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>>>> can manage the implicit sync slots explicitly. The other side of the
> >>>>> pipeline (compositor, other process or just different stage in a media
> >>>>> pipeline in the same process) can then either do the same, or fully
> >>>>> participate in the implicit sync as implemented by the kernel by
> >>>>> default.
> >>>>>
> >>>>> By building on the existing flag for buffers we avoid any issues with
> >>>>> opening up additional security concerns - anything this new flag here
> >>>>> allows is already.
> >>>>>
> >>>>> All drivers which supports this concept of a userspace-specific
> >>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>>>> that turned out to be a bit too inflexible. See the discussion below,
> >>>>> let's try to do a bit better for amdgpu.
> >>>>>
> >>>>> This alone only allows us to completely avoid any stalls due to
> >>>>> implicit sync, it does not yet allow us to use implicit sync as a
> >>>>> strange form of IPC for sync_file.
> >>>>>
> >>>>> For that we need two more pieces:
> >>>>>
> >>>>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>>>     be done in a driver ioctl, but everyone needs this, and generally a
> >>>>>     dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>>>     the dma-buf makes a ton more sense:
> >>>>>
> >>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287217723%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=L8KCz8711Y2qZx0%2FJWT6HSg4o6OMhn%2BC4U2IR06nViE%3D&amp;reserved=0
> >>>>>
> >>>>>     Current drivers in upstream solves this by having the opt-out flag
> >>>>>     on their CS ioctl. This has the downside that very often the CS
> >>>>>     which must actually stall for the implicit fence is run a while
> >>>>>     after the implicit fence point was logically sampled per the api
> >>>>>     spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>>>     results in oversync. Converting the implicit sync fences into a
> >>>>>     snap-shot sync_file is actually accurate.
> >>>>>
> >>>>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>>>     Current drivers again do this with a CS ioctl flag, with again the
> >>>>>     same problems that the time the CS happens additional dependencies
> >>>>>     have been added. An explicit ioctl to only insert a sync_file (while
> >>>>>     respecting the rules for how exclusive and shared fence slots must
> >>>>>     be update in struct dma_resv) is much better. This is proposed here:
> >>>>>
> >>>>>     https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca401fc4551f045c95d8808d9364c38f6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600523287227719%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=8Ws%2B573T5rj9Bs08%2BQB5CbIAsWgo36hYiH%2Fd0dPcJeg%3D&amp;reserved=0
> >>>>>
> >>>>> These three pieces together allow userspace to fully control implicit
> >>>>> fencing and remove all unecessary stall points due to them.
> >>>>>
> >>>>> Well, as much as the implicit fencing model fundamentally allows:
> >>>>> There is only one set of fences, you can only choose to sync against
> >>>>> only writers (exclusive slot), or everyone. Hence suballocating
> >>>>> multiple buffers or anything else like this is fundamentally not
> >>>>> possible, and can only be fixed by a proper explicit fencing model.
> >>>>>
> >>>>> Aside from that caveat this model gets implicit fencing as closely to
> >>>>> explicit fencing semantics as possible:
> >>>>>
> >>>>> On the actual implementation I opted for a simple setparam ioctl, no
> >>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>>>> flag parameter in the VM ioctl which we could use, except:
> >>>>> - it's not checked, so userspace likely passes garbage
> >>>>> - there's already a comment that userspace _does_ pass garbage in the
> >>>>>     priority field
> >>>>> So yeah unfortunately this flag parameter for setting vm flags is
> >>>>> useless, and we need to hack up a new one.
> >>>>>
> >>>>> v2: Explain why a new SETPARAM (Jason)
> >>>>>
> >>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>>>> need both, or this doesn't do much.
> >>>>>
> >>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>>>> fences.
> >>>> So I think there is still a case missing in this implementation.
> >>>> Consider these 3 cases
> >>>>
> >>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>>>
> >>>> explicit->explicit: This doesn't wait now, which is good
> >>>> Implicit->explicit: This doesn't wait now, which is good
> >>>> explicit->implicit : This still waits as the explicit submission still
> >>>> adds shared fences and most things that set an exclusive fence for
> >>>> implicit sync will hence wait on it.
> >>>>
> >>>> This is probably good enough for what radv needs now but also sounds
> >>>> like a risk wrt baking in new uapi behavior that we don't want to be
> >>>> the end result.
> >>>>
> >>>> Within AMDGPU this is probably solvable in two ways:
> >>>>
> >>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> >>> I'm not sure that works. I think the right fix is that radeonsi also
> >>> switches to this model, with maybe a per-bo CS flag to set indicate
> >>> write access, to cut down on the number of ioctls that are needed
> >>> otherwise on shared buffers. This per-bo flag would essentially select
> >>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> >> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> >>
> >> Problem with the per context or per vm flag is that you then don't get
> >> any implicit synchronization any more when another process starts using
> >> the buffer.
> > That is exactly what I want for Vulkan :)
>
> Yeah, but as far as I know this is not something we can do.
>
> See we have use cases like screen capture and debug which rely on that
> behavior.

They will keep working, if (and only if) the vulkan side sets the
winsys fences correctly. Also, everything else in vulkan aside from
winsys is explicitly not synced at all, you have to import drm syncobj
timeline on the gl side.

> The only thing we can do is to say on a per buffer flag that a buffer
> should not participate in implicit sync at all.

Nah, this doesn't work. Because it's not a global decision, is a local
decision for the rendered. Vulkan wants to control implicit sync
explicitly, and the kernel can't force more synchronization. If a
buffer is shared as a winsys buffer between vulkan client and gl using
compositor, then you _have_ to use implicit sync on it. But vk needs
to set the fences directly (and if the app gets it wrong, you get
misrendering, but that is the specified behavour of vulkan).
-Daniel

>
> Regards,
> Christian.
>
> >>> The current amdgpu uapi just doesn't allow any other model without an
> >>> explicit opt-in. So current implicit sync userspace just has to
> >>> oversync, there's not much choice.
> >>>
> >>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>>>
> >>>> But this doesn't solve cross-driver interactions here.
> >>> Yeah cross-driver is still entirely unsolved, because
> >>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> >> Hui? You have lost me. Why is that still unsolved?
> > The part we're trying to solve with this patch is Vulkan should not
> > participate in any implicit sync at all wrt submissions (and then
> > handle the implicit sync for WSI explicitly using the fence
> > import/export stuff that Jason wrote). As long we add shared fences to
> > the dma_resv we participate in implicit sync (at the level of an
> > implicit sync read) still, at least from the perspective of later jobs
> > waiting on these fences.
> >
> >> Regards,
> >> Christian.
> >>
> >>> -Daniel
> >>>
> >>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>> Cc: Rob Clark <robdclark@chromium.org>
> >>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>>>> Cc: Michel Dänzer <michel@daenzer.net>
> >>>>> Cc: Daniel Stone <daniels@collabora.com>
> >>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>> ---
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>>>    include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>>>    4 files changed, 42 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> index 65df34c17264..c5386d13eb4a 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>           struct amdgpu_bo *gds;
> >>>>>           struct amdgpu_bo *gws;
> >>>>>           struct amdgpu_bo *oa;
> >>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>           int r;
> >>>>>
> >>>>>           INIT_LIST_HEAD(&p->validated);
> >>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>
> >>>>>                   e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>>>
> >>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>>>> +               if (bo->tbo.base.dma_buf &&
> >>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>>>                           e->chain = dma_fence_chain_alloc();
> >>>>>                           if (!e->chain) {
> >>>>>                                   r = -ENOMEM;
> >>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>    {
> >>>>>           struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>>>           struct amdgpu_bo_list_entry *e;
> >>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>           int r;
> >>>>>
> >>>>>           list_for_each_entry(e, &p->validated, tv.head) {
> >>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>                   struct dma_resv *resv = bo->tbo.base.resv;
> >>>>>                   enum amdgpu_sync_mode sync_mode;
> >>>>>
> >>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>>>                           AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>>>                   r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>>>                                        &fpriv->vm);
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> index c080ba15ae77..f982626b5328 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>>>           return 0;
> >>>>>    }
> >>>>>
> >>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>>>> +                         struct drm_file *filp)
> >>>>> +{
> >>>>> +       struct drm_amdgpu_setparam *setparam = data;
> >>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>>>> +
> >>>>> +       switch (setparam->param) {
> >>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>>>> +               if (setparam->value)
> >>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>>>> +               else
> >>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>>>> +               break;
> >>>>> +       default:
> >>>>> +               return -EINVAL;
> >>>>> +       }
> >>>>> +
> >>>>> +       return 0;
> >>>>> +}
> >>>>> +
> >>>>>    const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>           DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>    };
> >>>>>
> >>>>>    static const struct drm_driver amdgpu_kms_driver = {
> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> index ddb85a85cbba..0e8c440c6303 100644
> >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>>>           bool                    bulk_moveable;
> >>>>>           /* Flag to indicate if VM is used for compute */
> >>>>>           bool                    is_compute_context;
> >>>>> +       /*
> >>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>>>> +        * this context. We do not care about races at all, userspace is allowed
> >>>>> +        * to shoot itself with implicit sync to its fullest liking.
> >>>>> +        */
> >>>>> +       bool no_implicit_sync;
> >>>>>    };
> >>>>>
> >>>>>    struct amdgpu_vm_manager {
> >>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>>>> index 0cbd1540aeac..9eae245c14d6 100644
> >>>>> --- a/include/uapi/drm/amdgpu_drm.h
> >>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>>>> @@ -54,6 +54,7 @@ extern "C" {
> >>>>>    #define DRM_AMDGPU_VM                  0x13
> >>>>>    #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>>>    #define DRM_AMDGPU_SCHED               0x15
> >>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>>>
> >>>>>    #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>>>    #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>>>> @@ -71,6 +72,7 @@ extern "C" {
> >>>>>    #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>>>    #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>>>    #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>>>
> >>>>>    /**
> >>>>>     * DOC: memory domains
> >>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>>>           struct drm_amdgpu_sched_in in;
> >>>>>    };
> >>>>>
> >>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>>>> +
> >>>>> +struct drm_amdgpu_setparam {
> >>>>> +       /* AMDGPU_SETPARAM_* */
> >>>>> +       __u32   param;
> >>>>> +       __u32   value;
> >>>>> +};
> >>>>> +
> >>>>>    /*
> >>>>>     * This is not a reliable API and you should expect it to fail for any
> >>>>>     * number of reasons and have fallback path that do not use userptr to
> >>>>> --
> >>>>> 2.32.0.rc2
> >>>>>
> >>>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 13:49               ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 14:02                 ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 14:02 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 3:44 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>
>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>> implicit sync.
>>>>>>>
>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>
>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>
>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>
>>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>
>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>
>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>> default.
>>>>>>>
>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>> allows is already.
>>>>>>>
>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>
>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>> strange form of IPC for sync_file.
>>>>>>>
>>>>>>> For that we need two more pieces:
>>>>>>>
>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>      the dma-buf makes a ton more sense:
>>>>>>>
>>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
>>>>>>>
>>>>>>>      Current drivers in upstream solves this by having the opt-out flag
>>>>>>>      on their CS ioctl. This has the downside that very often the CS
>>>>>>>      which must actually stall for the implicit fence is run a while
>>>>>>>      after the implicit fence point was logically sampled per the api
>>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>      results in oversync. Converting the implicit sync fences into a
>>>>>>>      snap-shot sync_file is actually accurate.
>>>>>>>
>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>      same problems that the time the CS happens additional dependencies
>>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>      respecting the rules for how exclusive and shared fence slots must
>>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>
>>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
>>>>>>>
>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>
>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>
>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>> explicit fencing semantics as possible:
>>>>>>>
>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>      priority field
>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>> useless, and we need to hack up a new one.
>>>>>>>
>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>
>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>> need both, or this doesn't do much.
>>>>>>>
>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>> fences.
>>>>>> So I think there is still a case missing in this implementation.
>>>>>> Consider these 3 cases
>>>>>>
>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>
>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>> implicit sync will hence wait on it.
>>>>>>
>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>> the end result.
>>>>>>
>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>
>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>> write access, to cut down on the number of ioctls that are needed
>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>
>>>> Problem with the per context or per vm flag is that you then don't get
>>>> any implicit synchronization any more when another process starts using
>>>> the buffer.
>>> That is exactly what I want for Vulkan :)
>> Yeah, but as far as I know this is not something we can do.
>>
>> See we have use cases like screen capture and debug which rely on that
>> behavior.
> They will keep working, if (and only if) the vulkan side sets the
> winsys fences correctly. Also, everything else in vulkan aside from
> winsys is explicitly not synced at all, you have to import drm syncobj
> timeline on the gl side.
>
>> The only thing we can do is to say on a per buffer flag that a buffer
>> should not participate in implicit sync at all.
> Nah, this doesn't work. Because it's not a global decision, is a local
> decision for the rendered. Vulkan wants to control implicit sync
> explicitly, and the kernel can't force more synchronization. If a
> buffer is shared as a winsys buffer between vulkan client and gl using
> compositor, then you _have_ to use implicit sync on it. But vk needs
> to set the fences directly (and if the app gets it wrong, you get
> misrendering, but that is the specified behavour of vulkan).

Yeah, but that's exactly what we tried to avoid.

Mhm, when we attach the flag to the process/VM then this would break the 
use case of VA-API and Vulkan in the same process.

But I think if you attach the flag to the context that should indeed 
work fine.

Christian.

> -Daniel
>
>> Regards,
>> Christian.
>>
>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>> oversync, there's not much choice.
>>>>>
>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>
>>>>>> But this doesn't solve cross-driver interactions here.
>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>> Hui? You have lost me. Why is that still unsolved?
>>> The part we're trying to solve with this patch is Vulkan should not
>>> participate in any implicit sync at all wrt submissions (and then
>>> handle the implicit sync for WSI explicitly using the fence
>>> import/export stuff that Jason wrote). As long we add shared fences to
>>> the dma_resv we participate in implicit sync (at the level of an
>>> implicit sync read) still, at least from the perspective of later jobs
>>> waiting on these fences.
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>            struct amdgpu_bo *gds;
>>>>>>>            struct amdgpu_bo *gws;
>>>>>>>            struct amdgpu_bo *oa;
>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>            int r;
>>>>>>>
>>>>>>>            INIT_LIST_HEAD(&p->validated);
>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>
>>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>
>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>                            e->chain = dma_fence_chain_alloc();
>>>>>>>                            if (!e->chain) {
>>>>>>>                                    r = -ENOMEM;
>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>     {
>>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>            struct amdgpu_bo_list_entry *e;
>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>            int r;
>>>>>>>
>>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>                    enum amdgpu_sync_mode sync_mode;
>>>>>>>
>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>                                         &fpriv->vm);
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>            return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>> +                         struct drm_file *filp)
>>>>>>> +{
>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>> +
>>>>>>> +       switch (setparam->param) {
>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>> +               if (setparam->value)
>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>> +               else
>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>> +               break;
>>>>>>> +       default:
>>>>>>> +               return -EINVAL;
>>>>>>> +       }
>>>>>>> +
>>>>>>> +       return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>     };
>>>>>>>
>>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>            bool                    bulk_moveable;
>>>>>>>            /* Flag to indicate if VM is used for compute */
>>>>>>>            bool                    is_compute_context;
>>>>>>> +       /*
>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>> +        */
>>>>>>> +       bool no_implicit_sync;
>>>>>>>     };
>>>>>>>
>>>>>>>     struct amdgpu_vm_manager {
>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>     #define DRM_AMDGPU_VM                  0x13
>>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>     #define DRM_AMDGPU_SCHED               0x15
>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>
>>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>
>>>>>>>     /**
>>>>>>>      * DOC: memory domains
>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>            struct drm_amdgpu_sched_in in;
>>>>>>>     };
>>>>>>>
>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>> +
>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>> +       __u32   param;
>>>>>>> +       __u32   value;
>>>>>>> +};
>>>>>>> +
>>>>>>>     /*
>>>>>>>      * This is not a reliable API and you should expect it to fail for any
>>>>>>>      * number of reasons and have fallback path that do not use userptr to
>>>>>>> --
>>>>>>> 2.32.0.rc2
>>>>>>>
>


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 14:02                 ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 14:02 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Dave Airlie,
	Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 3:44 PM Christian König
> <christian.koenig@amd.com> wrote:
>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>
>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>> implicit sync.
>>>>>>>
>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>
>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>
>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>
>>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>
>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>
>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>> default.
>>>>>>>
>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>> allows is already.
>>>>>>>
>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>
>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>> strange form of IPC for sync_file.
>>>>>>>
>>>>>>> For that we need two more pieces:
>>>>>>>
>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>      the dma-buf makes a ton more sense:
>>>>>>>
>>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
>>>>>>>
>>>>>>>      Current drivers in upstream solves this by having the opt-out flag
>>>>>>>      on their CS ioctl. This has the downside that very often the CS
>>>>>>>      which must actually stall for the implicit fence is run a while
>>>>>>>      after the implicit fence point was logically sampled per the api
>>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>      results in oversync. Converting the implicit sync fences into a
>>>>>>>      snap-shot sync_file is actually accurate.
>>>>>>>
>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>      same problems that the time the CS happens additional dependencies
>>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>      respecting the rules for how exclusive and shared fence slots must
>>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>
>>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
>>>>>>>
>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>
>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>
>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>> explicit fencing semantics as possible:
>>>>>>>
>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>      priority field
>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>> useless, and we need to hack up a new one.
>>>>>>>
>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>
>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>> need both, or this doesn't do much.
>>>>>>>
>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>> fences.
>>>>>> So I think there is still a case missing in this implementation.
>>>>>> Consider these 3 cases
>>>>>>
>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>
>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>> implicit sync will hence wait on it.
>>>>>>
>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>> the end result.
>>>>>>
>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>
>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>> write access, to cut down on the number of ioctls that are needed
>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>
>>>> Problem with the per context or per vm flag is that you then don't get
>>>> any implicit synchronization any more when another process starts using
>>>> the buffer.
>>> That is exactly what I want for Vulkan :)
>> Yeah, but as far as I know this is not something we can do.
>>
>> See we have use cases like screen capture and debug which rely on that
>> behavior.
> They will keep working, if (and only if) the vulkan side sets the
> winsys fences correctly. Also, everything else in vulkan aside from
> winsys is explicitly not synced at all, you have to import drm syncobj
> timeline on the gl side.
>
>> The only thing we can do is to say on a per buffer flag that a buffer
>> should not participate in implicit sync at all.
> Nah, this doesn't work. Because it's not a global decision, is a local
> decision for the rendered. Vulkan wants to control implicit sync
> explicitly, and the kernel can't force more synchronization. If a
> buffer is shared as a winsys buffer between vulkan client and gl using
> compositor, then you _have_ to use implicit sync on it. But vk needs
> to set the fences directly (and if the app gets it wrong, you get
> misrendering, but that is the specified behavour of vulkan).

Yeah, but that's exactly what we tried to avoid.

Mhm, when we attach the flag to the process/VM then this would break the 
use case of VA-API and Vulkan in the same process.

But I think if you attach the flag to the context that should indeed 
work fine.

Christian.

> -Daniel
>
>> Regards,
>> Christian.
>>
>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>> oversync, there's not much choice.
>>>>>
>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>
>>>>>> But this doesn't solve cross-driver interactions here.
>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>> Hui? You have lost me. Why is that still unsolved?
>>> The part we're trying to solve with this patch is Vulkan should not
>>> participate in any implicit sync at all wrt submissions (and then
>>> handle the implicit sync for WSI explicitly using the fence
>>> import/export stuff that Jason wrote). As long we add shared fences to
>>> the dma_resv we participate in implicit sync (at the level of an
>>> implicit sync read) still, at least from the perspective of later jobs
>>> waiting on these fences.
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>> ---
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>            struct amdgpu_bo *gds;
>>>>>>>            struct amdgpu_bo *gws;
>>>>>>>            struct amdgpu_bo *oa;
>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>            int r;
>>>>>>>
>>>>>>>            INIT_LIST_HEAD(&p->validated);
>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>
>>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>
>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>                            e->chain = dma_fence_chain_alloc();
>>>>>>>                            if (!e->chain) {
>>>>>>>                                    r = -ENOMEM;
>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>     {
>>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>            struct amdgpu_bo_list_entry *e;
>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>            int r;
>>>>>>>
>>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>                    enum amdgpu_sync_mode sync_mode;
>>>>>>>
>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>                                         &fpriv->vm);
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>            return 0;
>>>>>>>     }
>>>>>>>
>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>> +                         struct drm_file *filp)
>>>>>>> +{
>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>> +
>>>>>>> +       switch (setparam->param) {
>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>> +               if (setparam->value)
>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>> +               else
>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>> +               break;
>>>>>>> +       default:
>>>>>>> +               return -EINVAL;
>>>>>>> +       }
>>>>>>> +
>>>>>>> +       return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>     };
>>>>>>>
>>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>            bool                    bulk_moveable;
>>>>>>>            /* Flag to indicate if VM is used for compute */
>>>>>>>            bool                    is_compute_context;
>>>>>>> +       /*
>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>> +        */
>>>>>>> +       bool no_implicit_sync;
>>>>>>>     };
>>>>>>>
>>>>>>>     struct amdgpu_vm_manager {
>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>     #define DRM_AMDGPU_VM                  0x13
>>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>     #define DRM_AMDGPU_SCHED               0x15
>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>
>>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>
>>>>>>>     /**
>>>>>>>      * DOC: memory domains
>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>            struct drm_amdgpu_sched_in in;
>>>>>>>     };
>>>>>>>
>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>> +
>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>> +       __u32   param;
>>>>>>> +       __u32   value;
>>>>>>> +};
>>>>>>> +
>>>>>>>     /*
>>>>>>>      * This is not a reliable API and you should expect it to fail for any
>>>>>>>      * number of reasons and have fallback path that do not use userptr to
>>>>>>> --
>>>>>>> 2.32.0.rc2
>>>>>>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 14:02                 ` [Intel-gfx] " Christian König
@ 2021-06-23 14:50                   ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 14:50 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 4:02 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> >>>>> <bas@basnieuwenhuizen.nl> wrote:
> >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>>>>>
> >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>>>>>> explicitly. This is the only way it will mesh well with explicit
> >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> >>>>>>> be able to manage anything else that wants to use the same buffer on
> >>>>>>> multiple engines in parallel, and still be able to share it through
> >>>>>>> implicit sync.
> >>>>>>>
> >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> >>>>>>>
> >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> >>>>>>> takes care of all the complexities of making sure that non-optional
> >>>>>>> fences (like bo moves) are not ignored. This support was added in
> >>>>>>>
> >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>>>>>
> >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>>>>>
> >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>>>>>> disables implicit sync on an allocated buffer completely.
> >>>>>>>
> >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> >>>>>>> pipeline (compositor, other process or just different stage in a media
> >>>>>>> pipeline in the same process) can then either do the same, or fully
> >>>>>>> participate in the implicit sync as implemented by the kernel by
> >>>>>>> default.
> >>>>>>>
> >>>>>>> By building on the existing flag for buffers we avoid any issues with
> >>>>>>> opening up additional security concerns - anything this new flag here
> >>>>>>> allows is already.
> >>>>>>>
> >>>>>>> All drivers which supports this concept of a userspace-specific
> >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> >>>>>>> let's try to do a bit better for amdgpu.
> >>>>>>>
> >>>>>>> This alone only allows us to completely avoid any stalls due to
> >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> >>>>>>> strange form of IPC for sync_file.
> >>>>>>>
> >>>>>>> For that we need two more pieces:
> >>>>>>>
> >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>>>>>      the dma-buf makes a ton more sense:
> >>>>>>>
> >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> >>>>>>>
> >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> >>>>>>>      which must actually stall for the implicit fence is run a while
> >>>>>>>      after the implicit fence point was logically sampled per the api
> >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>>>>>      results in oversync. Converting the implicit sync fences into a
> >>>>>>>      snap-shot sync_file is actually accurate.
> >>>>>>>
> >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> >>>>>>>      same problems that the time the CS happens additional dependencies
> >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> >>>>>>>
> >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> >>>>>>>
> >>>>>>> These three pieces together allow userspace to fully control implicit
> >>>>>>> fencing and remove all unecessary stall points due to them.
> >>>>>>>
> >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> >>>>>>> There is only one set of fences, you can only choose to sync against
> >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> >>>>>>> multiple buffers or anything else like this is fundamentally not
> >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> >>>>>>>
> >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> >>>>>>> explicit fencing semantics as possible:
> >>>>>>>
> >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>>>>>> flag parameter in the VM ioctl which we could use, except:
> >>>>>>> - it's not checked, so userspace likely passes garbage
> >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> >>>>>>>      priority field
> >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> >>>>>>> useless, and we need to hack up a new one.
> >>>>>>>
> >>>>>>> v2: Explain why a new SETPARAM (Jason)
> >>>>>>>
> >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>>>>>> need both, or this doesn't do much.
> >>>>>>>
> >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>>>>>> fences.
> >>>>>> So I think there is still a case missing in this implementation.
> >>>>>> Consider these 3 cases
> >>>>>>
> >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>>>>>
> >>>>>> explicit->explicit: This doesn't wait now, which is good
> >>>>>> Implicit->explicit: This doesn't wait now, which is good
> >>>>>> explicit->implicit : This still waits as the explicit submission still
> >>>>>> adds shared fences and most things that set an exclusive fence for
> >>>>>> implicit sync will hence wait on it.
> >>>>>>
> >>>>>> This is probably good enough for what radv needs now but also sounds
> >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> >>>>>> the end result.
> >>>>>>
> >>>>>> Within AMDGPU this is probably solvable in two ways:
> >>>>>>
> >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> >>>>> write access, to cut down on the number of ioctls that are needed
> >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> >>>>
> >>>> Problem with the per context or per vm flag is that you then don't get
> >>>> any implicit synchronization any more when another process starts using
> >>>> the buffer.
> >>> That is exactly what I want for Vulkan :)
> >> Yeah, but as far as I know this is not something we can do.
> >>
> >> See we have use cases like screen capture and debug which rely on that
> >> behavior.
> > They will keep working, if (and only if) the vulkan side sets the
> > winsys fences correctly. Also, everything else in vulkan aside from
> > winsys is explicitly not synced at all, you have to import drm syncobj
> > timeline on the gl side.
> >
> >> The only thing we can do is to say on a per buffer flag that a buffer
> >> should not participate in implicit sync at all.
> > Nah, this doesn't work. Because it's not a global decision, is a local
> > decision for the rendered. Vulkan wants to control implicit sync
> > explicitly, and the kernel can't force more synchronization. If a
> > buffer is shared as a winsys buffer between vulkan client and gl using
> > compositor, then you _have_ to use implicit sync on it. But vk needs
> > to set the fences directly (and if the app gets it wrong, you get
> > misrendering, but that is the specified behavour of vulkan).
>
> Yeah, but that's exactly what we tried to avoid.
>
> Mhm, when we attach the flag to the process/VM then this would break the
> use case of VA-API and Vulkan in the same process.
>
> But I think if you attach the flag to the context that should indeed
> work fine.

Yeah that's a question I have, whether the drm_file is shared within
one process among everything, or whether radeonsi/libva/vk each have
their own. If each have their own drm_file, then we should be fine,
otherwise we need to figure out another place to put this (worst case
as a CS extension that vk just sets on every submit).

Also yes this risks that a vk app which was violationing the winsys
spec will now break, which is why I think we should do this sooner
than later. Otherwise the list of w/a we might need to apply in vk
userspace will become very long :-( At least since this is purely
opt-in from userspace, we only need to have the w/a list in userspace,
where mesa has the infrastructure for that already.
-Daniel

>
> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>>>> The current amdgpu uapi just doesn't allow any other model without an
> >>>>> explicit opt-in. So current implicit sync userspace just has to
> >>>>> oversync, there's not much choice.
> >>>>>
> >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>>>>>
> >>>>>> But this doesn't solve cross-driver interactions here.
> >>>>> Yeah cross-driver is still entirely unsolved, because
> >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> >>>> Hui? You have lost me. Why is that still unsolved?
> >>> The part we're trying to solve with this patch is Vulkan should not
> >>> participate in any implicit sync at all wrt submissions (and then
> >>> handle the implicit sync for WSI explicitly using the fence
> >>> import/export stuff that Jason wrote). As long we add shared fences to
> >>> the dma_resv we participate in implicit sync (at the level of an
> >>> implicit sync read) still, at least from the perspective of later jobs
> >>> waiting on these fences.
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> -Daniel
> >>>>>
> >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>> ---
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>>>            struct amdgpu_bo *gds;
> >>>>>>>            struct amdgpu_bo *gws;
> >>>>>>>            struct amdgpu_bo *oa;
> >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>>>            int r;
> >>>>>>>
> >>>>>>>            INIT_LIST_HEAD(&p->validated);
> >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>>>
> >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>>>>>
> >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>>>>>> +               if (bo->tbo.base.dma_buf &&
> >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>>>>>                            e->chain = dma_fence_chain_alloc();
> >>>>>>>                            if (!e->chain) {
> >>>>>>>                                    r = -ENOMEM;
> >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>>>     {
> >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>>>>>            struct amdgpu_bo_list_entry *e;
> >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>>>            int r;
> >>>>>>>
> >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> >>>>>>>
> >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>>>>>                                         &fpriv->vm);
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> index c080ba15ae77..f982626b5328 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>>>>>            return 0;
> >>>>>>>     }
> >>>>>>>
> >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>>>>>> +                         struct drm_file *filp)
> >>>>>>> +{
> >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>>>>>> +
> >>>>>>> +       switch (setparam->param) {
> >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>>>>>> +               if (setparam->value)
> >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>>>>>> +               else
> >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>>>>>> +               break;
> >>>>>>> +       default:
> >>>>>>> +               return -EINVAL;
> >>>>>>> +       }
> >>>>>>> +
> >>>>>>> +       return 0;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>     };
> >>>>>>>
> >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>>>>>            bool                    bulk_moveable;
> >>>>>>>            /* Flag to indicate if VM is used for compute */
> >>>>>>>            bool                    is_compute_context;
> >>>>>>> +       /*
> >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> >>>>>>> +        */
> >>>>>>> +       bool no_implicit_sync;
> >>>>>>>     };
> >>>>>>>
> >>>>>>>     struct amdgpu_vm_manager {
> >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>>>>>
> >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>>>>>
> >>>>>>>     /**
> >>>>>>>      * DOC: memory domains
> >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>>>>>            struct drm_amdgpu_sched_in in;
> >>>>>>>     };
> >>>>>>>
> >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>>>>>> +
> >>>>>>> +struct drm_amdgpu_setparam {
> >>>>>>> +       /* AMDGPU_SETPARAM_* */
> >>>>>>> +       __u32   param;
> >>>>>>> +       __u32   value;
> >>>>>>> +};
> >>>>>>> +
> >>>>>>>     /*
> >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> >>>>>>> --
> >>>>>>> 2.32.0.rc2
> >>>>>>>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 14:50                   ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 14:50 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Dave Airlie,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 4:02 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> >>>>> <bas@basnieuwenhuizen.nl> wrote:
> >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> >>>>>>>
> >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> >>>>>>> explicitly. This is the only way it will mesh well with explicit
> >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> >>>>>>> be able to manage anything else that wants to use the same buffer on
> >>>>>>> multiple engines in parallel, and still be able to share it through
> >>>>>>> implicit sync.
> >>>>>>>
> >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> >>>>>>>
> >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> >>>>>>> takes care of all the complexities of making sure that non-optional
> >>>>>>> fences (like bo moves) are not ignored. This support was added in
> >>>>>>>
> >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> >>>>>>>
> >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> >>>>>>>
> >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> >>>>>>> disables implicit sync on an allocated buffer completely.
> >>>>>>>
> >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> >>>>>>> pipeline (compositor, other process or just different stage in a media
> >>>>>>> pipeline in the same process) can then either do the same, or fully
> >>>>>>> participate in the implicit sync as implemented by the kernel by
> >>>>>>> default.
> >>>>>>>
> >>>>>>> By building on the existing flag for buffers we avoid any issues with
> >>>>>>> opening up additional security concerns - anything this new flag here
> >>>>>>> allows is already.
> >>>>>>>
> >>>>>>> All drivers which supports this concept of a userspace-specific
> >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> >>>>>>> let's try to do a bit better for amdgpu.
> >>>>>>>
> >>>>>>> This alone only allows us to completely avoid any stalls due to
> >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> >>>>>>> strange form of IPC for sync_file.
> >>>>>>>
> >>>>>>> For that we need two more pieces:
> >>>>>>>
> >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> >>>>>>>      the dma-buf makes a ton more sense:
> >>>>>>>
> >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> >>>>>>>
> >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> >>>>>>>      which must actually stall for the implicit fence is run a while
> >>>>>>>      after the implicit fence point was logically sampled per the api
> >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> >>>>>>>      results in oversync. Converting the implicit sync fences into a
> >>>>>>>      snap-shot sync_file is actually accurate.
> >>>>>>>
> >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> >>>>>>>      same problems that the time the CS happens additional dependencies
> >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> >>>>>>>
> >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> >>>>>>>
> >>>>>>> These three pieces together allow userspace to fully control implicit
> >>>>>>> fencing and remove all unecessary stall points due to them.
> >>>>>>>
> >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> >>>>>>> There is only one set of fences, you can only choose to sync against
> >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> >>>>>>> multiple buffers or anything else like this is fundamentally not
> >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> >>>>>>>
> >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> >>>>>>> explicit fencing semantics as possible:
> >>>>>>>
> >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> >>>>>>> flag parameter in the VM ioctl which we could use, except:
> >>>>>>> - it's not checked, so userspace likely passes garbage
> >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> >>>>>>>      priority field
> >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> >>>>>>> useless, and we need to hack up a new one.
> >>>>>>>
> >>>>>>> v2: Explain why a new SETPARAM (Jason)
> >>>>>>>
> >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> >>>>>>> need both, or this doesn't do much.
> >>>>>>>
> >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> >>>>>>> fences.
> >>>>>> So I think there is still a case missing in this implementation.
> >>>>>> Consider these 3 cases
> >>>>>>
> >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> >>>>>>
> >>>>>> explicit->explicit: This doesn't wait now, which is good
> >>>>>> Implicit->explicit: This doesn't wait now, which is good
> >>>>>> explicit->implicit : This still waits as the explicit submission still
> >>>>>> adds shared fences and most things that set an exclusive fence for
> >>>>>> implicit sync will hence wait on it.
> >>>>>>
> >>>>>> This is probably good enough for what radv needs now but also sounds
> >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> >>>>>> the end result.
> >>>>>>
> >>>>>> Within AMDGPU this is probably solvable in two ways:
> >>>>>>
> >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> >>>>> write access, to cut down on the number of ioctls that are needed
> >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> >>>>
> >>>> Problem with the per context or per vm flag is that you then don't get
> >>>> any implicit synchronization any more when another process starts using
> >>>> the buffer.
> >>> That is exactly what I want for Vulkan :)
> >> Yeah, but as far as I know this is not something we can do.
> >>
> >> See we have use cases like screen capture and debug which rely on that
> >> behavior.
> > They will keep working, if (and only if) the vulkan side sets the
> > winsys fences correctly. Also, everything else in vulkan aside from
> > winsys is explicitly not synced at all, you have to import drm syncobj
> > timeline on the gl side.
> >
> >> The only thing we can do is to say on a per buffer flag that a buffer
> >> should not participate in implicit sync at all.
> > Nah, this doesn't work. Because it's not a global decision, is a local
> > decision for the rendered. Vulkan wants to control implicit sync
> > explicitly, and the kernel can't force more synchronization. If a
> > buffer is shared as a winsys buffer between vulkan client and gl using
> > compositor, then you _have_ to use implicit sync on it. But vk needs
> > to set the fences directly (and if the app gets it wrong, you get
> > misrendering, but that is the specified behavour of vulkan).
>
> Yeah, but that's exactly what we tried to avoid.
>
> Mhm, when we attach the flag to the process/VM then this would break the
> use case of VA-API and Vulkan in the same process.
>
> But I think if you attach the flag to the context that should indeed
> work fine.

Yeah that's a question I have, whether the drm_file is shared within
one process among everything, or whether radeonsi/libva/vk each have
their own. If each have their own drm_file, then we should be fine,
otherwise we need to figure out another place to put this (worst case
as a CS extension that vk just sets on every submit).

Also yes this risks that a vk app which was violationing the winsys
spec will now break, which is why I think we should do this sooner
than later. Otherwise the list of w/a we might need to apply in vk
userspace will become very long :-( At least since this is purely
opt-in from userspace, we only need to have the w/a list in userspace,
where mesa has the infrastructure for that already.
-Daniel

>
> Christian.
>
> > -Daniel
> >
> >> Regards,
> >> Christian.
> >>
> >>>>> The current amdgpu uapi just doesn't allow any other model without an
> >>>>> explicit opt-in. So current implicit sync userspace just has to
> >>>>> oversync, there's not much choice.
> >>>>>
> >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> >>>>>>
> >>>>>> But this doesn't solve cross-driver interactions here.
> >>>>> Yeah cross-driver is still entirely unsolved, because
> >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> >>>> Hui? You have lost me. Why is that still unsolved?
> >>> The part we're trying to solve with this patch is Vulkan should not
> >>> participate in any implicit sync at all wrt submissions (and then
> >>> handle the implicit sync for WSI explicitly using the fence
> >>> import/export stuff that Jason wrote). As long we add shared fences to
> >>> the dma_resv we participate in implicit sync (at the level of an
> >>> implicit sync read) still, at least from the perspective of later jobs
> >>> waiting on these fences.
> >>>
> >>>> Regards,
> >>>> Christian.
> >>>>
> >>>>> -Daniel
> >>>>>
> >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> >>>>>>> ---
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>>>            struct amdgpu_bo *gds;
> >>>>>>>            struct amdgpu_bo *gws;
> >>>>>>>            struct amdgpu_bo *oa;
> >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>>>            int r;
> >>>>>>>
> >>>>>>>            INIT_LIST_HEAD(&p->validated);
> >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> >>>>>>>
> >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> >>>>>>>
> >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> >>>>>>> +               if (bo->tbo.base.dma_buf &&
> >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> >>>>>>>                            e->chain = dma_fence_chain_alloc();
> >>>>>>>                            if (!e->chain) {
> >>>>>>>                                    r = -ENOMEM;
> >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>>>     {
> >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> >>>>>>>            struct amdgpu_bo_list_entry *e;
> >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> >>>>>>>            int r;
> >>>>>>>
> >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> >>>>>>>
> >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> >>>>>>>                                         &fpriv->vm);
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> index c080ba15ae77..f982626b5328 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> >>>>>>>            return 0;
> >>>>>>>     }
> >>>>>>>
> >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> >>>>>>> +                         struct drm_file *filp)
> >>>>>>> +{
> >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> >>>>>>> +
> >>>>>>> +       switch (setparam->param) {
> >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> >>>>>>> +               if (setparam->value)
> >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> >>>>>>> +               else
> >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> >>>>>>> +               break;
> >>>>>>> +       default:
> >>>>>>> +               return -EINVAL;
> >>>>>>> +       }
> >>>>>>> +
> >>>>>>> +       return 0;
> >>>>>>> +}
> >>>>>>> +
> >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> >>>>>>>     };
> >>>>>>>
> >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> >>>>>>>            bool                    bulk_moveable;
> >>>>>>>            /* Flag to indicate if VM is used for compute */
> >>>>>>>            bool                    is_compute_context;
> >>>>>>> +       /*
> >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> >>>>>>> +        */
> >>>>>>> +       bool no_implicit_sync;
> >>>>>>>     };
> >>>>>>>
> >>>>>>>     struct amdgpu_vm_manager {
> >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> >>>>>>>
> >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> >>>>>>>
> >>>>>>>     /**
> >>>>>>>      * DOC: memory domains
> >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> >>>>>>>            struct drm_amdgpu_sched_in in;
> >>>>>>>     };
> >>>>>>>
> >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> >>>>>>> +
> >>>>>>> +struct drm_amdgpu_setparam {
> >>>>>>> +       /* AMDGPU_SETPARAM_* */
> >>>>>>> +       __u32   param;
> >>>>>>> +       __u32   value;
> >>>>>>> +};
> >>>>>>> +
> >>>>>>>     /*
> >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> >>>>>>> --
> >>>>>>> 2.32.0.rc2
> >>>>>>>
> >
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 14:50                   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 14:58                     ` Bas Nieuwenhuizen
  -1 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23 14:58 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Intel Graphics Development, Kevin Wang,
	DRI Development, Michel Dänzer, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Christian König, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Jun 23, 2021 at 4:02 PM Christian König
> <christian.koenig@amd.com> wrote:
> >
> > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > <christian.koenig@amd.com> wrote:
> > >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > >>> <christian.koenig@amd.com> wrote:
> > >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > >>>>> <bas@basnieuwenhuizen.nl> wrote:
> > >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > >>>>>>>
> > >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> > >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> > >>>>>>> explicitly. This is the only way it will mesh well with explicit
> > >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> > >>>>>>> be able to manage anything else that wants to use the same buffer on
> > >>>>>>> multiple engines in parallel, and still be able to share it through
> > >>>>>>> implicit sync.
> > >>>>>>>
> > >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> > >>>>>>>
> > >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> > >>>>>>> takes care of all the complexities of making sure that non-optional
> > >>>>>>> fences (like bo moves) are not ignored. This support was added in
> > >>>>>>>
> > >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> > >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> > >>>>>>>
> > >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > >>>>>>>
> > >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> > >>>>>>> disables implicit sync on an allocated buffer completely.
> > >>>>>>>
> > >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> > >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> > >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> > >>>>>>> pipeline (compositor, other process or just different stage in a media
> > >>>>>>> pipeline in the same process) can then either do the same, or fully
> > >>>>>>> participate in the implicit sync as implemented by the kernel by
> > >>>>>>> default.
> > >>>>>>>
> > >>>>>>> By building on the existing flag for buffers we avoid any issues with
> > >>>>>>> opening up additional security concerns - anything this new flag here
> > >>>>>>> allows is already.
> > >>>>>>>
> > >>>>>>> All drivers which supports this concept of a userspace-specific
> > >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> > >>>>>>> let's try to do a bit better for amdgpu.
> > >>>>>>>
> > >>>>>>> This alone only allows us to completely avoid any stalls due to
> > >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> > >>>>>>> strange form of IPC for sync_file.
> > >>>>>>>
> > >>>>>>> For that we need two more pieces:
> > >>>>>>>
> > >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> > >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> > >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> > >>>>>>>      the dma-buf makes a ton more sense:
> > >>>>>>>
> > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> > >>>>>>>
> > >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> > >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> > >>>>>>>      which must actually stall for the implicit fence is run a while
> > >>>>>>>      after the implicit fence point was logically sampled per the api
> > >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> > >>>>>>>      results in oversync. Converting the implicit sync fences into a
> > >>>>>>>      snap-shot sync_file is actually accurate.
> > >>>>>>>
> > >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> > >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> > >>>>>>>      same problems that the time the CS happens additional dependencies
> > >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> > >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> > >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> > >>>>>>>
> > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> > >>>>>>>
> > >>>>>>> These three pieces together allow userspace to fully control implicit
> > >>>>>>> fencing and remove all unecessary stall points due to them.
> > >>>>>>>
> > >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> > >>>>>>> There is only one set of fences, you can only choose to sync against
> > >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> > >>>>>>> multiple buffers or anything else like this is fundamentally not
> > >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> > >>>>>>>
> > >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> > >>>>>>> explicit fencing semantics as possible:
> > >>>>>>>
> > >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> > >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> > >>>>>>> flag parameter in the VM ioctl which we could use, except:
> > >>>>>>> - it's not checked, so userspace likely passes garbage
> > >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> > >>>>>>>      priority field
> > >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> > >>>>>>> useless, and we need to hack up a new one.
> > >>>>>>>
> > >>>>>>> v2: Explain why a new SETPARAM (Jason)
> > >>>>>>>
> > >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > >>>>>>> need both, or this doesn't do much.
> > >>>>>>>
> > >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> > >>>>>>> fences.
> > >>>>>> So I think there is still a case missing in this implementation.
> > >>>>>> Consider these 3 cases
> > >>>>>>
> > >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> > >>>>>>
> > >>>>>> explicit->explicit: This doesn't wait now, which is good
> > >>>>>> Implicit->explicit: This doesn't wait now, which is good
> > >>>>>> explicit->implicit : This still waits as the explicit submission still
> > >>>>>> adds shared fences and most things that set an exclusive fence for
> > >>>>>> implicit sync will hence wait on it.
> > >>>>>>
> > >>>>>> This is probably good enough for what radv needs now but also sounds
> > >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> > >>>>>> the end result.
> > >>>>>>
> > >>>>>> Within AMDGPU this is probably solvable in two ways:
> > >>>>>>
> > >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> > >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> > >>>>> write access, to cut down on the number of ioctls that are needed
> > >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> > >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > >>>>
> > >>>> Problem with the per context or per vm flag is that you then don't get
> > >>>> any implicit synchronization any more when another process starts using
> > >>>> the buffer.
> > >>> That is exactly what I want for Vulkan :)
> > >> Yeah, but as far as I know this is not something we can do.
> > >>
> > >> See we have use cases like screen capture and debug which rely on that
> > >> behavior.
> > > They will keep working, if (and only if) the vulkan side sets the
> > > winsys fences correctly. Also, everything else in vulkan aside from
> > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > timeline on the gl side.
> > >
> > >> The only thing we can do is to say on a per buffer flag that a buffer
> > >> should not participate in implicit sync at all.
> > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > decision for the rendered. Vulkan wants to control implicit sync
> > > explicitly, and the kernel can't force more synchronization. If a
> > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > to set the fences directly (and if the app gets it wrong, you get
> > > misrendering, but that is the specified behavour of vulkan).
> >
> > Yeah, but that's exactly what we tried to avoid.
> >
> > Mhm, when we attach the flag to the process/VM then this would break the
> > use case of VA-API and Vulkan in the same process.
> >
> > But I think if you attach the flag to the context that should indeed
> > work fine.
>
> Yeah that's a question I have, whether the drm_file is shared within
> one process among everything, or whether radeonsi/libva/vk each have
> their own. If each have their own drm_file, then we should be fine,
> otherwise we need to figure out another place to put this (worst case
> as a CS extension that vk just sets on every submit).

libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
process (modulo minigbm on chromeos and modulo a master fd).

That said the current proposal is for the context right? And on the
context this should pretty much work? So I'm not sure why this is the
part we are discussing?

>
> Also yes this risks that a vk app which was violationing the winsys
> spec will now break, which is why I think we should do this sooner
> than later. Otherwise the list of w/a we might need to apply in vk
> userspace will become very long :-( At least since this is purely
> opt-in from userspace, we only need to have the w/a list in userspace,
> where mesa has the infrastructure for that already.
> -Daniel
>
> >
> > Christian.
> >
> > > -Daniel
> > >
> > >> Regards,
> > >> Christian.
> > >>
> > >>>>> The current amdgpu uapi just doesn't allow any other model without an
> > >>>>> explicit opt-in. So current implicit sync userspace just has to
> > >>>>> oversync, there's not much choice.
> > >>>>>
> > >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> > >>>>>>
> > >>>>>> But this doesn't solve cross-driver interactions here.
> > >>>>> Yeah cross-driver is still entirely unsolved, because
> > >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > >>>> Hui? You have lost me. Why is that still unsolved?
> > >>> The part we're trying to solve with this patch is Vulkan should not
> > >>> participate in any implicit sync at all wrt submissions (and then
> > >>> handle the implicit sync for WSI explicitly using the fence
> > >>> import/export stuff that Jason wrote). As long we add shared fences to
> > >>> the dma_resv we participate in implicit sync (at the level of an
> > >>> implicit sync read) still, at least from the perspective of later jobs
> > >>> waiting on these fences.
> > >>>
> > >>>> Regards,
> > >>>> Christian.
> > >>>>
> > >>>>> -Daniel
> > >>>>>
> > >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> > >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> > >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> > >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> > >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> > >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> > >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> > >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> > >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> > >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> > >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >>>>>>> ---
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > >>>>>>>            struct amdgpu_bo *gds;
> > >>>>>>>            struct amdgpu_bo *gws;
> > >>>>>>>            struct amdgpu_bo *oa;
> > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > >>>>>>>            int r;
> > >>>>>>>
> > >>>>>>>            INIT_LIST_HEAD(&p->validated);
> > >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > >>>>>>>
> > >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > >>>>>>>
> > >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > >>>>>>> +               if (bo->tbo.base.dma_buf &&
> > >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > >>>>>>>                            e->chain = dma_fence_chain_alloc();
> > >>>>>>>                            if (!e->chain) {
> > >>>>>>>                                    r = -ENOMEM;
> > >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > >>>>>>>     {
> > >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > >>>>>>>            struct amdgpu_bo_list_entry *e;
> > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > >>>>>>>            int r;
> > >>>>>>>
> > >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> > >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> > >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> > >>>>>>>
> > >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > >>>>>>>                                         &fpriv->vm);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> index c080ba15ae77..f982626b5328 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > >>>>>>>            return 0;
> > >>>>>>>     }
> > >>>>>>>
> > >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > >>>>>>> +                         struct drm_file *filp)
> > >>>>>>> +{
> > >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> > >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > >>>>>>> +
> > >>>>>>> +       switch (setparam->param) {
> > >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > >>>>>>> +               if (setparam->value)
> > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > >>>>>>> +               else
> > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > >>>>>>> +               break;
> > >>>>>>> +       default:
> > >>>>>>> +               return -EINVAL;
> > >>>>>>> +       }
> > >>>>>>> +
> > >>>>>>> +       return 0;
> > >>>>>>> +}
> > >>>>>>> +
> > >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > >>>>>>>            bool                    bulk_moveable;
> > >>>>>>>            /* Flag to indicate if VM is used for compute */
> > >>>>>>>            bool                    is_compute_context;
> > >>>>>>> +       /*
> > >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> > >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> > >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> > >>>>>>> +        */
> > >>>>>>> +       bool no_implicit_sync;
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>>     struct amdgpu_vm_manager {
> > >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> > >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> > >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> > >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> > >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> > >>>>>>>
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > >>>>>>>
> > >>>>>>>     /**
> > >>>>>>>      * DOC: memory domains
> > >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > >>>>>>>            struct drm_amdgpu_sched_in in;
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > >>>>>>> +
> > >>>>>>> +struct drm_amdgpu_setparam {
> > >>>>>>> +       /* AMDGPU_SETPARAM_* */
> > >>>>>>> +       __u32   param;
> > >>>>>>> +       __u32   value;
> > >>>>>>> +};
> > >>>>>>> +
> > >>>>>>>     /*
> > >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> > >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> > >>>>>>> --
> > >>>>>>> 2.32.0.rc2
> > >>>>>>>
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 14:58                     ` Bas Nieuwenhuizen
  0 siblings, 0 replies; 175+ messages in thread
From: Bas Nieuwenhuizen @ 2021-06-23 14:58 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Intel Graphics Development, Kevin Wang,
	DRI Development, Sumit Semwal, Michel Dänzer, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Dave Airlie, Christian König, Dennis Li,
	Deepak R Varma

On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> On Wed, Jun 23, 2021 at 4:02 PM Christian König
> <christian.koenig@amd.com> wrote:
> >
> > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > <christian.koenig@amd.com> wrote:
> > >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > >>> <christian.koenig@amd.com> wrote:
> > >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > >>>>> <bas@basnieuwenhuizen.nl> wrote:
> > >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > >>>>>>>
> > >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> > >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> > >>>>>>> explicitly. This is the only way it will mesh well with explicit
> > >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> > >>>>>>> be able to manage anything else that wants to use the same buffer on
> > >>>>>>> multiple engines in parallel, and still be able to share it through
> > >>>>>>> implicit sync.
> > >>>>>>>
> > >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> > >>>>>>>
> > >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> > >>>>>>> takes care of all the complexities of making sure that non-optional
> > >>>>>>> fences (like bo moves) are not ignored. This support was added in
> > >>>>>>>
> > >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> > >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> > >>>>>>>
> > >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > >>>>>>>
> > >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> > >>>>>>> disables implicit sync on an allocated buffer completely.
> > >>>>>>>
> > >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> > >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> > >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> > >>>>>>> pipeline (compositor, other process or just different stage in a media
> > >>>>>>> pipeline in the same process) can then either do the same, or fully
> > >>>>>>> participate in the implicit sync as implemented by the kernel by
> > >>>>>>> default.
> > >>>>>>>
> > >>>>>>> By building on the existing flag for buffers we avoid any issues with
> > >>>>>>> opening up additional security concerns - anything this new flag here
> > >>>>>>> allows is already.
> > >>>>>>>
> > >>>>>>> All drivers which supports this concept of a userspace-specific
> > >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> > >>>>>>> let's try to do a bit better for amdgpu.
> > >>>>>>>
> > >>>>>>> This alone only allows us to completely avoid any stalls due to
> > >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> > >>>>>>> strange form of IPC for sync_file.
> > >>>>>>>
> > >>>>>>> For that we need two more pieces:
> > >>>>>>>
> > >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> > >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> > >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> > >>>>>>>      the dma-buf makes a ton more sense:
> > >>>>>>>
> > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> > >>>>>>>
> > >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> > >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> > >>>>>>>      which must actually stall for the implicit fence is run a while
> > >>>>>>>      after the implicit fence point was logically sampled per the api
> > >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> > >>>>>>>      results in oversync. Converting the implicit sync fences into a
> > >>>>>>>      snap-shot sync_file is actually accurate.
> > >>>>>>>
> > >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> > >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> > >>>>>>>      same problems that the time the CS happens additional dependencies
> > >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> > >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> > >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> > >>>>>>>
> > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> > >>>>>>>
> > >>>>>>> These three pieces together allow userspace to fully control implicit
> > >>>>>>> fencing and remove all unecessary stall points due to them.
> > >>>>>>>
> > >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> > >>>>>>> There is only one set of fences, you can only choose to sync against
> > >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> > >>>>>>> multiple buffers or anything else like this is fundamentally not
> > >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> > >>>>>>>
> > >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> > >>>>>>> explicit fencing semantics as possible:
> > >>>>>>>
> > >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> > >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> > >>>>>>> flag parameter in the VM ioctl which we could use, except:
> > >>>>>>> - it's not checked, so userspace likely passes garbage
> > >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> > >>>>>>>      priority field
> > >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> > >>>>>>> useless, and we need to hack up a new one.
> > >>>>>>>
> > >>>>>>> v2: Explain why a new SETPARAM (Jason)
> > >>>>>>>
> > >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > >>>>>>> need both, or this doesn't do much.
> > >>>>>>>
> > >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> > >>>>>>> fences.
> > >>>>>> So I think there is still a case missing in this implementation.
> > >>>>>> Consider these 3 cases
> > >>>>>>
> > >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> > >>>>>>
> > >>>>>> explicit->explicit: This doesn't wait now, which is good
> > >>>>>> Implicit->explicit: This doesn't wait now, which is good
> > >>>>>> explicit->implicit : This still waits as the explicit submission still
> > >>>>>> adds shared fences and most things that set an exclusive fence for
> > >>>>>> implicit sync will hence wait on it.
> > >>>>>>
> > >>>>>> This is probably good enough for what radv needs now but also sounds
> > >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> > >>>>>> the end result.
> > >>>>>>
> > >>>>>> Within AMDGPU this is probably solvable in two ways:
> > >>>>>>
> > >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> > >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> > >>>>> write access, to cut down on the number of ioctls that are needed
> > >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> > >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > >>>>
> > >>>> Problem with the per context or per vm flag is that you then don't get
> > >>>> any implicit synchronization any more when another process starts using
> > >>>> the buffer.
> > >>> That is exactly what I want for Vulkan :)
> > >> Yeah, but as far as I know this is not something we can do.
> > >>
> > >> See we have use cases like screen capture and debug which rely on that
> > >> behavior.
> > > They will keep working, if (and only if) the vulkan side sets the
> > > winsys fences correctly. Also, everything else in vulkan aside from
> > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > timeline on the gl side.
> > >
> > >> The only thing we can do is to say on a per buffer flag that a buffer
> > >> should not participate in implicit sync at all.
> > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > decision for the rendered. Vulkan wants to control implicit sync
> > > explicitly, and the kernel can't force more synchronization. If a
> > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > to set the fences directly (and if the app gets it wrong, you get
> > > misrendering, but that is the specified behavour of vulkan).
> >
> > Yeah, but that's exactly what we tried to avoid.
> >
> > Mhm, when we attach the flag to the process/VM then this would break the
> > use case of VA-API and Vulkan in the same process.
> >
> > But I think if you attach the flag to the context that should indeed
> > work fine.
>
> Yeah that's a question I have, whether the drm_file is shared within
> one process among everything, or whether radeonsi/libva/vk each have
> their own. If each have their own drm_file, then we should be fine,
> otherwise we need to figure out another place to put this (worst case
> as a CS extension that vk just sets on every submit).

libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
process (modulo minigbm on chromeos and modulo a master fd).

That said the current proposal is for the context right? And on the
context this should pretty much work? So I'm not sure why this is the
part we are discussing?

>
> Also yes this risks that a vk app which was violationing the winsys
> spec will now break, which is why I think we should do this sooner
> than later. Otherwise the list of w/a we might need to apply in vk
> userspace will become very long :-( At least since this is purely
> opt-in from userspace, we only need to have the w/a list in userspace,
> where mesa has the infrastructure for that already.
> -Daniel
>
> >
> > Christian.
> >
> > > -Daniel
> > >
> > >> Regards,
> > >> Christian.
> > >>
> > >>>>> The current amdgpu uapi just doesn't allow any other model without an
> > >>>>> explicit opt-in. So current implicit sync userspace just has to
> > >>>>> oversync, there's not much choice.
> > >>>>>
> > >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> > >>>>>>
> > >>>>>> But this doesn't solve cross-driver interactions here.
> > >>>>> Yeah cross-driver is still entirely unsolved, because
> > >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > >>>> Hui? You have lost me. Why is that still unsolved?
> > >>> The part we're trying to solve with this patch is Vulkan should not
> > >>> participate in any implicit sync at all wrt submissions (and then
> > >>> handle the implicit sync for WSI explicitly using the fence
> > >>> import/export stuff that Jason wrote). As long we add shared fences to
> > >>> the dma_resv we participate in implicit sync (at the level of an
> > >>> implicit sync read) still, at least from the perspective of later jobs
> > >>> waiting on these fences.
> > >>>
> > >>>> Regards,
> > >>>> Christian.
> > >>>>
> > >>>>> -Daniel
> > >>>>>
> > >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> > >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> > >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> > >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> > >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> > >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> > >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> > >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> > >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> > >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> > >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > >>>>>>> ---
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> > >>>>>>>
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > >>>>>>>            struct amdgpu_bo *gds;
> > >>>>>>>            struct amdgpu_bo *gws;
> > >>>>>>>            struct amdgpu_bo *oa;
> > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > >>>>>>>            int r;
> > >>>>>>>
> > >>>>>>>            INIT_LIST_HEAD(&p->validated);
> > >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > >>>>>>>
> > >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > >>>>>>>
> > >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > >>>>>>> +               if (bo->tbo.base.dma_buf &&
> > >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > >>>>>>>                            e->chain = dma_fence_chain_alloc();
> > >>>>>>>                            if (!e->chain) {
> > >>>>>>>                                    r = -ENOMEM;
> > >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > >>>>>>>     {
> > >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > >>>>>>>            struct amdgpu_bo_list_entry *e;
> > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > >>>>>>>            int r;
> > >>>>>>>
> > >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> > >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> > >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> > >>>>>>>
> > >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > >>>>>>>                                         &fpriv->vm);
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> index c080ba15ae77..f982626b5328 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > >>>>>>>            return 0;
> > >>>>>>>     }
> > >>>>>>>
> > >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > >>>>>>> +                         struct drm_file *filp)
> > >>>>>>> +{
> > >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> > >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > >>>>>>> +
> > >>>>>>> +       switch (setparam->param) {
> > >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > >>>>>>> +               if (setparam->value)
> > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > >>>>>>> +               else
> > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > >>>>>>> +               break;
> > >>>>>>> +       default:
> > >>>>>>> +               return -EINVAL;
> > >>>>>>> +       }
> > >>>>>>> +
> > >>>>>>> +       return 0;
> > >>>>>>> +}
> > >>>>>>> +
> > >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > >>>>>>>            bool                    bulk_moveable;
> > >>>>>>>            /* Flag to indicate if VM is used for compute */
> > >>>>>>>            bool                    is_compute_context;
> > >>>>>>> +       /*
> > >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> > >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> > >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> > >>>>>>> +        */
> > >>>>>>> +       bool no_implicit_sync;
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>>     struct amdgpu_vm_manager {
> > >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> > >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> > >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> > >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> > >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> > >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> > >>>>>>>
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > >>>>>>>
> > >>>>>>>     /**
> > >>>>>>>      * DOC: memory domains
> > >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > >>>>>>>            struct drm_amdgpu_sched_in in;
> > >>>>>>>     };
> > >>>>>>>
> > >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > >>>>>>> +
> > >>>>>>> +struct drm_amdgpu_setparam {
> > >>>>>>> +       /* AMDGPU_SETPARAM_* */
> > >>>>>>> +       __u32   param;
> > >>>>>>> +       __u32   value;
> > >>>>>>> +};
> > >>>>>>> +
> > >>>>>>>     /*
> > >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> > >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> > >>>>>>> --
> > >>>>>>> 2.32.0.rc2
> > >>>>>>>
> > >
> >
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 14:58                     ` [Intel-gfx] " Bas Nieuwenhuizen
@ 2021-06-23 15:03                       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:03 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, DRI Development, Michel Dänzer, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Christian König, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Wed, Jun 23, 2021 at 4:02 PM Christian König
> > <christian.koenig@amd.com> wrote:
> > >
> > > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > > <christian.koenig@amd.com> wrote:
> > > >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > > >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > > >>> <christian.koenig@amd.com> wrote:
> > > >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > > >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > > >>>>> <bas@basnieuwenhuizen.nl> wrote:
> > > >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > > >>>>>>>
> > > >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> > > >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> > > >>>>>>> explicitly. This is the only way it will mesh well with explicit
> > > >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> > > >>>>>>> be able to manage anything else that wants to use the same buffer on
> > > >>>>>>> multiple engines in parallel, and still be able to share it through
> > > >>>>>>> implicit sync.
> > > >>>>>>>
> > > >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> > > >>>>>>>
> > > >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> > > >>>>>>> takes care of all the complexities of making sure that non-optional
> > > >>>>>>> fences (like bo moves) are not ignored. This support was added in
> > > >>>>>>>
> > > >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > > >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> > > >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> > > >>>>>>>
> > > >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > > >>>>>>>
> > > >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> > > >>>>>>> disables implicit sync on an allocated buffer completely.
> > > >>>>>>>
> > > >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> > > >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> > > >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> > > >>>>>>> pipeline (compositor, other process or just different stage in a media
> > > >>>>>>> pipeline in the same process) can then either do the same, or fully
> > > >>>>>>> participate in the implicit sync as implemented by the kernel by
> > > >>>>>>> default.
> > > >>>>>>>
> > > >>>>>>> By building on the existing flag for buffers we avoid any issues with
> > > >>>>>>> opening up additional security concerns - anything this new flag here
> > > >>>>>>> allows is already.
> > > >>>>>>>
> > > >>>>>>> All drivers which supports this concept of a userspace-specific
> > > >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > > >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> > > >>>>>>> let's try to do a bit better for amdgpu.
> > > >>>>>>>
> > > >>>>>>> This alone only allows us to completely avoid any stalls due to
> > > >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> > > >>>>>>> strange form of IPC for sync_file.
> > > >>>>>>>
> > > >>>>>>> For that we need two more pieces:
> > > >>>>>>>
> > > >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> > > >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> > > >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> > > >>>>>>>      the dma-buf makes a ton more sense:
> > > >>>>>>>
> > > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> > > >>>>>>>
> > > >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> > > >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> > > >>>>>>>      which must actually stall for the implicit fence is run a while
> > > >>>>>>>      after the implicit fence point was logically sampled per the api
> > > >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> > > >>>>>>>      results in oversync. Converting the implicit sync fences into a
> > > >>>>>>>      snap-shot sync_file is actually accurate.
> > > >>>>>>>
> > > >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> > > >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> > > >>>>>>>      same problems that the time the CS happens additional dependencies
> > > >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> > > >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> > > >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> > > >>>>>>>
> > > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> > > >>>>>>>
> > > >>>>>>> These three pieces together allow userspace to fully control implicit
> > > >>>>>>> fencing and remove all unecessary stall points due to them.
> > > >>>>>>>
> > > >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> > > >>>>>>> There is only one set of fences, you can only choose to sync against
> > > >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> > > >>>>>>> multiple buffers or anything else like this is fundamentally not
> > > >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> > > >>>>>>>
> > > >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> > > >>>>>>> explicit fencing semantics as possible:
> > > >>>>>>>
> > > >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> > > >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> > > >>>>>>> flag parameter in the VM ioctl which we could use, except:
> > > >>>>>>> - it's not checked, so userspace likely passes garbage
> > > >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> > > >>>>>>>      priority field
> > > >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> > > >>>>>>> useless, and we need to hack up a new one.
> > > >>>>>>>
> > > >>>>>>> v2: Explain why a new SETPARAM (Jason)
> > > >>>>>>>
> > > >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > > >>>>>>> need both, or this doesn't do much.
> > > >>>>>>>
> > > >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> > > >>>>>>> fences.
> > > >>>>>> So I think there is still a case missing in this implementation.
> > > >>>>>> Consider these 3 cases
> > > >>>>>>
> > > >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> > > >>>>>>
> > > >>>>>> explicit->explicit: This doesn't wait now, which is good
> > > >>>>>> Implicit->explicit: This doesn't wait now, which is good
> > > >>>>>> explicit->implicit : This still waits as the explicit submission still
> > > >>>>>> adds shared fences and most things that set an exclusive fence for
> > > >>>>>> implicit sync will hence wait on it.
> > > >>>>>>
> > > >>>>>> This is probably good enough for what radv needs now but also sounds
> > > >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> > > >>>>>> the end result.
> > > >>>>>>
> > > >>>>>> Within AMDGPU this is probably solvable in two ways:
> > > >>>>>>
> > > >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > > >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> > > >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> > > >>>>> write access, to cut down on the number of ioctls that are needed
> > > >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> > > >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > > >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > > >>>>
> > > >>>> Problem with the per context or per vm flag is that you then don't get
> > > >>>> any implicit synchronization any more when another process starts using
> > > >>>> the buffer.
> > > >>> That is exactly what I want for Vulkan :)
> > > >> Yeah, but as far as I know this is not something we can do.
> > > >>
> > > >> See we have use cases like screen capture and debug which rely on that
> > > >> behavior.
> > > > They will keep working, if (and only if) the vulkan side sets the
> > > > winsys fences correctly. Also, everything else in vulkan aside from
> > > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > > timeline on the gl side.
> > > >
> > > >> The only thing we can do is to say on a per buffer flag that a buffer
> > > >> should not participate in implicit sync at all.
> > > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > > decision for the rendered. Vulkan wants to control implicit sync
> > > > explicitly, and the kernel can't force more synchronization. If a
> > > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > > to set the fences directly (and if the app gets it wrong, you get
> > > > misrendering, but that is the specified behavour of vulkan).
> > >
> > > Yeah, but that's exactly what we tried to avoid.
> > >
> > > Mhm, when we attach the flag to the process/VM then this would break the
> > > use case of VA-API and Vulkan in the same process.
> > >
> > > But I think if you attach the flag to the context that should indeed
> > > work fine.
> >
> > Yeah that's a question I have, whether the drm_file is shared within
> > one process among everything, or whether radeonsi/libva/vk each have
> > their own. If each have their own drm_file, then we should be fine,
> > otherwise we need to figure out another place to put this (worst case
> > as a CS extension that vk just sets on every submit).
> 
> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
> process (modulo minigbm on chromeos and modulo a master fd).
> 
> That said the current proposal is for the context right? And on the
> context this should pretty much work? So I'm not sure why this is the
> part we are discussing?

It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
to have it's private VM for this. And on the quick I didn't see any other
way to create a VM than to have an FD of your own.

If there's something else that means "gpu context with it's own vm" then
the flag would need to be moved there, pointers appreciated (but maybe
someone with hw + userspace can do that quicker).
-Daniel

> 
> >
> > Also yes this risks that a vk app which was violationing the winsys
> > spec will now break, which is why I think we should do this sooner
> > than later. Otherwise the list of w/a we might need to apply in vk
> > userspace will become very long :-( At least since this is purely
> > opt-in from userspace, we only need to have the w/a list in userspace,
> > where mesa has the infrastructure for that already.
> > -Daniel
> >
> > >
> > > Christian.
> > >
> > > > -Daniel
> > > >
> > > >> Regards,
> > > >> Christian.
> > > >>
> > > >>>>> The current amdgpu uapi just doesn't allow any other model without an
> > > >>>>> explicit opt-in. So current implicit sync userspace just has to
> > > >>>>> oversync, there's not much choice.
> > > >>>>>
> > > >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > > >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> > > >>>>>>
> > > >>>>>> But this doesn't solve cross-driver interactions here.
> > > >>>>> Yeah cross-driver is still entirely unsolved, because
> > > >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > > >>>> Hui? You have lost me. Why is that still unsolved?
> > > >>> The part we're trying to solve with this patch is Vulkan should not
> > > >>> participate in any implicit sync at all wrt submissions (and then
> > > >>> handle the implicit sync for WSI explicitly using the fence
> > > >>> import/export stuff that Jason wrote). As long we add shared fences to
> > > >>> the dma_resv we participate in implicit sync (at the level of an
> > > >>> implicit sync read) still, at least from the perspective of later jobs
> > > >>> waiting on these fences.
> > > >>>
> > > >>>> Regards,
> > > >>>> Christian.
> > > >>>>
> > > >>>>> -Daniel
> > > >>>>>
> > > >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> > > >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > > >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> > > >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> > > >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > > >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> > > >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> > > >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> > > >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > > >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > > >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> > > >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> > > >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> > > >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> > > >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > >>>>>>> ---
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > > >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > > >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> > > >>>>>>>
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > >>>>>>>            struct amdgpu_bo *gds;
> > > >>>>>>>            struct amdgpu_bo *gws;
> > > >>>>>>>            struct amdgpu_bo *oa;
> > > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > >>>>>>>            int r;
> > > >>>>>>>
> > > >>>>>>>            INIT_LIST_HEAD(&p->validated);
> > > >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > >>>>>>>
> > > >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > >>>>>>>
> > > >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > > >>>>>>> +               if (bo->tbo.base.dma_buf &&
> > > >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > > >>>>>>>                            e->chain = dma_fence_chain_alloc();
> > > >>>>>>>                            if (!e->chain) {
> > > >>>>>>>                                    r = -ENOMEM;
> > > >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > >>>>>>>     {
> > > >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > > >>>>>>>            struct amdgpu_bo_list_entry *e;
> > > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > >>>>>>>            int r;
> > > >>>>>>>
> > > >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> > > >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> > > >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> > > >>>>>>>
> > > >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > > >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > > >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > > >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > > >>>>>>>                                         &fpriv->vm);
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> index c080ba15ae77..f982626b5328 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > > >>>>>>>            return 0;
> > > >>>>>>>     }
> > > >>>>>>>
> > > >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > > >>>>>>> +                         struct drm_file *filp)
> > > >>>>>>> +{
> > > >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> > > >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > > >>>>>>> +
> > > >>>>>>> +       switch (setparam->param) {
> > > >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > > >>>>>>> +               if (setparam->value)
> > > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > > >>>>>>> +               else
> > > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > > >>>>>>> +               break;
> > > >>>>>>> +       default:
> > > >>>>>>> +               return -EINVAL;
> > > >>>>>>> +       }
> > > >>>>>>> +
> > > >>>>>>> +       return 0;
> > > >>>>>>> +}
> > > >>>>>>> +
> > > >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > > >>>>>>>            bool                    bulk_moveable;
> > > >>>>>>>            /* Flag to indicate if VM is used for compute */
> > > >>>>>>>            bool                    is_compute_context;
> > > >>>>>>> +       /*
> > > >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> > > >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> > > >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> > > >>>>>>> +        */
> > > >>>>>>> +       bool no_implicit_sync;
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>>     struct amdgpu_vm_manager {
> > > >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> > > >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> > > >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> > > >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > > >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> > > >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> > > >>>>>>>
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > > >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > > >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > > >>>>>>>
> > > >>>>>>>     /**
> > > >>>>>>>      * DOC: memory domains
> > > >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > > >>>>>>>            struct drm_amdgpu_sched_in in;
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > > >>>>>>> +
> > > >>>>>>> +struct drm_amdgpu_setparam {
> > > >>>>>>> +       /* AMDGPU_SETPARAM_* */
> > > >>>>>>> +       __u32   param;
> > > >>>>>>> +       __u32   value;
> > > >>>>>>> +};
> > > >>>>>>> +
> > > >>>>>>>     /*
> > > >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> > > >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> > > >>>>>>> --
> > > >>>>>>> 2.32.0.rc2
> > > >>>>>>>
> > > >
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 15:03                       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:03 UTC (permalink / raw)
  To: Bas Nieuwenhuizen
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, DRI Development, Sumit Semwal, Michel Dänzer,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Christian König,
	Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > On Wed, Jun 23, 2021 at 4:02 PM Christian König
> > <christian.koenig@amd.com> wrote:
> > >
> > > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > > <christian.koenig@amd.com> wrote:
> > > >> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > > >>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > > >>> <christian.koenig@amd.com> wrote:
> > > >>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > > >>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > > >>>>> <bas@basnieuwenhuizen.nl> wrote:
> > > >>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > > >>>>>>>
> > > >>>>>>> Implicit fencing done properly needs to treat the implicit fencing
> > > >>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
> > > >>>>>>> explicitly. This is the only way it will mesh well with explicit
> > > >>>>>>> fencing userspace like vk, and it's also the bare minimum required to
> > > >>>>>>> be able to manage anything else that wants to use the same buffer on
> > > >>>>>>> multiple engines in parallel, and still be able to share it through
> > > >>>>>>> implicit sync.
> > > >>>>>>>
> > > >>>>>>> amdgpu completely lacks such an uapi. Fix this.
> > > >>>>>>>
> > > >>>>>>> Luckily the concept of ignoring implicit fences exists already, and
> > > >>>>>>> takes care of all the complexities of making sure that non-optional
> > > >>>>>>> fences (like bo moves) are not ignored. This support was added in
> > > >>>>>>>
> > > >>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > > >>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
> > > >>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
> > > >>>>>>>
> > > >>>>>>>        drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > > >>>>>>>
> > > >>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
> > > >>>>>>> disables implicit sync on an allocated buffer completely.
> > > >>>>>>>
> > > >>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
> > > >>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
> > > >>>>>>> can manage the implicit sync slots explicitly. The other side of the
> > > >>>>>>> pipeline (compositor, other process or just different stage in a media
> > > >>>>>>> pipeline in the same process) can then either do the same, or fully
> > > >>>>>>> participate in the implicit sync as implemented by the kernel by
> > > >>>>>>> default.
> > > >>>>>>>
> > > >>>>>>> By building on the existing flag for buffers we avoid any issues with
> > > >>>>>>> opening up additional security concerns - anything this new flag here
> > > >>>>>>> allows is already.
> > > >>>>>>>
> > > >>>>>>> All drivers which supports this concept of a userspace-specific
> > > >>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > > >>>>>>> that turned out to be a bit too inflexible. See the discussion below,
> > > >>>>>>> let's try to do a bit better for amdgpu.
> > > >>>>>>>
> > > >>>>>>> This alone only allows us to completely avoid any stalls due to
> > > >>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
> > > >>>>>>> strange form of IPC for sync_file.
> > > >>>>>>>
> > > >>>>>>> For that we need two more pieces:
> > > >>>>>>>
> > > >>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
> > > >>>>>>>      be done in a driver ioctl, but everyone needs this, and generally a
> > > >>>>>>>      dma-buf is involved anyway to establish the sharing. So an ioctl on
> > > >>>>>>>      the dma-buf makes a ton more sense:
> > > >>>>>>>
> > > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=fbdwtutEj93anZp6Pshs277QoMTHZxIy0Yl54T95rCw%3D&amp;reserved=0
> > > >>>>>>>
> > > >>>>>>>      Current drivers in upstream solves this by having the opt-out flag
> > > >>>>>>>      on their CS ioctl. This has the downside that very often the CS
> > > >>>>>>>      which must actually stall for the implicit fence is run a while
> > > >>>>>>>      after the implicit fence point was logically sampled per the api
> > > >>>>>>>      spec (vk passes an explicit syncobj around for that afaiui), and so
> > > >>>>>>>      results in oversync. Converting the implicit sync fences into a
> > > >>>>>>>      snap-shot sync_file is actually accurate.
> > > >>>>>>>
> > > >>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
> > > >>>>>>>      Current drivers again do this with a CS ioctl flag, with again the
> > > >>>>>>>      same problems that the time the CS happens additional dependencies
> > > >>>>>>>      have been added. An explicit ioctl to only insert a sync_file (while
> > > >>>>>>>      respecting the rules for how exclusive and shared fence slots must
> > > >>>>>>>      be update in struct dma_resv) is much better. This is proposed here:
> > > >>>>>>>
> > > >>>>>>>      https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C83dbdd0a1eb8442cbf7108d9364db51e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600529684040802%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=vv%2BREnWorjoTOwrD1jH1GHVQcjPy1oesaophsz056aI%3D&amp;reserved=0
> > > >>>>>>>
> > > >>>>>>> These three pieces together allow userspace to fully control implicit
> > > >>>>>>> fencing and remove all unecessary stall points due to them.
> > > >>>>>>>
> > > >>>>>>> Well, as much as the implicit fencing model fundamentally allows:
> > > >>>>>>> There is only one set of fences, you can only choose to sync against
> > > >>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
> > > >>>>>>> multiple buffers or anything else like this is fundamentally not
> > > >>>>>>> possible, and can only be fixed by a proper explicit fencing model.
> > > >>>>>>>
> > > >>>>>>> Aside from that caveat this model gets implicit fencing as closely to
> > > >>>>>>> explicit fencing semantics as possible:
> > > >>>>>>>
> > > >>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
> > > >>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
> > > >>>>>>> flag parameter in the VM ioctl which we could use, except:
> > > >>>>>>> - it's not checked, so userspace likely passes garbage
> > > >>>>>>> - there's already a comment that userspace _does_ pass garbage in the
> > > >>>>>>>      priority field
> > > >>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
> > > >>>>>>> useless, and we need to hack up a new one.
> > > >>>>>>>
> > > >>>>>>> v2: Explain why a new SETPARAM (Jason)
> > > >>>>>>>
> > > >>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > > >>>>>>> need both, or this doesn't do much.
> > > >>>>>>>
> > > >>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
> > > >>>>>>> fences.
> > > >>>>>> So I think there is still a case missing in this implementation.
> > > >>>>>> Consider these 3 cases
> > > >>>>>>
> > > >>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
> > > >>>>>>
> > > >>>>>> explicit->explicit: This doesn't wait now, which is good
> > > >>>>>> Implicit->explicit: This doesn't wait now, which is good
> > > >>>>>> explicit->implicit : This still waits as the explicit submission still
> > > >>>>>> adds shared fences and most things that set an exclusive fence for
> > > >>>>>> implicit sync will hence wait on it.
> > > >>>>>>
> > > >>>>>> This is probably good enough for what radv needs now but also sounds
> > > >>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
> > > >>>>>> the end result.
> > > >>>>>>
> > > >>>>>> Within AMDGPU this is probably solvable in two ways:
> > > >>>>>>
> > > >>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > > >>>>> I'm not sure that works. I think the right fix is that radeonsi also
> > > >>>>> switches to this model, with maybe a per-bo CS flag to set indicate
> > > >>>>> write access, to cut down on the number of ioctls that are needed
> > > >>>>> otherwise on shared buffers. This per-bo flag would essentially select
> > > >>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > > >>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > > >>>>
> > > >>>> Problem with the per context or per vm flag is that you then don't get
> > > >>>> any implicit synchronization any more when another process starts using
> > > >>>> the buffer.
> > > >>> That is exactly what I want for Vulkan :)
> > > >> Yeah, but as far as I know this is not something we can do.
> > > >>
> > > >> See we have use cases like screen capture and debug which rely on that
> > > >> behavior.
> > > > They will keep working, if (and only if) the vulkan side sets the
> > > > winsys fences correctly. Also, everything else in vulkan aside from
> > > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > > timeline on the gl side.
> > > >
> > > >> The only thing we can do is to say on a per buffer flag that a buffer
> > > >> should not participate in implicit sync at all.
> > > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > > decision for the rendered. Vulkan wants to control implicit sync
> > > > explicitly, and the kernel can't force more synchronization. If a
> > > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > > to set the fences directly (and if the app gets it wrong, you get
> > > > misrendering, but that is the specified behavour of vulkan).
> > >
> > > Yeah, but that's exactly what we tried to avoid.
> > >
> > > Mhm, when we attach the flag to the process/VM then this would break the
> > > use case of VA-API and Vulkan in the same process.
> > >
> > > But I think if you attach the flag to the context that should indeed
> > > work fine.
> >
> > Yeah that's a question I have, whether the drm_file is shared within
> > one process among everything, or whether radeonsi/libva/vk each have
> > their own. If each have their own drm_file, then we should be fine,
> > otherwise we need to figure out another place to put this (worst case
> > as a CS extension that vk just sets on every submit).
> 
> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
> process (modulo minigbm on chromeos and modulo a master fd).
> 
> That said the current proposal is for the context right? And on the
> context this should pretty much work? So I'm not sure why this is the
> part we are discussing?

It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
to have it's private VM for this. And on the quick I didn't see any other
way to create a VM than to have an FD of your own.

If there's something else that means "gpu context with it's own vm" then
the flag would need to be moved there, pointers appreciated (but maybe
someone with hw + userspace can do that quicker).
-Daniel

> 
> >
> > Also yes this risks that a vk app which was violationing the winsys
> > spec will now break, which is why I think we should do this sooner
> > than later. Otherwise the list of w/a we might need to apply in vk
> > userspace will become very long :-( At least since this is purely
> > opt-in from userspace, we only need to have the w/a list in userspace,
> > where mesa has the infrastructure for that already.
> > -Daniel
> >
> > >
> > > Christian.
> > >
> > > > -Daniel
> > > >
> > > >> Regards,
> > > >> Christian.
> > > >>
> > > >>>>> The current amdgpu uapi just doesn't allow any other model without an
> > > >>>>> explicit opt-in. So current implicit sync userspace just has to
> > > >>>>> oversync, there's not much choice.
> > > >>>>>
> > > >>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > > >>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
> > > >>>>>>
> > > >>>>>> But this doesn't solve cross-driver interactions here.
> > > >>>>> Yeah cross-driver is still entirely unsolved, because
> > > >>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > > >>>> Hui? You have lost me. Why is that still unsolved?
> > > >>> The part we're trying to solve with this patch is Vulkan should not
> > > >>> participate in any implicit sync at all wrt submissions (and then
> > > >>> handle the implicit sync for WSI explicitly using the fence
> > > >>> import/export stuff that Jason wrote). As long we add shared fences to
> > > >>> the dma_resv we participate in implicit sync (at the level of an
> > > >>> implicit sync read) still, at least from the perspective of later jobs
> > > >>> waiting on these fences.
> > > >>>
> > > >>>> Regards,
> > > >>>> Christian.
> > > >>>>
> > > >>>>> -Daniel
> > > >>>>>
> > > >>>>>>> Cc: mesa-dev@lists.freedesktop.org
> > > >>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > > >>>>>>> Cc: Dave Airlie <airlied@gmail.com>
> > > >>>>>>> Cc: Rob Clark <robdclark@chromium.org>
> > > >>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > > >>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
> > > >>>>>>> Cc: Daniel Stone <daniels@collabora.com>
> > > >>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > >>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
> > > >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
> > > >>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > >>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > > >>>>>>> Cc: Chen Li <chenli@uniontech.com>
> > > >>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
> > > >>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
> > > >>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > >>>>>>> Cc: linaro-mm-sig@lists.linaro.org
> > > >>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > >>>>>>> ---
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > > >>>>>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > > >>>>>>>     include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > > >>>>>>>     4 files changed, 42 insertions(+), 2 deletions(-)
> > > >>>>>>>
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> index 65df34c17264..c5386d13eb4a 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > >>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > >>>>>>>            struct amdgpu_bo *gds;
> > > >>>>>>>            struct amdgpu_bo *gws;
> > > >>>>>>>            struct amdgpu_bo *oa;
> > > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > >>>>>>>            int r;
> > > >>>>>>>
> > > >>>>>>>            INIT_LIST_HEAD(&p->validated);
> > > >>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > >>>>>>>
> > > >>>>>>>                    e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > >>>>>>>
> > > >>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > > >>>>>>> +               if (bo->tbo.base.dma_buf &&
> > > >>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > > >>>>>>>                            e->chain = dma_fence_chain_alloc();
> > > >>>>>>>                            if (!e->chain) {
> > > >>>>>>>                                    r = -ENOMEM;
> > > >>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > >>>>>>>     {
> > > >>>>>>>            struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > > >>>>>>>            struct amdgpu_bo_list_entry *e;
> > > >>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > >>>>>>>            int r;
> > > >>>>>>>
> > > >>>>>>>            list_for_each_entry(e, &p->validated, tv.head) {
> > > >>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > >>>>>>>                    struct dma_resv *resv = bo->tbo.base.resv;
> > > >>>>>>>                    enum amdgpu_sync_mode sync_mode;
> > > >>>>>>>
> > > >>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > > >>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > > >>>>>>>                            AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > > >>>>>>>                    r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > > >>>>>>>                                         &fpriv->vm);
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> index c080ba15ae77..f982626b5328 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > >>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > > >>>>>>>            return 0;
> > > >>>>>>>     }
> > > >>>>>>>
> > > >>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > > >>>>>>> +                         struct drm_file *filp)
> > > >>>>>>> +{
> > > >>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
> > > >>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > > >>>>>>> +
> > > >>>>>>> +       switch (setparam->param) {
> > > >>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > > >>>>>>> +               if (setparam->value)
> > > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > > >>>>>>> +               else
> > > >>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > > >>>>>>> +               break;
> > > >>>>>>> +       default:
> > > >>>>>>> +               return -EINVAL;
> > > >>>>>>> +       }
> > > >>>>>>> +
> > > >>>>>>> +       return 0;
> > > >>>>>>> +}
> > > >>>>>>> +
> > > >>>>>>>     const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>            DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>>     static const struct drm_driver amdgpu_kms_driver = {
> > > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> index ddb85a85cbba..0e8c440c6303 100644
> > > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > >>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > > >>>>>>>            bool                    bulk_moveable;
> > > >>>>>>>            /* Flag to indicate if VM is used for compute */
> > > >>>>>>>            bool                    is_compute_context;
> > > >>>>>>> +       /*
> > > >>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
> > > >>>>>>> +        * this context. We do not care about races at all, userspace is allowed
> > > >>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
> > > >>>>>>> +        */
> > > >>>>>>> +       bool no_implicit_sync;
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>>     struct amdgpu_vm_manager {
> > > >>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
> > > >>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
> > > >>>>>>> @@ -54,6 +54,7 @@ extern "C" {
> > > >>>>>>>     #define DRM_AMDGPU_VM                  0x13
> > > >>>>>>>     #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > > >>>>>>>     #define DRM_AMDGPU_SCHED               0x15
> > > >>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
> > > >>>>>>>
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > > >>>>>>> @@ -71,6 +72,7 @@ extern "C" {
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > > >>>>>>>     #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > > >>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > > >>>>>>>
> > > >>>>>>>     /**
> > > >>>>>>>      * DOC: memory domains
> > > >>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > > >>>>>>>            struct drm_amdgpu_sched_in in;
> > > >>>>>>>     };
> > > >>>>>>>
> > > >>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > > >>>>>>> +
> > > >>>>>>> +struct drm_amdgpu_setparam {
> > > >>>>>>> +       /* AMDGPU_SETPARAM_* */
> > > >>>>>>> +       __u32   param;
> > > >>>>>>> +       __u32   value;
> > > >>>>>>> +};
> > > >>>>>>> +
> > > >>>>>>>     /*
> > > >>>>>>>      * This is not a reliable API and you should expect it to fail for any
> > > >>>>>>>      * number of reasons and have fallback path that do not use userptr to
> > > >>>>>>> --
> > > >>>>>>> 2.32.0.rc2
> > > >>>>>>>
> > > >
> > >
> >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 15:03                       ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 15:07                         ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 15:07 UTC (permalink / raw)
  To: Daniel Vetter, Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Daniel Vetter, Alex Deucher,
	mesa-dev, Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 17:03 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
>> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>> On Wed, Jun 23, 2021 at 4:02 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 3:44 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>>>>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>>>>>
>>>>>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>>>>>> implicit sync.
>>>>>>>>>>>
>>>>>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>>>>>
>>>>>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>>>>>
>>>>>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>>>>>
>>>>>>>>>>>         drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>>>>>
>>>>>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>>>>>
>>>>>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>>>>>> default.
>>>>>>>>>>>
>>>>>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>>>>>> allows is already.
>>>>>>>>>>>
>>>>>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>>>>>
>>>>>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>>>>>> strange form of IPC for sync_file.
>>>>>>>>>>>
>>>>>>>>>>> For that we need two more pieces:
>>>>>>>>>>>
>>>>>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>>>>>       be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>>>>>       dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>>>>>       the dma-buf makes a ton more sense:
>>>>>>>>>>>
>>>>>>>>>>>       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gntXLzlrqPxYj4Q3mQflD3arT9ad40S9AqsvtOXV4nk%3D&amp;reserved=0
>>>>>>>>>>>
>>>>>>>>>>>       Current drivers in upstream solves this by having the opt-out flag
>>>>>>>>>>>       on their CS ioctl. This has the downside that very often the CS
>>>>>>>>>>>       which must actually stall for the implicit fence is run a while
>>>>>>>>>>>       after the implicit fence point was logically sampled per the api
>>>>>>>>>>>       spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>>>>>       results in oversync. Converting the implicit sync fences into a
>>>>>>>>>>>       snap-shot sync_file is actually accurate.
>>>>>>>>>>>
>>>>>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>>>>>       Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>>>>>       same problems that the time the CS happens additional dependencies
>>>>>>>>>>>       have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>>>>>       respecting the rules for how exclusive and shared fence slots must
>>>>>>>>>>>       be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>>>>>
>>>>>>>>>>>       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YtqHT756jlt5NX7Ydr3Kk1UMTb98nQhlcOlrnr%2B48HE%3D&amp;reserved=0
>>>>>>>>>>>
>>>>>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>>>>>
>>>>>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>>>>>
>>>>>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>>>>>> explicit fencing semantics as possible:
>>>>>>>>>>>
>>>>>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>>>>>       priority field
>>>>>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>>>>>> useless, and we need to hack up a new one.
>>>>>>>>>>>
>>>>>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>>>>>
>>>>>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>>>>>> need both, or this doesn't do much.
>>>>>>>>>>>
>>>>>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>>>>>> fences.
>>>>>>>>>> So I think there is still a case missing in this implementation.
>>>>>>>>>> Consider these 3 cases
>>>>>>>>>>
>>>>>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>>>>>
>>>>>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>>>>>> implicit sync will hence wait on it.
>>>>>>>>>>
>>>>>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>>>>>> the end result.
>>>>>>>>>>
>>>>>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>>>>>
>>>>>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>>>>>> write access, to cut down on the number of ioctls that are needed
>>>>>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>>>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>>>>>
>>>>>>>> Problem with the per context or per vm flag is that you then don't get
>>>>>>>> any implicit synchronization any more when another process starts using
>>>>>>>> the buffer.
>>>>>>> That is exactly what I want for Vulkan :)
>>>>>> Yeah, but as far as I know this is not something we can do.
>>>>>>
>>>>>> See we have use cases like screen capture and debug which rely on that
>>>>>> behavior.
>>>>> They will keep working, if (and only if) the vulkan side sets the
>>>>> winsys fences correctly. Also, everything else in vulkan aside from
>>>>> winsys is explicitly not synced at all, you have to import drm syncobj
>>>>> timeline on the gl side.
>>>>>
>>>>>> The only thing we can do is to say on a per buffer flag that a buffer
>>>>>> should not participate in implicit sync at all.
>>>>> Nah, this doesn't work. Because it's not a global decision, is a local
>>>>> decision for the rendered. Vulkan wants to control implicit sync
>>>>> explicitly, and the kernel can't force more synchronization. If a
>>>>> buffer is shared as a winsys buffer between vulkan client and gl using
>>>>> compositor, then you _have_ to use implicit sync on it. But vk needs
>>>>> to set the fences directly (and if the app gets it wrong, you get
>>>>> misrendering, but that is the specified behavour of vulkan).
>>>> Yeah, but that's exactly what we tried to avoid.
>>>>
>>>> Mhm, when we attach the flag to the process/VM then this would break the
>>>> use case of VA-API and Vulkan in the same process.
>>>>
>>>> But I think if you attach the flag to the context that should indeed
>>>> work fine.
>>> Yeah that's a question I have, whether the drm_file is shared within
>>> one process among everything, or whether radeonsi/libva/vk each have
>>> their own. If each have their own drm_file, then we should be fine,
>>> otherwise we need to figure out another place to put this (worst case
>>> as a CS extension that vk just sets on every submit).
>> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
>> process (modulo minigbm on chromeos and modulo a master fd).
>>
>> That said the current proposal is for the context right? And on the
>> context this should pretty much work? So I'm not sure why this is the
>> part we are discussing?
> It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
> to have it's private VM for this. And on the quick I didn't see any other
> way to create a VM than to have an FD of your own.

You can't have your own FD in libdrm_amdgpu userspace. We had a pretty 
hard design discussion about that already.

What you could do is to load your own copy of libdrm_amdgpu, but I won't 
recommend that.

Just putting the flag on the context instead of the VM is much cleaner 
as far as I can see anyway.

Christian.

> If there's something else that means "gpu context with it's own vm" then
> the flag would need to be moved there, pointers appreciated (but maybe
> someone with hw + userspace can do that quicker).
> -Daniel
>
>>> Also yes this risks that a vk app which was violationing the winsys
>>> spec will now break, which is why I think we should do this sooner
>>> than later. Otherwise the list of w/a we might need to apply in vk
>>> userspace will become very long :-( At least since this is purely
>>> opt-in from userspace, we only need to have the w/a list in userspace,
>>> where mesa has the infrastructure for that already.
>>> -Daniel
>>>
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>>>>>> oversync, there's not much choice.
>>>>>>>>>
>>>>>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>>>>>
>>>>>>>>>> But this doesn't solve cross-driver interactions here.
>>>>>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>>>>>> Hui? You have lost me. Why is that still unsolved?
>>>>>>> The part we're trying to solve with this patch is Vulkan should not
>>>>>>> participate in any implicit sync at all wrt submissions (and then
>>>>>>> handle the implicit sync for WSI explicitly using the fence
>>>>>>> import/export stuff that Jason wrote). As long we add shared fences to
>>>>>>> the dma_resv we participate in implicit sync (at the level of an
>>>>>>> implicit sync read) still, at least from the perspective of later jobs
>>>>>>> waiting on these fences.
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> -Daniel
>>>>>>>>>
>>>>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>>>>>      include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>>>>>      4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>             struct amdgpu_bo *gds;
>>>>>>>>>>>             struct amdgpu_bo *gws;
>>>>>>>>>>>             struct amdgpu_bo *oa;
>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>             int r;
>>>>>>>>>>>
>>>>>>>>>>>             INIT_LIST_HEAD(&p->validated);
>>>>>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>
>>>>>>>>>>>                     e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>>>>>
>>>>>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>>>>>                             e->chain = dma_fence_chain_alloc();
>>>>>>>>>>>                             if (!e->chain) {
>>>>>>>>>>>                                     r = -ENOMEM;
>>>>>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>      {
>>>>>>>>>>>             struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>>>>>             struct amdgpu_bo_list_entry *e;
>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>             int r;
>>>>>>>>>>>
>>>>>>>>>>>             list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>                     struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>>>>>                     enum amdgpu_sync_mode sync_mode;
>>>>>>>>>>>
>>>>>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>                             AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>>>>>                     r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>>>>>                                          &fpriv->vm);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>>>>>             return 0;
>>>>>>>>>>>      }
>>>>>>>>>>>
>>>>>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>>>>>> +                         struct drm_file *filp)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>>>>>> +
>>>>>>>>>>> +       switch (setparam->param) {
>>>>>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>>>>>> +               if (setparam->value)
>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>>>>>> +               else
>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>>>>>> +               break;
>>>>>>>>>>> +       default:
>>>>>>>>>>> +               return -EINVAL;
>>>>>>>>>>> +       }
>>>>>>>>>>> +
>>>>>>>>>>> +       return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>      const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>>      static const struct drm_driver amdgpu_kms_driver = {
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>>>>>             bool                    bulk_moveable;
>>>>>>>>>>>             /* Flag to indicate if VM is used for compute */
>>>>>>>>>>>             bool                    is_compute_context;
>>>>>>>>>>> +       /*
>>>>>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>>>>>> +        */
>>>>>>>>>>> +       bool no_implicit_sync;
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>>      struct amdgpu_vm_manager {
>>>>>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>>>>>      #define DRM_AMDGPU_VM                  0x13
>>>>>>>>>>>      #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>>>>>      #define DRM_AMDGPU_SCHED               0x15
>>>>>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>>>>>
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>>>>>
>>>>>>>>>>>      /**
>>>>>>>>>>>       * DOC: memory domains
>>>>>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>>>>>             struct drm_amdgpu_sched_in in;
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>>>>>> +
>>>>>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>>>>>> +       __u32   param;
>>>>>>>>>>> +       __u32   value;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>>      /*
>>>>>>>>>>>       * This is not a reliable API and you should expect it to fail for any
>>>>>>>>>>>       * number of reasons and have fallback path that do not use userptr to
>>>>>>>>>>> --
>>>>>>>>>>> 2.32.0.rc2
>>>>>>>>>>>
>>>
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=tc6ZdgYzOXpER4vpuOiOlyIsr7YTAHLMcuFaNjSs6YE%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 15:07                         ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 15:07 UTC (permalink / raw)
  To: Daniel Vetter, Bas Nieuwenhuizen
  Cc: Rob Clark, Daniel Stone, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Daniel Vetter,
	Alex Deucher, mesa-dev, Dave Airlie, Michel Dänzer,
	Dennis Li, Deepak R Varma

Am 23.06.21 um 17:03 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
>> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>> On Wed, Jun 23, 2021 at 4:02 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 3:44 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>>>>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>>>>>
>>>>>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>>>>>> implicit sync.
>>>>>>>>>>>
>>>>>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>>>>>
>>>>>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>>>>>
>>>>>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>>>>>
>>>>>>>>>>>         drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>>>>>
>>>>>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>>>>>
>>>>>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>>>>>> default.
>>>>>>>>>>>
>>>>>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>>>>>> allows is already.
>>>>>>>>>>>
>>>>>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>>>>>
>>>>>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>>>>>> strange form of IPC for sync_file.
>>>>>>>>>>>
>>>>>>>>>>> For that we need two more pieces:
>>>>>>>>>>>
>>>>>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>>>>>       be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>>>>>       dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>>>>>       the dma-buf makes a ton more sense:
>>>>>>>>>>>
>>>>>>>>>>>       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gntXLzlrqPxYj4Q3mQflD3arT9ad40S9AqsvtOXV4nk%3D&amp;reserved=0
>>>>>>>>>>>
>>>>>>>>>>>       Current drivers in upstream solves this by having the opt-out flag
>>>>>>>>>>>       on their CS ioctl. This has the downside that very often the CS
>>>>>>>>>>>       which must actually stall for the implicit fence is run a while
>>>>>>>>>>>       after the implicit fence point was logically sampled per the api
>>>>>>>>>>>       spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>>>>>       results in oversync. Converting the implicit sync fences into a
>>>>>>>>>>>       snap-shot sync_file is actually accurate.
>>>>>>>>>>>
>>>>>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>>>>>       Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>>>>>       same problems that the time the CS happens additional dependencies
>>>>>>>>>>>       have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>>>>>       respecting the rules for how exclusive and shared fence slots must
>>>>>>>>>>>       be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>>>>>
>>>>>>>>>>>       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YtqHT756jlt5NX7Ydr3Kk1UMTb98nQhlcOlrnr%2B48HE%3D&amp;reserved=0
>>>>>>>>>>>
>>>>>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>>>>>
>>>>>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>>>>>
>>>>>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>>>>>> explicit fencing semantics as possible:
>>>>>>>>>>>
>>>>>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>>>>>       priority field
>>>>>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>>>>>> useless, and we need to hack up a new one.
>>>>>>>>>>>
>>>>>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>>>>>
>>>>>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>>>>>> need both, or this doesn't do much.
>>>>>>>>>>>
>>>>>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>>>>>> fences.
>>>>>>>>>> So I think there is still a case missing in this implementation.
>>>>>>>>>> Consider these 3 cases
>>>>>>>>>>
>>>>>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>>>>>
>>>>>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>>>>>> implicit sync will hence wait on it.
>>>>>>>>>>
>>>>>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>>>>>> the end result.
>>>>>>>>>>
>>>>>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>>>>>
>>>>>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>>>>>> write access, to cut down on the number of ioctls that are needed
>>>>>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>>>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>>>>>
>>>>>>>> Problem with the per context or per vm flag is that you then don't get
>>>>>>>> any implicit synchronization any more when another process starts using
>>>>>>>> the buffer.
>>>>>>> That is exactly what I want for Vulkan :)
>>>>>> Yeah, but as far as I know this is not something we can do.
>>>>>>
>>>>>> See we have use cases like screen capture and debug which rely on that
>>>>>> behavior.
>>>>> They will keep working, if (and only if) the vulkan side sets the
>>>>> winsys fences correctly. Also, everything else in vulkan aside from
>>>>> winsys is explicitly not synced at all, you have to import drm syncobj
>>>>> timeline on the gl side.
>>>>>
>>>>>> The only thing we can do is to say on a per buffer flag that a buffer
>>>>>> should not participate in implicit sync at all.
>>>>> Nah, this doesn't work. Because it's not a global decision, is a local
>>>>> decision for the rendered. Vulkan wants to control implicit sync
>>>>> explicitly, and the kernel can't force more synchronization. If a
>>>>> buffer is shared as a winsys buffer between vulkan client and gl using
>>>>> compositor, then you _have_ to use implicit sync on it. But vk needs
>>>>> to set the fences directly (and if the app gets it wrong, you get
>>>>> misrendering, but that is the specified behavour of vulkan).
>>>> Yeah, but that's exactly what we tried to avoid.
>>>>
>>>> Mhm, when we attach the flag to the process/VM then this would break the
>>>> use case of VA-API and Vulkan in the same process.
>>>>
>>>> But I think if you attach the flag to the context that should indeed
>>>> work fine.
>>> Yeah that's a question I have, whether the drm_file is shared within
>>> one process among everything, or whether radeonsi/libva/vk each have
>>> their own. If each have their own drm_file, then we should be fine,
>>> otherwise we need to figure out another place to put this (worst case
>>> as a CS extension that vk just sets on every submit).
>> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
>> process (modulo minigbm on chromeos and modulo a master fd).
>>
>> That said the current proposal is for the context right? And on the
>> context this should pretty much work? So I'm not sure why this is the
>> part we are discussing?
> It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
> to have it's private VM for this. And on the quick I didn't see any other
> way to create a VM than to have an FD of your own.

You can't have your own FD in libdrm_amdgpu userspace. We had a pretty 
hard design discussion about that already.

What you could do is to load your own copy of libdrm_amdgpu, but I won't 
recommend that.

Just putting the flag on the context instead of the VM is much cleaner 
as far as I can see anyway.

Christian.

> If there's something else that means "gpu context with it's own vm" then
> the flag would need to be moved there, pointers appreciated (but maybe
> someone with hw + userspace can do that quicker).
> -Daniel
>
>>> Also yes this risks that a vk app which was violationing the winsys
>>> spec will now break, which is why I think we should do this sooner
>>> than later. Otherwise the list of w/a we might need to apply in vk
>>> userspace will become very long :-( At least since this is purely
>>> opt-in from userspace, we only need to have the w/a list in userspace,
>>> where mesa has the infrastructure for that already.
>>> -Daniel
>>>
>>>> Christian.
>>>>
>>>>> -Daniel
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>>>>>> oversync, there's not much choice.
>>>>>>>>>
>>>>>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>>>>>
>>>>>>>>>> But this doesn't solve cross-driver interactions here.
>>>>>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>>>>>> Hui? You have lost me. Why is that still unsolved?
>>>>>>> The part we're trying to solve with this patch is Vulkan should not
>>>>>>> participate in any implicit sync at all wrt submissions (and then
>>>>>>> handle the implicit sync for WSI explicitly using the fence
>>>>>>> import/export stuff that Jason wrote). As long we add shared fences to
>>>>>>> the dma_resv we participate in implicit sync (at the level of an
>>>>>>> implicit sync read) still, at least from the perspective of later jobs
>>>>>>> waiting on these fences.
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>> -Daniel
>>>>>>>>>
>>>>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>>>>>      include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>>>>>      4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>             struct amdgpu_bo *gds;
>>>>>>>>>>>             struct amdgpu_bo *gws;
>>>>>>>>>>>             struct amdgpu_bo *oa;
>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>             int r;
>>>>>>>>>>>
>>>>>>>>>>>             INIT_LIST_HEAD(&p->validated);
>>>>>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>
>>>>>>>>>>>                     e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>>>>>
>>>>>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>>>>>                             e->chain = dma_fence_chain_alloc();
>>>>>>>>>>>                             if (!e->chain) {
>>>>>>>>>>>                                     r = -ENOMEM;
>>>>>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>      {
>>>>>>>>>>>             struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>>>>>             struct amdgpu_bo_list_entry *e;
>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>             int r;
>>>>>>>>>>>
>>>>>>>>>>>             list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>                     struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>>>>>                     enum amdgpu_sync_mode sync_mode;
>>>>>>>>>>>
>>>>>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>                             AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>>>>>                     r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>>>>>                                          &fpriv->vm);
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>>>>>             return 0;
>>>>>>>>>>>      }
>>>>>>>>>>>
>>>>>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>>>>>> +                         struct drm_file *filp)
>>>>>>>>>>> +{
>>>>>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>>>>>> +
>>>>>>>>>>> +       switch (setparam->param) {
>>>>>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>>>>>> +               if (setparam->value)
>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>>>>>> +               else
>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>>>>>> +               break;
>>>>>>>>>>> +       default:
>>>>>>>>>>> +               return -EINVAL;
>>>>>>>>>>> +       }
>>>>>>>>>>> +
>>>>>>>>>>> +       return 0;
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>      const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>>      static const struct drm_driver amdgpu_kms_driver = {
>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>>>>>             bool                    bulk_moveable;
>>>>>>>>>>>             /* Flag to indicate if VM is used for compute */
>>>>>>>>>>>             bool                    is_compute_context;
>>>>>>>>>>> +       /*
>>>>>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>>>>>> +        */
>>>>>>>>>>> +       bool no_implicit_sync;
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>>      struct amdgpu_vm_manager {
>>>>>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>>>>>      #define DRM_AMDGPU_VM                  0x13
>>>>>>>>>>>      #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>>>>>      #define DRM_AMDGPU_SCHED               0x15
>>>>>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>>>>>
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>>>>>      #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>>>>>
>>>>>>>>>>>      /**
>>>>>>>>>>>       * DOC: memory domains
>>>>>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>>>>>             struct drm_amdgpu_sched_in in;
>>>>>>>>>>>      };
>>>>>>>>>>>
>>>>>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>>>>>> +
>>>>>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>>>>>> +       __u32   param;
>>>>>>>>>>> +       __u32   value;
>>>>>>>>>>> +};
>>>>>>>>>>> +
>>>>>>>>>>>      /*
>>>>>>>>>>>       * This is not a reliable API and you should expect it to fail for any
>>>>>>>>>>>       * number of reasons and have fallback path that do not use userptr to
>>>>>>>>>>> --
>>>>>>>>>>> 2.32.0.rc2
>>>>>>>>>>>
>>>
>>> --
>>> Daniel Vetter
>>> Software Engineer, Intel Corporation
>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=tc6ZdgYzOXpER4vpuOiOlyIsr7YTAHLMcuFaNjSs6YE%3D&amp;reserved=0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 15:07                         ` [Intel-gfx] " Christian König
@ 2021-06-23 15:12                           ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:12 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 05:07:17PM +0200, Christian König wrote:
> Am 23.06.21 um 17:03 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
> > > On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > On Wed, Jun 23, 2021 at 4:02 PM Christian König
> > > > <christian.koenig@amd.com> wrote:
> > > > > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > > > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > > > > <christian.koenig@amd.com> wrote:
> > > > > > > Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > > > > > > > On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > > > > > > > <christian.koenig@amd.com> wrote:
> > > > > > > > > Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > > > > > > > > > On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > > > > > > > > > <bas@basnieuwenhuizen.nl> wrote:
> > > > > > > > > > > On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > > > > > WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > > > > > > > > > > > 
> > > > > > > > > > > > Implicit fencing done properly needs to treat the implicit fencing
> > > > > > > > > > > > slots like a funny kind of IPC mailbox. In other words it needs to be
> > > > > > > > > > > > explicitly. This is the only way it will mesh well with explicit
> > > > > > > > > > > > fencing userspace like vk, and it's also the bare minimum required to
> > > > > > > > > > > > be able to manage anything else that wants to use the same buffer on
> > > > > > > > > > > > multiple engines in parallel, and still be able to share it through
> > > > > > > > > > > > implicit sync.
> > > > > > > > > > > > 
> > > > > > > > > > > > amdgpu completely lacks such an uapi. Fix this.
> > > > > > > > > > > > 
> > > > > > > > > > > > Luckily the concept of ignoring implicit fences exists already, and
> > > > > > > > > > > > takes care of all the complexities of making sure that non-optional
> > > > > > > > > > > > fences (like bo moves) are not ignored. This support was added in
> > > > > > > > > > > > 
> > > > > > > > > > > > commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > > > > > > > > > > > Author: Andres Rodriguez <andresx7@gmail.com>
> > > > > > > > > > > > Date:   Fri Sep 15 20:44:06 2017 -0400
> > > > > > > > > > > > 
> > > > > > > > > > > >         drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > > > > > > > > > > > 
> > > > > > > > > > > > Unfortuantely it's the wrong semantics, because it's a bo flag and
> > > > > > > > > > > > disables implicit sync on an allocated buffer completely.
> > > > > > > > > > > > 
> > > > > > > > > > > > We _do_ want implicit sync, but control it explicitly. For this we
> > > > > > > > > > > > need a flag on the drm_file, so that a given userspace (like vulkan)
> > > > > > > > > > > > can manage the implicit sync slots explicitly. The other side of the
> > > > > > > > > > > > pipeline (compositor, other process or just different stage in a media
> > > > > > > > > > > > pipeline in the same process) can then either do the same, or fully
> > > > > > > > > > > > participate in the implicit sync as implemented by the kernel by
> > > > > > > > > > > > default.
> > > > > > > > > > > > 
> > > > > > > > > > > > By building on the existing flag for buffers we avoid any issues with
> > > > > > > > > > > > opening up additional security concerns - anything this new flag here
> > > > > > > > > > > > allows is already.
> > > > > > > > > > > > 
> > > > > > > > > > > > All drivers which supports this concept of a userspace-specific
> > > > > > > > > > > > opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > > > > > > > > > > > that turned out to be a bit too inflexible. See the discussion below,
> > > > > > > > > > > > let's try to do a bit better for amdgpu.
> > > > > > > > > > > > 
> > > > > > > > > > > > This alone only allows us to completely avoid any stalls due to
> > > > > > > > > > > > implicit sync, it does not yet allow us to use implicit sync as a
> > > > > > > > > > > > strange form of IPC for sync_file.
> > > > > > > > > > > > 
> > > > > > > > > > > > For that we need two more pieces:
> > > > > > > > > > > > 
> > > > > > > > > > > > - a way to get the current implicit sync fences out of a buffer. Could
> > > > > > > > > > > >       be done in a driver ioctl, but everyone needs this, and generally a
> > > > > > > > > > > >       dma-buf is involved anyway to establish the sharing. So an ioctl on
> > > > > > > > > > > >       the dma-buf makes a ton more sense:
> > > > > > > > > > > > 
> > > > > > > > > > > >       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gntXLzlrqPxYj4Q3mQflD3arT9ad40S9AqsvtOXV4nk%3D&amp;reserved=0
> > > > > > > > > > > > 
> > > > > > > > > > > >       Current drivers in upstream solves this by having the opt-out flag
> > > > > > > > > > > >       on their CS ioctl. This has the downside that very often the CS
> > > > > > > > > > > >       which must actually stall for the implicit fence is run a while
> > > > > > > > > > > >       after the implicit fence point was logically sampled per the api
> > > > > > > > > > > >       spec (vk passes an explicit syncobj around for that afaiui), and so
> > > > > > > > > > > >       results in oversync. Converting the implicit sync fences into a
> > > > > > > > > > > >       snap-shot sync_file is actually accurate.
> > > > > > > > > > > > 
> > > > > > > > > > > > - Simillar we need to be able to set the exclusive implicit fence.
> > > > > > > > > > > >       Current drivers again do this with a CS ioctl flag, with again the
> > > > > > > > > > > >       same problems that the time the CS happens additional dependencies
> > > > > > > > > > > >       have been added. An explicit ioctl to only insert a sync_file (while
> > > > > > > > > > > >       respecting the rules for how exclusive and shared fence slots must
> > > > > > > > > > > >       be update in struct dma_resv) is much better. This is proposed here:
> > > > > > > > > > > > 
> > > > > > > > > > > >       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YtqHT756jlt5NX7Ydr3Kk1UMTb98nQhlcOlrnr%2B48HE%3D&amp;reserved=0
> > > > > > > > > > > > 
> > > > > > > > > > > > These three pieces together allow userspace to fully control implicit
> > > > > > > > > > > > fencing and remove all unecessary stall points due to them.
> > > > > > > > > > > > 
> > > > > > > > > > > > Well, as much as the implicit fencing model fundamentally allows:
> > > > > > > > > > > > There is only one set of fences, you can only choose to sync against
> > > > > > > > > > > > only writers (exclusive slot), or everyone. Hence suballocating
> > > > > > > > > > > > multiple buffers or anything else like this is fundamentally not
> > > > > > > > > > > > possible, and can only be fixed by a proper explicit fencing model.
> > > > > > > > > > > > 
> > > > > > > > > > > > Aside from that caveat this model gets implicit fencing as closely to
> > > > > > > > > > > > explicit fencing semantics as possible:
> > > > > > > > > > > > 
> > > > > > > > > > > > On the actual implementation I opted for a simple setparam ioctl, no
> > > > > > > > > > > > locking (just atomic reads/writes) for simplicity. There is a nice
> > > > > > > > > > > > flag parameter in the VM ioctl which we could use, except:
> > > > > > > > > > > > - it's not checked, so userspace likely passes garbage
> > > > > > > > > > > > - there's already a comment that userspace _does_ pass garbage in the
> > > > > > > > > > > >       priority field
> > > > > > > > > > > > So yeah unfortunately this flag parameter for setting vm flags is
> > > > > > > > > > > > useless, and we need to hack up a new one.
> > > > > > > > > > > > 
> > > > > > > > > > > > v2: Explain why a new SETPARAM (Jason)
> > > > > > > > > > > > 
> > > > > > > > > > > > v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > > > > > > > > > > > need both, or this doesn't do much.
> > > > > > > > > > > > 
> > > > > > > > > > > > v4: Rebase over the amdgpu patch to always set the implicit sync
> > > > > > > > > > > > fences.
> > > > > > > > > > > So I think there is still a case missing in this implementation.
> > > > > > > > > > > Consider these 3 cases
> > > > > > > > > > > 
> > > > > > > > > > > (format: a->b: b waits on a. Yes, I know arrows are hard)
> > > > > > > > > > > 
> > > > > > > > > > > explicit->explicit: This doesn't wait now, which is good
> > > > > > > > > > > Implicit->explicit: This doesn't wait now, which is good
> > > > > > > > > > > explicit->implicit : This still waits as the explicit submission still
> > > > > > > > > > > adds shared fences and most things that set an exclusive fence for
> > > > > > > > > > > implicit sync will hence wait on it.
> > > > > > > > > > > 
> > > > > > > > > > > This is probably good enough for what radv needs now but also sounds
> > > > > > > > > > > like a risk wrt baking in new uapi behavior that we don't want to be
> > > > > > > > > > > the end result.
> > > > > > > > > > > 
> > > > > > > > > > > Within AMDGPU this is probably solvable in two ways:
> > > > > > > > > > > 
> > > > > > > > > > > 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > > > > > > > > > I'm not sure that works. I think the right fix is that radeonsi also
> > > > > > > > > > switches to this model, with maybe a per-bo CS flag to set indicate
> > > > > > > > > > write access, to cut down on the number of ioctls that are needed
> > > > > > > > > > otherwise on shared buffers. This per-bo flag would essentially select
> > > > > > > > > > between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > > > > > > > > Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > > > > > > > > 
> > > > > > > > > Problem with the per context or per vm flag is that you then don't get
> > > > > > > > > any implicit synchronization any more when another process starts using
> > > > > > > > > the buffer.
> > > > > > > > That is exactly what I want for Vulkan :)
> > > > > > > Yeah, but as far as I know this is not something we can do.
> > > > > > > 
> > > > > > > See we have use cases like screen capture and debug which rely on that
> > > > > > > behavior.
> > > > > > They will keep working, if (and only if) the vulkan side sets the
> > > > > > winsys fences correctly. Also, everything else in vulkan aside from
> > > > > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > > > > timeline on the gl side.
> > > > > > 
> > > > > > > The only thing we can do is to say on a per buffer flag that a buffer
> > > > > > > should not participate in implicit sync at all.
> > > > > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > > > > decision for the rendered. Vulkan wants to control implicit sync
> > > > > > explicitly, and the kernel can't force more synchronization. If a
> > > > > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > > > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > > > > to set the fences directly (and if the app gets it wrong, you get
> > > > > > misrendering, but that is the specified behavour of vulkan).
> > > > > Yeah, but that's exactly what we tried to avoid.
> > > > > 
> > > > > Mhm, when we attach the flag to the process/VM then this would break the
> > > > > use case of VA-API and Vulkan in the same process.
> > > > > 
> > > > > But I think if you attach the flag to the context that should indeed
> > > > > work fine.
> > > > Yeah that's a question I have, whether the drm_file is shared within
> > > > one process among everything, or whether radeonsi/libva/vk each have
> > > > their own. If each have their own drm_file, then we should be fine,
> > > > otherwise we need to figure out another place to put this (worst case
> > > > as a CS extension that vk just sets on every submit).
> > > libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
> > > process (modulo minigbm on chromeos and modulo a master fd).
> > > 
> > > That said the current proposal is for the context right? And on the
> > > context this should pretty much work? So I'm not sure why this is the
> > > part we are discussing?
> > It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
> > to have it's private VM for this. And on the quick I didn't see any other
> > way to create a VM than to have an FD of your own.
> 
> You can't have your own FD in libdrm_amdgpu userspace. We had a pretty hard
> design discussion about that already.
> 
> What you could do is to load your own copy of libdrm_amdgpu, but I won't
> recommend that.
> 
> Just putting the flag on the context instead of the VM is much cleaner as
> far as I can see anyway.

Helper for the blind? If you gues expect me to move that myself ...
-Daniel

> 
> Christian.
> 
> > If there's something else that means "gpu context with it's own vm" then
> > the flag would need to be moved there, pointers appreciated (but maybe
> > someone with hw + userspace can do that quicker).
> > -Daniel
> > 
> > > > Also yes this risks that a vk app which was violationing the winsys
> > > > spec will now break, which is why I think we should do this sooner
> > > > than later. Otherwise the list of w/a we might need to apply in vk
> > > > userspace will become very long :-( At least since this is purely
> > > > opt-in from userspace, we only need to have the w/a list in userspace,
> > > > where mesa has the infrastructure for that already.
> > > > -Daniel
> > > > 
> > > > > Christian.
> > > > > 
> > > > > > -Daniel
> > > > > > 
> > > > > > > Regards,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > > > > The current amdgpu uapi just doesn't allow any other model without an
> > > > > > > > > > explicit opt-in. So current implicit sync userspace just has to
> > > > > > > > > > oversync, there's not much choice.
> > > > > > > > > > 
> > > > > > > > > > > 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > > > > > > > > > > that is ignored by AMDGPU_SYNC_NE_OWNER.
> > > > > > > > > > > 
> > > > > > > > > > > But this doesn't solve cross-driver interactions here.
> > > > > > > > > > Yeah cross-driver is still entirely unsolved, because
> > > > > > > > > > amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > > > > > > > > Hui? You have lost me. Why is that still unsolved?
> > > > > > > > The part we're trying to solve with this patch is Vulkan should not
> > > > > > > > participate in any implicit sync at all wrt submissions (and then
> > > > > > > > handle the implicit sync for WSI explicitly using the fence
> > > > > > > > import/export stuff that Jason wrote). As long we add shared fences to
> > > > > > > > the dma_resv we participate in implicit sync (at the level of an
> > > > > > > > implicit sync read) still, at least from the perspective of later jobs
> > > > > > > > waiting on these fences.
> > > > > > > > 
> > > > > > > > > Regards,
> > > > > > > > > Christian.
> > > > > > > > > 
> > > > > > > > > > -Daniel
> > > > > > > > > > 
> > > > > > > > > > > > Cc: mesa-dev@lists.freedesktop.org
> > > > > > > > > > > > Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > > > > > > > > > > > Cc: Dave Airlie <airlied@gmail.com>
> > > > > > > > > > > > Cc: Rob Clark <robdclark@chromium.org>
> > > > > > > > > > > > Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > > > > > > > > > > > Cc: Michel Dänzer <michel@daenzer.net>
> > > > > > > > > > > > Cc: Daniel Stone <daniels@collabora.com>
> > > > > > > > > > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > > > > > > > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > > > > > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > > > > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > > > > > > > > > > > Cc: Chen Li <chenli@uniontech.com>
> > > > > > > > > > > > Cc: Kevin Wang <kevin1.wang@amd.com>
> > > > > > > > > > > > Cc: Dennis Li <Dennis.Li@amd.com>
> > > > > > > > > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > > > > > > > > Cc: linaro-mm-sig@lists.linaro.org
> > > > > > > > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > > > > > > > > > > >      include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > > > > > > > > > > >      4 files changed, 42 insertions(+), 2 deletions(-)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > index 65df34c17264..c5386d13eb4a 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > > > > > > > > > >             struct amdgpu_bo *gds;
> > > > > > > > > > > >             struct amdgpu_bo *gws;
> > > > > > > > > > > >             struct amdgpu_bo *oa;
> > > > > > > > > > > > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > > > > > > > > > >             int r;
> > > > > > > > > > > > 
> > > > > > > > > > > >             INIT_LIST_HEAD(&p->validated);
> > > > > > > > > > > > @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > > > > > > > > > > 
> > > > > > > > > > > >                     e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > > > > > > > > > > 
> > > > > > > > > > > > -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > > > > > > > > > > > +               if (bo->tbo.base.dma_buf &&
> > > > > > > > > > > > +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > > > > > > > > > > >                             e->chain = dma_fence_chain_alloc();
> > > > > > > > > > > >                             if (!e->chain) {
> > > > > > > > > > > >                                     r = -ENOMEM;
> > > > > > > > > > > > @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > > > > > > > > > >      {
> > > > > > > > > > > >             struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > > > > > > > > > > >             struct amdgpu_bo_list_entry *e;
> > > > > > > > > > > > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > > > > > > > > > >             int r;
> > > > > > > > > > > > 
> > > > > > > > > > > >             list_for_each_entry(e, &p->validated, tv.head) {
> > > > > > > > > > > > @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > > > > > > > > > >                     struct dma_resv *resv = bo->tbo.base.resv;
> > > > > > > > > > > >                     enum amdgpu_sync_mode sync_mode;
> > > > > > > > > > > > 
> > > > > > > > > > > > -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > > > > > > > > > > > +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > > > > > > > > > > >                             AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > > > > > > > > > > >                     r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > > > > > > > > > > >                                          &fpriv->vm);
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > index c080ba15ae77..f982626b5328 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > > > > > > > > > > >             return 0;
> > > > > > > > > > > >      }
> > > > > > > > > > > > 
> > > > > > > > > > > > +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > > > > > > > > > > > +                         struct drm_file *filp)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +       struct drm_amdgpu_setparam *setparam = data;
> > > > > > > > > > > > +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > > > > > > > > > > > +
> > > > > > > > > > > > +       switch (setparam->param) {
> > > > > > > > > > > > +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > > > > > > > > > > > +               if (setparam->value)
> > > > > > > > > > > > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > > > > > > > > > > > +               else
> > > > > > > > > > > > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > > > > > > > > > > > +               break;
> > > > > > > > > > > > +       default:
> > > > > > > > > > > > +               return -EINVAL;
> > > > > > > > > > > > +       }
> > > > > > > > > > > > +
> > > > > > > > > > > > +       return 0;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > >      const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > > @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > > +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > >      static const struct drm_driver amdgpu_kms_driver = {
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > index ddb85a85cbba..0e8c440c6303 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > > > > > > > > > > >             bool                    bulk_moveable;
> > > > > > > > > > > >             /* Flag to indicate if VM is used for compute */
> > > > > > > > > > > >             bool                    is_compute_context;
> > > > > > > > > > > > +       /*
> > > > > > > > > > > > +        * Flag to indicate whether implicit sync should always be skipped on
> > > > > > > > > > > > +        * this context. We do not care about races at all, userspace is allowed
> > > > > > > > > > > > +        * to shoot itself with implicit sync to its fullest liking.
> > > > > > > > > > > > +        */
> > > > > > > > > > > > +       bool no_implicit_sync;
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > >      struct amdgpu_vm_manager {
> > > > > > > > > > > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > index 0cbd1540aeac..9eae245c14d6 100644
> > > > > > > > > > > > --- a/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > +++ b/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > @@ -54,6 +54,7 @@ extern "C" {
> > > > > > > > > > > >      #define DRM_AMDGPU_VM                  0x13
> > > > > > > > > > > >      #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > > > > > > > > > > >      #define DRM_AMDGPU_SCHED               0x15
> > > > > > > > > > > > +#define DRM_AMDGPU_SETPARAM            0x16
> > > > > > > > > > > > 
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > > > > > > > > > > > @@ -71,6 +72,7 @@ extern "C" {
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > > > > > > > > > > > +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > > > > > > > > > > > 
> > > > > > > > > > > >      /**
> > > > > > > > > > > >       * DOC: memory domains
> > > > > > > > > > > > @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > > > > > > > > > > >             struct drm_amdgpu_sched_in in;
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > > +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > > > > > > > > > > > +
> > > > > > > > > > > > +struct drm_amdgpu_setparam {
> > > > > > > > > > > > +       /* AMDGPU_SETPARAM_* */
> > > > > > > > > > > > +       __u32   param;
> > > > > > > > > > > > +       __u32   value;
> > > > > > > > > > > > +};
> > > > > > > > > > > > +
> > > > > > > > > > > >      /*
> > > > > > > > > > > >       * This is not a reliable API and you should expect it to fail for any
> > > > > > > > > > > >       * number of reasons and have fallback path that do not use userptr to
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.32.0.rc2
> > > > > > > > > > > > 
> > > > 
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=tc6ZdgYzOXpER4vpuOiOlyIsr7YTAHLMcuFaNjSs6YE%3D&amp;reserved=0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 15:12                           ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:12 UTC (permalink / raw)
  To: Christian König
  Cc: Rob Clark, Daniel Stone, Daniel Vetter, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Dave Airlie,
	Michel Dänzer, Dennis Li, Deepak R Varma

On Wed, Jun 23, 2021 at 05:07:17PM +0200, Christian König wrote:
> Am 23.06.21 um 17:03 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
> > > On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > On Wed, Jun 23, 2021 at 4:02 PM Christian König
> > > > <christian.koenig@amd.com> wrote:
> > > > > Am 23.06.21 um 15:49 schrieb Daniel Vetter:
> > > > > > On Wed, Jun 23, 2021 at 3:44 PM Christian König
> > > > > > <christian.koenig@amd.com> wrote:
> > > > > > > Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
> > > > > > > > On Wed, Jun 23, 2021 at 2:59 PM Christian König
> > > > > > > > <christian.koenig@amd.com> wrote:
> > > > > > > > > Am 23.06.21 um 14:18 schrieb Daniel Vetter:
> > > > > > > > > > On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
> > > > > > > > > > <bas@basnieuwenhuizen.nl> wrote:
> > > > > > > > > > > On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > > > > > > > > > > WARNING: Absolutely untested beyond "gcc isn't dying in agony".
> > > > > > > > > > > > 
> > > > > > > > > > > > Implicit fencing done properly needs to treat the implicit fencing
> > > > > > > > > > > > slots like a funny kind of IPC mailbox. In other words it needs to be
> > > > > > > > > > > > explicitly. This is the only way it will mesh well with explicit
> > > > > > > > > > > > fencing userspace like vk, and it's also the bare minimum required to
> > > > > > > > > > > > be able to manage anything else that wants to use the same buffer on
> > > > > > > > > > > > multiple engines in parallel, and still be able to share it through
> > > > > > > > > > > > implicit sync.
> > > > > > > > > > > > 
> > > > > > > > > > > > amdgpu completely lacks such an uapi. Fix this.
> > > > > > > > > > > > 
> > > > > > > > > > > > Luckily the concept of ignoring implicit fences exists already, and
> > > > > > > > > > > > takes care of all the complexities of making sure that non-optional
> > > > > > > > > > > > fences (like bo moves) are not ignored. This support was added in
> > > > > > > > > > > > 
> > > > > > > > > > > > commit 177ae09b5d699a5ebd1cafcee78889db968abf54
> > > > > > > > > > > > Author: Andres Rodriguez <andresx7@gmail.com>
> > > > > > > > > > > > Date:   Fri Sep 15 20:44:06 2017 -0400
> > > > > > > > > > > > 
> > > > > > > > > > > >         drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
> > > > > > > > > > > > 
> > > > > > > > > > > > Unfortuantely it's the wrong semantics, because it's a bo flag and
> > > > > > > > > > > > disables implicit sync on an allocated buffer completely.
> > > > > > > > > > > > 
> > > > > > > > > > > > We _do_ want implicit sync, but control it explicitly. For this we
> > > > > > > > > > > > need a flag on the drm_file, so that a given userspace (like vulkan)
> > > > > > > > > > > > can manage the implicit sync slots explicitly. The other side of the
> > > > > > > > > > > > pipeline (compositor, other process or just different stage in a media
> > > > > > > > > > > > pipeline in the same process) can then either do the same, or fully
> > > > > > > > > > > > participate in the implicit sync as implemented by the kernel by
> > > > > > > > > > > > default.
> > > > > > > > > > > > 
> > > > > > > > > > > > By building on the existing flag for buffers we avoid any issues with
> > > > > > > > > > > > opening up additional security concerns - anything this new flag here
> > > > > > > > > > > > allows is already.
> > > > > > > > > > > > 
> > > > > > > > > > > > All drivers which supports this concept of a userspace-specific
> > > > > > > > > > > > opt-out of implicit sync have a flag in their CS ioctl, but in reality
> > > > > > > > > > > > that turned out to be a bit too inflexible. See the discussion below,
> > > > > > > > > > > > let's try to do a bit better for amdgpu.
> > > > > > > > > > > > 
> > > > > > > > > > > > This alone only allows us to completely avoid any stalls due to
> > > > > > > > > > > > implicit sync, it does not yet allow us to use implicit sync as a
> > > > > > > > > > > > strange form of IPC for sync_file.
> > > > > > > > > > > > 
> > > > > > > > > > > > For that we need two more pieces:
> > > > > > > > > > > > 
> > > > > > > > > > > > - a way to get the current implicit sync fences out of a buffer. Could
> > > > > > > > > > > >       be done in a driver ioctl, but everyone needs this, and generally a
> > > > > > > > > > > >       dma-buf is involved anyway to establish the sharing. So an ioctl on
> > > > > > > > > > > >       the dma-buf makes a ton more sense:
> > > > > > > > > > > > 
> > > > > > > > > > > >       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=gntXLzlrqPxYj4Q3mQflD3arT9ad40S9AqsvtOXV4nk%3D&amp;reserved=0
> > > > > > > > > > > > 
> > > > > > > > > > > >       Current drivers in upstream solves this by having the opt-out flag
> > > > > > > > > > > >       on their CS ioctl. This has the downside that very often the CS
> > > > > > > > > > > >       which must actually stall for the implicit fence is run a while
> > > > > > > > > > > >       after the implicit fence point was logically sampled per the api
> > > > > > > > > > > >       spec (vk passes an explicit syncobj around for that afaiui), and so
> > > > > > > > > > > >       results in oversync. Converting the implicit sync fences into a
> > > > > > > > > > > >       snap-shot sync_file is actually accurate.
> > > > > > > > > > > > 
> > > > > > > > > > > > - Simillar we need to be able to set the exclusive implicit fence.
> > > > > > > > > > > >       Current drivers again do this with a CS ioctl flag, with again the
> > > > > > > > > > > >       same problems that the time the CS happens additional dependencies
> > > > > > > > > > > >       have been added. An explicit ioctl to only insert a sync_file (while
> > > > > > > > > > > >       respecting the rules for how exclusive and shared fence slots must
> > > > > > > > > > > >       be update in struct dma_resv) is much better. This is proposed here:
> > > > > > > > > > > > 
> > > > > > > > > > > >       https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=YtqHT756jlt5NX7Ydr3Kk1UMTb98nQhlcOlrnr%2B48HE%3D&amp;reserved=0
> > > > > > > > > > > > 
> > > > > > > > > > > > These three pieces together allow userspace to fully control implicit
> > > > > > > > > > > > fencing and remove all unecessary stall points due to them.
> > > > > > > > > > > > 
> > > > > > > > > > > > Well, as much as the implicit fencing model fundamentally allows:
> > > > > > > > > > > > There is only one set of fences, you can only choose to sync against
> > > > > > > > > > > > only writers (exclusive slot), or everyone. Hence suballocating
> > > > > > > > > > > > multiple buffers or anything else like this is fundamentally not
> > > > > > > > > > > > possible, and can only be fixed by a proper explicit fencing model.
> > > > > > > > > > > > 
> > > > > > > > > > > > Aside from that caveat this model gets implicit fencing as closely to
> > > > > > > > > > > > explicit fencing semantics as possible:
> > > > > > > > > > > > 
> > > > > > > > > > > > On the actual implementation I opted for a simple setparam ioctl, no
> > > > > > > > > > > > locking (just atomic reads/writes) for simplicity. There is a nice
> > > > > > > > > > > > flag parameter in the VM ioctl which we could use, except:
> > > > > > > > > > > > - it's not checked, so userspace likely passes garbage
> > > > > > > > > > > > - there's already a comment that userspace _does_ pass garbage in the
> > > > > > > > > > > >       priority field
> > > > > > > > > > > > So yeah unfortunately this flag parameter for setting vm flags is
> > > > > > > > > > > > useless, and we need to hack up a new one.
> > > > > > > > > > > > 
> > > > > > > > > > > > v2: Explain why a new SETPARAM (Jason)
> > > > > > > > > > > > 
> > > > > > > > > > > > v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
> > > > > > > > > > > > need both, or this doesn't do much.
> > > > > > > > > > > > 
> > > > > > > > > > > > v4: Rebase over the amdgpu patch to always set the implicit sync
> > > > > > > > > > > > fences.
> > > > > > > > > > > So I think there is still a case missing in this implementation.
> > > > > > > > > > > Consider these 3 cases
> > > > > > > > > > > 
> > > > > > > > > > > (format: a->b: b waits on a. Yes, I know arrows are hard)
> > > > > > > > > > > 
> > > > > > > > > > > explicit->explicit: This doesn't wait now, which is good
> > > > > > > > > > > Implicit->explicit: This doesn't wait now, which is good
> > > > > > > > > > > explicit->implicit : This still waits as the explicit submission still
> > > > > > > > > > > adds shared fences and most things that set an exclusive fence for
> > > > > > > > > > > implicit sync will hence wait on it.
> > > > > > > > > > > 
> > > > > > > > > > > This is probably good enough for what radv needs now but also sounds
> > > > > > > > > > > like a risk wrt baking in new uapi behavior that we don't want to be
> > > > > > > > > > > the end result.
> > > > > > > > > > > 
> > > > > > > > > > > Within AMDGPU this is probably solvable in two ways:
> > > > > > > > > > > 
> > > > > > > > > > > 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
> > > > > > > > > > I'm not sure that works. I think the right fix is that radeonsi also
> > > > > > > > > > switches to this model, with maybe a per-bo CS flag to set indicate
> > > > > > > > > > write access, to cut down on the number of ioctls that are needed
> > > > > > > > > > otherwise on shared buffers. This per-bo flag would essentially select
> > > > > > > > > > between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
> > > > > > > > > Yeah, but I'm still not entirely sure why that approach isn't sufficient?
> > > > > > > > > 
> > > > > > > > > Problem with the per context or per vm flag is that you then don't get
> > > > > > > > > any implicit synchronization any more when another process starts using
> > > > > > > > > the buffer.
> > > > > > > > That is exactly what I want for Vulkan :)
> > > > > > > Yeah, but as far as I know this is not something we can do.
> > > > > > > 
> > > > > > > See we have use cases like screen capture and debug which rely on that
> > > > > > > behavior.
> > > > > > They will keep working, if (and only if) the vulkan side sets the
> > > > > > winsys fences correctly. Also, everything else in vulkan aside from
> > > > > > winsys is explicitly not synced at all, you have to import drm syncobj
> > > > > > timeline on the gl side.
> > > > > > 
> > > > > > > The only thing we can do is to say on a per buffer flag that a buffer
> > > > > > > should not participate in implicit sync at all.
> > > > > > Nah, this doesn't work. Because it's not a global decision, is a local
> > > > > > decision for the rendered. Vulkan wants to control implicit sync
> > > > > > explicitly, and the kernel can't force more synchronization. If a
> > > > > > buffer is shared as a winsys buffer between vulkan client and gl using
> > > > > > compositor, then you _have_ to use implicit sync on it. But vk needs
> > > > > > to set the fences directly (and if the app gets it wrong, you get
> > > > > > misrendering, but that is the specified behavour of vulkan).
> > > > > Yeah, but that's exactly what we tried to avoid.
> > > > > 
> > > > > Mhm, when we attach the flag to the process/VM then this would break the
> > > > > use case of VA-API and Vulkan in the same process.
> > > > > 
> > > > > But I think if you attach the flag to the context that should indeed
> > > > > work fine.
> > > > Yeah that's a question I have, whether the drm_file is shared within
> > > > one process among everything, or whether radeonsi/libva/vk each have
> > > > their own. If each have their own drm_file, then we should be fine,
> > > > otherwise we need to figure out another place to put this (worst case
> > > > as a CS extension that vk just sets on every submit).
> > > libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
> > > process (modulo minigbm on chromeos and modulo a master fd).
> > > 
> > > That said the current proposal is for the context right? And on the
> > > context this should pretty much work? So I'm not sure why this is the
> > > part we are discussing?
> > It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
> > to have it's private VM for this. And on the quick I didn't see any other
> > way to create a VM than to have an FD of your own.
> 
> You can't have your own FD in libdrm_amdgpu userspace. We had a pretty hard
> design discussion about that already.
> 
> What you could do is to load your own copy of libdrm_amdgpu, but I won't
> recommend that.
> 
> Just putting the flag on the context instead of the VM is much cleaner as
> far as I can see anyway.

Helper for the blind? If you gues expect me to move that myself ...
-Daniel

> 
> Christian.
> 
> > If there's something else that means "gpu context with it's own vm" then
> > the flag would need to be moved there, pointers appreciated (but maybe
> > someone with hw + userspace can do that quicker).
> > -Daniel
> > 
> > > > Also yes this risks that a vk app which was violationing the winsys
> > > > spec will now break, which is why I think we should do this sooner
> > > > than later. Otherwise the list of w/a we might need to apply in vk
> > > > userspace will become very long :-( At least since this is purely
> > > > opt-in from userspace, we only need to have the w/a list in userspace,
> > > > where mesa has the infrastructure for that already.
> > > > -Daniel
> > > > 
> > > > > Christian.
> > > > > 
> > > > > > -Daniel
> > > > > > 
> > > > > > > Regards,
> > > > > > > Christian.
> > > > > > > 
> > > > > > > > > > The current amdgpu uapi just doesn't allow any other model without an
> > > > > > > > > > explicit opt-in. So current implicit sync userspace just has to
> > > > > > > > > > oversync, there's not much choice.
> > > > > > > > > > 
> > > > > > > > > > > 2) Have an EXPLICIT fence owner that is used for explicit submissions
> > > > > > > > > > > that is ignored by AMDGPU_SYNC_NE_OWNER.
> > > > > > > > > > > 
> > > > > > > > > > > But this doesn't solve cross-driver interactions here.
> > > > > > > > > > Yeah cross-driver is still entirely unsolved, because
> > > > > > > > > > amdgpu_bo_explicit_sync() on the bo didn't solve that either.
> > > > > > > > > Hui? You have lost me. Why is that still unsolved?
> > > > > > > > The part we're trying to solve with this patch is Vulkan should not
> > > > > > > > participate in any implicit sync at all wrt submissions (and then
> > > > > > > > handle the implicit sync for WSI explicitly using the fence
> > > > > > > > import/export stuff that Jason wrote). As long we add shared fences to
> > > > > > > > the dma_resv we participate in implicit sync (at the level of an
> > > > > > > > implicit sync read) still, at least from the perspective of later jobs
> > > > > > > > waiting on these fences.
> > > > > > > > 
> > > > > > > > > Regards,
> > > > > > > > > Christian.
> > > > > > > > > 
> > > > > > > > > > -Daniel
> > > > > > > > > > 
> > > > > > > > > > > > Cc: mesa-dev@lists.freedesktop.org
> > > > > > > > > > > > Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
> > > > > > > > > > > > Cc: Dave Airlie <airlied@gmail.com>
> > > > > > > > > > > > Cc: Rob Clark <robdclark@chromium.org>
> > > > > > > > > > > > Cc: Kristian H. Kristensen <hoegsberg@google.com>
> > > > > > > > > > > > Cc: Michel Dänzer <michel@daenzer.net>
> > > > > > > > > > > > Cc: Daniel Stone <daniels@collabora.com>
> > > > > > > > > > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > > > > > > > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > > > > > > > > > Cc: Alex Deucher <alexander.deucher@amd.com>
> > > > > > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > > > > > > > > > > > Cc: Deepak R Varma <mh12gx2825@gmail.com>
> > > > > > > > > > > > Cc: Chen Li <chenli@uniontech.com>
> > > > > > > > > > > > Cc: Kevin Wang <kevin1.wang@amd.com>
> > > > > > > > > > > > Cc: Dennis Li <Dennis.Li@amd.com>
> > > > > > > > > > > > Cc: Luben Tuikov <luben.tuikov@amd.com>
> > > > > > > > > > > > Cc: linaro-mm-sig@lists.linaro.org
> > > > > > > > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > > > > > > > ---
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
> > > > > > > > > > > >      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
> > > > > > > > > > > >      include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
> > > > > > > > > > > >      4 files changed, 42 insertions(+), 2 deletions(-)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > index 65df34c17264..c5386d13eb4a 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> > > > > > > > > > > > @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > > > > > > > > > >             struct amdgpu_bo *gds;
> > > > > > > > > > > >             struct amdgpu_bo *gws;
> > > > > > > > > > > >             struct amdgpu_bo *oa;
> > > > > > > > > > > > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > > > > > > > > > >             int r;
> > > > > > > > > > > > 
> > > > > > > > > > > >             INIT_LIST_HEAD(&p->validated);
> > > > > > > > > > > > @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
> > > > > > > > > > > > 
> > > > > > > > > > > >                     e->bo_va = amdgpu_vm_bo_find(vm, bo);
> > > > > > > > > > > > 
> > > > > > > > > > > > -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
> > > > > > > > > > > > +               if (bo->tbo.base.dma_buf &&
> > > > > > > > > > > > +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
> > > > > > > > > > > >                             e->chain = dma_fence_chain_alloc();
> > > > > > > > > > > >                             if (!e->chain) {
> > > > > > > > > > > >                                     r = -ENOMEM;
> > > > > > > > > > > > @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > > > > > > > > > >      {
> > > > > > > > > > > >             struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
> > > > > > > > > > > >             struct amdgpu_bo_list_entry *e;
> > > > > > > > > > > > +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
> > > > > > > > > > > >             int r;
> > > > > > > > > > > > 
> > > > > > > > > > > >             list_for_each_entry(e, &p->validated, tv.head) {
> > > > > > > > > > > > @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
> > > > > > > > > > > >                     struct dma_resv *resv = bo->tbo.base.resv;
> > > > > > > > > > > >                     enum amdgpu_sync_mode sync_mode;
> > > > > > > > > > > > 
> > > > > > > > > > > > -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
> > > > > > > > > > > > +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
> > > > > > > > > > > >                             AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
> > > > > > > > > > > >                     r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
> > > > > > > > > > > >                                          &fpriv->vm);
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > index c080ba15ae77..f982626b5328 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> > > > > > > > > > > > @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
> > > > > > > > > > > >             return 0;
> > > > > > > > > > > >      }
> > > > > > > > > > > > 
> > > > > > > > > > > > +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
> > > > > > > > > > > > +                         struct drm_file *filp)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +       struct drm_amdgpu_setparam *setparam = data;
> > > > > > > > > > > > +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
> > > > > > > > > > > > +
> > > > > > > > > > > > +       switch (setparam->param) {
> > > > > > > > > > > > +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
> > > > > > > > > > > > +               if (setparam->value)
> > > > > > > > > > > > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
> > > > > > > > > > > > +               else
> > > > > > > > > > > > +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
> > > > > > > > > > > > +               break;
> > > > > > > > > > > > +       default:
> > > > > > > > > > > > +               return -EINVAL;
> > > > > > > > > > > > +       }
> > > > > > > > > > > > +
> > > > > > > > > > > > +       return 0;
> > > > > > > > > > > > +}
> > > > > > > > > > > > +
> > > > > > > > > > > >      const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > > @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >             DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > > +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > >      static const struct drm_driver amdgpu_kms_driver = {
> > > > > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > index ddb85a85cbba..0e8c440c6303 100644
> > > > > > > > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> > > > > > > > > > > > @@ -321,6 +321,12 @@ struct amdgpu_vm {
> > > > > > > > > > > >             bool                    bulk_moveable;
> > > > > > > > > > > >             /* Flag to indicate if VM is used for compute */
> > > > > > > > > > > >             bool                    is_compute_context;
> > > > > > > > > > > > +       /*
> > > > > > > > > > > > +        * Flag to indicate whether implicit sync should always be skipped on
> > > > > > > > > > > > +        * this context. We do not care about races at all, userspace is allowed
> > > > > > > > > > > > +        * to shoot itself with implicit sync to its fullest liking.
> > > > > > > > > > > > +        */
> > > > > > > > > > > > +       bool no_implicit_sync;
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > >      struct amdgpu_vm_manager {
> > > > > > > > > > > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > index 0cbd1540aeac..9eae245c14d6 100644
> > > > > > > > > > > > --- a/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > +++ b/include/uapi/drm/amdgpu_drm.h
> > > > > > > > > > > > @@ -54,6 +54,7 @@ extern "C" {
> > > > > > > > > > > >      #define DRM_AMDGPU_VM                  0x13
> > > > > > > > > > > >      #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
> > > > > > > > > > > >      #define DRM_AMDGPU_SCHED               0x15
> > > > > > > > > > > > +#define DRM_AMDGPU_SETPARAM            0x16
> > > > > > > > > > > > 
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
> > > > > > > > > > > > @@ -71,6 +72,7 @@ extern "C" {
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
> > > > > > > > > > > >      #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
> > > > > > > > > > > > +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
> > > > > > > > > > > > 
> > > > > > > > > > > >      /**
> > > > > > > > > > > >       * DOC: memory domains
> > > > > > > > > > > > @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
> > > > > > > > > > > >             struct drm_amdgpu_sched_in in;
> > > > > > > > > > > >      };
> > > > > > > > > > > > 
> > > > > > > > > > > > +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
> > > > > > > > > > > > +
> > > > > > > > > > > > +struct drm_amdgpu_setparam {
> > > > > > > > > > > > +       /* AMDGPU_SETPARAM_* */
> > > > > > > > > > > > +       __u32   param;
> > > > > > > > > > > > +       __u32   value;
> > > > > > > > > > > > +};
> > > > > > > > > > > > +
> > > > > > > > > > > >      /*
> > > > > > > > > > > >       * This is not a reliable API and you should expect it to fail for any
> > > > > > > > > > > >       * number of reasons and have fallback path that do not use userptr to
> > > > > > > > > > > > --
> > > > > > > > > > > > 2.32.0.rc2
> > > > > > > > > > > > 
> > > > 
> > > > --
> > > > Daniel Vetter
> > > > Software Engineer, Intel Corporation
> > > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C517f0d3467324e7ce05008d936581f60%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600574408265873%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=tc6ZdgYzOXpER4vpuOiOlyIsr7YTAHLMcuFaNjSs6YE%3D&amp;reserved=0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
  2021-06-23  8:31     ` Christian König
  (?)
@ 2021-06-23 15:15       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:15 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, DRI Development, Intel Graphics Development,
	Daniel Vetter, Sumit Semwal, linux-media, linaro-mm-sig

On Wed, Jun 23, 2021 at 10:31:18AM +0200, Christian König wrote:
> Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> > Oversight from
> > 
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
> > 
> >      dma-buf: rename and cleanup dma_resv_get_excl v3
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Pushed to drm-misc-next.
-Daniel

> 
> > ---
> >   include/linux/dma-resv.h | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >   }
> >   /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >    * @obj: the reservation object
> >    *
> >    * Returns the exclusive fence (if any). Caller must either hold the objects
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-23 15:15       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:15 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Intel Graphics Development, DRI Development,
	linaro-mm-sig, Daniel Vetter, linux-media

On Wed, Jun 23, 2021 at 10:31:18AM +0200, Christian König wrote:
> Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> > Oversight from
> > 
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
> > 
> >      dma-buf: rename and cleanup dma_resv_get_excl v3
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Pushed to drm-misc-next.
-Daniel

> 
> > ---
> >   include/linux/dma-resv.h | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >   }
> >   /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >    * @obj: the reservation object
> >    *
> >    * Returns the exclusive fence (if any). Caller must either hold the objects
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 01/15] dma-resv: Fix kerneldoc
@ 2021-06-23 15:15       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 15:15 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Intel Graphics Development, DRI Development,
	linaro-mm-sig, Daniel Vetter, Sumit Semwal, linux-media

On Wed, Jun 23, 2021 at 10:31:18AM +0200, Christian König wrote:
> Am 22.06.21 um 18:54 schrieb Daniel Vetter:
> > Oversight from
> > 
> > commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9
> > Author: Christian König <christian.koenig@amd.com>
> > Date:   Mon May 10 16:14:09 2021 +0200
> > 
> >      dma-buf: rename and cleanup dma_resv_get_excl v3
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Pushed to drm-misc-next.
-Daniel

> 
> > ---
> >   include/linux/dma-resv.h | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> > index 562b885cf9c3..e1ca2080a1ff 100644
> > --- a/include/linux/dma-resv.h
> > +++ b/include/linux/dma-resv.h
> > @@ -212,7 +212,7 @@ static inline void dma_resv_unlock(struct dma_resv *obj)
> >   }
> >   /**
> > - * dma_resv_exclusive - return the object's exclusive fence
> > + * dma_resv_excl_fence - return the object's exclusive fence
> >    * @obj: the reservation object
> >    *
> >    * Returns the exclusive fence (if any). Caller must either hold the objects
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
  2021-06-23 15:12                           ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 15:15                             ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 15:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Daniel Vetter, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 17:12 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 05:07:17PM +0200, Christian König wrote:
>> Am 23.06.21 um 17:03 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
>>>> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>> On Wed, Jun 23, 2021 at 4:02 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
>>>>>>> On Wed, Jun 23, 2021 at 3:44 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>>>>>>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>>>>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>>>>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>>>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>>>>>>>> implicit sync.
>>>>>>>>>>>>>
>>>>>>>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>>>>>>>
>>>>>>>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>>>>>>>
>>>>>>>>>>>>>          drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>>>>>>>> default.
>>>>>>>>>>>>>
>>>>>>>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>>>>>>>> allows is already.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>>>>>>>> strange form of IPC for sync_file.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For that we need two more pieces:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>>>>>>>        be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>>>>>>>        dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>>>>>>>        the dma-buf makes a ton more sense:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EKK8SOxfcGPIcH3EFEN656G76vQnn2jGlGyMkuY4f5k%3D&amp;reserved=0
>>>>>>>>>>>>>
>>>>>>>>>>>>>        Current drivers in upstream solves this by having the opt-out flag
>>>>>>>>>>>>>        on their CS ioctl. This has the downside that very often the CS
>>>>>>>>>>>>>        which must actually stall for the implicit fence is run a while
>>>>>>>>>>>>>        after the implicit fence point was logically sampled per the api
>>>>>>>>>>>>>        spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>>>>>>>        results in oversync. Converting the implicit sync fences into a
>>>>>>>>>>>>>        snap-shot sync_file is actually accurate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>>>>>>>        Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>>>>>>>        same problems that the time the CS happens additional dependencies
>>>>>>>>>>>>>        have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>>>>>>>        respecting the rules for how exclusive and shared fence slots must
>>>>>>>>>>>>>        be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nrsIXQYXctDMFvYuUfM2LO%2BozUfzopfs34ZpTKjNCKo%3D&amp;reserved=0
>>>>>>>>>>>>>
>>>>>>>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>>>>>>>> explicit fencing semantics as possible:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>>>>>>>        priority field
>>>>>>>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>>>>>>>> useless, and we need to hack up a new one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>>>>>>>
>>>>>>>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>>>>>>>> need both, or this doesn't do much.
>>>>>>>>>>>>>
>>>>>>>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>>>>>>>> fences.
>>>>>>>>>>>> So I think there is still a case missing in this implementation.
>>>>>>>>>>>> Consider these 3 cases
>>>>>>>>>>>>
>>>>>>>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>>>>>>>
>>>>>>>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>>>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>>>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>>>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>>>>>>>> implicit sync will hence wait on it.
>>>>>>>>>>>>
>>>>>>>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>>>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>>>>>>>> the end result.
>>>>>>>>>>>>
>>>>>>>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>>>>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>>>>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>>>>>>>> write access, to cut down on the number of ioctls that are needed
>>>>>>>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>>>>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>>>>>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>>>>>>>
>>>>>>>>>> Problem with the per context or per vm flag is that you then don't get
>>>>>>>>>> any implicit synchronization any more when another process starts using
>>>>>>>>>> the buffer.
>>>>>>>>> That is exactly what I want for Vulkan :)
>>>>>>>> Yeah, but as far as I know this is not something we can do.
>>>>>>>>
>>>>>>>> See we have use cases like screen capture and debug which rely on that
>>>>>>>> behavior.
>>>>>>> They will keep working, if (and only if) the vulkan side sets the
>>>>>>> winsys fences correctly. Also, everything else in vulkan aside from
>>>>>>> winsys is explicitly not synced at all, you have to import drm syncobj
>>>>>>> timeline on the gl side.
>>>>>>>
>>>>>>>> The only thing we can do is to say on a per buffer flag that a buffer
>>>>>>>> should not participate in implicit sync at all.
>>>>>>> Nah, this doesn't work. Because it's not a global decision, is a local
>>>>>>> decision for the rendered. Vulkan wants to control implicit sync
>>>>>>> explicitly, and the kernel can't force more synchronization. If a
>>>>>>> buffer is shared as a winsys buffer between vulkan client and gl using
>>>>>>> compositor, then you _have_ to use implicit sync on it. But vk needs
>>>>>>> to set the fences directly (and if the app gets it wrong, you get
>>>>>>> misrendering, but that is the specified behavour of vulkan).
>>>>>> Yeah, but that's exactly what we tried to avoid.
>>>>>>
>>>>>> Mhm, when we attach the flag to the process/VM then this would break the
>>>>>> use case of VA-API and Vulkan in the same process.
>>>>>>
>>>>>> But I think if you attach the flag to the context that should indeed
>>>>>> work fine.
>>>>> Yeah that's a question I have, whether the drm_file is shared within
>>>>> one process among everything, or whether radeonsi/libva/vk each have
>>>>> their own. If each have their own drm_file, then we should be fine,
>>>>> otherwise we need to figure out another place to put this (worst case
>>>>> as a CS extension that vk just sets on every submit).
>>>> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
>>>> process (modulo minigbm on chromeos and modulo a master fd).
>>>>
>>>> That said the current proposal is for the context right? And on the
>>>> context this should pretty much work? So I'm not sure why this is the
>>>> part we are discussing?
>>> It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
>>> to have it's private VM for this. And on the quick I didn't see any other
>>> way to create a VM than to have an FD of your own.
>> You can't have your own FD in libdrm_amdgpu userspace. We had a pretty hard
>> design discussion about that already.
>>
>> What you could do is to load your own copy of libdrm_amdgpu, but I won't
>> recommend that.
>>
>> Just putting the flag on the context instead of the VM is much cleaner as
>> far as I can see anyway.
> Helper for the blind? If you gues expect me to move that myself ...

Add the flag to struct amdgpu_ctx, you can use amdgpu_ctx_ioctl() to set 
it. Then during CS that is available as p->ctx.

If I'm not totally mistaken that is also what Bas had in mind with his 
comment.

Christian.

> -Daniel
>
>> Christian.
>>
>>> If there's something else that means "gpu context with it's own vm" then
>>> the flag would need to be moved there, pointers appreciated (but maybe
>>> someone with hw + userspace can do that quicker).
>>> -Daniel
>>>
>>>>> Also yes this risks that a vk app which was violationing the winsys
>>>>> spec will now break, which is why I think we should do this sooner
>>>>> than later. Otherwise the list of w/a we might need to apply in vk
>>>>> userspace will become very long :-( At least since this is purely
>>>>> opt-in from userspace, we only need to have the w/a list in userspace,
>>>>> where mesa has the infrastructure for that already.
>>>>> -Daniel
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>>>>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>>>>>>>> oversync, there's not much choice.
>>>>>>>>>>>
>>>>>>>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>>>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>>>>>>>
>>>>>>>>>>>> But this doesn't solve cross-driver interactions here.
>>>>>>>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>>>>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>>>>>>>> Hui? You have lost me. Why is that still unsolved?
>>>>>>>>> The part we're trying to solve with this patch is Vulkan should not
>>>>>>>>> participate in any implicit sync at all wrt submissions (and then
>>>>>>>>> handle the implicit sync for WSI explicitly using the fence
>>>>>>>>> import/export stuff that Jason wrote). As long we add shared fences to
>>>>>>>>> the dma_resv we participate in implicit sync (at the level of an
>>>>>>>>> implicit sync read) still, at least from the perspective of later jobs
>>>>>>>>> waiting on these fences.
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> -Daniel
>>>>>>>>>>>
>>>>>>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>>>>>>>       include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>>>>>>>       4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>>              struct amdgpu_bo *gds;
>>>>>>>>>>>>>              struct amdgpu_bo *gws;
>>>>>>>>>>>>>              struct amdgpu_bo *oa;
>>>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>>>              int r;
>>>>>>>>>>>>>
>>>>>>>>>>>>>              INIT_LIST_HEAD(&p->validated);
>>>>>>>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>>
>>>>>>>>>>>>>                      e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>>>>>>>
>>>>>>>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>>>>>>>                              e->chain = dma_fence_chain_alloc();
>>>>>>>>>>>>>                              if (!e->chain) {
>>>>>>>>>>>>>                                      r = -ENOMEM;
>>>>>>>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>>>       {
>>>>>>>>>>>>>              struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>>>>>>>              struct amdgpu_bo_list_entry *e;
>>>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>>>              int r;
>>>>>>>>>>>>>
>>>>>>>>>>>>>              list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>>>                      struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>>>>>>>                      enum amdgpu_sync_mode sync_mode;
>>>>>>>>>>>>>
>>>>>>>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>>>                              AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>>>>>>>                      r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>>>>>>>                                           &fpriv->vm);
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>>>>>>>              return 0;
>>>>>>>>>>>>>       }
>>>>>>>>>>>>>
>>>>>>>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>>>>>>>> +                         struct drm_file *filp)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       switch (setparam->param) {
>>>>>>>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>>>>>>>> +               if (setparam->value)
>>>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>>>>>>>> +               else
>>>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>>>>>>>> +               break;
>>>>>>>>>>>>> +       default:
>>>>>>>>>>>>> +               return -EINVAL;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>>       const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>>       static const struct drm_driver amdgpu_kms_driver = {
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>>>>>>>              bool                    bulk_moveable;
>>>>>>>>>>>>>              /* Flag to indicate if VM is used for compute */
>>>>>>>>>>>>>              bool                    is_compute_context;
>>>>>>>>>>>>> +       /*
>>>>>>>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       bool no_implicit_sync;
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>>       struct amdgpu_vm_manager {
>>>>>>>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>>>>>>>       #define DRM_AMDGPU_VM                  0x13
>>>>>>>>>>>>>       #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>>>>>>>       #define DRM_AMDGPU_SCHED               0x15
>>>>>>>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>>>>>>>
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>>>>>>>
>>>>>>>>>>>>>       /**
>>>>>>>>>>>>>        * DOC: memory domains
>>>>>>>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>>>>>>>              struct drm_amdgpu_sched_in in;
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>>>>>>>> +       __u32   param;
>>>>>>>>>>>>> +       __u32   value;
>>>>>>>>>>>>> +};
>>>>>>>>>>>>> +
>>>>>>>>>>>>>       /*
>>>>>>>>>>>>>        * This is not a reliable API and you should expect it to fail for any
>>>>>>>>>>>>>        * number of reasons and have fallback path that do not use userptr to
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 2.32.0.rc2
>>>>>>>>>>>>>
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=o%2BceWYx09dRBKYizcr2Hg8usS4Nl8%2Bz%2F4YM4hgWPYpE%3D&amp;reserved=0


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi
@ 2021-06-23 15:15                             ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-23 15:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Daniel Vetter, Daniel Vetter,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Dave Airlie,
	Michel Dänzer, Dennis Li, Deepak R Varma

Am 23.06.21 um 17:12 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 05:07:17PM +0200, Christian König wrote:
>> Am 23.06.21 um 17:03 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 04:58:27PM +0200, Bas Nieuwenhuizen wrote:
>>>> On Wed, Jun 23, 2021 at 4:50 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>> On Wed, Jun 23, 2021 at 4:02 PM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 23.06.21 um 15:49 schrieb Daniel Vetter:
>>>>>>> On Wed, Jun 23, 2021 at 3:44 PM Christian König
>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>> Am 23.06.21 um 15:38 schrieb Bas Nieuwenhuizen:
>>>>>>>>> On Wed, Jun 23, 2021 at 2:59 PM Christian König
>>>>>>>>> <christian.koenig@amd.com> wrote:
>>>>>>>>>> Am 23.06.21 um 14:18 schrieb Daniel Vetter:
>>>>>>>>>>> On Wed, Jun 23, 2021 at 11:45 AM Bas Nieuwenhuizen
>>>>>>>>>>> <bas@basnieuwenhuizen.nl> wrote:
>>>>>>>>>>>> On Tue, Jun 22, 2021 at 6:55 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>>>>>>>>>>>>> WARNING: Absolutely untested beyond "gcc isn't dying in agony".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Implicit fencing done properly needs to treat the implicit fencing
>>>>>>>>>>>>> slots like a funny kind of IPC mailbox. In other words it needs to be
>>>>>>>>>>>>> explicitly. This is the only way it will mesh well with explicit
>>>>>>>>>>>>> fencing userspace like vk, and it's also the bare minimum required to
>>>>>>>>>>>>> be able to manage anything else that wants to use the same buffer on
>>>>>>>>>>>>> multiple engines in parallel, and still be able to share it through
>>>>>>>>>>>>> implicit sync.
>>>>>>>>>>>>>
>>>>>>>>>>>>> amdgpu completely lacks such an uapi. Fix this.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Luckily the concept of ignoring implicit fences exists already, and
>>>>>>>>>>>>> takes care of all the complexities of making sure that non-optional
>>>>>>>>>>>>> fences (like bo moves) are not ignored. This support was added in
>>>>>>>>>>>>>
>>>>>>>>>>>>> commit 177ae09b5d699a5ebd1cafcee78889db968abf54
>>>>>>>>>>>>> Author: Andres Rodriguez <andresx7@gmail.com>
>>>>>>>>>>>>> Date:   Fri Sep 15 20:44:06 2017 -0400
>>>>>>>>>>>>>
>>>>>>>>>>>>>          drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortuantely it's the wrong semantics, because it's a bo flag and
>>>>>>>>>>>>> disables implicit sync on an allocated buffer completely.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We _do_ want implicit sync, but control it explicitly. For this we
>>>>>>>>>>>>> need a flag on the drm_file, so that a given userspace (like vulkan)
>>>>>>>>>>>>> can manage the implicit sync slots explicitly. The other side of the
>>>>>>>>>>>>> pipeline (compositor, other process or just different stage in a media
>>>>>>>>>>>>> pipeline in the same process) can then either do the same, or fully
>>>>>>>>>>>>> participate in the implicit sync as implemented by the kernel by
>>>>>>>>>>>>> default.
>>>>>>>>>>>>>
>>>>>>>>>>>>> By building on the existing flag for buffers we avoid any issues with
>>>>>>>>>>>>> opening up additional security concerns - anything this new flag here
>>>>>>>>>>>>> allows is already.
>>>>>>>>>>>>>
>>>>>>>>>>>>> All drivers which supports this concept of a userspace-specific
>>>>>>>>>>>>> opt-out of implicit sync have a flag in their CS ioctl, but in reality
>>>>>>>>>>>>> that turned out to be a bit too inflexible. See the discussion below,
>>>>>>>>>>>>> let's try to do a bit better for amdgpu.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This alone only allows us to completely avoid any stalls due to
>>>>>>>>>>>>> implicit sync, it does not yet allow us to use implicit sync as a
>>>>>>>>>>>>> strange form of IPC for sync_file.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For that we need two more pieces:
>>>>>>>>>>>>>
>>>>>>>>>>>>> - a way to get the current implicit sync fences out of a buffer. Could
>>>>>>>>>>>>>        be done in a driver ioctl, but everyone needs this, and generally a
>>>>>>>>>>>>>        dma-buf is involved anyway to establish the sharing. So an ioctl on
>>>>>>>>>>>>>        the dma-buf makes a ton more sense:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-4-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=EKK8SOxfcGPIcH3EFEN656G76vQnn2jGlGyMkuY4f5k%3D&amp;reserved=0
>>>>>>>>>>>>>
>>>>>>>>>>>>>        Current drivers in upstream solves this by having the opt-out flag
>>>>>>>>>>>>>        on their CS ioctl. This has the downside that very often the CS
>>>>>>>>>>>>>        which must actually stall for the implicit fence is run a while
>>>>>>>>>>>>>        after the implicit fence point was logically sampled per the api
>>>>>>>>>>>>>        spec (vk passes an explicit syncobj around for that afaiui), and so
>>>>>>>>>>>>>        results in oversync. Converting the implicit sync fences into a
>>>>>>>>>>>>>        snap-shot sync_file is actually accurate.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Simillar we need to be able to set the exclusive implicit fence.
>>>>>>>>>>>>>        Current drivers again do this with a CS ioctl flag, with again the
>>>>>>>>>>>>>        same problems that the time the CS happens additional dependencies
>>>>>>>>>>>>>        have been added. An explicit ioctl to only insert a sync_file (while
>>>>>>>>>>>>>        respecting the rules for how exclusive and shared fence slots must
>>>>>>>>>>>>>        be update in struct dma_resv) is much better. This is proposed here:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fdri-devel%2F20210520190007.534046-5-jason%40jlekstrand.net%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=nrsIXQYXctDMFvYuUfM2LO%2BozUfzopfs34ZpTKjNCKo%3D&amp;reserved=0
>>>>>>>>>>>>>
>>>>>>>>>>>>> These three pieces together allow userspace to fully control implicit
>>>>>>>>>>>>> fencing and remove all unecessary stall points due to them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Well, as much as the implicit fencing model fundamentally allows:
>>>>>>>>>>>>> There is only one set of fences, you can only choose to sync against
>>>>>>>>>>>>> only writers (exclusive slot), or everyone. Hence suballocating
>>>>>>>>>>>>> multiple buffers or anything else like this is fundamentally not
>>>>>>>>>>>>> possible, and can only be fixed by a proper explicit fencing model.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Aside from that caveat this model gets implicit fencing as closely to
>>>>>>>>>>>>> explicit fencing semantics as possible:
>>>>>>>>>>>>>
>>>>>>>>>>>>> On the actual implementation I opted for a simple setparam ioctl, no
>>>>>>>>>>>>> locking (just atomic reads/writes) for simplicity. There is a nice
>>>>>>>>>>>>> flag parameter in the VM ioctl which we could use, except:
>>>>>>>>>>>>> - it's not checked, so userspace likely passes garbage
>>>>>>>>>>>>> - there's already a comment that userspace _does_ pass garbage in the
>>>>>>>>>>>>>        priority field
>>>>>>>>>>>>> So yeah unfortunately this flag parameter for setting vm flags is
>>>>>>>>>>>>> useless, and we need to hack up a new one.
>>>>>>>>>>>>>
>>>>>>>>>>>>> v2: Explain why a new SETPARAM (Jason)
>>>>>>>>>>>>>
>>>>>>>>>>>>> v3: Bas noticed I forgot to hook up the dependency-side shortcut. We
>>>>>>>>>>>>> need both, or this doesn't do much.
>>>>>>>>>>>>>
>>>>>>>>>>>>> v4: Rebase over the amdgpu patch to always set the implicit sync
>>>>>>>>>>>>> fences.
>>>>>>>>>>>> So I think there is still a case missing in this implementation.
>>>>>>>>>>>> Consider these 3 cases
>>>>>>>>>>>>
>>>>>>>>>>>> (format: a->b: b waits on a. Yes, I know arrows are hard)
>>>>>>>>>>>>
>>>>>>>>>>>> explicit->explicit: This doesn't wait now, which is good
>>>>>>>>>>>> Implicit->explicit: This doesn't wait now, which is good
>>>>>>>>>>>> explicit->implicit : This still waits as the explicit submission still
>>>>>>>>>>>> adds shared fences and most things that set an exclusive fence for
>>>>>>>>>>>> implicit sync will hence wait on it.
>>>>>>>>>>>>
>>>>>>>>>>>> This is probably good enough for what radv needs now but also sounds
>>>>>>>>>>>> like a risk wrt baking in new uapi behavior that we don't want to be
>>>>>>>>>>>> the end result.
>>>>>>>>>>>>
>>>>>>>>>>>> Within AMDGPU this is probably solvable in two ways:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) Downgrade AMDGPU_SYNC_NE_OWNER to AMDGPU_SYNC_EXPLICIT for shared fences.
>>>>>>>>>>> I'm not sure that works. I think the right fix is that radeonsi also
>>>>>>>>>>> switches to this model, with maybe a per-bo CS flag to set indicate
>>>>>>>>>>> write access, to cut down on the number of ioctls that are needed
>>>>>>>>>>> otherwise on shared buffers. This per-bo flag would essentially select
>>>>>>>>>>> between SYNC_NE_OWNER and SYNC_EXPLICIT on a per-buffer basis.
>>>>>>>>>> Yeah, but I'm still not entirely sure why that approach isn't sufficient?
>>>>>>>>>>
>>>>>>>>>> Problem with the per context or per vm flag is that you then don't get
>>>>>>>>>> any implicit synchronization any more when another process starts using
>>>>>>>>>> the buffer.
>>>>>>>>> That is exactly what I want for Vulkan :)
>>>>>>>> Yeah, but as far as I know this is not something we can do.
>>>>>>>>
>>>>>>>> See we have use cases like screen capture and debug which rely on that
>>>>>>>> behavior.
>>>>>>> They will keep working, if (and only if) the vulkan side sets the
>>>>>>> winsys fences correctly. Also, everything else in vulkan aside from
>>>>>>> winsys is explicitly not synced at all, you have to import drm syncobj
>>>>>>> timeline on the gl side.
>>>>>>>
>>>>>>>> The only thing we can do is to say on a per buffer flag that a buffer
>>>>>>>> should not participate in implicit sync at all.
>>>>>>> Nah, this doesn't work. Because it's not a global decision, is a local
>>>>>>> decision for the rendered. Vulkan wants to control implicit sync
>>>>>>> explicitly, and the kernel can't force more synchronization. If a
>>>>>>> buffer is shared as a winsys buffer between vulkan client and gl using
>>>>>>> compositor, then you _have_ to use implicit sync on it. But vk needs
>>>>>>> to set the fences directly (and if the app gets it wrong, you get
>>>>>>> misrendering, but that is the specified behavour of vulkan).
>>>>>> Yeah, but that's exactly what we tried to avoid.
>>>>>>
>>>>>> Mhm, when we attach the flag to the process/VM then this would break the
>>>>>> use case of VA-API and Vulkan in the same process.
>>>>>>
>>>>>> But I think if you attach the flag to the context that should indeed
>>>>>> work fine.
>>>>> Yeah that's a question I have, whether the drm_file is shared within
>>>>> one process among everything, or whether radeonsi/libva/vk each have
>>>>> their own. If each have their own drm_file, then we should be fine,
>>>>> otherwise we need to figure out another place to put this (worst case
>>>>> as a CS extension that vk just sets on every submit).
>>>> libdrm_amdgpu dedupes it all so we mostly end up with one drm_file per
>>>> process (modulo minigbm on chromeos and modulo a master fd).
>>>>
>>>> That said the current proposal is for the context right? And on the
>>>> context this should pretty much work? So I'm not sure why this is the
>>>> part we are discussing?
>>> It's on the fpriv->vm, so on the FD. I assumed vulkan at least would want
>>> to have it's private VM for this. And on the quick I didn't see any other
>>> way to create a VM than to have an FD of your own.
>> You can't have your own FD in libdrm_amdgpu userspace. We had a pretty hard
>> design discussion about that already.
>>
>> What you could do is to load your own copy of libdrm_amdgpu, but I won't
>> recommend that.
>>
>> Just putting the flag on the context instead of the VM is much cleaner as
>> far as I can see anyway.
> Helper for the blind? If you gues expect me to move that myself ...

Add the flag to struct amdgpu_ctx, you can use amdgpu_ctx_ioctl() to set 
it. Then during CS that is available as p->ctx.

If I'm not totally mistaken that is also what Bas had in mind with his 
comment.

Christian.

> -Daniel
>
>> Christian.
>>
>>> If there's something else that means "gpu context with it's own vm" then
>>> the flag would need to be moved there, pointers appreciated (but maybe
>>> someone with hw + userspace can do that quicker).
>>> -Daniel
>>>
>>>>> Also yes this risks that a vk app which was violationing the winsys
>>>>> spec will now break, which is why I think we should do this sooner
>>>>> than later. Otherwise the list of w/a we might need to apply in vk
>>>>> userspace will become very long :-( At least since this is purely
>>>>> opt-in from userspace, we only need to have the w/a list in userspace,
>>>>> where mesa has the infrastructure for that already.
>>>>> -Daniel
>>>>>
>>>>>> Christian.
>>>>>>
>>>>>>> -Daniel
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Christian.
>>>>>>>>
>>>>>>>>>>> The current amdgpu uapi just doesn't allow any other model without an
>>>>>>>>>>> explicit opt-in. So current implicit sync userspace just has to
>>>>>>>>>>> oversync, there's not much choice.
>>>>>>>>>>>
>>>>>>>>>>>> 2) Have an EXPLICIT fence owner that is used for explicit submissions
>>>>>>>>>>>> that is ignored by AMDGPU_SYNC_NE_OWNER.
>>>>>>>>>>>>
>>>>>>>>>>>> But this doesn't solve cross-driver interactions here.
>>>>>>>>>>> Yeah cross-driver is still entirely unsolved, because
>>>>>>>>>>> amdgpu_bo_explicit_sync() on the bo didn't solve that either.
>>>>>>>>>> Hui? You have lost me. Why is that still unsolved?
>>>>>>>>> The part we're trying to solve with this patch is Vulkan should not
>>>>>>>>> participate in any implicit sync at all wrt submissions (and then
>>>>>>>>> handle the implicit sync for WSI explicitly using the fence
>>>>>>>>> import/export stuff that Jason wrote). As long we add shared fences to
>>>>>>>>> the dma_resv we participate in implicit sync (at the level of an
>>>>>>>>> implicit sync read) still, at least from the perspective of later jobs
>>>>>>>>> waiting on these fences.
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> -Daniel
>>>>>>>>>>>
>>>>>>>>>>>>> Cc: mesa-dev@lists.freedesktop.org
>>>>>>>>>>>>> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
>>>>>>>>>>>>> Cc: Dave Airlie <airlied@gmail.com>
>>>>>>>>>>>>> Cc: Rob Clark <robdclark@chromium.org>
>>>>>>>>>>>>> Cc: Kristian H. Kristensen <hoegsberg@google.com>
>>>>>>>>>>>>> Cc: Michel Dänzer <michel@daenzer.net>
>>>>>>>>>>>>> Cc: Daniel Stone <daniels@collabora.com>
>>>>>>>>>>>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>>>>>>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>>>>>>>>> Cc: Deepak R Varma <mh12gx2825@gmail.com>
>>>>>>>>>>>>> Cc: Chen Li <chenli@uniontech.com>
>>>>>>>>>>>>> Cc: Kevin Wang <kevin1.wang@amd.com>
>>>>>>>>>>>>> Cc: Dennis Li <Dennis.Li@amd.com>
>>>>>>>>>>>>> Cc: Luben Tuikov <luben.tuikov@amd.com>
>>>>>>>>>>>>> Cc: linaro-mm-sig@lists.linaro.org
>>>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c  |  7 +++++--
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 21 +++++++++++++++++++++
>>>>>>>>>>>>>       drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h  |  6 ++++++
>>>>>>>>>>>>>       include/uapi/drm/amdgpu_drm.h           | 10 ++++++++++
>>>>>>>>>>>>>       4 files changed, 42 insertions(+), 2 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> index 65df34c17264..c5386d13eb4a 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
>>>>>>>>>>>>> @@ -498,6 +498,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>>              struct amdgpu_bo *gds;
>>>>>>>>>>>>>              struct amdgpu_bo *gws;
>>>>>>>>>>>>>              struct amdgpu_bo *oa;
>>>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>>>              int r;
>>>>>>>>>>>>>
>>>>>>>>>>>>>              INIT_LIST_HEAD(&p->validated);
>>>>>>>>>>>>> @@ -577,7 +578,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>>>>>>>>>>>>>
>>>>>>>>>>>>>                      e->bo_va = amdgpu_vm_bo_find(vm, bo);
>>>>>>>>>>>>>
>>>>>>>>>>>>> -               if (bo->tbo.base.dma_buf && !amdgpu_bo_explicit_sync(bo)) {
>>>>>>>>>>>>> +               if (bo->tbo.base.dma_buf &&
>>>>>>>>>>>>> +                   !(no_implicit_sync || amdgpu_bo_explicit_sync(bo))) {
>>>>>>>>>>>>>                              e->chain = dma_fence_chain_alloc();
>>>>>>>>>>>>>                              if (!e->chain) {
>>>>>>>>>>>>>                                      r = -ENOMEM;
>>>>>>>>>>>>> @@ -649,6 +651,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>>>       {
>>>>>>>>>>>>>              struct amdgpu_fpriv *fpriv = p->filp->driver_priv;
>>>>>>>>>>>>>              struct amdgpu_bo_list_entry *e;
>>>>>>>>>>>>> +       bool no_implicit_sync = READ_ONCE(fpriv->vm.no_implicit_sync);
>>>>>>>>>>>>>              int r;
>>>>>>>>>>>>>
>>>>>>>>>>>>>              list_for_each_entry(e, &p->validated, tv.head) {
>>>>>>>>>>>>> @@ -656,7 +659,7 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
>>>>>>>>>>>>>                      struct dma_resv *resv = bo->tbo.base.resv;
>>>>>>>>>>>>>                      enum amdgpu_sync_mode sync_mode;
>>>>>>>>>>>>>
>>>>>>>>>>>>> -               sync_mode = amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>>> +               sync_mode = no_implicit_sync || amdgpu_bo_explicit_sync(bo) ?
>>>>>>>>>>>>>                              AMDGPU_SYNC_EXPLICIT : AMDGPU_SYNC_NE_OWNER;
>>>>>>>>>>>>>                      r = amdgpu_sync_resv(p->adev, &p->job->sync, resv, sync_mode,
>>>>>>>>>>>>>                                           &fpriv->vm);
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> index c080ba15ae77..f982626b5328 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>>>>>>>>>>>> @@ -1724,6 +1724,26 @@ int amdgpu_file_to_fpriv(struct file *filp, struct amdgpu_fpriv **fpriv)
>>>>>>>>>>>>>              return 0;
>>>>>>>>>>>>>       }
>>>>>>>>>>>>>
>>>>>>>>>>>>> +int amdgpu_setparam_ioctl(struct drm_device *dev, void *data,
>>>>>>>>>>>>> +                         struct drm_file *filp)
>>>>>>>>>>>>> +{
>>>>>>>>>>>>> +       struct drm_amdgpu_setparam *setparam = data;
>>>>>>>>>>>>> +       struct amdgpu_fpriv *fpriv = filp->driver_priv;
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       switch (setparam->param) {
>>>>>>>>>>>>> +       case AMDGPU_SETPARAM_NO_IMPLICIT_SYNC:
>>>>>>>>>>>>> +               if (setparam->value)
>>>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, true);
>>>>>>>>>>>>> +               else
>>>>>>>>>>>>> +                       WRITE_ONCE(fpriv->vm.no_implicit_sync, false);
>>>>>>>>>>>>> +               break;
>>>>>>>>>>>>> +       default:
>>>>>>>>>>>>> +               return -EINVAL;
>>>>>>>>>>>>> +       }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +       return 0;
>>>>>>>>>>>>> +}
>>>>>>>>>>>>> +
>>>>>>>>>>>>>       const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_CREATE, amdgpu_gem_create_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_CTX, amdgpu_ctx_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>> @@ -1742,6 +1762,7 @@ const struct drm_ioctl_desc amdgpu_ioctls_kms[] = {
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_VA, amdgpu_gem_va_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_OP, amdgpu_gem_op_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>              DRM_IOCTL_DEF_DRV(AMDGPU_GEM_USERPTR, amdgpu_gem_userptr_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>> +       DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>>       static const struct drm_driver amdgpu_kms_driver = {
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> index ddb85a85cbba..0e8c440c6303 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> @@ -321,6 +321,12 @@ struct amdgpu_vm {
>>>>>>>>>>>>>              bool                    bulk_moveable;
>>>>>>>>>>>>>              /* Flag to indicate if VM is used for compute */
>>>>>>>>>>>>>              bool                    is_compute_context;
>>>>>>>>>>>>> +       /*
>>>>>>>>>>>>> +        * Flag to indicate whether implicit sync should always be skipped on
>>>>>>>>>>>>> +        * this context. We do not care about races at all, userspace is allowed
>>>>>>>>>>>>> +        * to shoot itself with implicit sync to its fullest liking.
>>>>>>>>>>>>> +        */
>>>>>>>>>>>>> +       bool no_implicit_sync;
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>>       struct amdgpu_vm_manager {
>>>>>>>>>>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> index 0cbd1540aeac..9eae245c14d6 100644
>>>>>>>>>>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>>>>>>>>>>> @@ -54,6 +54,7 @@ extern "C" {
>>>>>>>>>>>>>       #define DRM_AMDGPU_VM                  0x13
>>>>>>>>>>>>>       #define DRM_AMDGPU_FENCE_TO_HANDLE     0x14
>>>>>>>>>>>>>       #define DRM_AMDGPU_SCHED               0x15
>>>>>>>>>>>>> +#define DRM_AMDGPU_SETPARAM            0x16
>>>>>>>>>>>>>
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_GEM_CREATE    DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_CREATE, union drm_amdgpu_gem_create)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_GEM_MMAP, union drm_amdgpu_gem_mmap)
>>>>>>>>>>>>> @@ -71,6 +72,7 @@ extern "C" {
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_VM            DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_VM, union drm_amdgpu_vm)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_FENCE_TO_HANDLE DRM_IOWR(DRM_COMMAND_BASE + DRM_AMDGPU_FENCE_TO_HANDLE, union drm_amdgpu_fence_to_handle)
>>>>>>>>>>>>>       #define DRM_IOCTL_AMDGPU_SCHED         DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SCHED, union drm_amdgpu_sched)
>>>>>>>>>>>>> +#define DRM_IOCTL_AMDGPU_SETPARAM      DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)
>>>>>>>>>>>>>
>>>>>>>>>>>>>       /**
>>>>>>>>>>>>>        * DOC: memory domains
>>>>>>>>>>>>> @@ -306,6 +308,14 @@ union drm_amdgpu_sched {
>>>>>>>>>>>>>              struct drm_amdgpu_sched_in in;
>>>>>>>>>>>>>       };
>>>>>>>>>>>>>
>>>>>>>>>>>>> +#define AMDGPU_SETPARAM_NO_IMPLICIT_SYNC       1
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +struct drm_amdgpu_setparam {
>>>>>>>>>>>>> +       /* AMDGPU_SETPARAM_* */
>>>>>>>>>>>>> +       __u32   param;
>>>>>>>>>>>>> +       __u32   value;
>>>>>>>>>>>>> +};
>>>>>>>>>>>>> +
>>>>>>>>>>>>>       /*
>>>>>>>>>>>>>        * This is not a reliable API and you should expect it to fail for any
>>>>>>>>>>>>>        * number of reasons and have fallback path that do not use userptr to
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 2.32.0.rc2
>>>>>>>>>>>>>
>>>>> --
>>>>> Daniel Vetter
>>>>> Software Engineer, Intel Corporation
>>>>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cbbfb6a2fd1ab4ab448d608d936594aae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637600579485508943%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=o%2BceWYx09dRBKYizcr2Hg8usS4Nl8%2Bz%2F4YM4hgWPYpE%3D&amp;reserved=0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  2021-06-22 20:20       ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 15:39         ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 15:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

Hi Daniel,

> > >        * equivalent functionality should be implemented through private
> > >        * members in the plane structure.
> > >        *
> > > -      * Drivers which always have their buffers pinned should use
> > > -      * drm_gem_plane_helper_prepare_fb() for this hook.
> > > +      * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> > s/not/nor/ ??
> 
> Yup.
> 
> > > +      * set drm_gem_plane_helper_prepare_fb() is called automatically to
> >               ^add comma?
> > > +      * implement this.
> >
> >
> > Leave cleanup_fb out of the description to make it more readable.
> 
> With the not->nor typo fixed, why does this make it more readable?
> Afaiui neither ... nor ... is fairly standard English, and I really
> want to make this the default only if you specify absolutely no plane
> fb handling of your own.

What I tried to suggest was like this:

"
Drivers which always have their buffers pinned should use
drm_gem_plane_helper_prepare_fb() for this hook.
For GEM drivers who do not have a @prepare_fb hook set,
drm_gem_plane_helper_prepare_fb() is called automatically to
implement this.
"

But anyway is fine and with the typo fixed:
Acked-by: Sam Ravnborg <sam@ravnborg.org>

	Sam

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
@ 2021-06-23 15:39         ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 15:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	Thomas Zimmermann, DRI Development

Hi Daniel,

> > >        * equivalent functionality should be implemented through private
> > >        * members in the plane structure.
> > >        *
> > > -      * Drivers which always have their buffers pinned should use
> > > -      * drm_gem_plane_helper_prepare_fb() for this hook.
> > > +      * For GEM drivers who neither have a @prepare_fb not @cleanup_fb hook
> > s/not/nor/ ??
> 
> Yup.
> 
> > > +      * set drm_gem_plane_helper_prepare_fb() is called automatically to
> >               ^add comma?
> > > +      * implement this.
> >
> >
> > Leave cleanup_fb out of the description to make it more readable.
> 
> With the not->nor typo fixed, why does this make it more readable?
> Afaiui neither ... nor ... is fairly standard English, and I really
> want to make this the default only if you specify absolutely no plane
> fb handling of your own.

What I tried to suggest was like this:

"
Drivers which always have their buffers pinned should use
drm_gem_plane_helper_prepare_fb() for this hook.
For GEM drivers who do not have a @prepare_fb hook set,
drm_gem_plane_helper_prepare_fb() is called automatically to
implement this.
"

But anyway is fine and with the typo fixed:
Acked-by: Sam Ravnborg <sam@ravnborg.org>

	Sam
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH] dma-buf: Switch to inline kerneldoc
  2021-06-22 16:54   ` Daniel Vetter
  (?)
@ 2021-06-23 16:17     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:17 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Christian König,
	Sam Ravnborg, Alex Deucher, Daniel Vetter, Sumit Semwal,
	Dave Airlie, Nirmoy Das, Deepak R Varma, Chen Li, Kevin Wang,
	linux-media, linaro-mm-sig

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

v2: Move the doc for sysfs_entry.attachment_uid to the right place too
(Sam)

Acked-by: Christian König <christian.koenig@amd.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 116 +++++++++++++++++++++++++++++++---------
 1 file changed, 90 insertions(+), 26 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..81cebf414505 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,30 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
- * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
- * and is incremented on each attach.
  *
  * This represents a shared buffer, created by calling dma_buf_export(). The
  * userspace representation is a normal file descriptor, which can be created by
@@ -324,24 +300,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,10 +401,22 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
+
+		/**
+		 * @sysfs_entry.attachment_uid:
+		 *
+		 * This is protected by the dma_resv_lock() on @resv and is
+		 * incremented on each attach.
+		 */
 		unsigned int attachment_uid;
 		struct kset *attach_stats_kset;
 	} *sysfs_entry;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH] dma-buf: Switch to inline kerneldoc
@ 2021-06-23 16:17     ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:17 UTC (permalink / raw)
  To: DRI Development
  Cc: Deepak R Varma, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, linaro-mm-sig, Nirmoy Das, Chen Li, Dave Airlie,
	Alex Deucher, Daniel Vetter, Sam Ravnborg, Christian König,
	linux-media

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

v2: Move the doc for sysfs_entry.attachment_uid to the right place too
(Sam)

Acked-by: Christian König <christian.koenig@amd.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 116 +++++++++++++++++++++++++++++++---------
 1 file changed, 90 insertions(+), 26 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..81cebf414505 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,30 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
- * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
- * and is incremented on each attach.
  *
  * This represents a shared buffer, created by calling dma_buf_export(). The
  * userspace representation is a normal file descriptor, which can be created by
@@ -324,24 +300,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,10 +401,22 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
+
+		/**
+		 * @sysfs_entry.attachment_uid:
+		 *
+		 * This is protected by the dma_resv_lock() on @resv and is
+		 * incremented on each attach.
+		 */
 		unsigned int attachment_uid;
 		struct kset *attach_stats_kset;
 	} *sysfs_entry;
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH] dma-buf: Switch to inline kerneldoc
@ 2021-06-23 16:17     ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:17 UTC (permalink / raw)
  To: DRI Development
  Cc: Deepak R Varma, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, Sumit Semwal, linaro-mm-sig, Nirmoy Das, Chen Li,
	Dave Airlie, Alex Deucher, Daniel Vetter, Sam Ravnborg,
	Christian König, linux-media

Also review & update everything while we're at it.

This is prep work to smash a ton of stuff into the kerneldoc for
@resv.

v2: Move the doc for sysfs_entry.attachment_uid to the right place too
(Sam)

Acked-by: Christian König <christian.koenig@amd.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Nirmoy Das <nirmoy.das@amd.com>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 include/linux/dma-buf.h | 116 +++++++++++++++++++++++++++++++---------
 1 file changed, 90 insertions(+), 26 deletions(-)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 92eec38a03aa..81cebf414505 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -289,30 +289,6 @@ struct dma_buf_ops {
 
 /**
  * struct dma_buf - shared buffer object
- * @size: size of the buffer; invariant over the lifetime of the buffer.
- * @file: file pointer used for sharing buffers across, and for refcounting.
- * @attachments: list of dma_buf_attachment that denotes all devices attached,
- *               protected by dma_resv lock.
- * @ops: dma_buf_ops associated with this buffer object.
- * @lock: used internally to serialize list manipulation, attach/detach and
- *        vmap/unmap
- * @vmapping_counter: used internally to refcnt the vmaps
- * @vmap_ptr: the current vmap ptr if vmapping_counter > 0
- * @exp_name: name of the exporter; useful for debugging.
- * @name: userspace-provided name; useful for accounting and debugging,
- *        protected by @resv.
- * @name_lock: spinlock to protect name access
- * @owner: pointer to exporter module; used for refcounting when exporter is a
- *         kernel module.
- * @list_node: node for dma_buf accounting and debugging.
- * @priv: exporter specific private data for this buffer object.
- * @resv: reservation object linked to this dma-buf
- * @poll: for userspace poll support
- * @cb_excl: for userspace poll support
- * @cb_shared: for userspace poll support
- * @sysfs_entry: for exposing information about this buffer in sysfs.
- * The attachment_uid member of @sysfs_entry is protected by dma_resv lock
- * and is incremented on each attach.
  *
  * This represents a shared buffer, created by calling dma_buf_export(). The
  * userspace representation is a normal file descriptor, which can be created by
@@ -324,24 +300,100 @@ struct dma_buf_ops {
  * Device DMA access is handled by the separate &struct dma_buf_attachment.
  */
 struct dma_buf {
+	/**
+	 * @size:
+	 *
+	 * Size of the buffer; invariant over the lifetime of the buffer.
+	 */
 	size_t size;
+
+	/**
+	 * @file:
+	 *
+	 * File pointer used for sharing buffers across, and for refcounting.
+	 * See dma_buf_get() and dma_buf_put().
+	 */
 	struct file *file;
+
+	/**
+	 * @attachments:
+	 *
+	 * List of dma_buf_attachment that denotes all devices attached,
+	 * protected by &dma_resv lock @resv.
+	 */
 	struct list_head attachments;
+
+	/** @ops: dma_buf_ops associated with this buffer object. */
 	const struct dma_buf_ops *ops;
+
+	/**
+	 * @lock:
+	 *
+	 * Used internally to serialize list manipulation, attach/detach and
+	 * vmap/unmap. Note that in many cases this is superseeded by
+	 * dma_resv_lock() on @resv.
+	 */
 	struct mutex lock;
+
+	/**
+	 * @vmapping_counter:
+	 *
+	 * Used internally to refcnt the vmaps returned by dma_buf_vmap().
+	 * Protected by @lock.
+	 */
 	unsigned vmapping_counter;
+
+	/**
+	 * @vmap_ptr:
+	 * The current vmap ptr if @vmapping_counter > 0. Protected by @lock.
+	 */
 	struct dma_buf_map vmap_ptr;
+
+	/**
+	 * @exp_name:
+	 *
+	 * Name of the exporter; useful for debugging. See the
+	 * DMA_BUF_SET_NAME IOCTL.
+	 */
 	const char *exp_name;
+
+	/**
+	 * @name:
+	 *
+	 * Userspace-provided name; useful for accounting and debugging,
+	 * protected by dma_resv_lock() on @resv and @name_lock for read access.
+	 */
 	const char *name;
+
+	/** @name_lock: Spinlock to protect name acces for read access. */
 	spinlock_t name_lock;
+
+	/**
+	 * @owner:
+	 *
+	 * Pointer to exporter module; used for refcounting when exporter is a
+	 * kernel module.
+	 */
 	struct module *owner;
+
+	/** @list_node: node for dma_buf accounting and debugging. */
 	struct list_head list_node;
+
+	/** @priv: exporter specific private data for this buffer object. */
 	void *priv;
+
+	/**
+	 * @resv:
+	 *
+	 * Reservation object linked to this dma-buf.
+	 */
 	struct dma_resv *resv;
 
-	/* poll support */
+	/** @poll: for userspace poll support */
 	wait_queue_head_t poll;
 
+	/** @cb_excl: for userspace poll support */
+	/** @cb_shared: for userspace poll support */
 	struct dma_buf_poll_cb_t {
 		struct dma_fence_cb cb;
 		wait_queue_head_t *poll;
@@ -349,10 +401,22 @@ struct dma_buf {
 		__poll_t active;
 	} cb_excl, cb_shared;
 #ifdef CONFIG_DMABUF_SYSFS_STATS
-	/* for sysfs stats */
+	/**
+	 * @sysfs_entry:
+	 *
+	 * For exposing information about this buffer in sysfs. See also
+	 * `DMA-BUF statistics`_ for the uapi this enables.
+	 */
 	struct dma_buf_sysfs_entry {
 		struct kobject kobj;
 		struct dma_buf *dmabuf;
+
+		/**
+		 * @sysfs_entry.attachment_uid:
+		 *
+		 * This is protected by the dma_resv_lock() on @resv and is
+		 * incremented on each attach.
+		 */
 		unsigned int attachment_uid;
 		struct kset *attach_stats_kset;
 	} *sysfs_entry;
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-22 16:54   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 16:19     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:19 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, linaro-mm-sig, Daniel Stone, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	Michel Dänzer, Luben Tuikov, Kristian H . Kristensen,
	Chen Li, Alex Deucher, mesa-dev, Christian König, Dennis Li,
	Deepak R Varma

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

v4:
- Add the missing "through the device" in the dynamic section that I
  overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect

Reviewed-by: Christian König <christian.koenig@amd.com> (v3)

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 81cebf414505..494f639ee486 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -386,6 +386,45 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
+	 * below rules.
+	 *
+	 * - Drivers should add a shared fence through
+	 *   dma_resv_add_shared_fence() for anything the userspace API
+	 *   considers a read access. This highly depends upon the API and
+	 *   window system: E.g. OpenGL is generally implicitly synchronized on
+	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
+	 *   generally explicitly synchronized for everything, and window system
+	 *   buffers have explicit API calls (which then need to make sure the
+	 *   implicit fences store here in @resv are updated correctly).
+	 *
+	 * - Similarly drivers should set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through the device.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their &dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-23 16:19     ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:19 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, linaro-mm-sig, Daniel Stone, Daniel Vetter,
	Daniel Vetter, Intel Graphics Development, Kevin Wang,
	Sumit Semwal, Michel Dänzer, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Bas Nieuwenhuizen,
	Alex Deucher, mesa-dev, Dave Airlie, Christian König,
	Dennis Li, Deepak R Varma

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

v4:
- Add the missing "through the device" in the dynamic section that I
  overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect

Reviewed-by: Christian König <christian.koenig@amd.com> (v3)

Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 81cebf414505..494f639ee486 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -386,6 +386,45 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
+	 * below rules.
+	 *
+	 * - Drivers should add a shared fence through
+	 *   dma_resv_add_shared_fence() for anything the userspace API
+	 *   considers a read access. This highly depends upon the API and
+	 *   window system: E.g. OpenGL is generally implicitly synchronized on
+	 *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
+	 *   generally explicitly synchronized for everything, and window system
+	 *   buffers have explicit API calls (which then need to make sure the
+	 *   implicit fences store here in @resv are updated correctly).
+	 *
+	 * - Similarly drivers should set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through the device.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their &dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 16:22     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:22 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: David Airlie, Daniel Vetter, DRI Development, Thomas Zimmermann,
	Daniel Vetter, Sam Ravnborg

There's a bunch of atomic drivers who don't do this quite correctly,
luckily most of them aren't in wide use or people would have noticed
the tearing.

By making this the default we avoid the constant audit pain and can
additionally remove a ton of lines from vfuncs for a bit more clarity
in smaller drivers.

While at it complain if there's a cleanup_fb hook but no prepare_fb
hook, because that makes no sense. I haven't found any driver which
violates this, but better safe than sorry.

Subsequent patches will reap the benefits.

v2: It's neither ... nor, not not (Sam)

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
 drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
 include/drm/drm_modeset_helper_vtables.h |  7 +++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 531f2374b072..9f6c5f21c4d6 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -35,6 +35,7 @@
 #include <drm/drm_damage_helper.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_print.h>
 #include <drm/drm_self_refresh_helper.h>
@@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
 			ret = funcs->prepare_fb(plane, new_plane_state);
 			if (ret)
 				goto fail;
+		} else {
+			WARN_ON_ONCE(funcs->cleanup_fb);
+
+			if (!drm_core_check_feature(dev, DRIVER_GEM))
+				continue;
+
+			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
+			if (ret)
+				goto fail;
 		}
 	}
 
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index a27135084ae5..bc9396f2a0ed 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -135,6 +135,9 @@
  * GEM based framebuffer drivers which have their buffers always pinned in
  * memory.
  *
+ * This function is the default implementation for GEM drivers of
+ * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
+ *
  * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
  * explicit fencing in atomic modeset updates.
  */
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index f3a4b47b3986..fdfa9f37ce05 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
 	 * equivalent functionality should be implemented through private
 	 * members in the plane structure.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_plane_helper_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb nor @cleanup_fb hook
+	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
+	 * implement this. Other drivers which need additional plane processing
+	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
+	 * hook.
 	 *
 	 * The helpers will call @cleanup_fb with matching arguments for every
 	 * successful call to this hook.
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
@ 2021-06-23 16:22     ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:22 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: David Airlie, Daniel Vetter, Maxime Ripard, DRI Development,
	Thomas Zimmermann, Daniel Vetter, Sam Ravnborg

There's a bunch of atomic drivers who don't do this quite correctly,
luckily most of them aren't in wide use or people would have noticed
the tearing.

By making this the default we avoid the constant audit pain and can
additionally remove a ton of lines from vfuncs for a bit more clarity
in smaller drivers.

While at it complain if there's a cleanup_fb hook but no prepare_fb
hook, because that makes no sense. I haven't found any driver which
violates this, but better safe than sorry.

Subsequent patches will reap the benefits.

v2: It's neither ... nor, not not (Sam)

Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_atomic_helper.c      | 10 ++++++++++
 drivers/gpu/drm/drm_gem_atomic_helper.c  |  3 +++
 include/drm/drm_modeset_helper_vtables.h |  7 +++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 531f2374b072..9f6c5f21c4d6 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -35,6 +35,7 @@
 #include <drm/drm_damage_helper.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_print.h>
 #include <drm/drm_self_refresh_helper.h>
@@ -2408,6 +2409,15 @@ int drm_atomic_helper_prepare_planes(struct drm_device *dev,
 			ret = funcs->prepare_fb(plane, new_plane_state);
 			if (ret)
 				goto fail;
+		} else {
+			WARN_ON_ONCE(funcs->cleanup_fb);
+
+			if (!drm_core_check_feature(dev, DRIVER_GEM))
+				continue;
+
+			ret = drm_gem_plane_helper_prepare_fb(plane, new_plane_state);
+			if (ret)
+				goto fail;
 		}
 	}
 
diff --git a/drivers/gpu/drm/drm_gem_atomic_helper.c b/drivers/gpu/drm/drm_gem_atomic_helper.c
index a27135084ae5..bc9396f2a0ed 100644
--- a/drivers/gpu/drm/drm_gem_atomic_helper.c
+++ b/drivers/gpu/drm/drm_gem_atomic_helper.c
@@ -135,6 +135,9 @@
  * GEM based framebuffer drivers which have their buffers always pinned in
  * memory.
  *
+ * This function is the default implementation for GEM drivers of
+ * &drm_plane_helper_funcs.prepare_fb if no callback is provided.
+ *
  * See drm_atomic_set_fence_for_plane() for a discussion of implicit and
  * explicit fencing in atomic modeset updates.
  */
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index f3a4b47b3986..fdfa9f37ce05 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -1178,8 +1178,11 @@ struct drm_plane_helper_funcs {
 	 * equivalent functionality should be implemented through private
 	 * members in the plane structure.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_plane_helper_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb nor @cleanup_fb hook
+	 * set drm_gem_plane_helper_prepare_fb() is called automatically to
+	 * implement this. Other drivers which need additional plane processing
+	 * can call drm_gem_plane_helper_prepare_fb() from their @prepare_fb
+	 * hook.
 	 *
 	 * The helpers will call @cleanup_fb with matching arguments for every
 	 * successful call to this hook.
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [PATCH] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 16:24     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:24 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: David Airlie, Daniel Vetter, DRI Development,
	Noralf Trønnes, Thomas Zimmermann, Daniel Vetter,
	Sam Ravnborg

It's tedious to review this all the time, and my audit showed that
arcpgu actually forgot to set this.

Make this the default and stop worrying.

Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
strange combinations of hooks: cleanup_fb without prepare_fb doesn't
make sense, and since simpler drivers are all new they better be GEM
based drivers.

v2: Warn and bail when it's _not_ a GEM driver (Noralf)

v3: It's neither ... nor, not not (Sam)

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
 include/drm/drm_simple_kms_helper.h     |  7 +++++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index 0b095a313c44..735f4f34bcc4 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -9,6 +9,8 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_probe_helper.h>
@@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
 	struct drm_simple_display_pipe *pipe;
 
 	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
-	if (!pipe->funcs || !pipe->funcs->prepare_fb)
-		return 0;
+	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
+		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
+			return 0;
+
+		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
+
+		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
+	}
 
 	return pipe->funcs->prepare_fb(pipe, state);
 }
diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
index ef9944e9c5fc..cf07132d4ee8 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
 	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
 	 * more details.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb nor @cleanup_fb hook
+	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
+	 * to implement this. Other drivers which need additional plane
+	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
+	 * their @prepare_fb hook.
 	 */
 	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
 			  struct drm_plane_state *plane_state);
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* [Intel-gfx] [PATCH] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
@ 2021-06-23 16:24     ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 16:24 UTC (permalink / raw)
  To: Intel Graphics Development
  Cc: David Airlie, Daniel Vetter, DRI Development,
	Noralf Trønnes, Maxime Ripard, Thomas Zimmermann,
	Daniel Vetter, Sam Ravnborg

It's tedious to review this all the time, and my audit showed that
arcpgu actually forgot to set this.

Make this the default and stop worrying.

Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
strange combinations of hooks: cleanup_fb without prepare_fb doesn't
make sense, and since simpler drivers are all new they better be GEM
based drivers.

v2: Warn and bail when it's _not_ a GEM driver (Noralf)

v3: It's neither ... nor, not not (Sam)

Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/drm_simple_kms_helper.c | 12 ++++++++++--
 include/drm/drm_simple_kms_helper.h     |  7 +++++--
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index 0b095a313c44..735f4f34bcc4 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -9,6 +9,8 @@
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_bridge.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_gem_atomic_helper.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_probe_helper.h>
@@ -225,8 +227,14 @@ static int drm_simple_kms_plane_prepare_fb(struct drm_plane *plane,
 	struct drm_simple_display_pipe *pipe;
 
 	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
-	if (!pipe->funcs || !pipe->funcs->prepare_fb)
-		return 0;
+	if (!pipe->funcs || !pipe->funcs->prepare_fb) {
+		if (WARN_ON_ONCE(!drm_core_check_feature(plane->dev, DRIVER_GEM)))
+			return 0;
+
+		WARN_ON_ONCE(pipe->funcs && pipe->funcs->cleanup_fb);
+
+		return drm_gem_simple_display_pipe_prepare_fb(pipe, state);
+	}
 
 	return pipe->funcs->prepare_fb(pipe, state);
 }
diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
index ef9944e9c5fc..cf07132d4ee8 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -116,8 +116,11 @@ struct drm_simple_display_pipe_funcs {
 	 * the documentation for the &drm_plane_helper_funcs.prepare_fb hook for
 	 * more details.
 	 *
-	 * Drivers which always have their buffers pinned should use
-	 * drm_gem_simple_display_pipe_prepare_fb() for this hook.
+	 * For GEM drivers who neither have a @prepare_fb nor @cleanup_fb hook
+	 * set drm_gem_simple_display_pipe_prepare_fb() is called automatically
+	 * to implement this. Other drivers which need additional plane
+	 * processing can call drm_gem_simple_display_pipe_prepare_fb() from
+	 * their @prepare_fb hook.
 	 */
 	int (*prepare_fb)(struct drm_simple_display_pipe *pipe,
 			  struct drm_plane_state *plane_state);
-- 
2.32.0.rc2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 175+ messages in thread

* Re: [PATCH 06/15] drm/panfrost: Fix implicit sync
  2021-06-22 16:55   ` Daniel Vetter
  (?)
@ 2021-06-23 16:47     ` Boris Brezillon
  -1 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Tomeu Vizoso, Christian König,
	Intel Graphics Development, Steven Price, linaro-mm-sig,
	Alyssa Rosenzweig, Daniel Vetter, linux-media

On Tue, 22 Jun 2021 18:55:02 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Currently this has no practial relevance I think because there's not
> many who can pull off a setup with panfrost and another gpu in the
> same system. But the rules are that if you're setting an exclusive
> fence, indicating a gpu write access in the implicit fencing system,
> then you need to wait for all fences, not just the previous exclusive
> fence.
> 
> panfrost against itself has no problem, because it always sets the
> exclusive fence (but that's probably something that will need to be
> fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> Also no problem with that against display.
> 
> With the prep work done to switch over to the dependency helpers this
> is now a oneliner.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 71cd43fa1b36..ef004d587dc4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
> -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> -
> -		ret = drm_gem_fence_array_add(deps, fence);
> +		/* panfrost always uses write mode in its current uapi */
> +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
>  		if (ret)
>  			return ret;
>  	}


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-23 16:47     ` Boris Brezillon
  0 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, linaro-mm-sig, Alyssa Rosenzweig, Daniel Vetter,
	Christian König, linux-media

On Tue, 22 Jun 2021 18:55:02 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Currently this has no practial relevance I think because there's not
> many who can pull off a setup with panfrost and another gpu in the
> same system. But the rules are that if you're setting an exclusive
> fence, indicating a gpu write access in the implicit fencing system,
> then you need to wait for all fences, not just the previous exclusive
> fence.
> 
> panfrost against itself has no problem, because it always sets the
> exclusive fence (but that's probably something that will need to be
> fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> Also no problem with that against display.
> 
> With the prep work done to switch over to the dependency helpers this
> is now a oneliner.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 71cd43fa1b36..ef004d587dc4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
> -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> -
> -		ret = drm_gem_fence_array_add(deps, fence);
> +		/* panfrost always uses write mode in its current uapi */
> +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
>  		if (ret)
>  			return ret;
>  	}


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-23 16:47     ` Boris Brezillon
  0 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:47 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, linaro-mm-sig, Alyssa Rosenzweig, Daniel Vetter,
	Christian König, linux-media

On Tue, 22 Jun 2021 18:55:02 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Currently this has no practial relevance I think because there's not
> many who can pull off a setup with panfrost and another gpu in the
> same system. But the rules are that if you're setting an exclusive
> fence, indicating a gpu write access in the implicit fencing system,
> then you need to wait for all fences, not just the previous exclusive
> fence.
> 
> panfrost against itself has no problem, because it always sets the
> exclusive fence (but that's probably something that will need to be
> fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> Also no problem with that against display.
> 
> With the prep work done to switch over to the dependency helpers this
> is now a oneliner.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 71cd43fa1b36..ef004d587dc4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
>  	int i, ret;
>  
>  	for (i = 0; i < bo_count; i++) {
> -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> -
> -		ret = drm_gem_fence_array_add(deps, fence);
> +		/* panfrost always uses write mode in its current uapi */
> +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
>  		if (ret)
>  			return ret;
>  	}

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
  2021-06-22 16:55   ` Daniel Vetter
  (?)
@ 2021-06-23 16:51     ` Boris Brezillon
  -1 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Tomeu Vizoso, Intel Graphics Development,
	Steven Price, linaro-mm-sig, Luben Tuikov, Alyssa Rosenzweig,
	Alex Deucher, Daniel Vetter, linux-media, Lee Jones,
	Christian König

On Tue, 22 Jun 2021 18:55:01 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> More consistency and prep work for the next patch.
> 
> Aside: I wonder whether we shouldn't just move this entire xarray
> business into the scheduler so that not everyone has to reinvent the
> same wheels. Cc'ing some scheduler people for this too.
> 
> v2: Correctly handle sched_lock since Lucas pointed out it's needed.
> 
> v3: Rebase, dma_resv_get_excl_unlocked got renamed
> 
> v4: Don't leak job references on failure (Steven).

Hehe, I had pretty much the same patch here [1].

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

[1]https://patchwork.kernel.org/project/dri-devel/patch/20210311092539.2405596-3-boris.brezillon@collabora.com/

> 
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
>  drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
>  drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
>  3 files changed, 49 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 075ec0ef746c..3ee828f1e7a5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
>  	if (!job->bo_count)
>  		return 0;
>  
> -	job->implicit_fences = kvmalloc_array(job->bo_count,
> -				  sizeof(struct dma_fence *),
> -				  GFP_KERNEL | __GFP_ZERO);
> -	if (!job->implicit_fences)
> -		return -ENOMEM;
> -
>  	ret = drm_gem_objects_lookup(file_priv,
>  				     (void __user *)(uintptr_t)args->bo_handles,
>  				     job->bo_count, &job->bos);
> @@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
>  }
>  
>  /**
> - * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
> + * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
>   * referenced by the job.
>   * @dev: DRM device
>   * @file_priv: DRM file for this fd
> @@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  {
>  	u32 *handles;
>  	int ret = 0;
> -	int i;
> +	int i, in_fence_count;
>  
> -	job->in_fence_count = args->in_sync_count;
> +	in_fence_count = args->in_sync_count;
>  
> -	if (!job->in_fence_count)
> +	if (!in_fence_count)
>  		return 0;
>  
> -	job->in_fences = kvmalloc_array(job->in_fence_count,
> -					sizeof(struct dma_fence *),
> -					GFP_KERNEL | __GFP_ZERO);
> -	if (!job->in_fences) {
> -		DRM_DEBUG("Failed to allocate job in fences\n");
> -		return -ENOMEM;
> -	}
> -
> -	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
> +	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
>  	if (!handles) {
>  		ret = -ENOMEM;
>  		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> @@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  
>  	if (copy_from_user(handles,
>  			   (void __user *)(uintptr_t)args->in_syncs,
> -			   job->in_fence_count * sizeof(u32))) {
> +			   in_fence_count * sizeof(u32))) {
>  		ret = -EFAULT;
>  		DRM_DEBUG("Failed to copy in syncobj handles\n");
>  		goto fail;
>  	}
>  
> -	for (i = 0; i < job->in_fence_count; i++) {
> +	for (i = 0; i < in_fence_count; i++) {
> +		struct dma_fence *fence;
> +
>  		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -					     &job->in_fences[i]);
> -		if (ret == -EINVAL)
> +					     &fence);
> +		if (ret)
> +			goto fail;
> +
> +		ret = drm_gem_fence_array_add(&job->deps, fence);
> +
> +		if (ret)
>  			goto fail;
>  	}
>  
> @@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  
>  	kref_init(&job->refcount);
>  
> +	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> +
>  	job->pfdev = pfdev;
>  	job->jc = args->jc;
>  	job->requirements = args->requirements;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 38f8580c19f1..71cd43fa1b36 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
>  }
>  
> -static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
> -					   int bo_count,
> -					   struct dma_fence **implicit_fences)
> +static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> +					  int bo_count,
> +					  struct xarray *deps)
>  {
> -	int i;
> +	int i, ret;
>  
> -	for (i = 0; i < bo_count; i++)
> -		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
> +	for (i = 0; i < bo_count; i++) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> +
> +		ret = drm_gem_fence_array_add(deps, fence);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
>  }
>  
>  static void panfrost_attach_object_fences(struct drm_gem_object **bos,
> @@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
>  
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
> -	kref_get(&job->refcount); /* put by scheduler job completion */
> +	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> +					     &job->deps);
> +	if (ret) {
> +		mutex_unlock(&pfdev->sched_lock);
> +		goto unlock;
> +	}
>  
> -	panfrost_acquire_object_fences(job->bos, job->bo_count,
> -				       job->implicit_fences);
> +	kref_get(&job->refcount); /* put by scheduler job completion */
>  
>  	drm_sched_entity_push_job(&job->base, entity);
>  
> @@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
>  {
>  	struct panfrost_job *job = container_of(ref, struct panfrost_job,
>  						refcount);
> +	struct dma_fence *fence;
> +	unsigned long index;
>  	unsigned int i;
>  
> -	if (job->in_fences) {
> -		for (i = 0; i < job->in_fence_count; i++)
> -			dma_fence_put(job->in_fences[i]);
> -		kvfree(job->in_fences);
> -	}
> -	if (job->implicit_fences) {
> -		for (i = 0; i < job->bo_count; i++)
> -			dma_fence_put(job->implicit_fences[i]);
> -		kvfree(job->implicit_fences);
> +	xa_for_each(&job->deps, index, fence) {
> +		dma_fence_put(fence);
>  	}
> +	xa_destroy(&job->deps);
> +
>  	dma_fence_put(job->done_fence);
>  	dma_fence_put(job->render_done_fence);
>  
> @@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
>  						 struct drm_sched_entity *s_entity)
>  {
>  	struct panfrost_job *job = to_panfrost_job(sched_job);
> -	struct dma_fence *fence;
> -	unsigned int i;
> -
> -	/* Explicit fences */
> -	for (i = 0; i < job->in_fence_count; i++) {
> -		if (job->in_fences[i]) {
> -			fence = job->in_fences[i];
> -			job->in_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
>  
> -	/* Implicit fences, max. one per BO */
> -	for (i = 0; i < job->bo_count; i++) {
> -		if (job->implicit_fences[i]) {
> -			fence = job->implicit_fences[i];
> -			job->implicit_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
> +	if (!xa_empty(&job->deps))
> +		return xa_erase(&job->deps, job->last_dep++);
>  
>  	return NULL;
>  }
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index bbd3ba97ff67..82306a03b57e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -19,9 +19,9 @@ struct panfrost_job {
>  	struct panfrost_device *pfdev;
>  	struct panfrost_file_priv *file_priv;
>  
> -	/* Optional fences userspace can pass in for the job to depend on. */
> -	struct dma_fence **in_fences;
> -	u32 in_fence_count;
> +	/* Contains both explicit and implicit fences */
> +	struct xarray deps;
> +	unsigned long last_dep;
>  
>  	/* Fence to be signaled by IRQ handler when the job is complete. */
>  	struct dma_fence *done_fence;
> @@ -30,8 +30,6 @@ struct panfrost_job {
>  	__u32 requirements;
>  	__u32 flush_id;
>  
> -	/* Exclusive fences we have taken from the BOs to wait for */
> -	struct dma_fence **implicit_fences;
>  	struct panfrost_gem_mapping **mappings;
>  	struct drm_gem_object **bos;
>  	u32 bo_count;


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
@ 2021-06-23 16:51     ` Boris Brezillon
  0 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, linaro-mm-sig, Luben Tuikov, Alyssa Rosenzweig,
	Alex Deucher, Daniel Vetter, Lee Jones, Christian König,
	linux-media

On Tue, 22 Jun 2021 18:55:01 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> More consistency and prep work for the next patch.
> 
> Aside: I wonder whether we shouldn't just move this entire xarray
> business into the scheduler so that not everyone has to reinvent the
> same wheels. Cc'ing some scheduler people for this too.
> 
> v2: Correctly handle sched_lock since Lucas pointed out it's needed.
> 
> v3: Rebase, dma_resv_get_excl_unlocked got renamed
> 
> v4: Don't leak job references on failure (Steven).

Hehe, I had pretty much the same patch here [1].

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

[1]https://patchwork.kernel.org/project/dri-devel/patch/20210311092539.2405596-3-boris.brezillon@collabora.com/

> 
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
>  drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
>  drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
>  3 files changed, 49 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 075ec0ef746c..3ee828f1e7a5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
>  	if (!job->bo_count)
>  		return 0;
>  
> -	job->implicit_fences = kvmalloc_array(job->bo_count,
> -				  sizeof(struct dma_fence *),
> -				  GFP_KERNEL | __GFP_ZERO);
> -	if (!job->implicit_fences)
> -		return -ENOMEM;
> -
>  	ret = drm_gem_objects_lookup(file_priv,
>  				     (void __user *)(uintptr_t)args->bo_handles,
>  				     job->bo_count, &job->bos);
> @@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
>  }
>  
>  /**
> - * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
> + * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
>   * referenced by the job.
>   * @dev: DRM device
>   * @file_priv: DRM file for this fd
> @@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  {
>  	u32 *handles;
>  	int ret = 0;
> -	int i;
> +	int i, in_fence_count;
>  
> -	job->in_fence_count = args->in_sync_count;
> +	in_fence_count = args->in_sync_count;
>  
> -	if (!job->in_fence_count)
> +	if (!in_fence_count)
>  		return 0;
>  
> -	job->in_fences = kvmalloc_array(job->in_fence_count,
> -					sizeof(struct dma_fence *),
> -					GFP_KERNEL | __GFP_ZERO);
> -	if (!job->in_fences) {
> -		DRM_DEBUG("Failed to allocate job in fences\n");
> -		return -ENOMEM;
> -	}
> -
> -	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
> +	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
>  	if (!handles) {
>  		ret = -ENOMEM;
>  		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> @@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  
>  	if (copy_from_user(handles,
>  			   (void __user *)(uintptr_t)args->in_syncs,
> -			   job->in_fence_count * sizeof(u32))) {
> +			   in_fence_count * sizeof(u32))) {
>  		ret = -EFAULT;
>  		DRM_DEBUG("Failed to copy in syncobj handles\n");
>  		goto fail;
>  	}
>  
> -	for (i = 0; i < job->in_fence_count; i++) {
> +	for (i = 0; i < in_fence_count; i++) {
> +		struct dma_fence *fence;
> +
>  		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -					     &job->in_fences[i]);
> -		if (ret == -EINVAL)
> +					     &fence);
> +		if (ret)
> +			goto fail;
> +
> +		ret = drm_gem_fence_array_add(&job->deps, fence);
> +
> +		if (ret)
>  			goto fail;
>  	}
>  
> @@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  
>  	kref_init(&job->refcount);
>  
> +	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> +
>  	job->pfdev = pfdev;
>  	job->jc = args->jc;
>  	job->requirements = args->requirements;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 38f8580c19f1..71cd43fa1b36 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
>  }
>  
> -static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
> -					   int bo_count,
> -					   struct dma_fence **implicit_fences)
> +static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> +					  int bo_count,
> +					  struct xarray *deps)
>  {
> -	int i;
> +	int i, ret;
>  
> -	for (i = 0; i < bo_count; i++)
> -		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
> +	for (i = 0; i < bo_count; i++) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> +
> +		ret = drm_gem_fence_array_add(deps, fence);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
>  }
>  
>  static void panfrost_attach_object_fences(struct drm_gem_object **bos,
> @@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
>  
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
> -	kref_get(&job->refcount); /* put by scheduler job completion */
> +	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> +					     &job->deps);
> +	if (ret) {
> +		mutex_unlock(&pfdev->sched_lock);
> +		goto unlock;
> +	}
>  
> -	panfrost_acquire_object_fences(job->bos, job->bo_count,
> -				       job->implicit_fences);
> +	kref_get(&job->refcount); /* put by scheduler job completion */
>  
>  	drm_sched_entity_push_job(&job->base, entity);
>  
> @@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
>  {
>  	struct panfrost_job *job = container_of(ref, struct panfrost_job,
>  						refcount);
> +	struct dma_fence *fence;
> +	unsigned long index;
>  	unsigned int i;
>  
> -	if (job->in_fences) {
> -		for (i = 0; i < job->in_fence_count; i++)
> -			dma_fence_put(job->in_fences[i]);
> -		kvfree(job->in_fences);
> -	}
> -	if (job->implicit_fences) {
> -		for (i = 0; i < job->bo_count; i++)
> -			dma_fence_put(job->implicit_fences[i]);
> -		kvfree(job->implicit_fences);
> +	xa_for_each(&job->deps, index, fence) {
> +		dma_fence_put(fence);
>  	}
> +	xa_destroy(&job->deps);
> +
>  	dma_fence_put(job->done_fence);
>  	dma_fence_put(job->render_done_fence);
>  
> @@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
>  						 struct drm_sched_entity *s_entity)
>  {
>  	struct panfrost_job *job = to_panfrost_job(sched_job);
> -	struct dma_fence *fence;
> -	unsigned int i;
> -
> -	/* Explicit fences */
> -	for (i = 0; i < job->in_fence_count; i++) {
> -		if (job->in_fences[i]) {
> -			fence = job->in_fences[i];
> -			job->in_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
>  
> -	/* Implicit fences, max. one per BO */
> -	for (i = 0; i < job->bo_count; i++) {
> -		if (job->implicit_fences[i]) {
> -			fence = job->implicit_fences[i];
> -			job->implicit_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
> +	if (!xa_empty(&job->deps))
> +		return xa_erase(&job->deps, job->last_dep++);
>  
>  	return NULL;
>  }
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index bbd3ba97ff67..82306a03b57e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -19,9 +19,9 @@ struct panfrost_job {
>  	struct panfrost_device *pfdev;
>  	struct panfrost_file_priv *file_priv;
>  
> -	/* Optional fences userspace can pass in for the job to depend on. */
> -	struct dma_fence **in_fences;
> -	u32 in_fence_count;
> +	/* Contains both explicit and implicit fences */
> +	struct xarray deps;
> +	unsigned long last_dep;
>  
>  	/* Fence to be signaled by IRQ handler when the job is complete. */
>  	struct dma_fence *done_fence;
> @@ -30,8 +30,6 @@ struct panfrost_job {
>  	__u32 requirements;
>  	__u32 flush_id;
>  
> -	/* Exclusive fences we have taken from the BOs to wait for */
> -	struct dma_fence **implicit_fences;
>  	struct panfrost_gem_mapping **mappings;
>  	struct drm_gem_object **bos;
>  	u32 bo_count;


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking
@ 2021-06-23 16:51     ` Boris Brezillon
  0 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, linaro-mm-sig, Luben Tuikov, Alyssa Rosenzweig,
	Alex Deucher, Daniel Vetter, Lee Jones, Christian König,
	linux-media

On Tue, 22 Jun 2021 18:55:01 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> More consistency and prep work for the next patch.
> 
> Aside: I wonder whether we shouldn't just move this entire xarray
> business into the scheduler so that not everyone has to reinvent the
> same wheels. Cc'ing some scheduler people for this too.
> 
> v2: Correctly handle sched_lock since Lucas pointed out it's needed.
> 
> v3: Rebase, dma_resv_get_excl_unlocked got renamed
> 
> v4: Don't leak job references on failure (Steven).

Hehe, I had pretty much the same patch here [1].

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

[1]https://patchwork.kernel.org/project/dri-devel/patch/20210311092539.2405596-3-boris.brezillon@collabora.com/

> 
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Luben Tuikov <luben.tuikov@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Lee Jones <lee.jones@linaro.org>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_drv.c | 41 +++++++---------
>  drivers/gpu/drm/panfrost/panfrost_job.c | 65 +++++++++++--------------
>  drivers/gpu/drm/panfrost/panfrost_job.h |  8 ++-
>  3 files changed, 49 insertions(+), 65 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
> index 075ec0ef746c..3ee828f1e7a5 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_drv.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
> @@ -138,12 +138,6 @@ panfrost_lookup_bos(struct drm_device *dev,
>  	if (!job->bo_count)
>  		return 0;
>  
> -	job->implicit_fences = kvmalloc_array(job->bo_count,
> -				  sizeof(struct dma_fence *),
> -				  GFP_KERNEL | __GFP_ZERO);
> -	if (!job->implicit_fences)
> -		return -ENOMEM;
> -
>  	ret = drm_gem_objects_lookup(file_priv,
>  				     (void __user *)(uintptr_t)args->bo_handles,
>  				     job->bo_count, &job->bos);
> @@ -174,7 +168,7 @@ panfrost_lookup_bos(struct drm_device *dev,
>  }
>  
>  /**
> - * panfrost_copy_in_sync() - Sets up job->in_fences[] with the sync objects
> + * panfrost_copy_in_sync() - Sets up job->deps with the sync objects
>   * referenced by the job.
>   * @dev: DRM device
>   * @file_priv: DRM file for this fd
> @@ -194,22 +188,14 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  {
>  	u32 *handles;
>  	int ret = 0;
> -	int i;
> +	int i, in_fence_count;
>  
> -	job->in_fence_count = args->in_sync_count;
> +	in_fence_count = args->in_sync_count;
>  
> -	if (!job->in_fence_count)
> +	if (!in_fence_count)
>  		return 0;
>  
> -	job->in_fences = kvmalloc_array(job->in_fence_count,
> -					sizeof(struct dma_fence *),
> -					GFP_KERNEL | __GFP_ZERO);
> -	if (!job->in_fences) {
> -		DRM_DEBUG("Failed to allocate job in fences\n");
> -		return -ENOMEM;
> -	}
> -
> -	handles = kvmalloc_array(job->in_fence_count, sizeof(u32), GFP_KERNEL);
> +	handles = kvmalloc_array(in_fence_count, sizeof(u32), GFP_KERNEL);
>  	if (!handles) {
>  		ret = -ENOMEM;
>  		DRM_DEBUG("Failed to allocate incoming syncobj handles\n");
> @@ -218,16 +204,23 @@ panfrost_copy_in_sync(struct drm_device *dev,
>  
>  	if (copy_from_user(handles,
>  			   (void __user *)(uintptr_t)args->in_syncs,
> -			   job->in_fence_count * sizeof(u32))) {
> +			   in_fence_count * sizeof(u32))) {
>  		ret = -EFAULT;
>  		DRM_DEBUG("Failed to copy in syncobj handles\n");
>  		goto fail;
>  	}
>  
> -	for (i = 0; i < job->in_fence_count; i++) {
> +	for (i = 0; i < in_fence_count; i++) {
> +		struct dma_fence *fence;
> +
>  		ret = drm_syncobj_find_fence(file_priv, handles[i], 0, 0,
> -					     &job->in_fences[i]);
> -		if (ret == -EINVAL)
> +					     &fence);
> +		if (ret)
> +			goto fail;
> +
> +		ret = drm_gem_fence_array_add(&job->deps, fence);
> +
> +		if (ret)
>  			goto fail;
>  	}
>  
> @@ -265,6 +258,8 @@ static int panfrost_ioctl_submit(struct drm_device *dev, void *data,
>  
>  	kref_init(&job->refcount);
>  
> +	xa_init_flags(&job->deps, XA_FLAGS_ALLOC);
> +
>  	job->pfdev = pfdev;
>  	job->jc = args->jc;
>  	job->requirements = args->requirements;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 38f8580c19f1..71cd43fa1b36 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -196,14 +196,21 @@ static void panfrost_job_hw_submit(struct panfrost_job *job, int js)
>  	job_write(pfdev, JS_COMMAND_NEXT(js), JS_COMMAND_START);
>  }
>  
> -static void panfrost_acquire_object_fences(struct drm_gem_object **bos,
> -					   int bo_count,
> -					   struct dma_fence **implicit_fences)
> +static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> +					  int bo_count,
> +					  struct xarray *deps)
>  {
> -	int i;
> +	int i, ret;
>  
> -	for (i = 0; i < bo_count; i++)
> -		implicit_fences[i] = dma_resv_get_excl_unlocked(bos[i]->resv);
> +	for (i = 0; i < bo_count; i++) {
> +		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> +
> +		ret = drm_gem_fence_array_add(deps, fence);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
>  }
>  
>  static void panfrost_attach_object_fences(struct drm_gem_object **bos,
> @@ -240,10 +247,14 @@ int panfrost_job_push(struct panfrost_job *job)
>  
>  	job->render_done_fence = dma_fence_get(&job->base.s_fence->finished);
>  
> -	kref_get(&job->refcount); /* put by scheduler job completion */
> +	ret = panfrost_acquire_object_fences(job->bos, job->bo_count,
> +					     &job->deps);
> +	if (ret) {
> +		mutex_unlock(&pfdev->sched_lock);
> +		goto unlock;
> +	}
>  
> -	panfrost_acquire_object_fences(job->bos, job->bo_count,
> -				       job->implicit_fences);
> +	kref_get(&job->refcount); /* put by scheduler job completion */
>  
>  	drm_sched_entity_push_job(&job->base, entity);
>  
> @@ -262,18 +273,15 @@ static void panfrost_job_cleanup(struct kref *ref)
>  {
>  	struct panfrost_job *job = container_of(ref, struct panfrost_job,
>  						refcount);
> +	struct dma_fence *fence;
> +	unsigned long index;
>  	unsigned int i;
>  
> -	if (job->in_fences) {
> -		for (i = 0; i < job->in_fence_count; i++)
> -			dma_fence_put(job->in_fences[i]);
> -		kvfree(job->in_fences);
> -	}
> -	if (job->implicit_fences) {
> -		for (i = 0; i < job->bo_count; i++)
> -			dma_fence_put(job->implicit_fences[i]);
> -		kvfree(job->implicit_fences);
> +	xa_for_each(&job->deps, index, fence) {
> +		dma_fence_put(fence);
>  	}
> +	xa_destroy(&job->deps);
> +
>  	dma_fence_put(job->done_fence);
>  	dma_fence_put(job->render_done_fence);
>  
> @@ -316,26 +324,9 @@ static struct dma_fence *panfrost_job_dependency(struct drm_sched_job *sched_job
>  						 struct drm_sched_entity *s_entity)
>  {
>  	struct panfrost_job *job = to_panfrost_job(sched_job);
> -	struct dma_fence *fence;
> -	unsigned int i;
> -
> -	/* Explicit fences */
> -	for (i = 0; i < job->in_fence_count; i++) {
> -		if (job->in_fences[i]) {
> -			fence = job->in_fences[i];
> -			job->in_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
>  
> -	/* Implicit fences, max. one per BO */
> -	for (i = 0; i < job->bo_count; i++) {
> -		if (job->implicit_fences[i]) {
> -			fence = job->implicit_fences[i];
> -			job->implicit_fences[i] = NULL;
> -			return fence;
> -		}
> -	}
> +	if (!xa_empty(&job->deps))
> +		return xa_erase(&job->deps, job->last_dep++);
>  
>  	return NULL;
>  }
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.h b/drivers/gpu/drm/panfrost/panfrost_job.h
> index bbd3ba97ff67..82306a03b57e 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.h
> @@ -19,9 +19,9 @@ struct panfrost_job {
>  	struct panfrost_device *pfdev;
>  	struct panfrost_file_priv *file_priv;
>  
> -	/* Optional fences userspace can pass in for the job to depend on. */
> -	struct dma_fence **in_fences;
> -	u32 in_fence_count;
> +	/* Contains both explicit and implicit fences */
> +	struct xarray deps;
> +	unsigned long last_dep;
>  
>  	/* Fence to be signaled by IRQ handler when the job is complete. */
>  	struct dma_fence *done_fence;
> @@ -30,8 +30,6 @@ struct panfrost_job {
>  	__u32 requirements;
>  	__u32 flush_id;
>  
> -	/* Exclusive fences we have taken from the BOs to wait for */
> -	struct dma_fence **implicit_fences;
>  	struct panfrost_gem_mapping **mappings;
>  	struct drm_gem_object **bos;
>  	u32 bo_count;

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 04/15] drm/panfrost: Shrink sched_lock
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 16:52     ` Boris Brezillon
  -1 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:52 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, Alyssa Rosenzweig, Daniel Vetter

On Tue, 22 Jun 2021 18:55:00 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> drm/scheduler requires a lock between _init and _push_job, but the
> reservation lock dance doesn't. So shrink the critical section a
> notch.
> 
> v2: Lucas pointed out how this should really work, I got it all wrong
> in v1.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 2df3e999a38d..38f8580c19f1 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -224,14 +224,13 @@ int panfrost_job_push(struct panfrost_job *job)
>  	struct ww_acquire_ctx acquire_ctx;
>  	int ret = 0;
>  
> -	mutex_lock(&pfdev->sched_lock);
>  
>  	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>  					    &acquire_ctx);
> -	if (ret) {
> -		mutex_unlock(&pfdev->sched_lock);
> +	if (ret)
>  		return ret;
> -	}
> +
> +	mutex_lock(&pfdev->sched_lock);
>  
>  	ret = drm_sched_job_init(&job->base, entity, NULL);
>  	if (ret) {


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 04/15] drm/panfrost: Shrink sched_lock
@ 2021-06-23 16:52     ` Boris Brezillon
  0 siblings, 0 replies; 175+ messages in thread
From: Boris Brezillon @ 2021-06-23 16:52 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Tomeu Vizoso, Intel Graphics Development, DRI Development,
	Steven Price, Alyssa Rosenzweig, Daniel Vetter

On Tue, 22 Jun 2021 18:55:00 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> drm/scheduler requires a lock between _init and _push_job, but the
> reservation lock dance doesn't. So shrink the critical section a
> notch.
> 
> v2: Lucas pointed out how this should really work, I got it all wrong
> in v1.
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Lucas Stach <l.stach@pengutronix.de>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> Cc: Steven Price <steven.price@arm.com>
> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 2df3e999a38d..38f8580c19f1 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -224,14 +224,13 @@ int panfrost_job_push(struct panfrost_job *job)
>  	struct ww_acquire_ctx acquire_ctx;
>  	int ret = 0;
>  
> -	mutex_lock(&pfdev->sched_lock);
>  
>  	ret = drm_gem_lock_reservations(job->bos, job->bo_count,
>  					    &acquire_ctx);
> -	if (ret) {
> -		mutex_unlock(&pfdev->sched_lock);
> +	if (ret)
>  		return ret;
> -	}
> +
> +	mutex_lock(&pfdev->sched_lock);
>  
>  	ret = drm_sched_job_init(&job->base, entity, NULL);
>  	if (ret) {

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for implicit fencing/dma-resv rules for shared buffers (rev5)
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (19 preceding siblings ...)
  (?)
@ 2021-06-23 17:05 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-23 17:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers (rev5)
URL   : https://patchwork.freedesktop.org/series/91789/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
ec4dbdd1b8f4 dma-resv: Fix kerneldoc
-:11: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 6edbd6abb783 ("dma-buf: rename and cleanup dma_resv_get_excl v3")'
#11: 
commit 6edbd6abb783d54f6ac4c3ed5cd9e50cff6c15e9

-:37: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 1 errors, 1 warnings, 0 checks, 8 lines checked
2757dca29ffc dma-buf: Switch to inline kerneldoc
-:102: WARNING:TYPO_SPELLING: 'superseeded' may be misspelled - perhaps 'superseded'?
#102: FILE: include/linux/dma-buf.h:333:
+	 * vmap/unmap. Note that in many cases this is superseeded by
 	                                               ^^^^^^^^^^^

-:193: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 2 warnings, 0 checks, 154 lines checked
ee5ebd9f360f dma-buf: Document dma-buf implicit fencing/resv fencing rules
-:15: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#15: 
https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

-:140: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 049aca4363d8 ("drm/amdgpu: fix using shared fence for exported BOs v2")'
#140: 
commit 049aca4363d8af87cab8d53de5401602db3b9999

-:155: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 9b495a588799 ("dma-buf: add poll support, v3")'
#155: 
	commit 9b495a5887994a6d74d5c261d012083a92b94738

-:183: WARNING:REPEATED_WORD: Possible repeated word: 'to'
#183: 
  writes, and a per-bo flag to to skip implicit fencing in the CS

-:200: WARNING:TYPO_SPELLING: 'wont' may be misspelled - perhaps 'won't'?
#200: 
  wont notice the perf impact. I think we can ignore LTS distros who
  ^^^^

-:233: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 8c505bdc9c8b ("drm/amdgpu: rework dma_resv handling v3")'
#233: 
commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)

-:320: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 3 errors, 4 warnings, 0 checks, 45 lines checked
f4a64c4c4a3c drm/panfrost: Shrink sched_lock
-:42: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 17 lines checked
9a0f2341b633 drm/panfrost: Use xarray and helpers for depedency tracking
-:255: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 197 lines checked
c234633ef5d4 drm/panfrost: Fix implicit sync
-:50: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 11 lines checked
8660ea15e738 drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
-:21: WARNING:REPEATED_WORD: Possible repeated word: 'not'
#21: 
v2: It's neither ... nor, not not (Sam)

-:90: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 2 warnings, 0 checks, 44 lines checked
7e0e4eba2802 drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
-:231: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 104 lines checked
b580cc0434b9 drm/armada: Remove prepare/cleanup_fb hooks
-:88: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 57 lines checked
1248a0b5d17a drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
-:84: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 45 lines checked
2d0b037b1bdb drm/omap: Follow implicit fencing in prepare_fb
-:33: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 15 lines checked
dd7d69020b37 drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
-:22: WARNING:REPEATED_WORD: Possible repeated word: 'not'
#22: 
v3: It's neither ... nor, not not (Sam)

-:81: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 2 warnings, 0 checks, 37 lines checked
a59a7001144a drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
-:203: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 98 lines checked
24934c721047 drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
-:35: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 9 lines checked
ab7b3daba258 RFC: drm/amdgpu: Implement a proper implicit fencing uapi
-:25: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 177ae09b5d69 ("drm/amdgpu: introduce AMDGPU_GEM_CREATE_EXPLICIT_SYNC v2")'
#25: 
commit 177ae09b5d699a5ebd1cafcee78889db968abf54

-:62: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#62: 
  https://lore.kernel.org/dri-devel/20210520190007.534046-4-jason@jlekstrand.net/

-:82: WARNING:TYPO_SPELLING: 'unecessary' may be misspelled - perhaps 'unnecessary'?
#82: 
fencing and remove all unecessary stall points due to them.
                       ^^^^^^^^^^

-:203: CHECK:SPACING: spaces preferred around that '|' (ctx:VxV)
#203: FILE: drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:1765:
+	DRM_IOCTL_DEF_DRV(AMDGPU_SETPARAM, amdgpu_setparam_ioctl, DRM_AUTH|DRM_RENDER_ALLOW),
 	                                                                  ^

-:240: WARNING:LONG_LINE: line length of 115 exceeds 100 columns
#240: FILE: include/uapi/drm/amdgpu_drm.h:75:
+#define DRM_IOCTL_AMDGPU_SETPARAM	DRM_IOW(DRM_COMMAND_BASE + DRM_AMDGPU_SETPARAM, struct drm_amdgpu_setparam)

-:258: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 1 errors, 4 warnings, 1 checks, 104 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✗ Fi.CI.SPARSE: warning for implicit fencing/dma-resv rules for shared buffers (rev5)
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (20 preceding siblings ...)
  (?)
@ 2021-06-23 17:07 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-23 17:07 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers (rev5)
URL   : https://patchwork.freedesktop.org/series/91789/
State : warning

== Summary ==

$ dim sparse --fast origin/drm-tip
Sparse version: v0.6.2
Fast mode used, each commit won't be checked separately.
+drivers/gpu/drm/i915/display/intel_display.c:1893:21:    expected struct i915_vma *[assigned] vma
+drivers/gpu/drm/i915/display/intel_display.c:1893:21:    got void [noderef] __iomem *[assigned] iomem
+drivers/gpu/drm/i915/display/intel_display.c:1893:21: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/i915/gem/i915_gem_ttm.c:733:38: warning: symbol 'i915_gem_ttm_obj_ops' was not declared. Should it be static?
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:27:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:32:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:49:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_engine_stats.h:56:9: warning: trying to copy expression type 31
+drivers/gpu/drm/i915/gt/intel_reset.c:1396:5: warning: context imbalance in 'intel_gt_reset_trylock' - different lock contexts for basic block
+drivers/gpu/drm/i915/gt/intel_ring_submission.c:1207:24: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/i915/i915_perf.c:1434:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/i915/i915_perf.c:1488:15: warning: memset with byte count of 16777216
+drivers/gpu/drm/selftests/test-drm_damage_helper.c:244:25: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/selftests/test-drm_damage_helper.c:268:23: warning: Using plain integer as NULL pointer
+drivers/gpu/drm/ttm/ttm_bo.c:1157:9: warning: context imbalance in 'ttm_bo_swapout' - unexpected unlock
+drivers/gpu/drm/ttm/ttm_bo.c:309:28: warning: context imbalance in 'ttm_bo_cleanup_refs' - unexpected unlock
+drivers/gpu/drm/ttm/ttm_bo.c:367:27: warning: context imbalance in 'ttm_bo_delayed_delete' - different lock contexts for basic block
+drivers/gpu/drm/ttm/ttm_bo.c:633:5: warning: context imbalance in 'ttm_mem_evict_first' - wrong count at exit
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:281:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:284:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38:    expected void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38:    got void [noderef] __iomem *
+drivers/gpu/drm/ttm/ttm_bo_util.c:287:38: warning: incorrect type in assignment (different address spaces)
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28:    expected void volatile [noderef] __iomem *addr
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28:    got void *virtual
+drivers/gpu/drm/ttm/ttm_bo_util.c:367:28: warning: incorrect type in argument 1 (different address spaces)
+drivers/gpu/drm/ttm/ttm_device.c:130:5: warning: context imbalance in 'ttm_device_swapout' - wrong count at exit
+./include/asm-generic/bitops/find.h:112:45: warning: shift count is negative (-262080)
+./include/asm-generic/bitops/find.h:32:31: warning: shift count is negative (-262080)
+./include/linux/seqlock.h:840:24: warning: trying to copy expression type 31
+./include/linux/seqlock.h:840:24: warning: trying to copy expression type 31
+./include/linux/seqlock.h:866:16: warning: trying to copy expression type 31
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen11_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen12_fwtable_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read64' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_read8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen6_write8' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write16' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write32' - different lock contexts for basic block
+./include/linux/spinlock.h:409:9: warning: context imbalance in 'gen8_write8' - different lock contexts for basic block


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH] dma-buf: Switch to inline kerneldoc
  2021-06-23 16:17     ` Daniel Vetter
@ 2021-06-23 17:33       ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 17:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, linaro-mm-sig, Nirmoy Das, Chen Li, Dave Airlie,
	Alex Deucher, Daniel Vetter, Christian König, linux-media

Hi Daniel, looks good.

On Wed, Jun 23, 2021 at 06:17:12PM +0200, Daniel Vetter wrote:
> Also review & update everything while we're at it.
> 
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
> 
> v2: Move the doc for sysfs_entry.attachment_uid to the right place too
> (Sam)
> 
> Acked-by: Christian König <christian.koenig@amd.com>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH] dma-buf: Switch to inline kerneldoc
@ 2021-06-23 17:33       ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 17:33 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Deepak R Varma, Intel Graphics Development, Kevin Wang,
	DRI Development, Sumit Semwal, linaro-mm-sig, Nirmoy Das,
	Chen Li, Dave Airlie, Alex Deucher, Daniel Vetter,
	Christian König, linux-media

Hi Daniel, looks good.

On Wed, Jun 23, 2021 at 06:17:12PM +0200, Daniel Vetter wrote:
> Also review & update everything while we're at it.
> 
> This is prep work to smash a ton of stuff into the kerneldoc for
> @resv.
> 
> v2: Move the doc for sysfs_entry.attachment_uid to the right place too
> (Sam)
> 
> Acked-by: Christian König <christian.koenig@amd.com>
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Nirmoy Das <nirmoy.das@amd.com>
> Cc: Deepak R Varma <mh12gx2825@gmail.com>
> Cc: Chen Li <chenli@uniontech.com>
> Cc: Kevin Wang <kevin1.wang@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
  2021-06-23 16:24     ` [Intel-gfx] " Daniel Vetter
@ 2021-06-23 17:34       ` Sam Ravnborg
  -1 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 17:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Intel Graphics Development, Noralf Trønnes,
	DRI Development, Thomas Zimmermann, Daniel Vetter

Hi Daniel, looks good.

On Wed, Jun 23, 2021 at 06:24:56PM +0200, Daniel Vetter wrote:
> It's tedious to review this all the time, and my audit showed that
> arcpgu actually forgot to set this.
> 
> Make this the default and stop worrying.
> 
> Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
> strange combinations of hooks: cleanup_fb without prepare_fb doesn't
> make sense, and since simpler drivers are all new they better be GEM
> based drivers.
> 
> v2: Warn and bail when it's _not_ a GEM driver (Noralf)
> 
> v3: It's neither ... nor, not not (Sam)
> 
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: Noralf Trønnes <noralf@tronnes.org>
> Acked-by: Noralf Trønnes <noralf@tronnes.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
@ 2021-06-23 17:34       ` Sam Ravnborg
  0 siblings, 0 replies; 175+ messages in thread
From: Sam Ravnborg @ 2021-06-23 17:34 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Intel Graphics Development, Maxime Ripard,
	Noralf Trønnes, DRI Development, Thomas Zimmermann,
	Daniel Vetter

Hi Daniel, looks good.

On Wed, Jun 23, 2021 at 06:24:56PM +0200, Daniel Vetter wrote:
> It's tedious to review this all the time, and my audit showed that
> arcpgu actually forgot to set this.
> 
> Make this the default and stop worrying.
> 
> Again I sprinkled WARN_ON_ONCE on top to make sure we don't have
> strange combinations of hooks: cleanup_fb without prepare_fb doesn't
> make sense, and since simpler drivers are all new they better be GEM
> based drivers.
> 
> v2: Warn and bail when it's _not_ a GEM driver (Noralf)
> 
> v3: It's neither ... nor, not not (Sam)
> 
> Cc: Sam Ravnborg <sam@ravnborg.org>
> Cc: Noralf Trønnes <noralf@tronnes.org>
> Acked-by: Noralf Trønnes <noralf@tronnes.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✓ Fi.CI.BAT: success for implicit fencing/dma-resv rules for shared buffers (rev5)
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (21 preceding siblings ...)
  (?)
@ 2021-06-23 17:35 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-23 17:35 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 3816 bytes --]

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers (rev5)
URL   : https://patchwork.freedesktop.org/series/91789/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10268 -> Patchwork_20444
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/index.html

Known issues
------------

  Here are the changes found in Patchwork_20444 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@amdgpu/amd_basic@cs-sdma:
    - fi-kbl-guc:         NOTRUN -> [SKIP][1] ([fdo#109271]) +59 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/fi-kbl-guc/igt@amdgpu/amd_basic@cs-sdma.html

  * igt@gem_huc_copy@huc-copy:
    - fi-kbl-guc:         NOTRUN -> [SKIP][2] ([fdo#109271] / [i915#2190])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/fi-kbl-guc/igt@gem_huc_copy@huc-copy.html

  * igt@kms_chamelium@vga-hpd-fast:
    - fi-kbl-guc:         NOTRUN -> [SKIP][3] ([fdo#109271] / [fdo#111827]) +8 similar issues
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/fi-kbl-guc/igt@kms_chamelium@vga-hpd-fast.html

  * igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d:
    - fi-kbl-guc:         NOTRUN -> [SKIP][4] ([fdo#109271] / [i915#533])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/fi-kbl-guc/igt@kms_pipe_crc_basic@compare-crc-sanitycheck-pipe-d.html

  * igt@runner@aborted:
    - fi-bxt-dsi:         NOTRUN -> [FAIL][5] ([i915#2426] / [i915#3363])
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/fi-bxt-dsi/igt@runner@aborted.html

  
  [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2426]: https://gitlab.freedesktop.org/drm/intel/issues/2426
  [i915#3363]: https://gitlab.freedesktop.org/drm/intel/issues/3363
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533


Participating hosts (41 -> 38)
------------------------------

  Additional (1): fi-kbl-guc 
  Missing    (4): fi-ilk-m540 fi-bdw-samus fi-bsw-cyan bat-adlp-4 


Build changes
-------------

  * Linux: CI_DRM_10268 -> Patchwork_20444

  CI-20190529: 20190529
  CI_DRM_10268: 0e0529132a50160a0e8bd0aa9608226445a3299b @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6117: 3ba0a02404f243d6d8f232c6215163cc4b0fd699 @ https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
  Patchwork_20444: ab7b3daba258b2b06f00e003a8d68c0795d91ae3 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

ab7b3daba258 RFC: drm/amdgpu: Implement a proper implicit fencing uapi
24934c721047 drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
a59a7001144a drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default
dd7d69020b37 drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default
2d0b037b1bdb drm/omap: Follow implicit fencing in prepare_fb
1248a0b5d17a drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
b580cc0434b9 drm/armada: Remove prepare/cleanup_fb hooks
7e0e4eba2802 drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
8660ea15e738 drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default
c234633ef5d4 drm/panfrost: Fix implicit sync
9a0f2341b633 drm/panfrost: Use xarray and helpers for depedency tracking
f4a64c4c4a3c drm/panfrost: Shrink sched_lock
ee5ebd9f360f dma-buf: Document dma-buf implicit fencing/resv fencing rules
2757dca29ffc dma-buf: Switch to inline kerneldoc
ec4dbdd1b8f4 dma-resv: Fix kerneldoc

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/index.html

[-- Attachment #1.2: Type: text/html, Size: 4806 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 06/15] drm/panfrost: Fix implicit sync
  2021-06-23 16:47     ` Boris Brezillon
  (?)
@ 2021-06-23 19:17       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 19:17 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Daniel Vetter, DRI Development, Tomeu Vizoso,
	Christian König, Intel Graphics Development, Steven Price,
	linaro-mm-sig, Alyssa Rosenzweig, Daniel Vetter, linux-media

On Wed, Jun 23, 2021 at 06:47:37PM +0200, Boris Brezillon wrote:
> On Tue, 22 Jun 2021 18:55:02 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > Currently this has no practial relevance I think because there's not
> > many who can pull off a setup with panfrost and another gpu in the
> > same system. But the rules are that if you're setting an exclusive
> > fence, indicating a gpu write access in the implicit fencing system,
> > then you need to wait for all fences, not just the previous exclusive
> > fence.
> > 
> > panfrost against itself has no problem, because it always sets the
> > exclusive fence (but that's probably something that will need to be
> > fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> > Also no problem with that against display.
> > 
> > With the prep work done to switch over to the dependency helpers this
> > is now a oneliner.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> 
> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

Pushed the 3 panfrost patches to drm-misc-next, thanks for reviewing them.
-Daniel

> 
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 71cd43fa1b36..ef004d587dc4 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> >  	int i, ret;
> >  
> >  	for (i = 0; i < bo_count; i++) {
> > -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> > -
> > -		ret = drm_gem_fence_array_add(deps, fence);
> > +		/* panfrost always uses write mode in its current uapi */
> > +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> >  		if (ret)
> >  			return ret;
> >  	}
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-23 19:17       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 19:17 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Tomeu Vizoso, Daniel Vetter, Intel Graphics Development,
	DRI Development, Steven Price, linaro-mm-sig, Alyssa Rosenzweig,
	Daniel Vetter, Christian König, linux-media

On Wed, Jun 23, 2021 at 06:47:37PM +0200, Boris Brezillon wrote:
> On Tue, 22 Jun 2021 18:55:02 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > Currently this has no practial relevance I think because there's not
> > many who can pull off a setup with panfrost and another gpu in the
> > same system. But the rules are that if you're setting an exclusive
> > fence, indicating a gpu write access in the implicit fencing system,
> > then you need to wait for all fences, not just the previous exclusive
> > fence.
> > 
> > panfrost against itself has no problem, because it always sets the
> > exclusive fence (but that's probably something that will need to be
> > fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> > Also no problem with that against display.
> > 
> > With the prep work done to switch over to the dependency helpers this
> > is now a oneliner.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> 
> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

Pushed the 3 panfrost patches to drm-misc-next, thanks for reviewing them.
-Daniel

> 
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 71cd43fa1b36..ef004d587dc4 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> >  	int i, ret;
> >  
> >  	for (i = 0; i < bo_count; i++) {
> > -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> > -
> > -		ret = drm_gem_fence_array_add(deps, fence);
> > +		/* panfrost always uses write mode in its current uapi */
> > +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> >  		if (ret)
> >  			return ret;
> >  	}
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 06/15] drm/panfrost: Fix implicit sync
@ 2021-06-23 19:17       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-23 19:17 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Tomeu Vizoso, Daniel Vetter, Intel Graphics Development,
	DRI Development, Steven Price, linaro-mm-sig, Alyssa Rosenzweig,
	Daniel Vetter, Christian König, linux-media

On Wed, Jun 23, 2021 at 06:47:37PM +0200, Boris Brezillon wrote:
> On Tue, 22 Jun 2021 18:55:02 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
> > Currently this has no practial relevance I think because there's not
> > many who can pull off a setup with panfrost and another gpu in the
> > same system. But the rules are that if you're setting an exclusive
> > fence, indicating a gpu write access in the implicit fencing system,
> > then you need to wait for all fences, not just the previous exclusive
> > fence.
> > 
> > panfrost against itself has no problem, because it always sets the
> > exclusive fence (but that's probably something that will need to be
> > fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
> > Also no problem with that against display.
> > 
> > With the prep work done to switch over to the dependency helpers this
> > is now a oneliner.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Rob Herring <robh@kernel.org>
> > Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> 
> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>

Pushed the 3 panfrost patches to drm-misc-next, thanks for reviewing them.
-Daniel

> 
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: linux-media@vger.kernel.org
> > Cc: linaro-mm-sig@lists.linaro.org
> > ---
> >  drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> > index 71cd43fa1b36..ef004d587dc4 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> > @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos,
> >  	int i, ret;
> >  
> >  	for (i = 0; i < bo_count; i++) {
> > -		struct dma_fence *fence = dma_resv_get_excl_unlocked(bos[i]->resv);
> > -
> > -		ret = drm_gem_fence_array_add(deps, fence);
> > +		/* panfrost always uses write mode in its current uapi */
> > +		ret = drm_gem_fence_array_add_implicit(deps, bos[i], true);
> >  		if (ret)
> >  			return ret;
> >  	}
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [Intel-gfx] ✓ Fi.CI.IGT: success for implicit fencing/dma-resv rules for shared buffers (rev5)
  2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
                   ` (22 preceding siblings ...)
  (?)
@ 2021-06-23 21:04 ` Patchwork
  -1 siblings, 0 replies; 175+ messages in thread
From: Patchwork @ 2021-06-23 21:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 30280 bytes --]

== Series Details ==

Series: implicit fencing/dma-resv rules for shared buffers (rev5)
URL   : https://patchwork.freedesktop.org/series/91789/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_10268_full -> Patchwork_20444_full
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  

Known issues
------------

  Here are the changes found in Patchwork_20444_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_create@create-massive:
    - shard-snb:          NOTRUN -> [DMESG-WARN][1] ([i915#3002]) +1 similar issue
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb5/igt@gem_create@create-massive.html
    - shard-kbl:          NOTRUN -> [DMESG-WARN][2] ([i915#3002])
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@gem_create@create-massive.html

  * igt@gem_ctx_isolation@preservation-s3@rcs0:
    - shard-apl:          NOTRUN -> [DMESG-WARN][3] ([i915#180])
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@gem_ctx_isolation@preservation-s3@rcs0.html

  * igt@gem_ctx_persistence@many-contexts:
    - shard-tglb:         [PASS][4] -> [FAIL][5] ([i915#2410])
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-tglb3/igt@gem_ctx_persistence@many-contexts.html
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-tglb7/igt@gem_ctx_persistence@many-contexts.html

  * igt@gem_ctx_persistence@process:
    - shard-snb:          NOTRUN -> [SKIP][6] ([fdo#109271] / [i915#1099]) +3 similar issues
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb2/igt@gem_ctx_persistence@process.html

  * igt@gem_eio@in-flight-contexts-1us:
    - shard-apl:          [PASS][7] -> [TIMEOUT][8] ([i915#3063])
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-apl7/igt@gem_eio@in-flight-contexts-1us.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl3/igt@gem_eio@in-flight-contexts-1us.html

  * igt@gem_eio@unwedge-stress:
    - shard-snb:          NOTRUN -> [FAIL][9] ([i915#3354])
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb5/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_fair@basic-deadline:
    - shard-skl:          NOTRUN -> [FAIL][10] ([i915#2846])
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@gem_exec_fair@basic-deadline.html

  * igt@gem_exec_fair@basic-none-solo@rcs0:
    - shard-kbl:          NOTRUN -> [FAIL][11] ([i915#2842]) +1 similar issue
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@gem_exec_fair@basic-none-solo@rcs0.html

  * igt@gem_exec_fair@basic-none@vecs0:
    - shard-apl:          NOTRUN -> [FAIL][12] ([i915#2842] / [i915#3468])
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl1/igt@gem_exec_fair@basic-none@vecs0.html

  * igt@gem_exec_fair@basic-pace@vecs0:
    - shard-kbl:          [PASS][13] -> [FAIL][14] ([i915#2842])
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl7/igt@gem_exec_fair@basic-pace@vecs0.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl4/igt@gem_exec_fair@basic-pace@vecs0.html

  * igt@gem_exec_fair@basic-sync@rcs0:
    - shard-kbl:          [PASS][15] -> [SKIP][16] ([fdo#109271]) +1 similar issue
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl4/igt@gem_exec_fair@basic-sync@rcs0.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl3/igt@gem_exec_fair@basic-sync@rcs0.html

  * igt@gem_exec_fair@basic-throttle@rcs0:
    - shard-glk:          [PASS][17] -> [FAIL][18] ([i915#2842])
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk5/igt@gem_exec_fair@basic-throttle@rcs0.html
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk3/igt@gem_exec_fair@basic-throttle@rcs0.html
    - shard-iclb:         [PASS][19] -> [FAIL][20] ([i915#2849])
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb8/igt@gem_exec_fair@basic-throttle@rcs0.html
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb3/igt@gem_exec_fair@basic-throttle@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@bcs0:
    - shard-apl:          NOTRUN -> [FAIL][21] ([i915#3633]) +3 similar issues
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@gem_exec_reloc@basic-wide-active@bcs0.html

  * igt@gem_exec_reloc@basic-wide-active@rcs0:
    - shard-snb:          NOTRUN -> [FAIL][22] ([i915#3633]) +2 similar issues
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb6/igt@gem_exec_reloc@basic-wide-active@rcs0.html

  * igt@gem_exec_reloc@basic-wide-active@vcs1:
    - shard-iclb:         NOTRUN -> [FAIL][23] ([i915#3633])
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb2/igt@gem_exec_reloc@basic-wide-active@vcs1.html

  * igt@gem_exec_whisper@basic-contexts-priority:
    - shard-glk:          [PASS][24] -> [DMESG-WARN][25] ([i915#118] / [i915#95])
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk5/igt@gem_exec_whisper@basic-contexts-priority.html
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk3/igt@gem_exec_whisper@basic-contexts-priority.html

  * igt@gem_exec_whisper@basic-queues-priority-all:
    - shard-iclb:         [PASS][26] -> [INCOMPLETE][27] ([i915#1895])
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb8/igt@gem_exec_whisper@basic-queues-priority-all.html
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb3/igt@gem_exec_whisper@basic-queues-priority-all.html

  * igt@gem_huc_copy@huc-copy:
    - shard-skl:          NOTRUN -> [SKIP][28] ([fdo#109271] / [i915#2190])
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@gem_huc_copy@huc-copy.html

  * igt@gem_mmap_gtt@big-copy-odd:
    - shard-skl:          [PASS][29] -> [FAIL][30] ([i915#307])
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl10/igt@gem_mmap_gtt@big-copy-odd.html
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl3/igt@gem_mmap_gtt@big-copy-odd.html
    - shard-glk:          [PASS][31] -> [FAIL][32] ([i915#307])
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk8/igt@gem_mmap_gtt@big-copy-odd.html
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk2/igt@gem_mmap_gtt@big-copy-odd.html

  * igt@gem_mmap_gtt@cpuset-big-copy-odd:
    - shard-iclb:         [PASS][33] -> [FAIL][34] ([i915#2428])
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb2/igt@gem_mmap_gtt@cpuset-big-copy-odd.html
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb2/igt@gem_mmap_gtt@cpuset-big-copy-odd.html

  * igt@gem_userptr_blits@input-checking:
    - shard-apl:          NOTRUN -> [DMESG-WARN][35] ([i915#3002])
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@gem_userptr_blits@input-checking.html

  * igt@gem_userptr_blits@vma-merge:
    - shard-kbl:          NOTRUN -> [FAIL][36] ([i915#3318])
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@gem_userptr_blits@vma-merge.html

  * igt@gen9_exec_parse@allowed-all:
    - shard-glk:          [PASS][37] -> [DMESG-WARN][38] ([i915#1436] / [i915#716])
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk4/igt@gen9_exec_parse@allowed-all.html
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk5/igt@gen9_exec_parse@allowed-all.html

  * igt@gen9_exec_parse@bb-large:
    - shard-apl:          NOTRUN -> [FAIL][39] ([i915#3296])
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@gen9_exec_parse@bb-large.html

  * igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp:
    - shard-apl:          NOTRUN -> [SKIP][40] ([fdo#109271] / [i915#1937])
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@i915_pm_lpsp@kms-lpsp@kms-lpsp-dp.html

  * igt@i915_pm_rc6_residency@rc6-fence:
    - shard-iclb:         NOTRUN -> [WARN][41] ([i915#1804] / [i915#2684])
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@i915_pm_rc6_residency@rc6-fence.html

  * igt@kms_big_fb@x-tiled-64bpp-rotate-0:
    - shard-iclb:         [PASS][42] -> [DMESG-WARN][43] ([i915#3621])
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb5/igt@kms_big_fb@x-tiled-64bpp-rotate-0.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb1/igt@kms_big_fb@x-tiled-64bpp-rotate-0.html

  * igt@kms_chamelium@dp-crc-fast:
    - shard-iclb:         NOTRUN -> [SKIP][44] ([fdo#109284] / [fdo#111827]) +1 similar issue
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_chamelium@dp-crc-fast.html

  * igt@kms_chamelium@dp-mode-timings:
    - shard-apl:          NOTRUN -> [SKIP][45] ([fdo#109271] / [fdo#111827]) +24 similar issues
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl1/igt@kms_chamelium@dp-mode-timings.html

  * igt@kms_chamelium@hdmi-hpd-fast:
    - shard-snb:          NOTRUN -> [SKIP][46] ([fdo#109271] / [fdo#111827]) +22 similar issues
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb5/igt@kms_chamelium@hdmi-hpd-fast.html

  * igt@kms_color@pipe-a-ctm-0-25:
    - shard-skl:          [PASS][47] -> [DMESG-WARN][48] ([i915#1982])
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl5/igt@kms_color@pipe-a-ctm-0-25.html
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl4/igt@kms_color@pipe-a-ctm-0-25.html

  * igt@kms_color_chamelium@pipe-b-ctm-max:
    - shard-skl:          NOTRUN -> [SKIP][49] ([fdo#109271] / [fdo#111827]) +9 similar issues
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@kms_color_chamelium@pipe-b-ctm-max.html

  * igt@kms_color_chamelium@pipe-c-ctm-negative:
    - shard-kbl:          NOTRUN -> [SKIP][50] ([fdo#109271] / [fdo#111827]) +4 similar issues
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@kms_color_chamelium@pipe-c-ctm-negative.html

  * igt@kms_content_protection@atomic-dpms:
    - shard-iclb:         NOTRUN -> [SKIP][51] ([fdo#109300] / [fdo#111066])
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_content_protection@atomic-dpms.html

  * igt@kms_content_protection@uevent:
    - shard-apl:          NOTRUN -> [FAIL][52] ([i915#2105])
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@kms_content_protection@uevent.html

  * igt@kms_cursor_crc@pipe-b-cursor-512x170-sliding:
    - shard-iclb:         NOTRUN -> [SKIP][53] ([fdo#109278] / [fdo#109279])
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_cursor_crc@pipe-b-cursor-512x170-sliding.html

  * igt@kms_cursor_edge_walk@pipe-d-128x128-right-edge:
    - shard-skl:          NOTRUN -> [SKIP][54] ([fdo#109271]) +106 similar issues
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl2/igt@kms_cursor_edge_walk@pipe-d-128x128-right-edge.html

  * igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size:
    - shard-iclb:         NOTRUN -> [SKIP][55] ([fdo#109274] / [fdo#109278]) +1 similar issue
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_cursor_legacy@cursorb-vs-flipb-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size:
    - shard-skl:          [PASS][56] -> [FAIL][57] ([i915#2346] / [i915#533])
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl5/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl4/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions-varying-size.html

  * igt@kms_cursor_legacy@pipe-d-single-bo:
    - shard-kbl:          NOTRUN -> [SKIP][58] ([fdo#109271] / [i915#533]) +1 similar issue
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl3/igt@kms_cursor_legacy@pipe-d-single-bo.html

  * igt@kms_cursor_legacy@pipe-d-torture-bo:
    - shard-apl:          NOTRUN -> [SKIP][59] ([fdo#109271] / [i915#533]) +4 similar issues
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@kms_cursor_legacy@pipe-d-torture-bo.html

  * igt@kms_flip@2x-flip-vs-expired-vblank-interruptible@bc-hdmi-a1-hdmi-a2:
    - shard-glk:          [PASS][60] -> [FAIL][61] ([i915#79])
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk4/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible@bc-hdmi-a1-hdmi-a2.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk4/igt@kms_flip@2x-flip-vs-expired-vblank-interruptible@bc-hdmi-a1-hdmi-a2.html

  * igt@kms_flip@flip-vs-suspend-interruptible@c-dp1:
    - shard-apl:          [PASS][62] -> [DMESG-WARN][63] ([i915#180])
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-apl8/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl3/igt@kms_flip@flip-vs-suspend-interruptible@c-dp1.html

  * igt@kms_flip@plain-flip-fb-recreate-interruptible@a-edp1:
    - shard-skl:          [PASS][64] -> [FAIL][65] ([i915#2122])
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl7/igt@kms_flip@plain-flip-fb-recreate-interruptible@a-edp1.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl5/igt@kms_flip@plain-flip-fb-recreate-interruptible@a-edp1.html

  * igt@kms_frontbuffer_tracking@fbcpsr-2p-shrfb-fliptrack-mmap-gtt:
    - shard-iclb:         NOTRUN -> [SKIP][66] ([fdo#109280]) +5 similar issues
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_frontbuffer_tracking@fbcpsr-2p-shrfb-fliptrack-mmap-gtt.html

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-onoff:
    - shard-snb:          NOTRUN -> [SKIP][67] ([fdo#109271]) +417 similar issues
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb2/igt@kms_frontbuffer_tracking@psr-1p-primscrn-cur-indfb-onoff.html

  * igt@kms_hdr@bpc-switch-suspend:
    - shard-kbl:          [PASS][68] -> [DMESG-WARN][69] ([i915#180]) +3 similar issues
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl2/igt@kms_hdr@bpc-switch-suspend.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl6/igt@kms_hdr@bpc-switch-suspend.html

  * igt@kms_invalid_dotclock:
    - shard-iclb:         NOTRUN -> [SKIP][70] ([fdo#109310])
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_invalid_dotclock.html

  * igt@kms_pipe_crc_basic@read-crc-pipe-d:
    - shard-skl:          NOTRUN -> [SKIP][71] ([fdo#109271] / [i915#533]) +1 similar issue
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@kms_pipe_crc_basic@read-crc-pipe-d.html

  * igt@kms_plane_alpha_blend@pipe-a-alpha-basic:
    - shard-apl:          NOTRUN -> [FAIL][72] ([fdo#108145] / [i915#265]) +4 similar issues
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl1/igt@kms_plane_alpha_blend@pipe-a-alpha-basic.html

  * igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb:
    - shard-apl:          NOTRUN -> [FAIL][73] ([i915#265]) +1 similar issue
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@kms_plane_alpha_blend@pipe-b-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-alpha-opaque-fb:
    - shard-kbl:          NOTRUN -> [FAIL][74] ([fdo#108145] / [i915#265]) +1 similar issue
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl3/igt@kms_plane_alpha_blend@pipe-c-alpha-opaque-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-alpha-transparent-fb:
    - shard-skl:          NOTRUN -> [FAIL][75] ([i915#265])
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl3/igt@kms_plane_alpha_blend@pipe-c-alpha-transparent-fb.html

  * igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max:
    - shard-skl:          NOTRUN -> [FAIL][76] ([fdo#108145] / [i915#265]) +1 similar issue
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl8/igt@kms_plane_alpha_blend@pipe-c-constant-alpha-max.html

  * igt@kms_plane_alpha_blend@pipe-d-coverage-vs-premult-vs-constant:
    - shard-iclb:         NOTRUN -> [SKIP][77] ([fdo#109278]) +3 similar issues
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@kms_plane_alpha_blend@pipe-d-coverage-vs-premult-vs-constant.html

  * igt@kms_plane_multiple@atomic-pipe-d-tiling-x:
    - shard-kbl:          NOTRUN -> [SKIP][78] ([fdo#109271]) +82 similar issues
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl1/igt@kms_plane_multiple@atomic-pipe-d-tiling-x.html

  * igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping:
    - shard-apl:          NOTRUN -> [SKIP][79] ([fdo#109271] / [i915#2733])
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html
    - shard-skl:          NOTRUN -> [SKIP][80] ([fdo#109271] / [i915#2733])
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl3/igt@kms_plane_scaling@scaler-with-clipping-clamping@pipe-c-scaler-with-clipping-clamping.html

  * igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3:
    - shard-apl:          NOTRUN -> [SKIP][81] ([fdo#109271] / [i915#658]) +5 similar issues
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3.html
    - shard-skl:          NOTRUN -> [SKIP][82] ([fdo#109271] / [i915#658])
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl2/igt@kms_psr2_sf@overlay-plane-update-sf-dmg-area-3.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-3:
    - shard-kbl:          NOTRUN -> [SKIP][83] ([fdo#109271] / [i915#658]) +1 similar issue
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-3.html

  * igt@kms_psr@psr2_sprite_mmap_cpu:
    - shard-iclb:         [PASS][84] -> [SKIP][85] ([fdo#109441])
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb2/igt@kms_psr@psr2_sprite_mmap_cpu.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb4/igt@kms_psr@psr2_sprite_mmap_cpu.html

  * igt@kms_setmode@basic:
    - shard-snb:          NOTRUN -> [FAIL][86] ([i915#31])
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb5/igt@kms_setmode@basic.html

  * igt@kms_sysfs_edid_timing:
    - shard-apl:          NOTRUN -> [FAIL][87] ([IGT#2])
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl1/igt@kms_sysfs_edid_timing.html

  * igt@kms_writeback@writeback-check-output:
    - shard-apl:          NOTRUN -> [SKIP][88] ([fdo#109271] / [i915#2437]) +1 similar issue
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@kms_writeback@writeback-check-output.html

  * igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame:
    - shard-apl:          NOTRUN -> [SKIP][89] ([fdo#109271]) +273 similar issues
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@nouveau_crc@pipe-b-ctx-flip-skip-current-frame.html

  * igt@perf_pmu@module-unload:
    - shard-skl:          [PASS][90] -> [DMESG-WARN][91] ([i915#1982] / [i915#262])
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl4/igt@perf_pmu@module-unload.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl8/igt@perf_pmu@module-unload.html

  * igt@runner@aborted:
    - shard-snb:          NOTRUN -> ([FAIL][92], [FAIL][93]) ([i915#3002])
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb5/igt@runner@aborted.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-snb6/igt@runner@aborted.html

  * igt@sysfs_clients@create:
    - shard-apl:          NOTRUN -> [SKIP][94] ([fdo#109271] / [i915#2994]) +2 similar issues
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl1/igt@sysfs_clients@create.html

  * igt@sysfs_clients@fair-7:
    - shard-skl:          NOTRUN -> [SKIP][95] ([fdo#109271] / [i915#2994])
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@sysfs_clients@fair-7.html

  * igt@sysfs_clients@sema-10:
    - shard-kbl:          NOTRUN -> [SKIP][96] ([fdo#109271] / [i915#2994]) +1 similar issue
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@sysfs_clients@sema-10.html

  * igt@sysfs_heartbeat_interval@mixed@vcs0:
    - shard-skl:          [PASS][97] -> [FAIL][98] ([i915#1731])
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl6/igt@sysfs_heartbeat_interval@mixed@vcs0.html
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl8/igt@sysfs_heartbeat_interval@mixed@vcs0.html

  
#### Possible fixes ####

  * igt@gem_eio@unwedge-stress:
    - shard-iclb:         [TIMEOUT][99] ([i915#2369] / [i915#2481] / [i915#3070]) -> [PASS][100]
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb7/igt@gem_eio@unwedge-stress.html
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb8/igt@gem_eio@unwedge-stress.html

  * igt@gem_exec_capture@pi@rcs0:
    - shard-skl:          [INCOMPLETE][101] ([i915#2369]) -> [PASS][102]
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl7/igt@gem_exec_capture@pi@rcs0.html
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl5/igt@gem_exec_capture@pi@rcs0.html

  * igt@gem_exec_fair@basic-none@rcs0:
    - shard-glk:          [FAIL][103] ([i915#2842]) -> [PASS][104]
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk2/igt@gem_exec_fair@basic-none@rcs0.html
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk1/igt@gem_exec_fair@basic-none@rcs0.html

  * igt@gem_exec_fair@basic-pace-solo@rcs0:
    - shard-kbl:          [FAIL][105] ([i915#2842]) -> [PASS][106]
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl1/igt@gem_exec_fair@basic-pace-solo@rcs0.html
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl6/igt@gem_exec_fair@basic-pace-solo@rcs0.html

  * igt@gem_mmap_offset@clear:
    - shard-skl:          [FAIL][107] ([i915#3160]) -> [PASS][108]
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl1/igt@gem_mmap_offset@clear.html
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl3/igt@gem_mmap_offset@clear.html
    - shard-iclb:         [FAIL][109] ([i915#3160]) -> [PASS][110]
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb1/igt@gem_mmap_offset@clear.html
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb7/igt@gem_mmap_offset@clear.html

  * igt@gem_workarounds@suspend-resume-fd:
    - shard-skl:          [INCOMPLETE][111] ([i915#198] / [i915#2405]) -> [PASS][112]
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl2/igt@gem_workarounds@suspend-resume-fd.html
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl2/igt@gem_workarounds@suspend-resume-fd.html

  * igt@i915_pm_dc@dc6-psr:
    - shard-iclb:         [FAIL][113] ([i915#454]) -> [PASS][114]
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb8/igt@i915_pm_dc@dc6-psr.html
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb3/igt@i915_pm_dc@dc6-psr.html

  * igt@kms_big_fb@x-tiled-32bpp-rotate-180:
    - shard-glk:          [DMESG-WARN][115] ([i915#118] / [i915#95]) -> [PASS][116]
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk9/igt@kms_big_fb@x-tiled-32bpp-rotate-180.html
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk8/igt@kms_big_fb@x-tiled-32bpp-rotate-180.html

  * igt@kms_color@pipe-c-ctm-0-75:
    - shard-skl:          [DMESG-WARN][117] ([i915#1982]) -> [PASS][118]
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl1/igt@kms_color@pipe-c-ctm-0-75.html
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl9/igt@kms_color@pipe-c-ctm-0-75.html

  * igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions:
    - shard-skl:          [FAIL][119] ([i915#2346]) -> [PASS][120]
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl9/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl10/igt@kms_cursor_legacy@flip-vs-cursor-atomic-transitions.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1:
    - shard-skl:          [FAIL][121] ([i915#79]) -> [PASS][122] +1 similar issue
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl7/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl5/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-edp1.html

  * igt@kms_flip@flip-vs-expired-vblank-interruptible@b-hdmi-a2:
    - shard-glk:          [FAIL][123] ([i915#79]) -> [PASS][124] +1 similar issue
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-glk8/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-hdmi-a2.html
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-glk1/igt@kms_flip@flip-vs-expired-vblank-interruptible@b-hdmi-a2.html

  * igt@kms_flip@flip-vs-suspend-interruptible@a-dp1:
    - shard-kbl:          [DMESG-WARN][125] ([i915#180]) -> [PASS][126] +4 similar issues
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl4/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl2/igt@kms_flip@flip-vs-suspend-interruptible@a-dp1.html

  * igt@kms_flip@flip-vs-suspend@c-edp1:
    - shard-skl:          [INCOMPLETE][127] ([i915#146] / [i915#198]) -> [PASS][128]
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl3/igt@kms_flip@flip-vs-suspend@c-edp1.html
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl7/igt@kms_flip@flip-vs-suspend@c-edp1.html

  * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c:
    - shard-apl:          [DMESG-WARN][129] ([i915#180]) -> [PASS][130]
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-apl7/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-c.html

  * igt@kms_psr@psr2_no_drrs:
    - shard-iclb:         [SKIP][131] ([fdo#109441]) -> [PASS][132]
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb5/igt@kms_psr@psr2_no_drrs.html
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb2/igt@kms_psr@psr2_no_drrs.html

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-kbl:          [DMESG-WARN][133] ([i915#180] / [i915#295]) -> [PASS][134]
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-kbl3/igt@kms_vblank@pipe-a-ts-continuation-suspend.html
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-kbl3/igt@kms_vblank@pipe-a-ts-continuation-suspend.html

  * igt@sysfs_heartbeat_interval@mixed@bcs0:
    - shard-skl:          [FAIL][135] ([i915#1731]) -> [PASS][136]
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-skl6/igt@sysfs_heartbeat_interval@mixed@bcs0.html
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-skl8/igt@sysfs_heartbeat_interval@mixed@bcs0.html

  
#### Warnings ####

  * igt@i915_pm_rc6_residency@rc6-idle:
    - shard-iclb:         [WARN][137] ([i915#2684]) -> [WARN][138] ([i915#1804] / [i915#2684])
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb8/igt@i915_pm_rc6_residency@rc6-idle.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb6/igt@i915_pm_rc6_residency@rc6-idle.html

  * igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-2:
    - shard-iclb:         [SKIP][139] ([i915#658]) -> [SKIP][140] ([i915#2920])
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-iclb5/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-2.html
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-iclb2/igt@kms_psr2_sf@overlay-primary-update-sf-dmg-area-2.html

  * igt@runner@aborted:
    - shard-apl:          ([FAIL][141], [FAIL][142]) ([fdo#109271] / [i915#1814] / [i915#3002] / [i915#3363]) -> ([FAIL][143], [FAIL][144], [FAIL][145], [FAIL][146]) ([i915#180] / [i915#3002] / [i915#3363])
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-apl7/igt@runner@aborted.html
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10268/shard-apl8/igt@runner@aborted.html
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl3/igt@runner@aborted.html
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl8/igt@runner@aborted.html
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl7/igt@runner@aborted.html
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/shard-apl3/igt@runner@aborted.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAIL

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_20444/index.html

[-- Attachment #1.2: Type: text/html, Size: 33492 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-23 16:19     ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24  6:59       ` Dave Airlie
  -1 siblings, 0 replies; 175+ messages in thread
From: Dave Airlie @ 2021-06-24  6:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, DRI Development, Michel Dänzer, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, Alex Deucher, mesa-dev,
	Christian König, Dennis Li, Deepak R Varma

On Thu, 24 Jun 2021 at 02:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Docs for struct dma_resv are fairly clear:
>
> "A reservation object can have attached one exclusive fence (normally
> associated with write operations) or N shared fences (read
> operations)."
>
> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
>
> Furthermore a review across all of upstream.
>
> First of render drivers and how they set implicit fences:
>
> - nouveau follows this contract, see in validate_fini_no_ticket()
>
>                         nouveau_bo_fence(nvbo, fence, !!b->write_domains);
>
>   and that last boolean controls whether the exclusive or shared fence
>   slot is used.
>
> - radeon follows this contract by setting
>
>                 p->relocs[i].tv.num_shared = !r->write_domain;
>
>   in radeon_cs_parser_relocs(), which ensures that the call to
>   ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
>   right thing.
>
> - vmwgfx seems to follow this contract with the shotgun approach of
>   always setting ttm_val_buf->num_shared = 0, which means
>   ttm_eu_fence_buffer_objects() will only use the exclusive slot.
>
> - etnaviv follows this contract, as can be trivially seen by looking
>   at submit_attach_object_fences()
>
> - i915 is a bit a convoluted maze with multiple paths leading to
>   i915_vma_move_to_active(). Which sets the exclusive flag if
>   EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
>   softpin mode, or through the write_domain when using relocations. It
>   follows this contract.
>
> - lima follows this contract, see lima_gem_submit() which sets the
>   exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
>   bo
>
> - msm follows this contract, see msm_gpu_submit() which sets the
>   exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
>
> - panfrost follows this contract with the shotgun approach of just
>   always setting the exclusive fence, see
>   panfrost_attach_object_fences(). Benefits of a single engine I guess
>
> - v3d follows this contract with the same shotgun approach in
>   v3d_attach_fences_and_unlock_reservation(), but it has at least an
>   XXX comment that maybe this should be improved
>
> - v4c uses the same shotgun approach of always setting an exclusive
>   fence, see vc4_update_bo_seqnos()
>
> - vgem also follows this contract, see vgem_fence_attach_ioctl() and
>   the VGEM_FENCE_WRITE. This is used in some igts to validate prime
>   sharing with i915.ko without the need of a 2nd gpu
>
> - vritio follows this contract again with the shotgun approach of
>   always setting an exclusive fence, see virtio_gpu_array_add_fence()
>
> This covers the setting of the exclusive fences when writing.
>
> Synchronizing against the exclusive fence is a lot more tricky, and I
> only spot checked a few:
>
> - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
>   implicit dependencies (which is used by vulkan)
>
> - etnaviv does this. Implicit dependencies are collected in
>   submit_fence_sync(), again with an opt-out flag
>   ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
>   etnaviv_sched_dependency which is the
>   drm_sched_backend_ops->dependency callback.
>
> - v4c seems to not do much here, maybe gets away with it by not having
>   a scheduler and only a single engine. Since all newer broadcom chips than
>   the OG vc4 use v3d for rendering, which follows this contract, the
>   impact of this issue is fairly small.
>
> - v3d does this using the drm_gem_fence_array_add_implicit() helper,
>   which then it's drm_sched_backend_ops->dependency callback
>   v3d_job_dependency() picks up.
>
> - panfrost is nice here and tracks the implicit fences in
>   panfrost_job->implicit_fences, which again the
>   drm_sched_backend_ops->dependency callback panfrost_job_dependency()
>   picks up. It is mildly questionable though since it only picks up
>   exclusive fences in panfrost_acquire_object_fences(), but not buggy
>   in practice because it also always sets the exclusive fence. It
>   should pick up both sets of fences, just in case there's ever going
>   to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
>   pcie port and a real gpu, which might actually happen eventually. A
>   bug, but easy to fix. Should probably use the
>   drm_gem_fence_array_add_implicit() helper.
>
> - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
>   the same schema as v3d.
>
> - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
>   but because it doesn't use the drm/scheduler it handles fences from
>   the wrong context with a synchronous dma_fence_wait. See
>   submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>   a scheduler might be a good idea.
>
> - all the remaining drivers are ttm based, where I hope they do
>   appropriately obey implicit fences already. I didn't do the full
>   audit there because a) not follow the contract would confuse ttm
>   quite well and b) reading non-standard scheduler and submit code
>   which isn't based on drm/scheduler is a pain.
>
> Onwards to the display side.
>
> - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
>   correctly. Overwhelmingly most drivers get this right, except a few
>   totally dont. I'll follow up with a patch to make this the default
>   and avoid a bunch of bugs.
>
> - I didn't audit the ttm drivers, but given that dma_resv started
>   there I hope they get this right.
>
> In conclusion this IS the contract, both as documented and
> overwhelmingly implemented, specically as implemented by all render
> drivers except amdgpu.
>
> Amdgpu tried to fix this already in
>
> commit 049aca4363d8af87cab8d53de5401602db3b9999
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Sep 19 16:54:35 2018 +0200
>
>     drm/amdgpu: fix using shared fence for exported BOs v2
>
> but this fix falls short on a number of areas:
>
> - It's racy, by the time the buffer is shared it might be too late. To
>   make sure there's definitely never a problem we need to set the
>   fences correctly for any buffer that's potentially exportable.
>
> - It's breaking uapi, dma-buf fds support poll() and differentitiate
>   between, which was introduced in
>
>         commit 9b495a5887994a6d74d5c261d012083a92b94738
>         Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>         Date:   Tue Jul 1 12:57:43 2014 +0200
>
>             dma-buf: add poll support, v3
>
> - Christian König wants to nack new uapi building further on this
>   dma_resv contract because it breaks amdgpu, quoting
>
>   "Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
>   "This doesn't works for amdgpu at all for the reasons outlined above."
>
>   https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
>
>   Rejecting new development because your own driver is broken and
>   violates established cross driver contracts and uapi is really not
>   how upstream works.
>
> Now this patch will have a severe performance impact on anything that
> runs on multiple engines. So we can't just merge it outright, but need
> a bit a plan:
>
> - amdgpu needs a proper uapi for handling implicit fencing. The funny
>   thing is that to do it correctly, implicit fencing must be treated
>   as a very strange IPC mechanism for transporting fences, where both
>   setting the fence and dependency intercepts must be handled
>   explicitly. Current best practices is a per-bo flag to indicate
>   writes, and a per-bo flag to to skip implicit fencing in the CS
>   ioctl as a new chunk.
>
> - Since amdgpu has been shipping with broken behaviour we need an
>   opt-out flag from the butchered implicit fencing model to enable the
>   proper explicit implicit fencing model.
>
> - for kernel memory fences due to bo moves at least the i915 idea is
>   to use ttm_bo->moving. amdgpu probably needs the same.
>
> - since the current p2p dma-buf interface assumes the kernel memory
>   fence is in the exclusive dma_resv fence slot we need to add a new
>   fence slot for kernel fences, which must never be ignored. Since
>   currently only amdgpu supports this there's no real problem here
>   yet, until amdgpu gains a NO_IMPLICIT CS flag.
>
> - New userspace needs to ship in enough desktop distros so that users
>   wont notice the perf impact. I think we can ignore LTS distros who
>   upgrade their kernels but not their mesa3d snapshot.
>
> - Then when this is all in place we can merge this patch here.
>
> What is not a solution to this problem here is trying to make the
> dma_resv rules in the kernel more clever. The fundamental issue here
> is that the amdgpu CS uapi is the least expressive one across all
> drivers (only equalled by panfrost, which has an actual excuse) by not
> allowing any userspace control over how implicit sync is conducted.
>
> Until this is fixed it's completely pointless to make the kernel more
> clever to improve amdgpu, because all we're doing is papering over
> this uapi design issue. amdgpu needs to attain the status quo
> established by other drivers first, once that's achieved we can tackle
> the remaining issues in a consistent way across drivers.
>
> v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
> entirely missed.
>
> This is great because it means the amdgpu specific piece for proper
> implicit fence handling exists already, and that since a while. The
> only thing that's now missing is
> - fishing the implicit fences out of a shared object at the right time
> - setting the exclusive implicit fence slot at the right time.
>
> Jason has a patch series to fill that gap with a bunch of generic
> ioctl on the dma-buf fd:
>
> https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
>
> v3: Since Christian has fixed amdgpu now in
>
> commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Jun 9 13:51:36 2021 +0200
>
>     drm/amdgpu: rework dma_resv handling v3
>
> Use the audit covered in this commit message as the excuse to update
> the dma-buf docs around dma_buf.resv usage across drivers.
>
> Since dynamic importers have different rules also hammer these in
> again while we're at it.
>
> v4:
> - Add the missing "through the device" in the dynamic section that I
>   overlooked.
> - Fix a kerneldoc markup mistake, the link didn't connect
>

This is pretty epic commit msg, thanks for the investment, the commit
msg should be required reading.

Reviewed-by: Dave Airlie <airlied@redhat.com>

Dave.

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-24  6:59       ` Dave Airlie
  0 siblings, 0 replies; 175+ messages in thread
From: Dave Airlie @ 2021-06-24  6:59 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Daniel Stone, Daniel Vetter, Intel Graphics Development,
	Kevin Wang, DRI Development, Sumit Semwal, Michel Dänzer,
	Luben Tuikov, Kristian H . Kristensen, Chen Li,
	Bas Nieuwenhuizen, Alex Deucher, mesa-dev, Christian König,
	Dennis Li, Deepak R Varma

On Thu, 24 Jun 2021 at 02:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> Docs for struct dma_resv are fairly clear:
>
> "A reservation object can have attached one exclusive fence (normally
> associated with write operations) or N shared fences (read
> operations)."
>
> https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects
>
> Furthermore a review across all of upstream.
>
> First of render drivers and how they set implicit fences:
>
> - nouveau follows this contract, see in validate_fini_no_ticket()
>
>                         nouveau_bo_fence(nvbo, fence, !!b->write_domains);
>
>   and that last boolean controls whether the exclusive or shared fence
>   slot is used.
>
> - radeon follows this contract by setting
>
>                 p->relocs[i].tv.num_shared = !r->write_domain;
>
>   in radeon_cs_parser_relocs(), which ensures that the call to
>   ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
>   right thing.
>
> - vmwgfx seems to follow this contract with the shotgun approach of
>   always setting ttm_val_buf->num_shared = 0, which means
>   ttm_eu_fence_buffer_objects() will only use the exclusive slot.
>
> - etnaviv follows this contract, as can be trivially seen by looking
>   at submit_attach_object_fences()
>
> - i915 is a bit a convoluted maze with multiple paths leading to
>   i915_vma_move_to_active(). Which sets the exclusive flag if
>   EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
>   softpin mode, or through the write_domain when using relocations. It
>   follows this contract.
>
> - lima follows this contract, see lima_gem_submit() which sets the
>   exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
>   bo
>
> - msm follows this contract, see msm_gpu_submit() which sets the
>   exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer
>
> - panfrost follows this contract with the shotgun approach of just
>   always setting the exclusive fence, see
>   panfrost_attach_object_fences(). Benefits of a single engine I guess
>
> - v3d follows this contract with the same shotgun approach in
>   v3d_attach_fences_and_unlock_reservation(), but it has at least an
>   XXX comment that maybe this should be improved
>
> - v4c uses the same shotgun approach of always setting an exclusive
>   fence, see vc4_update_bo_seqnos()
>
> - vgem also follows this contract, see vgem_fence_attach_ioctl() and
>   the VGEM_FENCE_WRITE. This is used in some igts to validate prime
>   sharing with i915.ko without the need of a 2nd gpu
>
> - vritio follows this contract again with the shotgun approach of
>   always setting an exclusive fence, see virtio_gpu_array_add_fence()
>
> This covers the setting of the exclusive fences when writing.
>
> Synchronizing against the exclusive fence is a lot more tricky, and I
> only spot checked a few:
>
> - i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
>   implicit dependencies (which is used by vulkan)
>
> - etnaviv does this. Implicit dependencies are collected in
>   submit_fence_sync(), again with an opt-out flag
>   ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
>   etnaviv_sched_dependency which is the
>   drm_sched_backend_ops->dependency callback.
>
> - v4c seems to not do much here, maybe gets away with it by not having
>   a scheduler and only a single engine. Since all newer broadcom chips than
>   the OG vc4 use v3d for rendering, which follows this contract, the
>   impact of this issue is fairly small.
>
> - v3d does this using the drm_gem_fence_array_add_implicit() helper,
>   which then it's drm_sched_backend_ops->dependency callback
>   v3d_job_dependency() picks up.
>
> - panfrost is nice here and tracks the implicit fences in
>   panfrost_job->implicit_fences, which again the
>   drm_sched_backend_ops->dependency callback panfrost_job_dependency()
>   picks up. It is mildly questionable though since it only picks up
>   exclusive fences in panfrost_acquire_object_fences(), but not buggy
>   in practice because it also always sets the exclusive fence. It
>   should pick up both sets of fences, just in case there's ever going
>   to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
>   pcie port and a real gpu, which might actually happen eventually. A
>   bug, but easy to fix. Should probably use the
>   drm_gem_fence_array_add_implicit() helper.
>
> - lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
>   the same schema as v3d.
>
> - msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
>   but because it doesn't use the drm/scheduler it handles fences from
>   the wrong context with a synchronous dma_fence_wait. See
>   submit_fence_sync() leading to msm_gem_sync_object(). Investing into
>   a scheduler might be a good idea.
>
> - all the remaining drivers are ttm based, where I hope they do
>   appropriately obey implicit fences already. I didn't do the full
>   audit there because a) not follow the contract would confuse ttm
>   quite well and b) reading non-standard scheduler and submit code
>   which isn't based on drm/scheduler is a pain.
>
> Onwards to the display side.
>
> - Any driver using the drm_gem_plane_helper_prepare_fb() helper will
>   correctly. Overwhelmingly most drivers get this right, except a few
>   totally dont. I'll follow up with a patch to make this the default
>   and avoid a bunch of bugs.
>
> - I didn't audit the ttm drivers, but given that dma_resv started
>   there I hope they get this right.
>
> In conclusion this IS the contract, both as documented and
> overwhelmingly implemented, specically as implemented by all render
> drivers except amdgpu.
>
> Amdgpu tried to fix this already in
>
> commit 049aca4363d8af87cab8d53de5401602db3b9999
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Sep 19 16:54:35 2018 +0200
>
>     drm/amdgpu: fix using shared fence for exported BOs v2
>
> but this fix falls short on a number of areas:
>
> - It's racy, by the time the buffer is shared it might be too late. To
>   make sure there's definitely never a problem we need to set the
>   fences correctly for any buffer that's potentially exportable.
>
> - It's breaking uapi, dma-buf fds support poll() and differentitiate
>   between, which was introduced in
>
>         commit 9b495a5887994a6d74d5c261d012083a92b94738
>         Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>         Date:   Tue Jul 1 12:57:43 2014 +0200
>
>             dma-buf: add poll support, v3
>
> - Christian König wants to nack new uapi building further on this
>   dma_resv contract because it breaks amdgpu, quoting
>
>   "Yeah, and that is exactly the reason why I will NAK this uAPI change.
>
>   "This doesn't works for amdgpu at all for the reasons outlined above."
>
>   https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/
>
>   Rejecting new development because your own driver is broken and
>   violates established cross driver contracts and uapi is really not
>   how upstream works.
>
> Now this patch will have a severe performance impact on anything that
> runs on multiple engines. So we can't just merge it outright, but need
> a bit a plan:
>
> - amdgpu needs a proper uapi for handling implicit fencing. The funny
>   thing is that to do it correctly, implicit fencing must be treated
>   as a very strange IPC mechanism for transporting fences, where both
>   setting the fence and dependency intercepts must be handled
>   explicitly. Current best practices is a per-bo flag to indicate
>   writes, and a per-bo flag to to skip implicit fencing in the CS
>   ioctl as a new chunk.
>
> - Since amdgpu has been shipping with broken behaviour we need an
>   opt-out flag from the butchered implicit fencing model to enable the
>   proper explicit implicit fencing model.
>
> - for kernel memory fences due to bo moves at least the i915 idea is
>   to use ttm_bo->moving. amdgpu probably needs the same.
>
> - since the current p2p dma-buf interface assumes the kernel memory
>   fence is in the exclusive dma_resv fence slot we need to add a new
>   fence slot for kernel fences, which must never be ignored. Since
>   currently only amdgpu supports this there's no real problem here
>   yet, until amdgpu gains a NO_IMPLICIT CS flag.
>
> - New userspace needs to ship in enough desktop distros so that users
>   wont notice the perf impact. I think we can ignore LTS distros who
>   upgrade their kernels but not their mesa3d snapshot.
>
> - Then when this is all in place we can merge this patch here.
>
> What is not a solution to this problem here is trying to make the
> dma_resv rules in the kernel more clever. The fundamental issue here
> is that the amdgpu CS uapi is the least expressive one across all
> drivers (only equalled by panfrost, which has an actual excuse) by not
> allowing any userspace control over how implicit sync is conducted.
>
> Until this is fixed it's completely pointless to make the kernel more
> clever to improve amdgpu, because all we're doing is papering over
> this uapi design issue. amdgpu needs to attain the status quo
> established by other drivers first, once that's achieved we can tackle
> the remaining issues in a consistent way across drivers.
>
> v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
> entirely missed.
>
> This is great because it means the amdgpu specific piece for proper
> implicit fence handling exists already, and that since a while. The
> only thing that's now missing is
> - fishing the implicit fences out of a shared object at the right time
> - setting the exclusive implicit fence slot at the right time.
>
> Jason has a patch series to fill that gap with a bunch of generic
> ioctl on the dma-buf fd:
>
> https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/
>
> v3: Since Christian has fixed amdgpu now in
>
> commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
> Author: Christian König <christian.koenig@amd.com>
> Date:   Wed Jun 9 13:51:36 2021 +0200
>
>     drm/amdgpu: rework dma_resv handling v3
>
> Use the audit covered in this commit message as the excuse to update
> the dma-buf docs around dma_buf.resv usage across drivers.
>
> Since dynamic importers have different rules also hammer these in
> again while we're at it.
>
> v4:
> - Add the missing "through the device" in the dynamic section that I
>   overlooked.
> - Fix a kerneldoc markup mistake, the link didn't connect
>

This is pretty epic commit msg, thanks for the investment, the commit
msg should be required reading.

Reviewed-by: Dave Airlie <airlied@redhat.com>

Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24  7:38     ` Thomas Zimmermann
  -1 siblings, 0 replies; 175+ messages in thread
From: Thomas Zimmermann @ 2021-06-24  7:38 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Daniel Vetter,
	Hans de Goede, Laurent Pinchart, Dave Airlie, Tian Tao


[-- Attachment #1.1: Type: text/plain, Size: 4281 bytes --]



Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Like we have for the shadow helpers too, and roll it out to drivers.
> 
> Acked-by: Tian Tao <tiantao6@hisilicon.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>

> ---
>   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
>   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
>   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
>   4 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> index e5996ae03c49..f5d58c3088fe 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
>   }
>   
>   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = ast_primary_plane_helper_atomic_check,
>   	.atomic_update = ast_primary_plane_helper_atomic_update,
>   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> index 29b8332b2bca..ccf80e369b4b 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
>   };
>   
>   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = hibmc_plane_atomic_check,
>   	.atomic_update = hibmc_plane_atomic_update,
>   };
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> index 964381d55fc1..972c83b720aa 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
>   	.atomic_check = vbox_primary_atomic_check,
>   	.atomic_update = vbox_primary_atomic_update,
>   	.atomic_disable = vbox_primary_atomic_disable,
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   };
>   
>   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> index 27ed7e9243b9..f48d181c824b 100644
> --- a/include/drm/drm_gem_vram_helper.h
> +++ b/include/drm/drm_gem_vram_helper.h
> @@ -124,6 +124,18 @@ void
>   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
>   				     struct drm_plane_state *old_state);
>   
> +/**
> + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> + *
> + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> + * functions.
> + */
> +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> +
>   /*
>    * Helpers for struct drm_simple_display_pipe_funcs
>    */
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
@ 2021-06-24  7:38     ` Thomas Zimmermann
  0 siblings, 0 replies; 175+ messages in thread
From: Thomas Zimmermann @ 2021-06-24  7:38 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Daniel Vetter,
	Laurent Pinchart, Dave Airlie, Tian Tao


[-- Attachment #1.1.1: Type: text/plain, Size: 4281 bytes --]



Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Like we have for the shadow helpers too, and roll it out to drivers.
> 
> Acked-by: Tian Tao <tiantao6@hisilicon.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>

> ---
>   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
>   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
>   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
>   4 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> index e5996ae03c49..f5d58c3088fe 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
>   }
>   
>   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = ast_primary_plane_helper_atomic_check,
>   	.atomic_update = ast_primary_plane_helper_atomic_update,
>   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> index 29b8332b2bca..ccf80e369b4b 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
>   };
>   
>   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = hibmc_plane_atomic_check,
>   	.atomic_update = hibmc_plane_atomic_update,
>   };
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> index 964381d55fc1..972c83b720aa 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
>   	.atomic_check = vbox_primary_atomic_check,
>   	.atomic_update = vbox_primary_atomic_update,
>   	.atomic_disable = vbox_primary_atomic_disable,
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   };
>   
>   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> index 27ed7e9243b9..f48d181c824b 100644
> --- a/include/drm/drm_gem_vram_helper.h
> +++ b/include/drm/drm_gem_vram_helper.h
> @@ -124,6 +124,18 @@ void
>   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
>   				     struct drm_plane_state *old_state);
>   
> +/**
> + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> + *
> + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> + * functions.
> + */
> +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> +
>   /*
>    * Helpers for struct drm_simple_display_pipe_funcs
>    */
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24  7:46     ` Thomas Zimmermann
  -1 siblings, 0 replies; 175+ messages in thread
From: Thomas Zimmermann @ 2021-06-24  7:46 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Daniel Vetter,
	Hans de Goede, Laurent Pinchart, Dave Airlie, Tian Tao


[-- Attachment #1.1: Type: text/plain, Size: 4383 bytes --]

Hi

Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Like we have for the shadow helpers too, and roll it out to drivers.

In addition to the plane-helper macro, you may also want to add 
DRM_GEM_VRAM_SIMPLE_DISPLAY_PIPE_FUNCS and use it in bochs.

Best regards
Thomas

> 
> Acked-by: Tian Tao <tiantao6@hisilicon.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
>   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
>   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
>   4 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> index e5996ae03c49..f5d58c3088fe 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
>   }
>   
>   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = ast_primary_plane_helper_atomic_check,
>   	.atomic_update = ast_primary_plane_helper_atomic_update,
>   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> index 29b8332b2bca..ccf80e369b4b 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
>   };
>   
>   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = hibmc_plane_atomic_check,
>   	.atomic_update = hibmc_plane_atomic_update,
>   };
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> index 964381d55fc1..972c83b720aa 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
>   	.atomic_check = vbox_primary_atomic_check,
>   	.atomic_update = vbox_primary_atomic_update,
>   	.atomic_disable = vbox_primary_atomic_disable,
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   };
>   
>   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> index 27ed7e9243b9..f48d181c824b 100644
> --- a/include/drm/drm_gem_vram_helper.h
> +++ b/include/drm/drm_gem_vram_helper.h
> @@ -124,6 +124,18 @@ void
>   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
>   				     struct drm_plane_state *old_state);
>   
> +/**
> + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> + *
> + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> + * functions.
> + */
> +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> +
>   /*
>    * Helpers for struct drm_simple_display_pipe_funcs
>    */
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
@ 2021-06-24  7:46     ` Thomas Zimmermann
  0 siblings, 0 replies; 175+ messages in thread
From: Thomas Zimmermann @ 2021-06-24  7:46 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: David Airlie, Intel Graphics Development, Daniel Vetter,
	Laurent Pinchart, Dave Airlie, Tian Tao


[-- Attachment #1.1.1: Type: text/plain, Size: 4383 bytes --]

Hi

Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> Like we have for the shadow helpers too, and roll it out to drivers.

In addition to the plane-helper macro, you may also want to add 
DRM_GEM_VRAM_SIMPLE_DISPLAY_PIPE_FUNCS and use it in bochs.

Best regards
Thomas

> 
> Acked-by: Tian Tao <tiantao6@hisilicon.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Dave Airlie <airlied@redhat.com>
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Hans de Goede <hdegoede@redhat.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Maxime Ripard <mripard@kernel.org>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Tian Tao <tiantao6@hisilicon.com>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> ---
>   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
>   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
>   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
>   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
>   4 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> index e5996ae03c49..f5d58c3088fe 100644
> --- a/drivers/gpu/drm/ast/ast_mode.c
> +++ b/drivers/gpu/drm/ast/ast_mode.c
> @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
>   }
>   
>   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = ast_primary_plane_helper_atomic_check,
>   	.atomic_update = ast_primary_plane_helper_atomic_update,
>   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> index 29b8332b2bca..ccf80e369b4b 100644
> --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
>   };
>   
>   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   	.atomic_check = hibmc_plane_atomic_check,
>   	.atomic_update = hibmc_plane_atomic_update,
>   };
> diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> index 964381d55fc1..972c83b720aa 100644
> --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
>   	.atomic_check = vbox_primary_atomic_check,
>   	.atomic_update = vbox_primary_atomic_update,
>   	.atomic_disable = vbox_primary_atomic_disable,
> -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
>   };
>   
>   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> index 27ed7e9243b9..f48d181c824b 100644
> --- a/include/drm/drm_gem_vram_helper.h
> +++ b/include/drm/drm_gem_vram_helper.h
> @@ -124,6 +124,18 @@ void
>   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
>   				     struct drm_plane_state *old_state);
>   
> +/**
> + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> + *
> + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> + * functions.
> + */
> +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> +
>   /*
>    * Helpers for struct drm_simple_display_pipe_funcs
>    */
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
  2021-06-22 16:55   ` Daniel Vetter
                       ` (5 preceding siblings ...)
  (?)
@ 2021-06-24  8:32     ` Philipp Zabel
  -1 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

_______________________________________________
Linux-mediatek mailing list
Linux-mediatek@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-mediatek

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Neil Armstrong, Tomi Valkeinen, Alexandre Torgue, linux-mips,
	Paul Cercueil, Benjamin Gaignard, Daniel Vetter, linux-stm32,
	Jerome Brunet, Marek Vasut, Kevin Hilman, Jernej Skrabec,
	linux-rockchip, Chen-Yu Tsai, NXP Linux Team, Sascha Hauer,
	Chun-Kuang Hu, Maxime Coquelin, Martin Blumenstingl,
	Intel Graphics Development, linux-mediatek, Laurentiu Palcu,
	Matthias Brugger, linux-amlogic, linux-arm-kernel,
	Tomi Valkeinen, Jyri Sarha, Yannick Fertre, Sandy Huang,
	linux-sunxi, Philippe Cornu, Pengutronix Kernel Team, Shawn Guo

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Heiko Stuebner, Neil Armstrong, Tomi Valkeinen, Alexandre Torgue,
	Stefan Agner, linux-mips, Paul Cercueil, Benjamin Gaignard,
	Daniel Vetter, Fabio Estevam, linux-stm32, Jerome Brunet,
	Marek Vasut, Kevin Hilman, Jernej Skrabec, linux-rockchip,
	Chen-Yu Tsai, NXP Linux Team, Sascha Hauer, Chun-Kuang Hu,
	Maxime Coquelin, Martin Blumenstingl, Intel Graphics Development,
	Maxime Ripard, linux-mediatek, Laurentiu Palcu, Matthias Brugger,
	linux-amlogic, linux-arm-kernel, Tomi Valkeinen, Jyri Sarha,
	Yannick Fertre, Sandy Huang, linux-sunxi, Philippe Cornu,
	Pengutronix Kernel Team, Shawn Guo, Lucas Stach

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now the default
@ 2021-06-24  8:32     ` Philipp Zabel
  0 siblings, 0 replies; 175+ messages in thread
From: Philipp Zabel @ 2021-06-24  8:32 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Heiko Stuebner, Paul Cercueil,
	Jernej Skrabec, Chun-Kuang Hu, Martin Blumenstingl,
	Tomi Valkeinen, Philippe Cornu, Lucas Stach, Daniel Vetter,
	Laurentiu Palcu, Shawn Guo, Sascha Hauer,
	Pengutronix Kernel Team, Fabio Estevam, NXP Linux Team,
	Matthias Brugger, Neil Armstrong, Kevin Hilman, Jerome Brunet,
	Marek Vasut, Stefan Agner, Sandy Huang, Yannick Fertre,
	Benjamin Gaignard, Maxime Coquelin, Alexandre Torgue,
	Maxime Ripard, Chen-Yu Tsai, Jyri Sarha, Tomi Valkeinen,
	linux-arm-kernel, linux-mips, linux-mediatek, linux-amlogic,
	linux-rockchip, linux-stm32, linux-sunxi

On Tue, 2021-06-22 at 18:55 +0200, Daniel Vetter wrote:
> No need to set it explicitly.
> 
[...]
>  drivers/gpu/drm/imx/ipuv3-plane.c           | 1 -
>  14 files changed, 15 deletions(-)
> 
[...]
> diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
> index 8710f55d2579..ef114b6aa691 100644
> --- a/drivers/gpu/drm/imx/ipuv3-plane.c
> +++ b/drivers/gpu/drm/imx/ipuv3-plane.c
> @@ -772,7 +772,6 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
>  }
>  
>  static const struct drm_plane_helper_funcs ipu_plane_helper_funcs = {
> -	.prepare_fb = drm_gem_plane_helper_prepare_fb,
>  	.atomic_check = ipu_plane_atomic_check,
>  	.atomic_disable = ipu_plane_atomic_disable,
>  	.atomic_update = ipu_plane_atomic_update,

Acked-by: Philipp Zabel <p.zabel@pengutronix.de>

regards
Philipp

_______________________________________________
linux-amlogic mailing list
linux-amlogic@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-amlogic

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Mesa-dev] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-23 16:19     ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24 11:08       ` Daniel Stone
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Stone @ 2021-06-24 11:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Michel Dänzer,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, ML mesa-dev, Alex Deucher,
	Daniel Vetter, Dennis Li, Deepak R Varma

Hi,

On Wed, 23 Jun 2021 at 17:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> +        *
> +        * IMPLICIT SYNCHRONIZATION RULES:
> +        *
> +        * Drivers which support implicit synchronization of buffer access as
> +        * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> +        * below rules.

'Should' ... ? Must.

> +        * - Drivers should add a shared fence through
> +        *   dma_resv_add_shared_fence() for anything the userspace API
> +        *   considers a read access. This highly depends upon the API and
> +        *   window system: E.g. OpenGL is generally implicitly synchronized on
> +        *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> +        *   generally explicitly synchronized for everything, and window system
> +        *   buffers have explicit API calls (which then need to make sure the
> +        *   implicit fences store here in @resv are updated correctly).
> +        *
> +        * - [...]

Mmm, I think this is all right, but it could be worded much more
clearly. Right now it's a bunch of points all smashed into one, and
there's a lot of room for misinterpretation.

Here's a strawman, starting with most basic and restrictive, working
through to when you're allowed to wriggle your way out:

Rule 1: Drivers must add a shared fence through
dma_resv_add_shared_fence() for any read accesses against that buffer.
This appends a fence to the shared array, ensuring that any future
non-read access will be synchronised against this operation to only
begin after it has completed.

Rule 2: Drivers must add an exclusive fence through
dma_resv_add_excl_fence() for any write accesses against that buffer.
This replaces the exclusive fence with the new operation, ensuring
that all future access will be synchronised against this operation to
only begin after it has completed.

Rule 3: Drivers must synchronise all accesses to buffers against
existing implicit fences. Read accesses must synchronise against the
exclusive fence (read-after-write), and write accesses must
synchronise against both the exclusive (write-after-write) and shared
(write-after-read) fences.

Note 1: Users like OpenGL and window systems on non-Android userspace
are generally implicitly synchronised. An implicitly-synchronised
userspace is unaware of fences from prior operations, so the kernel
mediates scheduling to create the illusion that GPU work is FIFO. For
example, an application will flush and schedule GPU write work to
render its image, then immediately tell the window system to display
that image; the window system may immediately flush and schedule GPU
read work to display that image, with neither waiting for the write to
have completed. The kernel provides coherence by synchronising the
read access against the write fence in the exclusive slot, so that the
image displayed is correct.

Note 2: Users like Vulkan and Android window system are generally
explicitly synchronised. An explicitly-synchronised userspace is
responsible for tracking its own read and write access and providing
the kernel with synchronisation barriers. For instance, a Vulkan
application rendering to a buffer and subsequently using it as a read
texture, must annotate the read operation with a read-after-write
synchronisation barrier.

Note 3: Implicit and explicit userspace can coexist. For instance, an
explicitly-synchronised Vulkan application may be running as a client
of an implicitly-synchronised window system which uses OpenGL for
composition; an implicitly-synchronised OpenGL application may be
running as a client of a window system which uses Vulkan for
composition.

Note 4: Some subsystems, for example V4L2, do not pipeline operations,
and instead only return to userspace when the scheduled work against a
buffer has fully retired.

Exemption 1: Fully self-coherent userspace may skip implicit
synchronisation barriers. For instance, accesses between two
Vulkan-internal buffers allocated by a single application do not need
to synchronise against each other's implicit fences, as the client is
responsible for explicitly providing barriers for access. A
self-contained OpenGL userspace also has no need to implicitly
synchronise its access if the driver instead tracks all access and
inserts the appropriate synchronisation barriers.

Exemption 2: When implicit and explicit userspace coexist, the
explicit side may skip intermediate synchronisation, and only place
synchronisation barriers at transition points. For example, a Vulkan
compositor displaying a buffer from an OpenGL application would need
to synchronise its first access against the fence placed in the
exclusive implicit-synchronisation slot. Once this read has fully
retired, the compositor has no need to participate in implicit
synchronisation until it is ready to return the buffer to the
application, at which point it must insert all its non-retired
accesses into the shared slot, which the application will then
synchronise future write accesses against.

Cheers,
Daniel

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [Mesa-dev] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-24 11:08       ` Daniel Stone
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Stone @ 2021-06-24 11:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Rob Clark, Daniel Stone, Michel Dänzer,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, ML mesa-dev, Alex Deucher,
	Daniel Vetter, Sumit Semwal, Dennis Li, Deepak R Varma

Hi,

On Wed, 23 Jun 2021 at 17:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> +        *
> +        * IMPLICIT SYNCHRONIZATION RULES:
> +        *
> +        * Drivers which support implicit synchronization of buffer access as
> +        * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> +        * below rules.

'Should' ... ? Must.

> +        * - Drivers should add a shared fence through
> +        *   dma_resv_add_shared_fence() for anything the userspace API
> +        *   considers a read access. This highly depends upon the API and
> +        *   window system: E.g. OpenGL is generally implicitly synchronized on
> +        *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> +        *   generally explicitly synchronized for everything, and window system
> +        *   buffers have explicit API calls (which then need to make sure the
> +        *   implicit fences store here in @resv are updated correctly).
> +        *
> +        * - [...]

Mmm, I think this is all right, but it could be worded much more
clearly. Right now it's a bunch of points all smashed into one, and
there's a lot of room for misinterpretation.

Here's a strawman, starting with most basic and restrictive, working
through to when you're allowed to wriggle your way out:

Rule 1: Drivers must add a shared fence through
dma_resv_add_shared_fence() for any read accesses against that buffer.
This appends a fence to the shared array, ensuring that any future
non-read access will be synchronised against this operation to only
begin after it has completed.

Rule 2: Drivers must add an exclusive fence through
dma_resv_add_excl_fence() for any write accesses against that buffer.
This replaces the exclusive fence with the new operation, ensuring
that all future access will be synchronised against this operation to
only begin after it has completed.

Rule 3: Drivers must synchronise all accesses to buffers against
existing implicit fences. Read accesses must synchronise against the
exclusive fence (read-after-write), and write accesses must
synchronise against both the exclusive (write-after-write) and shared
(write-after-read) fences.

Note 1: Users like OpenGL and window systems on non-Android userspace
are generally implicitly synchronised. An implicitly-synchronised
userspace is unaware of fences from prior operations, so the kernel
mediates scheduling to create the illusion that GPU work is FIFO. For
example, an application will flush and schedule GPU write work to
render its image, then immediately tell the window system to display
that image; the window system may immediately flush and schedule GPU
read work to display that image, with neither waiting for the write to
have completed. The kernel provides coherence by synchronising the
read access against the write fence in the exclusive slot, so that the
image displayed is correct.

Note 2: Users like Vulkan and Android window system are generally
explicitly synchronised. An explicitly-synchronised userspace is
responsible for tracking its own read and write access and providing
the kernel with synchronisation barriers. For instance, a Vulkan
application rendering to a buffer and subsequently using it as a read
texture, must annotate the read operation with a read-after-write
synchronisation barrier.

Note 3: Implicit and explicit userspace can coexist. For instance, an
explicitly-synchronised Vulkan application may be running as a client
of an implicitly-synchronised window system which uses OpenGL for
composition; an implicitly-synchronised OpenGL application may be
running as a client of a window system which uses Vulkan for
composition.

Note 4: Some subsystems, for example V4L2, do not pipeline operations,
and instead only return to userspace when the scheduled work against a
buffer has fully retired.

Exemption 1: Fully self-coherent userspace may skip implicit
synchronisation barriers. For instance, accesses between two
Vulkan-internal buffers allocated by a single application do not need
to synchronise against each other's implicit fences, as the client is
responsible for explicitly providing barriers for access. A
self-contained OpenGL userspace also has no need to implicitly
synchronise its access if the driver instead tracks all access and
inserts the appropriate synchronisation barriers.

Exemption 2: When implicit and explicit userspace coexist, the
explicit side may skip intermediate synchronisation, and only place
synchronisation barriers at transition points. For example, a Vulkan
compositor displaying a buffer from an OpenGL application would need
to synchronise its first access against the fence placed in the
exclusive implicit-synchronisation slot. Once this read has fully
retired, the compositor has no need to participate in implicit
synchronisation until it is ready to return the buffer to the
application, at which point it must insert all its non-retired
accesses into the shared slot, which the application will then
synchronise future write accesses against.

Cheers,
Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Mesa-dev] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-24 11:08       ` [Intel-gfx] " Daniel Stone
@ 2021-06-24 11:23         ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 11:23 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Rob Clark, Daniel Stone, Michel Dänzer,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, ML mesa-dev, Alex Deucher,
	Daniel Vetter, Dennis Li, Deepak R Varma

On Thu, Jun 24, 2021 at 1:08 PM Daniel Stone <daniel@fooishbar.org> wrote:
>
> Hi,
>
> On Wed, 23 Jun 2021 at 17:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > +        *
> > +        * IMPLICIT SYNCHRONIZATION RULES:
> > +        *
> > +        * Drivers which support implicit synchronization of buffer access as
> > +        * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> > +        * below rules.
>
> 'Should' ... ? Must.

Yeah  I guess I can upgrade a bunch of them.

> > +        * - Drivers should add a shared fence through
> > +        *   dma_resv_add_shared_fence() for anything the userspace API
> > +        *   considers a read access. This highly depends upon the API and
> > +        *   window system: E.g. OpenGL is generally implicitly synchronized on
> > +        *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> > +        *   generally explicitly synchronized for everything, and window system
> > +        *   buffers have explicit API calls (which then need to make sure the
> > +        *   implicit fences store here in @resv are updated correctly).
> > +        *
> > +        * - [...]
>
> Mmm, I think this is all right, but it could be worded much more
> clearly. Right now it's a bunch of points all smashed into one, and
> there's a lot of room for misinterpretation.
>
> Here's a strawman, starting with most basic and restrictive, working
> through to when you're allowed to wriggle your way out:
>
> Rule 1: Drivers must add a shared fence through
> dma_resv_add_shared_fence() for any read accesses against that buffer.
> This appends a fence to the shared array, ensuring that any future
> non-read access will be synchronised against this operation to only
> begin after it has completed.
>
> Rule 2: Drivers must add an exclusive fence through
> dma_resv_add_excl_fence() for any write accesses against that buffer.
> This replaces the exclusive fence with the new operation, ensuring
> that all future access will be synchronised against this operation to
> only begin after it has completed.
>
> Rule 3: Drivers must synchronise all accesses to buffers against
> existing implicit fences. Read accesses must synchronise against the
> exclusive fence (read-after-write), and write accesses must
> synchronise against both the exclusive (write-after-write) and shared
> (write-after-read) fences.
>
> Note 1: Users like OpenGL and window systems on non-Android userspace
> are generally implicitly synchronised. An implicitly-synchronised
> userspace is unaware of fences from prior operations, so the kernel
> mediates scheduling to create the illusion that GPU work is FIFO. For
> example, an application will flush and schedule GPU write work to
> render its image, then immediately tell the window system to display
> that image; the window system may immediately flush and schedule GPU
> read work to display that image, with neither waiting for the write to
> have completed. The kernel provides coherence by synchronising the
> read access against the write fence in the exclusive slot, so that the
> image displayed is correct.
>
> Note 2: Users like Vulkan and Android window system are generally
> explicitly synchronised. An explicitly-synchronised userspace is
> responsible for tracking its own read and write access and providing
> the kernel with synchronisation barriers. For instance, a Vulkan
> application rendering to a buffer and subsequently using it as a read
> texture, must annotate the read operation with a read-after-write
> synchronisation barrier.
>
> Note 3: Implicit and explicit userspace can coexist. For instance, an
> explicitly-synchronised Vulkan application may be running as a client
> of an implicitly-synchronised window system which uses OpenGL for
> composition; an implicitly-synchronised OpenGL application may be
> running as a client of a window system which uses Vulkan for
> composition.
>
> Note 4: Some subsystems, for example V4L2, do not pipeline operations,
> and instead only return to userspace when the scheduled work against a
> buffer has fully retired.
>
> Exemption 1: Fully self-coherent userspace may skip implicit
> synchronisation barriers. For instance, accesses between two
> Vulkan-internal buffers allocated by a single application do not need
> to synchronise against each other's implicit fences, as the client is
> responsible for explicitly providing barriers for access. A
> self-contained OpenGL userspace also has no need to implicitly
> synchronise its access if the driver instead tracks all access and
> inserts the appropriate synchronisation barriers.
>
> Exemption 2: When implicit and explicit userspace coexist, the
> explicit side may skip intermediate synchronisation, and only place
> synchronisation barriers at transition points. For example, a Vulkan
> compositor displaying a buffer from an OpenGL application would need
> to synchronise its first access against the fence placed in the
> exclusive implicit-synchronisation slot. Once this read has fully
> retired, the compositor has no need to participate in implicit
> synchronisation until it is ready to return the buffer to the
> application, at which point it must insert all its non-retired
> accesses into the shared slot, which the application will then
> synchronise future write accesses against.

So I think this is excellent, but maybe better suited in the uapi
section as a sperate chapter? Essentially keep your rules in the
driver-internal docs, but move the Note/excemptions into the uapi
section under a "Implicit Sync Mode of Operation" or whatever heading?

The other thing to keep in mind is that this is very much incomplete:
I'm silent on what drivers should do exactly with these fences. That's
largely because I haven't fully completed that audit, and there's a
pile of bugs there still.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [Mesa-dev] [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
@ 2021-06-24 11:23         ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 11:23 UTC (permalink / raw)
  To: Daniel Stone
  Cc: Rob Clark, Daniel Stone, Michel Dänzer,
	Intel Graphics Development, Kevin Wang, DRI Development,
	Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Luben Tuikov,
	Kristian H . Kristensen, Chen Li, ML mesa-dev, Alex Deucher,
	Daniel Vetter, Sumit Semwal, Dennis Li, Deepak R Varma

On Thu, Jun 24, 2021 at 1:08 PM Daniel Stone <daniel@fooishbar.org> wrote:
>
> Hi,
>
> On Wed, 23 Jun 2021 at 17:20, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > +        *
> > +        * IMPLICIT SYNCHRONIZATION RULES:
> > +        *
> > +        * Drivers which support implicit synchronization of buffer access as
> > +        * e.g. exposed in `Implicit Fence Poll Support`_ should follow the
> > +        * below rules.
>
> 'Should' ... ? Must.

Yeah  I guess I can upgrade a bunch of them.

> > +        * - Drivers should add a shared fence through
> > +        *   dma_resv_add_shared_fence() for anything the userspace API
> > +        *   considers a read access. This highly depends upon the API and
> > +        *   window system: E.g. OpenGL is generally implicitly synchronized on
> > +        *   Linux, but explicitly synchronized on Android. Whereas Vulkan is
> > +        *   generally explicitly synchronized for everything, and window system
> > +        *   buffers have explicit API calls (which then need to make sure the
> > +        *   implicit fences store here in @resv are updated correctly).
> > +        *
> > +        * - [...]
>
> Mmm, I think this is all right, but it could be worded much more
> clearly. Right now it's a bunch of points all smashed into one, and
> there's a lot of room for misinterpretation.
>
> Here's a strawman, starting with most basic and restrictive, working
> through to when you're allowed to wriggle your way out:
>
> Rule 1: Drivers must add a shared fence through
> dma_resv_add_shared_fence() for any read accesses against that buffer.
> This appends a fence to the shared array, ensuring that any future
> non-read access will be synchronised against this operation to only
> begin after it has completed.
>
> Rule 2: Drivers must add an exclusive fence through
> dma_resv_add_excl_fence() for any write accesses against that buffer.
> This replaces the exclusive fence with the new operation, ensuring
> that all future access will be synchronised against this operation to
> only begin after it has completed.
>
> Rule 3: Drivers must synchronise all accesses to buffers against
> existing implicit fences. Read accesses must synchronise against the
> exclusive fence (read-after-write), and write accesses must
> synchronise against both the exclusive (write-after-write) and shared
> (write-after-read) fences.
>
> Note 1: Users like OpenGL and window systems on non-Android userspace
> are generally implicitly synchronised. An implicitly-synchronised
> userspace is unaware of fences from prior operations, so the kernel
> mediates scheduling to create the illusion that GPU work is FIFO. For
> example, an application will flush and schedule GPU write work to
> render its image, then immediately tell the window system to display
> that image; the window system may immediately flush and schedule GPU
> read work to display that image, with neither waiting for the write to
> have completed. The kernel provides coherence by synchronising the
> read access against the write fence in the exclusive slot, so that the
> image displayed is correct.
>
> Note 2: Users like Vulkan and Android window system are generally
> explicitly synchronised. An explicitly-synchronised userspace is
> responsible for tracking its own read and write access and providing
> the kernel with synchronisation barriers. For instance, a Vulkan
> application rendering to a buffer and subsequently using it as a read
> texture, must annotate the read operation with a read-after-write
> synchronisation barrier.
>
> Note 3: Implicit and explicit userspace can coexist. For instance, an
> explicitly-synchronised Vulkan application may be running as a client
> of an implicitly-synchronised window system which uses OpenGL for
> composition; an implicitly-synchronised OpenGL application may be
> running as a client of a window system which uses Vulkan for
> composition.
>
> Note 4: Some subsystems, for example V4L2, do not pipeline operations,
> and instead only return to userspace when the scheduled work against a
> buffer has fully retired.
>
> Exemption 1: Fully self-coherent userspace may skip implicit
> synchronisation barriers. For instance, accesses between two
> Vulkan-internal buffers allocated by a single application do not need
> to synchronise against each other's implicit fences, as the client is
> responsible for explicitly providing barriers for access. A
> self-contained OpenGL userspace also has no need to implicitly
> synchronise its access if the driver instead tracks all access and
> inserts the appropriate synchronisation barriers.
>
> Exemption 2: When implicit and explicit userspace coexist, the
> explicit side may skip intermediate synchronisation, and only place
> synchronisation barriers at transition points. For example, a Vulkan
> compositor displaying a buffer from an OpenGL application would need
> to synchronise its first access against the fence placed in the
> exclusive implicit-synchronisation slot. Once this read has fully
> retired, the compositor has no need to participate in implicit
> synchronisation until it is ready to return the buffer to the
> application, at which point it must insert all its non-retired
> accesses into the shared slot, which the application will then
> synchronise future write accesses against.

So I think this is excellent, but maybe better suited in the uapi
section as a sperate chapter? Essentially keep your rules in the
driver-internal docs, but move the Note/excemptions into the uapi
section under a "Implicit Sync Mode of Operation" or whatever heading?

The other thing to keep in mind is that this is very much incomplete:
I'm silent on what drivers should do exactly with these fences. That's
largely because I haven't fully completed that audit, and there's a
pile of bugs there still.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-23  8:42     ` [Intel-gfx] " Christian König
@ 2021-06-24 12:41       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 12:41 UTC (permalink / raw)
  To: Christian König
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Thomas Zimmermann, Daniel Vetter

On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > Spotted while trying to convert panfrost to these.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >   drivers/gpu/drm/drm_gem.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index ba2e64ed8b47..68deb1de8235 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> >    * @fence_array: array of dma_fence * for the job to block on.
> >    * @fence: the dma_fence to add to the list of dependencies.
> >    *
> > + * This functions consumes the reference for @fence both on success and error
> > + * cases.
> > + *
> 
> Oh, the later is a bit ugly I think. But good to know.
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
look at the drm/armada patch too, then I think I have reviews/acks for all
of them?

Thanks, Daniel

> 
> >    * Returns:
> >    * 0 on success, or an error on failing to expand the array.
> >    */
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 12:41       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 12:41 UTC (permalink / raw)
  To: Christian König
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Maxime Ripard, Thomas Zimmermann, Daniel Vetter,
	Lucas Stach

On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > Spotted while trying to convert panfrost to these.
> > 
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Cc: Lucas Stach <l.stach@pengutronix.de>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > ---
> >   drivers/gpu/drm/drm_gem.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index ba2e64ed8b47..68deb1de8235 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> >    * @fence_array: array of dma_fence * for the job to block on.
> >    * @fence: the dma_fence to add to the list of dependencies.
> >    *
> > + * This functions consumes the reference for @fence both on success and error
> > + * cases.
> > + *
> 
> Oh, the later is a bit ugly I think. But good to know.
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
look at the drm/armada patch too, then I think I have reviews/acks for all
of them?

Thanks, Daniel

> 
> >    * Returns:
> >    * 0 on success, or an error on failing to expand the array.
> >    */
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 09/15] drm/armada: Remove prepare/cleanup_fb hooks
  2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24 12:46     ` Maxime Ripard
  -1 siblings, 0 replies; 175+ messages in thread
From: Maxime Ripard @ 2021-06-24 12:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Russell King, DRI Development

On Tue, Jun 22, 2021 at 06:55:05PM +0200, Daniel Vetter wrote:
> All they do is refcount the fb, which the atomic helpers already do.
> 
> This is was necessary with the legacy helpers and I guess just carry
> over in the conversion. drm_plane_state always has a full reference
> for its ->fb pointer during its entire lifetime,
> see __drm_atomic_helper_plane_destroy_state()
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Russell King <linux@armlinux.org.uk>

Acked-by: Maxime Ripard <maxime@cerno.tech>

Maxime

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 09/15] drm/armada: Remove prepare/cleanup_fb hooks
@ 2021-06-24 12:46     ` Maxime Ripard
  0 siblings, 0 replies; 175+ messages in thread
From: Maxime Ripard @ 2021-06-24 12:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Russell King, DRI Development

On Tue, Jun 22, 2021 at 06:55:05PM +0200, Daniel Vetter wrote:
> All they do is refcount the fb, which the atomic helpers already do.
> 
> This is was necessary with the legacy helpers and I guess just carry
> over in the conversion. drm_plane_state always has a full reference
> for its ->fb pointer during its entire lifetime,
> see __drm_atomic_helper_plane_destroy_state()
> 
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Russell King <linux@armlinux.org.uk>

Acked-by: Maxime Ripard <maxime@cerno.tech>

Maxime
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-23 16:19     ` [Intel-gfx] " Daniel Vetter
                       ` (2 preceding siblings ...)
  (?)
@ 2021-06-24 12:48     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 12:48 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Deepak R Varma, Dave Airlie, Daniel Vetter,
	Daniel Vetter, Michel Dänzer, Kevin Wang, linaro-mm-sig,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Alex Deucher,
	mesa-dev, Christian König, Dennis Li, Daniel Stone

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

v4:
- Add the missing "through the device" in the dynamic section that I
  overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect

v5:
- A few s/should/must/ to make clear what must be done (if the driver
  does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
  clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)

Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 81cebf414505..2b814fde0d11 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -386,6 +386,40 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ must follow the
+	 * below rules.
+	 *
+	 * - Drivers must add a shared fence through dma_resv_add_shared_fence()
+	 *   for anything the userspace API considers a read access. This highly
+	 *   depends upon the API and window system.
+	 *
+	 * - Similarly drivers must set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through the device.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their &dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-24 12:41       ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24 12:48         ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 12:48 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Thomas Zimmermann, Daniel Vetter



Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>> Spotted while trying to convert panfrost to these.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Maxime Ripard <mripard@kernel.org>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> ---
>>>    drivers/gpu/drm/drm_gem.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>> index ba2e64ed8b47..68deb1de8235 100644
>>> --- a/drivers/gpu/drm/drm_gem.c
>>> +++ b/drivers/gpu/drm/drm_gem.c
>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>     * @fence_array: array of dma_fence * for the job to block on.
>>>     * @fence: the dma_fence to add to the list of dependencies.
>>>     *
>>> + * This functions consumes the reference for @fence both on success and error
>>> + * cases.
>>> + *
>> Oh, the later is a bit ugly I think. But good to know.
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> look at the drm/armada patch too, then I think I have reviews/acks for all
> of them?

What are you talking about? I only see drm/armada patches for the irq 
stuff Thomas is working on.

Christian.

>
> Thanks, Daniel
>
>>>     * Returns:
>>>     * 0 on success, or an error on failing to expand the array.
>>>     */


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 12:48         ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 12:48 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Maxime Ripard, Thomas Zimmermann, Daniel Vetter,
	Lucas Stach



Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>> Spotted while trying to convert panfrost to these.
>>>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Maxime Ripard <mripard@kernel.org>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: David Airlie <airlied@linux.ie>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> ---
>>>    drivers/gpu/drm/drm_gem.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>> index ba2e64ed8b47..68deb1de8235 100644
>>> --- a/drivers/gpu/drm/drm_gem.c
>>> +++ b/drivers/gpu/drm/drm_gem.c
>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>     * @fence_array: array of dma_fence * for the job to block on.
>>>     * @fence: the dma_fence to add to the list of dependencies.
>>>     *
>>> + * This functions consumes the reference for @fence both on success and error
>>> + * cases.
>>> + *
>> Oh, the later is a bit ugly I think. But good to know.
>>
>> Reviewed-by: Christian König <christian.koenig@amd.com>
> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> look at the drm/armada patch too, then I think I have reviews/acks for all
> of them?

What are you talking about? I only see drm/armada patches for the irq 
stuff Thomas is working on.

Christian.

>
> Thanks, Daniel
>
>>>     * Returns:
>>>     * 0 on success, or an error on failing to expand the array.
>>>     */

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* [PATCH] dma-buf: Document dma-buf implicit fencing/resv fencing rules
  2021-06-23 16:19     ` [Intel-gfx] " Daniel Vetter
                       ` (3 preceding siblings ...)
  (?)
@ 2021-06-24 12:52     ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 12:52 UTC (permalink / raw)
  To: DRI Development
  Cc: Rob Clark, Deepak R Varma, Dave Airlie, Daniel Vetter,
	Daniel Vetter, Michel Dänzer, Kevin Wang, linaro-mm-sig,
	Luben Tuikov, Kristian H . Kristensen, Chen Li, Alex Deucher,
	mesa-dev, Christian König, Dennis Li, Daniel Stone

Docs for struct dma_resv are fairly clear:

"A reservation object can have attached one exclusive fence (normally
associated with write operations) or N shared fences (read
operations)."

https://dri.freedesktop.org/docs/drm/driver-api/dma-buf.html#reservation-objects

Furthermore a review across all of upstream.

First of render drivers and how they set implicit fences:

- nouveau follows this contract, see in validate_fini_no_ticket()

			nouveau_bo_fence(nvbo, fence, !!b->write_domains);

  and that last boolean controls whether the exclusive or shared fence
  slot is used.

- radeon follows this contract by setting

		p->relocs[i].tv.num_shared = !r->write_domain;

  in radeon_cs_parser_relocs(), which ensures that the call to
  ttm_eu_fence_buffer_objects() in radeon_cs_parser_fini() will do the
  right thing.

- vmwgfx seems to follow this contract with the shotgun approach of
  always setting ttm_val_buf->num_shared = 0, which means
  ttm_eu_fence_buffer_objects() will only use the exclusive slot.

- etnaviv follows this contract, as can be trivially seen by looking
  at submit_attach_object_fences()

- i915 is a bit a convoluted maze with multiple paths leading to
  i915_vma_move_to_active(). Which sets the exclusive flag if
  EXEC_OBJECT_WRITE is set. This can either come as a buffer flag for
  softpin mode, or through the write_domain when using relocations. It
  follows this contract.

- lima follows this contract, see lima_gem_submit() which sets the
  exclusive fence when the LIMA_SUBMIT_BO_WRITE flag is set for that
  bo

- msm follows this contract, see msm_gpu_submit() which sets the
  exclusive flag when the MSM_SUBMIT_BO_WRITE is set for that buffer

- panfrost follows this contract with the shotgun approach of just
  always setting the exclusive fence, see
  panfrost_attach_object_fences(). Benefits of a single engine I guess

- v3d follows this contract with the same shotgun approach in
  v3d_attach_fences_and_unlock_reservation(), but it has at least an
  XXX comment that maybe this should be improved

- v4c uses the same shotgun approach of always setting an exclusive
  fence, see vc4_update_bo_seqnos()

- vgem also follows this contract, see vgem_fence_attach_ioctl() and
  the VGEM_FENCE_WRITE. This is used in some igts to validate prime
  sharing with i915.ko without the need of a 2nd gpu

- vritio follows this contract again with the shotgun approach of
  always setting an exclusive fence, see virtio_gpu_array_add_fence()

This covers the setting of the exclusive fences when writing.

Synchronizing against the exclusive fence is a lot more tricky, and I
only spot checked a few:

- i915 does it, with the optional EXEC_OBJECT_ASYNC to skip all
  implicit dependencies (which is used by vulkan)

- etnaviv does this. Implicit dependencies are collected in
  submit_fence_sync(), again with an opt-out flag
  ETNA_SUBMIT_NO_IMPLICIT. These are then picked up in
  etnaviv_sched_dependency which is the
  drm_sched_backend_ops->dependency callback.

- v4c seems to not do much here, maybe gets away with it by not having
  a scheduler and only a single engine. Since all newer broadcom chips than
  the OG vc4 use v3d for rendering, which follows this contract, the
  impact of this issue is fairly small.

- v3d does this using the drm_gem_fence_array_add_implicit() helper,
  which then it's drm_sched_backend_ops->dependency callback
  v3d_job_dependency() picks up.

- panfrost is nice here and tracks the implicit fences in
  panfrost_job->implicit_fences, which again the
  drm_sched_backend_ops->dependency callback panfrost_job_dependency()
  picks up. It is mildly questionable though since it only picks up
  exclusive fences in panfrost_acquire_object_fences(), but not buggy
  in practice because it also always sets the exclusive fence. It
  should pick up both sets of fences, just in case there's ever going
  to be a 2nd gpu in a SoC with a mali gpu. Or maybe a mali SoC with a
  pcie port and a real gpu, which might actually happen eventually. A
  bug, but easy to fix. Should probably use the
  drm_gem_fence_array_add_implicit() helper.

- lima is nice an easy, uses drm_gem_fence_array_add_implicit() and
  the same schema as v3d.

- msm is mildly entertaining. It also supports MSM_SUBMIT_NO_IMPLICIT,
  but because it doesn't use the drm/scheduler it handles fences from
  the wrong context with a synchronous dma_fence_wait. See
  submit_fence_sync() leading to msm_gem_sync_object(). Investing into
  a scheduler might be a good idea.

- all the remaining drivers are ttm based, where I hope they do
  appropriately obey implicit fences already. I didn't do the full
  audit there because a) not follow the contract would confuse ttm
  quite well and b) reading non-standard scheduler and submit code
  which isn't based on drm/scheduler is a pain.

Onwards to the display side.

- Any driver using the drm_gem_plane_helper_prepare_fb() helper will
  correctly. Overwhelmingly most drivers get this right, except a few
  totally dont. I'll follow up with a patch to make this the default
  and avoid a bunch of bugs.

- I didn't audit the ttm drivers, but given that dma_resv started
  there I hope they get this right.

In conclusion this IS the contract, both as documented and
overwhelmingly implemented, specically as implemented by all render
drivers except amdgpu.

Amdgpu tried to fix this already in

commit 049aca4363d8af87cab8d53de5401602db3b9999
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Sep 19 16:54:35 2018 +0200

    drm/amdgpu: fix using shared fence for exported BOs v2

but this fix falls short on a number of areas:

- It's racy, by the time the buffer is shared it might be too late. To
  make sure there's definitely never a problem we need to set the
  fences correctly for any buffer that's potentially exportable.

- It's breaking uapi, dma-buf fds support poll() and differentitiate
  between, which was introduced in

	commit 9b495a5887994a6d74d5c261d012083a92b94738
	Author: Maarten Lankhorst <maarten.lankhorst@canonical.com>
	Date:   Tue Jul 1 12:57:43 2014 +0200

	    dma-buf: add poll support, v3

- Christian König wants to nack new uapi building further on this
  dma_resv contract because it breaks amdgpu, quoting

  "Yeah, and that is exactly the reason why I will NAK this uAPI change.

  "This doesn't works for amdgpu at all for the reasons outlined above."

  https://lore.kernel.org/dri-devel/f2eb6751-2f82-9b23-f57e-548de5b729de@gmail.com/

  Rejecting new development because your own driver is broken and
  violates established cross driver contracts and uapi is really not
  how upstream works.

Now this patch will have a severe performance impact on anything that
runs on multiple engines. So we can't just merge it outright, but need
a bit a plan:

- amdgpu needs a proper uapi for handling implicit fencing. The funny
  thing is that to do it correctly, implicit fencing must be treated
  as a very strange IPC mechanism for transporting fences, where both
  setting the fence and dependency intercepts must be handled
  explicitly. Current best practices is a per-bo flag to indicate
  writes, and a per-bo flag to to skip implicit fencing in the CS
  ioctl as a new chunk.

- Since amdgpu has been shipping with broken behaviour we need an
  opt-out flag from the butchered implicit fencing model to enable the
  proper explicit implicit fencing model.

- for kernel memory fences due to bo moves at least the i915 idea is
  to use ttm_bo->moving. amdgpu probably needs the same.

- since the current p2p dma-buf interface assumes the kernel memory
  fence is in the exclusive dma_resv fence slot we need to add a new
  fence slot for kernel fences, which must never be ignored. Since
  currently only amdgpu supports this there's no real problem here
  yet, until amdgpu gains a NO_IMPLICIT CS flag.

- New userspace needs to ship in enough desktop distros so that users
  wont notice the perf impact. I think we can ignore LTS distros who
  upgrade their kernels but not their mesa3d snapshot.

- Then when this is all in place we can merge this patch here.

What is not a solution to this problem here is trying to make the
dma_resv rules in the kernel more clever. The fundamental issue here
is that the amdgpu CS uapi is the least expressive one across all
drivers (only equalled by panfrost, which has an actual excuse) by not
allowing any userspace control over how implicit sync is conducted.

Until this is fixed it's completely pointless to make the kernel more
clever to improve amdgpu, because all we're doing is papering over
this uapi design issue. amdgpu needs to attain the status quo
established by other drivers first, once that's achieved we can tackle
the remaining issues in a consistent way across drivers.

v2: Bas pointed me at AMDGPU_GEM_CREATE_EXPLICIT_SYNC, which I
entirely missed.

This is great because it means the amdgpu specific piece for proper
implicit fence handling exists already, and that since a while. The
only thing that's now missing is
- fishing the implicit fences out of a shared object at the right time
- setting the exclusive implicit fence slot at the right time.

Jason has a patch series to fill that gap with a bunch of generic
ioctl on the dma-buf fd:

https://lore.kernel.org/dri-devel/20210520190007.534046-1-jason@jlekstrand.net/

v3: Since Christian has fixed amdgpu now in

commit 8c505bdc9c8b955223b054e34a0be9c3d841cd20 (drm-misc/drm-misc-next)
Author: Christian König <christian.koenig@amd.com>
Date:   Wed Jun 9 13:51:36 2021 +0200

    drm/amdgpu: rework dma_resv handling v3

Use the audit covered in this commit message as the excuse to update
the dma-buf docs around dma_buf.resv usage across drivers.

Since dynamic importers have different rules also hammer these in
again while we're at it.

v4:
- Add the missing "through the device" in the dynamic section that I
  overlooked.
- Fix a kerneldoc markup mistake, the link didn't connect

v5:
- A few s/should/must/ to make clear what must be done (if the driver
  does implicit sync) and what's more a maybe (Daniel Stone)
- drop all the example api discussion, that needs to be expanded,
  clarified and put into a new chapter in drm-uapi.rst (Daniel Stone)

Cc: Daniel Stone <daniel@fooishbar.org>
Acked-by: Daniel Stone <daniel@fooishbar.org>
Reviewed-by: Dave Airlie <airlied@redhat.com> (v4)
Reviewed-by: Christian König <christian.koenig@amd.com> (v3)
Cc: mesa-dev@lists.freedesktop.org
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: Dave Airlie <airlied@gmail.com>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Kristian H. Kristensen <hoegsberg@google.com>
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Daniel Stone <daniels@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Deepak R Varma <mh12gx2825@gmail.com>
Cc: Chen Li <chenli@uniontech.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: Luben Tuikov <luben.tuikov@amd.com>
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
---
 include/linux/dma-buf.h | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 81cebf414505..2b814fde0d11 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -386,6 +386,40 @@ struct dma_buf {
 	 * @resv:
 	 *
 	 * Reservation object linked to this dma-buf.
+	 *
+	 * IMPLICIT SYNCHRONIZATION RULES:
+	 *
+	 * Drivers which support implicit synchronization of buffer access as
+	 * e.g. exposed in `Implicit Fence Poll Support`_ must follow the
+	 * below rules.
+	 *
+	 * - Drivers must add a shared fence through dma_resv_add_shared_fence()
+	 *   for anything the userspace API considers a read access. This highly
+	 *   depends upon the API and window system.
+	 *
+	 * - Similarly drivers must set the exclusive fence through
+	 *   dma_resv_add_excl_fence() for anything the userspace API considers
+	 *   write access.
+	 *
+	 * - Drivers may just always set the exclusive fence, since that only
+	 *   causes unecessarily synchronization, but no correctness issues.
+	 *
+	 * - Some drivers only expose a synchronous userspace API with no
+	 *   pipelining across drivers. These do not set any fences for their
+	 *   access. An example here is v4l.
+	 *
+	 * DYNAMIC IMPORTER RULES:
+	 *
+	 * Dynamic importers, see dma_buf_attachment_is_dynamic(), have
+	 * additional constraints on how they set up fences:
+	 *
+	 * - Dynamic importers must obey the exclusive fence and wait for it to
+	 *   signal before allowing access to the buffer's underlying storage
+	 *   through the device.
+	 *
+	 * - Dynamic importers should set fences for any access that they can't
+	 *   disable immediately from their &dma_buf_attach_ops.move_notify
+	 *   callback.
 	 */
 	struct dma_resv *resv;
 
-- 
2.32.0.rc2


^ permalink raw reply related	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-24 12:48         ` [Intel-gfx] " Christian König
@ 2021-06-24 13:32           ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:32 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Zimmermann, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Daniel Vetter

On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
> 
> 
> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> > > Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > > > Spotted while trying to convert panfrost to these.
> > > > 
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Maxime Ripard <mripard@kernel.org>
> > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Cc: David Airlie <airlied@linux.ie>
> > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > ---
> > > >    drivers/gpu/drm/drm_gem.c | 3 +++
> > > >    1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > > > index ba2e64ed8b47..68deb1de8235 100644
> > > > --- a/drivers/gpu/drm/drm_gem.c
> > > > +++ b/drivers/gpu/drm/drm_gem.c
> > > > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> > > >     * @fence_array: array of dma_fence * for the job to block on.
> > > >     * @fence: the dma_fence to add to the list of dependencies.
> > > >     *
> > > > + * This functions consumes the reference for @fence both on success and error
> > > > + * cases.
> > > > + *
> > > Oh, the later is a bit ugly I think. But good to know.
> > > 
> > > Reviewed-by: Christian König <christian.koenig@amd.com>
> > Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> > look at the drm/armada patch too, then I think I have reviews/acks for all
> > of them?
> 
> What are you talking about? I only see drm/armada patches for the irq stuff
> Thomas is working on.

There was one in this series, but Maxime was quicker. I'm going to apply
all the remaining ones now. After that I'll send out a patch set to add
some dependency tracking to drm_sched_job so that there's not so much
copypasta going on there. I stumbled over that when reviewing how we
handle dependencies.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 13:32           ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:32 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Zimmermann, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Maxime Ripard,
	Daniel Vetter, Lucas Stach

On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
> 
> 
> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> > On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> > > Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > > > Spotted while trying to convert panfrost to these.
> > > > 
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > Cc: Maxime Ripard <mripard@kernel.org>
> > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Cc: David Airlie <airlied@linux.ie>
> > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > ---
> > > >    drivers/gpu/drm/drm_gem.c | 3 +++
> > > >    1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > > > index ba2e64ed8b47..68deb1de8235 100644
> > > > --- a/drivers/gpu/drm/drm_gem.c
> > > > +++ b/drivers/gpu/drm/drm_gem.c
> > > > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> > > >     * @fence_array: array of dma_fence * for the job to block on.
> > > >     * @fence: the dma_fence to add to the list of dependencies.
> > > >     *
> > > > + * This functions consumes the reference for @fence both on success and error
> > > > + * cases.
> > > > + *
> > > Oh, the later is a bit ugly I think. But good to know.
> > > 
> > > Reviewed-by: Christian König <christian.koenig@amd.com>
> > Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> > look at the drm/armada patch too, then I think I have reviews/acks for all
> > of them?
> 
> What are you talking about? I only see drm/armada patches for the irq stuff
> Thomas is working on.

There was one in this series, but Maxime was quicker. I'm going to apply
all the remaining ones now. After that I'll send out a patch set to add
some dependency tracking to drm_sched_job so that there's not so much
copypasta going on there. I stumbled over that when reviewing how we
handle dependencies.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-24 13:32           ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24 13:35             ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 13:35 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Thomas Zimmermann, Daniel Vetter

Am 24.06.21 um 15:32 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
>>
>> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>>>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>>>> Spotted while trying to convert panfrost to these.
>>>>>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>> Cc: Maxime Ripard <mripard@kernel.org>
>>>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>> ---
>>>>>     drivers/gpu/drm/drm_gem.c | 3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>>>> index ba2e64ed8b47..68deb1de8235 100644
>>>>> --- a/drivers/gpu/drm/drm_gem.c
>>>>> +++ b/drivers/gpu/drm/drm_gem.c
>>>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>>>      * @fence_array: array of dma_fence * for the job to block on.
>>>>>      * @fence: the dma_fence to add to the list of dependencies.
>>>>>      *
>>>>> + * This functions consumes the reference for @fence both on success and error
>>>>> + * cases.
>>>>> + *
>>>> Oh, the later is a bit ugly I think. But good to know.
>>>>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
>>> look at the drm/armada patch too, then I think I have reviews/acks for all
>>> of them?
>> What are you talking about? I only see drm/armada patches for the irq stuff
>> Thomas is working on.
> There was one in this series, but Maxime was quicker. I'm going to apply
> all the remaining ones now. After that I'll send out a patch set to add
> some dependency tracking to drm_sched_job so that there's not so much
> copypasta going on there. I stumbled over that when reviewing how we
> handle dependencies.

Do you mean a common container for dma_fence objects a drm_sched_job 
depends on?

Thanks,
Christian.

> -Daniel


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 13:35             ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 13:35 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Maxime Ripard, Thomas Zimmermann, Daniel Vetter,
	Lucas Stach

Am 24.06.21 um 15:32 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
>>
>> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
>>> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>>>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>>>> Spotted while trying to convert panfrost to these.
>>>>>
>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>> Cc: Maxime Ripard <mripard@kernel.org>
>>>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>> ---
>>>>>     drivers/gpu/drm/drm_gem.c | 3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>>>> index ba2e64ed8b47..68deb1de8235 100644
>>>>> --- a/drivers/gpu/drm/drm_gem.c
>>>>> +++ b/drivers/gpu/drm/drm_gem.c
>>>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>>>      * @fence_array: array of dma_fence * for the job to block on.
>>>>>      * @fence: the dma_fence to add to the list of dependencies.
>>>>>      *
>>>>> + * This functions consumes the reference for @fence both on success and error
>>>>> + * cases.
>>>>> + *
>>>> Oh, the later is a bit ugly I think. But good to know.
>>>>
>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
>>> look at the drm/armada patch too, then I think I have reviews/acks for all
>>> of them?
>> What are you talking about? I only see drm/armada patches for the irq stuff
>> Thomas is working on.
> There was one in this series, but Maxime was quicker. I'm going to apply
> all the remaining ones now. After that I'll send out a patch set to add
> some dependency tracking to drm_sched_job so that there's not so much
> copypasta going on there. I stumbled over that when reviewing how we
> handle dependencies.

Do you mean a common container for dma_fence objects a drm_sched_job 
depends on?

Thanks,
Christian.

> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
  2021-06-24  7:46     ` [Intel-gfx] " Thomas Zimmermann
@ 2021-06-24 13:39       ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:39 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Daniel Vetter, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Hans de Goede,
	Laurent Pinchart, Dave Airlie, Tian Tao

On Thu, Jun 24, 2021 at 09:46:20AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > Like we have for the shadow helpers too, and roll it out to drivers.
> 
> In addition to the plane-helper macro, you may also want to add
> DRM_GEM_VRAM_SIMPLE_DISPLAY_PIPE_FUNCS and use it in bochs.

Hm I guess we can do that when we have a 2nd such case. I was more aiming
to make it as friction-less as possible that drivers end up with a
prepare_fb implementation which fishes out the implicit fences as needed
in this series.

Thanks for looking at this patch, I'm merging them all to drm-misc-next
now.
-Daniel

> 
> Best regards
> Thomas
> 
> > 
> > Acked-by: Tian Tao <tiantao6@hisilicon.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: Hans de Goede <hdegoede@redhat.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
> >   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
> >   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
> >   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
> >   4 files changed, 15 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> > index e5996ae03c49..f5d58c3088fe 100644
> > --- a/drivers/gpu/drm/ast/ast_mode.c
> > +++ b/drivers/gpu/drm/ast/ast_mode.c
> > @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
> >   }
> >   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> > -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   	.atomic_check = ast_primary_plane_helper_atomic_check,
> >   	.atomic_update = ast_primary_plane_helper_atomic_update,
> >   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> > diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > index 29b8332b2bca..ccf80e369b4b 100644
> > --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
> >   };
> >   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> > -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   	.atomic_check = hibmc_plane_atomic_check,
> >   	.atomic_update = hibmc_plane_atomic_update,
> >   };
> > diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > index 964381d55fc1..972c83b720aa 100644
> > --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
> >   	.atomic_check = vbox_primary_atomic_check,
> >   	.atomic_update = vbox_primary_atomic_update,
> >   	.atomic_disable = vbox_primary_atomic_disable,
> > -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   };
> >   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> > diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> > index 27ed7e9243b9..f48d181c824b 100644
> > --- a/include/drm/drm_gem_vram_helper.h
> > +++ b/include/drm/drm_gem_vram_helper.h
> > @@ -124,6 +124,18 @@ void
> >   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
> >   				     struct drm_plane_state *old_state);
> > +/**
> > + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> > + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> > + *
> > + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> > + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> > + * functions.
> > + */
> > +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> > +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> > +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> > +
> >   /*
> >    * Helpers for struct drm_simple_display_pipe_funcs
> >    */
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS
@ 2021-06-24 13:39       ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:39 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Daniel Vetter, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Laurent Pinchart,
	Dave Airlie, Tian Tao

On Thu, Jun 24, 2021 at 09:46:20AM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > Like we have for the shadow helpers too, and roll it out to drivers.
> 
> In addition to the plane-helper macro, you may also want to add
> DRM_GEM_VRAM_SIMPLE_DISPLAY_PIPE_FUNCS and use it in bochs.

Hm I guess we can do that when we have a 2nd such case. I was more aiming
to make it as friction-less as possible that drivers end up with a
prepare_fb implementation which fishes out the implicit fences as needed
in this series.

Thanks for looking at this patch, I'm merging them all to drm-misc-next
now.
-Daniel

> 
> Best regards
> Thomas
> 
> > 
> > Acked-by: Tian Tao <tiantao6@hisilicon.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Dave Airlie <airlied@redhat.com>
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: Hans de Goede <hdegoede@redhat.com>
> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > Cc: Maxime Ripard <mripard@kernel.org>
> > Cc: David Airlie <airlied@linux.ie>
> > Cc: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Tian Tao <tiantao6@hisilicon.com>
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > ---
> >   drivers/gpu/drm/ast/ast_mode.c                 |  3 +--
> >   drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c |  3 +--
> >   drivers/gpu/drm/vboxvideo/vbox_mode.c          |  3 +--
> >   include/drm/drm_gem_vram_helper.h              | 12 ++++++++++++
> >   4 files changed, 15 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
> > index e5996ae03c49..f5d58c3088fe 100644
> > --- a/drivers/gpu/drm/ast/ast_mode.c
> > +++ b/drivers/gpu/drm/ast/ast_mode.c
> > @@ -612,8 +612,7 @@ ast_primary_plane_helper_atomic_disable(struct drm_plane *plane,
> >   }
> >   static const struct drm_plane_helper_funcs ast_primary_plane_helper_funcs = {
> > -	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   	.atomic_check = ast_primary_plane_helper_atomic_check,
> >   	.atomic_update = ast_primary_plane_helper_atomic_update,
> >   	.atomic_disable = ast_primary_plane_helper_atomic_disable,
> > diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > index 29b8332b2bca..ccf80e369b4b 100644
> > --- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > +++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
> > @@ -158,8 +158,7 @@ static const struct drm_plane_funcs hibmc_plane_funcs = {
> >   };
> >   static const struct drm_plane_helper_funcs hibmc_plane_helper_funcs = {
> > -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   	.atomic_check = hibmc_plane_atomic_check,
> >   	.atomic_update = hibmc_plane_atomic_update,
> >   };
> > diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > index 964381d55fc1..972c83b720aa 100644
> > --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
> > @@ -488,8 +488,7 @@ static const struct drm_plane_helper_funcs vbox_primary_helper_funcs = {
> >   	.atomic_check = vbox_primary_atomic_check,
> >   	.atomic_update = vbox_primary_atomic_update,
> >   	.atomic_disable = vbox_primary_atomic_disable,
> > -	.prepare_fb	= drm_gem_vram_plane_helper_prepare_fb,
> > -	.cleanup_fb	= drm_gem_vram_plane_helper_cleanup_fb,
> > +	DRM_GEM_VRAM_PLANE_HELPER_FUNCS,
> >   };
> >   static const struct drm_plane_funcs vbox_primary_plane_funcs = {
> > diff --git a/include/drm/drm_gem_vram_helper.h b/include/drm/drm_gem_vram_helper.h
> > index 27ed7e9243b9..f48d181c824b 100644
> > --- a/include/drm/drm_gem_vram_helper.h
> > +++ b/include/drm/drm_gem_vram_helper.h
> > @@ -124,6 +124,18 @@ void
> >   drm_gem_vram_plane_helper_cleanup_fb(struct drm_plane *plane,
> >   				     struct drm_plane_state *old_state);
> > +/**
> > + * DRM_GEM_VRAM_PLANE_HELPER_FUNCS -
> > + *	Initializes struct drm_plane_helper_funcs for VRAM handling
> > + *
> > + * Drivers may use GEM BOs as VRAM helpers for the framebuffer memory. This
> > + * macro initializes struct drm_plane_helper_funcs to use the respective helper
> > + * functions.
> > + */
> > +#define DRM_GEM_VRAM_PLANE_HELPER_FUNCS \
> > +	.prepare_fb = drm_gem_vram_plane_helper_prepare_fb, \
> > +	.cleanup_fb = drm_gem_vram_plane_helper_cleanup_fb
> > +
> >   /*
> >    * Helpers for struct drm_simple_display_pipe_funcs
> >    */
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-24 13:35             ` [Intel-gfx] " Christian König
@ 2021-06-24 13:41               ` Daniel Vetter
  -1 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:41 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Zimmermann, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Daniel Vetter

On Thu, Jun 24, 2021 at 03:35:19PM +0200, Christian König wrote:
> Am 24.06.21 um 15:32 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
> > > 
> > > Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> > > > On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> > > > > Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > > > > > Spotted while trying to convert panfrost to these.
> > > > > > 
> > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > > Cc: Maxime Ripard <mripard@kernel.org>
> > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > > > Cc: David Airlie <airlied@linux.ie>
> > > > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > > > ---
> > > > > >     drivers/gpu/drm/drm_gem.c | 3 +++
> > > > > >     1 file changed, 3 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > > > > > index ba2e64ed8b47..68deb1de8235 100644
> > > > > > --- a/drivers/gpu/drm/drm_gem.c
> > > > > > +++ b/drivers/gpu/drm/drm_gem.c
> > > > > > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> > > > > >      * @fence_array: array of dma_fence * for the job to block on.
> > > > > >      * @fence: the dma_fence to add to the list of dependencies.
> > > > > >      *
> > > > > > + * This functions consumes the reference for @fence both on success and error
> > > > > > + * cases.
> > > > > > + *
> > > > > Oh, the later is a bit ugly I think. But good to know.
> > > > > 
> > > > > Reviewed-by: Christian König <christian.koenig@amd.com>
> > > > Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> > > > look at the drm/armada patch too, then I think I have reviews/acks for all
> > > > of them?
> > > What are you talking about? I only see drm/armada patches for the irq stuff
> > > Thomas is working on.
> > There was one in this series, but Maxime was quicker. I'm going to apply
> > all the remaining ones now. After that I'll send out a patch set to add
> > some dependency tracking to drm_sched_job so that there's not so much
> > copypasta going on there. I stumbled over that when reviewing how we
> > handle dependencies.
> 
> Do you mean a common container for dma_fence objects a drm_sched_job depends
> on?

Yup. Well the real usefulness is the interfaces, so that you can just grep
for those when trying to figure out htf a driver handles its implicit
dependencies. And amdgpu is unfortunately going to be a bit in the cold
because it's special (but should be able to benefit too, just more than
1-2 patches to convert it over).

Anyway I'm going to type the cover letter rsn.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 13:41               ` Daniel Vetter
  0 siblings, 0 replies; 175+ messages in thread
From: Daniel Vetter @ 2021-06-24 13:41 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Zimmermann, David Airlie, Daniel Vetter,
	Intel Graphics Development, DRI Development, Maxime Ripard,
	Daniel Vetter, Lucas Stach

On Thu, Jun 24, 2021 at 03:35:19PM +0200, Christian König wrote:
> Am 24.06.21 um 15:32 schrieb Daniel Vetter:
> > On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
> > > 
> > > Am 24.06.21 um 14:41 schrieb Daniel Vetter:
> > > > On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
> > > > > Am 22.06.21 um 18:55 schrieb Daniel Vetter:
> > > > > > Spotted while trying to convert panfrost to these.
> > > > > > 
> > > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > > > Cc: Lucas Stach <l.stach@pengutronix.de>
> > > > > > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> > > > > > Cc: Maxime Ripard <mripard@kernel.org>
> > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > > > Cc: David Airlie <airlied@linux.ie>
> > > > > > Cc: Daniel Vetter <daniel@ffwll.ch>
> > > > > > ---
> > > > > >     drivers/gpu/drm/drm_gem.c | 3 +++
> > > > > >     1 file changed, 3 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > > > > > index ba2e64ed8b47..68deb1de8235 100644
> > > > > > --- a/drivers/gpu/drm/drm_gem.c
> > > > > > +++ b/drivers/gpu/drm/drm_gem.c
> > > > > > @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
> > > > > >      * @fence_array: array of dma_fence * for the job to block on.
> > > > > >      * @fence: the dma_fence to add to the list of dependencies.
> > > > > >      *
> > > > > > + * This functions consumes the reference for @fence both on success and error
> > > > > > + * cases.
> > > > > > + *
> > > > > Oh, the later is a bit ugly I think. But good to know.
> > > > > 
> > > > > Reviewed-by: Christian König <christian.koenig@amd.com>
> > > > Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
> > > > look at the drm/armada patch too, then I think I have reviews/acks for all
> > > > of them?
> > > What are you talking about? I only see drm/armada patches for the irq stuff
> > > Thomas is working on.
> > There was one in this series, but Maxime was quicker. I'm going to apply
> > all the remaining ones now. After that I'll send out a patch set to add
> > some dependency tracking to drm_sched_job so that there's not so much
> > copypasta going on there. I stumbled over that when reviewing how we
> > handle dependencies.
> 
> Do you mean a common container for dma_fence objects a drm_sched_job depends
> on?

Yup. Well the real usefulness is the interfaces, so that you can just grep
for those when trying to figure out htf a driver handles its implicit
dependencies. And amdgpu is unfortunately going to be a bit in the cold
because it's special (but should be able to benefit too, just more than
1-2 patches to convert it over).

Anyway I'm going to type the cover letter rsn.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
  2021-06-24 13:41               ` [Intel-gfx] " Daniel Vetter
@ 2021-06-24 13:45                 ` Christian König
  -1 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 13:45 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Thomas Zimmermann, Daniel Vetter

Am 24.06.21 um 15:41 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 03:35:19PM +0200, Christian König wrote:
>> Am 24.06.21 um 15:32 schrieb Daniel Vetter:
>>> On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
>>>> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>>>>>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>>>>>> Spotted while trying to convert panfrost to these.
>>>>>>>
>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>> Cc: Maxime Ripard <mripard@kernel.org>
>>>>>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/drm_gem.c | 3 +++
>>>>>>>      1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>>>>>> index ba2e64ed8b47..68deb1de8235 100644
>>>>>>> --- a/drivers/gpu/drm/drm_gem.c
>>>>>>> +++ b/drivers/gpu/drm/drm_gem.c
>>>>>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>>>>>       * @fence_array: array of dma_fence * for the job to block on.
>>>>>>>       * @fence: the dma_fence to add to the list of dependencies.
>>>>>>>       *
>>>>>>> + * This functions consumes the reference for @fence both on success and error
>>>>>>> + * cases.
>>>>>>> + *
>>>>>> Oh, the later is a bit ugly I think. But good to know.
>>>>>>
>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
>>>>> look at the drm/armada patch too, then I think I have reviews/acks for all
>>>>> of them?
>>>> What are you talking about? I only see drm/armada patches for the irq stuff
>>>> Thomas is working on.
>>> There was one in this series, but Maxime was quicker. I'm going to apply
>>> all the remaining ones now. After that I'll send out a patch set to add
>>> some dependency tracking to drm_sched_job so that there's not so much
>>> copypasta going on there. I stumbled over that when reviewing how we
>>> handle dependencies.
>> Do you mean a common container for dma_fence objects a drm_sched_job depends
>> on?
> Yup. Well the real usefulness is the interfaces, so that you can just grep
> for those when trying to figure out htf a driver handles its implicit
> dependencies. And amdgpu is unfortunately going to be a bit in the cold
> because it's special (but should be able to benefit too, just more than
> 1-2 patches to convert it over).

I had that on the TODO list for quite a while as well.

Essentially extracting what the dma_resv_list object is doing into a new 
object (but maybe without RCU).

Christian.

>
> Anyway I'm going to type the cover letter rsn.
> -Daniel


^ permalink raw reply	[flat|nested] 175+ messages in thread

* Re: [Intel-gfx] [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add
@ 2021-06-24 13:45                 ` Christian König
  0 siblings, 0 replies; 175+ messages in thread
From: Christian König @ 2021-06-24 13:45 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: David Airlie, Daniel Vetter, Intel Graphics Development,
	DRI Development, Maxime Ripard, Thomas Zimmermann, Daniel Vetter,
	Lucas Stach

Am 24.06.21 um 15:41 schrieb Daniel Vetter:
> On Thu, Jun 24, 2021 at 03:35:19PM +0200, Christian König wrote:
>> Am 24.06.21 um 15:32 schrieb Daniel Vetter:
>>> On Thu, Jun 24, 2021 at 02:48:54PM +0200, Christian König wrote:
>>>> Am 24.06.21 um 14:41 schrieb Daniel Vetter:
>>>>> On Wed, Jun 23, 2021 at 10:42:50AM +0200, Christian König wrote:
>>>>>> Am 22.06.21 um 18:55 schrieb Daniel Vetter:
>>>>>>> Spotted while trying to convert panfrost to these.
>>>>>>>
>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>>>>>> Cc: "Christian König" <christian.koenig@amd.com>
>>>>>>> Cc: Lucas Stach <l.stach@pengutronix.de>
>>>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>>>> Cc: Maxime Ripard <mripard@kernel.org>
>>>>>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>>>>>> Cc: David Airlie <airlied@linux.ie>
>>>>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/drm_gem.c | 3 +++
>>>>>>>      1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
>>>>>>> index ba2e64ed8b47..68deb1de8235 100644
>>>>>>> --- a/drivers/gpu/drm/drm_gem.c
>>>>>>> +++ b/drivers/gpu/drm/drm_gem.c
>>>>>>> @@ -1302,6 +1302,9 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations);
>>>>>>>       * @fence_array: array of dma_fence * for the job to block on.
>>>>>>>       * @fence: the dma_fence to add to the list of dependencies.
>>>>>>>       *
>>>>>>> + * This functions consumes the reference for @fence both on success and error
>>>>>>> + * cases.
>>>>>>> + *
>>>>>> Oh, the later is a bit ugly I think. But good to know.
>>>>>>
>>>>>> Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>> Merged to drm-misc-next, thanks for taking a look. Can you perhaps take a
>>>>> look at the drm/armada patch too, then I think I have reviews/acks for all
>>>>> of them?
>>>> What are you talking about? I only see drm/armada patches for the irq stuff
>>>> Thomas is working on.
>>> There was one in this series, but Maxime was quicker. I'm going to apply
>>> all the remaining ones now. After that I'll send out a patch set to add
>>> some dependency tracking to drm_sched_job so that there's not so much
>>> copypasta going on there. I stumbled over that when reviewing how we
>>> handle dependencies.
>> Do you mean a common container for dma_fence objects a drm_sched_job depends
>> on?
> Yup. Well the real usefulness is the interfaces, so that you can just grep
> for those when trying to figure out htf a driver handles its implicit
> dependencies. And amdgpu is unfortunately going to be a bit in the cold
> because it's special (but should be able to benefit too, just more than
> 1-2 patches to convert it over).

I had that on the TODO list for quite a while as well.

Essentially extracting what the dma_resv_list object is doing into a new 
object (but maybe without RCU).

Christian.

>
> Anyway I'm going to type the cover letter rsn.
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 175+ messages in thread

end of thread, other threads:[~2021-06-24 13:45 UTC | newest]

Thread overview: 175+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-22 16:54 [PATCH 00/15] implicit fencing/dma-resv rules for shared buffers Daniel Vetter
2021-06-22 16:54 ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:54 ` [PATCH 01/15] dma-resv: Fix kerneldoc Daniel Vetter
2021-06-22 16:54   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:54   ` Daniel Vetter
2021-06-22 18:19   ` Alex Deucher
2021-06-22 18:19     ` [Intel-gfx] " Alex Deucher
2021-06-22 18:19     ` Alex Deucher
2021-06-22 18:49   ` Sam Ravnborg
2021-06-22 18:49     ` [Intel-gfx] " Sam Ravnborg
2021-06-22 19:19     ` Daniel Vetter
2021-06-22 19:19       ` [Intel-gfx] " Daniel Vetter
2021-06-22 19:19       ` Daniel Vetter
2021-06-23  8:31   ` Christian König
2021-06-23  8:31     ` [Intel-gfx] " Christian König
2021-06-23  8:31     ` Christian König
2021-06-23 15:15     ` Daniel Vetter
2021-06-23 15:15       ` [Intel-gfx] " Daniel Vetter
2021-06-23 15:15       ` Daniel Vetter
2021-06-22 16:54 ` [PATCH 02/15] dma-buf: Switch to inline kerneldoc Daniel Vetter
2021-06-22 16:54   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:54   ` Daniel Vetter
2021-06-22 18:24   ` Alex Deucher
2021-06-22 18:24     ` [Intel-gfx] " Alex Deucher
2021-06-22 18:24     ` Alex Deucher
2021-06-22 19:01   ` Sam Ravnborg
2021-06-22 19:01     ` [Intel-gfx] " Sam Ravnborg
2021-06-22 19:21     ` Daniel Vetter
2021-06-22 19:21       ` [Intel-gfx] " Daniel Vetter
2021-06-22 19:21       ` Daniel Vetter
2021-06-23  8:32   ` Christian König
2021-06-23  8:32     ` [Intel-gfx] " Christian König
2021-06-23  8:32     ` Christian König
2021-06-23 16:17   ` [PATCH] " Daniel Vetter
2021-06-23 16:17     ` [Intel-gfx] " Daniel Vetter
2021-06-23 16:17     ` Daniel Vetter
2021-06-23 17:33     ` Sam Ravnborg
2021-06-23 17:33       ` [Intel-gfx] " Sam Ravnborg
2021-06-22 16:54 ` [PATCH 03/15] dma-buf: Document dma-buf implicit fencing/resv fencing rules Daniel Vetter
2021-06-22 16:54   ` [Intel-gfx] " Daniel Vetter
2021-06-23  8:41   ` Christian König
2021-06-23  8:41     ` [Intel-gfx] " Christian König
2021-06-23 16:19   ` [PATCH] " Daniel Vetter
2021-06-23 16:19     ` [Intel-gfx] " Daniel Vetter
2021-06-24  6:59     ` Dave Airlie
2021-06-24  6:59       ` [Intel-gfx] " Dave Airlie
2021-06-24 11:08     ` [Mesa-dev] " Daniel Stone
2021-06-24 11:08       ` [Intel-gfx] " Daniel Stone
2021-06-24 11:23       ` Daniel Vetter
2021-06-24 11:23         ` [Intel-gfx] " Daniel Vetter
2021-06-24 12:48     ` Daniel Vetter
2021-06-24 12:52     ` Daniel Vetter
2021-06-22 16:55 ` [PATCH 04/15] drm/panfrost: Shrink sched_lock Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-23 16:52   ` Boris Brezillon
2021-06-23 16:52     ` [Intel-gfx] " Boris Brezillon
2021-06-22 16:55 ` [PATCH 05/15] drm/panfrost: Use xarray and helpers for depedency tracking Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-23 16:51   ` Boris Brezillon
2021-06-23 16:51     ` [Intel-gfx] " Boris Brezillon
2021-06-23 16:51     ` Boris Brezillon
2021-06-22 16:55 ` [PATCH 06/15] drm/panfrost: Fix implicit sync Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-23 16:47   ` Boris Brezillon
2021-06-23 16:47     ` [Intel-gfx] " Boris Brezillon
2021-06-23 16:47     ` Boris Brezillon
2021-06-23 19:17     ` Daniel Vetter
2021-06-23 19:17       ` [Intel-gfx] " Daniel Vetter
2021-06-23 19:17       ` Daniel Vetter
2021-06-22 16:55 ` [PATCH 07/15] drm/atomic-helper: make drm_gem_plane_helper_prepare_fb the default Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 19:10   ` Sam Ravnborg
2021-06-22 19:10     ` [Intel-gfx] " Sam Ravnborg
2021-06-22 20:20     ` Daniel Vetter
2021-06-22 20:20       ` [Intel-gfx] " Daniel Vetter
2021-06-23 15:39       ` Sam Ravnborg
2021-06-23 15:39         ` [Intel-gfx] " Sam Ravnborg
2021-06-23 16:22   ` [PATCH] " Daniel Vetter
2021-06-23 16:22     ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55 ` [PATCH 08/15] drm/<driver>: drm_gem_plane_helper_prepare_fb is now " Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-24  8:32   ` Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-24  8:32     ` [Intel-gfx] " Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-24  8:32     ` Philipp Zabel
2021-06-22 16:55 ` [PATCH 09/15] drm/armada: Remove prepare/cleanup_fb hooks Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-24 12:46   ` Maxime Ripard
2021-06-24 12:46     ` [Intel-gfx] " Maxime Ripard
2021-06-22 16:55 ` [PATCH 10/15] drm/vram-helpers: Create DRM_GEM_VRAM_PLANE_HELPER_FUNCS Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-24  7:38   ` Thomas Zimmermann
2021-06-24  7:38     ` [Intel-gfx] " Thomas Zimmermann
2021-06-24  7:46   ` Thomas Zimmermann
2021-06-24  7:46     ` [Intel-gfx] " Thomas Zimmermann
2021-06-24 13:39     ` Daniel Vetter
2021-06-24 13:39       ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55 ` [PATCH 11/15] drm/omap: Follow implicit fencing in prepare_fb Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55 ` [PATCH 12/15] drm/simple-helper: drm_gem_simple_display_pipe_prepare_fb as default Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 19:15   ` Sam Ravnborg
2021-06-22 19:15     ` [Intel-gfx] " Sam Ravnborg
2021-06-23 16:24   ` [PATCH] " Daniel Vetter
2021-06-23 16:24     ` [Intel-gfx] " Daniel Vetter
2021-06-23 17:34     ` Sam Ravnborg
2021-06-23 17:34       ` [Intel-gfx] " Sam Ravnborg
2021-06-22 16:55 ` [PATCH 13/15] drm/tiny: drm_gem_simple_display_pipe_prepare_fb is the default Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 16:55   ` Daniel Vetter
2021-06-22 16:55 ` [PATCH 14/15] drm/gem: Tiny kernel clarification for drm_gem_fence_array_add Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-23  8:42   ` Christian König
2021-06-23  8:42     ` [Intel-gfx] " Christian König
2021-06-24 12:41     ` Daniel Vetter
2021-06-24 12:41       ` [Intel-gfx] " Daniel Vetter
2021-06-24 12:48       ` Christian König
2021-06-24 12:48         ` [Intel-gfx] " Christian König
2021-06-24 13:32         ` Daniel Vetter
2021-06-24 13:32           ` [Intel-gfx] " Daniel Vetter
2021-06-24 13:35           ` Christian König
2021-06-24 13:35             ` [Intel-gfx] " Christian König
2021-06-24 13:41             ` Daniel Vetter
2021-06-24 13:41               ` [Intel-gfx] " Daniel Vetter
2021-06-24 13:45               ` Christian König
2021-06-24 13:45                 ` [Intel-gfx] " Christian König
2021-06-22 16:55 ` [PATCH 15/15] RFC: drm/amdgpu: Implement a proper implicit fencing uapi Daniel Vetter
2021-06-22 16:55   ` [Intel-gfx] " Daniel Vetter
2021-06-22 23:56   ` kernel test robot
2021-06-23  9:45   ` Bas Nieuwenhuizen
2021-06-23  9:45     ` [Intel-gfx] " Bas Nieuwenhuizen
2021-06-23 12:18     ` Daniel Vetter
2021-06-23 12:18       ` [Intel-gfx] " Daniel Vetter
2021-06-23 12:59       ` Christian König
2021-06-23 12:59         ` [Intel-gfx] " Christian König
2021-06-23 13:38         ` Bas Nieuwenhuizen
2021-06-23 13:38           ` [Intel-gfx] " Bas Nieuwenhuizen
2021-06-23 13:44           ` Christian König
2021-06-23 13:44             ` [Intel-gfx] " Christian König
2021-06-23 13:49             ` Daniel Vetter
2021-06-23 13:49               ` [Intel-gfx] " Daniel Vetter
2021-06-23 14:02               ` Christian König
2021-06-23 14:02                 ` [Intel-gfx] " Christian König
2021-06-23 14:50                 ` Daniel Vetter
2021-06-23 14:50                   ` [Intel-gfx] " Daniel Vetter
2021-06-23 14:58                   ` Bas Nieuwenhuizen
2021-06-23 14:58                     ` [Intel-gfx] " Bas Nieuwenhuizen
2021-06-23 15:03                     ` Daniel Vetter
2021-06-23 15:03                       ` [Intel-gfx] " Daniel Vetter
2021-06-23 15:07                       ` Christian König
2021-06-23 15:07                         ` [Intel-gfx] " Christian König
2021-06-23 15:12                         ` Daniel Vetter
2021-06-23 15:12                           ` [Intel-gfx] " Daniel Vetter
2021-06-23 15:15                           ` Christian König
2021-06-23 15:15                             ` [Intel-gfx] " Christian König
2021-06-22 17:08 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for implicit fencing/dma-resv rules for shared buffers Patchwork
2021-06-22 17:11 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-06-22 17:38 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-06-22 19:12 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-06-23 17:05 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for implicit fencing/dma-resv rules for shared buffers (rev5) Patchwork
2021-06-23 17:07 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-06-23 17:35 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-06-23 21:04 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.