All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events
@ 2024-02-16 15:09 Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event Pierre-Eric Pelloux-Prayer
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

This series adds new events to make it easier for tools
like gpuvis or umr to graph the GPUs, kernel and applications
activity.

UMR patches using these events can be found here:
https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37

V1:
https://patchwork.kernel.org/project/linux-media/patch/20240117184329.479554-1-pierre-eric.pelloux-prayer@amd.com/

Changes from V1:
* uses trace_dma_fence_sync_to from dma-fence-chain.c
* new amdgpu events
* new drm plane commit event

Changes from V2:
* uses trace_dma_fence_used_as_dependency from drm_sched_job_add_dependency
* add devname attribute to the trace_amdgpu_sched_run_job event
* addressed review comments

Pierre-Eric Pelloux-Prayer (8):
  tracing, dma-buf: add a trace_dma_fence_sync_to event
  dma-buf/fence-chain: use trace_dma_fence_sync_to
  amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync
  drm/amdgpu: add a amdgpu_bo_fill trace event
  drm/amdgpu: add a amdgpu_cs_start trace event
  drm: add drm_mode_atomic_commit event
  drm/sched: use trace_dma_fence_used_as_dependency
  drm/amdgpu: add devname to trace_amdgpu_sched_run_job

 drivers/dma-buf/dma-fence-chain.c         |  4 +++
 drivers/dma-buf/dma-fence.c               |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c    |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c  |  9 +++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h  |  4 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 42 ++++++++++++++++++++---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 ++
 drivers/gpu/drm/drm_atomic_uapi.c         | 21 ++++++++++++
 drivers/gpu/drm/drm_trace.h               | 23 +++++++++++++
 drivers/gpu/drm/scheduler/sched_main.c    |  4 +++
 include/trace/events/dma_fence.h          | 27 +++++++++++++++
 12 files changed, 133 insertions(+), 8 deletions(-)

-- 
2.40.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:28   ` Christian König
  2024-02-16 15:09 ` [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to Pierre-Eric Pelloux-Prayer
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

This new event can be used to trace where a given dma_fence is added
as a dependency of some other work.

I plan to use it in amdgpu.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/dma-buf/dma-fence.c      |  1 +
 include/trace/events/dma_fence.h | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0393a9bba3a8..e7276c043984 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -23,6 +23,7 @@
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
 EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
+EXPORT_TRACEPOINT_SYMBOL(dma_fence_used_as_dependency);
 
 static DEFINE_SPINLOCK(dma_fence_stub_lock);
 static struct dma_fence dma_fence_stub;
diff --git a/include/trace/events/dma_fence.h b/include/trace/events/dma_fence.h
index 3963e79ca7b4..5a5d272031ce 100644
--- a/include/trace/events/dma_fence.h
+++ b/include/trace/events/dma_fence.h
@@ -83,6 +83,33 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end,
 	TP_ARGS(fence)
 );
 
+TRACE_EVENT(dma_fence_used_as_dependency,
+
+	TP_PROTO(struct dma_fence *fence, const char *reason),
+
+	TP_ARGS(fence, reason),
+
+	TP_STRUCT__entry(
+		__string(driver, fence->ops->get_driver_name(fence))
+		__string(timeline, fence->ops->get_timeline_name(fence))
+		__field(unsigned int, context)
+		__field(unsigned int, seqno)
+		__string(reason, reason)
+	),
+
+	TP_fast_assign(
+		__assign_str(driver, fence->ops->get_driver_name(fence));
+		__assign_str(timeline, fence->ops->get_timeline_name(fence));
+		__entry->context = fence->context;
+		__entry->seqno = fence->seqno;
+		__assign_str(reason, reason);
+	),
+
+	TP_printk("driver=%s timeline=%s context=%u seqno=%u reason=%s",
+		  __get_str(driver), __get_str(timeline), __entry->context,
+		  __entry->seqno, __get_str(reason))
+);
+
 #endif /*  _TRACE_DMA_FENCE_H */
 
 /* This part must be outside protection */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:30   ` Christian König
  2024-02-16 15:09 ` [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync Pierre-Eric Pelloux-Prayer
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

To inform tools about the relationship between the fences.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/dma-buf/dma-fence-chain.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
index 9663ba1bb6ac..3435078c45b7 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -9,6 +9,8 @@
 
 #include <linux/dma-fence-chain.h>
 
+#include "trace/events/dma_fence.h"
+
 static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
 
 /**
@@ -251,6 +253,8 @@ void dma_fence_chain_init(struct dma_fence_chain *chain,
 	chain->fence = fence;
 	chain->prev_seqno = 0;
 
+	trace_dma_fence_used_as_dependency(fence, __func__);
+
 	/* Try to reuse the context of the previous chain node. */
 	if (prev_chain && __dma_fence_is_later(seqno, prev->seqno, prev->ops)) {
 		context = prev->context;
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:56   ` Christian König
  2024-02-16 15:09 ` [PATCH v3 4/8] drm/amdgpu: add a amdgpu_bo_fill trace event Pierre-Eric Pelloux-Prayer
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

This makes it possible to understand the dependencies between jobs.
Possible usage of this trace:
* stuttering issues like Mesa !9189
* incorrect synchronization: I don't have a link for this one, but having
  these events was very useful to debug a virtio-gpu / native-context /
  radeonsi sync issue

I have prototype code using this in UMR, as can be see here:
   https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37

v2: add a macro since every caller passes __func__ as the reason parameter

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 9 +++++++--
 drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h | 4 +++-
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
index 1b013a44ca99..9a3fdc4be51e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
@@ -30,6 +30,7 @@
  */
 
 #include <linux/dma-fence-chain.h>
+#include <trace/events/dma_fence.h>
 
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
@@ -145,14 +146,16 @@ static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f)
 }
 
 /**
- * amdgpu_sync_fence - remember to sync to this fence
+ * amdgpu_sync_fence_with_reason - remember to sync to this fence
  *
  * @sync: sync object to add fence to
  * @f: fence to sync to
+ * @reason: why do we sync to this fence
  *
  * Add the fence to the sync object.
  */
-int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f)
+int amdgpu_sync_fence_with_reason(struct amdgpu_sync *sync, struct dma_fence *f,
+				  const char *reason)
 {
 	struct amdgpu_sync_entry *e;
 
@@ -166,6 +169,8 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f)
 	if (!e)
 		return -ENOMEM;
 
+	trace_dma_fence_used_as_dependency(f, reason);
+
 	hash_add(sync->fences, &e->node, f->context);
 	e->fence = dma_fence_get(f);
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
index cf1e9e858efd..52e7306801de 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
@@ -47,7 +47,9 @@ struct amdgpu_sync {
 };
 
 void amdgpu_sync_create(struct amdgpu_sync *sync);
-int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f);
+int amdgpu_sync_fence_with_reason(struct amdgpu_sync *sync, struct dma_fence *f,
+				  const char *reason);
+#define amdgpu_sync_fence(s, f) amdgpu_sync_fence_with_reason(s, f, __func__)
 int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
 		     struct dma_resv *resv, enum amdgpu_sync_mode mode,
 		     void *owner);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 4/8] drm/amdgpu: add a amdgpu_bo_fill trace event
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (2 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 5/8] drm/amdgpu: add a amdgpu_cs_start " Pierre-Eric Pelloux-Prayer
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

Useful to identify why sdma jobs are submitted.

v2: moved from amdgpu_bo_create to amdgpu_bo_fill

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 18 ++++++++++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 ++
 2 files changed, 20 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index f539b1d00234..0e47cbe7e0a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -514,6 +514,24 @@ TRACE_EVENT(amdgpu_bo_move,
 			__entry->new_placement, __entry->bo_size)
 );
 
+TRACE_EVENT(amdgpu_bo_fill,
+	    TP_PROTO(struct amdgpu_bo *bo, uint32_t value),
+	    TP_ARGS(bo, value),
+	    TP_STRUCT__entry(
+			__field(struct amdgpu_bo *, bo)
+			__field(u64, bo_size)
+			__field(u32, value)
+			),
+
+	    TP_fast_assign(
+			__entry->bo      = bo;
+			__entry->bo_size = amdgpu_bo_size(bo);
+			__entry->value   = value;
+			),
+	    TP_printk("bo=%p, size=%lld, value=0x%08x",
+			__entry->bo, __entry->bo_size, __entry->value)
+);
+
 TRACE_EVENT(amdgpu_ib_pipe_sync,
 	    TP_PROTO(struct amdgpu_job *sched_job, struct dma_fence *fence),
 	    TP_ARGS(sched_job, fence),
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 8722beba494e..7d0b2c1191f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -2231,6 +2231,8 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
 		return -EINVAL;
 	}
 
+	trace_amdgpu_bo_fill(bo, src_data);
+
 	amdgpu_res_first(bo->tbo.resource, 0, amdgpu_bo_size(bo), &dst);
 
 	mutex_lock(&adev->mman.gtt_window_lock);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 5/8] drm/amdgpu: add a amdgpu_cs_start trace event
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (3 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 4/8] drm/amdgpu: add a amdgpu_bo_fill trace event Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 6/8] drm: add drm_mode_atomic_commit event Pierre-Eric Pelloux-Prayer
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

amdgpu_cs_ioctl already exists but serves a different
purpose.

amdgpu_cs_start marks the beginning of the kernel processing of
the ioctl which is useful for tools to map which events belong to
the same submission (without this, the first event would be the
amdgpu_bo_set_list ones).

v2: renamed to amdgpu_cs_start

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c    |  2 ++
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 12 ++++++++++++
 2 files changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 0a4b09709cfb..f3369cd0d9a3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -1402,6 +1402,8 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void *data, struct drm_file *filp)
 		return r;
 	}
 
+	trace_amdgpu_cs_start(data);
+
 	r = amdgpu_cs_pass1(&parser, data);
 	if (r)
 		goto error_fini;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 0e47cbe7e0a9..3f18f570e5ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -189,6 +189,18 @@ TRACE_EVENT(amdgpu_cs_ioctl,
 		      __entry->seqno, __get_str(ring), __entry->num_ibs)
 );
 
+TRACE_EVENT(amdgpu_cs_start,
+	    TP_PROTO(union drm_amdgpu_cs *cs),
+	    TP_ARGS(cs),
+	    TP_STRUCT__entry(
+			     __field(uint32_t, ctx_id)
+	    ),
+	    TP_fast_assign(
+			   __entry->ctx_id = cs->in.ctx_id;
+	    ),
+	    TP_printk("context=%u", __entry->ctx_id)
+);
+
 TRACE_EVENT(amdgpu_sched_run_job,
 	    TP_PROTO(struct amdgpu_job *job),
 	    TP_ARGS(job),
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 6/8] drm: add drm_mode_atomic_commit event
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (4 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 5/8] drm/amdgpu: add a amdgpu_cs_start " Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:59   ` Steven Rostedt
  2024-02-16 16:24   ` Ville Syrjälä
  2024-02-16 15:09 ` [PATCH v3 7/8] drm/sched: use trace_dma_fence_used_as_dependency Pierre-Eric Pelloux-Prayer
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

With this and the dma_fence_used_as_dependency event, a tool can draw the
relationship between the compositing draw, the atomic commit, and vblank.

An example on a 2 monitors system look like this:

gnome-shell-1638    [018] .....  2571.905124: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x1}
gnome-shell-1638    [018] .....  2571.905147: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73240 reason=dma_fence_chain_init
gnome-shell-1638    [018] .....  2571.913226: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x0}
gnome-shell-1638    [018] .....  2571.913250: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73241 reason=dma_fence_chain_init
    <idle>-0       [018] d.h3.  2571.915687: drm_vblank_event: crtc=1, seq=155747, time=2571916093743, high-prec=true
    <idle>-0       [018] d.h3.  2571.915968: drm_vblank_event: crtc=0, seq=153862, time=2571916377180, high-prec=true

v2: fix unchecked memory allocation

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/drm_atomic_uapi.c | 21 +++++++++++++++++++++
 drivers/gpu/drm/drm_trace.h       | 23 +++++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
index 29d4940188d4..f31b5c6f870b 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -41,6 +41,7 @@
 #include <linux/file.h>
 
 #include "drm_crtc_internal.h"
+#include "drm_trace.h"
 
 /**
  * DOC: overview
@@ -1503,6 +1504,26 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
 		drm_mode_object_put(obj);
 	}
 
+	if (trace_drm_mode_atomic_commit_enabled()) {
+		struct drm_crtc_state *crtc_state;
+		struct drm_crtc *crtc;
+		int *crtcs;
+		int i, num_crtcs;
+
+		crtcs = kcalloc(dev->mode_config.num_crtc, sizeof(int),
+				GFP_KERNEL);
+
+		if (crtcs) {
+			num_crtcs = 0;
+			for_each_new_crtc_in_state(state, crtc, crtc_state, i)
+				crtcs[num_crtcs++] = drm_crtc_index(crtc);
+
+			trace_drm_mode_atomic_commit(file_priv, crtcs, num_crtcs, arg->flags);
+
+			kfree(crtcs);
+		}
+	}
+
 	ret = prepare_signaling(dev, state, arg, file_priv, &fence_state,
 				&num_fences);
 	if (ret)
diff --git a/drivers/gpu/drm/drm_trace.h b/drivers/gpu/drm/drm_trace.h
index 11c6dd577e8e..63489923c289 100644
--- a/drivers/gpu/drm/drm_trace.h
+++ b/drivers/gpu/drm/drm_trace.h
@@ -66,6 +66,29 @@ TRACE_EVENT(drm_vblank_event_delivered,
 		      __entry->seq)
 );
 
+TRACE_EVENT(drm_mode_atomic_commit,
+	    TP_PROTO(struct drm_file *file, int *crtcs, int ncrtcs, uint32_t flags),
+	    TP_ARGS(file, crtcs, ncrtcs, flags),
+	    TP_STRUCT__entry(
+		    __field(struct drm_file *, file)
+		    __dynamic_array(u32, crtcs, ncrtcs)
+		    __field(uint32_t, ncrtcs)
+		    __field(uint32_t, flags)
+		    ),
+	    TP_fast_assign(
+		    unsigned int i;
+
+		    __entry->file = file;
+		    for (i = 0; i < ncrtcs; i++)
+			((u32 *)__get_dynamic_array(crtcs))[i] = crtcs[i];
+		    __entry->ncrtcs = ncrtcs;
+		    __entry->flags = flags;
+		    ),
+	    TP_printk("file=%p, pid=%8d, flags=%08x, crtcs=%s", __entry->file,
+		      pid_nr(__entry->file->pid), __entry->flags,
+		      __print_array(__get_dynamic_array(crtcs), __entry->ncrtcs, 4))
+);
+
 #endif /* _DRM_TRACE_H_ */
 
 /* This part must be outside protection */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 7/8] drm/sched: use trace_dma_fence_used_as_dependency
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (5 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 6/8] drm: add drm_mode_atomic_commit event Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:09 ` [PATCH v3 8/8] drm/amdgpu: add devname to trace_amdgpu_sched_run_job Pierre-Eric Pelloux-Prayer
  2024-02-16 15:20 ` [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Christian König
  8 siblings, 0 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

drm_sched_job_add_dependency adds dependencies so use the new
trace event.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/scheduler/sched_main.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 7e90c9f95611..6ee49f70d319 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -84,6 +84,8 @@
 #include <drm/gpu_scheduler.h>
 #include <drm/spsc_queue.h>
 
+#include <trace/events/dma_fence.h>
+
 #define CREATE_TRACE_POINTS
 #include "gpu_scheduler_trace.h"
 
@@ -879,6 +881,8 @@ int drm_sched_job_add_dependency(struct drm_sched_job *job,
 		if (entry->context != fence->context)
 			continue;
 
+		trace_dma_fence_used_as_dependency(fence, __func__);
+
 		if (dma_fence_is_later(fence, entry)) {
 			dma_fence_put(entry);
 			xa_store(&job->dependencies, index, fence, GFP_KERNEL);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 8/8] drm/amdgpu: add devname to trace_amdgpu_sched_run_job
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (6 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 7/8] drm/sched: use trace_dma_fence_used_as_dependency Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:09 ` Pierre-Eric Pelloux-Prayer
  2024-02-16 15:20 ` [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Christian König
  8 siblings, 0 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-16 15:09 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Christian König, Steven Rostedt, Masami Hiramatsu,
	Mathieu Desnoyers, dri-devel, linux-media, linux-trace-kernel,
	Alex Deucher, amd-gfx

With the move to work queues for the drm scheduler it becomes
impossible for a tool to match the events to the GPU.

Before this move, the event source was fixed (eg: gfx_0.0.0-598),
so even if the system had multiple GPUs with identical queue names
it was possible to map the events using the PID.

With work queues, the source is now something like: "kworker/u64:0-15248"
(and the PID isn't stable), so the "timeline=gfx_0.0.0" attribute
isn't enough in multi-GPU setups.

This commit adds a dev=devname attribute to resolve this issue.

Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 12 ++++++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 71a5cf37b472..657866a498f1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -292,7 +292,7 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job)
 	job = to_amdgpu_job(sched_job);
 	finished = &job->base.s_fence->finished;
 
-	trace_amdgpu_sched_run_job(job);
+	trace_amdgpu_sched_run_job(job, adev);
 
 	/* Skip job if VRAM is lost and never resubmit gangs */
 	if (job->generation != amdgpu_vm_generation(adev, job->vm) ||
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 3f18f570e5ac..1aea1b78747d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -202,8 +202,8 @@ TRACE_EVENT(amdgpu_cs_start,
 );
 
 TRACE_EVENT(amdgpu_sched_run_job,
-	    TP_PROTO(struct amdgpu_job *job),
-	    TP_ARGS(job),
+	    TP_PROTO(struct amdgpu_job *job, struct amdgpu_device *adev),
+	    TP_ARGS(job, adev),
 	    TP_STRUCT__entry(
 			     __field(uint64_t, sched_job_id)
 			     __string(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
@@ -211,6 +211,7 @@ TRACE_EVENT(amdgpu_sched_run_job,
 			     __field(unsigned int, seqno)
 			     __string(ring, to_amdgpu_ring(job->base.sched)->name)
 			     __field(u32, num_ibs)
+			     __string(dname, dev_name(adev->dev))
 			     ),
 
 	    TP_fast_assign(
@@ -220,10 +221,13 @@ TRACE_EVENT(amdgpu_sched_run_job,
 			   __entry->seqno = job->base.s_fence->finished.seqno;
 			   __assign_str(ring, to_amdgpu_ring(job->base.sched)->name);
 			   __entry->num_ibs = job->num_ibs;
+			   __assign_str(dname, dev_name(adev->dev));
 			   ),
-	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
+	    TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, "
+		      "ring_name=%s, num_ibs=%u, dev=%s",
 		      __entry->sched_job_id, __get_str(timeline), __entry->context,
-		      __entry->seqno, __get_str(ring), __entry->num_ibs)
+		      __entry->seqno, __get_str(ring), __entry->num_ibs, __get_str(dname))
+
 );
 
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events
  2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
                   ` (7 preceding siblings ...)
  2024-02-16 15:09 ` [PATCH v3 8/8] drm/amdgpu: add devname to trace_amdgpu_sched_run_job Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:20 ` Christian König
  8 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2024-02-16 15:20 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx

Am 16.02.24 um 16:09 schrieb Pierre-Eric Pelloux-Prayer:
> This series adds new events to make it easier for tools
> like gpuvis or umr to graph the GPUs, kernel and applications
> activity.
>
> UMR patches using these events can be found here:
> https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
>
> V1:
> https://patchwork.kernel.org/project/linux-media/patch/20240117184329.479554-1-pierre-eric.pelloux-prayer@amd.com/

I need to separate this patch set a bit. The DMA-buf stuff usually goes 
upstream through drm-misc-next while the amdgpu only patches go upstream 
through our internal branch.

I will keep you looped in which patch I pick from this set to which branch.

Oh, that's going to be fun.

Christian.

>
> Changes from V1:
> * uses trace_dma_fence_sync_to from dma-fence-chain.c
> * new amdgpu events
> * new drm plane commit event
>
> Changes from V2:
> * uses trace_dma_fence_used_as_dependency from drm_sched_job_add_dependency
> * add devname attribute to the trace_amdgpu_sched_run_job event
> * addressed review comments
>
> Pierre-Eric Pelloux-Prayer (8):
>    tracing, dma-buf: add a trace_dma_fence_sync_to event
>    dma-buf/fence-chain: use trace_dma_fence_sync_to
>    amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync
>    drm/amdgpu: add a amdgpu_bo_fill trace event
>    drm/amdgpu: add a amdgpu_cs_start trace event
>    drm: add drm_mode_atomic_commit event
>    drm/sched: use trace_dma_fence_used_as_dependency
>    drm/amdgpu: add devname to trace_amdgpu_sched_run_job
>
>   drivers/dma-buf/dma-fence-chain.c         |  4 +++
>   drivers/dma-buf/dma-fence.c               |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c    |  2 ++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_job.c   |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c  |  9 +++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h  |  4 ++-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h | 42 ++++++++++++++++++++---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c   |  2 ++
>   drivers/gpu/drm/drm_atomic_uapi.c         | 21 ++++++++++++
>   drivers/gpu/drm/drm_trace.h               | 23 +++++++++++++
>   drivers/gpu/drm/scheduler/sched_main.c    |  4 +++
>   include/trace/events/dma_fence.h          | 27 +++++++++++++++
>   12 files changed, 133 insertions(+), 8 deletions(-)
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event
  2024-02-16 15:09 ` [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:28   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2024-02-16 15:28 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx

Am 16.02.24 um 16:09 schrieb Pierre-Eric Pelloux-Prayer:
> This new event can be used to trace where a given dma_fence is added
> as a dependency of some other work.
>
> I plan to use it in amdgpu.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>   drivers/dma-buf/dma-fence.c      |  1 +
>   include/trace/events/dma_fence.h | 27 +++++++++++++++++++++++++++
>   2 files changed, 28 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0393a9bba3a8..e7276c043984 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -23,6 +23,7 @@
>   EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit);
>   EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
>   EXPORT_TRACEPOINT_SYMBOL(dma_fence_signaled);
> +EXPORT_TRACEPOINT_SYMBOL(dma_fence_used_as_dependency);
>   
>   static DEFINE_SPINLOCK(dma_fence_stub_lock);
>   static struct dma_fence dma_fence_stub;
> diff --git a/include/trace/events/dma_fence.h b/include/trace/events/dma_fence.h
> index 3963e79ca7b4..5a5d272031ce 100644
> --- a/include/trace/events/dma_fence.h
> +++ b/include/trace/events/dma_fence.h
> @@ -83,6 +83,33 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end,
>   	TP_ARGS(fence)
>   );
>   
> +TRACE_EVENT(dma_fence_used_as_dependency,
> +
> +	TP_PROTO(struct dma_fence *fence, const char *reason),
> +
> +	TP_ARGS(fence, reason),
> +
> +	TP_STRUCT__entry(
> +		__string(driver, fence->ops->get_driver_name(fence))
> +		__string(timeline, fence->ops->get_timeline_name(fence))
> +		__field(unsigned int, context)
> +		__field(unsigned int, seqno)

I noted it before that this needs to be an u64 and not unsigned int. 
Otherwise we will lose the higher 32bits.

The existing trace points have that bug as well, so you might also want 
to provide a patch to fix this.

Christian.

> +		__string(reason, reason)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(driver, fence->ops->get_driver_name(fence));
> +		__assign_str(timeline, fence->ops->get_timeline_name(fence));
> +		__entry->context = fence->context;
> +		__entry->seqno = fence->seqno;
> +		__assign_str(reason, reason);
> +	),
> +
> +	TP_printk("driver=%s timeline=%s context=%u seqno=%u reason=%s",
> +		  __get_str(driver), __get_str(timeline), __entry->context,
> +		  __entry->seqno, __get_str(reason))
> +);
> +
>   #endif /*  _TRACE_DMA_FENCE_H */
>   
>   /* This part must be outside protection */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to
  2024-02-16 15:09 ` [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:30   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2024-02-16 15:30 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx

Am 16.02.24 um 16:09 schrieb Pierre-Eric Pelloux-Prayer:
> To inform tools about the relationship between the fences.
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/dma-buf/dma-fence-chain.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c
> index 9663ba1bb6ac..3435078c45b7 100644
> --- a/drivers/dma-buf/dma-fence-chain.c
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -9,6 +9,8 @@
>   
>   #include <linux/dma-fence-chain.h>
>   
> +#include "trace/events/dma_fence.h"
> +
>   static bool dma_fence_chain_enable_signaling(struct dma_fence *fence);
>   
>   /**
> @@ -251,6 +253,8 @@ void dma_fence_chain_init(struct dma_fence_chain *chain,
>   	chain->fence = fence;
>   	chain->prev_seqno = 0;
>   
> +	trace_dma_fence_used_as_dependency(fence, __func__);
> +
>   	/* Try to reuse the context of the previous chain node. */
>   	if (prev_chain && __dma_fence_is_later(seqno, prev->seqno, prev->ops)) {
>   		context = prev->context;


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync
  2024-02-16 15:09 ` [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:56   ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2024-02-16 15:56 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer, Sumit Semwal, Gustavo Padovan,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx

Am 16.02.24 um 16:09 schrieb Pierre-Eric Pelloux-Prayer:
> This makes it possible to understand the dependencies between jobs.
> Possible usage of this trace:
> * stuttering issues like Mesa !9189
> * incorrect synchronization: I don't have a link for this one, but having
>    these events was very useful to debug a virtio-gpu / native-context /
>    radeonsi sync issue
>
> I have prototype code using this in UMR, as can be see here:
>     https://gitlab.freedesktop.org/tomstdenis/umr/-/merge_requests/37
>
> v2: add a macro since every caller passes __func__ as the reason parameter
>
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 9 +++++++--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h | 4 +++-
>   2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> index 1b013a44ca99..9a3fdc4be51e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c
> @@ -30,6 +30,7 @@
>    */
>   
>   #include <linux/dma-fence-chain.h>
> +#include <trace/events/dma_fence.h>
>   
>   #include "amdgpu.h"
>   #include "amdgpu_trace.h"
> @@ -145,14 +146,16 @@ static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f)
>   }
>   
>   /**
> - * amdgpu_sync_fence - remember to sync to this fence
> + * amdgpu_sync_fence_with_reason - remember to sync to this fence
>    *
>    * @sync: sync object to add fence to
>    * @f: fence to sync to
> + * @reason: why do we sync to this fence
>    *
>    * Add the fence to the sync object.
>    */
> -int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f)
> +int amdgpu_sync_fence_with_reason(struct amdgpu_sync *sync, struct dma_fence *f,
> +				  const char *reason)
>   {
>   	struct amdgpu_sync_entry *e;
>   
> @@ -166,6 +169,8 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f)
>   	if (!e)
>   		return -ENOMEM;
>   
> +	trace_dma_fence_used_as_dependency(f, reason);
> +
>   	hash_add(sync->fences, &e->node, f->context);
>   	e->fence = dma_fence_get(f);
>   	return 0;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
> index cf1e9e858efd..52e7306801de 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h
> @@ -47,7 +47,9 @@ struct amdgpu_sync {
>   };
>   
>   void amdgpu_sync_create(struct amdgpu_sync *sync);
> -int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f);
> +int amdgpu_sync_fence_with_reason(struct amdgpu_sync *sync, struct dma_fence *f,
> +				  const char *reason);
> +#define amdgpu_sync_fence(s, f) amdgpu_sync_fence_with_reason(s, f, __func__)
>   int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync,
>   		     struct dma_resv *resv, enum amdgpu_sync_mode mode,
>   		     void *owner);


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] drm: add drm_mode_atomic_commit event
  2024-02-16 15:09 ` [PATCH v3 6/8] drm: add drm_mode_atomic_commit event Pierre-Eric Pelloux-Prayer
@ 2024-02-16 15:59   ` Steven Rostedt
  2024-02-16 16:24   ` Ville Syrjälä
  1 sibling, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2024-02-16 15:59 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer
  Cc: Sumit Semwal, Gustavo Padovan, Christian König,
	Masami Hiramatsu, Mathieu Desnoyers, dri-devel, linux-media,
	linux-trace-kernel, Alex Deucher, amd-gfx

On Fri, 16 Feb 2024 16:09:55 +0100
Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> wrote:
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>  drivers/gpu/drm/drm_atomic_uapi.c | 21 +++++++++++++++++++++
>  drivers/gpu/drm/drm_trace.h       | 23 +++++++++++++++++++++++
>  2 files changed, 44 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
> index 29d4940188d4..f31b5c6f870b 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -41,6 +41,7 @@
>  #include <linux/file.h>
>  
>  #include "drm_crtc_internal.h"
> +#include "drm_trace.h"
>  
>  /**
>   * DOC: overview
> @@ -1503,6 +1504,26 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
>  		drm_mode_object_put(obj);
>  	}
>  
> +	if (trace_drm_mode_atomic_commit_enabled()) {
> +		struct drm_crtc_state *crtc_state;
> +		struct drm_crtc *crtc;
> +		int *crtcs;
> +		int i, num_crtcs;
> +
> +		crtcs = kcalloc(dev->mode_config.num_crtc, sizeof(int),
> +				GFP_KERNEL);
> +
> +		if (crtcs) {
> +			num_crtcs = 0;
> +			for_each_new_crtc_in_state(state, crtc, crtc_state, i)
> +				crtcs[num_crtcs++] = drm_crtc_index(crtc);

Hmm, looking deeper into this, could you just do the loop the trace event?

That is how different is the config.num_crtc compared to the final num_crtcs?
That way, we don't need to do this allocation if it's not too different.
That is, pass in the dev->mode_config.num_crtc to the tracepoint instead of
num_crtcs.

> +
> +			trace_drm_mode_atomic_commit(file_priv, crtcs, num_crtcs, arg->flags);
> +
> +			kfree(crtcs);
> +		}
> +	}
> +
>  	ret = prepare_signaling(dev, state, arg, file_priv, &fence_state,
>  				&num_fences);
>  	if (ret)
> diff --git a/drivers/gpu/drm/drm_trace.h b/drivers/gpu/drm/drm_trace.h
> index 11c6dd577e8e..63489923c289 100644
> --- a/drivers/gpu/drm/drm_trace.h
> +++ b/drivers/gpu/drm/drm_trace.h
> @@ -66,6 +66,29 @@ TRACE_EVENT(drm_vblank_event_delivered,
>  		      __entry->seq)
>  );
>  
> +TRACE_EVENT(drm_mode_atomic_commit,
> +	    TP_PROTO(struct drm_file *file, int *crtcs, int ncrtcs, uint32_t flags),
> +	    TP_ARGS(file, crtcs, ncrtcs, flags),
> +	    TP_STRUCT__entry(
> +		    __field(struct drm_file *, file)
> +		    __dynamic_array(u32, crtcs, ncrtcs)

Here the ncrtcs is what is passed in. It will always be allocated to that
size though.

> +		    __field(uint32_t, ncrtcs)
> +		    __field(uint32_t, flags)
> +		    ),
> +	    TP_fast_assign(
> +		    unsigned int i;
> +
> +		    __entry->file = file;
> +		    for (i = 0; i < ncrtcs; i++)
> +			((u32 *)__get_dynamic_array(crtcs))[i] = crtcs[i];

Here we have:

		int n = 0;

		for_each_new_crtc_in_state(state, crtc, crtc_state, i)
			((u32 *)__get_dynamic_array(crtcs))[n++] = drm_crtc_index(crtc);

		__entry->ncrtcs = n;

But this is only viable if the ncrtcs is close to the same size as dev->mode_config.num_crtc,
otherwise it's not worth it.

-- Steve



> +		    __entry->ncrtcs = ncrtcs;
> +		    __entry->flags = flags;
> +		    ),
> +	    TP_printk("file=%p, pid=%8d, flags=%08x, crtcs=%s", __entry->file,
> +		      pid_nr(__entry->file->pid), __entry->flags,
> +		      __print_array(__get_dynamic_array(crtcs), __entry->ncrtcs, 4))
> +);
> +
>  #endif /* _DRM_TRACE_H_ */
>  
>  /* This part must be outside protection */


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] drm: add drm_mode_atomic_commit event
  2024-02-16 15:09 ` [PATCH v3 6/8] drm: add drm_mode_atomic_commit event Pierre-Eric Pelloux-Prayer
  2024-02-16 15:59   ` Steven Rostedt
@ 2024-02-16 16:24   ` Ville Syrjälä
  2024-02-22 13:05     ` Pierre-Eric Pelloux-Prayer
  1 sibling, 1 reply; 16+ messages in thread
From: Ville Syrjälä @ 2024-02-16 16:24 UTC (permalink / raw)
  To: Pierre-Eric Pelloux-Prayer
  Cc: Sumit Semwal, Gustavo Padovan, Christian König,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx

On Fri, Feb 16, 2024 at 04:09:55PM +0100, Pierre-Eric Pelloux-Prayer wrote:
> With this and the dma_fence_used_as_dependency event, a tool can draw the
> relationship between the compositing draw, the atomic commit, and vblank.
> 
> An example on a 2 monitors system look like this:
> 
> gnome-shell-1638    [018] .....  2571.905124: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x1}
> gnome-shell-1638    [018] .....  2571.905147: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73240 reason=dma_fence_chain_init
> gnome-shell-1638    [018] .....  2571.913226: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x0}
> gnome-shell-1638    [018] .....  2571.913250: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73241 reason=dma_fence_chain_init
>     <idle>-0       [018] d.h3.  2571.915687: drm_vblank_event: crtc=1, seq=155747, time=2571916093743, high-prec=true
>     <idle>-0       [018] d.h3.  2571.915968: drm_vblank_event: crtc=0, seq=153862, time=2571916377180, high-prec=true
> 
> v2: fix unchecked memory allocation
> 
> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
> ---
>  drivers/gpu/drm/drm_atomic_uapi.c | 21 +++++++++++++++++++++
>  drivers/gpu/drm/drm_trace.h       | 23 +++++++++++++++++++++++
>  2 files changed, 44 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
> index 29d4940188d4..f31b5c6f870b 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -41,6 +41,7 @@
>  #include <linux/file.h>
>  
>  #include "drm_crtc_internal.h"
> +#include "drm_trace.h"
>  
>  /**
>   * DOC: overview
> @@ -1503,6 +1504,26 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
>  		drm_mode_object_put(obj);
>  	}
>  
> +	if (trace_drm_mode_atomic_commit_enabled()) {
> +		struct drm_crtc_state *crtc_state;
> +		struct drm_crtc *crtc;
> +		int *crtcs;
> +		int i, num_crtcs;
> +
> +		crtcs = kcalloc(dev->mode_config.num_crtc, sizeof(int),
> +				GFP_KERNEL);
> +
> +		if (crtcs) {
> +			num_crtcs = 0;
> +			for_each_new_crtc_in_state(state, crtc, crtc_state, i)
> +				crtcs[num_crtcs++] = drm_crtc_index(crtc);
> +
> +			trace_drm_mode_atomic_commit(file_priv, crtcs, num_crtcs, arg->flags);
> +
> +			kfree(crtcs);
> +		}
> +	}

I think the current drm trace events are sort of semi-useless.
The problems are:
- no device id in the events so good luck with multi gpu systems
- vblank trace events are only emitted from some vblank
  codepaths but not others

I'm also not sure putting an event straight into the atomic ioctl is
particularly useful.

First of all it means that any commit not initiated by the atomic
ioctl will not be traced.

It would also seem more useful to me if the driver can emit the
trace just before it commits the frame to the hardware, so that
we can also observe the latency between userspace submitting
the frame vs. when the hardware will actually see it.

Also if we want tools to use these I think we're going to have to
make some kind of abi promises about the events, so we should make
sure they are as future proof as we can make them (eg. regarding
mutli-gpu systems/etc.).

> +
>  	ret = prepare_signaling(dev, state, arg, file_priv, &fence_state,
>  				&num_fences);
>  	if (ret)
> diff --git a/drivers/gpu/drm/drm_trace.h b/drivers/gpu/drm/drm_trace.h
> index 11c6dd577e8e..63489923c289 100644
> --- a/drivers/gpu/drm/drm_trace.h
> +++ b/drivers/gpu/drm/drm_trace.h
> @@ -66,6 +66,29 @@ TRACE_EVENT(drm_vblank_event_delivered,
>  		      __entry->seq)
>  );
>  
> +TRACE_EVENT(drm_mode_atomic_commit,
> +	    TP_PROTO(struct drm_file *file, int *crtcs, int ncrtcs, uint32_t flags),
> +	    TP_ARGS(file, crtcs, ncrtcs, flags),
> +	    TP_STRUCT__entry(
> +		    __field(struct drm_file *, file)
> +		    __dynamic_array(u32, crtcs, ncrtcs)
> +		    __field(uint32_t, ncrtcs)
> +		    __field(uint32_t, flags)
> +		    ),
> +	    TP_fast_assign(
> +		    unsigned int i;
> +
> +		    __entry->file = file;
> +		    for (i = 0; i < ncrtcs; i++)
> +			((u32 *)__get_dynamic_array(crtcs))[i] = crtcs[i];
> +		    __entry->ncrtcs = ncrtcs;
> +		    __entry->flags = flags;
> +		    ),
> +	    TP_printk("file=%p, pid=%8d, flags=%08x, crtcs=%s", __entry->file,
> +		      pid_nr(__entry->file->pid), __entry->flags,
> +		      __print_array(__get_dynamic_array(crtcs), __entry->ncrtcs, 4))
> +);
> +
>  #endif /* _DRM_TRACE_H_ */
>  
>  /* This part must be outside protection */
> -- 
> 2.40.1

-- 
Ville Syrjälä
Intel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 6/8] drm: add drm_mode_atomic_commit event
  2024-02-16 16:24   ` Ville Syrjälä
@ 2024-02-22 13:05     ` Pierre-Eric Pelloux-Prayer
  0 siblings, 0 replies; 16+ messages in thread
From: Pierre-Eric Pelloux-Prayer @ 2024-02-22 13:05 UTC (permalink / raw)
  To: Ville Syrjälä, Pierre-Eric Pelloux-Prayer
  Cc: Sumit Semwal, Gustavo Padovan, Christian König,
	Steven Rostedt, Masami Hiramatsu, Mathieu Desnoyers, dri-devel,
	linux-media, linux-trace-kernel, Alex Deucher, amd-gfx



Le 16/02/2024 à 17:24, Ville Syrjälä a écrit :
> On Fri, Feb 16, 2024 at 04:09:55PM +0100, Pierre-Eric Pelloux-Prayer wrote:
>> With this and the dma_fence_used_as_dependency event, a tool can draw the
>> relationship between the compositing draw, the atomic commit, and vblank.
>>
>> An example on a 2 monitors system look like this:
>>
>> gnome-shell-1638    [018] .....  2571.905124: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x1}
>> gnome-shell-1638    [018] .....  2571.905147: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73240 reason=dma_fence_chain_init
>> gnome-shell-1638    [018] .....  2571.913226: drm_mode_atomic_commit: file=00000000245c3f0c, pid=    1165, flags=00000201, crtcs={0x0}
>> gnome-shell-1638    [018] .....  2571.913250: dma_fence_used_as_dependency: driver=drm_sched timeline=gfx_0.0.0 context=270 seqno=73241 reason=dma_fence_chain_init
>>      <idle>-0       [018] d.h3.  2571.915687: drm_vblank_event: crtc=1, seq=155747, time=2571916093743, high-prec=true
>>      <idle>-0       [018] d.h3.  2571.915968: drm_vblank_event: crtc=0, seq=153862, time=2571916377180, high-prec=true
>>
>> v2: fix unchecked memory allocation
>>
>> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
>> ---
>>   drivers/gpu/drm/drm_atomic_uapi.c | 21 +++++++++++++++++++++
>>   drivers/gpu/drm/drm_trace.h       | 23 +++++++++++++++++++++++
>>   2 files changed, 44 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c
>> index 29d4940188d4..f31b5c6f870b 100644
>> --- a/drivers/gpu/drm/drm_atomic_uapi.c
>> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
>> @@ -41,6 +41,7 @@
>>   #include <linux/file.h>
>>   
>>   #include "drm_crtc_internal.h"
>> +#include "drm_trace.h"
>>   
>>   /**
>>    * DOC: overview
>> @@ -1503,6 +1504,26 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
>>   		drm_mode_object_put(obj);
>>   	}
>>   
>> +	if (trace_drm_mode_atomic_commit_enabled()) {
>> +		struct drm_crtc_state *crtc_state;
>> +		struct drm_crtc *crtc;
>> +		int *crtcs;
>> +		int i, num_crtcs;
>> +
>> +		crtcs = kcalloc(dev->mode_config.num_crtc, sizeof(int),
>> +				GFP_KERNEL);
>> +
>> +		if (crtcs) {
>> +			num_crtcs = 0;
>> +			for_each_new_crtc_in_state(state, crtc, crtc_state, i)
>> +				crtcs[num_crtcs++] = drm_crtc_index(crtc);
>> +
>> +			trace_drm_mode_atomic_commit(file_priv, crtcs, num_crtcs, arg->flags);
>> +
>> +			kfree(crtcs);
>> +		}
>> +	}
> 
> I think the current drm trace events are sort of semi-useless.
> The problems are:
> - no device id in the events so good luck with multi gpu systems
> - vblank trace events are only emitted from some vblank
>    codepaths but not others
> 
> I'm also not sure putting an event straight into the atomic ioctl is
> particularly useful.
> 
> First of all it means that any commit not initiated by the atomic
> ioctl will not be traced.
> 
> It would also seem more useful to me if the driver can emit the
> trace just before it commits the frame to the hardware, so that
> we can also observe the latency between userspace submitting
> the frame vs. when the hardware will actually see it.
> 
> Also if we want tools to use these I think we're going to have to
> make some kind of abi promises about the events, so we should make
> sure they are as future proof as we can make them (eg. regarding
> mutli-gpu systems/etc.).

Thanks for your feedback.

This series was also discussed on IRC with Sima [1], and the conclusion was
that it would be good to rework the series with the following goals in
mind:
* make sure the events are useful for any drivers using the core drm code,
not just amdgpu
* add new events or extend existing ones so that all the required information is
there (= no guessing needed)
* document the updated tracepoints (as UAPI?): how they should be interpreted
by tools (eg: how to reconstruct fence dependencies? how to measure latency? etc)


Pierre-Eric


[1]: https://dri.freedesktop.org/~cbrill/dri-log/?channel=dri-devel&date=2024-02-16




> 
>> +
>>   	ret = prepare_signaling(dev, state, arg, file_priv, &fence_state,
>>   				&num_fences);
>>   	if (ret)
>> diff --git a/drivers/gpu/drm/drm_trace.h b/drivers/gpu/drm/drm_trace.h
>> index 11c6dd577e8e..63489923c289 100644
>> --- a/drivers/gpu/drm/drm_trace.h
>> +++ b/drivers/gpu/drm/drm_trace.h
>> @@ -66,6 +66,29 @@ TRACE_EVENT(drm_vblank_event_delivered,
>>   		      __entry->seq)
>>   );
>>   
>> +TRACE_EVENT(drm_mode_atomic_commit,
>> +	    TP_PROTO(struct drm_file *file, int *crtcs, int ncrtcs, uint32_t flags),
>> +	    TP_ARGS(file, crtcs, ncrtcs, flags),
>> +	    TP_STRUCT__entry(
>> +		    __field(struct drm_file *, file)
>> +		    __dynamic_array(u32, crtcs, ncrtcs)
>> +		    __field(uint32_t, ncrtcs)
>> +		    __field(uint32_t, flags)
>> +		    ),
>> +	    TP_fast_assign(
>> +		    unsigned int i;
>> +
>> +		    __entry->file = file;
>> +		    for (i = 0; i < ncrtcs; i++)
>> +			((u32 *)__get_dynamic_array(crtcs))[i] = crtcs[i];
>> +		    __entry->ncrtcs = ncrtcs;
>> +		    __entry->flags = flags;
>> +		    ),
>> +	    TP_printk("file=%p, pid=%8d, flags=%08x, crtcs=%s", __entry->file,
>> +		      pid_nr(__entry->file->pid), __entry->flags,
>> +		      __print_array(__get_dynamic_array(crtcs), __entry->ncrtcs, 4))
>> +);
>> +
>>   #endif /* _DRM_TRACE_H_ */
>>   
>>   /* This part must be outside protection */
>> -- 
>> 2.40.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2024-02-22 13:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-16 15:09 [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Pierre-Eric Pelloux-Prayer
2024-02-16 15:09 ` [PATCH v3 1/8] tracing, dma-buf: add a trace_dma_fence_sync_to event Pierre-Eric Pelloux-Prayer
2024-02-16 15:28   ` Christian König
2024-02-16 15:09 ` [PATCH v3 2/8] dma-buf/fence-chain: use trace_dma_fence_sync_to Pierre-Eric Pelloux-Prayer
2024-02-16 15:30   ` Christian König
2024-02-16 15:09 ` [PATCH v3 3/8] amdgpu: use trace_dma_fence_sync_to in amdgpu_fence_sync Pierre-Eric Pelloux-Prayer
2024-02-16 15:56   ` Christian König
2024-02-16 15:09 ` [PATCH v3 4/8] drm/amdgpu: add a amdgpu_bo_fill trace event Pierre-Eric Pelloux-Prayer
2024-02-16 15:09 ` [PATCH v3 5/8] drm/amdgpu: add a amdgpu_cs_start " Pierre-Eric Pelloux-Prayer
2024-02-16 15:09 ` [PATCH v3 6/8] drm: add drm_mode_atomic_commit event Pierre-Eric Pelloux-Prayer
2024-02-16 15:59   ` Steven Rostedt
2024-02-16 16:24   ` Ville Syrjälä
2024-02-22 13:05     ` Pierre-Eric Pelloux-Prayer
2024-02-16 15:09 ` [PATCH v3 7/8] drm/sched: use trace_dma_fence_used_as_dependency Pierre-Eric Pelloux-Prayer
2024-02-16 15:09 ` [PATCH v3 8/8] drm/amdgpu: add devname to trace_amdgpu_sched_run_job Pierre-Eric Pelloux-Prayer
2024-02-16 15:20 ` [PATCH v3 0/8] dma-fence, drm, amdgpu new trace events Christian König

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.