* [PATCH] dma-buf: Enhance dma-fence tracing @ 2019-01-21 23:20 Chris Wilson 2019-01-21 23:47 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork ` (4 more replies) 0 siblings, 5 replies; 12+ messages in thread From: Chris Wilson @ 2019-01-21 23:20 UTC (permalink / raw) To: dri-devel Cc: Michael Sartain, intel-gfx, Steven Rostedt, Alex Deucher, Christian König, Pierre-Loup Griffais Rather than every backend and GPU driver reinventing the same wheel for user level debugging of HW execution, the common dma-fence framework should include the tracing infrastructure required for most client API level flow visualisation. With these common dma-fence level tracepoints, the userspace tools can establish a detailed view of the client <-> HW flow across different kernels. There is a strong ask to have this available, so that the userspace developer can effectively assess if they're doing a good job about feeding the beast of a GPU hardware. In the case of needing to look into more fine-grained details of how kernel internals work towards the goal of feeding the beast, the tools may optionally amend the dma-fence tracing information with the driver implementation specific. But for such cases, the tools should have a graceful degradation in case the expected extra tracepoints have changed or their format differs from the expected, as the kernel implementation internals are not expected to stay the same. It is important to distinguish between tracing for the purpose of client flow visualisation and tracing for the purpose of low-level kernel debugging. The latter is highly implementation specific, tied to a particular HW and driver, whereas the former addresses a common goal of user level tracing and likely a common set of userspace tools. Having made the distinction that these tracepoints will be consumed for client API tooling, we raise the spectre of tracepoint ABI stability. It is hoped that by defining a common set of dma-fence tracepoints, we avoid the pitfall of exposing low level details and so restrict ourselves only to the high level flow that is applicable to all drivers and hardware. Thus the reserved guarantee that this set of tracepoints will be stable (with the emphasis on depicting client <-> HW flow as opposed to driver <-> HW). In terms of specific changes to the dma-fence tracing, we remove the emission of the strings for every tracepoint (reserving them for dma_fence_init for cases where they have unique dma_fence_ops, and preferring to have descriptors for the whole fence context). strings do not pack as well into the ftrace ringbuffer and we would prefer to reduce the amount of indirect callbacks required for frequent tracepoint emission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Eric Anholt <eric@anholt.net> Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> Cc: Michael Sartain <mikesart@fastmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> --- drivers/dma-buf/dma-fence.c | 9 +- drivers/gpu/drm/i915/i915_gem_clflush.c | 5 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 1 - drivers/gpu/drm/i915/i915_request.c | 16 +- drivers/gpu/drm/i915/i915_timeline.c | 5 + drivers/gpu/drm/i915/i915_trace.h | 134 --------------- drivers/gpu/drm/i915/intel_guc_submission.c | 10 ++ drivers/gpu/drm/i915/intel_lrc.c | 6 + drivers/gpu/drm/i915/intel_ringbuffer.h | 2 + include/trace/events/dma_fence.h | 177 +++++++++++++++++++- 10 files changed, 214 insertions(+), 151 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 3aa8733f832a..5c93ed34b1ff 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -27,8 +27,15 @@ #define CREATE_TRACE_POINTS #include <trace/events/dma_fence.h> +EXPORT_TRACEPOINT_SYMBOL(dma_fence_context_create); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_context_destroy); + +EXPORT_TRACEPOINT_SYMBOL(dma_fence_await); EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); -EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_execute_start); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_execute_end); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_wait_start); +EXPORT_TRACEPOINT_SYMBOL(dma_fence_wait_end); static DEFINE_SPINLOCK(dma_fence_stub_lock); static struct dma_fence dma_fence_stub; diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c index 8e74c23cbd91..435c1303ecc8 100644 --- a/drivers/gpu/drm/i915/i915_gem_clflush.c +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c @@ -22,6 +22,8 @@ * */ +#include <trace/events/dma_fence.h> + #include "i915_drv.h" #include "intel_frontbuffer.h" #include "i915_gem_clflush.h" @@ -73,6 +75,7 @@ static void i915_clflush_work(struct work_struct *work) struct clflush *clflush = container_of(work, typeof(*clflush), work); struct drm_i915_gem_object *obj = clflush->obj; + trace_dma_fence_execute_start(&clflush->dma, smp_processor_id()); if (i915_gem_object_pin_pages(obj)) { DRM_ERROR("Failed to acquire obj->pages for clflushing\n"); goto out; @@ -83,6 +86,7 @@ static void i915_clflush_work(struct work_struct *work) i915_gem_object_unpin_pages(obj); out: + trace_dma_fence_execute_end(&clflush->dma, smp_processor_id()); i915_gem_object_put(obj); dma_fence_signal(&clflush->dma); @@ -97,6 +101,7 @@ i915_clflush_notify(struct i915_sw_fence *fence, switch (state) { case FENCE_COMPLETE: + trace_dma_fence_emit(&clflush->dma); schedule_work(&clflush->work); break; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index f250109e1f66..f10a9e675e2c 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -2397,7 +2397,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, */ eb.request->batch = eb.batch; - trace_i915_request_queue(eb.request, eb.batch_flags); err = eb_submit(&eb); err_request: i915_request_add(eb.request); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 5e178f5ac18b..d5bbc8cb8306 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -28,6 +28,8 @@ #include <linux/sched/clock.h> #include <linux/sched/signal.h> +#include <trace/events/dma_fence.h> + #include "i915_drv.h" #include "i915_reset.h" @@ -254,8 +256,6 @@ static void i915_request_retire(struct i915_request *request) GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit)); GEM_BUG_ON(!i915_request_completed(request)); - trace_i915_request_retire(request); - advance_ring(request); free_capture_list(request); @@ -383,8 +383,6 @@ void __i915_request_submit(struct i915_request *request) /* Transfer from per-context onto the global per-engine timeline */ move_to_timeline(request, &engine->timeline); - trace_i915_request_execute(request); - wake_up_all(&request->execute); } @@ -463,7 +461,8 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) switch (state) { case FENCE_COMPLETE: - trace_i915_request_submit(request); + trace_dma_fence_emit(&request->fence); + /* * We need to serialize use of the submit_request() callback * with its hotplugging performed during an emergency @@ -770,6 +769,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) if (ret < 0) return ret; + trace_dma_fence_await(&rq->fence, fence); + /* Record the latest fence used against each timeline */ if (fence->context != rq->i915->mm.unordered_timeline) i915_timeline_sync_set(rq->timeline, fence); @@ -878,7 +879,6 @@ void i915_request_add(struct i915_request *request) engine->name, request->fence.context, request->fence.seqno); lockdep_assert_held(&request->i915->drm.struct_mutex); - trace_i915_request_add(request); /* * Make sure that no request gazumped us - if it was allocated after @@ -1139,7 +1139,7 @@ long i915_request_wait(struct i915_request *rq, if (!timeout) return -ETIME; - trace_i915_request_wait_begin(rq, flags); + trace_dma_fence_wait_start(&rq->fence); add_wait_queue(&rq->execute, &exec); if (flags & I915_WAIT_LOCKED) @@ -1245,7 +1245,7 @@ long i915_request_wait(struct i915_request *rq, if (flags & I915_WAIT_LOCKED) remove_wait_queue(errq, &reset); remove_wait_queue(&rq->execute, &exec); - trace_i915_request_wait_end(rq); + trace_dma_fence_wait_end(&rq->fence); return timeout; } diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c index 4667cc08c416..a83ae0b59b1f 100644 --- a/drivers/gpu/drm/i915/i915_timeline.c +++ b/drivers/gpu/drm/i915/i915_timeline.c @@ -4,6 +4,8 @@ * Copyright © 2016-2018 Intel Corporation */ +#include <trace/events/dma_fence.h> + #include "i915_drv.h" #include "i915_timeline.h" @@ -30,6 +32,7 @@ void i915_timeline_init(struct drm_i915_private *i915, /* Called during early_init before we know how many engines there are */ timeline->fence_context = dma_fence_context_alloc(1); + trace_dma_fence_context_create(timeline->fence_context, "i915", name); spin_lock_init(&timeline->lock); @@ -70,6 +73,8 @@ void i915_timeline_fini(struct i915_timeline *timeline) { GEM_BUG_ON(!list_empty(&timeline->requests)); + trace_dma_fence_context_destroy(timeline->fence_context); + i915_syncmap_free(&timeline->sync); list_del(&timeline->link); diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 33d90eca9cdd..538b6efb5a56 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -584,82 +584,7 @@ TRACE_EVENT(i915_gem_evict_vm, TP_printk("dev=%d, vm=%p", __entry->dev, __entry->vm) ); -TRACE_EVENT(i915_request_queue, - TP_PROTO(struct i915_request *rq, u32 flags), - TP_ARGS(rq, flags), - - TP_STRUCT__entry( - __field(u32, dev) - __field(u32, hw_id) - __field(u64, ctx) - __field(u16, class) - __field(u16, instance) - __field(u32, seqno) - __field(u32, flags) - ), - - TP_fast_assign( - __entry->dev = rq->i915->drm.primary->index; - __entry->hw_id = rq->gem_context->hw_id; - __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; - __entry->ctx = rq->fence.context; - __entry->seqno = rq->fence.seqno; - __entry->flags = flags; - ), - - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, flags=0x%x", - __entry->dev, __entry->class, __entry->instance, - __entry->hw_id, __entry->ctx, __entry->seqno, - __entry->flags) -); - -DECLARE_EVENT_CLASS(i915_request, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq), - - TP_STRUCT__entry( - __field(u32, dev) - __field(u32, hw_id) - __field(u64, ctx) - __field(u16, class) - __field(u16, instance) - __field(u32, seqno) - __field(u32, global) - ), - - TP_fast_assign( - __entry->dev = rq->i915->drm.primary->index; - __entry->hw_id = rq->gem_context->hw_id; - __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; - __entry->ctx = rq->fence.context; - __entry->seqno = rq->fence.seqno; - __entry->global = rq->global_seqno; - ), - - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, global=%u", - __entry->dev, __entry->class, __entry->instance, - __entry->hw_id, __entry->ctx, __entry->seqno, - __entry->global) -); - -DEFINE_EVENT(i915_request, i915_request_add, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq) -); - #if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) -DEFINE_EVENT(i915_request, i915_request_submit, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq) -); - -DEFINE_EVENT(i915_request, i915_request_execute, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq) -); - TRACE_EVENT(i915_request_in, TP_PROTO(struct i915_request *rq, unsigned int port), TP_ARGS(rq, port), @@ -728,16 +653,6 @@ TRACE_EVENT(i915_request_out, #else #if !defined(TRACE_HEADER_MULTI_READ) -static inline void -trace_i915_request_submit(struct i915_request *rq) -{ -} - -static inline void -trace_i915_request_execute(struct i915_request *rq) -{ -} - static inline void trace_i915_request_in(struct i915_request *rq, unsigned int port) { @@ -775,55 +690,6 @@ TRACE_EVENT(intel_engine_notify, __entry->seqno, __entry->waiters) ); -DEFINE_EVENT(i915_request, i915_request_retire, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq) -); - -TRACE_EVENT(i915_request_wait_begin, - TP_PROTO(struct i915_request *rq, unsigned int flags), - TP_ARGS(rq, flags), - - TP_STRUCT__entry( - __field(u32, dev) - __field(u32, hw_id) - __field(u64, ctx) - __field(u16, class) - __field(u16, instance) - __field(u32, seqno) - __field(u32, global) - __field(unsigned int, flags) - ), - - /* NB: the blocking information is racy since mutex_is_locked - * doesn't check that the current thread holds the lock. The only - * other option would be to pass the boolean information of whether - * or not the class was blocking down through the stack which is - * less desirable. - */ - TP_fast_assign( - __entry->dev = rq->i915->drm.primary->index; - __entry->hw_id = rq->gem_context->hw_id; - __entry->class = rq->engine->uabi_class; - __entry->instance = rq->engine->instance; - __entry->ctx = rq->fence.context; - __entry->seqno = rq->fence.seqno; - __entry->global = rq->global_seqno; - __entry->flags = flags; - ), - - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, global=%u, blocking=%u, flags=0x%x", - __entry->dev, __entry->class, __entry->instance, - __entry->hw_id, __entry->ctx, __entry->seqno, - __entry->global, !!(__entry->flags & I915_WAIT_LOCKED), - __entry->flags) -); - -DEFINE_EVENT(i915_request, i915_request_wait_end, - TP_PROTO(struct i915_request *rq), - TP_ARGS(rq) -); - TRACE_EVENT_CONDITION(i915_reg_rw, TP_PROTO(bool write, i915_reg_t reg, u64 val, int len, bool trace), diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c index ab1c49b106f2..dda20ddf0418 100644 --- a/drivers/gpu/drm/i915/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/intel_guc_submission.c @@ -535,6 +535,14 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq) client->submissions[engine->id] += 1; spin_unlock(&client->wq_lock); + + /* + * XXX move to HW synthesis: + * - emit SRM(RCS_TIMESTAMP) to per-context journal + * - poll (maybe interrupt) journal from kthread, generating tracepoint + * - notify callback for dma_fence tracepoint register + */ + trace_dma_fence_execute_start(&rq->fence, i915_trace_hwid(rq->engine)); } /* @@ -810,6 +818,8 @@ static void guc_submission_tasklet(unsigned long data) rq = port_request(port); while (rq && i915_request_completed(rq)) { + trace_dma_fence_execute_end(&rq->fence, + i915_trace_hwid(rq->engine)); trace_i915_request_out(rq); i915_request_put(rq); diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index c0a42afaf177..3fc485925ab1 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -133,6 +133,8 @@ */ #include <linux/interrupt.h> +#include <trace/events/dma_fence.h> + #include <drm/i915_drm.h> #include "i915_drv.h" #include "i915_gem_render_state.h" @@ -351,11 +353,15 @@ execlists_context_schedule_in(struct i915_request *rq) execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); intel_engine_context_in(rq->engine); rq->hw_context->active = rq->engine; + + trace_dma_fence_execute_start(&rq->fence, i915_trace_hwid(rq->engine)); } static inline void execlists_context_schedule_out(struct i915_request *rq, unsigned long status) { + trace_dma_fence_execute_end(&rq->fence, i915_trace_hwid(rq->engine)); + rq->hw_context->active = NULL; intel_engine_context_out(rq->engine); execlists_context_status_change(rq, status); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index c3ef0f9bf321..0fd7fb7fd2cb 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -579,6 +579,8 @@ struct intel_engine_cs { } stats; }; +#define i915_trace_hwid(e) ((e)->uabi_class << 16 | (e)->instance) + static inline bool intel_engine_needs_cmd_parser(const struct intel_engine_cs *engine) { diff --git a/include/trace/events/dma_fence.h b/include/trace/events/dma_fence.h index 2212adda8f77..75d0de3ccc4a 100644 --- a/include/trace/events/dma_fence.h +++ b/include/trace/events/dma_fence.h @@ -9,12 +9,119 @@ struct dma_fence; +/* + * dma-fence tracing *or* How to debug fences + * + * The dma-fence tracing provides insight into the user <-> HW execution + * flow, although dma-fence is not tied to HW and may be used to coordinate + * any execution flow. A dma-fence represents a job along a timeline (the fence + * context), and when the job is complete it is signaled. However, for the + * purposes of visualisation of HW execution, we need a little more information + * than that as we need to not only know when the job became ready for execution + * (was passed into the HW queue) but ideally when and on which HW engine that + * job was eventually scheduled. For CPU bound execution flows, similarly + * knowing on which CPU the job was scheduled can be vital information for + * debugging. + * + * The typical flow of events from user to HW for a dma-fence would be: + * + * 1. dma_fence_init + * 2. dma_fence_await (optional) + * - records the dependencies between fences that must be signaled + * before this fence is ready for execution; an asynchronous wait + * 3. dma_fence_emit + * - the fence is ready for execution and passed to the execution queue + * (the user to HW/FW/backend transition) + * 4. dma_fence_execute_start (optional) + * - records the start of execution on the backend (HW) and includes + * a tag to uniquely identify the backend engine so that concurrent + * execution can be traced + * - may only be emitted for the first fence in a context to begin + * execution + * 5. dma_fence_execute_end (optional) + * - records the corresponding completion point of backend execution + * - may only be emitted for the last fence in a context to finish + * execution + * 6. dma_fence_signaled + * - records when the fence was marked as completed and the result + * propagated to the various waiters + * 7. dma_fence_destroy + * + * Note that not all fences are used in conjunction with HW engines, and + * so may exclude the execution tracing. Nor do they all correspond with + * client API, although many may be used as background tasks required + * before HW execution. + * + * The flow of events from HW to user would be: + * + * 1. dma_fence_wait_begin + * 2. dma_fence_enable_signaling (optional) + * 3. dma_fence_signaled + * 4. dma_fence_wait_end + * + * For improved visualisation, dma_fence_context_create and + * dma_fence_context_destroy are used to couple the context id to a string. + */ + +TRACE_EVENT(dma_fence_context_create, + TP_PROTO(u64 context, const char *driver, const char *timeline), + TP_ARGS(context, driver, timeline), + + TP_STRUCT__entry( + __field(u64, context) + __string(driver, driver) + __string(timeline, timeline) + ), + + TP_fast_assign( + __entry->context = context; + __assign_str(driver, driver) + __assign_str(timeline, timeline) + ), + + TP_printk("context=%llu, driver=%s, timeline=%s", + __entry->context, __get_str(driver), __get_str(timeline)) +); + +TRACE_EVENT(dma_fence_context_destroy, + TP_PROTO(u64 context), + TP_ARGS(context), + + TP_STRUCT__entry( + __field(u64, context) + ), + + TP_fast_assign( + __entry->context = context; + ), + + TP_printk("context=%llu", __entry->context) +); + DECLARE_EVENT_CLASS(dma_fence, TP_PROTO(struct dma_fence *fence), TP_ARGS(fence), + TP_STRUCT__entry( + __field(u64, context) + __field(u64, seqno) + ), + + TP_fast_assign( + __entry->context = fence->context; + __entry->seqno = fence->seqno; + ), + + TP_printk("context=%llu, seqno=%llu", __entry->context, __entry->seqno) +); + +TRACE_EVENT(dma_fence_init, + TP_PROTO(struct dma_fence *fence), + + TP_ARGS(fence), + TP_STRUCT__entry( __string(driver, fence->ops->get_driver_name(fence)) __string(timeline, fence->ops->get_timeline_name(fence)) @@ -41,13 +148,6 @@ DEFINE_EVENT(dma_fence, dma_fence_emit, TP_ARGS(fence) ); -DEFINE_EVENT(dma_fence, dma_fence_init, - - TP_PROTO(struct dma_fence *fence), - - TP_ARGS(fence) -); - DEFINE_EVENT(dma_fence, dma_fence_destroy, TP_PROTO(struct dma_fence *fence), @@ -83,6 +183,69 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end, TP_ARGS(fence) ); +TRACE_EVENT(dma_fence_await, + TP_PROTO(struct dma_fence *wait, struct dma_fence *signal), + TP_ARGS(wait, signal), + + TP_STRUCT__entry( + __field(u64, wait_context) + __field(u64, wait_seqno) + __field(u64, signal_context) + __field(u64, signal_seqno) + ), + + TP_fast_assign( + __entry->wait_context = wait->context; + __entry->wait_seqno = wait->seqno; + __entry->signal_context = signal->context; + __entry->signal_seqno = signal->seqno; + ), + + TP_printk("wait_context=%llu, wait_seqno=%llu, signal_context=%llu, signal_seqno=%llu", + __entry->wait_context, __entry->wait_seqno, + __entry->signal_context, __entry->signal_seqno) +); + +TRACE_EVENT(dma_fence_execute_start, + TP_PROTO(struct dma_fence *fence, u64 hwid), + TP_ARGS(fence, hwid), + + TP_STRUCT__entry( + __field(u64, context) + __field(u64, seqno) + __field(u64, hwid) + ), + + TP_fast_assign( + __entry->context = fence->context; + __entry->seqno = fence->seqno; + __entry->hwid = hwid; + ), + + TP_printk("context=%llu, seqno=%llu, hwid=%llu", + __entry->context, __entry->seqno, __entry->hwid) +); + +TRACE_EVENT(dma_fence_execute_end, + TP_PROTO(struct dma_fence *fence, u64 hwid), + TP_ARGS(fence, hwid), + + TP_STRUCT__entry( + __field(u64, context) + __field(u64, seqno) + __field(u64, hwid) + ), + + TP_fast_assign( + __entry->context = fence->context; + __entry->seqno = fence->seqno; + __entry->hwid = hwid; + ), + + TP_printk("context=%llu, seqno=%llu, hwid=%llu", + __entry->context, __entry->seqno, __entry->hwid) +); + #endif /* _TRACE_DMA_FENCE_H */ /* This part must be outside protection */ -- 2.20.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 12+ messages in thread
* ✗ Fi.CI.CHECKPATCH: warning for dma-buf: Enhance dma-fence tracing 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson @ 2019-01-21 23:47 ` Patchwork 2019-01-22 0:06 ` ✓ Fi.CI.BAT: success " Patchwork ` (3 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: Patchwork @ 2019-01-21 23:47 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: dma-buf: Enhance dma-fence tracing URL : https://patchwork.freedesktop.org/series/55531/ State : warning == Summary == $ dim checkpatch origin/drm-tip e620fc0f05f3 dma-buf: Enhance dma-fence tracing -:461: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'e' - possible side-effects? #461: FILE: drivers/gpu/drm/i915/intel_ringbuffer.h:582: +#define i915_trace_hwid(e) ((e)->uabi_class << 16 | (e)->instance) -:532: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #532: FILE: include/trace/events/dma_fence.h:70: + TP_STRUCT__entry( -:538: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #538: FILE: include/trace/events/dma_fence.h:76: + TP_fast_assign( -:552: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #552: FILE: include/trace/events/dma_fence.h:90: + TP_STRUCT__entry( -:556: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #556: FILE: include/trace/events/dma_fence.h:94: + TP_fast_assign( -:569: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #569: FILE: include/trace/events/dma_fence.h:107: + TP_STRUCT__entry( -:574: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #574: FILE: include/trace/events/dma_fence.h:112: + TP_fast_assign( -:583: CHECK:PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis #583: FILE: include/trace/events/dma_fence.h:121: +TRACE_EVENT(dma_fence_init, + TP_PROTO(struct dma_fence *fence), -:612: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #612: FILE: include/trace/events/dma_fence.h:190: + TP_STRUCT__entry( -:619: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #619: FILE: include/trace/events/dma_fence.h:197: + TP_fast_assign( -:635: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #635: FILE: include/trace/events/dma_fence.h:213: + TP_STRUCT__entry( -:641: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #641: FILE: include/trace/events/dma_fence.h:219: + TP_fast_assign( -:655: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #655: FILE: include/trace/events/dma_fence.h:233: + TP_STRUCT__entry( -:661: CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' #661: FILE: include/trace/events/dma_fence.h:239: + TP_fast_assign( total: 0 errors, 0 warnings, 14 checks, 546 lines checked _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* ✓ Fi.CI.BAT: success for dma-buf: Enhance dma-fence tracing 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson 2019-01-21 23:47 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork @ 2019-01-22 0:06 ` Patchwork 2019-01-22 0:55 ` ✓ Fi.CI.IGT: " Patchwork ` (2 subsequent siblings) 4 siblings, 0 replies; 12+ messages in thread From: Patchwork @ 2019-01-22 0:06 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: dma-buf: Enhance dma-fence tracing URL : https://patchwork.freedesktop.org/series/55531/ State : success == Summary == CI Bug Log - changes from CI_DRM_5459 -> Patchwork_12000 ==================================================== Summary ------- **SUCCESS** No regressions found. External URL: https://patchwork.freedesktop.org/api/1.0/series/55531/revisions/1/mbox/ Known issues ------------ Here are the changes found in Patchwork_12000 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@kms_chamelium@hdmi-hpd-fast: - fi-kbl-7500u: PASS -> FAIL [fdo#108767] * igt@kms_pipe_crc_basic@hang-read-crc-pipe-a: - fi-byt-clapper: PASS -> FAIL [fdo#103191] / [fdo#107362] * igt@kms_pipe_crc_basic@suspend-read-crc-pipe-b: - fi-blb-e6850: PASS -> INCOMPLETE [fdo#107718] #### Possible fixes #### * igt@i915_selftest@live_hangcheck: - fi-bwr-2160: DMESG-FAIL [fdo#108735] -> PASS * igt@kms_frontbuffer_tracking@basic: - {fi-icl-u2}: FAIL [fdo#103167] -> PASS * igt@pm_rpm@module-reload: - {fi-icl-u2}: DMESG-WARN [fdo#108654] -> PASS {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167 [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191 [fdo#107362]: https://bugs.freedesktop.org/show_bug.cgi?id=107362 [fdo#107718]: https://bugs.freedesktop.org/show_bug.cgi?id=107718 [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569 [fdo#108654]: https://bugs.freedesktop.org/show_bug.cgi?id=108654 [fdo#108735]: https://bugs.freedesktop.org/show_bug.cgi?id=108735 [fdo#108767]: https://bugs.freedesktop.org/show_bug.cgi?id=108767 [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [fdo#109315]: https://bugs.freedesktop.org/show_bug.cgi?id=109315 Participating hosts (47 -> 41) ------------------------------ Additional (1): fi-glk-j4005 Missing (7): fi-kbl-soraka fi-ilk-m540 fi-hsw-peppy fi-byt-squawks fi-bsw-cyan fi-pnv-d510 fi-bdw-samus Build changes ------------- * Linux: CI_DRM_5459 -> Patchwork_12000 CI_DRM_5459: 0f693a275dd91391b476ada7481cf08f4fe610aa @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4780: 1c1612bdc36b44a704095e7b0ba5542818ce793f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_12000: e620fc0f05f3e1ebf6b1fb1488e7d2fb38930404 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == e620fc0f05f3 dma-buf: Enhance dma-fence tracing == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12000/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* ✓ Fi.CI.IGT: success for dma-buf: Enhance dma-fence tracing 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson 2019-01-21 23:47 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork 2019-01-22 0:06 ` ✓ Fi.CI.BAT: success " Patchwork @ 2019-01-22 0:55 ` Patchwork 2019-01-22 8:49 ` [PATCH] " Koenig, Christian 2019-01-29 1:52 ` Michael Sartain 4 siblings, 0 replies; 12+ messages in thread From: Patchwork @ 2019-01-22 0:55 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx == Series Details == Series: dma-buf: Enhance dma-fence tracing URL : https://patchwork.freedesktop.org/series/55531/ State : success == Summary == CI Bug Log - changes from CI_DRM_5459_full -> Patchwork_12000_full ==================================================== Summary ------- **SUCCESS** No regressions found. Known issues ------------ Here are the changes found in Patchwork_12000_full that come from known issues: ### IGT changes ### #### Issues hit #### * igt@gem_exec_schedule@pi-ringfull-blt: - shard-apl: NOTRUN -> FAIL [fdo#103158] * igt@kms_content_protection@legacy: - shard-apl: NOTRUN -> FAIL [fdo#108597] * igt@kms_cursor_crc@cursor-128x128-onscreen: - shard-apl: PASS -> FAIL [fdo#103232] * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-mmap-wc: - shard-glk: PASS -> FAIL [fdo#103167] * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-pwrite: - shard-apl: PASS -> FAIL [fdo#103167] * igt@kms_plane_multiple@atomic-pipe-b-tiling-none: - shard-apl: PASS -> FAIL [fdo#103166] * igt@kms_plane_multiple@atomic-pipe-c-tiling-y: - shard-glk: PASS -> FAIL [fdo#103166] #### Possible fixes #### * igt@kms_busy@extended-pageflip-hang-oldfb-render-b: - shard-snb: {SKIP} [fdo#109271] / [fdo#109278] -> PASS * igt@kms_ccs@pipe-b-crc-sprite-planes-basic: - shard-glk: FAIL [fdo#108145] -> PASS * igt@kms_cursor_crc@cursor-256x85-sliding: - shard-apl: FAIL [fdo#103232] -> PASS +1 * igt@kms_draw_crc@draw-method-xrgb8888-mmap-gtt-untiled: - shard-snb: {SKIP} [fdo#109271] -> PASS +4 * igt@kms_plane@pixel-format-pipe-c-planes: - shard-apl: FAIL [fdo#103166] -> PASS * igt@kms_plane@plane-position-covered-pipe-c-planes: - shard-glk: FAIL [fdo#103166] -> PASS * igt@kms_rotation_crc@multiplane-rotation: - shard-kbl: FAIL -> PASS * igt@kms_setmode@basic: - shard-apl: INCOMPLETE [fdo#103927] -> PASS {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#103158]: https://bugs.freedesktop.org/show_bug.cgi?id=103158 [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166 [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167 [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232 [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927 [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145 [fdo#108597]: https://bugs.freedesktop.org/show_bug.cgi?id=108597 [fdo#109271]: https://bugs.freedesktop.org/show_bug.cgi?id=109271 [fdo#109278]: https://bugs.freedesktop.org/show_bug.cgi?id=109278 Participating hosts (7 -> 5) ------------------------------ Missing (2): shard-skl shard-iclb Build changes ------------- * Linux: CI_DRM_5459 -> Patchwork_12000 CI_DRM_5459: 0f693a275dd91391b476ada7481cf08f4fe610aa @ git://anongit.freedesktop.org/gfx-ci/linux IGT_4780: 1c1612bdc36b44a704095e7b0ba5542818ce793f @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_12000: e620fc0f05f3e1ebf6b1fb1488e7d2fb38930404 @ git://anongit.freedesktop.org/gfx-ci/linux piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_12000/ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson ` (2 preceding siblings ...) 2019-01-22 0:55 ` ✓ Fi.CI.IGT: " Patchwork @ 2019-01-22 8:49 ` Koenig, Christian 2019-01-22 9:05 ` Chris Wilson 2019-01-29 1:52 ` Michael Sartain 4 siblings, 1 reply; 12+ messages in thread From: Koenig, Christian @ 2019-01-22 8:49 UTC (permalink / raw) To: Chris Wilson, dri-devel Cc: Michael Sartain, intel-gfx, Steven Rostedt, Alex Deucher, Pierre-Loup Griffais Am 22.01.19 um 00:20 schrieb Chris Wilson: > Rather than every backend and GPU driver reinventing the same wheel for > user level debugging of HW execution, the common dma-fence framework > should include the tracing infrastructure required for most client API > level flow visualisation. > > With these common dma-fence level tracepoints, the userspace tools can > establish a detailed view of the client <-> HW flow across different > kernels. There is a strong ask to have this available, so that the > userspace developer can effectively assess if they're doing a good job > about feeding the beast of a GPU hardware. > > In the case of needing to look into more fine-grained details of how > kernel internals work towards the goal of feeding the beast, the tools > may optionally amend the dma-fence tracing information with the driver > implementation specific. But for such cases, the tools should have a > graceful degradation in case the expected extra tracepoints have > changed or their format differs from the expected, as the kernel > implementation internals are not expected to stay the same. > > It is important to distinguish between tracing for the purpose of client > flow visualisation and tracing for the purpose of low-level kernel > debugging. The latter is highly implementation specific, tied to > a particular HW and driver, whereas the former addresses a common goal > of user level tracing and likely a common set of userspace tools. > Having made the distinction that these tracepoints will be consumed for > client API tooling, we raise the spectre of tracepoint ABI stability. It > is hoped that by defining a common set of dma-fence tracepoints, we avoid > the pitfall of exposing low level details and so restrict ourselves only > to the high level flow that is applicable to all drivers and hardware. > Thus the reserved guarantee that this set of tracepoints will be stable > (with the emphasis on depicting client <-> HW flow as opposed to > driver <-> HW). > > In terms of specific changes to the dma-fence tracing, we remove the > emission of the strings for every tracepoint (reserving them for > dma_fence_init for cases where they have unique dma_fence_ops, and > preferring to have descriptors for the whole fence context). strings do > not pack as well into the ftrace ringbuffer and we would prefer to > reduce the amount of indirect callbacks required for frequent tracepoint > emission. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Cc: Alex Deucher <alexdeucher@gmail.com> > Cc: "Christian König" <christian.koenig@amd.com> > Cc: Eric Anholt <eric@anholt.net> > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> > Cc: Michael Sartain <mikesart@fastmail.com> > Cc: Steven Rostedt <rostedt@goodmis.org> In general yes please! If possible please separate out the changes to the common dma_fence infrastructure from the i915 changes. One thing I'm wondering is why the enable_signaling trace point doesn't need to be exported any more. Is that only used internally in the common infrastructure? Apart from that I'm on sick leave today, so give me at least a few days to recover and take a closer look. Thanks, Christian. > --- > drivers/dma-buf/dma-fence.c | 9 +- > drivers/gpu/drm/i915/i915_gem_clflush.c | 5 + > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 1 - > drivers/gpu/drm/i915/i915_request.c | 16 +- > drivers/gpu/drm/i915/i915_timeline.c | 5 + > drivers/gpu/drm/i915/i915_trace.h | 134 --------------- > drivers/gpu/drm/i915/intel_guc_submission.c | 10 ++ > drivers/gpu/drm/i915/intel_lrc.c | 6 + > drivers/gpu/drm/i915/intel_ringbuffer.h | 2 + > include/trace/events/dma_fence.h | 177 +++++++++++++++++++- > 10 files changed, 214 insertions(+), 151 deletions(-) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 3aa8733f832a..5c93ed34b1ff 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -27,8 +27,15 @@ > #define CREATE_TRACE_POINTS > #include <trace/events/dma_fence.h> > > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_context_create); > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_context_destroy); > + > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_await); > EXPORT_TRACEPOINT_SYMBOL(dma_fence_emit); > -EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_execute_start); > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_execute_end); > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_wait_start); > +EXPORT_TRACEPOINT_SYMBOL(dma_fence_wait_end); > > static DEFINE_SPINLOCK(dma_fence_stub_lock); > static struct dma_fence dma_fence_stub; > diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c > index 8e74c23cbd91..435c1303ecc8 100644 > --- a/drivers/gpu/drm/i915/i915_gem_clflush.c > +++ b/drivers/gpu/drm/i915/i915_gem_clflush.c > @@ -22,6 +22,8 @@ > * > */ > > +#include <trace/events/dma_fence.h> > + > #include "i915_drv.h" > #include "intel_frontbuffer.h" > #include "i915_gem_clflush.h" > @@ -73,6 +75,7 @@ static void i915_clflush_work(struct work_struct *work) > struct clflush *clflush = container_of(work, typeof(*clflush), work); > struct drm_i915_gem_object *obj = clflush->obj; > > + trace_dma_fence_execute_start(&clflush->dma, smp_processor_id()); > if (i915_gem_object_pin_pages(obj)) { > DRM_ERROR("Failed to acquire obj->pages for clflushing\n"); > goto out; > @@ -83,6 +86,7 @@ static void i915_clflush_work(struct work_struct *work) > i915_gem_object_unpin_pages(obj); > > out: > + trace_dma_fence_execute_end(&clflush->dma, smp_processor_id()); > i915_gem_object_put(obj); > > dma_fence_signal(&clflush->dma); > @@ -97,6 +101,7 @@ i915_clflush_notify(struct i915_sw_fence *fence, > > switch (state) { > case FENCE_COMPLETE: > + trace_dma_fence_emit(&clflush->dma); > schedule_work(&clflush->work); > break; > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index f250109e1f66..f10a9e675e2c 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -2397,7 +2397,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, > */ > eb.request->batch = eb.batch; > > - trace_i915_request_queue(eb.request, eb.batch_flags); > err = eb_submit(&eb); > err_request: > i915_request_add(eb.request); > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > index 5e178f5ac18b..d5bbc8cb8306 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -28,6 +28,8 @@ > #include <linux/sched/clock.h> > #include <linux/sched/signal.h> > > +#include <trace/events/dma_fence.h> > + > #include "i915_drv.h" > #include "i915_reset.h" > > @@ -254,8 +256,6 @@ static void i915_request_retire(struct i915_request *request) > GEM_BUG_ON(!i915_sw_fence_signaled(&request->submit)); > GEM_BUG_ON(!i915_request_completed(request)); > > - trace_i915_request_retire(request); > - > advance_ring(request); > free_capture_list(request); > > @@ -383,8 +383,6 @@ void __i915_request_submit(struct i915_request *request) > /* Transfer from per-context onto the global per-engine timeline */ > move_to_timeline(request, &engine->timeline); > > - trace_i915_request_execute(request); > - > wake_up_all(&request->execute); > } > > @@ -463,7 +461,8 @@ submit_notify(struct i915_sw_fence *fence, enum i915_sw_fence_notify state) > > switch (state) { > case FENCE_COMPLETE: > - trace_i915_request_submit(request); > + trace_dma_fence_emit(&request->fence); > + > /* > * We need to serialize use of the submit_request() callback > * with its hotplugging performed during an emergency > @@ -770,6 +769,8 @@ i915_request_await_dma_fence(struct i915_request *rq, struct dma_fence *fence) > if (ret < 0) > return ret; > > + trace_dma_fence_await(&rq->fence, fence); > + > /* Record the latest fence used against each timeline */ > if (fence->context != rq->i915->mm.unordered_timeline) > i915_timeline_sync_set(rq->timeline, fence); > @@ -878,7 +879,6 @@ void i915_request_add(struct i915_request *request) > engine->name, request->fence.context, request->fence.seqno); > > lockdep_assert_held(&request->i915->drm.struct_mutex); > - trace_i915_request_add(request); > > /* > * Make sure that no request gazumped us - if it was allocated after > @@ -1139,7 +1139,7 @@ long i915_request_wait(struct i915_request *rq, > if (!timeout) > return -ETIME; > > - trace_i915_request_wait_begin(rq, flags); > + trace_dma_fence_wait_start(&rq->fence); > > add_wait_queue(&rq->execute, &exec); > if (flags & I915_WAIT_LOCKED) > @@ -1245,7 +1245,7 @@ long i915_request_wait(struct i915_request *rq, > if (flags & I915_WAIT_LOCKED) > remove_wait_queue(errq, &reset); > remove_wait_queue(&rq->execute, &exec); > - trace_i915_request_wait_end(rq); > + trace_dma_fence_wait_end(&rq->fence); > > return timeout; > } > diff --git a/drivers/gpu/drm/i915/i915_timeline.c b/drivers/gpu/drm/i915/i915_timeline.c > index 4667cc08c416..a83ae0b59b1f 100644 > --- a/drivers/gpu/drm/i915/i915_timeline.c > +++ b/drivers/gpu/drm/i915/i915_timeline.c > @@ -4,6 +4,8 @@ > * Copyright © 2016-2018 Intel Corporation > */ > > +#include <trace/events/dma_fence.h> > + > #include "i915_drv.h" > > #include "i915_timeline.h" > @@ -30,6 +32,7 @@ void i915_timeline_init(struct drm_i915_private *i915, > /* Called during early_init before we know how many engines there are */ > > timeline->fence_context = dma_fence_context_alloc(1); > + trace_dma_fence_context_create(timeline->fence_context, "i915", name); > > spin_lock_init(&timeline->lock); > > @@ -70,6 +73,8 @@ void i915_timeline_fini(struct i915_timeline *timeline) > { > GEM_BUG_ON(!list_empty(&timeline->requests)); > > + trace_dma_fence_context_destroy(timeline->fence_context); > + > i915_syncmap_free(&timeline->sync); > > list_del(&timeline->link); > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 33d90eca9cdd..538b6efb5a56 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -584,82 +584,7 @@ TRACE_EVENT(i915_gem_evict_vm, > TP_printk("dev=%d, vm=%p", __entry->dev, __entry->vm) > ); > > -TRACE_EVENT(i915_request_queue, > - TP_PROTO(struct i915_request *rq, u32 flags), > - TP_ARGS(rq, flags), > - > - TP_STRUCT__entry( > - __field(u32, dev) > - __field(u32, hw_id) > - __field(u64, ctx) > - __field(u16, class) > - __field(u16, instance) > - __field(u32, seqno) > - __field(u32, flags) > - ), > - > - TP_fast_assign( > - __entry->dev = rq->i915->drm.primary->index; > - __entry->hw_id = rq->gem_context->hw_id; > - __entry->class = rq->engine->uabi_class; > - __entry->instance = rq->engine->instance; > - __entry->ctx = rq->fence.context; > - __entry->seqno = rq->fence.seqno; > - __entry->flags = flags; > - ), > - > - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, flags=0x%x", > - __entry->dev, __entry->class, __entry->instance, > - __entry->hw_id, __entry->ctx, __entry->seqno, > - __entry->flags) > -); > - > -DECLARE_EVENT_CLASS(i915_request, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq), > - > - TP_STRUCT__entry( > - __field(u32, dev) > - __field(u32, hw_id) > - __field(u64, ctx) > - __field(u16, class) > - __field(u16, instance) > - __field(u32, seqno) > - __field(u32, global) > - ), > - > - TP_fast_assign( > - __entry->dev = rq->i915->drm.primary->index; > - __entry->hw_id = rq->gem_context->hw_id; > - __entry->class = rq->engine->uabi_class; > - __entry->instance = rq->engine->instance; > - __entry->ctx = rq->fence.context; > - __entry->seqno = rq->fence.seqno; > - __entry->global = rq->global_seqno; > - ), > - > - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, global=%u", > - __entry->dev, __entry->class, __entry->instance, > - __entry->hw_id, __entry->ctx, __entry->seqno, > - __entry->global) > -); > - > -DEFINE_EVENT(i915_request, i915_request_add, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq) > -); > - > #if defined(CONFIG_DRM_I915_LOW_LEVEL_TRACEPOINTS) > -DEFINE_EVENT(i915_request, i915_request_submit, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq) > -); > - > -DEFINE_EVENT(i915_request, i915_request_execute, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq) > -); > - > TRACE_EVENT(i915_request_in, > TP_PROTO(struct i915_request *rq, unsigned int port), > TP_ARGS(rq, port), > @@ -728,16 +653,6 @@ TRACE_EVENT(i915_request_out, > > #else > #if !defined(TRACE_HEADER_MULTI_READ) > -static inline void > -trace_i915_request_submit(struct i915_request *rq) > -{ > -} > - > -static inline void > -trace_i915_request_execute(struct i915_request *rq) > -{ > -} > - > static inline void > trace_i915_request_in(struct i915_request *rq, unsigned int port) > { > @@ -775,55 +690,6 @@ TRACE_EVENT(intel_engine_notify, > __entry->seqno, __entry->waiters) > ); > > -DEFINE_EVENT(i915_request, i915_request_retire, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq) > -); > - > -TRACE_EVENT(i915_request_wait_begin, > - TP_PROTO(struct i915_request *rq, unsigned int flags), > - TP_ARGS(rq, flags), > - > - TP_STRUCT__entry( > - __field(u32, dev) > - __field(u32, hw_id) > - __field(u64, ctx) > - __field(u16, class) > - __field(u16, instance) > - __field(u32, seqno) > - __field(u32, global) > - __field(unsigned int, flags) > - ), > - > - /* NB: the blocking information is racy since mutex_is_locked > - * doesn't check that the current thread holds the lock. The only > - * other option would be to pass the boolean information of whether > - * or not the class was blocking down through the stack which is > - * less desirable. > - */ > - TP_fast_assign( > - __entry->dev = rq->i915->drm.primary->index; > - __entry->hw_id = rq->gem_context->hw_id; > - __entry->class = rq->engine->uabi_class; > - __entry->instance = rq->engine->instance; > - __entry->ctx = rq->fence.context; > - __entry->seqno = rq->fence.seqno; > - __entry->global = rq->global_seqno; > - __entry->flags = flags; > - ), > - > - TP_printk("dev=%u, engine=%u:%u, hw_id=%u, ctx=%llu, seqno=%u, global=%u, blocking=%u, flags=0x%x", > - __entry->dev, __entry->class, __entry->instance, > - __entry->hw_id, __entry->ctx, __entry->seqno, > - __entry->global, !!(__entry->flags & I915_WAIT_LOCKED), > - __entry->flags) > -); > - > -DEFINE_EVENT(i915_request, i915_request_wait_end, > - TP_PROTO(struct i915_request *rq), > - TP_ARGS(rq) > -); > - > TRACE_EVENT_CONDITION(i915_reg_rw, > TP_PROTO(bool write, i915_reg_t reg, u64 val, int len, bool trace), > > diff --git a/drivers/gpu/drm/i915/intel_guc_submission.c b/drivers/gpu/drm/i915/intel_guc_submission.c > index ab1c49b106f2..dda20ddf0418 100644 > --- a/drivers/gpu/drm/i915/intel_guc_submission.c > +++ b/drivers/gpu/drm/i915/intel_guc_submission.c > @@ -535,6 +535,14 @@ static void guc_add_request(struct intel_guc *guc, struct i915_request *rq) > client->submissions[engine->id] += 1; > > spin_unlock(&client->wq_lock); > + > + /* > + * XXX move to HW synthesis: > + * - emit SRM(RCS_TIMESTAMP) to per-context journal > + * - poll (maybe interrupt) journal from kthread, generating tracepoint > + * - notify callback for dma_fence tracepoint register > + */ > + trace_dma_fence_execute_start(&rq->fence, i915_trace_hwid(rq->engine)); > } > > /* > @@ -810,6 +818,8 @@ static void guc_submission_tasklet(unsigned long data) > > rq = port_request(port); > while (rq && i915_request_completed(rq)) { > + trace_dma_fence_execute_end(&rq->fence, > + i915_trace_hwid(rq->engine)); > trace_i915_request_out(rq); > i915_request_put(rq); > > diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c > index c0a42afaf177..3fc485925ab1 100644 > --- a/drivers/gpu/drm/i915/intel_lrc.c > +++ b/drivers/gpu/drm/i915/intel_lrc.c > @@ -133,6 +133,8 @@ > */ > #include <linux/interrupt.h> > > +#include <trace/events/dma_fence.h> > + > #include <drm/i915_drm.h> > #include "i915_drv.h" > #include "i915_gem_render_state.h" > @@ -351,11 +353,15 @@ execlists_context_schedule_in(struct i915_request *rq) > execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN); > intel_engine_context_in(rq->engine); > rq->hw_context->active = rq->engine; > + > + trace_dma_fence_execute_start(&rq->fence, i915_trace_hwid(rq->engine)); > } > > static inline void > execlists_context_schedule_out(struct i915_request *rq, unsigned long status) > { > + trace_dma_fence_execute_end(&rq->fence, i915_trace_hwid(rq->engine)); > + > rq->hw_context->active = NULL; > intel_engine_context_out(rq->engine); > execlists_context_status_change(rq, status); > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index c3ef0f9bf321..0fd7fb7fd2cb 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -579,6 +579,8 @@ struct intel_engine_cs { > } stats; > }; > > +#define i915_trace_hwid(e) ((e)->uabi_class << 16 | (e)->instance) > + > static inline bool > intel_engine_needs_cmd_parser(const struct intel_engine_cs *engine) > { > diff --git a/include/trace/events/dma_fence.h b/include/trace/events/dma_fence.h > index 2212adda8f77..75d0de3ccc4a 100644 > --- a/include/trace/events/dma_fence.h > +++ b/include/trace/events/dma_fence.h > @@ -9,12 +9,119 @@ > > struct dma_fence; > > +/* > + * dma-fence tracing *or* How to debug fences > + * > + * The dma-fence tracing provides insight into the user <-> HW execution > + * flow, although dma-fence is not tied to HW and may be used to coordinate > + * any execution flow. A dma-fence represents a job along a timeline (the fence > + * context), and when the job is complete it is signaled. However, for the > + * purposes of visualisation of HW execution, we need a little more information > + * than that as we need to not only know when the job became ready for execution > + * (was passed into the HW queue) but ideally when and on which HW engine that > + * job was eventually scheduled. For CPU bound execution flows, similarly > + * knowing on which CPU the job was scheduled can be vital information for > + * debugging. > + * > + * The typical flow of events from user to HW for a dma-fence would be: > + * > + * 1. dma_fence_init > + * 2. dma_fence_await (optional) > + * - records the dependencies between fences that must be signaled > + * before this fence is ready for execution; an asynchronous wait > + * 3. dma_fence_emit > + * - the fence is ready for execution and passed to the execution queue > + * (the user to HW/FW/backend transition) > + * 4. dma_fence_execute_start (optional) > + * - records the start of execution on the backend (HW) and includes > + * a tag to uniquely identify the backend engine so that concurrent > + * execution can be traced > + * - may only be emitted for the first fence in a context to begin > + * execution > + * 5. dma_fence_execute_end (optional) > + * - records the corresponding completion point of backend execution > + * - may only be emitted for the last fence in a context to finish > + * execution > + * 6. dma_fence_signaled > + * - records when the fence was marked as completed and the result > + * propagated to the various waiters > + * 7. dma_fence_destroy > + * > + * Note that not all fences are used in conjunction with HW engines, and > + * so may exclude the execution tracing. Nor do they all correspond with > + * client API, although many may be used as background tasks required > + * before HW execution. > + * > + * The flow of events from HW to user would be: > + * > + * 1. dma_fence_wait_begin > + * 2. dma_fence_enable_signaling (optional) > + * 3. dma_fence_signaled > + * 4. dma_fence_wait_end > + * > + * For improved visualisation, dma_fence_context_create and > + * dma_fence_context_destroy are used to couple the context id to a string. > + */ > + > +TRACE_EVENT(dma_fence_context_create, > + TP_PROTO(u64 context, const char *driver, const char *timeline), > + TP_ARGS(context, driver, timeline), > + > + TP_STRUCT__entry( > + __field(u64, context) > + __string(driver, driver) > + __string(timeline, timeline) > + ), > + > + TP_fast_assign( > + __entry->context = context; > + __assign_str(driver, driver) > + __assign_str(timeline, timeline) > + ), > + > + TP_printk("context=%llu, driver=%s, timeline=%s", > + __entry->context, __get_str(driver), __get_str(timeline)) > +); > + > +TRACE_EVENT(dma_fence_context_destroy, > + TP_PROTO(u64 context), > + TP_ARGS(context), > + > + TP_STRUCT__entry( > + __field(u64, context) > + ), > + > + TP_fast_assign( > + __entry->context = context; > + ), > + > + TP_printk("context=%llu", __entry->context) > +); > + > DECLARE_EVENT_CLASS(dma_fence, > > TP_PROTO(struct dma_fence *fence), > > TP_ARGS(fence), > > + TP_STRUCT__entry( > + __field(u64, context) > + __field(u64, seqno) > + ), > + > + TP_fast_assign( > + __entry->context = fence->context; > + __entry->seqno = fence->seqno; > + ), > + > + TP_printk("context=%llu, seqno=%llu", __entry->context, __entry->seqno) > +); > + > +TRACE_EVENT(dma_fence_init, > + TP_PROTO(struct dma_fence *fence), > + > + TP_ARGS(fence), > + > TP_STRUCT__entry( > __string(driver, fence->ops->get_driver_name(fence)) > __string(timeline, fence->ops->get_timeline_name(fence)) > @@ -41,13 +148,6 @@ DEFINE_EVENT(dma_fence, dma_fence_emit, > TP_ARGS(fence) > ); > > -DEFINE_EVENT(dma_fence, dma_fence_init, > - > - TP_PROTO(struct dma_fence *fence), > - > - TP_ARGS(fence) > -); > - > DEFINE_EVENT(dma_fence, dma_fence_destroy, > > TP_PROTO(struct dma_fence *fence), > @@ -83,6 +183,69 @@ DEFINE_EVENT(dma_fence, dma_fence_wait_end, > TP_ARGS(fence) > ); > > +TRACE_EVENT(dma_fence_await, > + TP_PROTO(struct dma_fence *wait, struct dma_fence *signal), > + TP_ARGS(wait, signal), > + > + TP_STRUCT__entry( > + __field(u64, wait_context) > + __field(u64, wait_seqno) > + __field(u64, signal_context) > + __field(u64, signal_seqno) > + ), > + > + TP_fast_assign( > + __entry->wait_context = wait->context; > + __entry->wait_seqno = wait->seqno; > + __entry->signal_context = signal->context; > + __entry->signal_seqno = signal->seqno; > + ), > + > + TP_printk("wait_context=%llu, wait_seqno=%llu, signal_context=%llu, signal_seqno=%llu", > + __entry->wait_context, __entry->wait_seqno, > + __entry->signal_context, __entry->signal_seqno) > +); > + > +TRACE_EVENT(dma_fence_execute_start, > + TP_PROTO(struct dma_fence *fence, u64 hwid), > + TP_ARGS(fence, hwid), > + > + TP_STRUCT__entry( > + __field(u64, context) > + __field(u64, seqno) > + __field(u64, hwid) > + ), > + > + TP_fast_assign( > + __entry->context = fence->context; > + __entry->seqno = fence->seqno; > + __entry->hwid = hwid; > + ), > + > + TP_printk("context=%llu, seqno=%llu, hwid=%llu", > + __entry->context, __entry->seqno, __entry->hwid) > +); > + > +TRACE_EVENT(dma_fence_execute_end, > + TP_PROTO(struct dma_fence *fence, u64 hwid), > + TP_ARGS(fence, hwid), > + > + TP_STRUCT__entry( > + __field(u64, context) > + __field(u64, seqno) > + __field(u64, hwid) > + ), > + > + TP_fast_assign( > + __entry->context = fence->context; > + __entry->seqno = fence->seqno; > + __entry->hwid = hwid; > + ), > + > + TP_printk("context=%llu, seqno=%llu, hwid=%llu", > + __entry->context, __entry->seqno, __entry->hwid) > +); > + > #endif /* _TRACE_DMA_FENCE_H */ > > /* This part must be outside protection */ _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-22 8:49 ` [PATCH] " Koenig, Christian @ 2019-01-22 9:05 ` Chris Wilson 2019-01-22 9:11 ` Daniel Vetter 2019-01-24 18:45 ` Eric Anholt 0 siblings, 2 replies; 12+ messages in thread From: Chris Wilson @ 2019-01-22 9:05 UTC (permalink / raw) To: Koenig, Christian, dri-devel Cc: Michael Sartain, Tvrtko Ursulin, intel-gfx, Steven Rostedt, Pierre-Loup Griffais Quoting Koenig, Christian (2019-01-22 08:49:30) > Am 22.01.19 um 00:20 schrieb Chris Wilson: > > Rather than every backend and GPU driver reinventing the same wheel for > > user level debugging of HW execution, the common dma-fence framework > > should include the tracing infrastructure required for most client API > > level flow visualisation. > > > > With these common dma-fence level tracepoints, the userspace tools can > > establish a detailed view of the client <-> HW flow across different > > kernels. There is a strong ask to have this available, so that the > > userspace developer can effectively assess if they're doing a good job > > about feeding the beast of a GPU hardware. > > > > In the case of needing to look into more fine-grained details of how > > kernel internals work towards the goal of feeding the beast, the tools > > may optionally amend the dma-fence tracing information with the driver > > implementation specific. But for such cases, the tools should have a > > graceful degradation in case the expected extra tracepoints have > > changed or their format differs from the expected, as the kernel > > implementation internals are not expected to stay the same. > > > > It is important to distinguish between tracing for the purpose of client > > flow visualisation and tracing for the purpose of low-level kernel > > debugging. The latter is highly implementation specific, tied to > > a particular HW and driver, whereas the former addresses a common goal > > of user level tracing and likely a common set of userspace tools. > > Having made the distinction that these tracepoints will be consumed for > > client API tooling, we raise the spectre of tracepoint ABI stability. It > > is hoped that by defining a common set of dma-fence tracepoints, we avoid > > the pitfall of exposing low level details and so restrict ourselves only > > to the high level flow that is applicable to all drivers and hardware. > > Thus the reserved guarantee that this set of tracepoints will be stable > > (with the emphasis on depicting client <-> HW flow as opposed to > > driver <-> HW). > > > > In terms of specific changes to the dma-fence tracing, we remove the > > emission of the strings for every tracepoint (reserving them for > > dma_fence_init for cases where they have unique dma_fence_ops, and > > preferring to have descriptors for the whole fence context). strings do > > not pack as well into the ftrace ringbuffer and we would prefer to > > reduce the amount of indirect callbacks required for frequent tracepoint > > emission. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Cc: Alex Deucher <alexdeucher@gmail.com> > > Cc: "Christian König" <christian.koenig@amd.com> > > Cc: Eric Anholt <eric@anholt.net> > > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> > > Cc: Michael Sartain <mikesart@fastmail.com> > > Cc: Steven Rostedt <rostedt@goodmis.org> > > In general yes please! If possible please separate out the changes to > the common dma_fence infrastructure from the i915 changes. Sure, I was just stressing the impact: remove some randomly placed internal debugging tracepoints, try to define useful ones instead :) On the list of things to do was to convert at least 2 other drivers (I was thinking nouveau/msm for simplicity, vc4 for a simpler introduction to drm_sched than amdgpu) over to be sure we have the right tracepoints. > One thing I'm wondering is why the enable_signaling trace point doesn't > need to be exported any more. Is that only used internally in the common > infrastructure? Right. Only used inside the core, and I don't see much call for making it easy for drivers to fiddle around bypassing the core enable_signaling/signal. (I'm not sure it's useful for client flow either, it feels more like dma-fence debugging, but they can just not listen to that tracepoint.) -Chris _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-22 9:05 ` Chris Wilson @ 2019-01-22 9:11 ` Daniel Vetter 2019-01-22 9:57 ` Chris Wilson 2019-01-24 18:45 ` Eric Anholt 1 sibling, 1 reply; 12+ messages in thread From: Daniel Vetter @ 2019-01-22 9:11 UTC (permalink / raw) To: Chris Wilson Cc: Michael Sartain, intel-gfx, Steven Rostedt, dri-devel, Koenig, Christian, Pierre-Loup Griffais On Tue, Jan 22, 2019 at 10:06 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Koenig, Christian (2019-01-22 08:49:30) > > Am 22.01.19 um 00:20 schrieb Chris Wilson: > > > Rather than every backend and GPU driver reinventing the same wheel for > > > user level debugging of HW execution, the common dma-fence framework > > > should include the tracing infrastructure required for most client API > > > level flow visualisation. > > > > > > With these common dma-fence level tracepoints, the userspace tools can > > > establish a detailed view of the client <-> HW flow across different > > > kernels. There is a strong ask to have this available, so that the > > > userspace developer can effectively assess if they're doing a good job > > > about feeding the beast of a GPU hardware. > > > > > > In the case of needing to look into more fine-grained details of how > > > kernel internals work towards the goal of feeding the beast, the tools > > > may optionally amend the dma-fence tracing information with the driver > > > implementation specific. But for such cases, the tools should have a > > > graceful degradation in case the expected extra tracepoints have > > > changed or their format differs from the expected, as the kernel > > > implementation internals are not expected to stay the same. > > > > > > It is important to distinguish between tracing for the purpose of client > > > flow visualisation and tracing for the purpose of low-level kernel > > > debugging. The latter is highly implementation specific, tied to > > > a particular HW and driver, whereas the former addresses a common goal > > > of user level tracing and likely a common set of userspace tools. > > > Having made the distinction that these tracepoints will be consumed for > > > client API tooling, we raise the spectre of tracepoint ABI stability. It > > > is hoped that by defining a common set of dma-fence tracepoints, we avoid > > > the pitfall of exposing low level details and so restrict ourselves only > > > to the high level flow that is applicable to all drivers and hardware. > > > Thus the reserved guarantee that this set of tracepoints will be stable > > > (with the emphasis on depicting client <-> HW flow as opposed to > > > driver <-> HW). > > > > > > In terms of specific changes to the dma-fence tracing, we remove the > > > emission of the strings for every tracepoint (reserving them for > > > dma_fence_init for cases where they have unique dma_fence_ops, and > > > preferring to have descriptors for the whole fence context). strings do > > > not pack as well into the ftrace ringbuffer and we would prefer to > > > reduce the amount of indirect callbacks required for frequent tracepoint > > > emission. > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > Cc: Alex Deucher <alexdeucher@gmail.com> > > > Cc: "Christian König" <christian.koenig@amd.com> > > > Cc: Eric Anholt <eric@anholt.net> > > > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> > > > Cc: Michael Sartain <mikesart@fastmail.com> > > > Cc: Steven Rostedt <rostedt@goodmis.org> > > > > In general yes please! If possible please separate out the changes to > > the common dma_fence infrastructure from the i915 changes. > > Sure, I was just stressing the impact: remove some randomly placed > internal debugging tracepoints, try to define useful ones instead :) > > On the list of things to do was to convert at least 2 other drivers > (I was thinking nouveau/msm for simplicity, vc4 for a simpler > introduction to drm_sched than amdgpu) over to be sure we have the right > tracepoints. I think sprinkling these over the scheduler (maybe just as an opt-in, for the case where the driver doesn't have some additional queueing somewhere) would be good. I haven't checked whether it fits, but would give you a bunch of drivers at once. It might also not cover all the cases (I guess the wait related ones would need to be somewhere else). -Daniel > > One thing I'm wondering is why the enable_signaling trace point doesn't > > need to be exported any more. Is that only used internally in the common > > infrastructure? > > Right. Only used inside the core, and I don't see much call for making > it easy for drivers to fiddle around bypassing the core > enable_signaling/signal. (I'm not sure it's useful for client flow > either, it feels more like dma-fence debugging, but they can just > not listen to that tracepoint.) > -Chris > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-22 9:11 ` Daniel Vetter @ 2019-01-22 9:57 ` Chris Wilson 2019-01-22 10:09 ` Daniel Vetter 0 siblings, 1 reply; 12+ messages in thread From: Chris Wilson @ 2019-01-22 9:57 UTC (permalink / raw) To: Daniel Vetter Cc: Michael Sartain, intel-gfx, Steven Rostedt, dri-devel, Koenig, Christian, Pierre-Loup Griffais Quoting Daniel Vetter (2019-01-22 09:11:53) > On Tue, Jan 22, 2019 at 10:06 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > Quoting Koenig, Christian (2019-01-22 08:49:30) > > > Am 22.01.19 um 00:20 schrieb Chris Wilson: > > > > Rather than every backend and GPU driver reinventing the same wheel for > > > > user level debugging of HW execution, the common dma-fence framework > > > > should include the tracing infrastructure required for most client API > > > > level flow visualisation. > > > > > > > > With these common dma-fence level tracepoints, the userspace tools can > > > > establish a detailed view of the client <-> HW flow across different > > > > kernels. There is a strong ask to have this available, so that the > > > > userspace developer can effectively assess if they're doing a good job > > > > about feeding the beast of a GPU hardware. > > > > > > > > In the case of needing to look into more fine-grained details of how > > > > kernel internals work towards the goal of feeding the beast, the tools > > > > may optionally amend the dma-fence tracing information with the driver > > > > implementation specific. But for such cases, the tools should have a > > > > graceful degradation in case the expected extra tracepoints have > > > > changed or their format differs from the expected, as the kernel > > > > implementation internals are not expected to stay the same. > > > > > > > > It is important to distinguish between tracing for the purpose of client > > > > flow visualisation and tracing for the purpose of low-level kernel > > > > debugging. The latter is highly implementation specific, tied to > > > > a particular HW and driver, whereas the former addresses a common goal > > > > of user level tracing and likely a common set of userspace tools. > > > > Having made the distinction that these tracepoints will be consumed for > > > > client API tooling, we raise the spectre of tracepoint ABI stability. It > > > > is hoped that by defining a common set of dma-fence tracepoints, we avoid > > > > the pitfall of exposing low level details and so restrict ourselves only > > > > to the high level flow that is applicable to all drivers and hardware. > > > > Thus the reserved guarantee that this set of tracepoints will be stable > > > > (with the emphasis on depicting client <-> HW flow as opposed to > > > > driver <-> HW). > > > > > > > > In terms of specific changes to the dma-fence tracing, we remove the > > > > emission of the strings for every tracepoint (reserving them for > > > > dma_fence_init for cases where they have unique dma_fence_ops, and > > > > preferring to have descriptors for the whole fence context). strings do > > > > not pack as well into the ftrace ringbuffer and we would prefer to > > > > reduce the amount of indirect callbacks required for frequent tracepoint > > > > emission. > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > Cc: Alex Deucher <alexdeucher@gmail.com> > > > > Cc: "Christian König" <christian.koenig@amd.com> > > > > Cc: Eric Anholt <eric@anholt.net> > > > > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> > > > > Cc: Michael Sartain <mikesart@fastmail.com> > > > > Cc: Steven Rostedt <rostedt@goodmis.org> > > > > > > In general yes please! If possible please separate out the changes to > > > the common dma_fence infrastructure from the i915 changes. > > > > Sure, I was just stressing the impact: remove some randomly placed > > internal debugging tracepoints, try to define useful ones instead :) > > > > On the list of things to do was to convert at least 2 other drivers > > (I was thinking nouveau/msm for simplicity, vc4 for a simpler > > introduction to drm_sched than amdgpu) over to be sure we have the right > > tracepoints. > > I think sprinkling these over the scheduler (maybe just as an opt-in, > for the case where the driver doesn't have some additional queueing > somewhere) would be good. I haven't checked whether it fits, but would > give you a bunch of drivers at once. It might also not cover all the > cases (I guess the wait related ones would need to be somewhere else). And the other thing (that got explicitly asked for!) was that we have some igt to make sure we don't surreptitiously break the tracepoints in future. Another task would to devise the set of tracepoints to describe the modesetting flow; that more or less is the flow of atomic helpers I guess: prepare; wait-on-fences; commit; signal; cleanup. For system snooping, knowing a target frame (msc or ust) and how late it was delayed and the HW execution flow up to the frame and being able to tie that back to the GL/VK client is the grand plan. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-22 9:57 ` Chris Wilson @ 2019-01-22 10:09 ` Daniel Vetter 0 siblings, 0 replies; 12+ messages in thread From: Daniel Vetter @ 2019-01-22 10:09 UTC (permalink / raw) To: Chris Wilson Cc: Michael Sartain, intel-gfx, Steven Rostedt, dri-devel, Koenig, Christian, Pierre-Loup Griffais On Tue, Jan 22, 2019 at 10:58 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > Quoting Daniel Vetter (2019-01-22 09:11:53) > > On Tue, Jan 22, 2019 at 10:06 AM Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > > > Quoting Koenig, Christian (2019-01-22 08:49:30) > > > > Am 22.01.19 um 00:20 schrieb Chris Wilson: > > > > > Rather than every backend and GPU driver reinventing the same wheel for > > > > > user level debugging of HW execution, the common dma-fence framework > > > > > should include the tracing infrastructure required for most client API > > > > > level flow visualisation. > > > > > > > > > > With these common dma-fence level tracepoints, the userspace tools can > > > > > establish a detailed view of the client <-> HW flow across different > > > > > kernels. There is a strong ask to have this available, so that the > > > > > userspace developer can effectively assess if they're doing a good job > > > > > about feeding the beast of a GPU hardware. > > > > > > > > > > In the case of needing to look into more fine-grained details of how > > > > > kernel internals work towards the goal of feeding the beast, the tools > > > > > may optionally amend the dma-fence tracing information with the driver > > > > > implementation specific. But for such cases, the tools should have a > > > > > graceful degradation in case the expected extra tracepoints have > > > > > changed or their format differs from the expected, as the kernel > > > > > implementation internals are not expected to stay the same. > > > > > > > > > > It is important to distinguish between tracing for the purpose of client > > > > > flow visualisation and tracing for the purpose of low-level kernel > > > > > debugging. The latter is highly implementation specific, tied to > > > > > a particular HW and driver, whereas the former addresses a common goal > > > > > of user level tracing and likely a common set of userspace tools. > > > > > Having made the distinction that these tracepoints will be consumed for > > > > > client API tooling, we raise the spectre of tracepoint ABI stability. It > > > > > is hoped that by defining a common set of dma-fence tracepoints, we avoid > > > > > the pitfall of exposing low level details and so restrict ourselves only > > > > > to the high level flow that is applicable to all drivers and hardware. > > > > > Thus the reserved guarantee that this set of tracepoints will be stable > > > > > (with the emphasis on depicting client <-> HW flow as opposed to > > > > > driver <-> HW). > > > > > > > > > > In terms of specific changes to the dma-fence tracing, we remove the > > > > > emission of the strings for every tracepoint (reserving them for > > > > > dma_fence_init for cases where they have unique dma_fence_ops, and > > > > > preferring to have descriptors for the whole fence context). strings do > > > > > not pack as well into the ftrace ringbuffer and we would prefer to > > > > > reduce the amount of indirect callbacks required for frequent tracepoint > > > > > emission. > > > > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > > > > > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > > Cc: Alex Deucher <alexdeucher@gmail.com> > > > > > Cc: "Christian König" <christian.koenig@amd.com> > > > > > Cc: Eric Anholt <eric@anholt.net> > > > > > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> > > > > > Cc: Michael Sartain <mikesart@fastmail.com> > > > > > Cc: Steven Rostedt <rostedt@goodmis.org> > > > > > > > > In general yes please! If possible please separate out the changes to > > > > the common dma_fence infrastructure from the i915 changes. > > > > > > Sure, I was just stressing the impact: remove some randomly placed > > > internal debugging tracepoints, try to define useful ones instead :) > > > > > > On the list of things to do was to convert at least 2 other drivers > > > (I was thinking nouveau/msm for simplicity, vc4 for a simpler > > > introduction to drm_sched than amdgpu) over to be sure we have the right > > > tracepoints. > > > > I think sprinkling these over the scheduler (maybe just as an opt-in, > > for the case where the driver doesn't have some additional queueing > > somewhere) would be good. I haven't checked whether it fits, but would > > give you a bunch of drivers at once. It might also not cover all the > > cases (I guess the wait related ones would need to be somewhere else). > > And the other thing (that got explicitly asked for!) was that we have > some igt to make sure we don't surreptitiously break the tracepoints > in future. > > Another task would to devise the set of tracepoints to describe the > modesetting flow; that more or less is the flow of atomic helpers I > guess: prepare; wait-on-fences; commit; signal; cleanup. For system > snooping, knowing a target frame (msc or ust) and how late it was > delayed and the HW execution flow up to the frame and being able to tie > that back to the GL/VK client is the grand plan. Yeah with atomic helpers this should be doable, as long as the driver uses the commit tracking part of the helpers. That's the stuff done by drm_atomic_helper_setup_commit(). I think by now that means all atomic drivers. I think the following would be good enough to track atomic updates: - atomic_check - atomic_commit_begin/end (for the synchronous portion, for nonblocking commits that would be mostly lock contention/cache flushing/pte wrangling in prepare_fb for fb pinning and stuff like that). - a flip complete tracepoint stuff into drm_crtc_send_vblank_event, we'll probably the lower level drm_send_event_locked() with some additional metadata stuffed into drm_pending_event for the tracepoint's consumption. For a bonus we probably want to annotate vblank waits/events too, so that all tracepoint hits in drm_send_event_locked have a source somewhere (either atomic commit or vblank wait/event). The waits the commit thread does should already be covered with your proposal here, but maybe we need some additional metadata to tie it in with the other atomic stuff in the UI. Otoh if the generic UI shows the name of the worker (there's another set of tracepoints for that iirc) it should be obvious what's going on. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-22 9:05 ` Chris Wilson 2019-01-22 9:11 ` Daniel Vetter @ 2019-01-24 18:45 ` Eric Anholt 1 sibling, 0 replies; 12+ messages in thread From: Eric Anholt @ 2019-01-24 18:45 UTC (permalink / raw) To: Chris Wilson, Koenig, Christian, dri-devel Cc: Michael Sartain, Tvrtko Ursulin, intel-gfx, Steven Rostedt, Pierre-Loup Griffais [-- Attachment #1.1: Type: text/plain, Size: 4110 bytes --] Chris Wilson <chris@chris-wilson.co.uk> writes: > Quoting Koenig, Christian (2019-01-22 08:49:30) >> Am 22.01.19 um 00:20 schrieb Chris Wilson: >> > Rather than every backend and GPU driver reinventing the same wheel for >> > user level debugging of HW execution, the common dma-fence framework >> > should include the tracing infrastructure required for most client API >> > level flow visualisation. >> > >> > With these common dma-fence level tracepoints, the userspace tools can >> > establish a detailed view of the client <-> HW flow across different >> > kernels. There is a strong ask to have this available, so that the >> > userspace developer can effectively assess if they're doing a good job >> > about feeding the beast of a GPU hardware. >> > >> > In the case of needing to look into more fine-grained details of how >> > kernel internals work towards the goal of feeding the beast, the tools >> > may optionally amend the dma-fence tracing information with the driver >> > implementation specific. But for such cases, the tools should have a >> > graceful degradation in case the expected extra tracepoints have >> > changed or their format differs from the expected, as the kernel >> > implementation internals are not expected to stay the same. >> > >> > It is important to distinguish between tracing for the purpose of client >> > flow visualisation and tracing for the purpose of low-level kernel >> > debugging. The latter is highly implementation specific, tied to >> > a particular HW and driver, whereas the former addresses a common goal >> > of user level tracing and likely a common set of userspace tools. >> > Having made the distinction that these tracepoints will be consumed for >> > client API tooling, we raise the spectre of tracepoint ABI stability. It >> > is hoped that by defining a common set of dma-fence tracepoints, we avoid >> > the pitfall of exposing low level details and so restrict ourselves only >> > to the high level flow that is applicable to all drivers and hardware. >> > Thus the reserved guarantee that this set of tracepoints will be stable >> > (with the emphasis on depicting client <-> HW flow as opposed to >> > driver <-> HW). >> > >> > In terms of specific changes to the dma-fence tracing, we remove the >> > emission of the strings for every tracepoint (reserving them for >> > dma_fence_init for cases where they have unique dma_fence_ops, and >> > preferring to have descriptors for the whole fence context). strings do >> > not pack as well into the ftrace ringbuffer and we would prefer to >> > reduce the amount of indirect callbacks required for frequent tracepoint >> > emission. >> > >> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> >> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >> > Cc: Alex Deucher <alexdeucher@gmail.com> >> > Cc: "Christian König" <christian.koenig@amd.com> >> > Cc: Eric Anholt <eric@anholt.net> >> > Cc: Pierre-Loup Griffais <pgriffais@valvesoftware.com> >> > Cc: Michael Sartain <mikesart@fastmail.com> >> > Cc: Steven Rostedt <rostedt@goodmis.org> >> >> In general yes please! If possible please separate out the changes to >> the common dma_fence infrastructure from the i915 changes. > > Sure, I was just stressing the impact: remove some randomly placed > internal debugging tracepoints, try to define useful ones instead :) > > On the list of things to do was to convert at least 2 other drivers > (I was thinking nouveau/msm for simplicity, vc4 for a simpler > introduction to drm_sched than amdgpu) over to be sure we have the right > tracepoints. v3d is using gpu-scheduler, and I'd love to see it using some shared tracepoints -- I put in some of what we'd need for visualization, but I haven't actually built visualization yet so I'm not sure it's good enough. vc4 isn't using gpu-scheduler yet. I'm interested in it -- there's the user qpu pipeline that we should expose, but supporting another pipeline without the shared scheduler is no fun. [-- Attachment #1.2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson ` (3 preceding siblings ...) 2019-01-22 8:49 ` [PATCH] " Koenig, Christian @ 2019-01-29 1:52 ` Michael Sartain 2019-01-29 10:08 ` Chris Wilson 4 siblings, 1 reply; 12+ messages in thread From: Michael Sartain @ 2019-01-29 1:52 UTC (permalink / raw) To: Chris Wilson, dri-devel Cc: intel-gfx, Steven Rostedt, Alex Deucher, Christian König, Pierre-Loup Griffais On Mon, Jan 21, 2019, at 4:20 PM, Chris Wilson wrote: > Rather than every backend and GPU driver reinventing the same wheel for > user level debugging of HW execution, the common dma-fence framework > should include the tracing infrastructure required for most client API > level flow visualisation. > > With these common dma-fence level tracepoints, the userspace tools can > establish a detailed view of the client <-> HW flow across different > kernels. There is a strong ask to have this available, so that the > userspace developer can effectively assess if they're doing a good job > about feeding the beast of a GPU hardware. ... I've got a first pass of this visualizing with gpuvis. Screenshots: ; with dma_event tracepoints patch https://imgur.com/a/MwvoAYY ; with old i915 tracepoints https://imgur.com/a/tG2iyHS Couple questions... With your new dma_event traceponts patch, we're still getting these tracepoints: i915_request_in i915_request_out intel_engine_notify And the in/out tracepoints line up with dma_fence_executes (same ctx:seqno and time): <idle>-0 [006] 150.376273: dma_fence_execute_start: context=31, seqno=35670, hwid=0 <idle>-0 [006] 150.413215: dma_fence_execute_end: context=31, seqno=35670, hwid=0 <idle>-0 [006] 150.376272: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, prio=0, global=41230, port=1 <idle>-0 [006] 150.413217: i915_request_out: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, global=41230, completed?=1 However I'm also seeing several i915_request_in --> intel_engine_notify tracepoints that don't have dma_fence_execute_* tracepoints: RenderThread-1279 [001] 150.341336: dma_fence_init: driver=i915 timeline=ShooterGame[1226]/2 context=31 seqno=35669 RenderThread-1279 [001] 150.341352: dma_fence_emit: context=31, seqno=35669 <idle>-0 [006] 150.376271: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35669, prio=0, global=41229, port=1 <idle>-0 [006] 150.411525: intel_engine_notify: dev=0, engine=0:0, seqno=41229, waiters=1 RenderThread-1279 [001] 150.419779: dma_fence_signaled: context=31, seqno=35669 RenderThread-1279 [001] 150.419838: dma_fence_destroy: context=31, seqno=35669 I assume something is going on at a lower level that we can't get the information for via dma_fence? Thanks! _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] dma-buf: Enhance dma-fence tracing 2019-01-29 1:52 ` Michael Sartain @ 2019-01-29 10:08 ` Chris Wilson 0 siblings, 0 replies; 12+ messages in thread From: Chris Wilson @ 2019-01-29 10:08 UTC (permalink / raw) To: Michael Sartain, dri-devel Cc: intel-gfx, Steven Rostedt, Alex Deucher, Christian König, Pierre-Loup Griffais Quoting Michael Sartain (2019-01-29 01:52:12) > On Mon, Jan 21, 2019, at 4:20 PM, Chris Wilson wrote: > > Rather than every backend and GPU driver reinventing the same wheel for > > user level debugging of HW execution, the common dma-fence framework > > should include the tracing infrastructure required for most client API > > level flow visualisation. > > > > With these common dma-fence level tracepoints, the userspace tools can > > establish a detailed view of the client <-> HW flow across different > > kernels. There is a strong ask to have this available, so that the > > userspace developer can effectively assess if they're doing a good job > > about feeding the beast of a GPU hardware. > ... > > I've got a first pass of this visualizing with gpuvis. Screenshots: > > ; with dma_event tracepoints patch > https://imgur.com/a/MwvoAYY > > ; with old i915 tracepoints > https://imgur.com/a/tG2iyHS > > Couple questions... > > With your new dma_event traceponts patch, we're still getting these > tracepoints: > > i915_request_in > i915_request_out These are debugging not really tracepoints and should be covered by trace_printk already. Left in this patch as they are a slightly different argument to remove (as in they are not directly replaced by dma-fence tracing). > intel_engine_notify To be removed upstream very shortly. > And the in/out tracepoints line up with dma_fence_executes > (same ctx:seqno and time): > > <idle>-0 [006] 150.376273: dma_fence_execute_start: context=31, seqno=35670, hwid=0 > <idle>-0 [006] 150.413215: dma_fence_execute_end: context=31, seqno=35670, hwid=0 > > <idle>-0 [006] 150.376272: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, prio=0, global=41230, port=1 > <idle>-0 [006] 150.413217: i915_request_out: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35670, global=41230, completed?=1 > > However I'm also seeing several i915_request_in --> intel_engine_notify > tracepoints that don't have dma_fence_execute_* tracepoints: Yes. I was trying to wean the API off expecting having an exact match and just be happy with context in/out events, not request level details. > RenderThread-1279 [001] 150.341336: dma_fence_init: driver=i915 timeline=ShooterGame[1226]/2 context=31 seqno=35669 > RenderThread-1279 [001] 150.341352: dma_fence_emit: context=31, seqno=35669 > <idle>-0 [006] 150.376271: i915_request_in: dev=0, engine=0:0, hw_id=4, ctx=31, seqno=35669, prio=0, global=41229, port=1 > <idle>-0 [006] 150.411525: intel_engine_notify: dev=0, engine=0:0, seqno=41229, waiters=1 > RenderThread-1279 [001] 150.419779: dma_fence_signaled: context=31, seqno=35669 > RenderThread-1279 [001] 150.419838: dma_fence_destroy: context=31, seqno=35669 > > I assume something is going on at a lower level that we can't get the > information for via dma_fence? Deliberate obfuscation. It more or less lets us know what client was running on the GPU at any one time, but you have to work back to identify exactly what fence by inspecting the signaling timeline. -Chris _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-01-29 10:08 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-01-21 23:20 [PATCH] dma-buf: Enhance dma-fence tracing Chris Wilson 2019-01-21 23:47 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork 2019-01-22 0:06 ` ✓ Fi.CI.BAT: success " Patchwork 2019-01-22 0:55 ` ✓ Fi.CI.IGT: " Patchwork 2019-01-22 8:49 ` [PATCH] " Koenig, Christian 2019-01-22 9:05 ` Chris Wilson 2019-01-22 9:11 ` Daniel Vetter 2019-01-22 9:57 ` Chris Wilson 2019-01-22 10:09 ` Daniel Vetter 2019-01-24 18:45 ` Eric Anholt 2019-01-29 1:52 ` Michael Sartain 2019-01-29 10:08 ` Chris Wilson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.