linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Ser <contact@emersion.fr>
To: Pekka Paalanen <ppaalanen@gmail.com>
Cc: jim.cromie@gmail.com, quic_saipraka@quicinc.com,
	Catalin Marinas <catalin.marinas@arm.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Will Deacon <will@kernel.org>,
	maz@kernel.org, Vincent Whitchurch <vincent.whitchurch@axis.com>,
	amd-gfx mailing list <amd-gfx@lists.freedesktop.org>,
	Ingo Molnar <mingo@redhat.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arm-msm@vger.kernel.org,
	Intel Graphics Development <intel-gfx@lists.freedesktop.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Jason Baron <jbaron@akamai.com>,
	Sean Paul <seanpaul@chromium.org>,
	intel-gvt-dev@lists.freedesktop.org,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Sean Paul <sean@poorly.run>, Greg KH <gregkh@linuxfoundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	quic_psodagud@quicinc.com, mathieu.desnoyers@efficios.com
Subject: Re: [PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC
Date: Tue, 23 Nov 2021 09:32:28 +0000	[thread overview]
Message-ID: <-PHBNsA2s0YNaFjE_76_aCTSMbqUpcaqbttDKFOZv0n9VRShPsgC8NDHq_S8KCpNbE32E9LRrw7CHb3pgFzgg99jFb0DX59vpcPVODkYe4Y=@emersion.fr> (raw)
In-Reply-To: <20211123104522.7a336773@eldfell>

First off, let me reiterate that this feature would be invaluable as user-space
developers. It's often pretty difficult to figure out the cause of an EINVAL,
we have to ask users to follow complicated instructions [1] to grab DRM logs.
Then have to skim through several megabytes of logs to find the error.

I have a hack [2] which just calls system("sudo dmesg") after a failed atomic
commit, it's been pretty handy. But it's really just a hack, a proper solution
would be awesome.

[1]: https://gitlab.freedesktop.org/wlroots/wlroots/-/wikis/DRM-Debugging
[2]: https://gitlab.freedesktop.org/emersion/libliftoff/-/merge_requests/61

> > > > Having a subsystem specific trace buffer would allow subsystem specific
> > > > trace log permissions depending on the sensitivity of the data. But
> > > > doesn't drm output today go to the system log which is typically world
> > > > readable today?

dmesg isn't world-readable these days, it's been changed recently-ish (last
year?) at least on my distribution (Arch). I need root to grab dmesg.

(Maybe we can we just let the DRM master process grab the logs?)

> > > Yes, and that is exactly the problem. The DRM debug output is so high
> > > traffic it would make the system log both unusable due to cruft and
> > > slow down the whole machine. The debug output is only useful when
> > > something went wrong, and at that point it is too late to enable
> > > debugging. That's why a flight recorder with an over-written circular
> > > in-memory buffer is needed.
> >
> > Seans patch reuses enum drm_debug_category to split the tracing
> > stream into 10 sub-streams
> > - how much traffic from each ?
> > - are some sub-streams more valuable for post-mortem ?
> > - any value from further refinement of categories ?
> > - drop irrelevant callsites individually to reduce clutter, extend
> > buffer time/space ?
>
> I think it's hard to predict which sub-streams you are going to need
> before you have a bug to debug. Hence I would err on the side of
> enabling too much. This also means that better or more refined
> categorisation might not be that much of help - or if it is, then are
> the excluded debug messages worth having in the kernel to begin with.
> Well, we're probably not that interested in GPU debugs but just
> everything related to the KMS side, which on the existing categories
> is... everything except half of CORE and DRIVER, maybe? Not sure.

We've been recommending drm.debug=0x19F so far (see wiki linked above).
KMS + PRIME + ATOMIC + LEASE is definitely something we want in, and
CORE + DRIVER contains other useful info. We definitely don't want VBL.

> My feeling is that that could mean in the order of hundreds of log
> events at framerate (e.g. 60 times per second) per each enabled output
> individually. And per DRM device, of course. This is with the
> uninteresting GPU debugs already excluded.

Indeed, successful KMS atomic commits already generate a lot of noise. On my
machine, setting drm.debug=0x19F and running the following command:

    sudo dmesg -w | pv >/dev/null

I get 400KiB/s when idling, and 850KiB/s when wiggling the cursor.

> Still, I don't think the flight recorder buffer would need to be
> massive. I suspect it would be enough to hold a few frames' worth which
> is just a split second under active operation. When something happens,
> the userspace stack is likely going to stop on its tracks immediately
> to collect the debug information, which means the flooding should pause
> and the relevant messages don't get overwritten before we get them. In
> a multi-seat system where each device is controlled by a separate
> display server instance, per-device logs would help with this. OTOH,
> multi-seat is not a very common use case I suppose.

There's also the case of multi-GPU where GPU B's logs could clutter GPU A's,
making it harder to understand the cause of an atomic commit failure on GPU A.
So per-device logs would be useful, but not a hard requirement for me, having
*anything* at all would already be a big win.

In my experiments linked above [2], system("sudo dmesg") after atomic commit
failure worked pretty well, and the bottom of the log contained the cause of
the failure. It was pretty useful to system("sudo dmesg -C") before performing
an atomic commit, to be able to only collect the extract of the log relevant to
the atomic commit.

Having some kind of "marker" mechanism could be pretty cool. "Mark" the log
stream before performing an atomic commit (ideally that'd just return e.g. an
uint64 offset), then on failure request the logs collected after that mark.

  reply	other threads:[~2021-11-23  9:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-11 22:01 [PATCH v10 00/10 RESEND] use DYNAMIC_DEBUG to implement DRM.debug & DRM.trace Jim Cromie
2021-11-11 22:01 ` [PATCH v10 01/10] dyndbg: add DEFINE_DYNAMIC_DEBUG_BITGRPS macro and callbacks Jim Cromie
2021-11-11 22:01 ` [PATCH v10 02/10] drm: fix doc grammar Jim Cromie
2021-11-11 22:01 ` [PATCH v10 03/10] amdgpu: use dyndbg.BITGRPS to control existing pr_debugs Jim Cromie
2021-11-11 22:02 ` [PATCH v10 04/10] i915/gvt: trim spaces from pr_debug "gvt: core:" prefixes Jim Cromie
2021-11-11 22:02 ` [PATCH v10 05/10] i915/gvt: use dyndbg.BITGRPS for existing pr_debugs Jim Cromie
2021-11-11 22:02 ` [PATCH v10 06/10] drm_print: add choice to use dynamic debug in drm-debug Jim Cromie
2021-11-11 22:02 ` [PATCH v10 07/10] drm_print: instrument drm_debug_enabled Jim Cromie
2021-11-11 22:02 ` [PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC Jim Cromie
2021-11-12 11:49   ` Vincent Whitchurch
2021-11-12 15:08     ` Jason Baron
2021-11-12 17:07       ` Steven Rostedt
2021-11-12 17:32         ` Jason Baron
2021-11-12 17:54           ` Steven Rostedt
2021-11-16  8:46       ` Pekka Paalanen
2021-11-18 14:29         ` Jason Baron
2021-11-18 15:24           ` Pekka Paalanen
2021-11-19 16:21             ` Jason Baron
2021-11-19 22:46               ` jim.cromie
2021-11-19 22:54                 ` Steven Rostedt
2021-11-25 13:51                 ` Vincent Whitchurch
2021-11-22  9:02               ` Pekka Paalanen
2021-11-22 22:42                 ` jim.cromie
2021-11-23  8:45                   ` Pekka Paalanen
2021-11-23  9:32                     ` Simon Ser [this message]
2021-12-08  5:16     ` jim.cromie
2021-12-09 15:09       ` Vincent Whitchurch
2021-11-11 22:02 ` [PATCH v10 09/10] dyndbg: create DEFINE_DYNAMIC_DEBUG_LOG|TRACE_GROUPS Jim Cromie
2021-11-11 22:02 ` [PATCH v10 10/10] drm: use DEFINE_DYNAMIC_DEBUG_TRACE_GROUPS in 3 places Jim Cromie
  -- strict thread matches above, loose matches on Subject: below --
2021-11-05 19:26 [PATCH v10 00/10] use DYNAMIC_DEBUG to implement DRM.debug & DRM.trace Jim Cromie
2021-11-05 19:26 ` [PATCH v10 08/10] dyndbg: add print-to-tracefs, selftest with it - RFC Jim Cromie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='-PHBNsA2s0YNaFjE_76_aCTSMbqUpcaqbttDKFOZv0n9VRShPsgC8NDHq_S8KCpNbE32E9LRrw7CHb3pgFzgg99jFb0DX59vpcPVODkYe4Y=@emersion.fr' \
    --to=contact@emersion.fr \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=intel-gvt-dev@lists.freedesktop.org \
    --cc=jbaron@akamai.com \
    --cc=jim.cromie@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=maz@kernel.org \
    --cc=mingo@redhat.com \
    --cc=ppaalanen@gmail.com \
    --cc=quic_psodagud@quicinc.com \
    --cc=quic_saipraka@quicinc.com \
    --cc=rostedt@goodmis.org \
    --cc=sean@poorly.run \
    --cc=seanpaul@chromium.org \
    --cc=vincent.whitchurch@axis.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).