All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Ahmed S. Darwish" <darwish.07@gmail.com>
Cc: John Ogness <john.ogness@linutronix.de>,
	daniel.vetter@ffwll.ch,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	dri-devel@lists.freedesktop.org, intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v2 1/3] drm: Add support for panic message output
Date: Thu, 14 Mar 2019 10:35:35 +0100	[thread overview]
Message-ID: <20190314093535.GN2665@phenom.ffwll.local> (raw)
In-Reply-To: <20190314044532.GA17451@darwi-home-pc>

On Thu, Mar 14, 2019 at 05:45:32AM +0100, Ahmed S. Darwish wrote:
> Hi,
> 
> On Wed, Mar 13, 2019 at 09:35:05AM +0100, Daniel Vetter wrote:
> > On Tue, Mar 12, 2019 at 11:13:03PM +0100, Ahmed S. Darwish wrote:
> > > On Mon, Mar 11, 2019 at 11:33:15PM +0100, Noralf Trønnes wrote:
> > > > Den 11.03.2019 20.23, skrev Daniel Vetter:
> [……]
> > > > >
> > > > > class_for_each_device uses klist, which only uses an irqsave spinlock. I
> > > > > think that's good enough. Comment to that effect would be good e.g.
> > > > >
> > > > > 	/* based on klist, which uses only a spin_lock_irqsave, which we
> > > > > 	 * assume still works */
> > > > >
> > > > > If we aim for perfect this should be a trylock still, maybe using our own
> > > > > device list.
> > > > >
> > >
> > > I definitely agree here.
> > >
> > > The lock may already be locked either by a stopped CPU, or by the
> > > very same CPU we execute panic() on (e.g. NMI panic() on the
> > > printing CPU).
> > >
> > > This is why it's very common for example in serial consoles, which
> > > are usually careful about re-entrance and panic contexts, to do:
> > >
> > >   xxxxxx_console_write(...) {
> > > 	if (oops_in_progress)
> > > 		locked = spin_trylock_irqsave(&port->lock, flags);
> > > 	else
> > > 		spin_lock_irqsave(&port->lock, flags);
> > >   }
> > >
> > > I'm quite positive we should do the same for panic drm drivers.
> >
> > Yeah Ideally all the locking in the drm path would be trylock only.
> >
> > I wonder whether lockdep could help us validate this, with some "don't
> > allow anything except trylocks in this context". It's easy to audit the
> > core code with review, but drivers are much tougher. And often end up with
> > really deep callchains to get at the backing buffers.
> [……]
> > > > > Instead what I had in mind is to recreate the worst
> > > > > possible panic context as much as feasible (disabling interrupts should be
> > > > > a good start, maybe we can even do an nmi callback), and then call our
> > > > > panic implementation. That way we can test the panic handler in a
> > > > > non-destructive way (i.e. aside from last dmesg lines printed to the
> > > > > screen nothing bad happens to the kernel: No real panic, no oops, no
> > > > > tainting).
> > > >
> > > > The interrupt case I can do, nmi I have no idea.
> > > >
> > >
> > > I agree too. Disabling interrupts + CONFIG_DEBUG_ATOMIC_SLEEP
> > > would be a nice non-destructive test-case emulation.
> >
> > See above, if we can somehow emulate "all locks are held, only allow
> > trylock" with lockdep that would be great too.
> >
> 
> Hmmmm, to prototype things faster, I'll do it as a hook:
> 
>    spin_lock(spinlock_t *lock)
>    {
>         might_spin_forever_under_panic_path();
>         ...
>    }
> 
> Just like how mutexes and DEBUG_ATOMIC_SLEEP do it today:
> 
>    void mutex_lock(struct mutex *lock)
>    {
>         might_sleep();
>         ...
>    }
> 
> Hopefully the locking maintainers can accept something like this.
> If not, let's play with lockdep (which is a bit complex).
> 
> ===> Thanks to Sebastian for suggesting this!

Ah yes this is a very nice idea. We only need to make sure that we don't
enable any of this for a real oops. Or we just scroll away the real oops
with tons of warnings.
-Daniel

> 
> > Plus nmi context, once/if that somehow becomes relevant.
> >
> > The thing that killed the old drm panic handling code definitely was that
> > we flat out couldnt' test it except with real oopses. And that's just
> > whack-a-mole and bug reporter frustration if you first have a few patch
> > iterations around "oh sry, it's still some oops in the panic handler, not
> > the first one that we're seeing" :-/
> >
> 
> Oh, that's a critical point:
> 
>    full data > some data > no data > invalid/useless data
> 
> First impressions for the feature will definitely count. Hopefully
> the hooks above and some heavy DRM CI testing __beforehand__ can
> improve things before publishing to a wider audience..

Yup, let's get this right!
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2019-03-14  9:35 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-11 17:42 [PATCH v2 0/3] drm: Add panic handling Noralf Trønnes
2019-03-11 17:42 ` [PATCH v2 1/3] drm: Add support for panic message output Noralf Trønnes
2019-03-11 19:23   ` Daniel Vetter
2019-03-11 19:29     ` Daniel Vetter
2019-03-11 22:40       ` Noralf Trønnes
2019-03-12  9:53         ` Daniel Vetter
2019-03-12  9:59           ` Daniel Vetter
2019-03-11 22:33     ` Noralf Trønnes
2019-03-12 10:58       ` Daniel Vetter
2019-03-12 13:29         ` Noralf Trønnes
2019-03-13  3:53         ` Ahmed S. Darwish
2019-03-12 22:13       ` Ahmed S. Darwish
2019-03-13  7:49         ` John Ogness
2019-03-13  8:37           ` Daniel Vetter
2019-03-14  2:51             ` Ahmed S. Darwish
2019-03-14  9:32               ` Daniel Vetter
2019-03-14  9:43                 ` John Ogness
2019-03-14  9:52                   ` John Ogness
2019-03-15 10:56                     ` Daniel Vetter
2019-03-13  8:35         ` Daniel Vetter
2019-03-14  4:45           ` Ahmed S. Darwish
2019-03-14  9:35             ` Daniel Vetter [this message]
2019-03-13 10:24         ` Noralf Trønnes
2019-03-13  4:05       ` Ahmed S. Darwish
2019-03-11 19:55   ` Sam Ravnborg
2019-03-12 10:47   ` Michel Dänzer
2019-03-12 16:17     ` [Intel-gfx] " Ville Syrjälä
2019-03-12 17:15       ` Noralf Trønnes
2019-03-12 17:25         ` Ville Syrjälä
2019-03-12 17:37           ` Noralf Trønnes
2019-03-12 17:44             ` Noralf Trønnes
2019-03-12 18:02             ` [Intel-gfx] " Ville Syrjälä
2019-03-13  8:29               ` Christian König
2019-03-13  8:43               ` [Intel-gfx] " Daniel Vetter
2019-03-13  9:35         ` Michel Dänzer
2019-03-13 13:31           ` [Intel-gfx] " Ville Syrjälä
2019-03-13 13:37             ` Christian König
2019-03-13 15:38               ` Michel Dänzer
2019-03-13 15:54                 ` [Intel-gfx] " Christian König
2019-03-13 16:16                   ` Kazlauskas, Nicholas
2019-03-13 17:30                     ` Koenig, Christian
2019-03-13 17:33                     ` Michel Dänzer
2019-03-13 17:41                       ` Kazlauskas, Nicholas
2019-03-14  9:50                         ` Daniel Vetter
2019-03-14 12:44                           ` Kazlauskas, Nicholas
2019-03-15 10:58                             ` [Intel-gfx] " Daniel Vetter
2019-03-13 17:52                       ` Koenig, Christian
2019-03-14  9:40                   ` [Intel-gfx] " Daniel Vetter
2019-03-11 17:42 ` [PATCH v2 2/3] drm/cma-helper: Add support for panic screen Noralf Trønnes
2019-03-11 17:42 ` [PATCH v2 3/3] drm/vc4: Support " Noralf Trønnes
2019-03-11 18:53 ` [PATCH v2 0/3] drm: Add panic handling Daniel Vetter
2019-03-12  9:09 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2019-03-12  9:12 ` ✗ Fi.CI.SPARSE: " Patchwork
2019-03-12  9:50 ` ✓ Fi.CI.BAT: success " Patchwork
2019-03-12 10:12 ` ✗ Fi.CI.BAT: failure for drm: Add panic handling (rev2) Patchwork
2019-03-17 23:06 ` [PATCH v2 0/3] drm: Add panic handling Ahmed S. Darwish
2019-03-25  8:42   ` Daniel Vetter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190314093535.GN2665@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=bigeasy@linutronix.de \
    --cc=daniel.vetter@ffwll.ch \
    --cc=darwish.07@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=john.ogness@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.