linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Atomicity in KMS panic notifier
@ 2014-05-05 13:02 Takashi Iwai
  2014-05-05 14:29 ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2014-05-05 13:02 UTC (permalink / raw)
  To: dri-devel; +Cc: linux-kernel

Hi,

while debugging a few reported bugs, I noticed that
drm_fb_helper_force_kernel_mode() that is called in the KMS panic
notifier isn't really atomic-safe.  It invokes crtc's set_config(),
and all implementations seem to involve with page allocations (kmalloc
with GFP_KERNEL, via some ttm ops, etc).  I've actually seen the Oops
with cirrus KMS during panic due to this.

Does anyone have an idea to fix this?  I thought of re-using
drm_fb_helper_debug_enter(), but this won't work with many drivers
that don't have crtc->mode_set_base_atomic(), either (yeah, this is
another bug).


thanks,

Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 13:02 Atomicity in KMS panic notifier Takashi Iwai
@ 2014-05-05 14:29 ` Daniel Vetter
  2014-05-05 14:48   ` Takashi Iwai
  2014-05-07 16:15   ` One Thousand Gnomes
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Vetter @ 2014-05-05 14:29 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: dri-devel, Linux Kernel Mailing List, David Herrmann

On Mon, May 5, 2014 at 3:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
> Hi,
>
> while debugging a few reported bugs, I noticed that
> drm_fb_helper_force_kernel_mode() that is called in the KMS panic
> notifier isn't really atomic-safe.  It invokes crtc's set_config(),
> and all implementations seem to involve with page allocations (kmalloc
> with GFP_KERNEL, via some ttm ops, etc).  I've actually seen the Oops
> with cirrus KMS during panic due to this.
>
> Does anyone have an idea to fix this?  I thought of re-using
> drm_fb_helper_debug_enter(), but this won't work with many drivers
> that don't have crtc->mode_set_base_atomic(), either (yeah, this is
> another bug).

David Herrmann has a long-term plan to address this, using a much more
minimal panic console (so that we can avoid all the fbcon madness) and
adding a new driver callback to deliver a pointer to whatever
framebuffers are currently displayed. If it's possible to obtain such
a pointer in atomic contexts. Then we'd rip out all the existing panic
notifier and handler stuff in fbcon. We'd also need to disable the
panic notifier fbcon registers itself (since that ends up calling down
into our ->set_par which again ends up in ->set_config).

That should leave us with some very minimal panic code to audit.

Imo trying to fix the current mess and making ->set_config work in
atomic contexts is pointless. drm_can_sleep is trying to make that
possible in some ways, and it's horrible since using it means
busy-loops in atomic contexts outside of panic handlers won't get
reported any more. Also the interactions with the console_lock (which
due to some bonghits is protecting almost everything in fbcon/fbdev
nowadays) would also be almost completely removed.

Unfortunately doing all this is a lot of work.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 14:29 ` Daniel Vetter
@ 2014-05-05 14:48   ` Takashi Iwai
  2014-05-05 14:52     ` Daniel Vetter
  2014-05-07 16:15   ` One Thousand Gnomes
  1 sibling, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2014-05-05 14:48 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel, Linux Kernel Mailing List, David Herrmann

At Mon, 5 May 2014 16:29:37 +0200,
Daniel Vetter wrote:
> 
> On Mon, May 5, 2014 at 3:02 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > Hi,
> >
> > while debugging a few reported bugs, I noticed that
> > drm_fb_helper_force_kernel_mode() that is called in the KMS panic
> > notifier isn't really atomic-safe.  It invokes crtc's set_config(),
> > and all implementations seem to involve with page allocations (kmalloc
> > with GFP_KERNEL, via some ttm ops, etc).  I've actually seen the Oops
> > with cirrus KMS during panic due to this.
> >
> > Does anyone have an idea to fix this?  I thought of re-using
> > drm_fb_helper_debug_enter(), but this won't work with many drivers
> > that don't have crtc->mode_set_base_atomic(), either (yeah, this is
> > another bug).
> 
> David Herrmann has a long-term plan to address this, using a much more
> minimal panic console (so that we can avoid all the fbcon madness) and
> adding a new driver callback to deliver a pointer to whatever
> framebuffers are currently displayed. If it's possible to obtain such
> a pointer in atomic contexts. Then we'd rip out all the existing panic
> notifier and handler stuff in fbcon. We'd also need to disable the
> panic notifier fbcon registers itself (since that ends up calling down
> into our ->set_par which again ends up in ->set_config).
> 
> That should leave us with some very minimal panic code to audit.
> 
> Imo trying to fix the current mess and making ->set_config work in
> atomic contexts is pointless. drm_can_sleep is trying to make that
> possible in some ways, and it's horrible since using it means
> busy-loops in atomic contexts outside of panic handlers won't get
> reported any more. Also the interactions with the console_lock (which
> due to some bonghits is protecting almost everything in fbcon/fbdev
> nowadays) would also be almost completely removed.
> 
> Unfortunately doing all this is a lot of work.

OK, thanks for clarification!

The current problem I see is that the rest of panic notifier chain
won't be called once when we hit the problem in KMS notifier.  So,
this bug in KMS influences on the rest panic behavior.

Maybe a hackish solution would be to keep KMS notifier at the end of
notifier chain so that it crashes at last.  I don't like this either,
but...


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 14:48   ` Takashi Iwai
@ 2014-05-05 14:52     ` Daniel Vetter
  2014-05-05 15:04       ` Takashi Iwai
  2014-05-06 13:27       ` Takashi Iwai
  0 siblings, 2 replies; 11+ messages in thread
From: Daniel Vetter @ 2014-05-05 14:52 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: dri-devel, Linux Kernel Mailing List, David Herrmann

On Mon, May 5, 2014 at 4:48 PM, Takashi Iwai <tiwai@suse.de> wrote:
>
> The current problem I see is that the rest of panic notifier chain
> won't be called once when we hit the problem in KMS notifier.  So,
> this bug in KMS influences on the rest panic behavior.
>
> Maybe a hackish solution would be to keep KMS notifier at the end of
> notifier chain so that it crashes at last.  I don't like this either,
> but...

You need to do that with both the kms panic notifier in
drm_fb_helper.c and with the fbcon panic notifier. And iirc there's
also a console->unblank call somewhere which _also_ can end up in
->set_par. But I'm not sure anymore when exactly that one is run, I've
tried hard to forget this all ;-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 14:52     ` Daniel Vetter
@ 2014-05-05 15:04       ` Takashi Iwai
  2014-05-06 13:27       ` Takashi Iwai
  1 sibling, 0 replies; 11+ messages in thread
From: Takashi Iwai @ 2014-05-05 15:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel, Linux Kernel Mailing List, David Herrmann

At Mon, 5 May 2014 16:52:45 +0200,
Daniel Vetter wrote:
> 
> On Mon, May 5, 2014 at 4:48 PM, Takashi Iwai <tiwai@suse.de> wrote:
> >
> > The current problem I see is that the rest of panic notifier chain
> > won't be called once when we hit the problem in KMS notifier.  So,
> > this bug in KMS influences on the rest panic behavior.
> >
> > Maybe a hackish solution would be to keep KMS notifier at the end of
> > notifier chain so that it crashes at last.  I don't like this either,
> > but...
> 
> You need to do that with both the kms panic notifier in
> drm_fb_helper.c and with the fbcon panic notifier. And iirc there's
> also a console->unblank call somewhere which _also_ can end up in
> ->set_par. But I'm not sure anymore when exactly that one is run, I've
> tried hard to forget this all ;-)

Oh well, you're suggesting the need for Propranolol overdose ;)


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 14:52     ` Daniel Vetter
  2014-05-05 15:04       ` Takashi Iwai
@ 2014-05-06 13:27       ` Takashi Iwai
  2014-05-06 13:32         ` David Herrmann
  1 sibling, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2014-05-06 13:27 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: dri-devel, Linux Kernel Mailing List, David Herrmann

At Mon, 5 May 2014 16:52:45 +0200,
Daniel Vetter wrote:
> 
> On Mon, May 5, 2014 at 4:48 PM, Takashi Iwai <tiwai@suse.de> wrote:
> >
> > The current problem I see is that the rest of panic notifier chain
> > won't be called once when we hit the problem in KMS notifier.  So,
> > this bug in KMS influences on the rest panic behavior.
> >
> > Maybe a hackish solution would be to keep KMS notifier at the end of
> > notifier chain so that it crashes at last.  I don't like this either,
> > but...
> 
> You need to do that with both the kms panic notifier in
> drm_fb_helper.c and with the fbcon panic notifier. And iirc there's
> also a console->unblank call somewhere which _also_ can end up in
> ->set_par. But I'm not sure anymore when exactly that one is run, I've
> tried hard to forget this all ;-)

Looking back at the code again, it seems that fbcon has no panic
notifier.  It has own notifier chain, but it's a private chain that
isn't called by the panic.  So, we can forget about fbcon, at least (I
hope).


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-06 13:27       ` Takashi Iwai
@ 2014-05-06 13:32         ` David Herrmann
  2014-05-06 13:38           ` Takashi Iwai
  0 siblings, 1 reply; 11+ messages in thread
From: David Herrmann @ 2014-05-06 13:32 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Daniel Vetter, dri-devel, Linux Kernel Mailing List

Hi

On Tue, May 6, 2014 at 3:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
> At Mon, 5 May 2014 16:52:45 +0200,
> Daniel Vetter wrote:
>>
>> On Mon, May 5, 2014 at 4:48 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> >
>> > The current problem I see is that the rest of panic notifier chain
>> > won't be called once when we hit the problem in KMS notifier.  So,
>> > this bug in KMS influences on the rest panic behavior.
>> >
>> > Maybe a hackish solution would be to keep KMS notifier at the end of
>> > notifier chain so that it crashes at last.  I don't like this either,
>> > but...
>>
>> You need to do that with both the kms panic notifier in
>> drm_fb_helper.c and with the fbcon panic notifier. And iirc there's
>> also a console->unblank call somewhere which _also_ can end up in
>> ->set_par. But I'm not sure anymore when exactly that one is run, I've
>> tried hard to forget this all ;-)
>
> Looking back at the code again, it seems that fbcon has no panic
> notifier.  It has own notifier chain, but it's a private chain that
> isn't called by the panic.  So, we can forget about fbcon, at least (I
> hope).

fbcon is called through the VT or fbdev layer, which is called by
bust_spinlocks(1) via either unblank_screen() or console_unblank().

Cheers
David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-06 13:32         ` David Herrmann
@ 2014-05-06 13:38           ` Takashi Iwai
  2014-05-06 13:53             ` David Herrmann
  0 siblings, 1 reply; 11+ messages in thread
From: Takashi Iwai @ 2014-05-06 13:38 UTC (permalink / raw)
  To: David Herrmann; +Cc: Daniel Vetter, dri-devel, Linux Kernel Mailing List

At Tue, 6 May 2014 15:32:21 +0200,
David Herrmann wrote:
> 
> Hi
> 
> On Tue, May 6, 2014 at 3:27 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > At Mon, 5 May 2014 16:52:45 +0200,
> > Daniel Vetter wrote:
> >>
> >> On Mon, May 5, 2014 at 4:48 PM, Takashi Iwai <tiwai@suse.de> wrote:
> >> >
> >> > The current problem I see is that the rest of panic notifier chain
> >> > won't be called once when we hit the problem in KMS notifier.  So,
> >> > this bug in KMS influences on the rest panic behavior.
> >> >
> >> > Maybe a hackish solution would be to keep KMS notifier at the end of
> >> > notifier chain so that it crashes at last.  I don't like this either,
> >> > but...
> >>
> >> You need to do that with both the kms panic notifier in
> >> drm_fb_helper.c and with the fbcon panic notifier. And iirc there's
> >> also a console->unblank call somewhere which _also_ can end up in
> >> ->set_par. But I'm not sure anymore when exactly that one is run, I've
> >> tried hard to forget this all ;-)
> >
> > Looking back at the code again, it seems that fbcon has no panic
> > notifier.  It has own notifier chain, but it's a private chain that
> > isn't called by the panic.  So, we can forget about fbcon, at least (I
> > hope).
> 
> fbcon is called through the VT or fbdev layer, which is called by
> bust_spinlocks(1) via either unblank_screen() or console_unblank().

You mean bust_spinlocks(0), right?

void __attribute__((weak)) bust_spinlocks(int yes)
{
	if (yes) {
		++oops_in_progress;
	} else {
#ifdef CONFIG_VT
		unblank_screen();
#endif
		console_unblank();
		if (--oops_in_progress == 0)
			wake_up_klogd();
	}
}

bust_spinlocks(0) is called after the notifier chain, and it's almost
at the end of panic().


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-06 13:38           ` Takashi Iwai
@ 2014-05-06 13:53             ` David Herrmann
  2014-05-06 14:07               ` Takashi Iwai
  0 siblings, 1 reply; 11+ messages in thread
From: David Herrmann @ 2014-05-06 13:53 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Daniel Vetter, dri-devel, Linux Kernel Mailing List

Hi

On Tue, May 6, 2014 at 3:38 PM, Takashi Iwai <tiwai@suse.de> wrote:
> At Tue, 6 May 2014 15:32:21 +0200,
> David Herrmann wrote:
>> fbcon is called through the VT or fbdev layer, which is called by
>> bust_spinlocks(1) via either unblank_screen() or console_unblank().
>
> You mean bust_spinlocks(0), right?
>
> void __attribute__((weak)) bust_spinlocks(int yes)
> {
>         if (yes) {
>                 ++oops_in_progress;
>         } else {
> #ifdef CONFIG_VT
>                 unblank_screen();
> #endif
>                 console_unblank();
>                 if (--oops_in_progress == 0)
>                         wake_up_klogd();
>         }
> }
>
> bust_spinlocks(0) is called after the notifier chain, and it's almost
> at the end of panic().

Yes, it's called _after_ the panic-handlers but _before_
console_unlock() (see console_unblank() in printk.c). Therefore, we
call into set_config() before the serial drivers get the panic-message
(flushed via console_unlock()). If the serial drivers (or whatever you
use for debugging) register their own panic-handlers, then they're
fine of course.

Thanks
David

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-06 13:53             ` David Herrmann
@ 2014-05-06 14:07               ` Takashi Iwai
  0 siblings, 0 replies; 11+ messages in thread
From: Takashi Iwai @ 2014-05-06 14:07 UTC (permalink / raw)
  To: David Herrmann; +Cc: Daniel Vetter, dri-devel, Linux Kernel Mailing List

At Tue, 6 May 2014 15:53:32 +0200,
David Herrmann wrote:
> 
> Hi
> 
> On Tue, May 6, 2014 at 3:38 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > At Tue, 6 May 2014 15:32:21 +0200,
> > David Herrmann wrote:
> >> fbcon is called through the VT or fbdev layer, which is called by
> >> bust_spinlocks(1) via either unblank_screen() or console_unblank().
> >
> > You mean bust_spinlocks(0), right?
> >
> > void __attribute__((weak)) bust_spinlocks(int yes)
> > {
> >         if (yes) {
> >                 ++oops_in_progress;
> >         } else {
> > #ifdef CONFIG_VT
> >                 unblank_screen();
> > #endif
> >                 console_unblank();
> >                 if (--oops_in_progress == 0)
> >                         wake_up_klogd();
> >         }
> > }
> >
> > bust_spinlocks(0) is called after the notifier chain, and it's almost
> > at the end of panic().
> 
> Yes, it's called _after_ the panic-handlers but _before_
> console_unlock() (see console_unblank() in printk.c). Therefore, we
> call into set_config() before the serial drivers get the panic-message
> (flushed via console_unlock()). If the serial drivers (or whatever you
> use for debugging) register their own panic-handlers, then they're
> fine of course.

Thanks for clarification.  I see it's at the sensible place.

FWIW, the problem I'm tackling now is the blockage of other panic
notifiers due to drm_fb.  For example, pvpanic isn't executed reliably
because of this when a KMS (either cirrus or qxl) driver is loaded.
So, for me, it's fine that the system stalls after that point :)


Takashi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Atomicity in KMS panic notifier
  2014-05-05 14:29 ` Daniel Vetter
  2014-05-05 14:48   ` Takashi Iwai
@ 2014-05-07 16:15   ` One Thousand Gnomes
  1 sibling, 0 replies; 11+ messages in thread
From: One Thousand Gnomes @ 2014-05-07 16:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Takashi Iwai, dri-devel, Linux Kernel Mailing List, David Herrmann

> Imo trying to fix the current mess and making ->set_config work in
> atomic contexts is pointless. drm_can_sleep is trying to make that
> possible in some ways, and it's horrible since using it means
> busy-loops in atomic contexts outside of panic handlers won't get
> reported any more. Also the interactions with the console_lock (which
> due to some bonghits is protecting almost everything in fbcon/fbdev
> nowadays) would also be almost completely removed.

Unfortunately years ago some Finnish student didn't design his console
driver to be lock friendly and then fbcon/fbdev inherited it.

If you are writing a new console driver please don't use fbcon, use a
text console that writes to a simple n x m text framebuffer with dirty
bits and wakes a waitqueue or work queue of some sort when it changes.
Then just redraw updated bits every vblank frame if its actually changed.

Not only will the lock problems go away it'll blow away the existing
scrolling text performance on any DRM driver except gma500.

Alan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-05-07 16:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-05 13:02 Atomicity in KMS panic notifier Takashi Iwai
2014-05-05 14:29 ` Daniel Vetter
2014-05-05 14:48   ` Takashi Iwai
2014-05-05 14:52     ` Daniel Vetter
2014-05-05 15:04       ` Takashi Iwai
2014-05-06 13:27       ` Takashi Iwai
2014-05-06 13:32         ` David Herrmann
2014-05-06 13:38           ` Takashi Iwai
2014-05-06 13:53             ` David Herrmann
2014-05-06 14:07               ` Takashi Iwai
2014-05-07 16:15   ` One Thousand Gnomes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).