All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <senozhatsky@chromium.org>
To: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
	John Ogness <john.ogness@linutronix.de>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] printk: disable optimistic spin during panic
Date: Thu, 27 Jan 2022 16:11:08 +0900	[thread overview]
Message-ID: <YfJFjHdg/khNXiRd@google.com> (raw)
In-Reply-To: <87tudqwegy.fsf@stepbren-lnx.us.oracle.com>

On (22/01/26 10:15), Stephen Brennan wrote:
[..]
> > On (22/01/26 10:51), John Ogness wrote:
> >> > Is there something that prevents panic CPU from NMI hlt CPU which is
> >> > in console_trylock() under raw_spin_lock_irqsave()?
> >> >
> >> >  CPU0				CPU1
> >> > 				console_trylock_spinnning()
> >> > 				 console_trylock()
> >> > 				  down_trylock()
> >> > 				   raw_spin_lock_irqsave(&sem->lock)
> >> >
> >> >  panic()
> >> >   crash_smp_send_stop()
> >> >    NMI 			-> 		HALT
> >> 
> >> This is a good point. I wonder if console_flush_on_panic() should
> >> perform a sema_init() before it does console_trylock().
> >
> > A long time ago there was zap_locks() function in printk, that used
> > to re-init console semaphore and logbuf spin_lock, but _only_ in case
> > of printk recursion (which was never reliable)
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/printk/printk.c?h=v4.9.297#n1557
> >
> > This has been superseded by printk_safe per-CPU buffers so we removed
> > that function.
> >
> > So it could be that may be we want to introduce something similar to
> > zap_locks() again.
> >
> > All reasonable serial consoles drivers should take oops_in_progress into
> > consideration in ->write(), so we probably don't care for console_drivers
> > spinlocks, etc. but potentially can do a bit better on the printk side.
> 
> I see the concern here. If a CPU is halted while holding
> console_sem.lock spinlock, then the very next printk would hang, since
> each vprintk_emit() does a trylock.

Right. So I also thought about placing panic_in_progress() somewhere in
console_trylock() and make it fail for anything that is not a panic CPU.

> Now in my thousands of iterations of tests, I haven't been lucky enough
> to interrupt a CPU in the middle of this critical section. The critical
> section itself is incredibly short and so it's hard to do it. Not
> impossible, I'd imagine.

I can imagine that the race window is really small, and I'm not insisting
on fixing it right now (or ever for that matter).

Basically, we now have two different "something bad is in progress"
that affect two different ends of the calls stack. bust_spinlocks()
sets oops_in_progress and affects console drivers' spinlocks, but has
no meaning to any other printk locks. And then we have panic_in_progress()
which is meaningful to some printk locks, but not to all of them, and is
meaningless to console drivers, because those look at oops_in_progress.

If printk folks are fine with that then I'm also fine.

> We can't fix it in console_flush_on_panic(), because that is called much
> later, after we've called the panic notifiers, which definitely
> printk(). If we wanted to re-initialize the console_sem, we'd want it
> done earlier in panic(), directly after the NMI was sent.

Right.

> My understanding was that we can't be too cautious regarding the console
> drivers. Sure, they _shouldn't_ have any race conditions, but once we're
> in panic we're better off avoiding the console drivers unless it's our
> last choice. So, is it worth re-initializing the console_sem early in
> panic, which forces all the subsequent printk to go out to the consoles?
> I don't know.
>
> One alternative is to do __printk_safe_enter() at the beginning of
> panic. This effectively guarantees that no printk will hit the console
> drivers or even attempt to grab the console_sem. Then, we can do the
> kmsg_dump, do a crash_kexec if configured, and only when all options
> have been exhausted would we reinitialize the console_sem and flush to
> the console. Maybe this is too cautious, but it is an alternative.

Back in the days we also had this idea of "detaching" non-panic CPUs from
printk() by overwriting their printk function pointers.

  reply	other threads:[~2022-01-27  7:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-21 19:02 [PATCH 0/4] printk: reduce deadlocks during panic Stephen Brennan
2022-01-21 19:02 ` [PATCH 1/4] panic: Add panic_in_progress helper Stephen Brennan
2022-01-25 11:48   ` Petr Mladek
2022-01-26 17:37     ` Stephen Brennan
2022-01-21 19:02 ` [PATCH 2/4] printk: disable optimistic spin during panic Stephen Brennan
2022-01-25 12:42   ` Petr Mladek
2022-01-26  9:18   ` Sergey Senozhatsky
2022-01-26  9:45     ` John Ogness
2022-01-26 10:06       ` Sergey Senozhatsky
2022-01-26 18:15         ` Stephen Brennan
2022-01-27  7:11           ` Sergey Senozhatsky [this message]
2022-01-27  9:09             ` John Ogness
2022-01-27 11:38             ` Petr Mladek
2022-01-27 12:43               ` John Ogness
2022-01-27 14:25                 ` Petr Mladek
2022-01-21 19:02 ` [PATCH 3/4] printk: Avoid livelock with heavy printk " Stephen Brennan
2022-01-25 14:25   ` Petr Mladek
2022-01-21 19:02 ` [PATCH 4/4] printk: Drop console_sem " Stephen Brennan
2022-01-24 16:12   ` John Ogness
2022-01-24 16:26     ` John Ogness
2022-01-25 15:04     ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfJFjHdg/khNXiRd@google.com \
    --to=senozhatsky@chromium.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=stephen.s.brennan@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.