All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: paulmck@kernel.org, linux-kernel@vger.kernel.org,
	frederic@kernel.org,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [BUG] 8e274732115f ("printk: extend console_lock for per-console locking")
Date: Mon, 13 Jun 2022 17:07:47 +0200	[thread overview]
Message-ID: <YqdSw/fJvnkRbjvc@alley> (raw)
In-Reply-To: <YqcSJyTnuRVAPmLI@alley>

On Mon 2022-06-13 12:32:08, Petr Mladek wrote:
> On Mon 2022-06-13 11:10:19, John Ogness wrote:
> > On 2022-06-12, "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > >> As I suspected, the final printk's cannot direct print because the
> > >> kthread was printing. Using the below patch did seem to address your
> > >> problem. But this is probably not the way forward.
> > >
> > > When I apply it, I still lose output, perhaps due to different timing?
> > > Doing the pr_flush(1000, true) just before the call to kernel_power_off()
> > > has been working quite well thus far, though.
> > 
> > @Petr, I like the idea of the kthreads getting out of the way rather
> > than trying to direct print themselves (for this situation). It still
> > isn't optimal because that final pr_emerg("Power down\n") might come
> > before the kthread has finished its current line. But in that case the
> > kthread may not have much a chance to finish the printing anyway.
> 
> I wonder if we could somehow combine it with pr_flush(timeout).
> 
> The kthread might bail-out when pr_flush() is running. It will
> know that someone would continue printing. The timeout might
> help to avoid a deadlock. We could somehow reuse
> console_trylock_spinning() code here.
> > 
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index ea3dd55709e7..45c6c2b0b104 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -3729,7 +3729,9 @@ static bool printer_should_wake(struct console *con, u64 seq)
> >  		return true;
> >  
> >  	if (con->blocked ||
> > -	    console_kthreads_atomically_blocked()) {
> > +	    console_kthreads_atomically_blocked() ||
> > +	    system_state > SYSTEM_RUNNING ||
> > +	    oops_in_progress) {
> >  		return false;
> >  	}
> 
> Also this is an interesting idea. We know that panic() will try
> to flush the messages. Well, panic() is not always called
> after Oops.

I tried to combine the various approaches and findings into
the following patch.

It introduces try_block_console_kthread() function. It tries
to summon console_lock() so that it is later available for
printk().

The function causes that console kthreads will not longer
handle the messages. Also it waits until all kthreads leave
the critical section and allow to take the global console lock.

It can be safely called in atomic context because it uses
busy wait by udelay(). Infinite waiting is prevented by
timeout currectly hardcoded to 10s.

The function is used in panic() and kernel_shutdown_prepare()
when we know that the system is going down and kthreads will
not be usable any longer.


Problems:

  + Just the best effort.
  + Does not help on single CPU system.


Testing:

I tried Paul's RCU torture test few times. It always failed without
this patch and always succeded with this patch. If I got the resutls
corretly.


Possible improvements and alternative approaches:

   + Prevent console kthreads from sleeping in critical
     section. Replace mutex with a spinlock?

   + Allow to steal con->lock similar way like the global
     console_sem.

   + Modify console kthreads process priority to make them
     scheduled immediately on another CPU.


What do you think, please?


diff --git a/include/linux/printk.h b/include/linux/printk.h
index 10ec29bc0135..78f27ca15922 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -173,6 +173,7 @@ extern void printk_prefer_direct_enter(void);
 extern void printk_prefer_direct_exit(void);
 
 extern bool pr_flush(int timeout_ms, bool reset_on_progress);
+extern void try_block_console_kthreads(int timeout_ms);
 
 /*
  * Please don't use printk_ratelimit(), because it shares ratelimiting state
diff --git a/kernel/panic.c b/kernel/panic.c
index a3c758dba15a..4cf13c37bd08 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -297,6 +297,7 @@ void panic(const char *fmt, ...)
 		 * unfortunately means it may not be hardened to work in a
 		 * panic situation.
 		 */
+		try_block_console_kthreads(10000);
 		smp_send_stop();
 	} else {
 		/*
@@ -304,6 +305,7 @@ void panic(const char *fmt, ...)
 		 * kmsg_dump, we will need architecture dependent extra
 		 * works in addition to stopping other CPUs.
 		 */
+		try_block_console_kthreads(10000);
 		crash_smp_send_stop();
 	}
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ea3dd55709e7..0bcd4f5cf2fc 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -250,6 +250,8 @@ static atomic_t console_kthreads_active = ATOMIC_INIT(0);
 #define console_kthread_printing_exit() \
 	atomic_dec(&console_kthreads_active)
 
+static bool block_console_kthreads;
+
 /*
  * Helper macros to handle lockdep when locking/unlocking console_sem. We use
  * macros instead of functions so that _RET_IP_ contains useful information.
@@ -3686,6 +3688,24 @@ bool pr_flush(int timeout_ms, bool reset_on_progress)
 }
 EXPORT_SYMBOL(pr_flush);
 
+void try_block_console_kthreads(int timeout_ms)
+{
+	block_console_kthreads = true;
+
+	while (timeout_ms > 0) {
+		if (console_trylock()) {
+			console_unlock();
+			return;
+		}
+
+		udelay(1000);
+		timeout_ms -= 1;
+	}
+
+	/* Failed to block console kthreads. Let them do the job. */
+	block_console_kthreads = false;
+}
+
 static void __printk_fallback_preferred_direct(void)
 {
 	printk_prefer_direct_enter();
@@ -3729,7 +3749,8 @@ static bool printer_should_wake(struct console *con, u64 seq)
 		return true;
 
 	if (con->blocked ||
-	    console_kthreads_atomically_blocked()) {
+	    console_kthreads_atomically_blocked() ||
+	    block_console_kthreads) {
 		return false;
 	}
 
diff --git a/kernel/reboot.c b/kernel/reboot.c
index a091145ee710..222bdd076cd0 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -270,6 +270,7 @@ static void kernel_shutdown_prepare(enum system_states state)
 	blocking_notifier_call_chain(&reboot_notifier_list,
 		(state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL);
 	system_state = state;
+	try_block_console_kthreads(10000);
 	usermodehelper_disable();
 	device_shutdown();
 }

  reply	other threads:[~2022-06-13 18:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-10 20:50 [BUG] 8e274732115f ("printk: extend console_lock for per-console locking") Paul E. McKenney
2022-06-12 13:23 ` Paul E. McKenney
2022-06-12 16:03   ` John Ogness
2022-06-12 16:30     ` Paul E. McKenney
2022-06-13 11:01     ` Petr Mladek
2022-06-13  2:12 ` John Ogness
2022-06-13  4:29   ` Paul E. McKenney
2022-06-13  9:04     ` John Ogness
2022-06-13 10:32       ` Petr Mladek
2022-06-13 15:07         ` Petr Mladek [this message]
2022-06-13 15:51           ` Paul E. McKenney
2022-06-13 15:44       ` Paul E. McKenney
2022-06-13 10:50 ` Petr Mladek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqdSw/fJvnkRbjvc@alley \
    --to=pmladek@suse.com \
    --cc=frederic@kernel.org \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.