From: Petr Mladek <pmladek@suse.com>
To: John Ogness <john.ogness@linutronix.de>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Subject: Re: [PATCH printk v1 11/13] printk: reimplement console_lock for proper kthread support
Date: Wed, 23 Feb 2022 11:19:43 +0100 [thread overview]
Message-ID: <YhYKP/UuSKENGwfj@alley> (raw)
In-Reply-To: <20220207194323.273637-12-john.ogness@linutronix.de>
On Mon 2022-02-07 20:49:21, John Ogness wrote:
> Rather than console_lock() callers holding the per-console mutex
> for the duration of the console lock, the per-console mutex is only
> taken in order to set a new CON_PAUSED flag, which is checked by
> the kthread printers. This avoids any issues due to nested locking
> between the various per-console mutexes.
>
> diff --git a/include/linux/console.h b/include/linux/console.h
> index 0f94b1771df8..c51c7f5507a5 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -16,6 +16,7 @@
>
> #include <linux/atomic.h>
> #include <linux/types.h>
> +#include <linux/mutex.h>
>
> struct vc_data;
> struct console_font_op;
> @@ -136,6 +137,7 @@ static inline int con_debug_leave(void)
> #define CON_ANYTIME (16) /* Safe to call before per-cpu resources ready */
> #define CON_BRL (32) /* Used for a braille device */
> #define CON_EXTENDED (64) /* Use the extended output format a la /dev/kmsg */
> +#define CON_PAUSED (128) /* Sleep while console is locked */
I was few times confused by the name. It sounds like that the console
does not show any messages as all. I mean that it is something as
console_suspend().
It is not easy to find a good short name explaining that it is only
about kthreads. What about CON_DIRECT?
>
> struct console {
> char name[16];
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index e182f31fec58..135fbe647092 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -214,6 +214,26 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> /* Number of registered extended console drivers. */
> static int nr_ext_console_drivers;
>
> +/*
> + * Used to synchronize printing kthreads against direct printing via
> + * console_trylock/console_unlock.
> + *
> + * Values:
> + * -1 = console locked (via trylock), kthreads will not print
> + * 0 = no kthread printing, console not locked (via trylock)
> + * >0 = kthread(s) actively printing
> + *
> + * Note: For synchronizing against direct printing via
> + * console_lock/console_unlock, see the @lock variable in
> + * struct console.
> + */
> +static atomic_t console_lock_count = ATOMIC_INIT(0);
I am getting lost in all the variables called lock.
What about calling this "console_kthreads_active"? The counter is
incremented when a kthread is actively printing.
> +#define console_excl_trylock() (atomic_cmpxchg(&console_lock_count, 0, -1) == 0)
> +#define console_excl_unlock() atomic_cmpxchg(&console_lock_count, -1, 0)
> +#define console_printer_tryenter() atomic_inc_unless_negative(&console_lock_count)
> +#define console_printer_exit() atomic_dec(&console_lock_count)
Similar here. What about the following
#define console_kthreads_atomic_tryblock() (atomic_cmpxchg(&console_kthreads_active, 0, -1) == 0)
#define console_kthreads_atomic_unblock() atomic_cmpxchg(&console_kthreads_active, -1, 0)
#define console_kthread_tryenter() atomic_inc_unless_negative(&console_kthreads_active)
#define console_kthread_exit() atomic_dec(&console_kthreads_active)
It is more clear that it is an "atomic" way to block/pause the "kthreads".
I do not resist on the word "block". I somehow like it more than
"pause". But "pause" is good as well when used together with "kthreads".
> +
> /*
> * Helper macros to handle lockdep when locking/unlocking console_sem. We use
> * macros instead of functions so that _RET_IP_ contains useful information.
> @@ -256,6 +276,37 @@ static void __up_console_sem(unsigned long ip)
> }
> #define up_console_sem() __up_console_sem(_RET_IP_)
>
> +/*
> + * Tracks whether kthread printers are all paused. A value of true implies
> + * that the console is locked via console_lock() or the console is suspended.
> + * Reading and writing to this variable requires holding @console_sem.
> + */
> +static bool consoles_paused;
> +
> +/*
> + * Pause or unpause all kthread printers.
> + *
> + * Requires the console_lock.
> + */
> +static void __pause_all_consoles(bool do_pause)
Again. It feels like no consoles will work anymore. What about
console_kthreads_block().
> +{
> + struct console *con;
> +
> + for_each_console(con) {
> + mutex_lock(&con->lock);
> + if (do_pause)
> + con->flags |= CON_PAUSED;
> + else
> + con->flags &= ~CON_PAUSED;
> + mutex_unlock(&con->lock);
> + }
> +
> + consoles_paused = do_pause;
> +}
> +
> +#define pause_all_consoles() __pause_all_consoles(true)
> +#define unpause_all_consoles() __pause_all_consoles(false)
> +
> /*
> * This is used for debugging the mess that is the VT code by
> * keeping track if we have the console semaphore held. It's
> @@ -2568,15 +2616,45 @@ int console_trylock(void)
> up_console_sem();
> return 0;
> }
> + if (!console_excl_trylock()) {
> + up_console_sem();
> + return 0;
> + }
> console_locked = 1;
> console_may_schedule = 0;
> return 1;
> }
> EXPORT_SYMBOL(console_trylock);
>
> +/*
> + * A variant of console_trylock() that allows specifying if the context may
> + * sleep. If yes, a trylock on @console_sem is attempted and if successful,
> + * the threaded printers are paused. This is important to ensure that
> + * sleepable contexts do not become involved in console_lock handovers and
> + * will call cond_resched() during the printing loop.
> + */
> +static int console_trylock_sched(bool may_schedule)
> +{
> + if (!may_schedule)
> + return console_trylock();
> +
> + might_sleep();
> +
> + if (down_trylock_console_sem())
> + return 0;
> + if (console_suspended) {
> + up_console_sem();
> + return 0;
> + }
> + pause_all_consoles();
This is weird. Any trylock function should be fast and non-blocking.
But pause_all_consoles() uses mutex_lock().
My expectation would be that console_trylock_sched() behaves
excatly the same as console_trylock() except that it will
set console_may_schedule by the given parameter.
I would do it the other way. Rename console_trylock() and
implement:
int console_trylock(void)
{
return console_trylock_sched(false);
}
LATER: I got it. It is used by console_trylock_sched() called
in console_unlock() when "do_cond_resched == true". In this
case, the trylock might wait for the mutexes. It will prevent
transfering console_lock from schedulable to atomic context
by the check in console_emit_next_record().
Hmm, I would still prefer to keep console_trylock_sched()
behavior sane: non-blocking in all situations. It means
that we actually do not need it and console_trylock()
is enough.
It will allow to steal console_lock() from schedulable
context. But it is not a regression. And it is only
a corner case when console_unlock() re-takes the semaphore
after releasing it.
We could do the same optimization in console_unlock() by
calling console_emit_next_record() with NULL handover pointer
when do_cond_resched == true. But we should do it
as a separate patch later.
> + console_locked = 1;
> + console_may_schedule = 1;
> + return 1;
> +}
> +
> int is_console_locked(void)
> {
> - return console_locked;
> + return (console_locked || atomic_read(&console_lock_count));
> }
> EXPORT_SYMBOL(is_console_locked);
>
> @@ -2610,6 +2688,19 @@ static inline bool console_is_usable(struct console *con)
> static void __console_unlock(void)
> {
> console_locked = 0;
> +
> + /*
> + * Depending on whether console_lock() or console_trylock() was used,
> + * appropriately allow the kthread printers to continue.
> + */
> + if (consoles_paused)
> + unpause_all_consoles();
> + else
> + console_excl_unlock();
This is a good example when the meaning of console_excl_lock() is not
obvious. IHMO, the following is easier to understand:
if (console_kthreads_blocked)
console_kthreads_unblock();
else
console_kthreads_atomic_unblock();
> +
> + /* Wake the kthread printers. */
> + wake_up_klogd();
> +
> up_console_sem();
> }
>
> @@ -2632,7 +2723,8 @@ static void __console_unlock(void)
> *
> * @handover will be set to true if a printk waiter has taken over the
> * console_lock, in which case the caller is no longer holding the
> - * console_lock. Otherwise it is set to false.
> + * console_lock. Otherwise it is set to false. A NULL pointer may be provided
> + * to disable allowing the console_lock to be taken over by a printk waiter.
> */
> static bool console_emit_next_record(struct console *con, char *text, char *ext_text,
> char *dropped_text, bool *handover)
> @@ -2640,12 +2732,14 @@ static bool console_emit_next_record(struct console *con, char *text, char *ext_
> struct printk_info info;
> struct printk_record r;
> unsigned long flags;
> + bool allow_handover;
> char *write_text;
> size_t len;
>
> prb_rec_init_rd(&r, &info, text, CONSOLE_LOG_MAX);
>
> - *handover = false;
> + if (handover)
> + *handover = false;
>
> if (!prb_read_valid(prb, con->seq, &r))
> return false;
> @@ -2671,18 +2765,23 @@ static bool console_emit_next_record(struct console *con, char *text, char *ext_
> len = record_print_text(&r, console_msg_format & MSG_FORMAT_SYSLOG, printk_time);
> }
>
> - /*
> - * While actively printing out messages, if another printk()
> - * were to occur on another CPU, it may wait for this one to
> - * finish. This task can not be preempted if there is a
> - * waiter waiting to take over.
> - *
> - * Interrupts are disabled because the hand over to a waiter
> - * must not be interrupted until the hand over is completed
> - * (@console_waiter is cleared).
> - */
> - printk_safe_enter_irqsave(flags);
> - console_lock_spinning_enable();
> + /* Handovers may only happen between trylock contexts. */
> + allow_handover = (handover && atomic_read(&console_lock_count) == -1);
It is far from obvious why the check for console_lock_count is needed.
I found the reason only after removing the check and seeing the
warning about using mutex_unlock() in atomic context.
I suggest something like:
/*
* Allow to handover the lock only from atomic context. The handover
* is used to transfer console owner between more printk() callers.
* They are not allowed to unblock the console kthreads by mutexes
* in atomic context.
*/
allow_handover = (handover && console_kthreads_atomically_blocked());
> +
> + if (allow_handover) {
> + /*
> + * While actively printing out messages, if another printk()
> + * were to occur on another CPU, it may wait for this one to
> + * finish. This task can not be preempted if there is a
> + * waiter waiting to take over.
> + *
> + * Interrupts are disabled because the hand over to a waiter
> + * must not be interrupted until the hand over is completed
> + * (@console_waiter is cleared).
> + */
> + printk_safe_enter_irqsave(flags);
> + console_lock_spinning_enable();
> + }
>
> stop_critical_timings(); /* don't trace print latency */
> call_console_driver(con, write_text, len, dropped_text);
> @@ -2825,7 +2926,7 @@ void console_unlock(void)
> * Re-check if there is a new record to flush. If the trylock
> * fails, another context is already handling the printing.
> */
> - } while (prb_read_valid(prb, next_seq, NULL) && console_trylock());
> + } while (prb_read_valid(prb, next_seq, NULL) && console_trylock_sched(do_cond_resched));
> }
> EXPORT_SYMBOL(console_unlock);
>
> @@ -2856,6 +2957,10 @@ void console_unblank(void)
> if (oops_in_progress) {
> if (down_trylock_console_sem() != 0)
> return;
> + if (!console_excl_trylock()) {
> + up_console_sem();
> + return;
> + }
It would be better to use
if (oops_in_progress) {
if (!console_trylock())
return;
> } else {
> pr_flush(1000, true);
> console_lock();
> @@ -3397,16 +3502,17 @@ static bool printer_should_wake(struct console *con, u64 seq)
> if (kthread_should_stop())
> return true;
>
> - if (console_suspended)
> - return false;
> -
> /*
> * This is an unsafe read to con->flags, but false positives
> * are not an issue as long as they are rare.
> */
> flags = data_race(READ_ONCE(con->flags));
> - if (!(flags & CON_ENABLED))
> +
> + if (!(flags & CON_ENABLED) ||
> + (flags & CON_PAUSED) ||
> + atomic_read(&console_lock_count) == -1) {
> return false;
> + }
I would remove also this and do the after waking up. I think
that it is actually already being discussed in another patch.
>
> return prb_read_valid(prb, seq, NULL);
> }
Otherwise, it looks good.
This must have been hard to invent. Great job!
Best Regards,
Petr
next prev parent reply other threads:[~2022-02-23 10:19 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-07 19:43 [PATCH printk v1 00/13] implement threaded console printing John Ogness
2022-02-07 19:43 ` [PATCH printk v1 01/13] printk: rename cpulock functions John Ogness
2022-02-11 12:44 ` Petr Mladek
2022-02-11 14:42 ` John Ogness
2022-02-11 20:57 ` Steven Rostedt
2022-02-11 21:04 ` Peter Zijlstra
2022-02-15 9:32 ` Petr Mladek
2022-02-15 9:13 ` Petr Mladek
2022-02-14 6:49 ` Sergey Senozhatsky
2022-02-14 9:45 ` John Ogness
2022-02-15 9:29 ` Petr Mladek
2022-02-16 3:27 ` Sergey Senozhatsky
2022-02-17 14:34 ` John Ogness
2022-02-07 19:43 ` [PATCH printk v1 02/13] printk: cpu sync always disable interrupts John Ogness
2022-02-11 12:58 ` Petr Mladek
2022-02-14 6:36 ` Sergey Senozhatsky
2022-02-07 19:43 ` [PATCH printk v1 03/13] printk: use percpu flag instead of cpu_online() John Ogness
2022-02-11 16:05 ` Petr Mladek
2022-02-14 7:08 ` Sergey Senozhatsky
2022-02-14 7:35 ` Sergey Senozhatsky
2022-02-15 10:38 ` Petr Mladek
2022-02-16 3:29 ` Sergey Senozhatsky
2022-03-02 14:21 ` John Ogness
2022-03-04 15:56 ` Petr Mladek
2022-03-05 17:05 ` Jason A. Donenfeld
2022-03-07 16:14 ` Petr Mladek
2022-02-16 13:58 ` two locations: was: " Petr Mladek
2022-03-02 14:49 ` John Ogness
2022-03-04 16:14 ` Petr Mladek
2022-03-07 10:06 ` John Ogness
2022-03-08 16:08 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 04/13] printk: get caller_id/timestamp after migration disable John Ogness
2022-02-15 5:53 ` Sergey Senozhatsky
2022-02-15 11:56 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 05/13] printk: call boot_delay_msec() in printk_delay() John Ogness
2022-02-15 5:58 ` Sergey Senozhatsky
2022-02-15 14:59 ` Petr Mladek
2022-02-16 3:21 ` Sergey Senozhatsky
2022-02-15 15:03 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 06/13] printk: refactor and rework printing logic John Ogness
2022-02-16 15:43 ` Petr Mladek
2022-03-02 16:10 ` John Ogness
2022-02-07 19:43 ` [PATCH printk v1 07/13] printk: move buffer definitions into console_emit_next_record() caller John Ogness
2022-02-16 16:10 ` Petr Mladek
2022-03-02 16:25 ` John Ogness
2022-02-07 19:43 ` [PATCH printk v1 08/13] printk: add pr_flush() John Ogness
2022-02-17 10:11 ` Petr Mladek
2022-03-02 17:23 ` John Ogness
2022-03-04 13:24 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 09/13] printk: add functions to allow direct printing John Ogness
2022-02-17 12:52 ` Petr Mladek
2022-02-18 9:00 ` David Laight
2022-02-18 12:52 ` Petr Mladek
2022-03-03 14:37 ` John Ogness
2022-02-07 19:43 ` [PATCH printk v1 10/13] printk: add kthread console printers John Ogness
2022-02-18 9:00 ` early start: was: " Petr Mladek
2022-02-18 9:04 ` start&stop: " Petr Mladek
2022-02-18 9:08 ` main loop: " Petr Mladek
2022-02-18 9:12 ` wake_up_all: " Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 11/13] printk: reimplement console_lock for proper kthread support John Ogness
2022-02-18 16:20 ` Petr Mladek
2022-02-18 21:41 ` John Ogness
2022-02-18 22:03 ` John Ogness
2022-02-22 11:42 ` Petr Mladek
2022-02-23 17:20 ` John Ogness
2022-02-24 8:27 ` Petr Mladek
2022-02-23 10:19 ` Petr Mladek [this message]
2022-03-09 13:56 ` John Ogness
2022-03-10 14:34 ` Petr Mladek
2022-03-10 16:08 ` John Ogness
2022-03-11 10:26 ` Petr Mladek
2022-03-11 13:28 ` John Ogness
2022-03-11 16:17 ` Petr Mladek
2022-03-11 22:21 ` John Ogness
2022-03-14 14:08 ` Petr Mladek
2022-03-14 14:43 ` John Ogness
2022-03-14 15:53 ` Petr Mladek
2022-03-11 18:41 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 12/13] printk: remove @console_locked John Ogness
2022-02-23 12:17 ` Petr Mladek
2022-02-07 19:43 ` [PATCH printk v1 13/13] console: introduce CON_MIGHT_SLEEP for vt John Ogness
2022-02-23 13:37 ` Petr Mladek
2022-02-23 18:31 ` Greg Kroah-Hartman
[not found] ` <20220208083620.2736-1-hdanton@sina.com>
2022-02-08 11:08 ` [PATCH printk v1 10/13] printk: add kthread console printers John Ogness
2022-02-08 14:53 ` Petr Mladek
2022-02-14 6:12 ` Sergey Senozhatsky
2022-02-14 10:02 ` Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YhYKP/UuSKENGwfj@alley \
--to=pmladek@suse.com \
--cc=gregkh@linuxfoundation.org \
--cc=john.ogness@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).