* [PATCH] printk: queue wake_up_klogd irq_work only if per-CPU areas are ready
@ 2020-03-03 4:40 Sergey Senozhatsky
2020-03-03 9:18 ` Petr Mladek
0 siblings, 1 reply; 3+ messages in thread
From: Sergey Senozhatsky @ 2020-03-03 4:40 UTC (permalink / raw)
To: Petr Mladek, Steven Rostedt
Cc: linux-kernel, Sergey Senozhatsky, Lech Perczak,
Greg Kroah-Hartman, Theodore Ts'o, John Ogness
printk_deferred(), similarly to printk_safe/printk_nmi,
does not immediately attempt to print a new message on
the consoles, avoiding calls into non-reentrant kernel
paths, e.g. scheduler or timekeeping, which potentially
can deadlock the system. Those printk() flavors, instead,
rely on per-CPU flush irq_work to print messages from
safer contexts. For same reasons (recursive scheduler or
timekeeping calls) printk() uses per-CPU irq_work in
order to wake up user space syslog/kmsg readers.
However, only printk_safe/printk_nmi do make sure that
per-CPU areas have been initialised and that it's safe
to modify per-CPU irq_work. This means that, for instance,
should printk_deferred() be invoked "too early", that
is before per-CPU areas are initialised, printk_deferred()
will perform illegal per-CPU access.
Lech Perczak [0] reports that after commit 1b710b1b10ef
("char/random: silence a lockdep splat with printk()")
user-space syslog/kmsg readers are not able to read new
kernel messages. The reason is printk_deferred() being
called too early (as was pointed out by Petr and John).
Fix printk_deferred() and do not queue per-CPU irq_work
before per-CPU areas are initialized.
[0] https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Lech Perczak <l.perczak@camlintechnologies.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: John Ogness <john.ogness@linutronix.de>
---
kernel/printk/internal.h | 3 +++
kernel/printk/printk.c | 31 +++++++++++++++++++++++++++++++
kernel/printk/printk_safe.c | 11 +----------
3 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index c8e6ab689d42..8ed2b7737063 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -23,6 +23,8 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args);
void __printk_safe_enter(void);
void __printk_safe_exit(void);
+bool printk_percpu_data_ready(void);
+
#define printk_safe_enter_irqsave(flags) \
do { \
local_irq_save(flags); \
@@ -64,4 +66,5 @@ __printf(1, 0) int vprintk_func(const char *fmt, va_list args) { return 0; }
#define printk_safe_enter_irq() local_irq_disable()
#define printk_safe_exit_irq() local_irq_enable()
+bool printk_percpu_data_ready(void) { return false; }
#endif /* CONFIG_PRINTK */
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ad4606234545..d951d35a0786 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -461,6 +461,18 @@ static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
static char *log_buf = __log_buf;
static u32 log_buf_len = __LOG_BUF_LEN;
+/*
+ * We cannot access per-CPU data (e.g. per-CPU flush irq_work) before
+ * per_cpu_areas are initialised. This variable is set to true when
+ * it's safe to access per-CPU data.
+ */
+static bool __printk_percpu_data_ready __read_mostly;
+
+bool printk_percpu_data_ready(void)
+{
+ return __printk_percpu_data_ready;
+}
+
/* Return log buffer address */
char *log_buf_addr_get(void)
{
@@ -1147,12 +1159,25 @@ static void __init log_buf_add_cpu(void)
static inline void log_buf_add_cpu(void) {}
#endif /* CONFIG_SMP */
+static void __init set_percpu_data_ready(void)
+{
+ __printk_percpu_data_ready = true;
+}
+
void __init setup_log_buf(int early)
{
unsigned long flags;
char *new_log_buf;
unsigned int free;
+ /*
+ * Some archs call setup_log_buf() multiple times - first is very
+ * early, e.g. from setup_arch(), and second - when percpu_areas
+ * are initialised.
+ */
+ if (!early)
+ set_percpu_data_ready();
+
if (log_buf != __log_buf)
return;
@@ -3009,6 +3034,9 @@ static DEFINE_PER_CPU(struct irq_work, wake_up_klogd_work) = {
void wake_up_klogd(void)
{
+ if (!printk_percpu_data_ready())
+ return;
+
preempt_disable();
if (waitqueue_active(&log_wait)) {
this_cpu_or(printk_pending, PRINTK_PENDING_WAKEUP);
@@ -3019,6 +3047,9 @@ void wake_up_klogd(void)
void defer_console_output(void)
{
+ if (!printk_percpu_data_ready())
+ return;
+
preempt_disable();
__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
irq_work_queue(this_cpu_ptr(&wake_up_klogd_work));
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index b4045e782743..d9a659a686f3 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -27,7 +27,6 @@
* There are situations when we want to make sure that all buffers
* were handled or when IRQs are blocked.
*/
-static int printk_safe_irq_ready __read_mostly;
#define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) - \
sizeof(atomic_t) - \
@@ -51,7 +50,7 @@ static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
/* Get flushed in a more safe context. */
static void queue_flush_work(struct printk_safe_seq_buf *s)
{
- if (printk_safe_irq_ready)
+ if (printk_percpu_data_ready())
irq_work_queue(&s->work);
}
@@ -402,14 +401,6 @@ void __init printk_safe_init(void)
#endif
}
- /*
- * In the highly unlikely event that a NMI were to trigger at
- * this moment. Make sure IRQ work is set up before this
- * variable is set.
- */
- barrier();
- printk_safe_irq_ready = 1;
-
/* Flush pending messages that did not have scheduled IRQ works. */
printk_safe_flush();
}
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] printk: queue wake_up_klogd irq_work only if per-CPU areas are ready
2020-03-03 4:40 [PATCH] printk: queue wake_up_klogd irq_work only if per-CPU areas are ready Sergey Senozhatsky
@ 2020-03-03 9:18 ` Petr Mladek
2020-03-03 10:27 ` Sergey Senozhatsky
0 siblings, 1 reply; 3+ messages in thread
From: Petr Mladek @ 2020-03-03 9:18 UTC (permalink / raw)
To: Sergey Senozhatsky
Cc: Steven Rostedt, linux-kernel, Lech Perczak, Greg Kroah-Hartman,
Theodore Ts'o, John Ogness
On Tue 2020-03-03 13:40:59, Sergey Senozhatsky wrote:
> Lech Perczak [0] reports that after commit 1b710b1b10ef
> ("char/random: silence a lockdep splat with printk()")
> user-space syslog/kmsg readers are not able to read new
> kernel messages. The reason is printk_deferred() being
> called too early (as was pointed out by Petr and John).
>
> Fix printk_deferred() and do not queue per-CPU irq_work
> before per-CPU areas are initialized.
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ad4606234545..d951d35a0786 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1147,12 +1159,25 @@ static void __init log_buf_add_cpu(void)
> static inline void log_buf_add_cpu(void) {}
> #endif /* CONFIG_SMP */
>
> +static void __init set_percpu_data_ready(void)
> +{
> + __printk_percpu_data_ready = true;
> +}
> +
> void __init setup_log_buf(int early)
> {
> unsigned long flags;
> char *new_log_buf;
> unsigned int free;
>
> + /*
> + * Some archs call setup_log_buf() multiple times - first is very
> + * early, e.g. from setup_arch(), and second - when percpu_areas
> + * are initialised.
> + */
> + if (!early)
> + set_percpu_data_ready();
> +
> if (log_buf != __log_buf)
> return;
>
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index b4045e782743..d9a659a686f3 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -27,7 +27,6 @@
a> * There are situations when we want to make sure that all buffers
> * were handled or when IRQs are blocked.
> */
> -static int printk_safe_irq_ready __read_mostly;
>
> #define SAFE_LOG_BUF_LEN ((1 << CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT) - \
> sizeof(atomic_t) - \
> @@ -51,7 +50,7 @@ static DEFINE_PER_CPU(struct printk_safe_seq_buf, nmi_print_seq);
> /* Get flushed in a more safe context. */
> static void queue_flush_work(struct printk_safe_seq_buf *s)
> {
> - if (printk_safe_irq_ready)
> + if (printk_percpu_data_ready())
> irq_work_queue(&s->work);
This is not safe. printk_percpu_data_ready() returns true even before
s->work gets initialized by printk_safe_init().
Solution would be to call printk_safe_init() from
setup_log_buf() before calling set_percpu_data_ready().
Or I would keep printk_safe code as it. I hope that it will get
removed rather soon anyway.
Otherwise, it looks good to me.
Best Regards,
Petr
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] printk: queue wake_up_klogd irq_work only if per-CPU areas are ready
2020-03-03 9:18 ` Petr Mladek
@ 2020-03-03 10:27 ` Sergey Senozhatsky
0 siblings, 0 replies; 3+ messages in thread
From: Sergey Senozhatsky @ 2020-03-03 10:27 UTC (permalink / raw)
To: Petr Mladek
Cc: Sergey Senozhatsky, Steven Rostedt, linux-kernel, Lech Perczak,
Greg Kroah-Hartman, Theodore Ts'o, John Ogness
On (20/03/03 10:18), Petr Mladek wrote:
[..]
> > static void queue_flush_work(struct printk_safe_seq_buf *s)
> > {
> > - if (printk_safe_irq_ready)
> > + if (printk_percpu_data_ready())
> > irq_work_queue(&s->work);
>
> This is not safe. printk_percpu_data_ready() returns true even before
> s->work gets initialized by printk_safe_init().
Good catch! I'll move printk_safe_init() call from init/main.c to
set_percpu_data_ready().
> Solution would be to call printk_safe_init() from
> setup_log_buf() before calling set_percpu_data_ready().
I'll move the init call. But printk_safe/nmi called too-early
will still write to no-yet-initialised per-cpu data.
-ss
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-03-03 10:28 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 4:40 [PATCH] printk: queue wake_up_klogd irq_work only if per-CPU areas are ready Sergey Senozhatsky
2020-03-03 9:18 ` Petr Mladek
2020-03-03 10:27 ` Sergey Senozhatsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).