linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] make printk work again
@ 2016-10-18 17:08 Peter Zijlstra
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
                   ` (4 more replies)
  0 siblings, 5 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:08 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel

This basically fixes printk by evading everything it does.

There's too many problems with printk, from sleeping locks to broken console
drivers. Stop using it.

The early_console drivers are the most robust and simple, use those.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/3] printk: Fix kdb_trap_printk placement
  2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
@ 2016-10-18 17:08 ` Peter Zijlstra
  2016-10-19 14:41   ` Petr Mladek
                     ` (2 more replies)
  2016-10-18 17:08 ` [PATCH 2/3] early_printk: Add force_early_printk kernel parameter Peter Zijlstra
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:08 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel,
	Jason Wessel

[-- Attachment #1: peterz-printk-kdb.patch --]
[-- Type: text/plain, Size: 1276 bytes --]

Some people figured vprintk_emit() makes for a nice API and exported
it, bypassing the kdb trap.

This still leaves vprintk_nmi() outside of the kbd reach, should that
be fixed too?

Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/printk/printk.c |   19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1750,6 +1750,13 @@ asmlinkage int vprintk_emit(int facility
 	/* cpu currently holding logbuf_lock in this function */
 	static unsigned int logbuf_cpu = UINT_MAX;
 
+#ifdef CONFIG_KGDB_KDB
+	if (unlikely(kdb_trap_printk)) {
+		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
+		return r;
+	}
+#endif
+
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
 		in_sched = true;
@@ -1932,17 +1939,7 @@ EXPORT_SYMBOL(printk_emit);
 
 int vprintk_default(const char *fmt, va_list args)
 {
-	int r;
-
-#ifdef CONFIG_KGDB_KDB
-	if (unlikely(kdb_trap_printk)) {
-		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
-		return r;
-	}
-#endif
-	r = vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
-
-	return r;
+	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
 }
 EXPORT_SYMBOL_GPL(vprintk_default);
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 2/3] early_printk: Add force_early_printk kernel parameter
  2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
@ 2016-10-18 17:08 ` Peter Zijlstra
  2016-11-29 14:02   ` Petr Mladek
  2016-10-18 17:08 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:08 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel

[-- Attachment #1: peterz-force_early_printk.patch --]
[-- Type: text/plain, Size: 2661 bytes --]

Add add the 'force_early_printk' kernel parameter to override printk()
and force it into early_printk(). This bypasses all the cruft and fail
from printk() and makes things work again.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/printk/printk.c |   74 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 49 insertions(+), 25 deletions(-)

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -344,6 +344,42 @@ __packed __aligned(4)
 #endif
 ;
 
+#ifdef CONFIG_EARLY_PRINTK
+struct console *early_console;
+
+static bool __read_mostly force_early_printk;
+
+static int __init force_early_printk_setup(char *str)
+{
+	force_early_printk = true;
+	return 0;
+}
+early_param("force_early_printk", force_early_printk_setup);
+
+static int early_vprintk(const char *fmt, va_list args)
+{
+	char buf[512];
+	int n;
+
+	n = vscnprintf(buf, sizeof(buf), fmt, args);
+	early_console->write(early_console, buf, n);
+
+	return n;
+}
+
+asmlinkage __visible void early_printk(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!early_console)
+		return;
+
+	va_start(ap, fmt);
+	early_vprintk(fmt, ap);
+	va_end(ap);
+}
+#endif
+
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters.  This can be taken
  * within the scheduler's rq lock. It must be released before calling
@@ -1751,10 +1787,13 @@ asmlinkage int vprintk_emit(int facility
 	static unsigned int logbuf_cpu = UINT_MAX;
 
 #ifdef CONFIG_KGDB_KDB
-	if (unlikely(kdb_trap_printk)) {
-		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
-		return r;
-	}
+	if (unlikely(kdb_trap_printk))
+		return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
+#endif
+
+#ifdef CONFIG_EARLY_PRINTK
+	if (force_early_printk && early_console)
+		return early_vprintk(fmt, args);
 #endif
 
 	if (level == LOGLEVEL_SCHED) {
@@ -1970,7 +2009,12 @@ asmlinkage __visible int printk(const ch
 	int r;
 
 	va_start(args, fmt);
-	r = vprintk_func(fmt, args);
+#ifdef CONFIG_EARLY_PRINTK
+	if (force_early_printk && early_console)
+		r = vprintk_default(fmt, args);
+	else
+#endif
+		r = vprintk_func(fmt, args);
 	va_end(args);
 
 	return r;
@@ -2020,26 +2064,6 @@ DEFINE_PER_CPU(printk_func_t, printk_fun
 
 #endif /* CONFIG_PRINTK */
 
-#ifdef CONFIG_EARLY_PRINTK
-struct console *early_console;
-
-asmlinkage __visible void early_printk(const char *fmt, ...)
-{
-	va_list ap;
-	char buf[512];
-	int n;
-
-	if (!early_console)
-		return;
-
-	va_start(ap, fmt);
-	n = vscnprintf(buf, sizeof(buf), fmt, ap);
-	va_end(ap);
-
-	early_console->write(early_console, buf, n);
-}
-#endif
-
 static int __add_preferred_console(char *name, int idx, char *options,
 				   char *brl_options)
 {

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
  2016-10-18 17:08 ` [PATCH 2/3] early_printk: Add force_early_printk kernel parameter Peter Zijlstra
@ 2016-10-18 17:08 ` Peter Zijlstra
  2016-10-18 17:19   ` Steven Rostedt
  2016-11-29 14:10   ` Petr Mladek
  2016-10-19  7:04 ` [PATCH 0/3] make printk work again Jan Kara
  2016-10-19 11:48 ` Sergey Senozhatsky
  4 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:08 UTC (permalink / raw)
  To: Sergey Senozhatsky, Petr Mladek, Andrew Morton
  Cc: Jan Kara, Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, Peter Zijlstra, linux-kernel

[-- Attachment #1: peterz-early_printk-serialize.patch --]
[-- Type: text/plain, Size: 978 bytes --]

In order to avoid multiple CPUs banging on the serial port at the same
time, add simple serialization. This explicitly deals with nested
contexts (like IRQs etc.).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/printk/printk.c |   16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -356,14 +356,28 @@ static int __init force_early_printk_set
 }
 early_param("force_early_printk", force_early_printk_setup);
 
+static int early_printk_cpu = -1;
+
 static int early_vprintk(const char *fmt, va_list args)
 {
+	int n, cpu, old;
 	char buf[512];
-	int n;
+
+	cpu = get_cpu();
+	for (;;) {
+		old = cmpxchg(&early_printk_cpu, -1, cpu);
+		if (old == -1 || old == cpu)
+			break;
+
+		cpu_relax();
+	}
 
 	n = vscnprintf(buf, sizeof(buf), fmt, args);
 	early_console->write(early_console, buf, n);
 
+	smp_store_release(&early_printk_cpu, old);
+	put_cpu();
+
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2016-10-18 17:08 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
@ 2016-10-18 17:19   ` Steven Rostedt
  2016-10-18 17:30     ` Peter Zijlstra
  2016-11-29 14:10   ` Petr Mladek
  1 sibling, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2016-10-18 17:19 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, linux-kernel

On Tue, 18 Oct 2016 19:08:33 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> In order to avoid multiple CPUs banging on the serial port at the same
> time, add simple serialization. This explicitly deals with nested
> contexts (like IRQs etc.).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---

hehe, I have the exact same patch for my -rt work.

-- Steve

>  kernel/printk/printk.c |   16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -356,14 +356,28 @@ static int __init force_early_printk_set
>  }
>  early_param("force_early_printk", force_early_printk_setup);
>  
> +static int early_printk_cpu = -1;
> +
>  static int early_vprintk(const char *fmt, va_list args)
>  {
> +	int n, cpu, old;
>  	char buf[512];
> -	int n;
> +
> +	cpu = get_cpu();
> +	for (;;) {
> +		old = cmpxchg(&early_printk_cpu, -1, cpu);
> +		if (old == -1 || old == cpu)
> +			break;
> +
> +		cpu_relax();
> +	}
>  
>  	n = vscnprintf(buf, sizeof(buf), fmt, args);
>  	early_console->write(early_console, buf, n);
>  
> +	smp_store_release(&early_printk_cpu, old);
> +	put_cpu();
> +
>  	return n;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2016-10-18 17:19   ` Steven Rostedt
@ 2016-10-18 17:30     ` Peter Zijlstra
  2016-10-18 17:53       ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-18 17:30 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, linux-kernel

On Tue, Oct 18, 2016 at 01:19:51PM -0400, Steven Rostedt wrote:
> 
> hehe, I have the exact same patch for my -rt work.

> > +static int early_printk_cpu = -1;
> > +
> >  static int early_vprintk(const char *fmt, va_list args)
> >  {
> > +	int n, cpu, old;
> >  	char buf[512];
> > +
> > +	cpu = get_cpu();
> > +	for (;;) {
> > +		old = cmpxchg(&early_printk_cpu, -1, cpu);
> > +		if (old == -1 || old == cpu)
> > +			break;

Looking at this again, we really should not spin using cmpxchg(), that
thrashes the cacheline.

The below is slightly better spinning... then again, this isn't
performance code.

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -376,10 +376,16 @@ static int early_vprintk(const char *fmt
 
 	cpu = get_cpu();
 	for (;;) {
-		old = cmpxchg(&early_printk_cpu, -1, cpu);
-		if (old == -1 || old == cpu)
+		old = READ_ONCE(early_printk_cpu);
+		if (old == cpu)
 			break;
 
+		if (old == -1) {
+			old = cmpxchg(&early_printk_cpu, -1, cpu);
+			if (old == -1 || old == cpu)
+				break;
+		}
+
 		cpu_relax();
 	}
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2016-10-18 17:30     ` Peter Zijlstra
@ 2016-10-18 17:53       ` Steven Rostedt
  0 siblings, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2016-10-18 17:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Ingo Molnar, linux-kernel

On Tue, 18 Oct 2016 19:30:46 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Oct 18, 2016 at 01:19:51PM -0400, Steven Rostedt wrote:

> Looking at this again, we really should not spin using cmpxchg(), that
> thrashes the cacheline.
> 
> The below is slightly better spinning... then again, this isn't
> performance code.

Yeah, this is similar to the hack I had.

May want to fold this in your previous version.

-- Steve

> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -376,10 +376,16 @@ static int early_vprintk(const char *fmt
>  
>  	cpu = get_cpu();
>  	for (;;) {
> -		old = cmpxchg(&early_printk_cpu, -1, cpu);
> -		if (old == -1 || old == cpu)
> +		old = READ_ONCE(early_printk_cpu);
> +		if (old == cpu)
>  			break;
>  
> +		if (old == -1) {
> +			old = cmpxchg(&early_printk_cpu, -1, cpu);
> +			if (old == -1 || old == cpu)
> +				break;
> +		}
> +
>  		cpu_relax();
>  	}
>  

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] make printk work again
  2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
                   ` (2 preceding siblings ...)
  2016-10-18 17:08 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
@ 2016-10-19  7:04 ` Jan Kara
  2016-10-19  9:24   ` Peter Zijlstra
  2016-10-19 11:48 ` Sergey Senozhatsky
  4 siblings, 1 reply; 29+ messages in thread
From: Jan Kara @ 2016-10-19  7:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel

On Tue 18-10-16 19:08:30, Peter Zijlstra wrote:
> This basically fixes printk by evading everything it does.
> 
> There's too many problems with printk, from sleeping locks to broken console
> drivers. Stop using it.

I agree that printk is fragile and your patches are likely fine for machine
where you do kernel development. However for production servers with
hundreds of SCSI LUNs assigned I don't think it is a viable solution - I'm
pretty sure those machines would take ages to boot (if they ever boot) with
early_printk implementation. So do you intend this as "the ultimate printk
solution" or just a "kernel developers debugging aid"? :)

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] make printk work again
  2016-10-19  7:04 ` [PATCH 0/3] make printk work again Jan Kara
@ 2016-10-19  9:24   ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-19  9:24 UTC (permalink / raw)
  To: Jan Kara
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel

On Wed, Oct 19, 2016 at 09:04:16AM +0200, Jan Kara wrote:
> On Tue 18-10-16 19:08:30, Peter Zijlstra wrote:
> > This basically fixes printk by evading everything it does.
> > 
> > There's too many problems with printk, from sleeping locks to broken console
> > drivers. Stop using it.
> 
> I agree that printk is fragile and your patches are likely fine for machine
> where you do kernel development. However for production servers with
> hundreds of SCSI LUNs assigned I don't think it is a viable solution - I'm
> pretty sure those machines would take ages to boot (if they ever boot) with
> early_printk implementation. So do you intend this as "the ultimate printk
> solution" or just a "kernel developers debugging aid"? :)

Mostly just to scratch my itch. It also completely kills dmesg, which I
can't see being popular :-) On the plus side, these 3 little patches
seems like something mergable (as opposed to my previous approach which
basically just deleted everything printk). The first also fixes a real,
and long standing, issue with kdb_printk.

But I really do not see how printk and the console drivers as exist
today can ever yield something reliable. We've been stacking band-aids
on it for a while now, and I think we're at the point where we should
say stop this madness.

Either do a complete overhaul and redesign of the entire stack, or just
give up and admit its broken crap and leave it rot.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] make printk work again
  2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
                   ` (3 preceding siblings ...)
  2016-10-19  7:04 ` [PATCH 0/3] make printk work again Jan Kara
@ 2016-10-19 11:48 ` Sergey Senozhatsky
  2016-10-19 12:21   ` Peter Zijlstra
  4 siblings, 1 reply; 29+ messages in thread
From: Sergey Senozhatsky @ 2016-10-19 11:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel

On (10/18/16 19:08), Peter Zijlstra wrote:
> This basically fixes printk by evading everything it does.
> 
> There's too many problems with printk, from sleeping locks to broken console
> drivers. Stop using it.
> 
> The early_console drivers are the most robust and simple, use those.

ACK

Can we also please ban that pre Because it's a bit tricky.emption thing?

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/3] make printk work again
  2016-10-19 11:48 ` Sergey Senozhatsky
@ 2016-10-19 12:21   ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-19 12:21 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Petr Mladek, Andrew Morton, Jan Kara, Tejun Heo, Calvin Owens,
	Thomas Gleixner, Mel Gorman, Steven Rostedt, Ingo Molnar,
	linux-kernel

On Wed, Oct 19, 2016 at 08:48:28PM +0900, Sergey Senozhatsky wrote:
> On (10/18/16 19:08), Peter Zijlstra wrote:
> > This basically fixes printk by evading everything it does.
> > 
> > There's too many problems with printk, from sleeping locks to broken console
> > drivers. Stop using it.
> > 
> > The early_console drivers are the most robust and simple, use those.
> 
> ACK
> 
> Can we also please ban that pre Because it's a bit tricky.emption thing?

Ha, so that's actually quite complicated. The 3rd patch does what is
simple and robust. Doing more gets very tricky real quick.

So actual preemption is disabled per get_cpu()/put_cpu(), and cross-cpu
interleaving is disabled by that 'lock'. But interrupt (and NMI) nesting
are still possible.

It does sporadically happen, but I've not really found it to be a
problem when reading the output.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] printk: Fix kdb_trap_printk placement
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
@ 2016-10-19 14:41   ` Petr Mladek
  2016-10-19 15:18     ` Peter Zijlstra
  2016-10-20 13:02   ` Sergey Senozhatsky
  2016-11-29 13:54   ` Petr Mladek
  2 siblings, 1 reply; 29+ messages in thread
From: Petr Mladek @ 2016-10-19 14:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel, Jason Wessel

On Tue 2016-10-18 19:08:31, Peter Zijlstra wrote:
> Some people figured vprintk_emit() makes for a nice API and exported
> it, bypassing the kdb trap.
> 
> This still leaves vprintk_nmi() outside of the kbd reach, should that
> be fixed too?

Good question! vkdb_printf() tries to avoid a deadlock but the code is racy:

int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
{
[...]
	/* Serialize kdb_printf if multiple cpus try to write at once.
	 * But if any cpu goes recursive in kdb, just print the output,
	 * even if it is interleaved with any other text.
	 */
	if (!KDB_STATE(PRINTF_LOCK)) {
		KDB_STATE_SET(PRINTF_LOCK);
		spin_lock_irqsave(&kdb_printf_lock, flags);
		got_printf_lock = 1;
		atomic_inc(&kdb_event);
	} else {
		__acquire(kdb_printf_lock);
	}


Let's have the following situation:

CPU1					CPU2

if (!KDB_STATE(PRINTF_LOCK)) {
	KDB_STATE_SET(PRINTF_LOCK);

					if (!KDB_STATE(PRINTF_LOCK)) {
					} else {
						__acquire(kdb_printf_lock);
					}

Now, both CPUs are in the critical section and happily writing over each
other, e.g. in

	vsnprintf(next_avail, size_avail, fmt, ap);

I quess that we want to fix this race. But I am not sure if it will
be done an NMI-safe way. I am going to send a patch for this.

Well, vkdb_printf() is called later when the messages are pushed
to the main logbuffer by printk_nmi_flush_line(). It is not perfect
but...


> Cc: Jason Wessel <jason.wessel@windriver.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Otherwise, your patch makes sense:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] printk: Fix kdb_trap_printk placement
  2016-10-19 14:41   ` Petr Mladek
@ 2016-10-19 15:18     ` Peter Zijlstra
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2016-10-19 15:18 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel, Jason Wessel

On Wed, Oct 19, 2016 at 04:41:40PM +0200, Petr Mladek wrote:
> On Tue 2016-10-18 19:08:31, Peter Zijlstra wrote:
> > Some people figured vprintk_emit() makes for a nice API and exported
> > it, bypassing the kdb trap.
> > 
> > This still leaves vprintk_nmi() outside of the kbd reach, should that
> > be fixed too?
> 
> Good question! vkdb_printf() tries to avoid a deadlock but the code is racy:
> 
> int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> {
> [...]
> 	/* Serialize kdb_printf if multiple cpus try to write at once.
> 	 * But if any cpu goes recursive in kdb, just print the output,
> 	 * even if it is interleaved with any other text.
> 	 */
> 	if (!KDB_STATE(PRINTF_LOCK)) {
> 		KDB_STATE_SET(PRINTF_LOCK);
> 		spin_lock_irqsave(&kdb_printf_lock, flags);
> 		got_printf_lock = 1;
> 		atomic_inc(&kdb_event);
> 	} else {
> 		__acquire(kdb_printf_lock);
> 	}
> 
> 
> Let's have the following situation:
> 
> CPU1					CPU2
> 
> if (!KDB_STATE(PRINTF_LOCK)) {
> 	KDB_STATE_SET(PRINTF_LOCK);
> 
> 					if (!KDB_STATE(PRINTF_LOCK)) {
> 					} else {
> 						__acquire(kdb_printf_lock);
> 					}
> 
> Now, both CPUs are in the critical section and happily writing over each
> other, e.g. in
> 
> 	vsnprintf(next_avail, size_avail, fmt, ap);
> 
> I quess that we want to fix this race. But I am not sure if it will
> be done an NMI-safe way. I am going to send a patch for this.

Something like patch 3 in this series should do I suppose. But the
vkdb_printf() thing using spin_lock_irqsave() seems to suggest it was
never meant to be used from NMI context.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] printk: Fix kdb_trap_printk placement
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
  2016-10-19 14:41   ` Petr Mladek
@ 2016-10-20 13:02   ` Sergey Senozhatsky
  2016-11-29 13:54   ` Petr Mladek
  2 siblings, 0 replies; 29+ messages in thread
From: Sergey Senozhatsky @ 2016-10-20 13:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Petr Mladek, Andrew Morton, Jan Kara,
	Tejun Heo, Calvin Owens, Thomas Gleixner, Mel Gorman,
	Steven Rostedt, Ingo Molnar, linux-kernel, Jason Wessel

On (10/18/16 19:08), Peter Zijlstra wrote:
> 
> Some people figured vprintk_emit() makes for a nice API and exported
> it, bypassing the kdb trap.
> 
> This still leaves vprintk_nmi() outside of the kbd reach, should that
> be fixed too?
> 
> Cc: Jason Wessel <jason.wessel@windriver.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

looks good to me.

Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

	-ss

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/3] printk: Fix kdb_trap_printk placement
  2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
  2016-10-19 14:41   ` Petr Mladek
  2016-10-20 13:02   ` Sergey Senozhatsky
@ 2016-11-29 13:54   ` Petr Mladek
  2 siblings, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2016-11-29 13:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel, Jason Wessel

On Tue 2016-10-18 19:08:31, Peter Zijlstra wrote:
> Some people figured vprintk_emit() makes for a nice API and exported
> it, bypassing the kdb trap.

I think that nobody saw this problem because kdb_trap_printk was
used only for a limited number of printk's. It is just a trick
how to use generic functions to print messages in the kdb context,
e.g. for getting backtraces.

But the patch makes sense.


> This still leaves vprintk_nmi() outside of the kbd reach, should that
> be fixed too?

It is perfectly fine. Messages from NMI context are not meant for
kdb anyway.

> Cc: Jason Wessel <jason.wessel@windriver.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/printk/printk.c |   19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1750,6 +1750,13 @@ asmlinkage int vprintk_emit(int facility
>  	/* cpu currently holding logbuf_lock in this function */
>  	static unsigned int logbuf_cpu = UINT_MAX;
>  
> +#ifdef CONFIG_KGDB_KDB
> +	if (unlikely(kdb_trap_printk)) {
> +		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
> +		return r;

r is not defined. It is fixed in the next patch but it breaks
bisectability.

Please, find below an updated patch that also includes
my Reviewed-by.


>From 5adf18de45ba986ea3ae611446828238f4d65fe0 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Tue, 18 Oct 2016 19:08:31 +0200
Subject: [PATCH 1/3] printk: Fix kdb_trap_printk placement

Some people figured vprintk_emit() makes for a nice API and exported
it, bypassing the kdb trap.

This still leaves vprintk_nmi() outside of the kbd reach, should that
be fixed too?

Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f7a55e9ff2f7..541ce7705353 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1781,6 +1781,11 @@ asmlinkage int vprintk_emit(int facility, int level,
 	/* cpu currently holding logbuf_lock in this function */
 	static unsigned int logbuf_cpu = UINT_MAX;
 
+#ifdef CONFIG_KGDB_KDB
+	if (unlikely(kdb_trap_printk))
+		return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
+#endif
+
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
 		in_sched = true;
@@ -1923,17 +1928,7 @@ asmlinkage int printk_emit(int facility, int level,
 
 int vprintk_default(const char *fmt, va_list args)
 {
-	int r;
-
-#ifdef CONFIG_KGDB_KDB
-	if (unlikely(kdb_trap_printk)) {
-		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
-		return r;
-	}
-#endif
-	r = vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
-
-	return r;
+	return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
 }
 EXPORT_SYMBOL_GPL(vprintk_default);
 
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 2/3] early_printk: Add force_early_printk kernel parameter
  2016-10-18 17:08 ` [PATCH 2/3] early_printk: Add force_early_printk kernel parameter Peter Zijlstra
@ 2016-11-29 14:02   ` Petr Mladek
  0 siblings, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2016-11-29 14:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel

On Tue 2016-10-18 19:08:32, Peter Zijlstra wrote:
> Add add the 'force_early_printk' kernel parameter to override printk()
> and force it into early_printk(). This bypasses all the cruft and fail
> from printk() and makes things work again.

IMHO, the patch makes perfect sense and helps with debugging
hard problems.

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1751,10 +1787,13 @@ asmlinkage int vprintk_emit(int facility
>  	static unsigned int logbuf_cpu = UINT_MAX;
>  
>  #ifdef CONFIG_KGDB_KDB
> -	if (unlikely(kdb_trap_printk)) {
> -		r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
> -		return r;
> -	}
> +	if (unlikely(kdb_trap_printk))
> +		return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
> +#endif

Please, find below an updated patch that has also my Reviewed-by.

In particular, the above change was moved to the first patch
to fix bisectability. The result after applying both patches
is still exactly the same.


>From e1c00ae67d07767ec8e5bddb1113c2badf31f4bd Mon Sep 17 00:00:00 2001
From: Petr Mladek <pmladek@suse.com>
Date: Tue, 29 Nov 2016 13:32:56 +0100
Subject: [PATCH 2/3] early_printk: Add force_early_printk kernel parameter

Add add the 'force_early_printk' kernel parameter to override printk()
and force it into early_printk(). This bypasses all the cruft and fail
from printk() and makes things work again.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/printk.c | 68 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 47 insertions(+), 21 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 541ce7705353..bb612e5c2e00 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -344,6 +344,42 @@ __packed __aligned(4)
 #endif
 ;
 
+#ifdef CONFIG_EARLY_PRINTK
+struct console *early_console;
+
+static bool __read_mostly force_early_printk;
+
+static int __init force_early_printk_setup(char *str)
+{
+	force_early_printk = true;
+	return 0;
+}
+early_param("force_early_printk", force_early_printk_setup);
+
+static int early_vprintk(const char *fmt, va_list args)
+{
+	char buf[512];
+	int n;
+
+	n = vscnprintf(buf, sizeof(buf), fmt, args);
+	early_console->write(early_console, buf, n);
+
+	return n;
+}
+
+asmlinkage __visible void early_printk(const char *fmt, ...)
+{
+	va_list ap;
+
+	if (!early_console)
+		return;
+
+	va_start(ap, fmt);
+	early_vprintk(fmt, ap);
+	va_end(ap);
+}
+#endif
+
 /*
  * The logbuf_lock protects kmsg buffer, indices, counters.  This can be taken
  * within the scheduler's rq lock. It must be released before calling
@@ -1786,6 +1822,11 @@ asmlinkage int vprintk_emit(int facility, int level,
 		return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
 #endif
 
+#ifdef CONFIG_EARLY_PRINTK
+	if (force_early_printk && early_console)
+		return early_vprintk(fmt, args);
+#endif
+
 	if (level == LOGLEVEL_SCHED) {
 		level = LOGLEVEL_DEFAULT;
 		in_sched = true;
@@ -1959,7 +2000,12 @@ asmlinkage __visible int printk(const char *fmt, ...)
 	int r;
 
 	va_start(args, fmt);
-	r = vprintk_func(fmt, args);
+#ifdef CONFIG_EARLY_PRINTK
+	if (force_early_printk && early_console)
+		r = vprintk_default(fmt, args);
+	else
+#endif
+		r = vprintk_func(fmt, args);
 	va_end(args);
 
 	return r;
@@ -2009,26 +2055,6 @@ static size_t msg_print_text(const struct printk_log *msg, enum log_flags prev,
 
 #endif /* CONFIG_PRINTK */
 
-#ifdef CONFIG_EARLY_PRINTK
-struct console *early_console;
-
-asmlinkage __visible void early_printk(const char *fmt, ...)
-{
-	va_list ap;
-	char buf[512];
-	int n;
-
-	if (!early_console)
-		return;
-
-	va_start(ap, fmt);
-	n = vscnprintf(buf, sizeof(buf), fmt, ap);
-	va_end(ap);
-
-	early_console->write(early_console, buf, n);
-}
-#endif
-
 static int __add_preferred_console(char *name, int idx, char *options,
 				   char *brl_options)
 {
-- 
1.8.5.6

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2016-10-18 17:08 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
  2016-10-18 17:19   ` Steven Rostedt
@ 2016-11-29 14:10   ` Petr Mladek
  1 sibling, 0 replies; 29+ messages in thread
From: Petr Mladek @ 2016-11-29 14:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Sergey Senozhatsky, Andrew Morton, Jan Kara, Tejun Heo,
	Calvin Owens, Thomas Gleixner, Mel Gorman, Steven Rostedt,
	Ingo Molnar, linux-kernel

On Tue 2016-10-18 19:08:33, Peter Zijlstra wrote:
> In order to avoid multiple CPUs banging on the serial port at the same
> time, add simple serialization. This explicitly deals with nested
> contexts (like IRQs etc.).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Makes sense. Just a small comment below.

Reviewd-by: Petr Mladek <pmladek@suse.com>

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -356,14 +356,28 @@ static int __init force_early_printk_set
>  }
>  early_param("force_early_printk", force_early_printk_setup);
>  
> +static int early_printk_cpu = -1;

[...]

> +	for (;;) {
> +		old = cmpxchg(&early_printk_cpu, -1, cpu);
> +		if (old == -1 || old == cpu)
> +			break;
> +
> +		cpu_relax();
> +	}
>  
>  	n = vscnprintf(buf, sizeof(buf), fmt, args);
>  	early_console->write(early_console, buf, n);
>  
> +	smp_store_release(&early_printk_cpu, old);

checkpatch.pl complains about using a barrier without a comment.
The code is simple but it still might help to add something like:

	/* Releasing early_printk_cpu custom lock. */

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 15:24           ` Peter Zijlstra
@ 2017-10-04 15:38             ` Paul E. McKenney
  0 siblings, 0 replies; 29+ messages in thread
From: Paul E. McKenney @ 2017-10-04 15:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, Oct 04, 2017 at 05:24:23PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 04, 2017 at 07:17:45AM -0700, Paul E. McKenney wrote:
> > If you use READ_ONCE(), then all architectures I know of enforce
> > full ordering for accesses to a single variable.  (If you don't use
> > READ_ONCE(), then in theory Itanium can reorder reads.)  Me, I would
> > argue for WRITE_ONCE() as well to prevent store tearing.
> 
> Note that the stores are either cmpxchg() or smp_store_release() both of
> which imply a WRITE_ONCE().

That works for me!  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 14:17         ` Paul E. McKenney
  2017-10-04 14:43           ` Steven Rostedt
@ 2017-10-04 15:24           ` Peter Zijlstra
  2017-10-04 15:38             ` Paul E. McKenney
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2017-10-04 15:24 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Steven Rostedt, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, Oct 04, 2017 at 07:17:45AM -0700, Paul E. McKenney wrote:
> If you use READ_ONCE(), then all architectures I know of enforce
> full ordering for accesses to a single variable.  (If you don't use
> READ_ONCE(), then in theory Itanium can reorder reads.)  Me, I would
> argue for WRITE_ONCE() as well to prevent store tearing.

Note that the stores are either cmpxchg() or smp_store_release() both of
which imply a WRITE_ONCE().

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 14:52             ` Peter Zijlstra
  2017-10-04 15:02               ` Steven Rostedt
@ 2017-10-04 15:14               ` Paul E. McKenney
  1 sibling, 0 replies; 29+ messages in thread
From: Paul E. McKenney @ 2017-10-04 15:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Steven Rostedt, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, Oct 04, 2017 at 04:52:47PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 04, 2017 at 10:43:54AM -0400, Steven Rostedt wrote:
> > 
> > My question is not about ordering, but about coherency. Can you have
> > one CPU read a variable that goes into cache, and keep using the cached
> > variable every time the program asks to read it, instead of going out
> > to memory.
> 
> No, not on a coherent system.

What Peter said.  And if you use READ_ONCE() for the reads, than as
far as I know, all the systems that the Linux kernel supports are
coherent systems.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 14:52             ` Peter Zijlstra
@ 2017-10-04 15:02               ` Steven Rostedt
  2017-10-04 15:14               ` Paul E. McKenney
  1 sibling, 0 replies; 29+ messages in thread
From: Steven Rostedt @ 2017-10-04 15:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Paul E. McKenney, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, 4 Oct 2017 16:52:47 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, Oct 04, 2017 at 10:43:54AM -0400, Steven Rostedt wrote:
> > 
> > My question is not about ordering, but about coherency. Can you have
> > one CPU read a variable that goes into cache, and keep using the cached
> > variable every time the program asks to read it, instead of going out
> > to memory.  
> 
> No, not on a coherent system.

In this case.

Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 14:43           ` Steven Rostedt
@ 2017-10-04 14:52             ` Peter Zijlstra
  2017-10-04 15:02               ` Steven Rostedt
  2017-10-04 15:14               ` Paul E. McKenney
  0 siblings, 2 replies; 29+ messages in thread
From: Peter Zijlstra @ 2017-10-04 14:52 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul E. McKenney, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, Oct 04, 2017 at 10:43:54AM -0400, Steven Rostedt wrote:
> 
> My question is not about ordering, but about coherency. Can you have
> one CPU read a variable that goes into cache, and keep using the cached
> variable every time the program asks to read it, instead of going out
> to memory.

No, not on a coherent system.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 14:17         ` Paul E. McKenney
@ 2017-10-04 14:43           ` Steven Rostedt
  2017-10-04 14:52             ` Peter Zijlstra
  2017-10-04 15:24           ` Peter Zijlstra
  1 sibling, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2017-10-04 14:43 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, 4 Oct 2017 07:17:45 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> > I'm more worried about other architectures that don't have as strong of
> > a cache coherency.
> > 
> > [ Added Paul as he knows a lot about odd architectures ]
> > 
> > Is there any architecture that we support that can have the following:
> > 
> > 	CPU0			CPU1
> > 	----			----
> > 			    early_printk_cpu = 1
> >  for (;;)
> >    old = READ_ONCE(early_printk_cpu);
> >    [ old = 1 ]
> > 
> > 			    early_printk_cpu = -1
> > 
> >    [...]
> >    cpu_relax();
> >    old = READ_ONCE(early_printk_cpu);
> > 
> >    [ but the CPU uses the cache and not the memory? ]
> > 
> >    old = 1;  
> 
> If you use READ_ONCE(), then all architectures I know of enforce
> full ordering for accesses to a single variable.  (If you don't use
> READ_ONCE(), then in theory Itanium can reorder reads.)  Me, I would
> argue for WRITE_ONCE() as well to prevent store tearing.
> 
> It is only when you have at least two variables and at least two threads
> than things start getting really "interesting".  ;-)
> 

My question is not about ordering, but about coherency. Can you have
one CPU read a variable that goes into cache, and keep using the cached
variable every time the program asks to read it, instead of going out
to memory.

Also, on the other CPU, if a variable is written, and the cache is not
write-through, could that variable be sitting in cache and not go out
to memory until a flush happens?

Do we support architectures that don't have strong coherency to know
that one CPU is asking for a memory location for something that was
changed in the cache of another CPU. Or a CPU will return old cache data
even though the memory was updated?

I guess my concern is that READ_ONCE() and cpu_relax(), don't actually
do a memory barrier. They are mostly compiler barriers. Do we support
poorly coherent architectures that require some kind of flush to make
sure the communication exists between the two CPUs?

Note, at TimeSys, I had to port Linux to a poorly coherent SMP board
that would trip over this all the time. I don't know if that board is
sufficient to run Linux, as we had to slap in memory barriers all over
the place. But this was a 2.4 kernel at the time.

-- Steve

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 13:04       ` Steven Rostedt
  2017-10-04 13:08         ` Peter Zijlstra
@ 2017-10-04 14:17         ` Paul E. McKenney
  2017-10-04 14:43           ` Steven Rostedt
  2017-10-04 15:24           ` Peter Zijlstra
  1 sibling, 2 replies; 29+ messages in thread
From: Paul E. McKenney @ 2017-10-04 14:17 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Peter Zijlstra, pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Wed, Oct 04, 2017 at 09:04:01AM -0400, Steven Rostedt wrote:
> On Wed, 4 Oct 2017 11:08:30 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Tue, Oct 03, 2017 at 06:24:22PM -0400, Steven Rostedt wrote:
> > > On Thu, 28 Sep 2017 14:18:26 +0200
> > > Peter Zijlstra <peterz@infradead.org> wrote:  
> > > >  static int early_vprintk(const char *fmt, va_list args)
> > > >  {
> > > > +	int n, cpu, old;
> > > >  	char buf[512];
> > > > +
> > > > +	cpu = get_cpu();
> > > > +	/*
> > > > +	 * Test-and-Set inter-cpu spinlock with recursion.
> > > > +	 */
> > > > +	for (;;) {
> > > > +		/*
> > > > +		 * c-cas to avoid the exclusive bouncing on spin.
> > > > +		 * Depends on the memory barrier implied by cmpxchg
> > > > +		 * for ACQUIRE semantics.
> > > > +		 */
> > > > +		old = READ_ONCE(early_printk_cpu);
> > > > +		if (old == -1) {  
> > > 
> > > If old != -1 and old != cpu, is it possible that the CPU could have
> > > fetched an old value, and never try to fetch it again?  
> > 
> > What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the
> > READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again,
> > as does the barrier() implied by cpu_relax().
> 
> I'm more worried about other architectures that don't have as strong of
> a cache coherency.
> 
> [ Added Paul as he knows a lot about odd architectures ]
> 
> Is there any architecture that we support that can have the following:
> 
> 	CPU0			CPU1
> 	----			----
> 			    early_printk_cpu = 1
>  for (;;)
>    old = READ_ONCE(early_printk_cpu);
>    [ old = 1 ]
> 
> 			    early_printk_cpu = -1
> 
>    [...]
>    cpu_relax();
>    old = READ_ONCE(early_printk_cpu);
> 
>    [ but the CPU uses the cache and not the memory? ]
> 
>    old = 1;

If you use READ_ONCE(), then all architectures I know of enforce
full ordering for accesses to a single variable.  (If you don't use
READ_ONCE(), then in theory Itanium can reorder reads.)  Me, I would
argue for WRITE_ONCE() as well to prevent store tearing.

It is only when you have at least two variables and at least two threads
than things start getting really "interesting".  ;-)

							Thanx, Paul

> -- Steve
> 
> 
> > 
> > > The cmpxchg memory barrier only happens when old == -1.  
> > 
> > Yeah, so?
> > 
> > > > +			old = cmpxchg(&early_printk_cpu, -1, cpu);
> > > > +			if (old == -1)
> > > > +				break;
> > > > +		}
> > > > +		/*
> > > > +		 * Allow recursion for interrupts and the like.
> > > > +		 */
> > > > +		if (old == cpu)
> > > > +			break;
> > > > +
> > > > +		cpu_relax();
> > > > +	}
> > > >  
> > > >  	n = vscnprintf(buf, sizeof(buf), fmt, args);
> > > >  	early_console->write(early_console, buf, n);
> > > >  
> > > > +	/*
> > > > +	 * Unlock -- in case @old == @cpu, this is a no-op.
> > > > +	 */
> > > > +	smp_store_release(&early_printk_cpu, old);
> > > > +	put_cpu();
> > > > +
> > > >  	return n;
> > > >  }  
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04 13:04       ` Steven Rostedt
@ 2017-10-04 13:08         ` Peter Zijlstra
  2017-10-04 14:17         ` Paul E. McKenney
  1 sibling, 0 replies; 29+ messages in thread
From: Peter Zijlstra @ 2017-10-04 13:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx, Paul E. McKenney

On Wed, Oct 04, 2017 at 09:04:01AM -0400, Steven Rostedt wrote:

> > > If old != -1 and old != cpu, is it possible that the CPU could have
> > > fetched an old value, and never try to fetch it again?  
> > 
> > What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the
> > READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again,
> > as does the barrier() implied by cpu_relax().
> 
> I'm more worried about other architectures that don't have as strong of
> a cache coherency.

Linux mandates cache-coherency, there's no weak or strong there. Memory
ordering can be weak or strong, but coherency not.

If this patch is broken, lots of code would be broken.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-04  9:08     ` Peter Zijlstra
@ 2017-10-04 13:04       ` Steven Rostedt
  2017-10-04 13:08         ` Peter Zijlstra
  2017-10-04 14:17         ` Paul E. McKenney
  0 siblings, 2 replies; 29+ messages in thread
From: Steven Rostedt @ 2017-10-04 13:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx, Paul E. McKenney

On Wed, 4 Oct 2017 11:08:30 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Oct 03, 2017 at 06:24:22PM -0400, Steven Rostedt wrote:
> > On Thu, 28 Sep 2017 14:18:26 +0200
> > Peter Zijlstra <peterz@infradead.org> wrote:  
> > >  static int early_vprintk(const char *fmt, va_list args)
> > >  {
> > > +	int n, cpu, old;
> > >  	char buf[512];
> > > +
> > > +	cpu = get_cpu();
> > > +	/*
> > > +	 * Test-and-Set inter-cpu spinlock with recursion.
> > > +	 */
> > > +	for (;;) {
> > > +		/*
> > > +		 * c-cas to avoid the exclusive bouncing on spin.
> > > +		 * Depends on the memory barrier implied by cmpxchg
> > > +		 * for ACQUIRE semantics.
> > > +		 */
> > > +		old = READ_ONCE(early_printk_cpu);
> > > +		if (old == -1) {  
> > 
> > If old != -1 and old != cpu, is it possible that the CPU could have
> > fetched an old value, and never try to fetch it again?  
> 
> What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the
> READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again,
> as does the barrier() implied by cpu_relax().

I'm more worried about other architectures that don't have as strong of
a cache coherency.

[ Added Paul as he knows a lot about odd architectures ]

Is there any architecture that we support that can have the following:

	CPU0			CPU1
	----			----
			    early_printk_cpu = 1
 for (;;)
   old = READ_ONCE(early_printk_cpu);
   [ old = 1 ]

			    early_printk_cpu = -1

   [...]
   cpu_relax();
   old = READ_ONCE(early_printk_cpu);

   [ but the CPU uses the cache and not the memory? ]

   old = 1;

-- Steve


> 
> > The cmpxchg memory barrier only happens when old == -1.  
> 
> Yeah, so?
> 
> > > +			old = cmpxchg(&early_printk_cpu, -1, cpu);
> > > +			if (old == -1)
> > > +				break;
> > > +		}
> > > +		/*
> > > +		 * Allow recursion for interrupts and the like.
> > > +		 */
> > > +		if (old == cpu)
> > > +			break;
> > > +
> > > +		cpu_relax();
> > > +	}
> > >  
> > >  	n = vscnprintf(buf, sizeof(buf), fmt, args);
> > >  	early_console->write(early_console, buf, n);
> > >  
> > > +	/*
> > > +	 * Unlock -- in case @old == @cpu, this is a no-op.
> > > +	 */
> > > +	smp_store_release(&early_printk_cpu, old);
> > > +	put_cpu();
> > > +
> > >  	return n;
> > >  }  

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-10-03 22:24   ` Steven Rostedt
@ 2017-10-04  9:08     ` Peter Zijlstra
  2017-10-04 13:04       ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2017-10-04  9:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Tue, Oct 03, 2017 at 06:24:22PM -0400, Steven Rostedt wrote:
> On Thu, 28 Sep 2017 14:18:26 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> >  static int early_vprintk(const char *fmt, va_list args)
> >  {
> > +	int n, cpu, old;
> >  	char buf[512];
> > +
> > +	cpu = get_cpu();
> > +	/*
> > +	 * Test-and-Set inter-cpu spinlock with recursion.
> > +	 */
> > +	for (;;) {
> > +		/*
> > +		 * c-cas to avoid the exclusive bouncing on spin.
> > +		 * Depends on the memory barrier implied by cmpxchg
> > +		 * for ACQUIRE semantics.
> > +		 */
> > +		old = READ_ONCE(early_printk_cpu);
> > +		if (old == -1) {
> 
> If old != -1 and old != cpu, is it possible that the CPU could have
> fetched an old value, and never try to fetch it again?

What? If old != -1 and old != cpu, we'll hit the cpu_relax() and do the
READ_ONCE() again. The READ_ONCE() guarantees we'll do the load again,
as does the barrier() implied by cpu_relax().

> The cmpxchg memory barrier only happens when old == -1.

Yeah, so?

> > +			old = cmpxchg(&early_printk_cpu, -1, cpu);
> > +			if (old == -1)
> > +				break;
> > +		}
> > +		/*
> > +		 * Allow recursion for interrupts and the like.
> > +		 */
> > +		if (old == cpu)
> > +			break;
> > +
> > +		cpu_relax();
> > +	}
> >  
> >  	n = vscnprintf(buf, sizeof(buf), fmt, args);
> >  	early_console->write(early_console, buf, n);
> >  
> > +	/*
> > +	 * Unlock -- in case @old == @cpu, this is a no-op.
> > +	 */
> > +	smp_store_release(&early_printk_cpu, old);
> > +	put_cpu();
> > +
> >  	return n;
> >  }

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-09-28 12:18 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
@ 2017-10-03 22:24   ` Steven Rostedt
  2017-10-04  9:08     ` Peter Zijlstra
  0 siblings, 1 reply; 29+ messages in thread
From: Steven Rostedt @ 2017-10-03 22:24 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: pmladek, sergey.senozhatsky, linux-kernel, mingo, tglx

On Thu, 28 Sep 2017 14:18:26 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> In order to avoid multiple CPUs banging on the serial port at the same
> time, add simple serialization. This explicitly deals with nested
> contexts (like IRQs etc.).
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/printk/printk.c |   35 ++++++++++++++++++++++++++++++++++-
>  1 file changed, 34 insertions(+), 1 deletion(-)
> 
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -378,14 +378,47 @@ static int __init force_early_printk_set
>  }
>  early_param("force_early_printk", force_early_printk_setup);
>  
> +static int early_printk_cpu = -1;
> +
>  static int early_vprintk(const char *fmt, va_list args)
>  {
> +	int n, cpu, old;
>  	char buf[512];
> -	int n;
> +
> +	cpu = get_cpu();
> +	/*
> +	 * Test-and-Set inter-cpu spinlock with recursion.
> +	 */
> +	for (;;) {
> +		/*
> +		 * c-cas to avoid the exclusive bouncing on spin.
> +		 * Depends on the memory barrier implied by cmpxchg
> +		 * for ACQUIRE semantics.
> +		 */
> +		old = READ_ONCE(early_printk_cpu);
> +		if (old == -1) {

If old != -1 and old != cpu, is it possible that the CPU could have
fetched an old value, and never try to fetch it again?

The cmpxchg memory barrier only happens when old == -1.

-- Steve

> +			old = cmpxchg(&early_printk_cpu, -1, cpu);
> +			if (old == -1)
> +				break;
> +		}
> +		/*
> +		 * Allow recursion for interrupts and the like.
> +		 */
> +		if (old == cpu)
> +			break;
> +
> +		cpu_relax();
> +	}
>  
>  	n = vscnprintf(buf, sizeof(buf), fmt, args);
>  	early_console->write(early_console, buf, n);
>  
> +	/*
> +	 * Unlock -- in case @old == @cpu, this is a no-op.
> +	 */
> +	smp_store_release(&early_printk_cpu, old);
> +	put_cpu();
> +
>  	return n;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 3/3] early_printk: Add simple serialization to early_vprintk()
  2017-09-28 12:18 [PATCH 0/3] printk: Add force_early_printk boot param Peter Zijlstra
@ 2017-09-28 12:18 ` Peter Zijlstra
  2017-10-03 22:24   ` Steven Rostedt
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2017-09-28 12:18 UTC (permalink / raw)
  To: pmladek, sergey.senozhatsky; +Cc: linux-kernel, rostedt, mingo, tglx, peterz

[-- Attachment #1: peterz-early_printk-serialize.patch --]
[-- Type: text/plain, Size: 1422 bytes --]

In order to avoid multiple CPUs banging on the serial port at the same
time, add simple serialization. This explicitly deals with nested
contexts (like IRQs etc.).

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/printk/printk.c |   35 ++++++++++++++++++++++++++++++++++-
 1 file changed, 34 insertions(+), 1 deletion(-)

--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -378,14 +378,47 @@ static int __init force_early_printk_set
 }
 early_param("force_early_printk", force_early_printk_setup);
 
+static int early_printk_cpu = -1;
+
 static int early_vprintk(const char *fmt, va_list args)
 {
+	int n, cpu, old;
 	char buf[512];
-	int n;
+
+	cpu = get_cpu();
+	/*
+	 * Test-and-Set inter-cpu spinlock with recursion.
+	 */
+	for (;;) {
+		/*
+		 * c-cas to avoid the exclusive bouncing on spin.
+		 * Depends on the memory barrier implied by cmpxchg
+		 * for ACQUIRE semantics.
+		 */
+		old = READ_ONCE(early_printk_cpu);
+		if (old == -1) {
+			old = cmpxchg(&early_printk_cpu, -1, cpu);
+			if (old == -1)
+				break;
+		}
+		/*
+		 * Allow recursion for interrupts and the like.
+		 */
+		if (old == cpu)
+			break;
+
+		cpu_relax();
+	}
 
 	n = vscnprintf(buf, sizeof(buf), fmt, args);
 	early_console->write(early_console, buf, n);
 
+	/*
+	 * Unlock -- in case @old == @cpu, this is a no-op.
+	 */
+	smp_store_release(&early_printk_cpu, old);
+	put_cpu();
+
 	return n;
 }
 

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-10-04 15:38 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-18 17:08 [PATCH 0/3] make printk work again Peter Zijlstra
2016-10-18 17:08 ` [PATCH 1/3] printk: Fix kdb_trap_printk placement Peter Zijlstra
2016-10-19 14:41   ` Petr Mladek
2016-10-19 15:18     ` Peter Zijlstra
2016-10-20 13:02   ` Sergey Senozhatsky
2016-11-29 13:54   ` Petr Mladek
2016-10-18 17:08 ` [PATCH 2/3] early_printk: Add force_early_printk kernel parameter Peter Zijlstra
2016-11-29 14:02   ` Petr Mladek
2016-10-18 17:08 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
2016-10-18 17:19   ` Steven Rostedt
2016-10-18 17:30     ` Peter Zijlstra
2016-10-18 17:53       ` Steven Rostedt
2016-11-29 14:10   ` Petr Mladek
2016-10-19  7:04 ` [PATCH 0/3] make printk work again Jan Kara
2016-10-19  9:24   ` Peter Zijlstra
2016-10-19 11:48 ` Sergey Senozhatsky
2016-10-19 12:21   ` Peter Zijlstra
2017-09-28 12:18 [PATCH 0/3] printk: Add force_early_printk boot param Peter Zijlstra
2017-09-28 12:18 ` [PATCH 3/3] early_printk: Add simple serialization to early_vprintk() Peter Zijlstra
2017-10-03 22:24   ` Steven Rostedt
2017-10-04  9:08     ` Peter Zijlstra
2017-10-04 13:04       ` Steven Rostedt
2017-10-04 13:08         ` Peter Zijlstra
2017-10-04 14:17         ` Paul E. McKenney
2017-10-04 14:43           ` Steven Rostedt
2017-10-04 14:52             ` Peter Zijlstra
2017-10-04 15:02               ` Steven Rostedt
2017-10-04 15:14               ` Paul E. McKenney
2017-10-04 15:24           ` Peter Zijlstra
2017-10-04 15:38             ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).