All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ksummit-discuss] [TECH TOPIC] asynchronous printk
@ 2016-07-19  3:47 Sergey Senozhatsky
  2016-07-19  3:56 ` Viresh Kumar
                   ` (3 more replies)
  0 siblings, 4 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-19  3:47 UTC (permalink / raw)
  To: ksummit-discuss
  Cc: Petr Mladek, Tetsuo Handa, Viresh Kumar, Tejun Heo, Jiri Kosina

Hello,

Wondering if anyone will be interested in printk-related topics
(or we can handle it in the mailing list).

What I have on my list is:


- synchronous printk()

printk() prints messages from kernel printk buffer until the buffer
is empty. When serial console is attached, printing is slow and thus
other CPUs in the system have plenty of time to append new messages to
the buffer while one CPU is printing. Thus the CPU can spend unbounded
amount of time doing printing in console_unlock().  This is especially
serious problem if the printk() calling console_unlock() was called with
interrupts disabled, or from IRQ, or from spin_lock protected section
(if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
function to call in some cases, it can cause different types of lockups
(soft, hard, spinlock), stalls and so on.

we have some progress on this side. printk() can offload printing from
sensitive and unsafe contexts to a schedulable printk_kthread context (a
special purpose printing kthread).
but "The whole idea remains worrisome", per Andrew :)


- synchronous console_unlock()

there are many places that need to lock and unlock console semaphore
for purposes other than flushing the log_buf. apart from 'in-kernel'
users (e.g. video, etc.) some of those console_lock/console_unlock
are done by user-space apps in system calls. examples:

= open /dev/ttyX

  console_unlock()
  ...
  console_lock()
  tty_open()
  console_device()
  chrdev_open()
  SyS_open()

= cat /proc/consoles

   console_unlock()
   c_stop()
  ...
   console_lock()
   c_start()
  seq_read()
  proc_reg_read()
  vfs_read()
  SyS_read()

= and so on.


so user-space apps can print kernel messages. which is not really nice,
I think. I have a patch that makes console_unlock() async as well; but
we still have console_lock() that is sync.

thus


- console semaphore

not every console_lock() caller has an intention to modify console driver
state, or console drivers list. so we probably can start distinguish READ
console drivers list access and WRITE access.


- KERN_CONT handling

the comment for KERN_CONT in include/linux/kern_levels.h suggests that

 /*
  * Annotation for a "continued" line of log printout (only done after a
  * line that had no enclosing \n). Only to be used by core/arch code
  * during early bootup (a continued line is not SMP-safe otherwise).
  */
 #define KERN_CONT→      ""

the thing is that people want cont lines in SMP-mode, and people do
use KERN_CONT in SMP mode. e.g. cgroups, ACPI, etc.

internally printk maintains a single cont buffer and, thus, there are
numerous if-s and else-s to cope with the fact that it can be used in
SMP-mode. so the question is -- is it the right time to make KERN_CONT
SMP-safe?

may be we can move cont buffers into per-CPU and add a new API for
cont printing -- pr_cont_begin()/pr_cont_end(). so the usage will be
something like this:

+	pr_cont_begin();  /* preempt disable */

	for (...)
		pr_cont("foo .... " .....);

+	pr_cont_end();    /* preempt enable */


Petr Mladek has another idea:
http://marc.info/?l=linux-kernel&m=146860197621876




The list of potential attendees
(not in any particular order, most likely incomplete)

Jan Kara
Petr Mladek
Andrew Morton
Jiri Kosina
Tejun Heo
Hannes Reinecke
Viresh Kumar
Steven Rostedt
Sergey Senozhatsky

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  3:47 [Ksummit-discuss] [TECH TOPIC] asynchronous printk Sergey Senozhatsky
@ 2016-07-19  3:56 ` Viresh Kumar
  2016-07-19  6:17 ` Hannes Reinecke
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 60+ messages in thread
From: Viresh Kumar @ 2016-07-19  3:56 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: ksummit-discuss, Tetsuo Handa, gregkh, Tejun Heo, Jiri Kosina

+Greg.

On 19-07-16, 12:47, Sergey Senozhatsky wrote:
> Hello,
> 
> Wondering if anyone will be interested in printk-related topics
> (or we can handle it in the mailing list).
> 
> What I have on my list is:
> 
> 
> - synchronous printk()
> 
> printk() prints messages from kernel printk buffer until the buffer
> is empty. When serial console is attached, printing is slow and thus
> other CPUs in the system have plenty of time to append new messages to
> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> amount of time doing printing in console_unlock().  This is especially
> serious problem if the printk() calling console_unlock() was called with
> interrupts disabled, or from IRQ, or from spin_lock protected section
> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
> function to call in some cases, it can cause different types of lockups
> (soft, hard, spinlock), stalls and so on.
> 
> we have some progress on this side. printk() can offload printing from
> sensitive and unsafe contexts to a schedulable printk_kthread context (a
> special purpose printing kthread).
> but "The whole idea remains worrisome", per Andrew :)
> 
> 
> - synchronous console_unlock()
> 
> there are many places that need to lock and unlock console semaphore
> for purposes other than flushing the log_buf. apart from 'in-kernel'
> users (e.g. video, etc.) some of those console_lock/console_unlock
> are done by user-space apps in system calls. examples:
> 
> = open /dev/ttyX
> 
>   console_unlock()
>   ...
>   console_lock()
>   tty_open()
>   console_device()
>   chrdev_open()
>   SyS_open()
> 
> = cat /proc/consoles
> 
>    console_unlock()
>    c_stop()
>   ...
>    console_lock()
>    c_start()
>   seq_read()
>   proc_reg_read()
>   vfs_read()
>   SyS_read()
> 
> = and so on.
> 
> 
> so user-space apps can print kernel messages. which is not really nice,
> I think. I have a patch that makes console_unlock() async as well; but
> we still have console_lock() that is sync.
> 
> thus
> 
> 
> - console semaphore
> 
> not every console_lock() caller has an intention to modify console driver
> state, or console drivers list. so we probably can start distinguish READ
> console drivers list access and WRITE access.
> 
> 
> - KERN_CONT handling
> 
> the comment for KERN_CONT in include/linux/kern_levels.h suggests that
> 
>  /*
>   * Annotation for a "continued" line of log printout (only done after a
>   * line that had no enclosing \n). Only to be used by core/arch code
>   * during early bootup (a continued line is not SMP-safe otherwise).
>   */
>  #define KERN_CONT→      ""
> 
> the thing is that people want cont lines in SMP-mode, and people do
> use KERN_CONT in SMP mode. e.g. cgroups, ACPI, etc.
> 
> internally printk maintains a single cont buffer and, thus, there are
> numerous if-s and else-s to cope with the fact that it can be used in
> SMP-mode. so the question is -- is it the right time to make KERN_CONT
> SMP-safe?
> 
> may be we can move cont buffers into per-CPU and add a new API for
> cont printing -- pr_cont_begin()/pr_cont_end(). so the usage will be
> something like this:
> 
> +	pr_cont_begin();  /* preempt disable */
> 
> 	for (...)
> 		pr_cont("foo .... " .....);
> 
> +	pr_cont_end();    /* preempt enable */
> 
> 
> Petr Mladek has another idea:
> http://marc.info/?l=linux-kernel&m=146860197621876
> 
> 
> 
> 
> The list of potential attendees
> (not in any particular order, most likely incomplete)
> 
> Jan Kara
> Petr Mladek
> Andrew Morton
> Jiri Kosina
> Tejun Heo
> Hannes Reinecke
> Viresh Kumar
> Steven Rostedt
> Sergey Senozhatsky
> Greg KH
> 
> 	-ss

-- 
viresh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  3:47 [Ksummit-discuss] [TECH TOPIC] asynchronous printk Sergey Senozhatsky
  2016-07-19  3:56 ` Viresh Kumar
@ 2016-07-19  6:17 ` Hannes Reinecke
  2016-07-19  6:49   ` Josh Triplett
                     ` (3 more replies)
  2016-07-19 14:45 ` James Bottomley
  2016-09-21  4:41 ` Sergey Senozhatsky
  3 siblings, 4 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-07-19  6:17 UTC (permalink / raw)
  To: ksummit-discuss

On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
> Hello,
> 
> Wondering if anyone will be interested in printk-related topics
> (or we can handle it in the mailing list).
> 
> What I have on my list is:
> 
> 
> - synchronous printk()
> 
> printk() prints messages from kernel printk buffer until the buffer
> is empty. When serial console is attached, printing is slow and thus
> other CPUs in the system have plenty of time to append new messages to
> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> amount of time doing printing in console_unlock().  This is especially
> serious problem if the printk() calling console_unlock() was called with
> interrupts disabled, or from IRQ, or from spin_lock protected section
> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
> function to call in some cases, it can cause different types of lockups
> (soft, hard, spinlock), stalls and so on.
> 
> we have some progress on this side. printk() can offload printing from
> sensitive and unsafe contexts to a schedulable printk_kthread context (a
> special purpose printing kthread).
> but "The whole idea remains worrisome", per Andrew :)
> 
Yes. The main problem stems from the fact that printk has two different
and conflicting use-cases:
- Really urgent, 'I am about to die' messages. Which obviously need to
  be printed out as fast as possible.
- Rather largish, information/logging 'what I always wanted to tell you'
  type of messages. These messages tend to be very large, but at the end
  it doesn't really matter _when_ they'll be printed as they are
  time-stamped anyway.

For the first use-case you absolutely need a synchronous printk, but
this is a complete killer for the second case.
And OTOH having a separate thread is really the way to go for the second
case, but an absolute no-go for the first.

So I really wonder if it does make sense to lump both use-cases into one
call, or whether it wouldn't be better to have two distinct calls
for that (or, for the sake of argument, use KERN_EMERG to trigger
synchronous printks).

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  6:17 ` Hannes Reinecke
@ 2016-07-19  6:49   ` Josh Triplett
  2016-07-19  7:02     ` Hannes Reinecke
  2016-07-19  7:33   ` Sergey Senozhatsky
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 60+ messages in thread
From: Josh Triplett @ 2016-07-19  6:49 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 08:17:19AM +0200, Hannes Reinecke wrote:
> Yes. The main problem stems from the fact that printk has two different
> and conflicting use-cases:
> - Really urgent, 'I am about to die' messages. Which obviously need to
>   be printed out as fast as possible.
> - Rather largish, information/logging 'what I always wanted to tell you'
>   type of messages. These messages tend to be very large, but at the end
>   it doesn't really matter _when_ they'll be printed as they are
>   time-stamped anyway.
> 
> For the first use-case you absolutely need a synchronous printk, but
> this is a complete killer for the second case.
> And OTOH having a separate thread is really the way to go for the second
> case, but an absolute no-go for the first.
> 
> So I really wonder if it does make sense to lump both use-cases into one
> call, or whether it wouldn't be better to have two distinct calls
> for that (or, for the sake of argument, use KERN_EMERG to trigger
> synchronous printks).

For the sake of argument: what about using loglevel to distinguish the
two cases by default?

If the printk would show up on the console, handle it inline immediately
before returning, so that the user sees it on the console immediately in
case the very next line hangs the system.  That also helps with the
debugging approach of copy/pasting many instances of pr_alert("%s:%d:
here\n", __func__, __LINE__) and looking for the last one that shows up.

If the printk would *not* show up on the console, and would only show up
asynchronously in dmesg or a log somewhere, then go ahead and throw it
to the asynchronous printk_kthread context to handle and return, because
if the next line crashes, userspace wouldn't get the opportunity to read
and log it anyway.

Combined with a mechanism like "if the kernel panics, try as hard as
possible to dump out all the pending printks before dying", that seems
like a reasonable default behavior that shouldn't result in surprises.
If the kernel is alive enough that userspace can still log things (such
as if the display hangs but the kernel and userspace are still running),
then the kernel should also still be alive enough to process the pending
printks.

- Josh Triplett

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  6:49   ` Josh Triplett
@ 2016-07-19  7:02     ` Hannes Reinecke
  2016-07-19  7:11       ` Geert Uytterhoeven
  2016-07-20 22:54       ` Josh Triplett
  0 siblings, 2 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-07-19  7:02 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

On 07/19/2016 08:49 AM, Josh Triplett wrote:
> On Tue, Jul 19, 2016 at 08:17:19AM +0200, Hannes Reinecke wrote:
>> Yes. The main problem stems from the fact that printk has two different
>> and conflicting use-cases:
>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>   be printed out as fast as possible.
>> - Rather largish, information/logging 'what I always wanted to tell you'
>>   type of messages. These messages tend to be very large, but at the end
>>   it doesn't really matter _when_ they'll be printed as they are
>>   time-stamped anyway.
>>
>> For the first use-case you absolutely need a synchronous printk, but
>> this is a complete killer for the second case.
>> And OTOH having a separate thread is really the way to go for the second
>> case, but an absolute no-go for the first.
>>
>> So I really wonder if it does make sense to lump both use-cases into one
>> call, or whether it wouldn't be better to have two distinct calls
>> for that (or, for the sake of argument, use KERN_EMERG to trigger
>> synchronous printks).
> 
> For the sake of argument: what about using loglevel to distinguish the
> two cases by default?
> 
That's what I've tried to infer by the above statement; KERN_EMERG could
easily used for that sort of thing.
Also I'd like to have everything on board with this idea, namely that
using printk for instantaneous but really long messages it _NOT_ going
to work.
(And I'd love to kill printk line continuation, too, but that's another
story)

> If the printk would show up on the console, handle it inline immediately
> before returning, so that the user sees it on the console immediately in
> case the very next line hangs the system.  That also helps with the
> debugging approach of copy/pasting many instances of pr_alert("%s:%d:
> here\n", __func__, __LINE__) and looking for the last one that shows up.
> 
> If the printk would *not* show up on the console, and would only show up
> asynchronously in dmesg or a log somewhere, then go ahead and throw it
> to the asynchronous printk_kthread context to handle and return, because
> if the next line crashes, userspace wouldn't get the opportunity to read
> and log it anyway.
> 
> Combined with a mechanism like "if the kernel panics, try as hard as
> possible to dump out all the pending printks before dying", that seems
> like a reasonable default behavior that shouldn't result in surprises.
> If the kernel is alive enough that userspace can still log things (such
> as if the display hangs but the kernel and userspace are still running),
> then the kernel should also still be alive enough to process the pending
> printks.
> 
But it still leaves us with a possible priority inversion.
How should we deal with situations where the async thread is running and
someone is issuing a synchronous printk?
Should we skip the asynchonous ones and print the synchronous one first,
risking out-of-order messages but a higher probability that the urgent
message is actually printed ?
Or should we elevate everything to synchronous, preserving ordering
but increasing the risk that the synchronous message never ever makes it
to the console?

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:02     ` Hannes Reinecke
@ 2016-07-19  7:11       ` Geert Uytterhoeven
  2016-07-20  6:02         ` Jan Kara
  2016-07-20 22:54       ` Josh Triplett
  1 sibling, 1 reply; 60+ messages in thread
From: Geert Uytterhoeven @ 2016-07-19  7:11 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 9:02 AM, Hannes Reinecke <hare@suse.com> wrote:
>> If the printk would show up on the console, handle it inline immediately
>> before returning, so that the user sees it on the console immediately in
>> case the very next line hangs the system.  That also helps with the
>> debugging approach of copy/pasting many instances of pr_alert("%s:%d:
>> here\n", __func__, __LINE__) and looking for the last one that shows up.
>>
>> If the printk would *not* show up on the console, and would only show up
>> asynchronously in dmesg or a log somewhere, then go ahead and throw it
>> to the asynchronous printk_kthread context to handle and return, because
>> if the next line crashes, userspace wouldn't get the opportunity to read
>> and log it anyway.
>>
>> Combined with a mechanism like "if the kernel panics, try as hard as
>> possible to dump out all the pending printks before dying", that seems
>> like a reasonable default behavior that shouldn't result in surprises.
>> If the kernel is alive enough that userspace can still log things (such
>> as if the display hangs but the kernel and userspace are still running),
>> then the kernel should also still be alive enough to process the pending
>> printks.
>>
> But it still leaves us with a possible priority inversion.
> How should we deal with situations where the async thread is running and
> someone is issuing a synchronous printk?
> Should we skip the asynchonous ones and print the synchronous one first,
> risking out-of-order messages but a higher probability that the urgent
> message is actually printed ?

There are already out-of-order messages for SMP and continuation,
some of these we want to fix, but always for the same loglevel.
If two messages have a different log level, I see no reason to (try to) keep
ordering. If keeping them together is required, the user should use pr_cont.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  6:17 ` Hannes Reinecke
  2016-07-19  6:49   ` Josh Triplett
@ 2016-07-19  7:33   ` Sergey Senozhatsky
  2016-07-19  7:38     ` Hannes Reinecke
  2016-07-19  7:46   ` Christian Borntraeger
  2016-07-20  3:35   ` Wangnan (F)
  3 siblings, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-19  7:33 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On (07/19/16 08:17), Hannes Reinecke wrote:
[..]
> Yes. The main problem stems from the fact that printk has two different
> and conflicting use-cases:
> - Really urgent, 'I am about to die' messages. Which obviously need to
>   be printed out as fast as possible.
> - Rather largish, information/logging 'what I always wanted to tell you'
>   type of messages. These messages tend to be very large, but at the end
>   it doesn't really matter _when_ they'll be printed as they are
>   time-stamped anyway.
> 
> For the first use-case you absolutely need a synchronous printk, but
> this is a complete killer for the second case.
> And OTOH having a separate thread is really the way to go for the second
> case, but an absolute no-go for the first.
> 
> So I really wonder if it does make sense to lump both use-cases into one
> call, or whether it wouldn't be better to have two distinct calls
> for that (or, for the sake of argument, use KERN_EMERG to trigger
> synchronous printks).

yes, panic() must be in sync printk mode. but we do it a
bit differently: console_verbose() forces printk to switch
to sync mode.

so panic() goes like this:

	panic()
	{
		console_verbose(); # switch to sync printk. forever.
		bust_spinlocks();
		pr_emerg("Kernel panic - not syncing...");

		....
		debug_locks_off();
		console_flush_on_panic();  # flushes kernel log_bug in
					   # sync mode

		// the rest of panic()

	}

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:33   ` Sergey Senozhatsky
@ 2016-07-19  7:38     ` Hannes Reinecke
  2016-07-19  7:46       ` Sergey Senozhatsky
  2016-07-20  6:09       ` Jan Kara
  0 siblings, 2 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-07-19  7:38 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: ksummit-discuss

On 07/19/2016 09:33 AM, Sergey Senozhatsky wrote:
> On (07/19/16 08:17), Hannes Reinecke wrote:
> [..]
>> Yes. The main problem stems from the fact that printk has two different
>> and conflicting use-cases:
>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>   be printed out as fast as possible.
>> - Rather largish, information/logging 'what I always wanted to tell you'
>>   type of messages. These messages tend to be very large, but at the end
>>   it doesn't really matter _when_ they'll be printed as they are
>>   time-stamped anyway.
>>
>> For the first use-case you absolutely need a synchronous printk, but
>> this is a complete killer for the second case.
>> And OTOH having a separate thread is really the way to go for the second
>> case, but an absolute no-go for the first.
>>
>> So I really wonder if it does make sense to lump both use-cases into one
>> call, or whether it wouldn't be better to have two distinct calls
>> for that (or, for the sake of argument, use KERN_EMERG to trigger
>> synchronous printks).
> 
> yes, panic() must be in sync printk mode. but we do it a
> bit differently: console_verbose() forces printk to switch
> to sync mode.
> 
> so panic() goes like this:
> 
> 	panic()
> 	{
> 		console_verbose(); # switch to sync printk. forever.
> 		bust_spinlocks();
> 		pr_emerg("Kernel panic - not syncing...");
> 
> 		....
> 		debug_locks_off();
> 		console_flush_on_panic();  # flushes kernel log_bug in
> 					   # sync mode
> 
> 		// the rest of panic()
> 
> 	}
> 
> 	-ss
> 
but this is precisely what I meant by priority inversion:
If there are lots of messages in the printk buffer we might not be
_able_ to print out everything as the machine died before the entire
printk buffer could be printed.
Which means we will never see the panic message, at which point we could
as well drop the whole exercise.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:38     ` Hannes Reinecke
@ 2016-07-19  7:46       ` Sergey Senozhatsky
  2016-07-19  8:02         ` Hannes Reinecke
  2016-07-20  6:09       ` Jan Kara
  1 sibling, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-19  7:46 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On (07/19/16 09:38), Hannes Reinecke wrote:
[..]
> > yes, panic() must be in sync printk mode. but we do it a
> > bit differently: console_verbose() forces printk to switch
> > to sync mode.
> > 
> > so panic() goes like this:
> > 
> > 	panic()
> > 	{
> > 		console_verbose(); # switch to sync printk. forever.
> > 		bust_spinlocks();
> > 		pr_emerg("Kernel panic - not syncing...");
> > 
> > 		....
> > 		debug_locks_off();
> > 		console_flush_on_panic();  # flushes kernel log_bug in
> > 					   # sync mode
> > 
> > 		// the rest of panic()
> > 
> > 	}
> > 
> > 	-ss
> > 
> but this is precisely what I meant by priority inversion:
> If there are lots of messages in the printk buffer we might not be
> _able_ to print out everything as the machine died before the entire
> printk buffer could be printed.

why would it die? the same CPU that panics the system flushes the
log_buf in sync mode. console_flush_on_panic() is invoked after
panic_smp_send_stop(), which is supposed to IPI_STOP other CPUs on the
system, but before panic() reboots the system (if it must reboot it).
and it does not really care whether console_sem is available, it goes
to console_unlock() in any case

void console_flush_on_panic(void)
{
	/*
	 * If someone else is holding the console lock, trylock will fail
	 * and may_schedule may be set.  Ignore and proceed to unlock so
	 * that messages are flushed out.  As this can be called from any
	 * context and we don't want to get preempted while flushing,
	 * ensure may_schedule is cleared.
	 */
	console_trylock();
	console_may_schedule = 0;
	console_unlock();
}

so why would the system die before we console_unlock()?

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  6:17 ` Hannes Reinecke
  2016-07-19  6:49   ` Josh Triplett
  2016-07-19  7:33   ` Sergey Senozhatsky
@ 2016-07-19  7:46   ` Christian Borntraeger
  2016-07-19  7:53     ` Christian Borntraeger
  2016-07-20  3:35   ` Wangnan (F)
  3 siblings, 1 reply; 60+ messages in thread
From: Christian Borntraeger @ 2016-07-19  7:46 UTC (permalink / raw)
  To: Hannes Reinecke, ksummit-discuss

On 07/19/2016 08:17 AM, Hannes Reinecke wrote:
> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>> Hello,
>>
>> Wondering if anyone will be interested in printk-related topics
>> (or we can handle it in the mailing list).
>>
>> What I have on my list is:
>>
>>
>> - synchronous printk()
>>
>> printk() prints messages from kernel printk buffer until the buffer
>> is empty. When serial console is attached, printing is slow and thus
>> other CPUs in the system have plenty of time to append new messages to
>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>> amount of time doing printing in console_unlock().  This is especially
>> serious problem if the printk() calling console_unlock() was called with
>> interrupts disabled, or from IRQ, or from spin_lock protected section
>> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
>> function to call in some cases, it can cause different types of lockups
>> (soft, hard, spinlock), stalls and so on.
>>
>> we have some progress on this side. printk() can offload printing from
>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>> special purpose printing kthread).
>> but "The whole idea remains worrisome", per Andrew :)
>>
> Yes. The main problem stems from the fact that printk has two different
> and conflicting use-cases:
> - Really urgent, 'I am about to die' messages. Which obviously need to
>   be printed out as fast as possible.
> - Rather largish, information/logging 'what I always wanted to tell you'
>   type of messages. These messages tend to be very large, but at the end
>   it doesn't really matter _when_ they'll be printed as they are
>   time-stamped anyway.

I think you brought up this topic last year in a lightning talk, correct?
Didn't Linus say "fix the console then"?


> 
> For the first use-case you absolutely need a synchronous printk, but
> this is a complete killer for the second case.
> And OTOH having a separate thread is really the way to go for the second
> case, but an absolute no-go for the first.
> 
> So I really wonder if it does make sense to lump both use-cases into one
> call, or whether it wouldn't be better to have two distinct calls
> for that (or, for the sake of argument, use KERN_EMERG to trigger
> synchronous printks).
> 
> Cheers,
> 
> Hannes
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:46   ` Christian Borntraeger
@ 2016-07-19  7:53     ` Christian Borntraeger
  2016-07-19 13:55       ` Jan Kara
  0 siblings, 1 reply; 60+ messages in thread
From: Christian Borntraeger @ 2016-07-19  7:53 UTC (permalink / raw)
  To: Hannes Reinecke, ksummit-discuss

On 07/19/2016 09:46 AM, Christian Borntraeger wrote:
> On 07/19/2016 08:17 AM, Hannes Reinecke wrote:
>> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>>> Hello,
>>>
>>> Wondering if anyone will be interested in printk-related topics
>>> (or we can handle it in the mailing list).
>>>
>>> What I have on my list is:
>>>
>>>
>>> - synchronous printk()
>>>
>>> printk() prints messages from kernel printk buffer until the buffer
>>> is empty. When serial console is attached, printing is slow and thus
>>> other CPUs in the system have plenty of time to append new messages to
>>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>>> amount of time doing printing in console_unlock().  This is especially
>>> serious problem if the printk() calling console_unlock() was called with
>>> interrupts disabled, or from IRQ, or from spin_lock protected section
>>> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
>>> function to call in some cases, it can cause different types of lockups
>>> (soft, hard, spinlock), stalls and so on.
>>>
>>> we have some progress on this side. printk() can offload printing from
>>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>>> special purpose printing kthread).
>>> but "The whole idea remains worrisome", per Andrew :)
>>>
>> Yes. The main problem stems from the fact that printk has two different
>> and conflicting use-cases:
>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>   be printed out as fast as possible.
>> - Rather largish, information/logging 'what I always wanted to tell you'
>>   type of messages. These messages tend to be very large, but at the end
>>   it doesn't really matter _when_ they'll be printed as they are
>>   time-stamped anyway.
> 
> I think you brought up this topic last year in a lightning talk, correct?
> Didn't Linus say "fix the console then"?

Which does not make the problem go away, though, so yes - a session to
discuss this is probably a good idea - it really depends on how strong
Linus opinion on this topic is ;-)

On s390 we have for LPAR the relatively slow operating system messages console,
which goes out to the service element.
In 
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit?id=25b41a7b67ee4f4d12cee8a4b8b5929e36c27e29
we provided a console specific fix^Whack to allow to tune the buffer size
and to allow to drop messages. Better solutions are certainly welcome.

Christian

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:46       ` Sergey Senozhatsky
@ 2016-07-19  8:02         ` Hannes Reinecke
  2016-07-19  8:23           ` Sergey Senozhatsky
  2016-07-21 10:36           ` David Woodhouse
  0 siblings, 2 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-07-19  8:02 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: ksummit-discuss

On 07/19/2016 09:46 AM, Sergey Senozhatsky wrote:
> On (07/19/16 09:38), Hannes Reinecke wrote:
> [..]
>>> yes, panic() must be in sync printk mode. but we do it a
>>> bit differently: console_verbose() forces printk to switch
>>> to sync mode.
>>>
>>> so panic() goes like this:
>>>
>>> 	panic()
>>> 	{
>>> 		console_verbose(); # switch to sync printk. forever.
>>> 		bust_spinlocks();
>>> 		pr_emerg("Kernel panic - not syncing...");
>>>
>>> 		....
>>> 		debug_locks_off();
>>> 		console_flush_on_panic();  # flushes kernel log_bug in
>>> 					   # sync mode
>>>
>>> 		// the rest of panic()
>>>
>>> 	}
>>>
>>> 	-ss
>>>
>> but this is precisely what I meant by priority inversion:
>> If there are lots of messages in the printk buffer we might not be
>> _able_ to print out everything as the machine died before the entire
>> printk buffer could be printed.
> 
> why would it die? the same CPU that panics the system flushes the
> log_buf in sync mode. console_flush_on_panic() is invoked after
> panic_smp_send_stop(), which is supposed to IPI_STOP other CPUs on the
> system, but before panic() reboots the system (if it must reboot it).
> and it does not really care whether console_sem is available, it goes
> to console_unlock() in any case
> 
> void console_flush_on_panic(void)
> {
> 	/*
> 	 * If someone else is holding the console lock, trylock will fail
> 	 * and may_schedule may be set.  Ignore and proceed to unlock so
> 	 * that messages are flushed out.  As this can be called from any
> 	 * context and we don't want to get preempted while flushing,
> 	 * ensure may_schedule is cleared.
> 	 */
> 	console_trylock();
> 	console_may_schedule = 0;
> 	console_unlock();
> }
> 
> so why would the system die before we console_unlock()?
> 
Errm.
Because it doesn't have any other chance?
Like, hard lockup?
Power down?
Hardware dead?

Slightly puzzled,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  8:02         ` Hannes Reinecke
@ 2016-07-19  8:23           ` Sergey Senozhatsky
  2016-07-21 10:36           ` David Woodhouse
  1 sibling, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-19  8:23 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On (07/19/16 10:02), Hannes Reinecke wrote:
[..]
> > so why would the system die before we console_unlock()?
> > 
> Errm.
> Because it doesn't have any other chance?
> Like, hard lockup?
> Power down?
> Hardware dead?

ah, ok. so mostly for the reasons outside of printk/panic code.
hard lockup, case. hm, I think nmi_panic() is still going to
call panic() at some point. so we have chances to flush log_buf
messages.

power down or h/w failure, however, can't be fixed up neither
by sync nor by async printk. well, we can't help here.


the thing that really can stop us from flushing the messages
is the fact that console driver may be in some unknown state.
for example, have some internal locks owned by non-panic CPU,
which we IPI_STOP-ed before console_flush_on_panic(). console
drivers are normally unaware of the fact that the system is in
panic now. so some sort of console->write_on_panic(), or
console->reset() would probably be helpful, but that's a bit
different topic.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:53     ` Christian Borntraeger
@ 2016-07-19 13:55       ` Jan Kara
  2016-07-28  2:59         ` Steven Rostedt
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Kara @ 2016-07-19 13:55 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: ksummit-discuss

On Tue 19-07-16 09:53:01, Christian Borntraeger wrote:
> On 07/19/2016 09:46 AM, Christian Borntraeger wrote:
> > On 07/19/2016 08:17 AM, Hannes Reinecke wrote:
> >> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
> >>> Hello,
> >>>
> >>> Wondering if anyone will be interested in printk-related topics
> >>> (or we can handle it in the mailing list).
> >>>
> >>> What I have on my list is:
> >>>
> >>>
> >>> - synchronous printk()
> >>>
> >>> printk() prints messages from kernel printk buffer until the buffer
> >>> is empty. When serial console is attached, printing is slow and thus
> >>> other CPUs in the system have plenty of time to append new messages to
> >>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> >>> amount of time doing printing in console_unlock().  This is especially
> >>> serious problem if the printk() calling console_unlock() was called with
> >>> interrupts disabled, or from IRQ, or from spin_lock protected section
> >>> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
> >>> function to call in some cases, it can cause different types of lockups
> >>> (soft, hard, spinlock), stalls and so on.
> >>>
> >>> we have some progress on this side. printk() can offload printing from
> >>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
> >>> special purpose printing kthread).
> >>> but "The whole idea remains worrisome", per Andrew :)
> >>>
> >> Yes. The main problem stems from the fact that printk has two different
> >> and conflicting use-cases:
> >> - Really urgent, 'I am about to die' messages. Which obviously need to
> >>   be printed out as fast as possible.
> >> - Rather largish, information/logging 'what I always wanted to tell you'
> >>   type of messages. These messages tend to be very large, but at the end
> >>   it doesn't really matter _when_ they'll be printed as they are
> >>   time-stamped anyway.
> > 
> > I think you brought up this topic last year in a lightning talk, correct?
> > Didn't Linus say "fix the console then"?
> 
> Which does not make the problem go away, though, so yes - a session to
> discuss this is probably a good idea - it really depends on how strong
> Linus opinion on this topic is ;-)

Well, he backed up from that statement later a bit. So when I was
discussing this problem with him at last KS he agreed that making printk
async is the way to go and he specifically dismissed attempts to print
synchronously for a while and fall back to async only later. He just wanted
some way to get things to console synchronously once things go bad (oops,
panic,...) which is discussed elsewhere in this thread.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  3:47 [Ksummit-discuss] [TECH TOPIC] asynchronous printk Sergey Senozhatsky
  2016-07-19  3:56 ` Viresh Kumar
  2016-07-19  6:17 ` Hannes Reinecke
@ 2016-07-19 14:45 ` James Bottomley
  2016-07-19 14:55   ` Sergey Senozhatsky
  2016-09-21  4:41 ` Sergey Senozhatsky
  3 siblings, 1 reply; 60+ messages in thread
From: James Bottomley @ 2016-07-19 14:45 UTC (permalink / raw)
  To: Sergey Senozhatsky, ksummit-discuss
  Cc: Tetsuo Handa, Tejun Heo, Jiri Kosina, Viresh Kumar

On Tue, 2016-07-19 at 12:47 +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> Wondering if anyone will be interested in printk-related topics
> (or we can handle it in the mailing list).
> 
> What I have on my list is:

Are there any bug reports for these cases?  If there are, I'd use the
bug report as an opportunity to fix on the relevant list and if there
aren't, I'd say that the problem is then largely theoretical and
there's not much point solving it until it becomes a real problem.

For instance on the KERN_CONT we've been debating the problems of pre
-empt and continuation lines for years but no-ones cared enough to fix
it.  This means either the problem isn't seen often enough in the field
or that when it is seen there hasn't been much difficulty disentangling
the log.  For the former, there's no need to fix something that isn't
seen in practice and for the latter, if no-one cares enough then it's
not a big enough problem to fix.

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 14:45 ` James Bottomley
@ 2016-07-19 14:55   ` Sergey Senozhatsky
  2016-07-19 17:58     ` James Bottomley
  2016-07-20  6:14     ` Jan Kara
  0 siblings, 2 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-19 14:55 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jiri Kosina, ksummit-discuss, Tetsuo Handa, Viresh Kumar, Tejun Heo

Hello,

On (07/19/16 07:45), James Bottomley wrote:
> On Tue, 2016-07-19 at 12:47 +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > Wondering if anyone will be interested in printk-related topics
> > (or we can handle it in the mailing list).
> > 
> > What I have on my list is:
> 
> Are there any bug reports for these cases?  If there are, I'd use the
> bug report as an opportunity to fix on the relevant list and if there
> aren't, I'd say that the problem is then largely theoretical and
> there's not much point solving it until it becomes a real problem.

yes, there are reports. for instance,
http://marc.info/?l=linux-kernel&m=146823278316663

I do have same problems with pintk (lockups, stalls, etc.)
and even more.


> For instance on the KERN_CONT we've been debating the problems of pre
> -empt and continuation lines for years but no-ones cared enough to fix
> it.  This means either the problem isn't seen often enough in the field
> or that when it is seen there hasn't been much difficulty disentangling
> the log.  For the former, there's no need to fix something that isn't
> seen in practice and for the latter, if no-one cares enough then it's
> not a big enough problem to fix.

well, I agree that it doesn't make it impossible to read the logs.
how often does it happen... on my laptop sometimes KERN_CONT lines
are not always really continuous. so I observe it, in some sense.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 14:55   ` Sergey Senozhatsky
@ 2016-07-19 17:58     ` James Bottomley
  2016-07-19 18:24       ` Viresh Kumar
  2016-07-20  2:08       ` Sergey Senozhatsky
  2016-07-20  6:14     ` Jan Kara
  1 sibling, 2 replies; 60+ messages in thread
From: James Bottomley @ 2016-07-19 17:58 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Tetsuo Handa, Viresh Kumar, Tejun Heo, Jiri Kosina, ksummit-discuss

On Tue, 2016-07-19 at 23:55 +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/19/16 07:45), James Bottomley wrote:
> > On Tue, 2016-07-19 at 12:47 +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > Wondering if anyone will be interested in printk-related topics
> > > (or we can handle it in the mailing list).
> > > 
> > > What I have on my list is:
> > 
> > Are there any bug reports for these cases?  If there are, I'd use 
> > the bug report as an opportunity to fix on the relevant list and if
> > there aren't, I'd say that the problem is then largely theoretical 
> > and there's not much point solving it until it becomes a real
> > problem.
> 
> yes, there are reports. for instance,
> http://marc.info/?l=linux-kernel&m=146823278316663

That's about backporting a fix from upstream to 3.12 to fix the
particular issue, so it seems to be solved for the reporter as far as
it goes.

> I do have same problems with pintk (lockups, stalls, etc.)
> and even more.

OK, but given that the bugs are fixed as they're reported, the only
issue seems to be that some people think the fix is incomplete and
Andrew is sitting on it because he's unsure if the patches actually
solve the problem (or even if we have a problem).

The only comments from him I can find are in Jan's series:

http://thread.gmane.org/gmane.linux.kernel/1619692

The concern seems to be "prink is fragile you look to be making it
differently fragile, how is that of benefit".

So the problem is there's no overriding need driving this and it's
blocked by "vague concerns" about fragility.  Is there a process
problem that there's no effective way to move these patches forward
without finding an overriding need or addressing the concerns?

> > For instance on the KERN_CONT we've been debating the problems of 
> > pre-empt and continuation lines for years but no-ones cared enough 
> > to fix it.  This means either the problem isn't seen often enough 
> > in the field or that when it is seen there hasn't been much difficu
> > lty disentangling the log.  For the former, there's no need to fix 
> > something that isn't seen in practice and for the latter, if no-one 
> > cares enough then it's not a big enough problem to fix.
> 
> well, I agree that it doesn't make it impossible to read the logs.
> how often does it happen... on my laptop sometimes KERN_CONT lines
> are not always really continuous. so I observe it, in some sense.

OK, so it's unsightly but not necessarily a problem for reading the
logs.  Again, this means we have no overriding need to fix it.

James

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 17:58     ` James Bottomley
@ 2016-07-19 18:24       ` Viresh Kumar
  2016-07-20  2:08       ` Sergey Senozhatsky
  1 sibling, 0 replies; 60+ messages in thread
From: Viresh Kumar @ 2016-07-19 18:24 UTC (permalink / raw)
  To: James Bottomley; +Cc: Tetsuo Handa, Tejun Heo, Jiri Kosina, ksummit-discuss

On 19-07-16, 10:58, James Bottomley wrote:
> On Tue, 2016-07-19 at 23:55 +0900, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > On (07/19/16 07:45), James Bottomley wrote:
> > > On Tue, 2016-07-19 at 12:47 +0900, Sergey Senozhatsky wrote:
> > > > Hello,
> > > > 
> > > > Wondering if anyone will be interested in printk-related topics
> > > > (or we can handle it in the mailing list).
> > > > 
> > > > What I have on my list is:
> > > 
> > > Are there any bug reports for these cases?  If there are, I'd use 
> > > the bug report as an opportunity to fix on the relevant list and if
> > > there aren't, I'd say that the problem is then largely theoretical 
> > > and there's not much point solving it until it becomes a real
> > > problem.
> > 
> > yes, there are reports. for instance,
> > http://marc.info/?l=linux-kernel&m=146823278316663
> 
> That's about backporting a fix from upstream to 3.12 to fix the
> particular issue, so it seems to be solved for the reporter as far as
> it goes.

Not really, we need patches from Sergey and Jan for async printing to make it
work.

-- 
viresh

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 17:58     ` James Bottomley
  2016-07-19 18:24       ` Viresh Kumar
@ 2016-07-20  2:08       ` Sergey Senozhatsky
  1 sibling, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-20  2:08 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jiri Kosina, ksummit-discuss, Tetsuo Handa, Viresh Kumar, Tejun Heo

Hello,

On (07/19/16 10:58), James Bottomley wrote:
[..]
> > yes, there are reports. for instance,
> > http://marc.info/?l=linux-kernel&m=146823278316663
> 
> That's about backporting a fix from upstream to 3.12 to fix the
> particular issue, so it seems to be solved for the reporter as far as
> it goes.

but there is no upstream fix. if the one you are talking about is
'5874af2003b1 ("printk: enable interrupts before callin console_trylock_for_printk()")'

then it can't and won't solve every single problem with printk.
we still have cases like:

spin_lock()
 printk() -> console_unlock()
spin_unlock()

spin_lock_irqsafe()
 printk() -> console_unlock()
spin_unlock_irqrestore()

rcu_read_lock()
 printk() -> cosnole_unlock()
rcu_read_unlock()

preempt_disable()
 printk() -> console_unlock()
preempt_enable()

local_irq_save()
 printk() -> console_unlock()
local_irq_restore()

in IRQ
printk() -> console_unlock()

in IRQ (again)
printk_deferred() -> irq_work -> IRQ -> wake_up_klogd_work_func -> console_unlock()

and so on.

console_unlock() does
{
 for (;;) {
   text = get_message_from_log_buf();
   call_console_drivers(text) -> UART_write(text);     // for instance
 }
}

while other CPUs on the system do

printk(text)
{
   append_message_to_log_buf(text);
   return;
}


so aync printk attempts to address those problems. what it does
not address, though, are direct console_lock() calls. a driver
or an IRQ handler still can do

	if (console_trylock())
		console_unlock();     -> loop

that is scheduled to be addressed in the future.


> > I do have same problems with pintk (lockups, stalls, etc.)
> > and even more.
> 
> OK, but given that the bugs are fixed as they're reported, the only
> issue seems to be that some people think the fix is incomplete and
> Andrew is sitting on it because he's unsure if the patches actually
> solve the problem (or even if we have a problem).
> 
> The only comments from him I can find are in Jan's series:
> 
> http://thread.gmane.org/gmane.linux.kernel/1619692
> 
> The concern seems to be "prink is fragile you look to be making it
> differently fragile, how is that of benefit".
>
> So the problem is there's no overriding need driving this and it's
> blocked by "vague concerns" about fragility.  Is there a process
> problem that there's no effective way to move these patches forward
> without finding an overriding need or addressing the concerns?

later comments:
http://marc.info/?l=linux-kernel&m=145981032029352

> The concern seems to be "prink is fragile you look to be making it
> differently fragile, how is that of benefit".

well, yes, more or less, that seems to be the concern. probably,
there is no agreement that the patch set is moving in the right
direction (from implementation POV) in the first place.

> > well, I agree that it doesn't make it impossible to read the logs.
> > how often does it happen... on my laptop sometimes KERN_CONT lines
> > are not always really continuous. so I observe it, in some sense.
> 
> OK, so it's unsightly but not necessarily a problem for reading the
> logs.  Again, this means we have no overriding need to fix it.

well, both agree and disagree. once you have cont lines from, say, 3
CPUs mixed in peculiar ways reading the logs is not so simple any longer.
what may be would help here (if `fixing' cont lines is not an option)
is CPU number attached to every line.

for example, this

[12.2324] 0xffff8888
[12.2324] 0xc3234898
[12.2324] 0x00000002
[12.2324] 0xffff1233
[12.2325] 0x00000113
[12.2325] 0xc3248764

becomes this

[0][12.2324] 0xffff8888
[0][12.2324] 0xc3234898
[1][12.2324] 0x00000002
[2][12.2324] 0xffff1233
[2][12.2325] 0x00000113
[0][12.2325] 0xc3248764

just an idea.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  6:17 ` Hannes Reinecke
                     ` (2 preceding siblings ...)
  2016-07-19  7:46   ` Christian Borntraeger
@ 2016-07-20  3:35   ` Wangnan (F)
  2016-07-21  1:16     ` Andy Lutomirski
  3 siblings, 1 reply; 60+ messages in thread
From: Wangnan (F) @ 2016-07-20  3:35 UTC (permalink / raw)
  To: Hannes Reinecke, ksummit-discuss



On 2016/7/19 14:17, Hannes Reinecke wrote:
> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>> Hello,
>>
>> Wondering if anyone will be interested in printk-related topics
>> (or we can handle it in the mailing list).
>>
>> What I have on my list is:
>>
>>
>> - synchronous printk()
>>
>> printk() prints messages from kernel printk buffer until the buffer
>> is empty. When serial console is attached, printing is slow and thus
>> other CPUs in the system have plenty of time to append new messages to
>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>> amount of time doing printing in console_unlock().  This is especially
>> serious problem if the printk() calling console_unlock() was called with
>> interrupts disabled, or from IRQ, or from spin_lock protected section
>> (if the spinlock is contended), etc. etc. IOW, printk() is quite dangerous
>> function to call in some cases, it can cause different types of lockups
>> (soft, hard, spinlock), stalls and so on.
>>
>> we have some progress on this side. printk() can offload printing from
>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>> special purpose printing kthread).
>> but "The whole idea remains worrisome", per Andrew :)
>>
> Yes. The main problem stems from the fact that printk has two different
> and conflicting use-cases:
> - Really urgent, 'I am about to die' messages. Which obviously need to
>    be printed out as fast as possible.
> - Rather largish, information/logging 'what I always wanted to tell you'
>    type of messages. These messages tend to be very large, but at the end
>    it doesn't really matter _when_ they'll be printed as they are
>    time-stamped anyway.
>

Actually, there are 3 types of messages:

  1. Urgent: I'm going to die.
  2. information/logging.
  3. Trace.

If you look at some customized kernel you can find lots of trace from
kernel. Most of the time they are really meaningless, but if kernel is
buggy or crash they become key information. When debugging, it is really
frustrating to see something is missed in kmsg, so developer tend to output
maything from printk. However, in production system they have to turns those
trace off, so debugging production systems is really painful.

I have an idea on those trace level printk. Since they are meaningless most
of the time, let it output to another ring buffer, unconditionally overwrite
old message without outputing to console or log file. When kernel detect 
something
happen or before kernel die, dump the ring buffer to the console. Then we
can put as many as tracing information through printk without console 
problem.

Since we already have tracepoints and overwritable perf ring buffer, we can
encapsulate trace level printk using tracepoints, and let it output to perf
ring buffer.

Make sence?

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:11       ` Geert Uytterhoeven
@ 2016-07-20  6:02         ` Jan Kara
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Kara @ 2016-07-20  6:02 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: ksummit-discuss

On Tue 19-07-16 09:11:40, Geert Uytterhoeven wrote:
> On Tue, Jul 19, 2016 at 9:02 AM, Hannes Reinecke <hare@suse.com> wrote:
> >> If the printk would show up on the console, handle it inline immediately
> >> before returning, so that the user sees it on the console immediately in
> >> case the very next line hangs the system.  That also helps with the
> >> debugging approach of copy/pasting many instances of pr_alert("%s:%d:
> >> here\n", __func__, __LINE__) and looking for the last one that shows up.
> >>
> >> If the printk would *not* show up on the console, and would only show up
> >> asynchronously in dmesg or a log somewhere, then go ahead and throw it
> >> to the asynchronous printk_kthread context to handle and return, because
> >> if the next line crashes, userspace wouldn't get the opportunity to read
> >> and log it anyway.
> >>
> >> Combined with a mechanism like "if the kernel panics, try as hard as
> >> possible to dump out all the pending printks before dying", that seems
> >> like a reasonable default behavior that shouldn't result in surprises.
> >> If the kernel is alive enough that userspace can still log things (such
> >> as if the display hangs but the kernel and userspace are still running),
> >> then the kernel should also still be alive enough to process the pending
> >> printks.
> >>
> > But it still leaves us with a possible priority inversion.
> > How should we deal with situations where the async thread is running and
> > someone is issuing a synchronous printk?
> > Should we skip the asynchonous ones and print the synchronous one first,
> > risking out-of-order messages but a higher probability that the urgent
> > message is actually printed ?
> 
> There are already out-of-order messages for SMP and continuation,
> some of these we want to fix, but always for the same loglevel.
> If two messages have a different log level, I see no reason to (try to) keep
> ordering. If keeping them together is required, the user should use pr_cont.

This does not really match reality. Just look at how Oops is printed. It is
a mix of different loglevels and you certainly want to print them together
and in the right order. Generally I don't think reordering messages is
acceptable because less severe messages before a critical message can be
very important in analyzing what has happened.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:38     ` Hannes Reinecke
  2016-07-19  7:46       ` Sergey Senozhatsky
@ 2016-07-20  6:09       ` Jan Kara
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Kara @ 2016-07-20  6:09 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Tue 19-07-16 09:38:10, Hannes Reinecke wrote:
> On 07/19/2016 09:33 AM, Sergey Senozhatsky wrote:
> > On (07/19/16 08:17), Hannes Reinecke wrote:
> > [..]
> >> Yes. The main problem stems from the fact that printk has two different
> >> and conflicting use-cases:
> >> - Really urgent, 'I am about to die' messages. Which obviously need to
> >>   be printed out as fast as possible.
> >> - Rather largish, information/logging 'what I always wanted to tell you'
> >>   type of messages. These messages tend to be very large, but at the end
> >>   it doesn't really matter _when_ they'll be printed as they are
> >>   time-stamped anyway.
> >>
> >> For the first use-case you absolutely need a synchronous printk, but
> >> this is a complete killer for the second case.
> >> And OTOH having a separate thread is really the way to go for the second
> >> case, but an absolute no-go for the first.
> >>
> >> So I really wonder if it does make sense to lump both use-cases into one
> >> call, or whether it wouldn't be better to have two distinct calls
> >> for that (or, for the sake of argument, use KERN_EMERG to trigger
> >> synchronous printks).
> > 
> > yes, panic() must be in sync printk mode. but we do it a
> > bit differently: console_verbose() forces printk to switch
> > to sync mode.
> > 
> > so panic() goes like this:
> > 
> > 	panic()
> > 	{
> > 		console_verbose(); # switch to sync printk. forever.
> > 		bust_spinlocks();
> > 		pr_emerg("Kernel panic - not syncing...");
> > 
> > 		....
> > 		debug_locks_off();
> > 		console_flush_on_panic();  # flushes kernel log_bug in
> > 					   # sync mode
> > 
> > 		// the rest of panic()
> > 
> > 	}
> > 
> > 	-ss
> > 
> but this is precisely what I meant by priority inversion:
> If there are lots of messages in the printk buffer we might not be
> _able_ to print out everything as the machine died before the entire
> printk buffer could be printed.
> Which means we will never see the panic message, at which point we could
> as well drop the whole exercise.

I like Linus's suggestion how to fix this problem: skip part of buffered
messages and print just last 100 of them before the critical event. I have
a patch to do this but for now Sergey removed it from the patch set because
it is mostly independent and there were some technical issues with it.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 14:55   ` Sergey Senozhatsky
  2016-07-19 17:58     ` James Bottomley
@ 2016-07-20  6:14     ` Jan Kara
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Kara @ 2016-07-20  6:14 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Jiri Kosina, ksummit-discuss, Tetsuo Handa, Viresh Kumar,
	James Bottomley, Tejun Heo

On Tue 19-07-16 23:55:09, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/19/16 07:45), James Bottomley wrote:
> > On Tue, 2016-07-19 at 12:47 +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > Wondering if anyone will be interested in printk-related topics
> > > (or we can handle it in the mailing list).
> > > 
> > > What I have on my list is:
> > 
> > Are there any bug reports for these cases?  If there are, I'd use the
> > bug report as an opportunity to fix on the relevant list and if there
> > aren't, I'd say that the problem is then largely theoretical and
> > there's not much point solving it until it becomes a real problem.
> 
> yes, there are reports. for instance,
> http://marc.info/?l=linux-kernel&m=146823278316663
> 
> I do have same problems with pintk (lockups, stalls, etc.)
> and even more.

Yeah, and there are much more reports of these problems. Last reporter of
these issues was Vishal with his phone not being able to suspend due to too
many messages being printed, Hyper-V people reported these issues, in SUSE
we have machines which are *unable to boot* without async printk patches,
and there are more.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  7:02     ` Hannes Reinecke
  2016-07-19  7:11       ` Geert Uytterhoeven
@ 2016-07-20 22:54       ` Josh Triplett
  2016-07-21  0:46         ` Sergey Senozhatsky
  1 sibling, 1 reply; 60+ messages in thread
From: Josh Triplett @ 2016-07-20 22:54 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 09:02:17AM +0200, Hannes Reinecke wrote:
> On 07/19/2016 08:49 AM, Josh Triplett wrote:
> > On Tue, Jul 19, 2016 at 08:17:19AM +0200, Hannes Reinecke wrote:
> >> Yes. The main problem stems from the fact that printk has two different
> >> and conflicting use-cases:
> >> - Really urgent, 'I am about to die' messages. Which obviously need to
> >>   be printed out as fast as possible.
> >> - Rather largish, information/logging 'what I always wanted to tell you'
> >>   type of messages. These messages tend to be very large, but at the end
> >>   it doesn't really matter _when_ they'll be printed as they are
> >>   time-stamped anyway.
> >>
> >> For the first use-case you absolutely need a synchronous printk, but
> >> this is a complete killer for the second case.
> >> And OTOH having a separate thread is really the way to go for the second
> >> case, but an absolute no-go for the first.
> >>
> >> So I really wonder if it does make sense to lump both use-cases into one
> >> call, or whether it wouldn't be better to have two distinct calls
> >> for that (or, for the sake of argument, use KERN_EMERG to trigger
> >> synchronous printks).
> > 
> > For the sake of argument: what about using loglevel to distinguish the
> > two cases by default?
> > 
> That's what I've tried to infer by the above statement; KERN_EMERG could
> easily used for that sort of thing.

I don't mean using just the priority level of the printk call.  I mean
using the current kernel loglevel, as in what level it displays to the
console, as set on the kernel command line with the loglevel= parameter.
printk could quickly check the priority level of the call versus the
current kernel loglevel to determine if the message would go to the
console or not, and use that to decide whether to handle it
synchronously or asynchronously.

- Josh Triplett

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-20 22:54       ` Josh Triplett
@ 2016-07-21  0:46         ` Sergey Senozhatsky
  2016-07-21  1:12           ` Josh Triplett
  0 siblings, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-21  0:46 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

Hello,

On (07/20/16 15:54), Josh Triplett wrote:
[..]
> > That's what I've tried to infer by the above statement; KERN_EMERG could
> > easily used for that sort of thing.
> 
> I don't mean using just the priority level of the printk call.  I mean
> using the current kernel loglevel, as in what level it displays to the
> console, as set on the kernel command line with the loglevel= parameter.
> printk could quickly check the priority level of the call versus the
> current kernel loglevel to determine if the message would go to the
> console or not, and use that to decide whether to handle it
> synchronously or asynchronously.

between loglevel check in printk() and actual printing console loglevel
may change. thus printk() does not make this (severity level filtering)
decision. console_unlock() does, on per-log record basis:

console_unlock()
{
	for (;;) {
		spin_lock(logbuf_lock)

skip:
		msg = log_from_idx(console_idx);
		if (msg->level >= console_loglevel)
			goto skip;

		spin_unlock(logbuf_lock)

		call_console_drivers(text)
	}
}

and by the time we call console_unlock() we better already be
either in async mode, or sync mode. unless we want to rewrite
console_unlock().

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21  0:46         ` Sergey Senozhatsky
@ 2016-07-21  1:12           ` Josh Triplett
  0 siblings, 0 replies; 60+ messages in thread
From: Josh Triplett @ 2016-07-21  1:12 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 09:46:16AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/20/16 15:54), Josh Triplett wrote:
> [..]
> > > That's what I've tried to infer by the above statement; KERN_EMERG could
> > > easily used for that sort of thing.
> > 
> > I don't mean using just the priority level of the printk call.  I mean
> > using the current kernel loglevel, as in what level it displays to the
> > console, as set on the kernel command line with the loglevel= parameter.
> > printk could quickly check the priority level of the call versus the
> > current kernel loglevel to determine if the message would go to the
> > console or not, and use that to decide whether to handle it
> > synchronously or asynchronously.
> 
> between loglevel check in printk() and actual printing console loglevel
> may change. thus printk() does not make this (severity level filtering)
> decision. console_unlock() does, on per-log record basis:
[...]
> and by the time we call console_unlock() we better already be
> either in async mode, or sync mode. unless we want to rewrite
> console_unlock().

That's exactly why I was suggesting this filter.  Yes, loglevel can
change at any time, but I don't think it's worth worrying about whether
messages that happen to race with the change to loglevel get handled
synchronously or asynchronously.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-20  3:35   ` Wangnan (F)
@ 2016-07-21  1:16     ` Andy Lutomirski
  2016-07-21  1:52       ` Wangnan (F)
                         ` (2 more replies)
  0 siblings, 3 replies; 60+ messages in thread
From: Andy Lutomirski @ 2016-07-21  1:16 UTC (permalink / raw)
  To: Wangnan (F); +Cc: ksummit-discuss

On Tue, Jul 19, 2016 at 8:35 PM, Wangnan (F) <wangnan0@huawei.com> wrote:
>
>
> On 2016/7/19 14:17, Hannes Reinecke wrote:
>>
>> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>>>
>>> Hello,
>>>
>>> Wondering if anyone will be interested in printk-related topics
>>> (or we can handle it in the mailing list).
>>>
>>> What I have on my list is:
>>>
>>>
>>> - synchronous printk()
>>>
>>> printk() prints messages from kernel printk buffer until the buffer
>>> is empty. When serial console is attached, printing is slow and thus
>>> other CPUs in the system have plenty of time to append new messages to
>>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>>> amount of time doing printing in console_unlock().  This is especially
>>> serious problem if the printk() calling console_unlock() was called with
>>> interrupts disabled, or from IRQ, or from spin_lock protected section
>>> (if the spinlock is contended), etc. etc. IOW, printk() is quite
>>> dangerous
>>> function to call in some cases, it can cause different types of lockups
>>> (soft, hard, spinlock), stalls and so on.
>>>
>>> we have some progress on this side. printk() can offload printing from
>>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>>> special purpose printing kthread).
>>> but "The whole idea remains worrisome", per Andrew :)
>>>
>> Yes. The main problem stems from the fact that printk has two different
>> and conflicting use-cases:
>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>    be printed out as fast as possible.
>> - Rather largish, information/logging 'what I always wanted to tell you'
>>    type of messages. These messages tend to be very large, but at the end
>>    it doesn't really matter _when_ they'll be printed as they are
>>    time-stamped anyway.
>>
>
> Actually, there are 3 types of messages:
>
>  1. Urgent: I'm going to die.
>  2. information/logging.
>  3. Trace.
>

If we do all this stuff, can we also try to clean up earlyprintk a
bit?  The whole earlyconsole mechanism is a mess, and switching over
to the non-early console is only somewhat functional.  I'd love to see
this all simplified: before there's any console at all available, just
buffer messages.  Then, when a console shows up, write the buffer out.
Then earlyprintk can work just like regular printk.

--Andy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21  1:16     ` Andy Lutomirski
@ 2016-07-21  1:52       ` Wangnan (F)
  2016-07-21  5:59       ` Hannes Reinecke
  2016-07-21 10:28       ` David Woodhouse
  2 siblings, 0 replies; 60+ messages in thread
From: Wangnan (F) @ 2016-07-21  1:52 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss



On 2016/7/21 9:16, Andy Lutomirski wrote:
> On Tue, Jul 19, 2016 at 8:35 PM, Wangnan (F) <wangnan0@huawei.com> wrote:
>>
>> On 2016/7/19 14:17, Hannes Reinecke wrote:
>>> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>>>> Hello,
>>>>
>>>> Wondering if anyone will be interested in printk-related topics
>>>> (or we can handle it in the mailing list).
>>>>
>>>> What I have on my list is:
>>>>
>>>>
>>>> - synchronous printk()
>>>>
>>>> printk() prints messages from kernel printk buffer until the buffer
>>>> is empty. When serial console is attached, printing is slow and thus
>>>> other CPUs in the system have plenty of time to append new messages to
>>>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>>>> amount of time doing printing in console_unlock().  This is especially
>>>> serious problem if the printk() calling console_unlock() was called with
>>>> interrupts disabled, or from IRQ, or from spin_lock protected section
>>>> (if the spinlock is contended), etc. etc. IOW, printk() is quite
>>>> dangerous
>>>> function to call in some cases, it can cause different types of lockups
>>>> (soft, hard, spinlock), stalls and so on.
>>>>
>>>> we have some progress on this side. printk() can offload printing from
>>>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>>>> special purpose printing kthread).
>>>> but "The whole idea remains worrisome", per Andrew :)
>>>>
>>> Yes. The main problem stems from the fact that printk has two different
>>> and conflicting use-cases:
>>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>>     be printed out as fast as possible.
>>> - Rather largish, information/logging 'what I always wanted to tell you'
>>>     type of messages. These messages tend to be very large, but at the end
>>>     it doesn't really matter _when_ they'll be printed as they are
>>>     time-stamped anyway.
>>>
>> Actually, there are 3 types of messages:
>>
>>   1. Urgent: I'm going to die.
>>   2. information/logging.
>>   3. Trace.
>>
> If we do all this stuff, can we also try to clean up earlyprintk a
> bit?  The whole earlyconsole mechanism is a mess, and switching over
> to the non-early console is only somewhat functional.  I'd love to see
> this all simplified: before there's any console at all available, just
> buffer messages.  Then, when a console shows up, write the buffer out.
> Then earlyprintk can work just like regular printk.

I think kernel need earlyprintk because at very early stage kernel
can't use memory so can't buffer messages?

Thank you.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21  1:16     ` Andy Lutomirski
  2016-07-21  1:52       ` Wangnan (F)
@ 2016-07-21  5:59       ` Hannes Reinecke
  2016-07-21 10:31         ` David Woodhouse
  2016-07-21 10:28       ` David Woodhouse
  2 siblings, 1 reply; 60+ messages in thread
From: Hannes Reinecke @ 2016-07-21  5:59 UTC (permalink / raw)
  To: Andy Lutomirski, Wangnan (F); +Cc: ksummit-discuss

On 07/21/2016 03:16 AM, Andy Lutomirski wrote:
> On Tue, Jul 19, 2016 at 8:35 PM, Wangnan (F) <wangnan0@huawei.com> wrote:
>>
>>
>> On 2016/7/19 14:17, Hannes Reinecke wrote:
>>>
>>> On 07/19/2016 05:47 AM, Sergey Senozhatsky wrote:
>>>>
>>>> Hello,
>>>>
>>>> Wondering if anyone will be interested in printk-related topics
>>>> (or we can handle it in the mailing list).
>>>>
>>>> What I have on my list is:
>>>>
>>>>
>>>> - synchronous printk()
>>>>
>>>> printk() prints messages from kernel printk buffer until the buffer
>>>> is empty. When serial console is attached, printing is slow and thus
>>>> other CPUs in the system have plenty of time to append new messages to
>>>> the buffer while one CPU is printing. Thus the CPU can spend unbounded
>>>> amount of time doing printing in console_unlock().  This is especially
>>>> serious problem if the printk() calling console_unlock() was called with
>>>> interrupts disabled, or from IRQ, or from spin_lock protected section
>>>> (if the spinlock is contended), etc. etc. IOW, printk() is quite
>>>> dangerous
>>>> function to call in some cases, it can cause different types of lockups
>>>> (soft, hard, spinlock), stalls and so on.
>>>>
>>>> we have some progress on this side. printk() can offload printing from
>>>> sensitive and unsafe contexts to a schedulable printk_kthread context (a
>>>> special purpose printing kthread).
>>>> but "The whole idea remains worrisome", per Andrew :)
>>>>
>>> Yes. The main problem stems from the fact that printk has two different
>>> and conflicting use-cases:
>>> - Really urgent, 'I am about to die' messages. Which obviously need to
>>>    be printed out as fast as possible.
>>> - Rather largish, information/logging 'what I always wanted to tell you'
>>>    type of messages. These messages tend to be very large, but at the end
>>>    it doesn't really matter _when_ they'll be printed as they are
>>>    time-stamped anyway.
>>>
>>
>> Actually, there are 3 types of messages:
>>
>>  1. Urgent: I'm going to die.
>>  2. information/logging.
>>  3. Trace.
>>
> 
> If we do all this stuff, can we also try to clean up earlyprintk a
> bit?  The whole earlyconsole mechanism is a mess, and switching over
> to the non-early console is only somewhat functional.  I'd love to see
> this all simplified: before there's any console at all available, just
> buffer messages.  Then, when a console shows up, write the buffer out.
> Then earlyprintk can work just like regular printk.
> 
Begging the question how one would debug failures during that time.
The current earlyprintk stuff is at least able to print out _something_,
so that you have some idea what went wrong.
Without that things will become _really_ hard during board bringup.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		               zSeries & Storage
hare@suse.com			               +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21  1:16     ` Andy Lutomirski
  2016-07-21  1:52       ` Wangnan (F)
  2016-07-21  5:59       ` Hannes Reinecke
@ 2016-07-21 10:28       ` David Woodhouse
  2 siblings, 0 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-21 10:28 UTC (permalink / raw)
  To: Andy Lutomirski, Wangnan (F); +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 968 bytes --]

On Wed, 2016-07-20 at 18:16 -0700, Andy Lutomirski wrote:
> If we do all this stuff, can we also try to clean up earlyprintk a
> bit?  The whole earlyconsole mechanism is a mess, and switching over
> to the non-early console is only somewhat functional.  I'd love to see
> this all simplified: before there's any console at all available, just
> buffer messages.  Then, when a console shows up, write the buffer out.
> Then earlyprintk can work just like regular printk.

That's exactly what we already do. Look at the CON_PRINTBUFFER flag.

I never did understand why "earlyprintk" was introduced as such a nasty
special-case hack in the first place. There were various minor excuses
for doing so, but nothing insurmountable IIRC.

I would *love* to kill it completely and simplify the whole thing.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21  5:59       ` Hannes Reinecke
@ 2016-07-21 10:31         ` David Woodhouse
  2016-07-21 11:19           ` Josh Triplett
  2016-07-21 15:05           ` Andy Lutomirski
  0 siblings, 2 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-21 10:31 UTC (permalink / raw)
  To: Hannes Reinecke, Andy Lutomirski, Wangnan (F); +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1216 bytes --]

On Thu, 2016-07-21 at 07:59 +0200, Hannes Reinecke wrote:
> > If we do all this stuff, can we also try to clean up earlyprintk a
> > bit?  The whole earlyconsole mechanism is a mess, and switching over
> > to the non-early console is only somewhat functional.  I'd love to see
> > this all simplified: before there's any console at all available, just
> > buffer messages.  Then, when a console shows up, write the buffer out.
> > Then earlyprintk can work just like regular printk.
> > 
> Begging the question how one would debug failures during that time.
> The current earlyprintk stuff is at least able to print out _something_,
> so that you have some idea what went wrong.
> Without that things will become _really_ hard during board bringup.

No, you don't actually *delay* the console output. You still register
the console as soon as you physically can, but you just don't have all
that nasty special-casing of the "earlyprintk" nonsense; you just make
sure the standard console output is capable of working as early as you
need it to.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  8:02         ` Hannes Reinecke
  2016-07-19  8:23           ` Sergey Senozhatsky
@ 2016-07-21 10:36           ` David Woodhouse
  2016-07-21 12:31             ` Jan Kara
  2016-07-28  2:55             ` Steven Rostedt
  1 sibling, 2 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-21 10:36 UTC (permalink / raw)
  To: Hannes Reinecke, Sergey Senozhatsky; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1547 bytes --]

On Tue, 2016-07-19 at 10:02 +0200, Hannes Reinecke wrote:
> > so why would the system die before we console_unlock()?
> > 
> Errm.
> Because it doesn't have any other chance?
> Like, hard lockup?
> Power down?
> Hardware dead?
> 
> Slightly puzzled,

Right. This was exactly the kind of hang I was chasing shortly before
last year's KS — an interrupt storm killing the box (because of a
tendency in network drivers to return IRQ_HANDLED and not really care
if we *had* done so, which ISTR arguing with DaveM about separately).

Sometimes you don't get a nice clean panic. Sometimes you just get a
lockup or a hard reset.

Which is also why it doesn't help much to try to use the level of an
individual printk to determine whether it should be synchronous or not.
In this case it was all KERN_DEBUG messages from the network driver,
which I was logging to the serial port so I could see what was
happening... but which weren't making it out the port before the
lockup.

A viable solution to fix this might be a 'synchronous' flag on the
console itself — so I could boot with 'console=ttyS0,synchronous' and
get a debuggable system again, Or maybe it would be simpler to have a
system-wide control which makes all consoles synchronous, if that's
easier. Either way, we do need the option, and we need it to apply to
*all* output, not just KERN_EMERG messages.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 10:31         ` David Woodhouse
@ 2016-07-21 11:19           ` Josh Triplett
  2016-07-21 11:59             ` David Woodhouse
  2016-07-21 15:05           ` Andy Lutomirski
  1 sibling, 1 reply; 60+ messages in thread
From: Josh Triplett @ 2016-07-21 11:19 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 11:31:05AM +0100, David Woodhouse wrote:
> On Thu, 2016-07-21 at 07:59 +0200, Hannes Reinecke wrote:
> > > If we do all this stuff, can we also try to clean up earlyprintk a
> > > bit?  The whole earlyconsole mechanism is a mess, and switching over
> > > to the non-early console is only somewhat functional.  I'd love to see
> > > this all simplified: before there's any console at all available, just
> > > buffer messages.  Then, when a console shows up, write the buffer out.
> > > Then earlyprintk can work just like regular printk.
> > > 
> > Begging the question how one would debug failures during that time.
> > The current earlyprintk stuff is at least able to print out _something_,
> > so that you have some idea what went wrong.
> > Without that things will become _really_ hard during board bringup.
> 
> No, you don't actually *delay* the console output. You still register
> the console as soon as you physically can, but you just don't have all
> that nasty special-casing of the "earlyprintk" nonsense; you just make
> sure the standard console output is capable of working as early as you
> need it to.

I've run into various cases where "console=ttyS0,115200" didn't produce
any serial output at all, but "earlyprintk=ttyS0,115200,keep" did.
Perhaps "earlyprintk" expects less from the serial port than "console"
does, but that suggests that the differences involve more than just how
early the driver can work.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 11:19           ` Josh Triplett
@ 2016-07-21 11:59             ` David Woodhouse
  2016-07-21 14:21               ` Josh Triplett
  2016-08-02 11:59               ` Petr Mladek
  0 siblings, 2 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-21 11:59 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1309 bytes --]

On Thu, 2016-07-21 at 04:19 -0700, Josh Triplett wrote:
> 
> I've run into various cases where "console=ttyS0,115200" didn't produce
> any serial output at all, but "earlyprintk=ttyS0,115200,keep" did.

Note that those are very different things. Because 'ttyS0' for
earlyprintk really means '16x50 UART at 0x3f8'. While the later one
actually depends on it being discovered by the various options we have
for finding them, some of which are only available quite late.

> Perhaps "earlyprintk" expects less from the serial port than "console"
> does, but that suggests that the differences involve more than just how
> early the driver can work.

There *are* differences, yes. But they are almost entirely gratuitous.
Fundamentally, the only interesting thing is how early the driver can
shovel bits out the port, and thus how early it can be registered.

In the case of serial ports, we can register the console early and just
need to sure it keeps working when the driver is *later* fully
initialised.

It's slightly more fun in cases like the bootx console which actually
does stop working when the graphics hardware is properly set up, but
that's not really hard to cope with in the generic code either; it
doesn't need to be *such* a horrid mess of special handling.

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 10:36           ` David Woodhouse
@ 2016-07-21 12:31             ` Jan Kara
  2016-07-28  2:55             ` Steven Rostedt
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Kara @ 2016-07-21 12:31 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu 21-07-16 11:36:42, David Woodhouse wrote:
> On Tue, 2016-07-19 at 10:02 +0200, Hannes Reinecke wrote:
> > > so why would the system die before we console_unlock()?
> > > 
> > Errm.
> > Because it doesn't have any other chance?
> > Like, hard lockup?
> > Power down?
> > Hardware dead?
> > 
> > Slightly puzzled,
> 
> Right. This was exactly the kind of hang I was chasing shortly before
> last year's KS — an interrupt storm killing the box (because of a
> tendency in network drivers to return IRQ_HANDLED and not really care
> if we *had* done so, which ISTR arguing with DaveM about separately).
> 
> Sometimes you don't get a nice clean panic. Sometimes you just get a
> lockup or a hard reset.
> 
> Which is also why it doesn't help much to try to use the level of an
> individual printk to determine whether it should be synchronous or not.
> In this case it was all KERN_DEBUG messages from the network driver,
> which I was logging to the serial port so I could see what was
> happening... but which weren't making it out the port before the
> lockup.
> 
> A viable solution to fix this might be a 'synchronous' flag on the
> console itself — so I could boot with 'console=ttyS0,synchronous' and
> get a debuggable system again, Or maybe it would be simpler to have a
> system-wide control which makes all consoles synchronous, if that's
> easier. Either way, we do need the option, and we need it to apply to
> *all* output, not just KERN_EMERG messages.

Yes, and something like this is already implemented in the patchset - you
have /sys/kernel/printk/synchronous tunable and you can switch its value
anytime between 0 and 1 (or specify its value as a kernel parameter) and
the printk behavior changes. So for debugging the patchset already supports
all the necessary tuning.

Of course there are cases where you run a fleet of production machines and
you don't know in advance which and when fails. Then async printk may
somewhat reduce debuggability of this. But the above tunable still gives
reasonable handle to userspace to cater for cases like that - e.g. you can
switch to synchronous printk after the booting is finished and the printk
load is lower or based on whatever other heuristic that you are able to
invent in userspace...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 11:59             ` David Woodhouse
@ 2016-07-21 14:21               ` Josh Triplett
  2016-07-21 14:40                 ` David Woodhouse
  2016-07-28  3:05                 ` Steven Rostedt
  2016-08-02 11:59               ` Petr Mladek
  1 sibling, 2 replies; 60+ messages in thread
From: Josh Triplett @ 2016-07-21 14:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 12:59:18PM +0100, David Woodhouse wrote:
> On Thu, 2016-07-21 at 04:19 -0700, Josh Triplett wrote:
> > I've run into various cases where "console=ttyS0,115200" didn't produce
> > any serial output at all, but "earlyprintk=ttyS0,115200,keep" did.
> 
> Note that those are very different things. Because 'ttyS0' for
> earlyprintk really means '16x50 UART at 0x3f8'. While the later one
> actually depends on it being discovered by the various options we have
> for finding them, some of which are only available quite late.

Given the discussion in this thread about wanting to eliminate
earlyprintk in favor of a unified mechanism that works as early as
possible, how would this case work on the system I described above where
earlyprintk works and console doesn't?

- Josh Triplett

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 14:21               ` Josh Triplett
@ 2016-07-21 14:40                 ` David Woodhouse
  2016-07-28  3:05                 ` Steven Rostedt
  1 sibling, 0 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-21 14:40 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1019 bytes --]

On Thu, 2016-07-21 at 07:21 -0700, Josh Triplett wrote:
> On Thu, Jul 21, 2016 at 12:59:18PM +0100, David Woodhouse wrote:
> > On Thu, 2016-07-21 at 04:19 -0700, Josh Triplett wrote:
> > > I've run into various cases where "console=ttyS0,115200" didn't produce
> > > any serial output at all, but "earlyprintk=ttyS0,115200,keep" did.
> > 
> > Note that those are very different things. Because 'ttyS0' for
> > earlyprintk really means '16x50 UART at 0x3f8'. While the later one
> > actually depends on it being discovered by the various options we have
> > for finding them, some of which are only available quite late.
> 
> Given the discussion in this thread about wanting to eliminate
> earlyprintk in favor of a unified mechanism that works as early as
> possible, how would this case work on the system I described above where
> earlyprintk works and console doesn't?

We'd need to look at why, and fix the driver so that it *does* work.

But that should be a driver-specific issue.

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 10:31         ` David Woodhouse
  2016-07-21 11:19           ` Josh Triplett
@ 2016-07-21 15:05           ` Andy Lutomirski
  2016-07-26 14:40             ` David Woodhouse
  1 sibling, 1 reply; 60+ messages in thread
From: Andy Lutomirski @ 2016-07-21 15:05 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, Jul 21, 2016 at 3:31 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Thu, 2016-07-21 at 07:59 +0200, Hannes Reinecke wrote:
>> > If we do all this stuff, can we also try to clean up earlyprintk a
>> > bit?  The whole earlyconsole mechanism is a mess, and switching over
>> > to the non-early console is only somewhat functional.  I'd love to see
>> > this all simplified: before there's any console at all available, just
>> > buffer messages.  Then, when a console shows up, write the buffer out.
>> > Then earlyprintk can work just like regular printk.
>> >
>> Begging the question how one would debug failures during that time.
>> The current earlyprintk stuff is at least able to print out _something_,
>> so that you have some idea what went wrong.
>> Without that things will become _really_ hard during board bringup.
>
> No, you don't actually *delay* the console output. You still register
> the console as soon as you physically can, but you just don't have all
> that nasty special-casing of the "earlyprintk" nonsense; you just make
> sure the standard console output is capable of working as early as you
> need it to.

Agreed.

If we could convince fbcon to work early (which it really ought to),
then we could get rid of the extra graphical console implementations
floating around, e.g. the mediocre one sitting in the earlyefi console
code.  We could also get rid of the handover problem when we
initialize the "real" efifb -- all we'd have to do is initialize efifb
very, very early, and I see no reason at all that we can't do that.
(Well, I see one silly reason: if we initialize it before PAT, then we
could end up with a UC mapping instead of WC, and I'm not sure that
the x86 PAT code is smart enough to change the mapping type after the
fact.  This could relatively easily be special-cased for efifb by
teaching it to unmap and remap itself after PAT init, or we could try
to initialize PAT earlier.)

But doing this really nicely does tie in to the async printk
discussion for a different reason.  I think that we should always
initialize the obvious "early" consoles that can be cleanly enumerated
early (e.g. EFI).  What we shouldn't do by default is *display*
anything to them unless we crash, because the distro people want a
clean boot to look pretty.  ISTM the condition for flushing the async
printk buffers could be the same as the condition for actually writing
to pre-VT consoles.

--Andy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 15:05           ` Andy Lutomirski
@ 2016-07-26 14:40             ` David Woodhouse
  2016-07-26 15:44               ` Benjamin Herrenschmidt
  2016-07-26 21:00               ` Andy Lutomirski
  0 siblings, 2 replies; 60+ messages in thread
From: David Woodhouse @ 2016-07-26 14:40 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2523 bytes --]

On Thu, 2016-07-21 at 08:05 -0700, Andy Lutomirski wrote:
> On Thu, Jul 21, 2016 at 3:31 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> > On Thu, 2016-07-21 at 07:59 +0200, Hannes Reinecke wrote:
> > > > If we do all this stuff, can we also try to clean up earlyprintk a
> > > > bit?  The whole earlyconsole mechanism is a mess, and switching over
> > > > to the non-early console is only somewhat functional.  I'd love to see
> > > > this all simplified: before there's any console at all available, just
> > > > buffer messages.  Then, when a console shows up, write the buffer out.
> > > > Then earlyprintk can work just like regular printk.
> > > > 
> > > Begging the question how one would debug failures during that time.
> > > The current earlyprintk stuff is at least able to print out _something_,
> > > so that you have some idea what went wrong.
> > > Without that things will become _really_ hard during board bringup.
> > 
> > No, you don't actually *delay* the console output. You still register
> > the console as soon as you physically can, but you just don't have all
> > that nasty special-casing of the "earlyprintk" nonsense; you just make
> > sure the standard console output is capable of working as early as you
> > need it to.
> 
> Agreed.
> 
> If we could convince fbcon to work early (which it really ought to),

We had bootx on PowerPC which worked from extremely early.

> But doing this really nicely does tie in to the async printk
> discussion for a different reason.  I think that we should always
> initialize the obvious "early" consoles that can be cleanly enumerated
> early (e.g. EFI).  What we shouldn't do by default is *display*
> anything to them unless we crash, because the distro people want a
> clean boot to look pretty.  ISTM the condition for flushing the async
> printk buffers could be the same as the condition for actually writing
> to pre-VT consoles.

Me no like.

That makes even *less* sense for early boot than it does during later
runtime. Because even *more* of the 'we crash' scenarios during early
boot are going to be uncontrolled "oh shit" happenings, not "oh dear,
it looks like we're going to crash so let's helpfully dump out all the
console messages before we do so".

We already have *really* early command line parsing; we can use that to
determine the console loglevel if we need to. No need to be printing
debug messages to the console unconditionally and upsetting the
graphical boot...

-- 
dwmw2

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-26 14:40             ` David Woodhouse
@ 2016-07-26 15:44               ` Benjamin Herrenschmidt
  2016-07-26 21:00               ` Andy Lutomirski
  1 sibling, 0 replies; 60+ messages in thread
From: Benjamin Herrenschmidt @ 2016-07-26 15:44 UTC (permalink / raw)
  To: David Woodhouse, Andy Lutomirski; +Cc: ksummit-discuss

On Tue, 2016-07-26 at 15:40 +0100, David Woodhouse wrote:
> > If we could convince fbcon to work early (which it really ought to),
> 
> We had bootx on PowerPC which worked from extremely early.

We still do ;-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-26 14:40             ` David Woodhouse
  2016-07-26 15:44               ` Benjamin Herrenschmidt
@ 2016-07-26 21:00               ` Andy Lutomirski
  2016-07-27  0:03                 ` David Woodhouse
  1 sibling, 1 reply; 60+ messages in thread
From: Andy Lutomirski @ 2016-07-26 21:00 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Tue, Jul 26, 2016 at 7:40 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Thu, 2016-07-21 at 08:05 -0700, Andy Lutomirski wrote:
>> On Thu, Jul 21, 2016 at 3:31 AM, David Woodhouse <dwmw2@infradead.org> wrote:
>> > On Thu, 2016-07-21 at 07:59 +0200, Hannes Reinecke wrote:
>> > > > If we do all this stuff, can we also try to clean up earlyprintk a
>> > > > bit?  The whole earlyconsole mechanism is a mess, and switching over
>> > > > to the non-early console is only somewhat functional.  I'd love to see
>> > > > this all simplified: before there's any console at all available, just
>> > > > buffer messages.  Then, when a console shows up, write the buffer out.
>> > > > Then earlyprintk can work just like regular printk.
>> > > >
>> > > Begging the question how one would debug failures during that time.
>> > > The current earlyprintk stuff is at least able to print out _something_,
>> > > so that you have some idea what went wrong.
>> > > Without that things will become _really_ hard during board bringup.
>> >
>> > No, you don't actually *delay* the console output. You still register
>> > the console as soon as you physically can, but you just don't have all
>> > that nasty special-casing of the "earlyprintk" nonsense; you just make
>> > sure the standard console output is capable of working as early as you
>> > need it to.
>>
>> Agreed.
>>
>> If we could convince fbcon to work early (which it really ought to),
>
> We had bootx on PowerPC which worked from extremely early.
>
>> But doing this really nicely does tie in to the async printk
>> discussion for a different reason.  I think that we should always
>> initialize the obvious "early" consoles that can be cleanly enumerated
>> early (e.g. EFI).  What we shouldn't do by default is *display*
>> anything to them unless we crash, because the distro people want a
>> clean boot to look pretty.  ISTM the condition for flushing the async
>> printk buffers could be the same as the condition for actually writing
>> to pre-VT consoles.
>
> Me no like.
>
> That makes even *less* sense for early boot than it does during later
> runtime. Because even *more* of the 'we crash' scenarios during early
> boot are going to be uncontrolled "oh shit" happenings, not "oh dear,
> it looks like we're going to crash so let's helpfully dump out all the
> console messages before we do so".

I only meant in the cases where we're currently silent.  If we boot
with "quiet" and we panic, couldn't we at least try to flush out the
printk buffer onto the early console?  Currently we just sit there
with a blank screen.  (I debugged a case exactly like that last week.)

--Andy

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-26 21:00               ` Andy Lutomirski
@ 2016-07-27  0:03                 ` David Woodhouse
  2016-07-27  1:16                   ` Sergey Senozhatsky
  0 siblings, 1 reply; 60+ messages in thread
From: David Woodhouse @ 2016-07-27  0:03 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2331 bytes --]

On Tue, 2016-07-26 at 14:00 -0700, Andy Lutomirski wrote:
> 
> >> But doing this really nicely does tie in to the async printk
> >> discussion for a different reason.  I think that we should always
> >> initialize the obvious "early" consoles that can be cleanly enumerated
> >> early (e.g. EFI).  What we shouldn't do by default is *display*
> >> anything to them unless we crash, because the distro people want a
> >> clean boot to look pretty.  ISTM the condition for flushing the async
> >> printk buffers could be the same as the condition for actually writing
> >> to pre-VT consoles.
> >
> > Me no like.
> >
> > That makes even *less* sense for early boot than it does during later
> > runtime. Because even *more* of the 'we crash' scenarios during early
> > boot are going to be uncontrolled "oh shit" happenings, not "oh dear,
> > it looks like we're going to crash so let's helpfully dump out all the
> > console messages before we do so".
> 
> I only meant in the cases where we're currently silent.  If we boot
> with "quiet" and we panic, couldn't we at least try to flush out the
> printk buffer onto the early console?  Currently we just sit there
> with a blank screen.  (I debugged a case exactly like that last week.)

Sure, *that* part makes some sense if we can work out the details.

One of the issues is that you can end up with out-of-order messages —
for example if you've already printed everything above KERN_CRIT as it
happened, and you then go back and re-print the *whole* logbuffer when
you realise you're going to panic. If you *haven't* actually printed
any lines yet then that's kind of OK, but are you really intending to
keep track of that and behave differently if you have?

And if you spew *everything* on panic to a limited output like the
bootx or VGA text-mode console, you might find that the KERN_DEBUG
messages obscure what actually happened, which would be visible if you
only printed KERN_NOTICE and above... so maybe the best option is just
to honour the loglevel on the command line, but to do it *early*.
Heuristics for 'dump the logbuffer when you panic' still aren't a
panacea.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5760 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-27  0:03                 ` David Woodhouse
@ 2016-07-27  1:16                   ` Sergey Senozhatsky
  0 siblings, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-27  1:16 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On (07/27/16 01:03), David Woodhouse wrote:
[..]
> And if you spew *everything* on panic to a limited output like the
> bootx or VGA text-mode console, you might find that the KERN_DEBUG
> messages obscure what actually happened, which would be visible if you
> only printed KERN_NOTICE and above...

yes, limited output is a problem on its own. sometimes backtrace is just too
big and even critical messages don't fit. I ended up once forcing panic() to
re-print that "bug at __FILE__, __LINE__, __func__" at the bottom of panic
output. schematically, something like this:

 #define BUG() do { \
-       printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \
-       panic("BUG!"); \
+       panic("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__); \
 } while (0)

and pr_emerg() that string in panic() instead of atomic_notifier_call_chain()
output.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 10:36           ` David Woodhouse
  2016-07-21 12:31             ` Jan Kara
@ 2016-07-28  2:55             ` Steven Rostedt
  1 sibling, 0 replies; 60+ messages in thread
From: Steven Rostedt @ 2016-07-28  2:55 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu, 21 Jul 2016 11:36:42 +0100
David Woodhouse <dwmw2@infradead.org> wrote:


> A viable solution to fix this might be a 'synchronous' flag on the
> console itself — so I could boot with 'console=ttyS0,synchronous' and
> get a debuggable system again, Or maybe it would be simpler to have a
> system-wide control which makes all consoles synchronous, if that's
> easier. Either way, we do need the option, and we need it to apply to
> *all* output, not just KERN_EMERG messages.
> 

For these cases I use, earlyprintk=ttyS0,keep

;-)

Actually, the RT patch has a option to force all printk()s to be
earlyprintk directly. I have a separate patch that has a special
spinlock to keep SMP from stomping all over the output. It's special
because it locks per CPU, if an NMI comes in while the current context
has the lock and is printing, it will corrupt the buffer being printed,
but I don't care at that moment. If NMI is doing printks, its message
will be more important anyway.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19 13:55       ` Jan Kara
@ 2016-07-28  2:59         ` Steven Rostedt
  2016-07-28  4:12           ` Sergey Senozhatsky
  0 siblings, 1 reply; 60+ messages in thread
From: Steven Rostedt @ 2016-07-28  2:59 UTC (permalink / raw)
  To: Jan Kara; +Cc: ksummit-discuss

On Tue, 19 Jul 2016 15:55:29 +0200
Jan Kara <jack@suse.cz> wrote:

> Well, he backed up from that statement later a bit. So when I was
> discussing this problem with him at last KS he agreed that making printk
> async is the way to go and he specifically dismissed attempts to print
> synchronously for a while and fall back to async only later. He just wanted
> some way to get things to console synchronously once things go bad (oops,
> panic,...) which is discussed elsewhere in this thread.

Note, my selftests require synchronous printing, so I hope this feature
still allows locations to force that without being in oops mode. The
tracing selftests can cause triple faults, and seeing where it happened
is a must. I already had issues once when printk changed (temporarily)
to only dump out lines when a \n was seen. That caused me to think the
wrong test was rebooting the box.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 14:21               ` Josh Triplett
  2016-07-21 14:40                 ` David Woodhouse
@ 2016-07-28  3:05                 ` Steven Rostedt
  1 sibling, 0 replies; 60+ messages in thread
From: Steven Rostedt @ 2016-07-28  3:05 UTC (permalink / raw)
  To: Josh Triplett; +Cc: ksummit-discuss

On Thu, 21 Jul 2016 07:21:28 -0700
Josh Triplett <josh@joshtriplett.org> wrote:
> 
> Given the discussion in this thread about wanting to eliminate
> earlyprintk in favor of a unified mechanism that works as early as
> possible, how would this case work on the system I described above where
> earlyprintk works and console doesn't?

No no no!!! Please do not eliminate earlyprintk(). It is critical to
the PREEMPT_RT project. It requires no sleeping locks, and can be used
in pretty much any context (with slight modifications). printk() is a
beast, and in -rt it calls sleeping locks. That is, if interrupts are
disabled, we prevent printk from print anything. (Note PREEMPT_RT
keeps interrupts enabled much more than the normal kernel does). If a
bug happens where interrupts are disabled, we fall back to earlyprintk
(and tracing), to debug it.

-- Steve

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-28  2:59         ` Steven Rostedt
@ 2016-07-28  4:12           ` Sergey Senozhatsky
  2016-07-28 13:02             ` Steven Rostedt
  0 siblings, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-07-28  4:12 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: ksummit-discuss

Hello Steven,

On (07/27/16 22:59), Steven Rostedt wrote:
> > Well, he backed up from that statement later a bit. So when I was
> > discussing this problem with him at last KS he agreed that making printk
> > async is the way to go and he specifically dismissed attempts to print
> > synchronously for a while and fall back to async only later. He just wanted
> > some way to get things to console synchronously once things go bad (oops,
> > panic,...) which is discussed elsewhere in this thread.
> 
> Note, my selftests require synchronous printing, so I hope this feature
> still allows locations to force that without being in oops mode. The
> tracing selftests can cause triple faults, and seeing where it happened
> is a must. I already had issues once when printk changed (temporarily)
> to only dump out lines when a \n was seen. That caused me to think the
> wrong test was rebooting the box.

if you run the tests from user space, then there is a way to change
printk behavior at will. if from kernel, then... there is no really
nice way, as of yet.

you can do something like (ugly)

	console_lock();
	test();
	console_unlock();

or change console_loglevel to CONSOLE_LOGLEVEL_MOTORMOUTH temporarily

	console_verbose();
	test();

or boot with async printk (if needed) then change to sync mode from
user space and then start the test...

by default printk is in sync (old) mode; we don't want to enforce printk
changes on the systems that have never seen any issues with printk. so unless
you pass aync printk boot param or change printk to async from user space,
it must work just as before.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-28  4:12           ` Sergey Senozhatsky
@ 2016-07-28 13:02             ` Steven Rostedt
  0 siblings, 0 replies; 60+ messages in thread
From: Steven Rostedt @ 2016-07-28 13:02 UTC (permalink / raw)
  To: Sergey Senozhatsky; +Cc: ksummit-discuss

On Thu, 28 Jul 2016 13:12:51 +0900
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:

> Hello Steven,
> 
> On (07/27/16 22:59), Steven Rostedt wrote:
> > > Well, he backed up from that statement later a bit. So when I was
> > > discussing this problem with him at last KS he agreed that making printk
> > > async is the way to go and he specifically dismissed attempts to print
> > > synchronously for a while and fall back to async only later. He just wanted
> > > some way to get things to console synchronously once things go bad (oops,
> > > panic,...) which is discussed elsewhere in this thread.  
> > 
> > Note, my selftests require synchronous printing, so I hope this feature
> > still allows locations to force that without being in oops mode. The
> > tracing selftests can cause triple faults, and seeing where it happened
> > is a must. I already had issues once when printk changed (temporarily)
> > to only dump out lines when a \n was seen. That caused me to think the
> > wrong test was rebooting the box.  
> 
> if you run the tests from user space, then there is a way to change
> printk behavior at will. if from kernel, then... there is no really
> nice way, as of yet.

The tests are run by the kernel at boot up.

> 
> you can do something like (ugly)
> 
> 	console_lock();
> 	test();
> 	console_unlock();
> 
> or change console_loglevel to CONSOLE_LOGLEVEL_MOTORMOUTH temporarily

A temporary force also works.

-- Steve

> 
> 	console_verbose();
> 	test();
> 
> or boot with async printk (if needed) then change to sync mode from
> user space and then start the test...
> 
> by default printk is in sync (old) mode; we don't want to enforce printk
> changes on the systems that have never seen any issues with printk. so unless
> you pass aync printk boot param or change printk to async from user space,
> it must work just as before.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-21 11:59             ` David Woodhouse
  2016-07-21 14:21               ` Josh Triplett
@ 2016-08-02 11:59               ` Petr Mladek
  1 sibling, 0 replies; 60+ messages in thread
From: Petr Mladek @ 2016-08-02 11:59 UTC (permalink / raw)
  To: David Woodhouse; +Cc: ksummit-discuss

On Thu 2016-07-21 12:59:18, David Woodhouse wrote:
> On Thu, 2016-07-21 at 04:19 -0700, Josh Triplett wrote:
> > 
> > I've run into various cases where "console=ttyS0,115200" didn't produce
> > any serial output at all, but "earlyprintk=ttyS0,115200,keep" did.
> 
> Note that those are very different things. Because 'ttyS0' for
> earlyprintk really means '16x50 UART at 0x3f8'. While the later one
> actually depends on it being discovered by the various options we have
> for finding them, some of which are only available quite late.
> 
> > Perhaps "earlyprintk" expects less from the serial port than "console"
> > does, but that suggests that the differences involve more than just how
> > early the driver can work.
> 
> There *are* differences, yes. But they are almost entirely gratuitous.
> Fundamentally, the only interesting thing is how early the driver can
> shovel bits out the port, and thus how early it can be registered.

Also earlyprintk allows to flush the characters directly to the serial
port without any locking. It is racy and hacky but it makes it work
even on very borked system when the normal printk would get blocked.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-07-19  3:47 [Ksummit-discuss] [TECH TOPIC] asynchronous printk Sergey Senozhatsky
                   ` (2 preceding siblings ...)
  2016-07-19 14:45 ` James Bottomley
@ 2016-09-21  4:41 ` Sergey Senozhatsky
  2016-10-31  6:54   ` Sergey Senozhatsky
  3 siblings, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-09-21  4:41 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: Tetsuo Handa, Viresh Kumar, Tejun Heo, Jiri Kosina

Hello,

a small addition to [1]

On (07/19/16 12:47), Sergey Senozhatsky wrote:
> What I have on my list is:
> - synchronous printk()
> - synchronous console_unlock()
> - console semaphore

Petr Mladek came up with a nice summary [2] of the problems that we
discussed (with a small update from my side [3]). So I would like to
add one more issue to the list, that is, however, not related to async
printk: recursive printk calls. There is a tiny set of cases when
printk can correctly handle recursive printk calls, the rest lead to
deadlock. Those problems, may be, not commonly seen, we had two or three
reports via the public mailing list in the past year; and I've seen 4-5
reports in internal bug tracker (not all of those errors were due to
incorrect printk usage. An unfortunate ARM imprecise abort [4] delivered
while the CPU is in printk->console_unlock->try_to_wake_up under the p->pi_lock
is enough to deadlock the system).

There a several ideas how to try to address those issues that I'm aware
of [5],[6],[7] and, I'm quite sure, even more ideas that I'm not aware of.
Would be interesting to discuss it.

[1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-July/002740.html
[2] https://marc.info/?l=linux-kernel&m=147263639304079
[3] https://marc.info/?l=linux-kernel&m=147264799407718&w=2
[4] http://infocenter.arm.com/help/topic/com.arm.doc.faqs/14809.html
[5] https://marc.info/?l=linux-kernel&m=147158843319944
[6] https://marc.info/?l=linux-kernel&m=147183933104154
[7] https://marc.info/?l=linux-kernel&m=145440084915204

> The list of potential attendees
> (not in any particular order, most likely incomplete)
> 
> Jan Kara
> Petr Mladek
> Andrew Morton
> Jiri Kosina
> Tejun Heo
> Hannes Reinecke
> Viresh Kumar
> Steven Rostedt
> Sergey Senozhatsky

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-09-21  4:41 ` Sergey Senozhatsky
@ 2016-10-31  6:54   ` Sergey Senozhatsky
  2016-10-31 13:56     ` Theodore Ts'o
  0 siblings, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-10-31  6:54 UTC (permalink / raw)
  To: ksummit-discuss
  Cc: Tetsuo Handa, Viresh Kumar, James Bottomley, Tejun Heo, Jiri Kosina

Hello,

On (09/21/16 13:41), Sergey Senozhatsky wrote:
> a small addition to [1]
> 
> On (07/19/16 12:47), Sergey Senozhatsky wrote:
> > What I have on my list is:
> > - synchronous printk()
> > - synchronous console_unlock()
> > - console semaphore
> 
> Petr Mladek came up with a nice summary [2] of the problems that we
> discussed (with a small update from my side [3]). So I would like to
> add one more issue to the list, that is, however, not related to async
> printk: recursive printk calls. There is a tiny set of cases when
> printk can correctly handle recursive printk calls, the rest lead to
> deadlock. Those problems, may be, not commonly seen, we had two or three
> reports via the public mailing list in the past year; and I've seen 4-5
> reports in internal bug tracker (not all of those errors were due to
> incorrect printk usage. An unfortunate ARM imprecise abort [4] delivered
> while the CPU is in printk->console_unlock->try_to_wake_up under the p->pi_lock
> is enough to deadlock the system).
> 
> There a several ideas how to try to address those issues that I'm aware
> of [5],[6],[7] and, I'm quite sure, even more ideas that I'm not aware of.
> Would be interesting to discuss it.
> 
> [1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-July/002740.html
> [2] https://marc.info/?l=linux-kernel&m=147263639304079
> [3] https://marc.info/?l=linux-kernel&m=147264799407718&w=2
> [4] http://infocenter.arm.com/help/topic/com.arm.doc.faqs/14809.html
> [5] https://marc.info/?l=linux-kernel&m=147158843319944
> [6] https://marc.info/?l=linux-kernel&m=147183933104154
> [7] https://marc.info/?l=linux-kernel&m=145440084915204


Any chance to do this session on Tuesday?
I prepared some slides; there are some patches circulating
and there are some things to discuss.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-10-31  6:54   ` Sergey Senozhatsky
@ 2016-10-31 13:56     ` Theodore Ts'o
  2016-10-31 13:59       ` Jiri Kosina
  2016-10-31 14:56       ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
  0 siblings, 2 replies; 60+ messages in thread
From: Theodore Ts'o @ 2016-10-31 13:56 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, James Bottomley,
	Tejun Heo, Jiri Kosina

On Mon, Oct 31, 2016 at 03:54:35PM +0900, Sergey Senozhatsky wrote:
> 
> Any chance to do this session on Tuesday?
> I prepared some slides; there are some patches circulating
> and there are some things to discuss.

Unless there are strong reasons why this needs to be done on Tuesday,
my preference would be to try to schedule tech topics on Wednesday or
Thursday.  But sure, I'm happy to schedule such a topic if folks are
interested.  Let me know what your potential conflicts might be and/or
a preferred time slot.

Cheers,

					- Ted
>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] asynchronous printk
  2016-10-31 13:56     ` Theodore Ts'o
@ 2016-10-31 13:59       ` Jiri Kosina
  2016-10-31 14:56       ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
  1 sibling, 0 replies; 60+ messages in thread
From: Jiri Kosina @ 2016-10-31 13:59 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, Petr Mladek,
	James Bottomley, Tejun Heo

On Mon, 31 Oct 2016, Theodore Ts'o wrote:

> > Any chance to do this session on Tuesday?
> > I prepared some slides; there are some patches circulating
> > and there are some things to discuss.
> 
> Unless there are strong reasons why this needs to be done on Tuesday,
> my preference would be to try to schedule tech topics on Wednesday or
> Thursday.  But sure, I'm happy to schedule such a topic if folks are
> interested.  Let me know what your potential conflicts might be and/or
> a preferred time slot.

(speaking for Petr Mladek, who is currently travelling): please don't 
schedule this for wednesday afternoon, as that'd create a LPC miniconf 
conflict.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk)
  2016-10-31 13:56     ` Theodore Ts'o
  2016-10-31 13:59       ` Jiri Kosina
@ 2016-10-31 14:56       ` Sergey Senozhatsky
  2016-10-31 16:18         ` Theodore Ts'o
  2016-11-01 17:50         ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
  1 sibling, 2 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-10-31 14:56 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, James Bottomley,
	Tejun Heo, Jiri Kosina

Hello,

On (10/31/16 09:56), Theodore Ts'o wrote:
> On Mon, Oct 31, 2016 at 03:54:35PM +0900, Sergey Senozhatsky wrote:
> > 
> > Any chance to do this session on Tuesday?
> > I prepared some slides; there are some patches circulating
> > and there are some things to discuss.
> 
> Unless there are strong reasons why this needs to be done on Tuesday,
> my preference would be to try to schedule tech topics on Wednesday or
> Thursday.  But sure, I'm happy to schedule such a topic if folks are
> interested.  Let me know what your potential conflicts might be and/or
> a preferred time slot.

ok, thanks. I personally will have some conflicts on Wed - "Open
panel for discussing MM problems" and some of LPC microconfs.

and I'd love to change the topic name since async printk is just
25% of the issue we want to discuss. the list I'm looking at now
is as follows:

1) deadlocks in printk and printk recursion
2) async printk
3) pr_cont
4) console semaphore scalability and problems

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk)
  2016-10-31 14:56       ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
@ 2016-10-31 16:18         ` Theodore Ts'o
  2016-10-31 18:21           ` Sergey Senozhatsky
  2016-10-31 20:28           ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Jan Kara
  2016-11-01 17:50         ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
  1 sibling, 2 replies; 60+ messages in thread
From: Theodore Ts'o @ 2016-10-31 16:18 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, James Bottomley,
	Tejun Heo, Jiri Kosina

On Mon, Oct 31, 2016 at 11:56:55PM +0900, Sergey Senozhatsky wrote:
> ok, thanks. I personally will have some conflicts on Wed - "Open
> panel for discussing MM problems" and some of LPC microconfs.

would 10:00am (after the MM panel) work for you and others?

> and I'd love to change the topic name since async printk is just
> 25% of the issue we want to discuss. the list I'm looking at now
> is as follows:
> 
> 1) deadlocks in printk and printk recursion
> 2) async printk
> 3) pr_cont
> 4) console semaphore scalability and problems

So a topic name like "printk improvements" would be better?

     	   	     	     		   	 - Ted

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk)
  2016-10-31 16:18         ` Theodore Ts'o
@ 2016-10-31 18:21           ` Sergey Senozhatsky
  2016-10-31 18:26             ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful Hannes Reinecke
  2016-10-31 20:28           ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Jan Kara
  1 sibling, 1 reply; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-10-31 18:21 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, James Bottomley,
	Tejun Heo, Jiri Kosina

Hello,

On (10/31/16 12:18), Theodore Ts'o wrote:
> On Mon, Oct 31, 2016 at 11:56:55PM +0900, Sergey Senozhatsky wrote:
> > ok, thanks. I personally will have some conflicts on Wed - "Open
> > panel for discussing MM problems" and some of LPC microconfs.
> 
> would 10:00am (after the MM panel) work for you and others?

speaking for myself only,
if there are no better options, then I'm OK with Wed 10am. I'd like
to be in Sweeney AB starting from 11:20am, but can sacrifice it, if
need be (in case if 10am doesn't work for someone. or if 1hr won't
be enough (which is quite likely)). so let's hear from others.

> > and I'd love to change the topic name since async printk is just
> > 25% of the issue we want to discuss. the list I'm looking at now
> > is as follows:
> > 
> > 1) deadlocks in printk and printk recursion
> > 2) async printk
> > 3) pr_cont
> > 4) console semaphore scalability and problems
> 
> So a topic name like "printk improvements" would be better?

"printk considered harmful" sounds good as well :) but "printk improvements"
can surely work.

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful
  2016-10-31 18:21           ` Sergey Senozhatsky
@ 2016-10-31 18:26             ` Hannes Reinecke
  0 siblings, 0 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-10-31 18:26 UTC (permalink / raw)
  To: ksummit-discuss

On 10/31/2016 07:21 PM, Sergey Senozhatsky wrote:
> Hello,
>
> On (10/31/16 12:18), Theodore Ts'o wrote:
>> On Mon, Oct 31, 2016 at 11:56:55PM +0900, Sergey Senozhatsky wrote:
>>> ok, thanks. I personally will have some conflicts on Wed - "Open
>>> panel for discussing MM problems" and some of LPC microconfs.
>>
>> would 10:00am (after the MM panel) work for you and others?
>
> speaking for myself only,
> if there are no better options, then I'm OK with Wed 10am. I'd like
> to be in Sweeney AB starting from 11:20am, but can sacrifice it, if
> need be (in case if 10am doesn't work for someone. or if 1hr won't
> be enough (which is quite likely)). so let's hear from others.
>
>>> and I'd love to change the topic name since async printk is just
>>> 25% of the issue we want to discuss. the list I'm looking at now
>>> is as follows:
>>>
>>> 1) deadlocks in printk and printk recursion
>>> 2) async printk
>>> 3) pr_cont
>>> 4) console semaphore scalability and problems
>>
>> So a topic name like "printk improvements" would be better?
>
> "printk considered harmful" sounds good as well :) but "printk improvements"
> can surely work.
>
I'd like to join, too.
Personally I would prefer Tue, but the proposed time on Wed would work 
for me, too.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk)
  2016-10-31 16:18         ` Theodore Ts'o
  2016-10-31 18:21           ` Sergey Senozhatsky
@ 2016-10-31 20:28           ` Jan Kara
  2016-11-01 12:27             ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful Hannes Reinecke
  1 sibling, 1 reply; 60+ messages in thread
From: Jan Kara @ 2016-10-31 20:28 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: ksummit-discuss, Tetsuo Handa, Viresh Kumar, James Bottomley,
	Tejun Heo, Jiri Kosina

On Mon 31-10-16 12:18:24, Ted Tso wrote:
> On Mon, Oct 31, 2016 at 11:56:55PM +0900, Sergey Senozhatsky wrote:
> > ok, thanks. I personally will have some conflicts on Wed - "Open
> > panel for discussing MM problems" and some of LPC microconfs.
> 
> would 10:00am (after the MM panel) work for you and others?

Works for me.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful
  2016-10-31 20:28           ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Jan Kara
@ 2016-11-01 12:27             ` Hannes Reinecke
  0 siblings, 0 replies; 60+ messages in thread
From: Hannes Reinecke @ 2016-11-01 12:27 UTC (permalink / raw)
  To: ksummit-discuss

On 10/31/2016 09:28 PM, Jan Kara wrote:
> On Mon 31-10-16 12:18:24, Ted Tso wrote:
>> On Mon, Oct 31, 2016 at 11:56:55PM +0900, Sergey Senozhatsky wrote:
>>> ok, thanks. I personally will have some conflicts on Wed - "Open
>>> panel for discussing MM problems" and some of LPC microconfs.
>>
>> would 10:00am (after the MM panel) work for you and others?
>
> Works for me.
>
Yep, that's fine for me, too.

Cheers,

Hannes

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk)
  2016-10-31 14:56       ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
  2016-10-31 16:18         ` Theodore Ts'o
@ 2016-11-01 17:50         ` Sergey Senozhatsky
  1 sibling, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2016-11-01 17:50 UTC (permalink / raw)
  To: ksummit-discuss

Hello,

a small update

On (10/31/16 23:56), Sergey Senozhatsky wrote:
[..]
> 1) deadlocks in printk and printk recursion
> 2) async printk
> 3) pr_cont
> 4) console semaphore scalability and problems

the session is now scheduled (thanks Ted):


 Title: Printk improvements
 Microconference: Kernel Summit Track
 Room: Coronado
 Time: 10:00am

	-ss

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2016-11-01 17:51 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-19  3:47 [Ksummit-discuss] [TECH TOPIC] asynchronous printk Sergey Senozhatsky
2016-07-19  3:56 ` Viresh Kumar
2016-07-19  6:17 ` Hannes Reinecke
2016-07-19  6:49   ` Josh Triplett
2016-07-19  7:02     ` Hannes Reinecke
2016-07-19  7:11       ` Geert Uytterhoeven
2016-07-20  6:02         ` Jan Kara
2016-07-20 22:54       ` Josh Triplett
2016-07-21  0:46         ` Sergey Senozhatsky
2016-07-21  1:12           ` Josh Triplett
2016-07-19  7:33   ` Sergey Senozhatsky
2016-07-19  7:38     ` Hannes Reinecke
2016-07-19  7:46       ` Sergey Senozhatsky
2016-07-19  8:02         ` Hannes Reinecke
2016-07-19  8:23           ` Sergey Senozhatsky
2016-07-21 10:36           ` David Woodhouse
2016-07-21 12:31             ` Jan Kara
2016-07-28  2:55             ` Steven Rostedt
2016-07-20  6:09       ` Jan Kara
2016-07-19  7:46   ` Christian Borntraeger
2016-07-19  7:53     ` Christian Borntraeger
2016-07-19 13:55       ` Jan Kara
2016-07-28  2:59         ` Steven Rostedt
2016-07-28  4:12           ` Sergey Senozhatsky
2016-07-28 13:02             ` Steven Rostedt
2016-07-20  3:35   ` Wangnan (F)
2016-07-21  1:16     ` Andy Lutomirski
2016-07-21  1:52       ` Wangnan (F)
2016-07-21  5:59       ` Hannes Reinecke
2016-07-21 10:31         ` David Woodhouse
2016-07-21 11:19           ` Josh Triplett
2016-07-21 11:59             ` David Woodhouse
2016-07-21 14:21               ` Josh Triplett
2016-07-21 14:40                 ` David Woodhouse
2016-07-28  3:05                 ` Steven Rostedt
2016-08-02 11:59               ` Petr Mladek
2016-07-21 15:05           ` Andy Lutomirski
2016-07-26 14:40             ` David Woodhouse
2016-07-26 15:44               ` Benjamin Herrenschmidt
2016-07-26 21:00               ` Andy Lutomirski
2016-07-27  0:03                 ` David Woodhouse
2016-07-27  1:16                   ` Sergey Senozhatsky
2016-07-21 10:28       ` David Woodhouse
2016-07-19 14:45 ` James Bottomley
2016-07-19 14:55   ` Sergey Senozhatsky
2016-07-19 17:58     ` James Bottomley
2016-07-19 18:24       ` Viresh Kumar
2016-07-20  2:08       ` Sergey Senozhatsky
2016-07-20  6:14     ` Jan Kara
2016-09-21  4:41 ` Sergey Senozhatsky
2016-10-31  6:54   ` Sergey Senozhatsky
2016-10-31 13:56     ` Theodore Ts'o
2016-10-31 13:59       ` Jiri Kosina
2016-10-31 14:56       ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky
2016-10-31 16:18         ` Theodore Ts'o
2016-10-31 18:21           ` Sergey Senozhatsky
2016-10-31 18:26             ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful Hannes Reinecke
2016-10-31 20:28           ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Jan Kara
2016-11-01 12:27             ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful Hannes Reinecke
2016-11-01 17:50         ` [Ksummit-discuss] [TECH TOPIC] printk considered harmful (was: [TECH TOPIC] asynchronous printk) Sergey Senozhatsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.