linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Ogness <john.ogness@linutronix.de>
To: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Daniel Wang <wonderfly@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Alan Cox <gnomes@lxorguk.ukuu.org.uk>,
	Jiri Slaby <jslaby@suse.com>, Peter Feiner <pfeiner@google.com>,
	linux-serial@vger.kernel.org,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Subject: Re: [RFC PATCH v1 00/25] printk: new implementation
Date: Wed, 13 Feb 2019 14:43:48 +0100	[thread overview]
Message-ID: <87d0nv248b.fsf@linutronix.de> (raw)
In-Reply-To: <20190213013101.GA8097@jagdpanzerIV> (Sergey Senozhatsky's message of "Wed, 13 Feb 2019 10:31:01 +0900")

Hi Sergey,

I am glad to see that you are getting involved here. Your previous
talks, work, and discussions were a large part of my research when
preparing for this work.

My response to your comments inline...

On 2019-02-13, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote:
> On (02/12/19 15:29), John Ogness wrote:
>>
>> 1. The printk buffer is protected by a global raw spinlock for readers
>>    and writers. This restricts the contexts that are allowed to
>>   access the buffer.
>
> [..]
>
>> 2. Because of #1, NMI and recursive contexts are handled by deferring
>>    logging/printing to a spinlock-safe context. This means that
>>    messages will not be visible if (for example) the kernel dies in
>>    NMI context and the irq_work mechanism does not survive.
>
> panic() calls printk_safe_flush_on_panic(), which iterates all per-CPU
> buffers and moves data to the main logbuf; so then we can flush pending
> logbuf message
>
> 	panic()
> 	  printk_safe_flush_on_panic();
> 	  console_flush_on_panic();

If we are talking about an SMP system where logbuf_lock is locked, the
call chain is actually:

    panic()
      crash_smp_send_stop()
        ... wait for "num_online_cpus() == 1" ...
      printk_safe_flush_on_panic();
      console_flush_on_panic();

Is it guaranteed that the kernel will successfully stop the other CPUs
so that it can print to the console?

And then there is console_flush_on_panic(), which will ignore locks and
write to the consoles, expecting them to check "oops_in_progress" and
ignore their own internal locks.

Is it guaranteed that locks can just be ignored and backtraces will be
seen and legible to the user?

With the proposed emergency messages, panic() can write immediately to
the guaranteed NMI-safe write_atomic console without having to first do
anything with other CPUs (IPIs, NMIs, waiting, whatever) and without
ignoring locks.

> We don't really use irq_work mechanism for that.

Sorry. I didn't mean to imply that panic() uses irq_work. For non-panic
NMI situations irq_work is required (for example, WARN_ON). As an
example, if a WARN_ON occurred but the hardware locked up before the
irq_work engaged, the message is not seen.

Obviously I am talking about rare situations. But these situations do
occur and it is quite misfortunate (and frustrating!) when the kernel
has the important messages ready, but could not get them out.

>> 3. Because of #1, when *not* using features such as PREEMPT_RT, large
>>    latencies exist when printing to slow consoles.
>
> Because of #1? I'm not familiar with PREEMPT_RT; but logbuf spinlock
> should be unlocked while we print messages to slow consoles
> (call_consoles_drivers() is protected by console_sem, not logbuf
> lock).
>
> So it's
>
> 	spin_lock_irqsave(logbuf);
> 	vsprintf();
> 	memcpy();
> 	spin_unlock_irqrestore(logbuf);
>
> 	console_trylock();
> 	for (;;)
> 	    call_console_drivers();
> 	    				// console_owner handover
> 	console_unlock();
>
> Do you see large latencies because of logbuf spinlock?

Sorry. As you point out, this issue is not because of #1. It is because
of the call sequence:

    vprintk_emit()
      preempt_disable()
      console_unlock()
        call_console_drivers()
      preempt_enable()

For slow consoles, this can cause large latencies for some misfortunate
tasks.

>> 5. Printing to consoles is the responsibility of the printk caller
>>    and that caller may be required to print many messages that other
>>    printk callers inserted. Because of this there can be enormous
>>    variance in the runtime of a printk call.
>
> That's complicated. Steven's console_owner handover patch makes
> printk() more fair. We can have "winner takes it all" scenarios,
> but significantly less often, IMO. Do you have any data that
> suggest otherwise?

Steven's console_owner handover was a definite improvement. But as you
said, we can have "winner takes it all" scenarios (or rather "the last
one to leave the bar pays the bill"). This still should not be
acceptable. Let's let some some other _preemptible_ task pay the bill.

I am proposing to change printk so that a single pr_info() call can be
made without the fear that it might have to print thousands of messages
to multiple consoles, all with preemption disabled.

>> 7. Loglevel INFO is handled the same as ERR. There seems to be an
>>    endless effort to get printk to show _all_ messages as quickly as
>>    possible in case of a panic (i.e. printing from any context), but
>>    at the same time try not to have printk be too intrusive for the
>>    callers. These are conflicting requirements that lead to a printk
>>    implementation that does a sub-optimal job of satisfying both
>>    sides.
>
> Per my experience, fully preemptible "print it sometime maybe"
> printk() does not work equally well for everyone.

(I'm also including your previous relevant comment[0].)

> One thing that I have learned is that preemptible printk does not work
> as expected; it wants to be 'atomic' and just stay busy as long as it
> can.
> We tried preemptible printk at Samsung and the result was just bad:
>    preempted printk kthread + slow serial console = lots of lost
> messages

As long as all critical messages are print directly and immediately to
an emergency console, why is it is problem if the informational messages
to consoles are sometimes delayed or lost? And if those informational
messages _are_ so important, there are things the user can do. For
example, create a realtime userspace task to read /dev/kmsg.

> We also had preemptile printk in the upstream kernel and reverted the
> patch (see fd5f7cde1b85d4c8e09); same reasons - we had reports that
> preemptible printk could "stall" for minutes.

But in this case the preemptible task was used for printing critical
tasks as well. Then the stall really is a problem. I am proposing to
rely on emergency consoles for critical messages. By changing printk to
support 2 different channels (emergency and non-emergency), we can focus
on making each of those channels optimal.

This is exactly what point #7 is talking about: How we currently have
only 1 channel to try and satisfy all needs (whether critical or console
noise).

John Ogness

[0] http://lkml.kernel.org/r/20190212154736.GA3160@tigerII.localdomain

  reply	other threads:[~2019-02-13 13:44 UTC|newest]

Thread overview: 147+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-12 14:29 [RFC PATCH v1 00/25] printk: new implementation John Ogness
2019-02-12 14:29 ` [RFC PATCH v1 01/25] printk-rb: add printk ring buffer documentation John Ogness
2019-02-12 14:45   ` Greg Kroah-Hartman
2019-02-12 14:29 ` [RFC PATCH v1 02/25] printk-rb: add prb locking functions John Ogness
2019-02-13 15:45   ` Petr Mladek
2019-02-13 21:39     ` John Ogness
2019-02-14 10:33       ` Petr Mladek
2019-02-14 12:10         ` John Ogness
2019-02-15 10:26           ` Petr Mladek
2019-02-15 10:56             ` John Ogness
2019-03-07  2:12   ` Sergey Senozhatsky
2019-02-12 14:29 ` [RFC PATCH v1 03/25] printk-rb: define ring buffer struct and initializer John Ogness
2019-02-12 14:46   ` Greg Kroah-Hartman
2019-02-14 12:46     ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 04/25] printk-rb: add writer interface John Ogness
2019-02-14 15:16   ` Petr Mladek
2019-02-14 23:36     ` John Ogness
2019-02-15  1:19       ` John Ogness
2019-02-15 13:47       ` Petr Mladek
2019-02-17  1:32         ` John Ogness
2019-02-21 13:51           ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 05/25] printk-rb: add basic non-blocking reading interface John Ogness
2019-02-18 12:54   ` Petr Mladek
2019-02-19 21:44     ` John Ogness
2019-02-21 16:22       ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 06/25] printk-rb: add blocking reader support John Ogness
2019-02-18 14:05   ` Petr Mladek
2019-02-19 21:47     ` John Ogness
2019-02-12 14:29 ` [RFC PATCH v1 07/25] printk-rb: add functionality required by printk John Ogness
2019-02-12 17:15   ` Linus Torvalds
2019-02-13  9:20     ` John Ogness
2019-02-18 15:59   ` Petr Mladek
2019-02-19 22:08     ` John Ogness
2019-02-22  9:58       ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 08/25] printk: add ring buffer and kthread John Ogness
2019-02-12 15:47   ` Sergey Senozhatsky
2019-02-19 13:54   ` Petr Mladek
2019-03-04  7:38   ` Sergey Senozhatsky
2019-03-04 10:00     ` Sergey Senozhatsky
2019-03-04 11:07       ` Sergey Senozhatsky
2019-03-05 21:00         ` John Ogness
2019-03-06 15:57           ` Petr Mladek
2019-03-06 21:17             ` John Ogness
2019-03-06 22:22               ` John Ogness
2019-03-07  6:41                 ` Sergey Senozhatsky
2019-03-07  6:51                   ` Sergey Senozhatsky
2019-03-07 12:50               ` Petr Mladek
2019-03-07  5:15           ` Sergey Senozhatsky
2019-03-11 10:51             ` John Ogness
2019-03-12  9:58               ` Sergey Senozhatsky
2019-03-12 10:30               ` Petr Mladek
2019-03-07 12:06     ` John Ogness
2019-03-08  1:31       ` Sergey Senozhatsky
2019-03-08 10:04         ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 09/25] printk: remove exclusive console hack John Ogness
2019-02-19 14:03   ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 10/25] printk: redirect emit/store to new ringbuffer John Ogness
2019-02-20  9:01   ` Petr Mladek
2019-02-20 21:25     ` John Ogness
2019-02-22 14:43       ` Petr Mladek
2019-02-22 15:06         ` John Ogness
2019-02-22 15:25           ` Petr Mladek
2019-02-25 12:11       ` Petr Mladek
2019-02-25 16:41         ` John Ogness
2019-02-26  9:45           ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 11/25] printk_safe: remove printk safe code John Ogness
2019-02-22 10:37   ` Petr Mladek
2019-02-22 13:38     ` John Ogness
2019-02-22 15:15       ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 12/25] printk: minimize console locking implementation John Ogness
2019-02-25 13:44   ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 13/25] printk: track seq per console John Ogness
2019-02-25 14:59   ` Petr Mladek
2019-02-26  8:45     ` John Ogness
2019-02-26 13:11       ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 14/25] printk: do boot_delay_msec inside printk_delay John Ogness
2019-02-12 14:29 ` [RFC PATCH v1 15/25] printk: print history for new consoles John Ogness
2019-02-26 14:58   ` Petr Mladek
2019-02-26 15:22     ` John Ogness
2019-02-27  9:02       ` Petr Mladek
2019-02-27 10:02         ` John Ogness
2019-02-27 13:12           ` Petr Mladek
2019-03-04  9:24       ` Sergey Senozhatsky
2019-02-12 14:29 ` [RFC PATCH v1 16/25] printk: implement CON_PRINTBUFFER John Ogness
2019-02-26 15:38   ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 17/25] printk: add processor number to output John Ogness
2019-02-13 22:29   ` John Ogness
2019-02-12 14:29 ` [RFC PATCH v1 18/25] console: add write_atomic interface John Ogness
2019-02-12 14:29 ` [RFC PATCH v1 19/25] printk: introduce emergency messages John Ogness
2019-03-07  7:30   ` Sergey Senozhatsky
2019-03-08 10:31     ` Petr Mladek
2019-03-11 12:04       ` John Ogness
2019-03-12  2:51         ` Sergey Senozhatsky
2019-03-12  2:58       ` Sergey Senozhatsky
2019-02-12 14:29 ` [RFC PATCH v1 20/25] serial: 8250: implement write_atomic John Ogness
2019-02-27  9:46   ` Petr Mladek
2019-02-27 10:32     ` John Ogness
2019-02-27 13:55       ` Petr Mladek
2019-03-08  4:05         ` John Ogness
2019-03-08  4:17           ` John Ogness
2019-03-08 10:28           ` Petr Mladek
2019-02-12 14:29 ` [RFC PATCH v1 21/25] printk: implement KERN_CONT John Ogness
2019-02-12 14:30 ` [RFC PATCH v1 22/25] printk: implement /dev/kmsg John Ogness
2019-02-12 14:30 ` [RFC PATCH v1 23/25] printk: implement syslog John Ogness
2019-02-12 14:30 ` [RFC PATCH v1 24/25] printk: implement kmsg_dump John Ogness
2019-02-12 14:30 ` [RFC PATCH v1 25/25] printk: remove unused code John Ogness
2019-03-08 14:02   ` Sebastian Andrzej Siewior
2019-03-11  2:46     ` Sergey Senozhatsky
2019-03-11  8:18       ` Sebastian Andrzej Siewior
2019-03-12  9:38         ` Petr Mladek
2019-02-13  1:31 ` [RFC PATCH v1 00/25] printk: new implementation Sergey Senozhatsky
2019-02-13 13:43   ` John Ogness [this message]
2019-03-04  6:39     ` Sergey Senozhatsky
2019-02-13  1:41 ` Sergey Senozhatsky
2019-02-13 14:15   ` John Ogness
2019-03-04  5:31     ` Sergey Senozhatsky
2019-02-13  2:55 ` Sergey Senozhatsky
2019-02-13 14:43   ` John Ogness
2019-03-04  5:23     ` Sergey Senozhatsky
2019-03-07  9:53       ` John Ogness
2019-03-08 10:00         ` Petr Mladek
2019-03-11 10:54         ` Sergey Senozhatsky
2019-03-12 12:38           ` Petr Mladek
2019-03-12 15:15             ` John Ogness
2019-03-13  2:15               ` Sergey Senozhatsky
2019-03-13  8:19                 ` John Ogness
2019-03-13  8:40                   ` Sebastian Siewior
2019-03-13  9:27                     ` Sergey Senozhatsky
2019-03-13 10:06                       ` Sergey Senozhatsky
2019-03-14  9:27                       ` Petr Mladek
2019-03-13  8:46                   ` Sergey Senozhatsky
2019-03-14  9:14               ` Petr Mladek
2019-03-14  9:35                 ` John Ogness
2019-03-13  2:00             ` Sergey Senozhatsky
2019-02-13 16:54 ` David Laight
2019-02-13 22:20   ` John Ogness
2020-01-20 23:05 ` Eugeniu Rosca
2020-01-21 23:56   ` John Ogness
2020-01-22  2:34     ` Eugeniu Rosca
2020-01-22  7:31       ` Geert Uytterhoeven
2020-01-22 16:58         ` Eugeniu Rosca
2020-01-22 19:48           ` Geert Uytterhoeven
2020-01-24 16:09             ` Eugeniu Rosca
2020-01-27 12:32               ` Petr Mladek
2020-01-27 13:45                 ` Eugeniu Rosca
2020-01-22 10:33       ` John Ogness
2020-01-24 12:13         ` Eugeniu Rosca

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d0nv248b.fsf@linutronix.de \
    --to=john.ogness@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=gnomes@lxorguk.ukuu.org.uk \
    --cc=gregkh@linuxfoundation.org \
    --cc=jslaby@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=pfeiner@google.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=torvalds@linux-foundation.org \
    --cc=wonderfly@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).