linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Petr Mladek <pmladek@suse.com>, Tejun Heo <tj@kernel.org>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Mel Gorman <mgorman@suse.de>, Michal Hocko <mhocko@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	rostedt@rostedt.homelinux.com,
	Byungchul Park <byungchul.park@lge.com>,
	Pavel Machek <pavel@ucw.cz>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup
Date: Wed, 24 Jan 2018 11:11:33 +0900	[thread overview]
Message-ID: <20180124021034.GA651@jagdpanzerIV> (raw)
In-Reply-To: <20180123112436.0c94bc2e@gandalf.local.home>

Hello,

On (01/23/18 11:24), Steven Rostedt wrote:
[..]
> > With WQ we don't lockup the kernel, because we flush printk_safe in
> > preemptible context. And people are very much expected to fix the
> > misbehaving consoles. But that should not be printk_safe problem.
> 
> Right, but now you just made printk safe unreliable to get information
> out, because you need to wait for a schedule to occur, and if there's
> issues, like a deadlock, that thread will never run. And you just lost
> you lockdep splat.

Yes and No.

printk_safe and printk_nmi are unreliable - both need irq_work. That's
why we forcibly flush those buffers in panic(). At least for printk_safe
case, and I'm pretty sure the same stands for printk_nmi, we never said
that we will store all the messages that were printed from unsafe context
(recursion or NMI). The only thing we said - we will try not to deadlock
the system.

Now we are adding one more thing to printk_safe - we will also try not to
lockup the system.

Default printk_safe buffer size might not be enough to store a very large
lockdep splat. And we will report that the buffer is too small and that we
lost some of the lines: "here is what we have, we lost N lines, but at least
we didn't deadlock the system". See f975237b76827956fe13ecfe993a319158e2c303
for more details, it contains a list of recursive-printk deadlock scenarios
that printk_safe was meant to handle.

It is possible and OK to lose messages in printk_safe/printk_nmi

printk_safe_enter_irqsave()
  printk
  printk
  ...
  ...
  printk
  printk
printk_safe_exit_irqrestore()

No flush will take place as long as there is no IRQ on that CPU.
But printk_safe and printk_nmi are solving different problem in
the first place.

> > I'll re-read this one tomorrow. Not quite following it.
> 
> I'll add more capitals next time ;-)

Ha-ha-ha ;)

[..]
> > pintk_safe was designed to be recursive. It was never designed to be
> > used to troubleshoot or debug consoles. But it was designed to be
> > recursive - because that's the sort of the problems it was meant to
> > handle: recursive printks that would otherwise deadlock us. That's why
> > we have it in the first place.
> 
> So printk safe is only triggered when at the same context? If we can
> guarantee that printk safe is triggered only when its because a printk
> is happening at the same context (not because of an interrupt, but
> really at the same context, using my context check), then I'm fine with
> delaying them to a work queue.

printk_safe is for printk recursion only. It happens in the same context
only. When we switch to printk_safe we disable local IRQs, NMIs have their
own printk_nmi thing. And the way we flush printk_safe is mostly recursive.
Because we flush when we know that we will not deadlock [as much as we can;
we can't control any 3rd party locks which might be involved; thus
printk_deferred() usage].

Usually it's something like

   printk
    spin_lock_irqsave(logbuf_lock)
     printk
      spin_lock_irqsave(logbuf_lock) << deadlock

What we have with printk_safe is

  printk
   local_irq_save
   printk_safe_enter
   spin_lock(logbuf_lock)
    printk
     vprintk_safe
      queue irq work
   spin_unlock(logbuf_lock)
   printk_safe_exit
   local_irq_restore
   >>> IRQ work
       printk_safe_flush
        printk
	 spin_lock_irqsave(logbuf_lock)
	 log_store()
	 spin_unlock_irqrestore(logbuf_lock)

So we flush printk_safe ASAP, which usually (unless originally we were
not in IRQ) means that the flush is recursive, but safe - we don't
deadlock.

> That is, if we have this:
> 
> 	printk()
> 		console_lock()
> 			<interrupt>
> 				printk()
> 					add to log buffer
> 		<print irq printk too>
> 		console_unlock();

Right. This is what we have right now. Every time we enable local IRQs in
the console_unlock() printing loop - we flush printk_safe. And that's the
problem.

> 	printk()
> 		console_lock()
> 			<console does a printk>
> 				put in printk safe buffer
> 				trigger work queue
> 		console_unlock()
> 	<work queue>
> 		flush safe buffer
> 		printk()

Right. This is what we will have with WQ. We don't flush printk_safe until
we return from console_unlock(). Because printk() disables preemption for
the duration of console_unlock(), we can't schedule WQ on that CPU. And we
schedule flushing work only on the CPU that has triggered the recursion.

Another thing:

console_lock()
 blah blah
console_unlock()

In this case we will flush printk_safe withing the printing loop.
Immediately. But we don't care - the CPU is preemptible, we don't
lock up the kernel.


> Then I'm fine with that.
> 
> I have to look at the latest code. If this is indeed what we have, then
> I admit I misunderstood the problem you want to solve.
> 
> I only want recursive printks (those that are actually triggered by
> doing a printk) to be allowed to be delayed.
> 
> Make sense?

Please take a look.

	-ss

  reply	other threads:[~2018-01-24  2:11 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-10 13:24 [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Petr Mladek
2018-01-10 13:24 ` [PATCH v5 1/2] printk: Add console owner and waiter logic to load balance console writes Petr Mladek
2018-01-10 16:50   ` Steven Rostedt
2018-01-12 16:54   ` Steven Rostedt
2018-01-12 17:11     ` Steven Rostedt
2018-01-17 19:13       ` Rasmus Villemoes
2018-01-17 19:33         ` Steven Rostedt
2018-01-19  9:51         ` Sergey Senozhatsky
2018-01-18 22:03     ` Pavel Machek
2018-01-19  0:20       ` Steven Rostedt
2018-01-17  2:19   ` Byungchul Park
2018-01-17  4:54     ` Byungchul Park
2018-01-17  7:34     ` Byungchul Park
2018-01-17 12:04     ` Petr Mladek
2018-01-18  1:53       ` Byungchul Park
2018-01-18  1:57         ` Byungchul Park
2018-01-18  2:19         ` Steven Rostedt
2018-01-18  4:01           ` Byungchul Park
2018-01-18 15:21             ` Steven Rostedt
2018-01-19  2:37               ` Byungchul Park
2018-01-19  3:27                 ` Steven Rostedt
2018-01-22  2:31                   ` Byungchul Park
2018-01-10 13:24 ` [PATCH v5 2/2] printk: Hide console waiter logic into helpers Petr Mladek
2018-01-10 17:52   ` Steven Rostedt
2018-01-11 12:03     ` Petr Mladek
2018-01-12 15:37       ` Steven Rostedt
2018-01-12 16:08         ` Petr Mladek
2018-01-12 16:36           ` Steven Rostedt
2018-01-15 16:08             ` Petr Mladek
2018-01-16  5:05               ` Sergey Senozhatsky
2018-01-10 14:05 ` [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Tejun Heo
2018-01-10 16:29   ` Petr Mladek
2018-01-10 17:02     ` Tejun Heo
2018-01-10 18:21       ` Peter Zijlstra
2018-01-10 18:30         ` Tejun Heo
2018-01-10 18:41           ` Peter Zijlstra
2018-01-10 19:05             ` Tejun Heo
2018-01-11  5:15         ` Sergey Senozhatsky
2018-01-10 18:22       ` Steven Rostedt
2018-01-10 18:36         ` Tejun Heo
2018-01-10 18:40       ` Mathieu Desnoyers
2018-01-11  7:36         ` Sergey Senozhatsky
2018-01-11 11:24           ` Petr Mladek
2018-01-11 13:19             ` Sergey Senozhatsky
2018-01-24  9:36       ` Peter Zijlstra
2018-01-24 18:46         ` Tejun Heo
2018-05-09  8:58       ` Sergey Senozhatsky
2018-01-10 18:54     ` Steven Rostedt
2018-01-11  5:10     ` Sergey Senozhatsky
2018-01-10 18:05   ` Steven Rostedt
2018-01-10 18:12     ` Tejun Heo
2018-01-10 18:14       ` Tejun Heo
2018-01-10 18:45         ` Steven Rostedt
2018-01-10 18:41       ` Steven Rostedt
2018-01-10 18:57         ` Tejun Heo
2018-01-10 19:17           ` Steven Rostedt
2018-01-10 19:34             ` Tejun Heo
2018-01-10 19:44               ` Steven Rostedt
2018-01-10 22:44                 ` Tejun Heo
2018-01-11  5:35             ` Sergey Senozhatsky
2018-01-11  4:58     ` Sergey Senozhatsky
2018-01-11  9:34       ` Petr Mladek
2018-01-11 10:38         ` Sergey Senozhatsky
2018-01-11 11:50           ` Petr Mladek
2018-01-11 16:29           ` Steven Rostedt
2018-01-12  1:30             ` Steven Rostedt
2018-01-12  2:55               ` Steven Rostedt
2018-01-12  4:20                 ` Steven Rostedt
2018-01-16 19:44                 ` Tejun Heo
2018-01-17  9:12                   ` Petr Mladek
2018-01-17 15:15                     ` Tejun Heo
2018-01-17 17:12                       ` Steven Rostedt
2018-01-17 18:42                         ` Steven Rostedt
2018-01-19 18:20                           ` Steven Rostedt
2018-01-20  7:14                             ` Sergey Senozhatsky
2018-01-20 15:49                               ` Steven Rostedt
2018-01-21 14:15                                 ` Sergey Senozhatsky
2018-01-21 21:04                                   ` Steven Rostedt
2018-01-22  8:56                                     ` Sergey Senozhatsky
2018-01-22 10:28                                       ` Sergey Senozhatsky
2018-01-22 10:36                                         ` Sergey Senozhatsky
2018-01-23  6:40                                   ` Sergey Senozhatsky
2018-01-23  7:05                                     ` Sergey Senozhatsky
2018-01-23  7:31                                     ` Sergey Senozhatsky
2018-01-23 14:56                                     ` Steven Rostedt
2018-01-23 15:21                                       ` Sergey Senozhatsky
2018-01-23 15:41                                         ` Steven Rostedt
2018-01-23 15:43                                           ` Tejun Heo
2018-01-23 16:12                                             ` Sergey Senozhatsky
2018-01-23 16:13                                             ` Steven Rostedt
2018-01-23 17:21                                               ` Tejun Heo
2018-04-23  5:35                                             ` Sergey Senozhatsky
2018-01-23 16:01                                           ` Sergey Senozhatsky
2018-01-23 16:24                                             ` Steven Rostedt
2018-01-24  2:11                                               ` Sergey Senozhatsky [this message]
2018-01-24  2:52                                                 ` Steven Rostedt
2018-01-24  4:44                                                   ` Sergey Senozhatsky
2018-01-23 17:22                                             ` Tejun Heo
2018-01-20 12:19                             ` Tejun Heo
2018-01-20 14:51                               ` Steven Rostedt
2018-01-17 20:05                         ` Tejun Heo
2018-01-18  5:43                           ` Sergey Senozhatsky
2018-01-18 11:51                           ` Petr Mladek
2018-01-18  5:42                         ` Sergey Senozhatsky
2018-01-12  3:12               ` Sergey Senozhatsky
2018-01-12  2:56             ` Sergey Senozhatsky
2018-01-12  3:21               ` Steven Rostedt
2018-01-12 10:05                 ` Sergey Senozhatsky
2018-01-12 12:21                   ` Steven Rostedt
2018-01-12 12:55                     ` Petr Mladek
2018-01-13  7:31                       ` Sergey Senozhatsky
2018-01-15  8:51                         ` Petr Mladek
2018-01-15  9:48                           ` Sergey Senozhatsky
2018-01-16  5:16                           ` Sergey Senozhatsky
2018-01-16  9:08                             ` Petr Mladek
2018-01-15 12:08                       ` Steven Rostedt
2018-01-16  4:51                         ` Sergey Senozhatsky
2018-01-13  7:28                     ` Sergey Senozhatsky
2018-01-15 10:17                       ` Petr Mladek
2018-01-15 11:50                         ` Petr Mladek
2018-01-16  6:10                           ` Sergey Senozhatsky
2018-01-16  9:36                             ` Petr Mladek
2018-01-16 10:10                               ` Sergey Senozhatsky
2018-01-16 16:06                             ` Steven Rostedt
2018-01-16  5:23                         ` Sergey Senozhatsky
2018-01-15 12:06                       ` Steven Rostedt
2018-01-15 14:45                         ` Petr Mladek
2018-01-16  2:23                           ` Sergey Senozhatsky
2018-01-16  4:47                             ` Sergey Senozhatsky
2018-01-16 10:19                               ` Petr Mladek
2018-01-17  2:24                                 ` Sergey Senozhatsky
2018-01-16 15:45                               ` Steven Rostedt
2018-01-17  2:18                                 ` Sergey Senozhatsky
2018-01-17 13:04                                   ` Petr Mladek
2018-01-17 15:24                                     ` Steven Rostedt
2018-01-18  4:31                                     ` Sergey Senozhatsky
2018-01-18 15:22                                       ` Steven Rostedt
2018-01-16 10:13                             ` Petr Mladek
2018-01-17  6:29                               ` Sergey Senozhatsky
2018-01-16  1:46                         ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180124021034.GA651@jagdpanzerIV \
    --to=sergey.senozhatsky.work@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=byungchul.park@lge.com \
    --cc=dave.hansen@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=rostedt@rostedt.homelinux.com \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).