All of lore.kernel.org
 help / color / mirror / Atom feed
From: Petr Mladek <pmladek@suse.com>
To: Qian Cai <cai@lca.pw>
Cc: Michal Hocko <mhocko@kernel.org>,
	sergey.senozhatsky.work@gmail.com, rostedt@goodmis.org,
	peterz@infradead.org, linux-mm@kvack.org,
	john.ogness@linutronix.de, akpm@linux-foundation.org,
	david@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()
Date: Tue, 8 Oct 2019 10:15:10 +0200	[thread overview]
Message-ID: <20191008081510.ptwmb7zflqiup5py@pathway.suse.cz> (raw)
In-Reply-To: <1570462407.5576.292.camel@lca.pw>

On Mon 2019-10-07 11:33:27, Qian Cai wrote:
> On Mon, 2019-10-07 at 17:12 +0200, Michal Hocko wrote:
> > On Mon 07-10-19 10:59:10, Qian Cai wrote:
> > [...]
> > > It is almost impossible to eliminate all the indirect call chains from
> > > console_sem/console_owner_lock to zone->lock because it is too normal that
> > > something later needs to allocate some memory dynamically, so as long as it
> > > directly call printk() with zone->lock held, it will be in trouble.
> > 
> > Do you have any example where the console driver really _has_ to
> > allocate. Because I have hard time to believe this is going to work at
> > all as the atomic context doesn't allow to do any memory reclaim and
> > such an allocation would be too easy to fail so the allocation cannot
> > really rely on it.
> 
> I don't know how to explain to you clearly, but let me repeat again one last
> time. There is no necessary for console driver directly to allocate considering
> this example,
> 
> CPU0:              CPU1:    CPU2:       CPU3:
> console_sem->lock                       zone->lock
>                    pi->lock
> pi->lock                    rq_lock
>                    rq->lock
>                             zone->lock
>                                         console_sem->lock

I am curious about CPU2. Does scheduler need to allocate memory?

> Here it only need someone held the rq_lock and allocate some memory. There is
> also true for port_lock. Since the deadlock could involve a lot of CPUs and a
> longer lock chain, it is impossible to predict which one to allocate some memory
> while held a lock could end up with the same problematic lock chain.
> 
> This is just a tip of iceberg to show the lock dependency,
> 
> console_owner --> port_lock_key
> 
> which could easily happen everywhere with a simple printk().

We have got several lockdep reports about possible deadlocks between
console_lock and port_lock caused by printk() called from console
code.

First note that they have been there for years. They were well hidden
until 4.11 released in April 2017. Where the commit
f975237b76827956fe13e ("printk: use printk_safe buffers in printk")
allowed recursive printk() and lockdep.

We believe that these deadlocks are really hard to hit. Console
drivers call printk() only in very critical and rare situations.
This is why nobody invested too much time into fixing these
so far.

There are basically three possibilities:

1. Do crazy exercises with locks all around the kernel to
   avoid the deadlocks. It is usually not worth it. And
   it is a "whack a mole" approach.

2. Use printk_deferred() in problematic code paths. It is
   a "whack a mole" approach as well. And we would end up
   with printk_deferred() used almost everywhere.

3. Always deffer the console handling in printk(). This would
   help also to avoid soft lockups. Several people pushed
   against this last few years because it might reduce
   the chance to see the message in case of system crash.

As I said, there has finally been agreement to always do
the offload few weeks ago. John Ogness is working on it.
So we might have the systematic solution for these deadlocks
rather sooner than later.

Feel free to ask John to CC you on the patches if you want
to help with review.

Best Regards,
Petr

  reply	other threads:[~2019-10-08  8:15 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04 22:26 [PATCH v2] mm/page_isolation: fix a deadlock with printk() Qian Cai
2019-10-07  8:07 ` Michal Hocko
2019-10-07  9:05   ` Petr Mladek
2019-10-07 11:33     ` Michal Hocko
2019-10-07 12:34     ` Qian Cai
2019-10-07 12:34       ` Qian Cai
2019-10-07 11:04   ` Qian Cai
2019-10-07 11:37     ` Michal Hocko
2019-10-07 12:11       ` Qian Cai
2019-10-07 12:11         ` Qian Cai
2019-10-07 12:43         ` Michal Hocko
2019-10-07 13:07           ` Qian Cai
2019-10-07 13:07             ` Qian Cai
2019-10-07 14:10             ` Petr Mladek
2019-10-07 14:30 ` Petr Mladek
2019-10-07 14:49   ` Michal Hocko
2019-10-08  7:43     ` Petr Mladek
2019-10-08  8:27       ` Michal Hocko
2019-10-08 12:56         ` Christian Borntraeger
2019-10-08 16:08           ` Qian Cai
2019-10-08 16:08             ` Qian Cai
2019-10-08 18:35             ` Michal Hocko
2019-10-08 19:06               ` Qian Cai
2019-10-08 19:06                 ` Qian Cai
2019-10-08 19:17                 ` Michal Hocko
2019-10-08 19:35                   ` Qian Cai
2019-10-08 19:35                     ` Qian Cai
2019-10-09 11:49                     ` Petr Mladek
2019-10-09 13:06                       ` Qian Cai
2019-10-09 13:06                         ` Qian Cai
2019-10-09 13:27                         ` Michal Hocko
2019-10-09 13:43                           ` Qian Cai
2019-10-09 13:43                             ` Qian Cai
2019-10-09 13:51                             ` Michal Hocko
2019-10-09 14:19                               ` Qian Cai
2019-10-09 14:19                                 ` Qian Cai
2019-10-09 14:34                                 ` Michal Hocko
2019-10-09 15:08                                   ` Qian Cai
2019-10-09 15:08                                     ` Qian Cai
2019-10-09 16:23                                     ` Michal Hocko
2019-10-09 16:23                                       ` Michal Hocko
2019-10-10  9:01                                       ` Qian Cai
2019-10-10 10:59                                         ` Michal Hocko
2019-10-10 13:11                                           ` Qian Cai
2019-10-10 13:11                                             ` Qian Cai
2019-10-10 14:18                                             ` Michal Hocko
2019-10-10 14:47                                               ` Qian Cai
2019-10-10 14:47                                                 ` Qian Cai
2019-10-10 17:30                                                 ` Michal Hocko
2019-10-10 17:48                                                   ` Qian Cai
2019-10-10 17:48                                                     ` Qian Cai
2019-10-10 18:06                                                     ` Michal Hocko
2019-10-10 18:59                                                       ` David Hildenbrand
2019-10-09 14:24                             ` Petr Mladek
2019-10-09 14:46                               ` Qian Cai
2019-10-09 14:46                                 ` Qian Cai
2019-10-10  7:57                                 ` Petr Mladek
2019-10-09 11:39                 ` Petr Mladek
2019-10-09 13:56             ` Peter Oberparleiter
2019-10-09 14:26               ` Michal Hocko
2019-10-10  5:12                 ` Sergey Senozhatsky
2019-10-10  7:40                   ` Michal Hocko
2019-10-10  8:16                     ` Sergey Senozhatsky
2019-10-10  8:37                       ` Michal Hocko
2019-10-10  8:21                   ` Petr Mladek
2019-10-10  8:39                     ` Sergey Senozhatsky
2019-10-10 11:11                       ` Petr Mladek
2019-10-09 15:25               ` Qian Cai
2019-10-09 15:25                 ` Qian Cai
2019-10-09 15:25                 ` Qian Cai
2019-10-07 14:59   ` Qian Cai
2019-10-07 14:59     ` Qian Cai
2019-10-07 15:12     ` Michal Hocko
2019-10-07 15:33       ` Qian Cai
2019-10-07 15:33         ` Qian Cai
2019-10-08  8:15         ` Petr Mladek [this message]
2019-10-08  9:32           ` Qian Cai
2019-10-08 13:13           ` Steven Rostedt
2019-10-08 13:23             ` Qian Cai
2019-10-08 13:23               ` Qian Cai
2019-10-08 13:33               ` Steven Rostedt
2019-10-08 13:42               ` Petr Mladek
2019-10-08 13:48                 ` Michal Hocko
2019-10-08 14:03                 ` Qian Cai
2019-10-08 14:03                   ` Qian Cai
2019-10-08 14:08                   ` Michal Hocko
2019-10-08  8:40         ` Michal Hocko
2019-10-08 10:04           ` Qian Cai
2019-10-08 10:39             ` Michal Hocko
2019-10-08 12:00               ` Qian Cai
2019-10-08 12:39                 ` Michal Hocko
2019-10-08 13:06                   ` Qian Cai
2019-10-08 13:06                     ` Qian Cai
2019-10-08 13:37                     ` Michal Hocko
2019-10-08 13:08     ` Petr Mladek
2019-10-08 13:33       ` Qian Cai
2019-10-08 13:33         ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191008081510.ptwmb7zflqiup5py@pathway.suse.cz \
    --to=pmladek@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=david@redhat.com \
    --cc=john.ogness@linutronix.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.