linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
	Michal Hocko <mhocko@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	davem@davemloft.net,  netdev@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH] net/skbuff: silence warnings under memory pressure
Date: Wed, 04 Sep 2019 16:42:17 -0400	[thread overview]
Message-ID: <1567629737.5576.87.camel@lca.pw> (raw)
In-Reply-To: <20190904144850.GA8296@tigerII.localdomain>

On Wed, 2019-09-04 at 23:48 +0900, Sergey Senozhatsky wrote:
> On (09/04/19 08:14), Qian Cai wrote:
> > > Plus one more check - waitqueue_active(&log_wait). printk() adds
> > > pending irq_work only if there is a user-space process sleeping on
> > > log_wait and irq_work is not already scheduled. If the syslog is
> > > active or there is noone to wakeup then we don't queue irq_work.
> > 
> > Another possibility for this potential livelock is that those printk() from
> > warn_alloc(), dump_stack() and show_mem() increase the time it needs to
> > process
> > build_skb() allocation failures significantly under memory pressure. As the
> > result, ksoftirqd() could be rescheduled during that time via a different
> > CPU
> > (this is a large x86 NUMA system anyway),
> > 
> > [83605.577256][   C31]  run_ksoftirqd+0x1f/0x40
> > [83605.577256][   C31]  smpboot_thread_fn+0x255/0x440
> > [83605.577256][   C31]  kthread+0x1df/0x200
> > [83605.577256][   C31]  ret_from_fork+0x35/0x40
> 
> Hum hum hum...
> 
> So I can, _probably_, think of several patches.
> 
> First, move wake_up_klogd() back to console_unlock().
> 
> Second, move `printk_pending' out of per-CPU region and make it global.
> So we will have just one printk irq_work scheduled across all CPUs;
> currently we have one irq_work per CPU. I think I sent a patch a long
> long time ago, but we never discussed it, as far as I remember.
> 
> > In addition, those printk() will deal with console drivers or even a
> > networking
> > console, so it is probably not unusual that it could call irq_exit()-
> > __do_softirq() at one point and then this livelock.
> 
> Do you use netcon? Because this, theoretically, can open up one more
> vector. netcon allocates skbs from ->write() path. We call con drivers'
> ->write() from printk_safe context, so should netcon skb allocation
> warn we will scedule one more irq_work on that CPU to flush per-CPU
> printk_safe buffer.
> 
> If this is the case, then we can stop calling console_driver() under
> printk_safe. I sent a patch a while ago, but we agreed to keep the
> things the way they are, fot the time being.
> 
> Let me think more.

To summary, those look to me are all good long-term improvement that would
reduce the likelihood of this kind of livelock in general especially for other
unknown allocations that happen while processing softirqs, but it is still up to
the air if it fixes it 100% in all situations as printk() is going to take more
time and could deal with console hardware that involve irq_exit() anyway.

On the other hand, adding __GPF_NOWARN in the build_skb() allocation will fix
this known NET_TX_SOFTIRQ case which is common when softirqd involved at least
in short-term. It even have a benefit to reduce the overall warn_alloc() noise
out there.

I can resubmit with an update changelog. Does it make any sense?


  parent reply	other threads:[~2019-09-04 20:42 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 14:57 [PATCH] net/skbuff: silence warnings under memory pressure Qian Cai
2019-08-30 15:11 ` Eric Dumazet
2019-08-30 15:25   ` Qian Cai
2019-08-30 16:15     ` Eric Dumazet
2019-08-30 18:06       ` Qian Cai
2019-09-03 13:22       ` Michal Hocko
2019-09-03 15:42         ` Qian Cai
2019-09-03 18:53           ` Michal Hocko
2019-09-03 21:42             ` Qian Cai
2019-09-04  6:15               ` Michal Hocko
2019-09-04  6:41                 ` Sergey Senozhatsky
2019-09-04  6:54                   ` Michal Hocko
2019-09-04  7:19                     ` Sergey Senozhatsky
2019-09-04  7:43                       ` Sergey Senozhatsky
2019-09-04 12:14                         ` Qian Cai
2019-09-04 14:48                           ` Sergey Senozhatsky
2019-09-04 15:07                             ` Qian Cai
2019-09-04 20:42                             ` Qian Cai [this message]
2019-09-05  8:32                               ` Eric Dumazet
2019-09-05 14:09                                 ` Qian Cai
2019-09-05 15:06                                   ` Eric Dumazet
2019-09-05 15:14                                   ` Eric Dumazet
2019-09-05 11:32                               ` Sergey Senozhatsky
2019-09-05 16:03                                 ` Qian Cai
2019-09-05 17:14                                   ` Steven Rostedt
2019-09-06  2:50                                     ` Sergey Senozhatsky
2019-09-06  4:32                                   ` Sergey Senozhatsky
2019-09-06 21:17                                     ` Qian Cai
2019-09-05 17:23                                 ` Steven Rostedt
2019-09-06  3:39                                   ` Sergey Senozhatsky
2019-09-06 15:32                                     ` Petr Mladek
2019-09-09  1:10                                       ` Sergey Senozhatsky
2019-09-06 14:55                                 ` Petr Mladek
2019-09-06 19:51                                   ` Sergey Senozhatsky
2019-11-14 17:12                                 ` Qian Cai
2019-11-18 15:27                                   ` Petr Mladek
2019-11-19  0:41                                     ` Sergey Senozhatsky
2019-11-19  9:41                                       ` Petr Mladek
2019-11-19 15:58                                         ` Qian Cai
2019-11-20  1:30                                         ` Sergey Senozhatsky
2019-11-20 16:13                                           ` Petr Mladek
2019-11-21  1:05                                             ` Sergey Senozhatsky
2019-11-21  9:15                                               ` Petr Mladek
2019-09-04  7:00                   ` Sergey Senozhatsky
2019-09-04  8:25                     ` Michal Hocko
2019-09-04 11:59                       ` Qian Cai
2019-09-04 12:07                         ` Michal Hocko
2019-09-04 12:28                           ` Qian Cai
2019-09-04  6:15               ` Michal Hocko
2019-09-02 14:24     ` Vlastimil Babka
2019-09-03 10:23       ` Qian Cai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1567629737.5576.87.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky.work@gmail.com \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).