All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: Cong Wang <xiyou.wangcong@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>, Mel Gorman <mgorman@suse.de>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd
Date: Tue, 28 Jan 2020 00:30:44 -0800	[thread overview]
Message-ID: <20200128083044.GB6615@bombadil.infradead.org> (raw)
In-Reply-To: <20200128081712.GA18145@dhcp22.suse.cz>

On Tue, Jan 28, 2020 at 09:17:12AM +0100, Michal Hocko wrote:
> On Mon 27-01-20 11:06:53, Matthew Wilcox wrote:
> > On Mon, Jan 27, 2020 at 04:00:24PM +0100, Michal Hocko wrote:
> > > On Sun 26-01-20 15:39:35, Matthew Wilcox wrote:
> > > > On Sun, Jan 26, 2020 at 11:53:55AM -0800, Cong Wang wrote:
> > > > > I suspect the process gets stuck in the retry loop in try_charge(), as
> > > > > the _shortest_ stacktrace of the perf samples indicated:
> > > > > 
> > > > > cycles:ppp:
> > > > >         ffffffffa72963db mem_cgroup_iter
> > > > >         ffffffffa72980ca mem_cgroup_oom_unlock
> > > > >         ffffffffa7298c15 try_charge
> > > > >         ffffffffa729a886 mem_cgroup_try_charge
> > > > >         ffffffffa720ec03 __add_to_page_cache_locked
> > > > >         ffffffffa720ee3a add_to_page_cache_lru
> > > > >         ffffffffa7312ddb iomap_readpages_actor
> > > > >         ffffffffa73133f7 iomap_apply
> > > > >         ffffffffa73135da iomap_readpages
> > > > >         ffffffffa722062e read_pages
> > > > >         ffffffffa7220b3f __do_page_cache_readahead
> > > > >         ffffffffa7210554 filemap_fault
> > > > >         ffffffffc039e41f __xfs_filemap_fault
> > > > >         ffffffffa724f5e7 __do_fault
> > > > >         ffffffffa724c5f2 __handle_mm_fault
> > > > >         ffffffffa724cbc6 handle_mm_fault
> > > > >         ffffffffa70a313e __do_page_fault
> > > > >         ffffffffa7a00dfe page_fault
> > > > > 
> > > > > But I don't see how it could be, the only possible case is when
> > > > > mem_cgroup_oom() returns OOM_SUCCESS. However I can't
> > > > > find any clue in dmesg pointing to OOM. These processes in the
> > > > > same memcg are either running or sleeping (that is not exiting or
> > > > > coredump'ing), I don't see how and why they could be selected as
> > > > > a victim of OOM killer. I don't see any signal pending either from
> > > > > their /proc/X/status.
> > > > 
> > > > I think this is a situation where we might end up with a genuine deadlock
> > > > if we're not trylocking the pages.  readahead allocates a batch of
> > > > locked pages and adds them to the pagecache.  If it has allocated,
> > > > say, 5 pages, successfully inserted the first three into i_pages, then
> > > > needs to allocate memory to insert the fourth one into i_pages, and
> > > > the process then attempts to migrate the pages which are still locked,
> > > > they will never come unlocked because they haven't yet been submitted
> > > > to the filesystem for reading.
> > > 
> > > Just to make sure I understand. Do you mean this?
> > > lock_page(A)
> > > alloc_pages
> > >   try_to_compact_pages
> > >     compact_zone_order
> > >       compact_zone(MIGRATE_SYNC_LIGHT)
> > >         migrate_pages
> > > 	  unmap_and_move
> > > 	    __unmap_and_move
> > > 	      lock_page(A)
> > 
> > Yes.  There's a little more to it than that, eg slab is involved, but
> > you have it in a nutshell.
> 
> I am not deeply familiar with the readahead code. But is there really a
> high oerder allocation (order > 1) that would trigger compaction in the
> phase when pages are locked?

Thanks to sl*b, yes:

radix_tree_node    80890 102536    584   28    4 : tunables    0    0    0 : slabdata   3662   3662      0

so it's allocating 4 pages for an allocation of a 576 byte node.

> Btw. the compaction rejects to consider file backed pages when __GFP_FS
> is not present AFAIR.

Ah, that would save us.

  reply	other threads:[~2020-01-28  8:30 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-09 22:56 [PATCH] mm: avoid blocking lock_page() in kcompactd Cong Wang
2020-01-10  0:28 ` Yang Shi
2020-01-10  0:28   ` Yang Shi
2020-01-10  1:01   ` Cong Wang
2020-01-10  1:01     ` Cong Wang
2020-01-10  4:51     ` Cong Wang
2020-01-10  4:51       ` Cong Wang
2020-01-10  7:38 ` Michal Hocko
2020-01-20 22:48   ` Cong Wang
2020-01-20 22:48     ` Cong Wang
2020-01-21  9:00     ` Michal Hocko
2020-01-26 19:53       ` Cong Wang
2020-01-26 19:53         ` Cong Wang
2020-01-26 23:39         ` Matthew Wilcox
2020-01-27 15:00           ` Michal Hocko
2020-01-27 19:06             ` Matthew Wilcox
2020-01-28  1:25               ` Yang Shi
2020-01-28  1:25                 ` Yang Shi
2020-01-28  6:03                 ` Matthew Wilcox
2020-01-28  8:17               ` Michal Hocko
2020-01-28  8:30                 ` Matthew Wilcox [this message]
2020-01-28  9:13                   ` Michal Hocko
2020-01-28 10:48                     ` Matthew Wilcox
2020-01-28 11:39                       ` Michal Hocko
2020-01-28 19:44                         ` Cong Wang
2020-01-28 19:44                           ` Cong Wang
2020-01-30 22:52                           ` Cong Wang
2020-01-30 22:52                             ` Cong Wang
2020-02-13  7:48                         ` Michal Hocko
2020-02-13 16:46                           ` Matthew Wilcox
2020-02-13 17:08                             ` Michal Hocko
2020-02-14  4:27                               ` Matthew Wilcox
2020-02-14  6:55                                 ` Michal Hocko
2020-01-27 14:49         ` Michal Hocko
2020-01-28  0:46           ` Cong Wang
2020-01-28  0:46             ` Cong Wang
2020-01-28  8:22             ` Michal Hocko
2020-01-10  9:22 ` Mel Gorman
2020-01-20 22:41   ` Cong Wang
2020-01-20 22:41     ` Cong Wang
2020-01-21 19:21     ` Yang Shi
2020-01-21 19:21       ` Yang Shi
2020-01-21  8:26   ` Hillf Danton
2020-01-21  9:06     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200128083044.GB6615@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.