All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH 4/4] mm: zero-seek shrinkers
Date: Fri, 12 Oct 2018 15:48:52 +0200	[thread overview]
Message-ID: <ce593e78-e245-6c89-eb91-2ba61b1be855@suse.cz> (raw)
In-Reply-To: <20181009184732.762-5-hannes@cmpxchg.org>

On 10/9/18 8:47 PM, Johannes Weiner wrote:
> The page cache and most shrinkable slab caches hold data that has been
> read from disk, but there are some caches that only cache CPU work,
> such as the dentry and inode caches of procfs and sysfs, as well as
> the subset of radix tree nodes that track non-resident page cache.
> 
> Currently, all these are shrunk at the same rate: using DEFAULT_SEEKS
> for the shrinker's seeks setting tells the reclaim algorithm that for
> every two page cache pages scanned it should scan one slab object.
> 
> This is a bogus setting. A virtual inode that required no IO to create
> is not twice as valuable as a page cache page; shadow cache entries
> with eviction distances beyond the size of memory aren't either.
> 
> In most cases, the behavior in practice is still fine. Such virtual
> caches don't tend to grow and assert themselves aggressively, and
> usually get picked up before they cause problems. But there are
> scenarios where that's not true.
> 
> Our database workloads suffer from two of those. For one, their file
> workingset is several times bigger than available memory, which has
> the kernel aggressively create shadow page cache entries for the
> non-resident parts of it. The workingset code does tell the VM that
> most of these are expendable, but the VM ends up balancing them 2:1 to
> cache pages as per the seeks setting. This is a huge waste of memory.
> 
> These workloads also deal with tens of thousands of open files and use
> /proc for introspection, which ends up growing the proc_inode_cache to
> absurdly large sizes - again at the cost of valuable cache space,
> which isn't a reasonable trade-off, given that proc inodes can be
> re-created without involving the disk.
> 
> This patch implements a "zero-seek" setting for shrinkers that results
> in a target ratio of 0:1 between their objects and IO-backed
> caches. This allows such virtual caches to grow when memory is
> available (they do cache/avoid CPU work after all), but effectively
> disables them as soon as IO-backed objects are under pressure.
> 
> It then switches the shrinkers for procfs and sysfs metadata, as well
> as excess page cache shadow nodes, to the new zero-seek setting.

AFAIU procfs and sysfs metadata have exclusive slab caches, while the
shadow nodes share 'radix_tree_node' cache with non-shadow ones, right?
To avoid fragmentation, it should be better if they had also separate
cache, since their lifetime becomes different. In case that's feasible
(are non-shadow nodes changing to shadow nodes and vice versa? I guess
they do? That would require reallocation in the other cache.).

Vlastimil

      parent reply	other threads:[~2018-10-12 13:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-09 18:47 [PATCH 0/4] mm: workingset & shrinker fixes Johannes Weiner
2018-10-09 18:47 ` [PATCH 1/4] mm: workingset: don't drop refault information prematurely fix Johannes Weiner
2018-10-10  0:55   ` Rik van Riel
2018-10-09 18:47 ` [PATCH 2/4] mm: workingset: use cheaper __inc_lruvec_state in irqsafe node reclaim Johannes Weiner
2018-10-10  0:55   ` Rik van Riel
2018-10-09 18:47 ` [PATCH 3/4] mm: workingset: add vmstat counter for shadow nodes Johannes Weiner
2018-10-09 22:04   ` Andrew Morton
2018-10-10 14:02     ` Johannes Weiner
2018-10-09 22:08   ` Andrew Morton
2018-10-10 15:05     ` Johannes Weiner
2018-10-16  8:49     ` Mel Gorman
2018-10-16 22:27       ` Andrew Morton
2018-10-09 18:47 ` [PATCH 4/4] mm: zero-seek shrinkers Johannes Weiner
2018-10-09 22:15   ` Andrew Morton
2018-10-09 22:17     ` Andrew Morton
2018-10-10  1:03   ` Rik van Riel
2018-10-10 15:15     ` Johannes Weiner
2018-10-12 13:48   ` Vlastimil Babka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ce593e78-e245-6c89-eb91-2ba61b1be855@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    --subject='Re: [PATCH 4/4] mm: zero-seek shrinkers' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.