linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable
@ 2014-01-17 19:25 Vladimir Davydov
  2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Vladimir Davydov @ 2014-01-17 19:25 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

The name `max_pass' is misleading, because this variable actually keeps
the estimate number of freeable objects, not the maximal number of
objects we can scan in this pass, which can be twice that. Rename it to
reflect its actual meaning.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Glauber Costa <glommer@gmail.com>
---
 mm/vmscan.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index eea668d..31aa997 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -224,15 +224,15 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 	unsigned long freed = 0;
 	unsigned long long delta;
 	long total_scan;
-	long max_pass;
+	long freeable;
 	long nr;
 	long new_nr;
 	int nid = shrinkctl->nid;
 	long batch_size = shrinker->batch ? shrinker->batch
 					  : SHRINK_BATCH;
 
-	max_pass = shrinker->count_objects(shrinker, shrinkctl);
-	if (max_pass == 0)
+	freeable = shrinker->count_objects(shrinker, shrinkctl);
+	if (freeable == 0)
 		return 0;
 
 	/*
@@ -244,14 +244,14 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 
 	total_scan = nr;
 	delta = (4 * nr_pages_scanned) / shrinker->seeks;
-	delta *= max_pass;
+	delta *= freeable;
 	do_div(delta, lru_pages + 1);
 	total_scan += delta;
 	if (total_scan < 0) {
 		printk(KERN_ERR
 		"shrink_slab: %pF negative objects to delete nr=%ld\n",
 		       shrinker->scan_objects, total_scan);
-		total_scan = max_pass;
+		total_scan = freeable;
 	}
 
 	/*
@@ -260,26 +260,26 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 	 * shrinkers to return -1 all the time. This results in a large
 	 * nr being built up so when a shrink that can do some work
 	 * comes along it empties the entire cache due to nr >>>
-	 * max_pass.  This is bad for sustaining a working set in
+	 * freeable. This is bad for sustaining a working set in
 	 * memory.
 	 *
 	 * Hence only allow the shrinker to scan the entire cache when
 	 * a large delta change is calculated directly.
 	 */
-	if (delta < max_pass / 4)
-		total_scan = min(total_scan, max_pass / 2);
+	if (delta < freeable / 4)
+		total_scan = min(total_scan, freeable / 2);
 
 	/*
 	 * Avoid risking looping forever due to too large nr value:
 	 * never try to free more than twice the estimate number of
 	 * freeable entries.
 	 */
-	if (total_scan > max_pass * 2)
-		total_scan = max_pass * 2;
+	if (total_scan > freeable * 2)
+		total_scan = freeable * 2;
 
 	trace_mm_shrink_slab_start(shrinker, shrinkctl, nr,
 				nr_pages_scanned, lru_pages,
-				max_pass, delta, total_scan);
+				freeable, delta, total_scan);
 
 	while (total_scan >= batch_size) {
 		unsigned long ret;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
  2014-01-17 19:25 [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable Vladimir Davydov
@ 2014-01-17 19:25 ` Vladimir Davydov
  2014-02-04 21:58   ` Andrew Morton
  2014-01-17 19:25 ` [PATCH 3/3] mm: vmscan: shrink_slab: do not skip caches with < batch_size objects Vladimir Davydov
  2014-01-21 22:22 ` [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable David Rientjes
  2 siblings, 1 reply; 9+ messages in thread
From: Vladimir Davydov @ 2014-01-17 19:25 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

Each shrinker must define the number of seeks it takes to recreate a
shrinkable cache object. It is used to balance slab reclaim vs page
reclaim: assuming it costs one seek to replace an LRU page, we age equal
percentages of the LRU and ageable caches. So far, everything sounds
clear, but the code implementing this behavior is rather confusing.

First, there is the DEFAULT_SEEKS constant, which equals 2 for some
reason:

  #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */

Most shrinkers define `seeks' to be equal to DEFAULT_SEEKS, some use
DEFAULT_SEEKS*N, and there are a few that totally ignore it. What is
peculiar, dcache and icache shrinkers have seeks=DEFAULT_SEEKS although
recreating an inode typically requires one seek. Does this mean that we
scan twice more inodes than we should?

Actually, no. The point is that vmscan handles DEFAULT_SEEKS as if it
were 1 (`delta' is the number of objects we are going to scan):

  shrink_slab_node():
    delta = (4 * nr_pages_scanned) / shrinker->seeks;
    delta *= freeable;
    do_div(delta, lru_pages + 1);

i.e.

            2 * nr_pages_scanned    DEFAULT_SEEKS
    delta = -------------------- * --------------- * freeable;
                 lru_pages         shrinker->seeks

Here we double the number of pages scanned in order to take into account
moves of on-LRU pages from the inactive list to the active list, which
we do not count in nr_pages_scanned.

That said, shrinker->seeks=DEFAULT_SEEKS*N is equivalent to N seeks, so
why on the hell do we need it?

IMO, the existence of the DEFAULT_SEEKS constant only causes confusion
for both users of the shrinker interface and those trying to understand
how slab shrinking works. The meaning of the `seeks' is perfectly
explained by the comment to it and there is no need in any obscure
constants for using it.

That's why I'm sending this patch which completely removes DEFAULT_SEEKS
and makes all shrinkers use N instead of N*DEFAULT_SEEKS, documenting
the idea lying behind shrink_slab() in the meanwhile.

Unfortunately, there are a few shrinkers that define seeks=1, which is
impossible to transfer to the new interface intact, namely:

  nfsd_reply_cache_shrinker
  ttm_pool_manager::mm_shrink
  ttm_pool_manager::mm_shrink
  dm_bufio_client::shrinker

It seems to me their authors were simply deceived by this mysterious
DEFAULT_SEEKS constant, because I've found no documentation why these
particular caches should be scanned more aggressively than the page and
other slab caches. For them, this patch leaves seeks=1. Thus, it DOES
introduce a functional change: the shrinkers enumerated above will be
scanned twice less intensively than they are now. I do not think that
this will cause any problems though.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Glauber Costa <glommer@gmail.com>
---
 arch/x86/kvm/mmu.c                                 |    2 +-
 drivers/gpu/drm/i915/i915_gem.c                    |    2 +-
 drivers/md/bcache/btree.c                          |    2 +-
 drivers/staging/android/ashmem.c                   |    2 +-
 drivers/staging/android/lowmemorykiller.c          |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c     |    4 +--
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c    |    2 +-
 fs/ext4/extents_status.c                           |    2 +-
 fs/gfs2/glock.c                                    |    2 +-
 fs/gfs2/quota.c                                    |    2 +-
 fs/mbcache.c                                       |    2 +-
 fs/nfs/super.c                                     |    2 +-
 fs/quota/dquot.c                                   |    2 +-
 fs/super.c                                         |    2 +-
 fs/ubifs/super.c                                   |    2 +-
 fs/xfs/xfs_buf.c                                   |    2 +-
 fs/xfs/xfs_qm.c                                    |    2 +-
 include/linux/shrinker.h                           |    1 -
 mm/huge_memory.c                                   |    2 +-
 mm/vmscan.c                                        |   31 ++++++++++----------
 net/sunrpc/auth.c                                  |    2 +-
 22 files changed, 36 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 40772ef..b092ccc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4445,7 +4445,7 @@ mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 static struct shrinker mmu_shrinker = {
 	.count_objects = mmu_shrink_count,
 	.scan_objects = mmu_shrink_scan,
-	.seeks = DEFAULT_SEEKS * 10,
+	.seeks = 10,
 };
 
 static void mmu_destroy_caches(void)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 76d3d1a..c779221 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4674,7 +4674,7 @@ i915_gem_load(struct drm_device *dev)
 
 	dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan;
 	dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count;
-	dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS;
+	dev_priv->mm.inactive_shrinker.seeks = 1;
 	register_shrinker(&dev_priv->mm.inactive_shrinker);
 }
 
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 31bb53f..a359351 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -818,7 +818,7 @@ int bch_btree_cache_alloc(struct cache_set *c)
 
 	c->shrink.count_objects = bch_mca_count;
 	c->shrink.scan_objects = bch_mca_scan;
-	c->shrink.seeks = 4;
+	c->shrink.seeks = 2;
 	c->shrink.batch = c->btree_pages * 2;
 	register_shrinker(&c->shrink);
 
diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c
index 23948f1..dbb6128 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -470,7 +470,7 @@ static struct shrinker ashmem_shrinker = {
 	 * XXX (dchinner): I wish people would comment on why they need on
 	 * significant changes to the default value here
 	 */
-	.seeks = DEFAULT_SEEKS * 4,
+	.seeks = 4,
 };
 
 static int set_prot_mask(struct ashmem_area *asma, unsigned long prot)
diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
index 6f094b3..0cfd62c 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -173,7 +173,7 @@ static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
 static struct shrinker lowmem_shrinker = {
 	.scan_objects = lowmem_scan,
 	.count_objects = lowmem_count,
-	.seeks = DEFAULT_SEEKS * 16
+	.seeks = 16
 };
 
 static int __init lowmem_init(void)
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index 0025ee6..29bf615 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@ -1394,13 +1394,13 @@ static void ldlm_pools_thread_stop(void)
 static struct shrinker ldlm_pools_srv_shrinker = {
 	.count_objects	= ldlm_pools_srv_count,
 	.scan_objects	= ldlm_pools_srv_scan,
-	.seeks		= DEFAULT_SEEKS,
+	.seeks		= 1,
 };
 
 static struct shrinker ldlm_pools_cli_shrinker = {
 	.count_objects	= ldlm_pools_cli_count,
 	.scan_objects	= ldlm_pools_cli_scan,
-	.seeks		= DEFAULT_SEEKS,
+	.seeks		= 1,
 };
 
 int ldlm_pools_init(void)
diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 212823a..7799321 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1923,7 +1923,7 @@ int lu_printk_printer(const struct lu_env *env,
 static struct shrinker lu_site_shrinker = {
 	.count_objects	= lu_cache_shrink_count,
 	.scan_objects	= lu_cache_shrink_scan,
-	.seeks 		= DEFAULT_SEEKS,
+	.seeks 		= 1,
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
index 316103a..6019213 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec_bulk.c
@@ -710,7 +710,7 @@ static inline void enc_pools_free(void)
 static struct shrinker pools_shrinker = {
 	.count_objects	= enc_pools_shrink_count,
 	.scan_objects	= enc_pools_shrink_scan,
-	.seeks		= DEFAULT_SEEKS,
+	.seeks		= 1,
 };
 
 int sptlrpc_enc_pool_init(void)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 3981ff7..2a49cca 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -1048,7 +1048,7 @@ void ext4_es_register_shrinker(struct ext4_sb_info *sbi)
 	sbi->s_es_last_sorted = 0;
 	sbi->s_es_shrinker.scan_objects = ext4_es_scan;
 	sbi->s_es_shrinker.count_objects = ext4_es_count;
-	sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
+	sbi->s_es_shrinker.seeks = 1;
 	register_shrinker(&sbi->s_es_shrinker);
 }
 
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 6f7a47c..dfd37c2 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1468,7 +1468,7 @@ static unsigned long gfs2_glock_shrink_count(struct shrinker *shrink,
 }
 
 static struct shrinker glock_shrinker = {
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 	.count_objects = gfs2_glock_shrink_count,
 	.scan_objects = gfs2_glock_shrink_scan,
 };
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 98236d0..0e4ad87 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -149,7 +149,7 @@ static unsigned long gfs2_qd_shrink_count(struct shrinker *shrink,
 struct shrinker gfs2_qd_shrinker = {
 	.count_objects = gfs2_qd_shrink_count,
 	.scan_objects = gfs2_qd_shrink_scan,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 	.flags = SHRINKER_NUMA_AWARE,
 };
 
diff --git a/fs/mbcache.c b/fs/mbcache.c
index e519e45..a273ebd 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -195,7 +195,7 @@ mb_cache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 static struct shrinker mb_cache_shrinker = {
 	.count_objects = mb_cache_shrink_count,
 	.scan_objects = mb_cache_shrink_scan,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 };
 
 /*
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 910ed90..e96a512 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -362,7 +362,7 @@ static void unregister_nfs4_fs(void)
 static struct shrinker acl_shrinker = {
 	.count_objects	= nfs_access_cache_count,
 	.scan_objects	= nfs_access_cache_scan,
-	.seeks		= DEFAULT_SEEKS,
+	.seeks		= 1,
 };
 
 /*
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 831d49a..ff5e0e1 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -718,7 +718,7 @@ dqcache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
 static struct shrinker dqcache_shrinker = {
 	.count_objects = dqcache_shrink_count,
 	.scan_objects = dqcache_shrink_scan,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 };
 
 /*
diff --git a/fs/super.c b/fs/super.c
index e5f6c2c..6010ac4 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -219,7 +219,7 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
 	s->s_time_gran = 1000000000;
 	s->cleancache_poolid = -1;
 
-	s->s_shrink.seeks = DEFAULT_SEEKS;
+	s->s_shrink.seeks = 1;
 	s->s_shrink.scan_objects = super_cache_scan;
 	s->s_shrink.count_objects = super_cache_count;
 	s->s_shrink.batch = 1024;
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index f69daa5..62a4703 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -51,7 +51,7 @@ struct kmem_cache *ubifs_inode_slab;
 static struct shrinker ubifs_shrinker_info = {
 	.scan_objects = ubifs_shrink_scan,
 	.count_objects = ubifs_shrink_count,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 };
 
 /**
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index afe7645..bfd982f 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1667,7 +1667,7 @@ xfs_alloc_buftarg(
 
 	btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
 	btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
-	btp->bt_shrinker.seeks = DEFAULT_SEEKS;
+	btp->bt_shrinker.seeks = 1;
 	btp->bt_shrinker.flags = SHRINKER_NUMA_AWARE;
 	register_shrinker(&btp->bt_shrinker);
 	return btp;
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index dd88f0e..4ff8536 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -931,7 +931,7 @@ xfs_qm_init_quotainfo(
 
 	qinf->qi_shrinker.count_objects = xfs_qm_shrink_count;
 	qinf->qi_shrinker.scan_objects = xfs_qm_shrink_scan;
-	qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
+	qinf->qi_shrinker.seeks = 1;
 	qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
 	register_shrinker(&qinf->qi_shrinker);
 	return 0;
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 68c0970..0cd3257 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -60,7 +60,6 @@ struct shrinker {
 	/* objs pending delete, per node */
 	atomic_long_t *nr_deferred;
 };
-#define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
 
 /* Flags */
 #define SHRINKER_NUMA_AWARE (1 << 0)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 95d1acb..0c2379b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -235,7 +235,7 @@ static unsigned long shrink_huge_zero_page_scan(struct shrinker *shrink,
 static struct shrinker huge_zero_page_shrinker = {
 	.count_objects = shrink_huge_zero_page_count,
 	.scan_objects = shrink_huge_zero_page_scan,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 };
 
 #ifdef CONFIG_SYSFS
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31aa997..f6d716d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -242,11 +242,19 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 	 */
 	nr = atomic_long_xchg(&shrinker->nr_deferred[nid], 0);
 
-	total_scan = nr;
-	delta = (4 * nr_pages_scanned) / shrinker->seeks;
+	/*
+	 * Assuming it costs one seek to replace an LRU page, we age
+	 * equal percentages of the LRU and ageable caches. This should
+	 * balance the seeks generated by these structures.
+	 *
+	 * To scan an LRU page, we have to move it to an inactive list
+	 * first. Take this into account by doubling nr_pages_scanned.
+	 */
+	delta = (2 * nr_pages_scanned) / shrinker->seeks;
 	delta *= freeable;
 	do_div(delta, lru_pages + 1);
-	total_scan += delta;
+
+	total_scan = nr + delta;
 	if (total_scan < 0) {
 		printk(KERN_ERR
 		"shrink_slab: %pF negative objects to delete nr=%ld\n",
@@ -314,19 +322,10 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 /*
  * Call the shrink functions to age shrinkable caches
  *
- * Here we assume it costs one seek to replace a lru page and that it also
- * takes a seek to recreate a cache object.  With this in mind we age equal
- * percentages of the lru and ageable caches.  This should balance the seeks
- * generated by these structures.
- *
- * If the vm encountered mapped pages on the LRU it increase the pressure on
- * slab to avoid swapping.
- *
- * We do weird things to avoid (scanned*seeks*entries) overflowing 32 bits.
- *
- * `lru_pages' represents the number of on-LRU pages in all the zones which
- * are eligible for the caller's allocation attempt.  It is used for balancing
- * slab reclaim versus page reclaim.
+ * `nr_pages_scanned' and `lru_pages' represent the number of scanned pages and
+ * the total number of on-LRU pages in all the zones which are eligible for the
+ * caller's allocation attempt respectively. They are used for balancing slab
+ * reclaim vs page reclaim.
  *
  * Returns the number of slab objects which we shrunk.
  */
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 5285ead..72a1665 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -819,7 +819,7 @@ rpcauth_uptodatecred(struct rpc_task *task)
 static struct shrinker rpc_cred_shrinker = {
 	.count_objects = rpcauth_cache_shrink_count,
 	.scan_objects = rpcauth_cache_shrink_scan,
-	.seeks = DEFAULT_SEEKS,
+	.seeks = 1,
 };
 
 int __init rpcauth_init_module(void)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] mm: vmscan: shrink_slab: do not skip caches with < batch_size objects
  2014-01-17 19:25 [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable Vladimir Davydov
  2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
@ 2014-01-17 19:25 ` Vladimir Davydov
  2014-01-21 22:22 ` [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable David Rientjes
  2 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2014-01-17 19:25 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

In its current implementation, shrink_slab() won't scan caches that have
less than batch_size objects. If there are only a few shrinkers
available, such a behavior won't cause any problems, because the
batch_size is usually small, but if we have a lot of slab shrinkers,
which is perfectly possible since FS shrinkers are now per-superblock,
we can end up with hundreds of megabytes of practically unreclaimable
kmem objects. For instance, mounting a thousand of ext2 FS images with a
hundred of files in each and iterating over all the files using du(1)
will result in about 200 Mb of FS caches that cannot be dropped even
with the aid of the vm.drop_caches sysctl! Fix this.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Glauber Costa <glommer@gmail.com>
---
 mm/vmscan.c |   25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index f6d716d..2e710d4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -275,7 +275,7 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 	 * a large delta change is calculated directly.
 	 */
 	if (delta < freeable / 4)
-		total_scan = min(total_scan, freeable / 2);
+		total_scan = min(total_scan, max(freeable / 2, batch_size));
 
 	/*
 	 * Avoid risking looping forever due to too large nr value:
@@ -289,21 +289,34 @@ shrink_slab_node(struct shrink_control *shrinkctl, struct shrinker *shrinker,
 				nr_pages_scanned, lru_pages,
 				freeable, delta, total_scan);
 
-	while (total_scan >= batch_size) {
+	/*
+	 * To avoid CPU cache thrashing, we should not scan less than
+	 * batch_size objects in one pass, but if the cache has less
+	 * than batch_size objects in total, and we really want to
+	 * shrink them all, go ahead and scan what we have, otherwise
+	 * last batch_size objects will never get reclaimed.
+	 */
+	if (total_scan < batch_size &&
+	    !(freeable < batch_size && total_scan >= freeable))
+		goto out;
+
+	do {
 		unsigned long ret;
+		unsigned long nr_to_scan = min(total_scan, batch_size);
 
-		shrinkctl->nr_to_scan = batch_size;
+		shrinkctl->nr_to_scan = nr_to_scan;
 		ret = shrinker->scan_objects(shrinker, shrinkctl);
 		if (ret == SHRINK_STOP)
 			break;
 		freed += ret;
 
-		count_vm_events(SLABS_SCANNED, batch_size);
-		total_scan -= batch_size;
+		count_vm_events(SLABS_SCANNED, nr_to_scan);
+		total_scan -= nr_to_scan;
 
 		cond_resched();
-	}
+	} while (total_scan >= batch_size);
 
+out:
 	/*
 	 * move the unused scan count back into the shrinker in a
 	 * manner that handles concurrent updates. If we exhausted the
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable
  2014-01-17 19:25 [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable Vladimir Davydov
  2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
  2014-01-17 19:25 ` [PATCH 3/3] mm: vmscan: shrink_slab: do not skip caches with < batch_size objects Vladimir Davydov
@ 2014-01-21 22:22 ` David Rientjes
  2014-01-22  6:11   ` Vladimir Davydov
  2 siblings, 1 reply; 9+ messages in thread
From: David Rientjes @ 2014-01-21 22:22 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: akpm, linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1136 bytes --]

On Fri, 17 Jan 2014, Vladimir Davydov wrote:

> The name `max_pass' is misleading, because this variable actually keeps
> the estimate number of freeable objects, not the maximal number of
> objects we can scan in this pass, which can be twice that. Rename it to
> reflect its actual meaning.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Michal Hocko <mhocko@suse.cz>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Dave Chinner <dchinner@redhat.com>
> Cc: Glauber Costa <glommer@gmail.com>

This doesn't compile on linux-next:

mm/vmscan.c: In function ‘shrink_slab_node’:
mm/vmscan.c:300:23: error: ‘max_pass’ undeclared (first use in this function)
mm/vmscan.c:300:23: note: each undeclared identifier is reported only once for each function it appears in

because of b01fa2357bca ("mm: vmscan: shrink all slab objects if tight on 
memory") from an author with a name remarkably similar to yours.  Could 
you rebase this series on top of your previous work that is already in 
-mm?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable
  2014-01-21 22:22 ` [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable David Rientjes
@ 2014-01-22  6:11   ` Vladimir Davydov
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2014-01-22  6:11 UTC (permalink / raw)
  To: David Rientjes
  Cc: akpm, linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

On 01/22/2014 02:22 AM, David Rientjes wrote:
> On Fri, 17 Jan 2014, Vladimir Davydov wrote:
>
>> The name `max_pass' is misleading, because this variable actually keeps
>> the estimate number of freeable objects, not the maximal number of
>> objects we can scan in this pass, which can be twice that. Rename it to
>> reflect its actual meaning.
>>
>> Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Mel Gorman <mgorman@suse.de>
>> Cc: Michal Hocko <mhocko@suse.cz>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Rik van Riel <riel@redhat.com>
>> Cc: Dave Chinner <dchinner@redhat.com>
>> Cc: Glauber Costa <glommer@gmail.com>
> This doesn't compile on linux-next:
>
> mm/vmscan.c: In function ‘shrink_slab_node’:
> mm/vmscan.c:300:23: error: ‘max_pass’ undeclared (first use in this function)
> mm/vmscan.c:300:23: note: each undeclared identifier is reported only once for each function it appears in
>
> because of b01fa2357bca ("mm: vmscan: shrink all slab objects if tight on 
> memory") from an author with a name remarkably similar to yours.

Oh, sorry. I thought it hadn't been committed there yet.

> Could you rebase this series on top of your previous work that is already in -mm?

Sure.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
  2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
@ 2014-02-04 21:58   ` Andrew Morton
  2014-02-05  7:16     ` Vladimir Davydov
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2014-02-04 21:58 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

On Fri, 17 Jan 2014 23:25:30 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:

> Each shrinker must define the number of seeks it takes to recreate a
> shrinkable cache object. It is used to balance slab reclaim vs page
> reclaim: assuming it costs one seek to replace an LRU page, we age equal
> percentages of the LRU and ageable caches. So far, everything sounds
> clear, but the code implementing this behavior is rather confusing.
> 
> First, there is the DEFAULT_SEEKS constant, which equals 2 for some
> reason:
> 
>   #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
> 
> Most shrinkers define `seeks' to be equal to DEFAULT_SEEKS, some use
> DEFAULT_SEEKS*N, and there are a few that totally ignore it. What is
> peculiar, dcache and icache shrinkers have seeks=DEFAULT_SEEKS although
> recreating an inode typically requires one seek. Does this mean that we
> scan twice more inodes than we should?
> 
> Actually, no. The point is that vmscan handles DEFAULT_SEEKS as if it
> were 1 (`delta' is the number of objects we are going to scan):
> 
>   shrink_slab_node():
>     delta = (4 * nr_pages_scanned) / shrinker->seeks;
>     delta *= freeable;
>     do_div(delta, lru_pages + 1);
> 
> i.e.
> 
>             2 * nr_pages_scanned    DEFAULT_SEEKS
>     delta = -------------------- * --------------- * freeable;
>                  lru_pages         shrinker->seeks
> 
> Here we double the number of pages scanned in order to take into account
> moves of on-LRU pages from the inactive list to the active list, which
> we do not count in nr_pages_scanned.
> 
> That said, shrinker->seeks=DEFAULT_SEEKS*N is equivalent to N seeks, so
> why on the hell do we need it?
> 
> IMO, the existence of the DEFAULT_SEEKS constant only causes confusion
> for both users of the shrinker interface and those trying to understand
> how slab shrinking works. The meaning of the `seeks' is perfectly
> explained by the comment to it and there is no need in any obscure
> constants for using it.
> 
> That's why I'm sending this patch which completely removes DEFAULT_SEEKS
> and makes all shrinkers use N instead of N*DEFAULT_SEEKS, documenting
> the idea lying behind shrink_slab() in the meanwhile.
> 
> Unfortunately, there are a few shrinkers that define seeks=1, which is
> impossible to transfer to the new interface intact, namely:
> 
>   nfsd_reply_cache_shrinker
>   ttm_pool_manager::mm_shrink
>   ttm_pool_manager::mm_shrink
>   dm_bufio_client::shrinker
> 
> It seems to me their authors were simply deceived by this mysterious
> DEFAULT_SEEKS constant, because I've found no documentation why these
> particular caches should be scanned more aggressively than the page and
> other slab caches. For them, this patch leaves seeks=1. Thus, it DOES
> introduce a functional change: the shrinkers enumerated above will be
> scanned twice less intensively than they are now. I do not think that
> this will cause any problems though.
> 

um, yes.  DEFAULT_SEEKS is supposed to be "the number of seeks if you
don't know any better".  Using DEFAULT_SEEKS*n is just daft.

So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
recreate (say) an inode would require a seek to the inode data then a
seek back.  Is it legitimate to include the
seek-back-to-what-you-were-doing-before seek in the cost of an inode
reclaim?  I guess so...

If a filesystem were to require a seek to the superblock for every
inode read (ok, bad example) then the cost of reestablishing that inode
would be 3.

All that being said, why did you go through and halve everything?  The
cost of reestablishing an ext2 inode should be "2 seeks", but the patch
makes it "1".


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
  2014-02-04 21:58   ` Andrew Morton
@ 2014-02-05  7:16     ` Vladimir Davydov
  2014-02-05 20:52       ` Andrew Morton
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Davydov @ 2014-02-05  7:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

On 02/05/2014 01:58 AM, Andrew Morton wrote:
> On Fri, 17 Jan 2014 23:25:30 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>> Each shrinker must define the number of seeks it takes to recreate a
>> shrinkable cache object. It is used to balance slab reclaim vs page
>> reclaim: assuming it costs one seek to replace an LRU page, we age equal
>> percentages of the LRU and ageable caches. So far, everything sounds
>> clear, but the code implementing this behavior is rather confusing.
>>
>> First, there is the DEFAULT_SEEKS constant, which equals 2 for some
>> reason:
>>
>>   #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
>>
>> Most shrinkers define `seeks' to be equal to DEFAULT_SEEKS, some use
>> DEFAULT_SEEKS*N, and there are a few that totally ignore it. What is
>> peculiar, dcache and icache shrinkers have seeks=DEFAULT_SEEKS although
>> recreating an inode typically requires one seek. Does this mean that we
>> scan twice more inodes than we should?
>>
>> Actually, no. The point is that vmscan handles DEFAULT_SEEKS as if it
>> were 1 (`delta' is the number of objects we are going to scan):
>>
>>   shrink_slab_node():
>>     delta = (4 * nr_pages_scanned) / shrinker->seeks;
>>     delta *= freeable;
>>     do_div(delta, lru_pages + 1);
>>
>> i.e.
>>
>>             2 * nr_pages_scanned    DEFAULT_SEEKS
>>     delta = -------------------- * --------------- * freeable;
>>                  lru_pages         shrinker->seeks
>>
>> Here we double the number of pages scanned in order to take into account
>> moves of on-LRU pages from the inactive list to the active list, which
>> we do not count in nr_pages_scanned.
>>
>> That said, shrinker->seeks=DEFAULT_SEEKS*N is equivalent to N seeks, so
>> why on the hell do we need it?
>>
>> IMO, the existence of the DEFAULT_SEEKS constant only causes confusion
>> for both users of the shrinker interface and those trying to understand
>> how slab shrinking works. The meaning of the `seeks' is perfectly
>> explained by the comment to it and there is no need in any obscure
>> constants for using it.
>>
>> That's why I'm sending this patch which completely removes DEFAULT_SEEKS
>> and makes all shrinkers use N instead of N*DEFAULT_SEEKS, documenting
>> the idea lying behind shrink_slab() in the meanwhile.
>>
>> Unfortunately, there are a few shrinkers that define seeks=1, which is
>> impossible to transfer to the new interface intact, namely:
>>
>>   nfsd_reply_cache_shrinker
>>   ttm_pool_manager::mm_shrink
>>   ttm_pool_manager::mm_shrink
>>   dm_bufio_client::shrinker
>>
>> It seems to me their authors were simply deceived by this mysterious
>> DEFAULT_SEEKS constant, because I've found no documentation why these
>> particular caches should be scanned more aggressively than the page and
>> other slab caches. For them, this patch leaves seeks=1. Thus, it DOES
>> introduce a functional change: the shrinkers enumerated above will be
>> scanned twice less intensively than they are now. I do not think that
>> this will cause any problems though.
>>
> um, yes.  DEFAULT_SEEKS is supposed to be "the number of seeks if you
> don't know any better".  Using DEFAULT_SEEKS*n is just daft.
>
> So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
> recreate (say) an inode would require a seek to the inode data then a
> seek back.  Is it legitimate to include the
> seek-back-to-what-you-were-doing-before seek in the cost of an inode
> reclaim?  I guess so...

Hmm, that explains this 2. Since we typically don't need to "seek back"
when recreating a cache page, as they are usually read in bunches by
readahead, the number of seeks to bring back a user page is 1, while the
number of seeks to recreate an average inode is 2, right?

Then to scan inodes and user pages so that they would generate
approximately the same number of seeks, we should calculate the number
of objects to scan as follows:

nr_objects_to_scan = nr_pages_scanned / lru_pages *
                                        nr_freeable_objects /
shrinker->seeks

where shrinker->seeks = DEFAULT_SEEKS = 2 for inodes. But currently we
have four times that. I can explain why we should multiply this by 2 -
we do not count pages moving from active to inactive lrus in
nr_pages_scanned, and 2*nr_pages_scanned can be a good approximation for
that - but I have no idea why we multiply it by 4...

Thanks.

>
> If a filesystem were to require a seek to the superblock for every
> inode read (ok, bad example) then the cost of reestablishing that inode
> would be 3.
>
> All that being said, why did you go through and halve everything?  The
> cost of reestablishing an ext2 inode should be "2 seeks", but the patch
> makes it "1".
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
  2014-02-05  7:16     ` Vladimir Davydov
@ 2014-02-05 20:52       ` Andrew Morton
  2014-02-06 18:51         ` Vladimir Davydov
  0 siblings, 1 reply; 9+ messages in thread
From: Andrew Morton @ 2014-02-05 20:52 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

On Wed, 5 Feb 2014 11:16:49 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:

> > So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
> > recreate (say) an inode would require a seek to the inode data then a
> > seek back.  Is it legitimate to include the
> > seek-back-to-what-you-were-doing-before seek in the cost of an inode
> > reclaim?  I guess so...
> 
> Hmm, that explains this 2. Since we typically don't need to "seek back"
> when recreating a cache page, as they are usually read in bunches by
> readahead, the number of seeks to bring back a user page is 1, while the
> number of seeks to recreate an average inode is 2, right?

Sounds right to me.

> Then to scan inodes and user pages so that they would generate
> approximately the same number of seeks, we should calculate the number
> of objects to scan as follows:
> 
> nr_objects_to_scan = nr_pages_scanned / lru_pages *
>                                         nr_freeable_objects /
> shrinker->seeks
> 
> where shrinker->seeks = DEFAULT_SEEKS = 2 for inodes.

hm, I wonder if we should take the size of the object into account. 
Should we be maximizing (memory-reclaimed / seeks-to-reestablish-it).

> But currently we
> have four times that. I can explain why we should multiply this by 2 -
> we do not count pages moving from active to inactive lrus in
> nr_pages_scanned, and 2*nr_pages_scanned can be a good approximation for
> that - but I have no idea why we multiply it by 4...

I don't understand this code at all:

	total_scan = nr;
	delta = (4 * nr_pages_scanned) / shrinker->seeks;
	delta *= freeable;
	do_div(delta, lru_pages + 1);
	total_scan += delta;

If it actually makes any sense, it sorely sorely needs documentation.

David, you touched it last.  Any hints?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic
  2014-02-05 20:52       ` Andrew Morton
@ 2014-02-06 18:51         ` Vladimir Davydov
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Davydov @ 2014-02-06 18:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, devel, Mel Gorman, Michal Hocko,
	Johannes Weiner, Rik van Riel, Dave Chinner, Glauber Costa

On 02/06/2014 12:52 AM, Andrew Morton wrote:
> On Wed, 5 Feb 2014 11:16:49 +0400 Vladimir Davydov <vdavydov@parallels.com> wrote:
>
>>> So why did I originally make DEFAULT_SEEKS=2?  Because I figured that to
>>> recreate (say) an inode would require a seek to the inode data then a
>>> seek back.  Is it legitimate to include the
>>> seek-back-to-what-you-were-doing-before seek in the cost of an inode
>>> reclaim?  I guess so...
>> Hmm, that explains this 2. Since we typically don't need to "seek back"
>> when recreating a cache page, as they are usually read in bunches by
>> readahead, the number of seeks to bring back a user page is 1, while the
>> number of seeks to recreate an average inode is 2, right?
> Sounds right to me.
>
>> Then to scan inodes and user pages so that they would generate
>> approximately the same number of seeks, we should calculate the number
>> of objects to scan as follows:
>>
>> nr_objects_to_scan = nr_pages_scanned / lru_pages *
>>                                         nr_freeable_objects /
>> shrinker->seeks
>>
>> where shrinker->seeks = DEFAULT_SEEKS = 2 for inodes.
> hm, I wonder if we should take the size of the object into account. 
> Should we be maximizing (memory-reclaimed / seeks-to-reestablish-it).

I'm not sure I understand you quite right. You mean that if two slab
caches have obj sizes 1k and 2k and both of them need 2 seeks to
recreate an object, we should scan the 1k (or 2k?) slab cache more
aggressively than the 2k one? Hmm... I don't know. It depends on what we
want to achieve. But this won't balance the seeks, which is our goal for
now, IIUC.

>> But currently we
>> have four times that. I can explain why we should multiply this by 2 -
>> we do not count pages moving from active to inactive lrus in
>> nr_pages_scanned, and 2*nr_pages_scanned can be a good approximation for
>> that - but I have no idea why we multiply it by 4...
> I don't understand this code at all:
>
> 	total_scan = nr;
> 	delta = (4 * nr_pages_scanned) / shrinker->seeks;
> 	delta *= freeable;
> 	do_div(delta, lru_pages + 1);
> 	total_scan += delta;
>
> If it actually makes any sense, it sorely sorely needs documentation.

To find its roots I had to checkout the linux history tree:

commit c3f4656118a78c1c294e0b4d338ac946265a822b
Author: Andrew Morton <akpm@osdl.org>
Date:   Mon Dec 29 23:48:44 2003 -0800

    [PATCH] shrink_slab acounts for seeks incorrectly
   
    wli points out that shrink_slab inverts the sense of
shrinker->seeks: those
    caches which require more seeks to reestablish an object are shrunk
harder.
    That's wrong - they should be shrunk less.
   
    So fix that up, but scaling the result so that the patch is actually
a no-op
    at this time, because all caches use DEFAULT_SEEKS (2).

diff --git a/mm/vmscan.c b/mm/vmscan.c
index b8594827bbac..f2da3c9fb346 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -154,7 +154,7 @@ static int shrink_slab(long scanned, unsigned int
gfp_mask)
        list_for_each_entry(shrinker, &shrinker_list, list) {
                unsigned long long delta;
 
-               delta = scanned * shrinker->seeks;
+               delta = 4 * (scanned / shrinker->seeks);
                delta *= (*shrinker->shrinker)(0, gfp_mask);
                do_div(delta, pages + 1);
                shrinker->nr += delta;


So the idea seemed to be fixing a bug without introducing any functional
changes. Since then we have been living with this "4", which makes no
sense (?). Nobody complained though.

Thanks.

> David, you touched it last.  Any hints?


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-02-06 18:51 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-17 19:25 [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable Vladimir Davydov
2014-01-17 19:25 ` [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic Vladimir Davydov
2014-02-04 21:58   ` Andrew Morton
2014-02-05  7:16     ` Vladimir Davydov
2014-02-05 20:52       ` Andrew Morton
2014-02-06 18:51         ` Vladimir Davydov
2014-01-17 19:25 ` [PATCH 3/3] mm: vmscan: shrink_slab: do not skip caches with < batch_size objects Vladimir Davydov
2014-01-21 22:22 ` [PATCH 1/3] mm: vmscan: shrink_slab: rename max_pass -> freeable David Rientjes
2014-01-22  6:11   ` Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).