All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/12] RFC: shrinker APi rework and generic LRU lists
@ 2011-08-23  8:56 ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

This series is a current work in progress: 

	- cleans up the shrinker API, fixes a couple of warts and converts all
	  the shrinkers to us it,
	- makes the inode slab cache initialisation consistent across all inode
	  caches,
	- introduces a generic LRU list type and infrstructure
	- converts inode cache to use lru list infrastructure
	- convert xfs buffer cache to use lru list infrastructure
	- converts the dentry cache LRU to per-sb
	- fixes dcache select_parent() use-the-lru-for-disposal abuse
	- makes the dcache consistent about removing inodes from the LRU before
	  disposal of them
	- converts the dentry cache to use lru list infrastructure

The basic concept here is to fix the shrinker API to be somewhat
sane and convert the main slab cache LRUs in the system to use
generic infrastructure. Both the detry and inode caches use LRU
implementations that are almost-but-not-quite the same. There is no
reason for them to be different - it's only the fact that the dentry
cache LRU has been used in for disposal purposes rather than using
dispose lists.

The dentry caceh dispose list also has a problem as a result of the
RCU-ifying of the code - the dispose list is implicitly protected by
the LRU lock, and actaully forms a disjoint part of the LRU as
dentries on the dispose list are still accounted to the LRU and
require a call to dentry_lru_del() to remove from the dispose list
and correct the LRU accounting. THis only works when there is a
single LRU lock - if the dispose list is made upof dentries
protected by different LRU locks, then it fails with list corruption
pretty quickly. This is another reason for moving to the same
strategy as the inode cache, where inodes are completely removed
form thr LRU before being placed on the dispose list....

In case it is not obvious, this is all preparatory work for making
the LRUs and shrinkers node aware. The new generic LRU lists can be
trivially converted to be node aware, and with the addition of node
masks to the struct shrink_control propagated from shrink_slab() we
can easily extend all these caches to have node aware reclaim. We
will then have a generic node-aware LRU implementation that all
subsystems can use to play well with memory reclaim on large NUMA
machines...


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 00/12] RFC: shrinker APi rework and generic LRU lists
@ 2011-08-23  8:56 ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

This series is a current work in progress: 

	- cleans up the shrinker API, fixes a couple of warts and converts all
	  the shrinkers to us it,
	- makes the inode slab cache initialisation consistent across all inode
	  caches,
	- introduces a generic LRU list type and infrstructure
	- converts inode cache to use lru list infrastructure
	- convert xfs buffer cache to use lru list infrastructure
	- converts the dentry cache LRU to per-sb
	- fixes dcache select_parent() use-the-lru-for-disposal abuse
	- makes the dcache consistent about removing inodes from the LRU before
	  disposal of them
	- converts the dentry cache to use lru list infrastructure

The basic concept here is to fix the shrinker API to be somewhat
sane and convert the main slab cache LRUs in the system to use
generic infrastructure. Both the detry and inode caches use LRU
implementations that are almost-but-not-quite the same. There is no
reason for them to be different - it's only the fact that the dentry
cache LRU has been used in for disposal purposes rather than using
dispose lists.

The dentry caceh dispose list also has a problem as a result of the
RCU-ifying of the code - the dispose list is implicitly protected by
the LRU lock, and actaully forms a disjoint part of the LRU as
dentries on the dispose list are still accounted to the LRU and
require a call to dentry_lru_del() to remove from the dispose list
and correct the LRU accounting. THis only works when there is a
single LRU lock - if the dispose list is made upof dentries
protected by different LRU locks, then it fails with list corruption
pretty quickly. This is another reason for moving to the same
strategy as the inode cache, where inodes are completely removed
form thr LRU before being placed on the dispose list....

In case it is not obvious, this is all preparatory work for making
the LRUs and shrinkers node aware. The new generic LRU lists can be
trivially converted to be node aware, and with the addition of node
masks to the struct shrink_control propagated from shrink_slab() we
can easily extend all these caches to have node aware reclaim. We
will then have a generic node-aware LRU implementation that all
subsystems can use to play well with memory reclaim on large NUMA
machines...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 01/13] fs: Use a common define for inode slab caches
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

All inode slab cache initialisation calls need to use specific flags
so that certain core functionality works correctly (e.g. reclaimable
memory accounting). Some of these flags are used inconsistently
across different filesystems, so inode cache slab behaviour can vary
according to filesystem type.

Wrap all the SLAB_* flags relevant to inode caches up into a single
SLAB_INODES flag and convert all the inode caches to use the new
flag.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Documentation/filesystems/porting |    3 +++
 fs/9p/v9fs.c                      |    3 +--
 fs/adfs/super.c                   |    3 +--
 fs/affs/super.c                   |    3 +--
 fs/afs/super.c                    |    7 +++----
 fs/befs/linuxvfs.c                |    3 +--
 fs/bfs/inode.c                    |    3 +--
 fs/btrfs/inode.c                  |    2 +-
 fs/ceph/super.c                   |    3 +--
 fs/cifs/cifsfs.c                  |    3 +--
 fs/coda/inode.c                   |    3 +--
 fs/efs/super.c                    |    3 +--
 fs/exofs/super.c                  |    3 +--
 fs/ext2/super.c                   |    3 +--
 fs/ext3/super.c                   |    3 +--
 fs/ext4/super.c                   |    3 +--
 fs/fat/inode.c                    |    3 +--
 fs/freevxfs/vxfs_super.c          |    2 +-
 fs/fuse/inode.c                   |    6 +++---
 fs/gfs2/main.c                    |    3 +--
 fs/hfs/super.c                    |    3 ++-
 fs/hfsplus/super.c                |    3 ++-
 fs/hpfs/super.c                   |    3 +--
 fs/hugetlbfs/inode.c              |    2 +-
 fs/inode.c                        |    4 +---
 fs/isofs/inode.c                  |    4 +---
 fs/jffs2/super.c                  |    3 +--
 fs/jfs/super.c                    |    3 +--
 fs/logfs/inode.c                  |    2 +-
 fs/minix/inode.c                  |    3 +--
 fs/ncpfs/inode.c                  |    3 +--
 fs/nfs/inode.c                    |    3 +--
 fs/nilfs2/super.c                 |    4 ++--
 fs/ntfs/super.c                   |    3 ++-
 fs/ocfs2/dlmfs/dlmfs.c            |    3 +--
 fs/ocfs2/super.c                  |    4 +---
 fs/openpromfs/inode.c             |    4 +---
 fs/proc/inode.c                   |    3 +--
 fs/qnx4/inode.c                   |    3 +--
 fs/reiserfs/super.c               |    7 ++-----
 fs/romfs/super.c                  |    3 +--
 fs/squashfs/super.c               |    3 ++-
 fs/sysv/inode.c                   |    3 +--
 fs/ubifs/super.c                  |    2 +-
 fs/udf/super.c                    |    3 +--
 fs/ufs/super.c                    |    3 +--
 fs/xfs/kmem.h                     |    1 +
 fs/xfs/xfs_super.c                |    4 ++--
 include/linux/slab.h              |    7 +++++++
 ipc/mqueue.c                      |    3 ++-
 mm/shmem.c                        |    3 ++-
 net/socket.c                      |    9 +++------
 net/sunrpc/rpc_pipe.c             |    5 ++---
 53 files changed, 77 insertions(+), 104 deletions(-)

diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index b4a3d76..2866bc9 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -352,6 +352,9 @@ protects *all* the dcache state of a given dentry.
 
 --
 [mandatory]
+	Inodes must be allocated via a slab cache created with the
+SLAB_INODE_CACHE flag set. This sets all the necessary slab cache flags for
+correct operation and control of the cache across the system.
 
 	Filesystems must RCU-free their inodes, if they can have been accessed
 via rcu-walk path walk (basically, if the file can have had a path name in the
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index ef96618..e899f1d 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -525,8 +525,7 @@ static int v9fs_init_inode_cache(void)
 {
 	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
 					  sizeof(struct v9fs_inode),
-					  0, (SLAB_RECLAIM_ACCOUNT|
-					      SLAB_MEM_SPREAD),
+					  0, SLAB_INODE_CACHE,
 					  v9fs_inode_init_once);
 	if (!v9fs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index c8bf36a..a67095f 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -266,8 +266,7 @@ static int init_inodecache(void)
 {
 	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
 					     sizeof(struct adfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (adfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/affs/super.c b/fs/affs/super.c
index b31507d..fa727c1 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -120,8 +120,7 @@ static int init_inodecache(void)
 {
 	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
 					     sizeof(struct affs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (affs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 356dcf0..ba7566d 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -86,10 +86,9 @@ int __init afs_fs_init(void)
 
 	ret = -ENOMEM;
 	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
-					     sizeof(struct afs_vnode),
-					     0,
-					     SLAB_HWCACHE_ALIGN,
-					     afs_i_init_once);
+					sizeof(struct afs_vnode), 0,
+					SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+					afs_i_init_once);
 	if (!afs_inode_cachep) {
 		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
 		return ret;
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 720d885..b62654f 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -436,8 +436,7 @@ befs_init_inodecache(void)
 {
 	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
 					      sizeof (struct befs_inode_info),
-					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					      0, SLAB_INODE_CACHE,
 					      init_once);
 	if (befs_inode_cachep == NULL) {
 		printk(KERN_ERR "befs_init_inodecache: "
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index a8e37f8..6038016 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -271,8 +271,7 @@ static int init_inodecache(void)
 {
 	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
 					     sizeof(struct bfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (bfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ccc743..3afa9ca 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6878,7 +6878,7 @@ int btrfs_init_cachep(void)
 {
 	btrfs_inode_cachep = kmem_cache_create("btrfs_inode_cache",
 			sizeof(struct btrfs_inode), 0,
-			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
+			SLAB_INODE_CACHE, init_once);
 	if (!btrfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index d47c5ec..79b7ff3 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -532,8 +532,7 @@ static int __init init_caches(void)
 	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
 				      sizeof(struct ceph_inode_info),
 				      __alignof__(struct ceph_inode_info),
-				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
-				      ceph_inode_init_once);
+				      SLAB_INODE_CACHE, ceph_inode_init_once);
 	if (ceph_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index f93eb94..c33fb7e 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -948,8 +948,7 @@ cifs_init_inodecache(void)
 {
 	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
 					      sizeof(struct cifsInodeInfo),
-					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					      0, SLAB_INODE_CACHE,
 					      cifs_init_once);
 	if (cifs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 871b277..0a31da6 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -78,8 +78,7 @@ int coda_init_inodecache(void)
 {
 	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
 				sizeof(struct coda_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				0, SLAB_INODE_CACHE, init_once);
 	if (coda_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/efs/super.c b/fs/efs/super.c
index 0f31acb..ff87da4 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -88,8 +88,7 @@ static int init_inodecache(void)
 {
 	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
 				sizeof(struct efs_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				0, SLAB_INODE_CACHE, init_once);
 	if (efs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 2748940..07f0023 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -194,8 +194,7 @@ static int init_inodecache(void)
 {
 	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
 				sizeof(struct exofs_i_info), 0,
-				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				exofs_init_once);
+				SLAB_INODE_CACHE, exofs_init_once);
 	if (exofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 1dd62ed..9d5d7a7 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -198,8 +198,7 @@ static int init_inodecache(void)
 {
 	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
 					     sizeof(struct ext2_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext2_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 7beb69a..8c4c9e1 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -544,8 +544,7 @@ static int init_inodecache(void)
 {
 	ext3_inode_cachep = kmem_cache_create("ext3_inode_cache",
 					     sizeof(struct ext3_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext3_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 44d0c8d..738d64a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -947,8 +947,7 @@ static int init_inodecache(void)
 {
 	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
 					     sizeof(struct ext4_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 1726d73..3316e5d 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -543,8 +543,7 @@ static int __init fat_init_inodecache(void)
 {
 	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
 					     sizeof(struct msdos_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (fat_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index 9d1c995..225f18a 100644
--- a/fs/freevxfs/vxfs_super.c
+++ b/fs/freevxfs/vxfs_super.c
@@ -267,7 +267,7 @@ vxfs_init(void)
 
 	vxfs_inode_cachep = kmem_cache_create("vxfs_inode",
 			sizeof(struct vxfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, NULL);
+			SLAB_INODE_CACHE, NULL);
 	if (!vxfs_inode_cachep)
 		return -ENOMEM;
 	rv = register_filesystem(&vxfs_fs_type);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 38f84cd..55ad0a1 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1138,9 +1138,9 @@ static int __init fuse_fs_init(void)
 		goto out_unreg;
 
 	fuse_inode_cachep = kmem_cache_create("fuse_inode",
-					      sizeof(struct fuse_inode),
-					      0, SLAB_HWCACHE_ALIGN,
-					      fuse_inode_init_once);
+					sizeof(struct fuse_inode), 0,
+					SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+					fuse_inode_init_once);
 	err = -ENOMEM;
 	if (!fuse_inode_cachep)
 		goto out_unreg2;
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 8a139ff..8ea7747 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -105,8 +105,7 @@ static int __init init_gfs2_fs(void)
 
 	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
 					      sizeof(struct gfs2_inode),
-					      0,  SLAB_RECLAIM_ACCOUNT|
-					          SLAB_MEM_SPREAD,
+					      0, SLAB_INODE_CACHE,
 					      gfs2_init_inode_once);
 	if (!gfs2_inode_cachep)
 		goto fail;
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index 1b55f70..789f74c 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -473,7 +473,8 @@ static int __init init_hfs_fs(void)
 	int err;
 
 	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
-		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
+		sizeof(struct hfs_inode_info), 0,
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 		hfs_init_once);
 	if (!hfs_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index c106ca2..fc88368 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -590,7 +590,8 @@ static int __init init_hfsplus_fs(void)
 	int err;
 
 	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
-		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
+		HFSPLUS_INODE_SIZE, 0,
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 		hfsplus_init_once);
 	if (!hfsplus_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index 98580a3..3de3965 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -201,8 +201,7 @@ static int init_inodecache(void)
 {
 	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
 					     sizeof(struct hpfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (hpfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 87b6e04..1644d5f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1002,7 +1002,7 @@ static int __init init_hugetlbfs_fs(void)
 
 	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
 					sizeof(struct hugetlbfs_inode_info),
-					0, 0, init_once);
+					0, SLAB_INODE_CACHE, init_once);
 	if (hugetlbfs_inode_cachep == NULL)
 		goto out2;
 
diff --git a/fs/inode.c b/fs/inode.c
index 73920d5..848808f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1591,9 +1591,7 @@ void __init inode_init(void)
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
 					 sizeof(struct inode),
-					 0,
-					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-					 SLAB_MEM_SPREAD),
+					 0, SLAB_INODE_CACHE | SLAB_PANIC,
 					 init_once);
 
 	/* Hash may have been set up in inode_init_early */
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index a5d0367..237dbc9 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -104,9 +104,7 @@ static int init_inodecache(void)
 {
 	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
 					sizeof(struct iso_inode_info),
-					0, (SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
-					init_once);
+					0, SLAB_INODE_CACHE, init_once);
 	if (isofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index 853b8e3..3c9dbe8 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -266,8 +266,7 @@ static int __init init_jffs2_fs(void)
 
 	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
 					     sizeof(struct jffs2_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     jffs2_i_init_once);
 	if (!jffs2_inode_cachep) {
 		printk(KERN_ERR "JFFS2 error: Failed to initialise inode cache\n");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 06c8a67..31a52a6 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -803,8 +803,7 @@ static int __init init_jfs_fs(void)
 
 	jfs_inode_cachep =
 	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
-			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-			    init_once);
+					     SLAB_INODE_CACHE, init_once);
 	if (jfs_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
index edfea7a..cb96b72 100644
--- a/fs/logfs/inode.c
+++ b/fs/logfs/inode.c
@@ -392,7 +392,7 @@ const struct super_operations logfs_super_operations = {
 int logfs_init_inode_cache(void)
 {
 	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
-			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
+			sizeof(struct logfs_inode), 0, SLAB_INODE_CACHE,
 			logfs_init_once);
 	if (!logfs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index e7d23e2..4535e83 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -91,8 +91,7 @@ static int init_inodecache(void)
 {
 	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
 					     sizeof(struct minix_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (minix_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index 202f370..66add22 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -81,8 +81,7 @@ static int init_inodecache(void)
 {
 	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
 					     sizeof(struct ncp_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ncp_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index fe12037..ee8bd18 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1506,8 +1506,7 @@ static int __init nfs_init_inodecache(void)
 {
 	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
 					     sizeof(struct nfs_inode),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (nfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index 8351c44..f6025c1 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1401,8 +1401,8 @@ static void nilfs_destroy_cachep(void)
 static int __init nilfs_init_cachep(void)
 {
 	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
-			sizeof(struct nilfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
+			sizeof(struct nilfs_inode_info), 0, SLAB_INODE_CACHE,
+			nilfs_inode_init_once);
 	if (!nilfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index b52706d..97ad840 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3136,9 +3136,10 @@ static int __init init_ntfs_fs(void)
 		goto inode_err_out;
 	}
 
+	/* ntfs_big_inode_cache is the inode cache used for VFS level inodes */
 	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
 			sizeof(big_ntfs_inode), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 			ntfs_big_inode_init_once);
 	if (!ntfs_big_inode_cache) {
 		printk(KERN_CRIT "NTFS: Failed to create %s!\n",
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index b420767..f6c762d 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -676,8 +676,7 @@ static int __init init_dlmfs_fs(void)
 
 	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
 				sizeof(struct dlmfs_inode_private),
-				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+				0, (SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE),
 				dlmfs_init_once);
 	if (!dlmfs_inode_cache) {
 		status = -ENOMEM;
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 56f6102..f4d0a0f 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1784,9 +1784,7 @@ static int ocfs2_initialize_mem_caches(void)
 {
 	ocfs2_inode_cachep = kmem_cache_create("ocfs2_inode_cache",
 				       sizeof(struct ocfs2_inode_info),
-				       0,
-				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+				       0, (SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE),
 				       ocfs2_inode_init_once);
 	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
 					sizeof(struct ocfs2_dquot),
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index a2a5bff..3aea1e8 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -448,9 +448,7 @@ static int __init init_openprom_fs(void)
 
 	op_inode_cachep = kmem_cache_create("op_inode_cache",
 					    sizeof(struct op_inode_info),
-					    0,
-					    (SLAB_RECLAIM_ACCOUNT |
-					     SLAB_MEM_SPREAD),
+					    0, SLAB_INODE_CACHE,
 					    op_inode_init_once);
 	if (!op_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 7ed72d6..9794661 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -97,8 +97,7 @@ void __init proc_init_inodecache(void)
 {
 	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
 					     sizeof(struct proc_inode),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_PANIC),
+					     0, SLAB_INODE_CACHE | SLAB_PANIC,
 					     init_once);
 }
 
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index 2b06466..7b77dd1 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -447,8 +447,7 @@ static int init_inodecache(void)
 {
 	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
 					     sizeof(struct qnx4_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (qnx4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 14363b9..5ba7c0d 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -552,11 +552,8 @@ static void init_once(void *foo)
 static int init_inodecache(void)
 {
 	reiserfs_inode_cachep = kmem_cache_create("reiser_inode_cache",
-						  sizeof(struct
-							 reiserfs_inode_info),
-						  0, (SLAB_RECLAIM_ACCOUNT|
-							SLAB_MEM_SPREAD),
-						  init_once);
+					sizeof(struct reiserfs_inode_info),
+					0, SLAB_INODE_CACHE, init_once);
 	if (reiserfs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 2305e31..a771db6 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -625,8 +625,7 @@ static int __init init_romfs_fs(void)
 	romfs_inode_cachep =
 		kmem_cache_create("romfs_i",
 				  sizeof(struct romfs_inode_info), 0,
-				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				  romfs_i_init_once);
+				  SLAB_INODE_CACHE, romfs_i_init_once);
 
 	if (!romfs_inode_cachep) {
 		printk(KERN_ERR
diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
index 7438850..b21d9ba 100644
--- a/fs/squashfs/super.c
+++ b/fs/squashfs/super.c
@@ -413,7 +413,8 @@ static int __init init_inodecache(void)
 {
 	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
 		sizeof(struct squashfs_inode_info), 0,
-		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+		init_once);
 
 	return squashfs_inode_cachep ? 0 : -ENOMEM;
 }
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 0630eb9..e3319db 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -367,8 +367,7 @@ const struct super_operations sysv_sops = {
 int __init sysv_init_icache(void)
 {
 	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
-			sizeof(struct sysv_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			sizeof(struct sysv_inode_info), 0, SLAB_INODE_CACHE,
 			init_once);
 	if (!sysv_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index b281212..91903f6 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2273,7 +2273,7 @@ static int __init ubifs_init(void)
 	err = -ENOMEM;
 	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
 				sizeof(struct ubifs_inode), 0,
-				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
+				SLAB_INODE_CACHE,
 				&inode_slab_ctor);
 	if (!ubifs_inode_slab)
 		goto out_reg;
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 7b27b06..b6e9969 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -163,8 +163,7 @@ static int init_inodecache(void)
 {
 	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
 					     sizeof(struct udf_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT |
-						 SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (!udf_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 3915ade..0cd3dc0 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1445,8 +1445,7 @@ static int init_inodecache(void)
 {
 	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
 					     sizeof(struct ufs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ufs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index f7c8f7a..4e6b372 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -82,6 +82,7 @@ extern void *kmem_zalloc_greedy(size_t *, size_t, size_t);
 #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
 #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
 #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
+#define KM_ZONE_INODES	SLAB_INODE_CACHE
 
 #define kmem_zone	kmem_cache
 #define kmem_zone_t	struct kmem_cache
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 9a72dda..c94ec22 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1589,8 +1589,8 @@ xfs_init_zones(void)
 
 	xfs_inode_zone =
 		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
-			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
-			xfs_fs_inode_init_once);
+					KM_ZONE_HWALIGN | KM_ZONE_INODES,
+					xfs_fs_inode_init_once);
 	if (!xfs_inode_zone)
 		goto out_destroy_efi_zone;
 
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 573c809..9d4a5b8 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -93,6 +93,13 @@
 				(unsigned long)ZERO_SIZE_PTR)
 
 /*
+ * Set the default flags necessary for inode caches manipulated by the VFS.
+ */
+#define SLAB_INODE_CACHE	(SLAB_MEM_SPREAD | \
+				 SLAB_DESTROY_BY_RCU | \
+				 SLAB_RECLAIM_ACCOUNT)
+
+/*
  * struct kmem_cache related prototypes
  */
 void __init kmem_cache_init(void);
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ed049ea..512b1b2 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1278,7 +1278,8 @@ static int __init init_mqueue_fs(void)
 
 	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
 				sizeof(struct mqueue_inode_info), 0,
-				SLAB_HWCACHE_ALIGN, init_once);
+				SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+				init_once);
 	if (mqueue_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/mm/shmem.c b/mm/shmem.c
index 32f6763..98bfa2e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2249,7 +2249,8 @@ static int shmem_init_inodecache(void)
 {
 	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
 				sizeof(struct shmem_inode_info),
-				0, SLAB_PANIC, shmem_init_inode);
+				0, SLAB_INODE_CACHE|SLAB_PANIC,
+				shmem_init_inode);
 	return 0;
 }
 
diff --git a/net/socket.c b/net/socket.c
index 24a7740..4ade5bf 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -284,12 +284,9 @@ static void init_once(void *foo)
 static int init_inodecache(void)
 {
 	sock_inode_cachep = kmem_cache_create("sock_inode_cache",
-					      sizeof(struct socket_alloc),
-					      0,
-					      (SLAB_HWCACHE_ALIGN |
-					       SLAB_RECLAIM_ACCOUNT |
-					       SLAB_MEM_SPREAD),
-					      init_once);
+			      sizeof(struct socket_alloc), 0,
+			      SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+			      init_once);
 	if (sock_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index b181e34..53f0dd6 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1064,9 +1064,8 @@ int register_rpc_pipefs(void)
 	int err;
 
 	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
-				sizeof(struct rpc_inode),
-				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+				sizeof(struct rpc_inode), 0,
+				SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE,
 				init_once);
 	if (!rpc_inode_cachep)
 		return -ENOMEM;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 01/13] fs: Use a common define for inode slab caches
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

All inode slab cache initialisation calls need to use specific flags
so that certain core functionality works correctly (e.g. reclaimable
memory accounting). Some of these flags are used inconsistently
across different filesystems, so inode cache slab behaviour can vary
according to filesystem type.

Wrap all the SLAB_* flags relevant to inode caches up into a single
SLAB_INODES flag and convert all the inode caches to use the new
flag.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Documentation/filesystems/porting |    3 +++
 fs/9p/v9fs.c                      |    3 +--
 fs/adfs/super.c                   |    3 +--
 fs/affs/super.c                   |    3 +--
 fs/afs/super.c                    |    7 +++----
 fs/befs/linuxvfs.c                |    3 +--
 fs/bfs/inode.c                    |    3 +--
 fs/btrfs/inode.c                  |    2 +-
 fs/ceph/super.c                   |    3 +--
 fs/cifs/cifsfs.c                  |    3 +--
 fs/coda/inode.c                   |    3 +--
 fs/efs/super.c                    |    3 +--
 fs/exofs/super.c                  |    3 +--
 fs/ext2/super.c                   |    3 +--
 fs/ext3/super.c                   |    3 +--
 fs/ext4/super.c                   |    3 +--
 fs/fat/inode.c                    |    3 +--
 fs/freevxfs/vxfs_super.c          |    2 +-
 fs/fuse/inode.c                   |    6 +++---
 fs/gfs2/main.c                    |    3 +--
 fs/hfs/super.c                    |    3 ++-
 fs/hfsplus/super.c                |    3 ++-
 fs/hpfs/super.c                   |    3 +--
 fs/hugetlbfs/inode.c              |    2 +-
 fs/inode.c                        |    4 +---
 fs/isofs/inode.c                  |    4 +---
 fs/jffs2/super.c                  |    3 +--
 fs/jfs/super.c                    |    3 +--
 fs/logfs/inode.c                  |    2 +-
 fs/minix/inode.c                  |    3 +--
 fs/ncpfs/inode.c                  |    3 +--
 fs/nfs/inode.c                    |    3 +--
 fs/nilfs2/super.c                 |    4 ++--
 fs/ntfs/super.c                   |    3 ++-
 fs/ocfs2/dlmfs/dlmfs.c            |    3 +--
 fs/ocfs2/super.c                  |    4 +---
 fs/openpromfs/inode.c             |    4 +---
 fs/proc/inode.c                   |    3 +--
 fs/qnx4/inode.c                   |    3 +--
 fs/reiserfs/super.c               |    7 ++-----
 fs/romfs/super.c                  |    3 +--
 fs/squashfs/super.c               |    3 ++-
 fs/sysv/inode.c                   |    3 +--
 fs/ubifs/super.c                  |    2 +-
 fs/udf/super.c                    |    3 +--
 fs/ufs/super.c                    |    3 +--
 fs/xfs/kmem.h                     |    1 +
 fs/xfs/xfs_super.c                |    4 ++--
 include/linux/slab.h              |    7 +++++++
 ipc/mqueue.c                      |    3 ++-
 mm/shmem.c                        |    3 ++-
 net/socket.c                      |    9 +++------
 net/sunrpc/rpc_pipe.c             |    5 ++---
 53 files changed, 77 insertions(+), 104 deletions(-)

diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index b4a3d76..2866bc9 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -352,6 +352,9 @@ protects *all* the dcache state of a given dentry.
 
 --
 [mandatory]
+	Inodes must be allocated via a slab cache created with the
+SLAB_INODE_CACHE flag set. This sets all the necessary slab cache flags for
+correct operation and control of the cache across the system.
 
 	Filesystems must RCU-free their inodes, if they can have been accessed
 via rcu-walk path walk (basically, if the file can have had a path name in the
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index ef96618..e899f1d 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -525,8 +525,7 @@ static int v9fs_init_inode_cache(void)
 {
 	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
 					  sizeof(struct v9fs_inode),
-					  0, (SLAB_RECLAIM_ACCOUNT|
-					      SLAB_MEM_SPREAD),
+					  0, SLAB_INODE_CACHE,
 					  v9fs_inode_init_once);
 	if (!v9fs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index c8bf36a..a67095f 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -266,8 +266,7 @@ static int init_inodecache(void)
 {
 	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
 					     sizeof(struct adfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (adfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/affs/super.c b/fs/affs/super.c
index b31507d..fa727c1 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -120,8 +120,7 @@ static int init_inodecache(void)
 {
 	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
 					     sizeof(struct affs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (affs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 356dcf0..ba7566d 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -86,10 +86,9 @@ int __init afs_fs_init(void)
 
 	ret = -ENOMEM;
 	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
-					     sizeof(struct afs_vnode),
-					     0,
-					     SLAB_HWCACHE_ALIGN,
-					     afs_i_init_once);
+					sizeof(struct afs_vnode), 0,
+					SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+					afs_i_init_once);
 	if (!afs_inode_cachep) {
 		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
 		return ret;
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 720d885..b62654f 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -436,8 +436,7 @@ befs_init_inodecache(void)
 {
 	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
 					      sizeof (struct befs_inode_info),
-					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					      0, SLAB_INODE_CACHE,
 					      init_once);
 	if (befs_inode_cachep == NULL) {
 		printk(KERN_ERR "befs_init_inodecache: "
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index a8e37f8..6038016 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -271,8 +271,7 @@ static int init_inodecache(void)
 {
 	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
 					     sizeof(struct bfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (bfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 0ccc743..3afa9ca 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6878,7 +6878,7 @@ int btrfs_init_cachep(void)
 {
 	btrfs_inode_cachep = kmem_cache_create("btrfs_inode_cache",
 			sizeof(struct btrfs_inode), 0,
-			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
+			SLAB_INODE_CACHE, init_once);
 	if (!btrfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index d47c5ec..79b7ff3 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -532,8 +532,7 @@ static int __init init_caches(void)
 	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
 				      sizeof(struct ceph_inode_info),
 				      __alignof__(struct ceph_inode_info),
-				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
-				      ceph_inode_init_once);
+				      SLAB_INODE_CACHE, ceph_inode_init_once);
 	if (ceph_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index f93eb94..c33fb7e 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -948,8 +948,7 @@ cifs_init_inodecache(void)
 {
 	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
 					      sizeof(struct cifsInodeInfo),
-					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					      0, SLAB_INODE_CACHE,
 					      cifs_init_once);
 	if (cifs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 871b277..0a31da6 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -78,8 +78,7 @@ int coda_init_inodecache(void)
 {
 	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
 				sizeof(struct coda_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				0, SLAB_INODE_CACHE, init_once);
 	if (coda_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/efs/super.c b/fs/efs/super.c
index 0f31acb..ff87da4 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -88,8 +88,7 @@ static int init_inodecache(void)
 {
 	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
 				sizeof(struct efs_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				0, SLAB_INODE_CACHE, init_once);
 	if (efs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 2748940..07f0023 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -194,8 +194,7 @@ static int init_inodecache(void)
 {
 	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
 				sizeof(struct exofs_i_info), 0,
-				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				exofs_init_once);
+				SLAB_INODE_CACHE, exofs_init_once);
 	if (exofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 1dd62ed..9d5d7a7 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -198,8 +198,7 @@ static int init_inodecache(void)
 {
 	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
 					     sizeof(struct ext2_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext2_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 7beb69a..8c4c9e1 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -544,8 +544,7 @@ static int init_inodecache(void)
 {
 	ext3_inode_cachep = kmem_cache_create("ext3_inode_cache",
 					     sizeof(struct ext3_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext3_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 44d0c8d..738d64a 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -947,8 +947,7 @@ static int init_inodecache(void)
 {
 	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
 					     sizeof(struct ext4_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ext4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 1726d73..3316e5d 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -543,8 +543,7 @@ static int __init fat_init_inodecache(void)
 {
 	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
 					     sizeof(struct msdos_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (fat_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/freevxfs/vxfs_super.c b/fs/freevxfs/vxfs_super.c
index 9d1c995..225f18a 100644
--- a/fs/freevxfs/vxfs_super.c
+++ b/fs/freevxfs/vxfs_super.c
@@ -267,7 +267,7 @@ vxfs_init(void)
 
 	vxfs_inode_cachep = kmem_cache_create("vxfs_inode",
 			sizeof(struct vxfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, NULL);
+			SLAB_INODE_CACHE, NULL);
 	if (!vxfs_inode_cachep)
 		return -ENOMEM;
 	rv = register_filesystem(&vxfs_fs_type);
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 38f84cd..55ad0a1 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1138,9 +1138,9 @@ static int __init fuse_fs_init(void)
 		goto out_unreg;
 
 	fuse_inode_cachep = kmem_cache_create("fuse_inode",
-					      sizeof(struct fuse_inode),
-					      0, SLAB_HWCACHE_ALIGN,
-					      fuse_inode_init_once);
+					sizeof(struct fuse_inode), 0,
+					SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+					fuse_inode_init_once);
 	err = -ENOMEM;
 	if (!fuse_inode_cachep)
 		goto out_unreg2;
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 8a139ff..8ea7747 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -105,8 +105,7 @@ static int __init init_gfs2_fs(void)
 
 	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
 					      sizeof(struct gfs2_inode),
-					      0,  SLAB_RECLAIM_ACCOUNT|
-					          SLAB_MEM_SPREAD,
+					      0, SLAB_INODE_CACHE,
 					      gfs2_init_inode_once);
 	if (!gfs2_inode_cachep)
 		goto fail;
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index 1b55f70..789f74c 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -473,7 +473,8 @@ static int __init init_hfs_fs(void)
 	int err;
 
 	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
-		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
+		sizeof(struct hfs_inode_info), 0,
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 		hfs_init_once);
 	if (!hfs_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index c106ca2..fc88368 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -590,7 +590,8 @@ static int __init init_hfsplus_fs(void)
 	int err;
 
 	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
-		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
+		HFSPLUS_INODE_SIZE, 0,
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 		hfsplus_init_once);
 	if (!hfsplus_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index 98580a3..3de3965 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -201,8 +201,7 @@ static int init_inodecache(void)
 {
 	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
 					     sizeof(struct hpfs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (hpfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 87b6e04..1644d5f 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1002,7 +1002,7 @@ static int __init init_hugetlbfs_fs(void)
 
 	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
 					sizeof(struct hugetlbfs_inode_info),
-					0, 0, init_once);
+					0, SLAB_INODE_CACHE, init_once);
 	if (hugetlbfs_inode_cachep == NULL)
 		goto out2;
 
diff --git a/fs/inode.c b/fs/inode.c
index 73920d5..848808f 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1591,9 +1591,7 @@ void __init inode_init(void)
 	/* inode slab cache */
 	inode_cachep = kmem_cache_create("inode_cache",
 					 sizeof(struct inode),
-					 0,
-					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-					 SLAB_MEM_SPREAD),
+					 0, SLAB_INODE_CACHE | SLAB_PANIC,
 					 init_once);
 
 	/* Hash may have been set up in inode_init_early */
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index a5d0367..237dbc9 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -104,9 +104,7 @@ static int init_inodecache(void)
 {
 	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
 					sizeof(struct iso_inode_info),
-					0, (SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
-					init_once);
+					0, SLAB_INODE_CACHE, init_once);
 	if (isofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index 853b8e3..3c9dbe8 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -266,8 +266,7 @@ static int __init init_jffs2_fs(void)
 
 	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
 					     sizeof(struct jffs2_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     jffs2_i_init_once);
 	if (!jffs2_inode_cachep) {
 		printk(KERN_ERR "JFFS2 error: Failed to initialise inode cache\n");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 06c8a67..31a52a6 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -803,8 +803,7 @@ static int __init init_jfs_fs(void)
 
 	jfs_inode_cachep =
 	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
-			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-			    init_once);
+					     SLAB_INODE_CACHE, init_once);
 	if (jfs_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
index edfea7a..cb96b72 100644
--- a/fs/logfs/inode.c
+++ b/fs/logfs/inode.c
@@ -392,7 +392,7 @@ const struct super_operations logfs_super_operations = {
 int logfs_init_inode_cache(void)
 {
 	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
-			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
+			sizeof(struct logfs_inode), 0, SLAB_INODE_CACHE,
 			logfs_init_once);
 	if (!logfs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index e7d23e2..4535e83 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -91,8 +91,7 @@ static int init_inodecache(void)
 {
 	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
 					     sizeof(struct minix_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (minix_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index 202f370..66add22 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -81,8 +81,7 @@ static int init_inodecache(void)
 {
 	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
 					     sizeof(struct ncp_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ncp_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index fe12037..ee8bd18 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1506,8 +1506,7 @@ static int __init nfs_init_inodecache(void)
 {
 	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
 					     sizeof(struct nfs_inode),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (nfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index 8351c44..f6025c1 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1401,8 +1401,8 @@ static void nilfs_destroy_cachep(void)
 static int __init nilfs_init_cachep(void)
 {
 	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
-			sizeof(struct nilfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
+			sizeof(struct nilfs_inode_info), 0, SLAB_INODE_CACHE,
+			nilfs_inode_init_once);
 	if (!nilfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index b52706d..97ad840 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3136,9 +3136,10 @@ static int __init init_ntfs_fs(void)
 		goto inode_err_out;
 	}
 
+	/* ntfs_big_inode_cache is the inode cache used for VFS level inodes */
 	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
 			sizeof(big_ntfs_inode), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
 			ntfs_big_inode_init_once);
 	if (!ntfs_big_inode_cache) {
 		printk(KERN_CRIT "NTFS: Failed to create %s!\n",
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index b420767..f6c762d 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -676,8 +676,7 @@ static int __init init_dlmfs_fs(void)
 
 	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
 				sizeof(struct dlmfs_inode_private),
-				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+				0, (SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE),
 				dlmfs_init_once);
 	if (!dlmfs_inode_cache) {
 		status = -ENOMEM;
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 56f6102..f4d0a0f 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1784,9 +1784,7 @@ static int ocfs2_initialize_mem_caches(void)
 {
 	ocfs2_inode_cachep = kmem_cache_create("ocfs2_inode_cache",
 				       sizeof(struct ocfs2_inode_info),
-				       0,
-				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+				       0, (SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE),
 				       ocfs2_inode_init_once);
 	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
 					sizeof(struct ocfs2_dquot),
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index a2a5bff..3aea1e8 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -448,9 +448,7 @@ static int __init init_openprom_fs(void)
 
 	op_inode_cachep = kmem_cache_create("op_inode_cache",
 					    sizeof(struct op_inode_info),
-					    0,
-					    (SLAB_RECLAIM_ACCOUNT |
-					     SLAB_MEM_SPREAD),
+					    0, SLAB_INODE_CACHE,
 					    op_inode_init_once);
 	if (!op_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 7ed72d6..9794661 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -97,8 +97,7 @@ void __init proc_init_inodecache(void)
 {
 	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
 					     sizeof(struct proc_inode),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_PANIC),
+					     0, SLAB_INODE_CACHE | SLAB_PANIC,
 					     init_once);
 }
 
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index 2b06466..7b77dd1 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -447,8 +447,7 @@ static int init_inodecache(void)
 {
 	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
 					     sizeof(struct qnx4_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (qnx4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 14363b9..5ba7c0d 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -552,11 +552,8 @@ static void init_once(void *foo)
 static int init_inodecache(void)
 {
 	reiserfs_inode_cachep = kmem_cache_create("reiser_inode_cache",
-						  sizeof(struct
-							 reiserfs_inode_info),
-						  0, (SLAB_RECLAIM_ACCOUNT|
-							SLAB_MEM_SPREAD),
-						  init_once);
+					sizeof(struct reiserfs_inode_info),
+					0, SLAB_INODE_CACHE, init_once);
 	if (reiserfs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 2305e31..a771db6 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -625,8 +625,7 @@ static int __init init_romfs_fs(void)
 	romfs_inode_cachep =
 		kmem_cache_create("romfs_i",
 				  sizeof(struct romfs_inode_info), 0,
-				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				  romfs_i_init_once);
+				  SLAB_INODE_CACHE, romfs_i_init_once);
 
 	if (!romfs_inode_cachep) {
 		printk(KERN_ERR
diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
index 7438850..b21d9ba 100644
--- a/fs/squashfs/super.c
+++ b/fs/squashfs/super.c
@@ -413,7 +413,8 @@ static int __init init_inodecache(void)
 {
 	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
 		sizeof(struct squashfs_inode_info), 0,
-		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
+		SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+		init_once);
 
 	return squashfs_inode_cachep ? 0 : -ENOMEM;
 }
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 0630eb9..e3319db 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -367,8 +367,7 @@ const struct super_operations sysv_sops = {
 int __init sysv_init_icache(void)
 {
 	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
-			sizeof(struct sysv_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			sizeof(struct sysv_inode_info), 0, SLAB_INODE_CACHE,
 			init_once);
 	if (!sysv_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index b281212..91903f6 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2273,7 +2273,7 @@ static int __init ubifs_init(void)
 	err = -ENOMEM;
 	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
 				sizeof(struct ubifs_inode), 0,
-				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
+				SLAB_INODE_CACHE,
 				&inode_slab_ctor);
 	if (!ubifs_inode_slab)
 		goto out_reg;
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 7b27b06..b6e9969 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -163,8 +163,7 @@ static int init_inodecache(void)
 {
 	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
 					     sizeof(struct udf_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT |
-						 SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (!udf_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 3915ade..0cd3dc0 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1445,8 +1445,7 @@ static int init_inodecache(void)
 {
 	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
 					     sizeof(struct ufs_inode_info),
-					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+					     0, SLAB_INODE_CACHE,
 					     init_once);
 	if (ufs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index f7c8f7a..4e6b372 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -82,6 +82,7 @@ extern void *kmem_zalloc_greedy(size_t *, size_t, size_t);
 #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
 #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
 #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
+#define KM_ZONE_INODES	SLAB_INODE_CACHE
 
 #define kmem_zone	kmem_cache
 #define kmem_zone_t	struct kmem_cache
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 9a72dda..c94ec22 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1589,8 +1589,8 @@ xfs_init_zones(void)
 
 	xfs_inode_zone =
 		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
-			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
-			xfs_fs_inode_init_once);
+					KM_ZONE_HWALIGN | KM_ZONE_INODES,
+					xfs_fs_inode_init_once);
 	if (!xfs_inode_zone)
 		goto out_destroy_efi_zone;
 
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 573c809..9d4a5b8 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -93,6 +93,13 @@
 				(unsigned long)ZERO_SIZE_PTR)
 
 /*
+ * Set the default flags necessary for inode caches manipulated by the VFS.
+ */
+#define SLAB_INODE_CACHE	(SLAB_MEM_SPREAD | \
+				 SLAB_DESTROY_BY_RCU | \
+				 SLAB_RECLAIM_ACCOUNT)
+
+/*
  * struct kmem_cache related prototypes
  */
 void __init kmem_cache_init(void);
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ed049ea..512b1b2 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1278,7 +1278,8 @@ static int __init init_mqueue_fs(void)
 
 	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
 				sizeof(struct mqueue_inode_info), 0,
-				SLAB_HWCACHE_ALIGN, init_once);
+				SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+				init_once);
 	if (mqueue_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/mm/shmem.c b/mm/shmem.c
index 32f6763..98bfa2e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2249,7 +2249,8 @@ static int shmem_init_inodecache(void)
 {
 	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
 				sizeof(struct shmem_inode_info),
-				0, SLAB_PANIC, shmem_init_inode);
+				0, SLAB_INODE_CACHE|SLAB_PANIC,
+				shmem_init_inode);
 	return 0;
 }
 
diff --git a/net/socket.c b/net/socket.c
index 24a7740..4ade5bf 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -284,12 +284,9 @@ static void init_once(void *foo)
 static int init_inodecache(void)
 {
 	sock_inode_cachep = kmem_cache_create("sock_inode_cache",
-					      sizeof(struct socket_alloc),
-					      0,
-					      (SLAB_HWCACHE_ALIGN |
-					       SLAB_RECLAIM_ACCOUNT |
-					       SLAB_MEM_SPREAD),
-					      init_once);
+			      sizeof(struct socket_alloc), 0,
+			      SLAB_HWCACHE_ALIGN | SLAB_INODE_CACHE,
+			      init_once);
 	if (sock_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index b181e34..53f0dd6 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1064,9 +1064,8 @@ int register_rpc_pipefs(void)
 	int err;
 
 	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
-				sizeof(struct rpc_inode),
-				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+				sizeof(struct rpc_inode), 0,
+				SLAB_HWCACHE_ALIGN|SLAB_INODE_CACHE,
 				init_once);
 	if (!rpc_inode_cachep)
 		return -ENOMEM;
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Before we split up the dcache_lru_lock, the unused dentry counter
needs to be made independent of the global dcache_lru_lock. Convert
it to per-cpu counters to do this.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index a88948b..febe701 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -116,6 +116,7 @@ struct dentry_stat_t dentry_stat = {
 };
 
 static DEFINE_PER_CPU(unsigned int, nr_dentry);
+static DEFINE_PER_CPU(unsigned int, nr_dentry_unused);
 
 #if defined(CONFIG_SYSCTL) && defined(CONFIG_PROC_FS)
 static int get_nr_dentry(void)
@@ -127,10 +128,20 @@ static int get_nr_dentry(void)
 	return sum < 0 ? 0 : sum;
 }
 
+static int get_nr_dentry_unused(void)
+{
+	int i;
+	int sum = 0;
+	for_each_possible_cpu(i)
+		sum += per_cpu(nr_dentry_unused, i);
+	return sum < 0 ? 0 : sum;
+}
+
 int proc_nr_dentry(ctl_table *table, int write, void __user *buffer,
 		   size_t *lenp, loff_t *ppos)
 {
 	dentry_stat.nr_dentry = get_nr_dentry();
+	dentry_stat.nr_unused = get_nr_dentry_unused();
 	return proc_dointvec(table, write, buffer, lenp, ppos);
 }
 #endif
@@ -233,7 +244,7 @@ static void dentry_lru_add(struct dentry *dentry)
 		spin_lock(&dcache_lru_lock);
 		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
-		dentry_stat.nr_unused++;
+		this_cpu_inc(nr_dentry_unused);
 		spin_unlock(&dcache_lru_lock);
 	}
 }
@@ -242,7 +253,7 @@ static void __dentry_lru_del(struct dentry *dentry)
 {
 	list_del_init(&dentry->d_lru);
 	dentry->d_sb->s_nr_dentry_unused--;
-	dentry_stat.nr_unused--;
+	this_cpu_dec(nr_dentry_unused);
 }
 
 static void dentry_lru_del(struct dentry *dentry)
@@ -260,7 +271,7 @@ static void dentry_lru_move_tail(struct dentry *dentry)
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
-		dentry_stat.nr_unused++;
+		this_cpu_inc(nr_dentry_unused);
 	} else {
 		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 	}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Before we split up the dcache_lru_lock, the unused dentry counter
needs to be made independent of the global dcache_lru_lock. Convert
it to per-cpu counters to do this.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   17 ++++++++++++++---
 1 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index a88948b..febe701 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -116,6 +116,7 @@ struct dentry_stat_t dentry_stat = {
 };
 
 static DEFINE_PER_CPU(unsigned int, nr_dentry);
+static DEFINE_PER_CPU(unsigned int, nr_dentry_unused);
 
 #if defined(CONFIG_SYSCTL) && defined(CONFIG_PROC_FS)
 static int get_nr_dentry(void)
@@ -127,10 +128,20 @@ static int get_nr_dentry(void)
 	return sum < 0 ? 0 : sum;
 }
 
+static int get_nr_dentry_unused(void)
+{
+	int i;
+	int sum = 0;
+	for_each_possible_cpu(i)
+		sum += per_cpu(nr_dentry_unused, i);
+	return sum < 0 ? 0 : sum;
+}
+
 int proc_nr_dentry(ctl_table *table, int write, void __user *buffer,
 		   size_t *lenp, loff_t *ppos)
 {
 	dentry_stat.nr_dentry = get_nr_dentry();
+	dentry_stat.nr_unused = get_nr_dentry_unused();
 	return proc_dointvec(table, write, buffer, lenp, ppos);
 }
 #endif
@@ -233,7 +244,7 @@ static void dentry_lru_add(struct dentry *dentry)
 		spin_lock(&dcache_lru_lock);
 		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
-		dentry_stat.nr_unused++;
+		this_cpu_inc(nr_dentry_unused);
 		spin_unlock(&dcache_lru_lock);
 	}
 }
@@ -242,7 +253,7 @@ static void __dentry_lru_del(struct dentry *dentry)
 {
 	list_del_init(&dentry->d_lru);
 	dentry->d_sb->s_nr_dentry_unused--;
-	dentry_stat.nr_unused--;
+	this_cpu_dec(nr_dentry_unused);
 }
 
 static void dentry_lru_del(struct dentry *dentry)
@@ -260,7 +271,7 @@ static void dentry_lru_move_tail(struct dentry *dentry)
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
-		dentry_stat.nr_unused++;
+		this_cpu_inc(nr_dentry_unused);
 	} else {
 		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 	}
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/13] dentry: move to per-sb LRU locks
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

With the dentry LRUs being per-sb structures, there is no real need
for a global dentry_lru_lock. The locking can be made more
fine-grained by moving to a per-sb LRU lock, isolating the LRU
operations of different filesytsems completely from each other.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c        |   33 ++++++++++++++++-----------------
 fs/super.c         |    1 +
 include/linux/fs.h |    4 ++--
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index febe701..5123d71 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -46,7 +46,7 @@
  *   - the dcache hash table
  * s_anon bl list spinlock protects:
  *   - the s_anon list (see __d_drop)
- * dcache_lru_lock protects:
+ * dentry->d_sb->s_dentry_lru_lock protects:
  *   - the dcache lru lists and counters
  * d_lock protects:
  *   - d_flags
@@ -61,7 +61,7 @@
  * Ordering:
  * dentry->d_inode->i_lock
  *   dentry->d_lock
- *     dcache_lru_lock
+ *     dentry->d_sb->s_dentry_lru_lock
  *     dcache_hash_bucket lock
  *     s_anon lock
  *
@@ -79,7 +79,6 @@
 int sysctl_vfs_cache_pressure __read_mostly = 100;
 EXPORT_SYMBOL_GPL(sysctl_vfs_cache_pressure);
 
-static __cacheline_aligned_in_smp DEFINE_SPINLOCK(dcache_lru_lock);
 __cacheline_aligned_in_smp DEFINE_SEQLOCK(rename_lock);
 
 EXPORT_SYMBOL(rename_lock);
@@ -241,11 +240,11 @@ static void dentry_unlink_inode(struct dentry * dentry)
 static void dentry_lru_add(struct dentry *dentry)
 {
 	if (list_empty(&dentry->d_lru)) {
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
 		this_cpu_inc(nr_dentry_unused);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 	}
 }
 
@@ -259,15 +258,15 @@ static void __dentry_lru_del(struct dentry *dentry)
 static void dentry_lru_del(struct dentry *dentry)
 {
 	if (!list_empty(&dentry->d_lru)) {
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 		__dentry_lru_del(dentry);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 	}
 }
 
 static void dentry_lru_move_tail(struct dentry *dentry)
 {
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
@@ -275,7 +274,7 @@ static void dentry_lru_move_tail(struct dentry *dentry)
 	} else {
 		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 	}
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
 
 /**
@@ -767,14 +766,14 @@ static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
 	LIST_HEAD(tmp);
 
 relock:
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		dentry = list_entry(sb->s_dentry_lru.prev,
 				struct dentry, d_lru);
 		BUG_ON(dentry->d_sb != sb);
 
 		if (!spin_trylock(&dentry->d_lock)) {
-			spin_unlock(&dcache_lru_lock);
+			spin_unlock(&sb->s_dentry_lru_lock);
 			cpu_relax();
 			goto relock;
 		}
@@ -795,11 +794,11 @@ relock:
 			if (!--count)
 				break;
 		}
-		cond_resched_lock(&dcache_lru_lock);
+		cond_resched_lock(&sb->s_dentry_lru_lock);
 	}
 	if (!list_empty(&referenced))
 		list_splice(&referenced, &sb->s_dentry_lru);
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&sb->s_dentry_lru_lock);
 
 	shrink_dentry_list(&tmp);
 }
@@ -832,14 +831,14 @@ void shrink_dcache_sb(struct super_block *sb)
 {
 	LIST_HEAD(tmp);
 
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		list_splice_init(&sb->s_dentry_lru, &tmp);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&sb->s_dentry_lru_lock);
 		shrink_dentry_list(&tmp);
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&sb->s_dentry_lru_lock);
 	}
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&sb->s_dentry_lru_lock);
 }
 EXPORT_SYMBOL(shrink_dcache_sb);
 
diff --git a/fs/super.c b/fs/super.c
index 3f56a26..6a72693 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -140,6 +140,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_HLIST_BL_HEAD(&s->s_anon);
 		INIT_LIST_HEAD(&s->s_inodes);
 		INIT_LIST_HEAD(&s->s_dentry_lru);
+		spin_lock_init(&s->s_dentry_lru_lock);
 		INIT_LIST_HEAD(&s->s_inode_lru);
 		spin_lock_init(&s->s_inode_lru_lock);
 		init_rwsem(&s->s_umount);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 178cdb4..14be4d8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1410,9 +1410,9 @@ struct super_block {
 #else
 	struct list_head	s_files;
 #endif
-	/* s_dentry_lru, s_nr_dentry_unused protected by dcache.c lru locks */
+	spinlock_t		s_dentry_lru_lock ____cacheline_aligned_in_smp;
 	struct list_head	s_dentry_lru;	/* unused dentry lru */
-	int			s_nr_dentry_unused;	/* # of dentry on lru */
+	int			s_nr_dentry_unused; /* # of dentries on lru */
 
 	/* s_inode_lru_lock protects s_inode_lru and s_nr_inodes_unused */
 	spinlock_t		s_inode_lru_lock ____cacheline_aligned_in_smp;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 03/13] dentry: move to per-sb LRU locks
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

With the dentry LRUs being per-sb structures, there is no real need
for a global dentry_lru_lock. The locking can be made more
fine-grained by moving to a per-sb LRU lock, isolating the LRU
operations of different filesytsems completely from each other.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c        |   33 ++++++++++++++++-----------------
 fs/super.c         |    1 +
 include/linux/fs.h |    4 ++--
 3 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index febe701..5123d71 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -46,7 +46,7 @@
  *   - the dcache hash table
  * s_anon bl list spinlock protects:
  *   - the s_anon list (see __d_drop)
- * dcache_lru_lock protects:
+ * dentry->d_sb->s_dentry_lru_lock protects:
  *   - the dcache lru lists and counters
  * d_lock protects:
  *   - d_flags
@@ -61,7 +61,7 @@
  * Ordering:
  * dentry->d_inode->i_lock
  *   dentry->d_lock
- *     dcache_lru_lock
+ *     dentry->d_sb->s_dentry_lru_lock
  *     dcache_hash_bucket lock
  *     s_anon lock
  *
@@ -79,7 +79,6 @@
 int sysctl_vfs_cache_pressure __read_mostly = 100;
 EXPORT_SYMBOL_GPL(sysctl_vfs_cache_pressure);
 
-static __cacheline_aligned_in_smp DEFINE_SPINLOCK(dcache_lru_lock);
 __cacheline_aligned_in_smp DEFINE_SEQLOCK(rename_lock);
 
 EXPORT_SYMBOL(rename_lock);
@@ -241,11 +240,11 @@ static void dentry_unlink_inode(struct dentry * dentry)
 static void dentry_lru_add(struct dentry *dentry)
 {
 	if (list_empty(&dentry->d_lru)) {
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
 		this_cpu_inc(nr_dentry_unused);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 	}
 }
 
@@ -259,15 +258,15 @@ static void __dentry_lru_del(struct dentry *dentry)
 static void dentry_lru_del(struct dentry *dentry)
 {
 	if (!list_empty(&dentry->d_lru)) {
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 		__dentry_lru_del(dentry);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 	}
 }
 
 static void dentry_lru_move_tail(struct dentry *dentry)
 {
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 		dentry->d_sb->s_nr_dentry_unused++;
@@ -275,7 +274,7 @@ static void dentry_lru_move_tail(struct dentry *dentry)
 	} else {
 		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
 	}
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
 
 /**
@@ -767,14 +766,14 @@ static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
 	LIST_HEAD(tmp);
 
 relock:
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		dentry = list_entry(sb->s_dentry_lru.prev,
 				struct dentry, d_lru);
 		BUG_ON(dentry->d_sb != sb);
 
 		if (!spin_trylock(&dentry->d_lock)) {
-			spin_unlock(&dcache_lru_lock);
+			spin_unlock(&sb->s_dentry_lru_lock);
 			cpu_relax();
 			goto relock;
 		}
@@ -795,11 +794,11 @@ relock:
 			if (!--count)
 				break;
 		}
-		cond_resched_lock(&dcache_lru_lock);
+		cond_resched_lock(&sb->s_dentry_lru_lock);
 	}
 	if (!list_empty(&referenced))
 		list_splice(&referenced, &sb->s_dentry_lru);
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&sb->s_dentry_lru_lock);
 
 	shrink_dentry_list(&tmp);
 }
@@ -832,14 +831,14 @@ void shrink_dcache_sb(struct super_block *sb)
 {
 	LIST_HEAD(tmp);
 
-	spin_lock(&dcache_lru_lock);
+	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		list_splice_init(&sb->s_dentry_lru, &tmp);
-		spin_unlock(&dcache_lru_lock);
+		spin_unlock(&sb->s_dentry_lru_lock);
 		shrink_dentry_list(&tmp);
-		spin_lock(&dcache_lru_lock);
+		spin_lock(&sb->s_dentry_lru_lock);
 	}
-	spin_unlock(&dcache_lru_lock);
+	spin_unlock(&sb->s_dentry_lru_lock);
 }
 EXPORT_SYMBOL(shrink_dcache_sb);
 
diff --git a/fs/super.c b/fs/super.c
index 3f56a26..6a72693 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -140,6 +140,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_HLIST_BL_HEAD(&s->s_anon);
 		INIT_LIST_HEAD(&s->s_inodes);
 		INIT_LIST_HEAD(&s->s_dentry_lru);
+		spin_lock_init(&s->s_dentry_lru_lock);
 		INIT_LIST_HEAD(&s->s_inode_lru);
 		spin_lock_init(&s->s_inode_lru_lock);
 		init_rwsem(&s->s_umount);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 178cdb4..14be4d8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1410,9 +1410,9 @@ struct super_block {
 #else
 	struct list_head	s_files;
 #endif
-	/* s_dentry_lru, s_nr_dentry_unused protected by dcache.c lru locks */
+	spinlock_t		s_dentry_lru_lock ____cacheline_aligned_in_smp;
 	struct list_head	s_dentry_lru;	/* unused dentry lru */
-	int			s_nr_dentry_unused;	/* # of dentry on lru */
+	int			s_nr_dentry_unused; /* # of dentries on lru */
 
 	/* s_inode_lru_lock protects s_inode_lru and s_nr_inodes_unused */
 	spinlock_t		s_inode_lru_lock ____cacheline_aligned_in_smp;
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/13] mm: new shrinker API
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

The current shrinker callout API uses an a single shrinker call for
multiple functions. To determine the function, a special magical
value is passed in a parameter to change the behaviour. This
complicates the implementation and return value specification for
the different behaviours.

Separate the two different behaviours into separate operations, one
to return a count of freeable objects in the cache, and another to
scan a certain number of objects in the cache for freeing. In
defining these new operations, ensure the return values and
resultant behaviours are clearly defined and documented.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/shrinker.h |   38 +++++++++++++++++++++++++++-----------
 1 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 790651b..50f213f 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -3,32 +3,47 @@
 
 /*
  * This struct is used to pass information from page reclaim to the shrinkers.
- * We consolidate the values for easier extention later.
+ *
+ * The 'gfpmask' refers to the allocation we are currently trying to
+ * fulfil.
+ *
+ * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
+ * querying the cache size, so a fastpath for that case is appropriate.
  */
 struct shrink_control {
 	gfp_t gfp_mask;
 
 	/* How many slab objects shrinker() should scan and try to reclaim */
-	unsigned long nr_to_scan;
+	long nr_to_scan;
 };
 
 /*
  * A callback you can register to apply pressure to ageable caches.
  *
- * 'sc' is passed shrink_control which includes a count 'nr_to_scan'
- * and a 'gfpmask'.  It should look through the least-recently-used
- * 'nr_to_scan' entries and attempt to free them up.  It should return
- * the number of objects which remain in the cache.  If it returns -1, it means
- * it cannot do any scanning at this time (eg. there is a risk of deadlock).
+ * @shrink() should look through the least-recently-used 'nr_to_scan' entries
+ * and attempt to free them up.  It should return the number of objects which
+ * remain in the cache.  If it returns -1, it means it cannot do any scanning at
+ * this time (eg. there is a risk of deadlock).
  *
- * The 'gfpmask' refers to the allocation we are currently trying to
- * fulfil.
+ * @count_objects should return the number of freeable items in the cache. If
+ * there are no objects to free or the number of freeable items cannot be
+ * determined, it should return 0. No deadlock checks should be done during the
+ * count callback - the shrinker relies on aggregating scan counts that couldn't
+ * be executed due to potential deadlocks to be run at a later call when the
+ * deadlock condition is no longer pending.
  *
- * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
- * querying the cache size, so a fastpath for that case is appropriate.
+ * @scan_objects will only be called if @count_objects returned a positive value
+ * for the number of freeable objects. The callout should scan the cache and
+ * attemp to free items from the cache. It should then return the number of
+ * objects freed during the scan, or -1 if progress cannot be made due to
+ * potential deadlocks. If -1 is returned, then no further attempts to call the
+ * @scan_objects will be made from the current reclaim context.
  */
 struct shrinker {
 	int (*shrink)(struct shrinker *, struct shrink_control *sc);
+	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
+	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
+
 	int seeks;	/* seeks to recreate an obj */
 	long batch;	/* reclaim batch size, 0 = default */
 
@@ -36,6 +51,7 @@ struct shrinker {
 	struct list_head list;
 	long nr;	/* objs pending delete */
 };
+
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
 extern void register_shrinker(struct shrinker *);
 extern void unregister_shrinker(struct shrinker *);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 04/13] mm: new shrinker API
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

The current shrinker callout API uses an a single shrinker call for
multiple functions. To determine the function, a special magical
value is passed in a parameter to change the behaviour. This
complicates the implementation and return value specification for
the different behaviours.

Separate the two different behaviours into separate operations, one
to return a count of freeable objects in the cache, and another to
scan a certain number of objects in the cache for freeing. In
defining these new operations, ensure the return values and
resultant behaviours are clearly defined and documented.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/shrinker.h |   38 +++++++++++++++++++++++++++-----------
 1 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 790651b..50f213f 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -3,32 +3,47 @@
 
 /*
  * This struct is used to pass information from page reclaim to the shrinkers.
- * We consolidate the values for easier extention later.
+ *
+ * The 'gfpmask' refers to the allocation we are currently trying to
+ * fulfil.
+ *
+ * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
+ * querying the cache size, so a fastpath for that case is appropriate.
  */
 struct shrink_control {
 	gfp_t gfp_mask;
 
 	/* How many slab objects shrinker() should scan and try to reclaim */
-	unsigned long nr_to_scan;
+	long nr_to_scan;
 };
 
 /*
  * A callback you can register to apply pressure to ageable caches.
  *
- * 'sc' is passed shrink_control which includes a count 'nr_to_scan'
- * and a 'gfpmask'.  It should look through the least-recently-used
- * 'nr_to_scan' entries and attempt to free them up.  It should return
- * the number of objects which remain in the cache.  If it returns -1, it means
- * it cannot do any scanning at this time (eg. there is a risk of deadlock).
+ * @shrink() should look through the least-recently-used 'nr_to_scan' entries
+ * and attempt to free them up.  It should return the number of objects which
+ * remain in the cache.  If it returns -1, it means it cannot do any scanning at
+ * this time (eg. there is a risk of deadlock).
  *
- * The 'gfpmask' refers to the allocation we are currently trying to
- * fulfil.
+ * @count_objects should return the number of freeable items in the cache. If
+ * there are no objects to free or the number of freeable items cannot be
+ * determined, it should return 0. No deadlock checks should be done during the
+ * count callback - the shrinker relies on aggregating scan counts that couldn't
+ * be executed due to potential deadlocks to be run at a later call when the
+ * deadlock condition is no longer pending.
  *
- * Note that 'shrink' will be passed nr_to_scan == 0 when the VM is
- * querying the cache size, so a fastpath for that case is appropriate.
+ * @scan_objects will only be called if @count_objects returned a positive value
+ * for the number of freeable objects. The callout should scan the cache and
+ * attemp to free items from the cache. It should then return the number of
+ * objects freed during the scan, or -1 if progress cannot be made due to
+ * potential deadlocks. If -1 is returned, then no further attempts to call the
+ * @scan_objects will be made from the current reclaim context.
  */
 struct shrinker {
 	int (*shrink)(struct shrinker *, struct shrink_control *sc);
+	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
+	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
+
 	int seeks;	/* seeks to recreate an obj */
 	long batch;	/* reclaim batch size, 0 = default */
 
@@ -36,6 +51,7 @@ struct shrinker {
 	struct list_head list;
 	long nr;	/* objs pending delete */
 };
+
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
 extern void register_shrinker(struct shrinker *);
 extern void unregister_shrinker(struct shrinker *);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/13] mm: convert shrinkers to use new API
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Modify shrink_slab() to use the new .count_objects/.scan_objects API
and implement the callouts for all the existing shrinkers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Documentation/filesystems/vfs.txt    |   11 +++--
 arch/x86/kvm/mmu.c                   |   16 ++++---
 drivers/gpu/drm/i915/i915_dma.c      |    4 +-
 drivers/gpu/drm/i915/i915_gem.c      |   49 ++++++++++++++---------
 drivers/gpu/drm/ttm/ttm_page_alloc.c |   14 ++++--
 drivers/staging/zcache/zcache-main.c |   45 ++++++++++++---------
 fs/cifs/cifsacl.c                    |   57 +++++++++++++++++----------
 fs/dcache.c                          |   15 ++++---
 fs/gfs2/glock.c                      |   24 +++++++-----
 fs/gfs2/main.c                       |    3 +-
 fs/gfs2/quota.c                      |   19 +++++----
 fs/gfs2/quota.h                      |    4 +-
 fs/inode.c                           |    7 ++-
 fs/internal.h                        |    3 +
 fs/mbcache.c                         |   37 +++++++++++------
 fs/nfs/dir.c                         |   17 ++++++--
 fs/nfs/internal.h                    |    6 ++-
 fs/nfs/super.c                       |    3 +-
 fs/quota/dquot.c                     |   39 +++++++++----------
 fs/super.c                           |   71 ++++++++++++++++++++--------------
 fs/ubifs/shrinker.c                  |   19 +++++----
 fs/ubifs/super.c                     |    3 +-
 fs/ubifs/ubifs.h                     |    3 +-
 fs/xfs/xfs_buf.c                     |   19 ++++++++-
 fs/xfs/xfs_qm.c                      |   22 +++++++---
 fs/xfs/xfs_super.c                   |    8 ++--
 fs/xfs/xfs_sync.c                    |   17 +++++---
 fs/xfs/xfs_sync.h                    |    4 +-
 include/linux/fs.h                   |    8 +---
 include/trace/events/vmscan.h        |   12 +++---
 mm/vmscan.c                          |   46 +++++++++-------------
 net/sunrpc/auth.c                    |   21 +++++++---
 32 files changed, 369 insertions(+), 257 deletions(-)

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 52d8fb8..4ca3c2d 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -229,8 +229,8 @@ struct super_operations {
 
         ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
         ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
-	int (*nr_cached_objects)(struct super_block *);
-	void (*free_cached_objects)(struct super_block *, int);
+	long (*nr_cached_objects)(struct super_block *);
+	long (*free_cached_objects)(struct super_block *, long);
 };
 
 All methods are called without any locks being held, unless otherwise
@@ -313,9 +313,10 @@ or bottom half).
 	implement ->nr_cached_objects for it to be called correctly.
 
 	We can't do anything with any errors that the filesystem might
-	encountered, hence the void return type. This will never be called if
-	the VM is trying to reclaim under GFP_NOFS conditions, hence this
-	method does not need to handle that situation itself.
+	encountered, so the return value is the number of objects freed. This
+	will never be called if the VM is trying to reclaim under GFP_NOFS
+	conditions, hence this method does not need to handle that situation
+	itself.
 
 	Implementations must include conditional reschedule calls inside any
 	scanning loop that is done. This allows the VFS to determine
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1c5b693..939e201 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3858,14 +3858,12 @@ static int kvm_mmu_remove_some_alloc_mmu_pages(struct kvm *kvm,
 	return kvm_mmu_prepare_zap_page(kvm, page, invalid_list);
 }
 
-static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
+static long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct kvm *kvm;
 	struct kvm *kvm_freed = NULL;
 	int nr_to_scan = sc->nr_to_scan;
-
-	if (nr_to_scan == 0)
-		goto out;
+	long freed_pages = 0;
 
 	raw_spin_lock(&kvm_lock);
 
@@ -3877,7 +3875,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 		spin_lock(&kvm->mmu_lock);
 		if (!kvm_freed && nr_to_scan > 0 &&
 		    kvm->arch.n_used_mmu_pages > 0) {
-			freed_pages = kvm_mmu_remove_some_alloc_mmu_pages(kvm,
+			freed_pages += kvm_mmu_remove_some_alloc_mmu_pages(kvm,
 							  &invalid_list);
 			kvm_freed = kvm;
 		}
@@ -3891,13 +3889,17 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 		list_move_tail(&kvm_freed->vm_list, &vm_list);
 
 	raw_spin_unlock(&kvm_lock);
+	return freed_pages;
+}
 
-out:
+static long mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
 	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
 }
 
 static struct shrinker mmu_shrinker = {
-	.shrink = mmu_shrink,
+	.scan_objects = mmu_shrink_scan,
+	.count_objects = mmu_shrink_count,
 	.seeks = DEFAULT_SEEKS * 10,
 };
 
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 8a3942c..734ea5e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2074,7 +2074,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	return 0;
 
 out_gem_unload:
-	if (dev_priv->mm.inactive_shrinker.shrink)
+	if (dev_priv->mm.inactive_shrinker.scan_objects)
 		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
 
 	if (dev->pdev->msi_enabled)
@@ -2108,7 +2108,7 @@ int i915_driver_unload(struct drm_device *dev)
 	i915_mch_dev = NULL;
 	spin_unlock(&mchdev_lock);
 
-	if (dev_priv->mm.inactive_shrinker.shrink)
+	if (dev_priv->mm.inactive_shrinker.scan_objects)
 		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
 
 	mutex_lock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a546a71..0647a33 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -56,7 +56,9 @@ static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_file *file);
 static void i915_gem_free_object_tail(struct drm_i915_gem_object *obj);
 
-static int i915_gem_inactive_shrink(struct shrinker *shrinker,
+static long i915_gem_inactive_scan(struct shrinker *shrinker,
+				   struct shrink_control *sc);
+static long i915_gem_inactive_count(struct shrinker *shrinker,
 				    struct shrink_control *sc);
 
 /* some bookkeeping */
@@ -3999,7 +4001,8 @@ i915_gem_load(struct drm_device *dev)
 
 	dev_priv->mm.interruptible = true;
 
-	dev_priv->mm.inactive_shrinker.shrink = i915_gem_inactive_shrink;
+	dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan;
+	dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count;
 	dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS;
 	register_shrinker(&dev_priv->mm.inactive_shrinker);
 }
@@ -4221,8 +4224,8 @@ i915_gpu_is_active(struct drm_device *dev)
 	return !lists_empty;
 }
 
-static int
-i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
+static long
+i915_gem_inactive_scan(struct shrinker *shrinker, struct shrink_control *sc)
 {
 	struct drm_i915_private *dev_priv =
 		container_of(shrinker,
@@ -4231,22 +4234,10 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	struct drm_device *dev = dev_priv->dev;
 	struct drm_i915_gem_object *obj, *next;
 	int nr_to_scan = sc->nr_to_scan;
-	int cnt;
 
 	if (!mutex_trylock(&dev->struct_mutex))
 		return 0;
 
-	/* "fast-path" to count number of available objects */
-	if (nr_to_scan == 0) {
-		cnt = 0;
-		list_for_each_entry(obj,
-				    &dev_priv->mm.inactive_list,
-				    mm_list)
-			cnt++;
-		mutex_unlock(&dev->struct_mutex);
-		return cnt / 100 * sysctl_vfs_cache_pressure;
-	}
-
 rescan:
 	/* first scan for clean buffers */
 	i915_gem_retire_requests(dev);
@@ -4262,15 +4253,12 @@ rescan:
 	}
 
 	/* second pass, evict/count anything still on the inactive list */
-	cnt = 0;
 	list_for_each_entry_safe(obj, next,
 				 &dev_priv->mm.inactive_list,
 				 mm_list) {
 		if (nr_to_scan &&
 		    i915_gem_object_unbind(obj) == 0)
 			nr_to_scan--;
-		else
-			cnt++;
 	}
 
 	if (nr_to_scan && i915_gpu_is_active(dev)) {
@@ -4284,5 +4272,26 @@ rescan:
 			goto rescan;
 	}
 	mutex_unlock(&dev->struct_mutex);
-	return cnt / 100 * sysctl_vfs_cache_pressure;
+	return sc->nr_to_scan - nr_to_scan;
+}
+
+static long
+i915_gem_inactive_count(struct shrinker *shrinker, struct shrink_control *sc)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(shrinker,
+			     struct drm_i915_private,
+			     mm.inactive_shrinker);
+	struct drm_device *dev = dev_priv->dev;
+	struct drm_i915_gem_object *obj;
+	long count = 0;
+
+	if (!mutex_trylock(&dev->struct_mutex))
+		return 0;
+
+	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list)
+		count++;
+
+	mutex_unlock(&dev->struct_mutex);
+	return count;
 }
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 727e93d..3e71c68 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -395,14 +395,13 @@ static int ttm_pool_get_num_unused_pages(void)
 /**
  * Callback for mm to request pool to reduce number of page held.
  */
-static int ttm_pool_mm_shrink(struct shrinker *shrink,
-			      struct shrink_control *sc)
+static long ttm_pool_mm_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	static atomic_t start_pool = ATOMIC_INIT(0);
 	unsigned i;
 	unsigned pool_offset = atomic_add_return(1, &start_pool);
 	struct ttm_page_pool *pool;
-	int shrink_pages = sc->nr_to_scan;
+	long shrink_pages = sc->nr_to_scan;
 
 	pool_offset = pool_offset % NUM_POOLS;
 	/* select start pool in round robin fashion */
@@ -413,13 +412,18 @@ static int ttm_pool_mm_shrink(struct shrinker *shrink,
 		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
 		shrink_pages = ttm_page_pool_free(pool, nr_free);
 	}
-	/* return estimated number of unused pages in pool */
+	return sc->nr_to_scan;
+}
+
+static long ttm_pool_mm_count(struct shrinker *shrink, struct shrink_control *sc)
+{
 	return ttm_pool_get_num_unused_pages();
 }
 
 static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
 {
-	manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
+	manager->mm_shrink.scan_objects = ttm_pool_mm_scan;
+	manager->mm_shrink.count_objects = ttm_pool_mm_count;
 	manager->mm_shrink.seeks = 1;
 	register_shrinker(&manager->mm_shrink);
 }
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 855a5bb..3ccb723 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -493,9 +493,10 @@ static void zbud_evict_zbpg(struct zbud_page *zbpg)
  * page in use by another cpu, but also to avoid potential deadlock due to
  * lock inversion.
  */
-static void zbud_evict_pages(int nr)
+static int zbud_evict_pages(int nr)
 {
 	struct zbud_page *zbpg;
+	int freed = 0;
 	int i;
 
 	/* first try freeing any pages on unused list */
@@ -511,7 +512,7 @@ retry_unused_list:
 		spin_unlock_bh(&zbpg_unused_list_spinlock);
 		zcache_free_page(zbpg);
 		zcache_evicted_raw_pages++;
-		if (--nr <= 0)
+		if (++freed >= nr)
 			goto out;
 		goto retry_unused_list;
 	}
@@ -535,7 +536,7 @@ retry_unbud_list_i:
 			/* want budlists unlocked when doing zbpg eviction */
 			zbud_evict_zbpg(zbpg);
 			local_bh_enable();
-			if (--nr <= 0)
+			if (++freed >= nr)
 				goto out;
 			goto retry_unbud_list_i;
 		}
@@ -559,13 +560,13 @@ retry_bud_list:
 		/* want budlists unlocked when doing zbpg eviction */
 		zbud_evict_zbpg(zbpg);
 		local_bh_enable();
-		if (--nr <= 0)
+		if (++freed >= nr)
 			goto out;
 		goto retry_bud_list;
 	}
 	spin_unlock_bh(&zbud_budlists_spinlock);
 out:
-	return;
+	return freed;
 }
 
 static void zbud_init(void)
@@ -1496,30 +1497,34 @@ static bool zcache_freeze;
 /*
  * zcache shrinker interface (only useful for ephemeral pages, so zbud only)
  */
-static int shrink_zcache_memory(struct shrinker *shrink,
-				struct shrink_control *sc)
+static long shrink_zcache_scan(struct shrinker *shrink,
+			       struct shrink_control *sc)
 {
 	int ret = -1;
 	int nr = sc->nr_to_scan;
 	gfp_t gfp_mask = sc->gfp_mask;
 
-	if (nr >= 0) {
-		if (!(gfp_mask & __GFP_FS))
-			/* does this case really need to be skipped? */
-			goto out;
-		if (spin_trylock(&zcache_direct_reclaim_lock)) {
-			zbud_evict_pages(nr);
-			spin_unlock(&zcache_direct_reclaim_lock);
-		} else
-			zcache_aborted_shrink++;
-	}
-	ret = (int)atomic_read(&zcache_zbud_curr_raw_pages);
-out:
+	if (!(gfp_mask & __GFP_FS))
+		return -1;
+
+	if (spin_trylock(&zcache_direct_reclaim_lock)) {
+		ret = zbud_evict_pages(nr);
+		spin_unlock(&zcache_direct_reclaim_lock);
+	} else
+		zcache_aborted_shrink++;
+
 	return ret;
 }
 
+static long shrink_zcache_count(struct shrinker *shrink,
+				struct shrink_control *sc)
+{
+	return atomic_read(&zcache_zbud_curr_raw_pages);
+}
+
 static struct shrinker zcache_shrinker = {
-	.shrink = shrink_zcache_memory,
+	.scan_objects = shrink_zcache_scan,
+	.count_objects = shrink_zcache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index d0f59fa..508a684 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -44,58 +44,73 @@ static const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} };
 
 const struct cred *root_cred;
 
-static void
-shrink_idmap_tree(struct rb_root *root, int nr_to_scan, int *nr_rem,
-			int *nr_del)
+static long
+shrink_idmap_tree(struct rb_root *root, int nr_to_scan)
 {
 	struct rb_node *node;
 	struct rb_node *tmp;
 	struct cifs_sid_id *psidid;
+	long count = 0;
 
 	node = rb_first(root);
 	while (node) {
 		tmp = node;
 		node = rb_next(tmp);
 		psidid = rb_entry(tmp, struct cifs_sid_id, rbnode);
-		if (nr_to_scan == 0 || *nr_del == nr_to_scan)
-			++(*nr_rem);
-		else {
-			if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
-						&& psidid->refcount == 0) {
-				rb_erase(tmp, root);
-				++(*nr_del);
-			} else
-				++(*nr_rem);
+		if (nr_to_scan == 0) {
+			count++;
+			continue:
+		}
+		if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
+					&& psidid->refcount == 0) {
+			rb_erase(tmp, root);
+			if (++count >= nr_to_scan)
+				break;
 		}
 	}
+	return count;
 }
 
 /*
  * Run idmap cache shrinker.
  */
-static int
-cifs_idmap_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+static long
+cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
-	int nr_to_scan = sc->nr_to_scan;
-	int nr_del = 0;
-	int nr_rem = 0;
 	struct rb_root *root;
+	long freed;
 
 	root = &uidtree;
 	spin_lock(&siduidlock);
-	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
+	freed = shrink_idmap_tree(root, sc->nr_to_scan);
 	spin_unlock(&siduidlock);
 
 	root = &gidtree;
 	spin_lock(&sidgidlock);
-	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
+	freed += shrink_idmap_tree(root, sc->nr_to_scan);
 	spin_unlock(&sidgidlock);
 
-	return nr_rem;
+	return freed;
+}
+
+/*
+ * This still abuses the nr_to_scan == 0 trick to get the common code just to
+ * count objects. There neds to be an external count of the objects in the
+ * caches to avoid this.
+ */
+static long
+cifs_idmap_shrinker_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct shrinker_control sc = {
+		.nr_to_scan = 0,
+	}
+
+	return cifs_idmap_shrinker_scan(shrink, &sc);
 }
 
 static struct shrinker cifs_shrinker = {
-	.shrink = cifs_idmap_shrinker,
+	.scan_objects = cifs_idmap_shrinker_scan,
+	.count_objects = cifs_idmap_shrinker_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/dcache.c b/fs/dcache.c
index 5123d71..d19e453 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -759,11 +759,12 @@ static void shrink_dentry_list(struct list_head *list)
  *
  * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
  */
-static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
+static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
 {
 	struct dentry *dentry;
 	LIST_HEAD(referenced);
 	LIST_HEAD(tmp);
+	long freed = 0;
 
 relock:
 	spin_lock(&sb->s_dentry_lru_lock);
@@ -791,6 +792,7 @@ relock:
 		} else {
 			list_move_tail(&dentry->d_lru, &tmp);
 			spin_unlock(&dentry->d_lock);
+			freed++;
 			if (!--count)
 				break;
 		}
@@ -801,6 +803,7 @@ relock:
 	spin_unlock(&sb->s_dentry_lru_lock);
 
 	shrink_dentry_list(&tmp);
+	return freed;
 }
 
 /**
@@ -815,9 +818,9 @@ relock:
  * This function may fail to free any resources if all the dentries are in
  * use.
  */
-void prune_dcache_sb(struct super_block *sb, int nr_to_scan)
+long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
-	__shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
+	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
 }
 
 /**
@@ -1070,12 +1073,12 @@ EXPORT_SYMBOL(have_submounts);
  * drop the lock and return early due to latency
  * constraints.
  */
-static int select_parent(struct dentry * parent)
+static long select_parent(struct dentry * parent)
 {
 	struct dentry *this_parent;
 	struct list_head *next;
 	unsigned seq;
-	int found = 0;
+	long found = 0;
 	int locked = 0;
 
 	seq = read_seqbegin(&rename_lock);
@@ -1163,7 +1166,7 @@ rename_retry:
 void shrink_dcache_parent(struct dentry * parent)
 {
 	struct super_block *sb = parent->d_sb;
-	int found;
+	long found;
 
 	while ((found = select_parent(parent)) != 0)
 		__shrink_dcache_sb(sb, found, 0);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 88e8a23..f9bc88d 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1370,24 +1370,21 @@ void gfs2_glock_complete(struct gfs2_glock *gl, int ret)
 }
 
 
-static int gfs2_shrink_glock_memory(struct shrinker *shrink,
-				    struct shrink_control *sc)
+static long gfs2_shrink_glock_scan(struct shrinker *shrink,
+				   struct shrink_control *sc)
 {
 	struct gfs2_glock *gl;
 	int may_demote;
 	int nr_skipped = 0;
-	int nr = sc->nr_to_scan;
+	int freed = 0;
 	gfp_t gfp_mask = sc->gfp_mask;
 	LIST_HEAD(skipped);
 
-	if (nr == 0)
-		goto out;
-
 	if (!(gfp_mask & __GFP_FS))
 		return -1;
 
 	spin_lock(&lru_lock);
-	while(nr && !list_empty(&lru_list)) {
+	while (freed < sc->nr_to_scan && !list_empty(&lru_list)) {
 		gl = list_entry(lru_list.next, struct gfs2_glock, gl_lru);
 		list_del_init(&gl->gl_lru);
 		clear_bit(GLF_LRU, &gl->gl_flags);
@@ -1401,7 +1398,7 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
 			may_demote = demote_ok(gl);
 			if (may_demote) {
 				handle_callback(gl, LM_ST_UNLOCKED, 0);
-				nr--;
+				freed++;
 			}
 			clear_bit(GLF_LOCK, &gl->gl_flags);
 			smp_mb__after_clear_bit();
@@ -1418,12 +1415,19 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
 	list_splice(&skipped, &lru_list);
 	atomic_add(nr_skipped, &lru_count);
 	spin_unlock(&lru_lock);
-out:
+
+	return freed;
+}
+
+static long gfs2_shrink_glock_count(struct shrinker *shrink,
+				    struct shrink_control *sc)
+{
 	return (atomic_read(&lru_count) / 100) * sysctl_vfs_cache_pressure;
 }
 
 static struct shrinker glock_shrinker = {
-	.shrink = gfs2_shrink_glock_memory,
+	.scan_objects = gfs2_shrink_glock_scan,
+	.count_objects = gfs2_shrink_glock_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 8ea7747..2c21986 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -29,7 +29,8 @@
 #include "dir.h"
 
 static struct shrinker qd_shrinker = {
-	.shrink = gfs2_shrink_qd_memory,
+	.scan_objects = gfs2_shrink_qd_scan,
+	.count_objects = gfs2_shrink_qd_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 42e8d23..5a5f76c 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -78,20 +78,17 @@ static LIST_HEAD(qd_lru_list);
 static atomic_t qd_lru_count = ATOMIC_INIT(0);
 static DEFINE_SPINLOCK(qd_lru_lock);
 
-int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
+long gfs2_shrink_qd_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct gfs2_quota_data *qd;
 	struct gfs2_sbd *sdp;
-	int nr_to_scan = sc->nr_to_scan;
-
-	if (nr_to_scan == 0)
-		goto out;
+	int freed = 0;
 
 	if (!(sc->gfp_mask & __GFP_FS))
 		return -1;
 
 	spin_lock(&qd_lru_lock);
-	while (nr_to_scan && !list_empty(&qd_lru_list)) {
+	while (freed <= sc->nr_to_scan && !list_empty(&qd_lru_list)) {
 		qd = list_entry(qd_lru_list.next,
 				struct gfs2_quota_data, qd_reclaim);
 		sdp = qd->qd_gl->gl_sbd;
@@ -112,12 +109,16 @@ int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
 		spin_unlock(&qd_lru_lock);
 		kmem_cache_free(gfs2_quotad_cachep, qd);
 		spin_lock(&qd_lru_lock);
-		nr_to_scan--;
+		freed++;
 	}
 	spin_unlock(&qd_lru_lock);
 
-out:
-	return (atomic_read(&qd_lru_count) * sysctl_vfs_cache_pressure) / 100;
+	return freed;
+}
+
+long gfs2_shrink_qd_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	return (atomic_read(&qd_lru_count) / 100) * sysctl_vfs_cache_pressure;
 }
 
 static u64 qd2offset(struct gfs2_quota_data *qd)
diff --git a/fs/gfs2/quota.h b/fs/gfs2/quota.h
index 90bf1c3..c40fe6d 100644
--- a/fs/gfs2/quota.h
+++ b/fs/gfs2/quota.h
@@ -52,7 +52,9 @@ static inline int gfs2_quota_lock_check(struct gfs2_inode *ip)
 	return ret;
 }
 
-extern int gfs2_shrink_qd_memory(struct shrinker *shrink,
+extern long gfs2_shrink_qd_scan(struct shrinker *shrink,
+				struct shrink_control *sc);
+extern long gfs2_shrink_qd_count(struct shrinker *shrink,
 				 struct shrink_control *sc);
 extern const struct quotactl_ops gfs2_quotactl_ops;
 
diff --git a/fs/inode.c b/fs/inode.c
index 848808f..fee5d9a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -613,10 +613,11 @@ static int can_unuse(struct inode *inode)
  * LRU does not have strict ordering. Hence we don't want to reclaim inodes
  * with this flag set because they are the inodes that are out of order.
  */
-void prune_icache_sb(struct super_block *sb, int nr_to_scan)
+long prune_icache_sb(struct super_block *sb, long nr_to_scan)
 {
 	LIST_HEAD(freeable);
-	int nr_scanned;
+	long nr_scanned;
+	long freed = 0;
 	unsigned long reap = 0;
 
 	spin_lock(&sb->s_inode_lru_lock);
@@ -686,6 +687,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
 		list_move(&inode->i_lru, &freeable);
 		sb->s_nr_inodes_unused--;
 		this_cpu_dec(nr_unused);
+		freed++;
 	}
 	if (current_is_kswapd())
 		__count_vm_events(KSWAPD_INODESTEAL, reap);
@@ -694,6 +696,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
 	spin_unlock(&sb->s_inode_lru_lock);
 
 	dispose_list(&freeable);
+	return freed;
 }
 
 static void __wait_on_freeing_inode(struct inode *inode);
diff --git a/fs/internal.h b/fs/internal.h
index fe327c2..2662ffa 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -127,6 +127,8 @@ extern long do_handle_open(int mountdirfd,
  * inode.c
  */
 extern spinlock_t inode_sb_list_lock;
+extern long prune_icache_sb(struct super_block *sb, long nr_to_scan);
+
 
 /*
  * fs-writeback.c
@@ -141,3 +143,4 @@ extern int invalidate_inodes(struct super_block *, bool);
  * dcache.c
  */
 extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
+extern long prune_dcache_sb(struct super_block *sb, long nr_to_scan);
diff --git a/fs/mbcache.c b/fs/mbcache.c
index 8c32ef3..aa3a19a 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -90,11 +90,14 @@ static DEFINE_SPINLOCK(mb_cache_spinlock);
  * What the mbcache registers as to get shrunk dynamically.
  */
 
-static int mb_cache_shrink_fn(struct shrinker *shrink,
-			      struct shrink_control *sc);
+static long mb_cache_shrink_scan(struct shrinker *shrink,
+				 struct shrink_control *sc);
+static long mb_cache_shrink_count(struct shrinker *shrink,
+				  struct shrink_control *sc);
 
 static struct shrinker mb_cache_shrinker = {
-	.shrink = mb_cache_shrink_fn,
+	.scan_objects = mb_cache_shrink_scan,
+	.count_objects = mb_cache_shrink_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
@@ -161,13 +164,12 @@ forget:
  *
  * Returns the number of objects which are present in the cache.
  */
-static int
-mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
+static long
+mb_cache_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	LIST_HEAD(free_list);
-	struct mb_cache *cache;
 	struct mb_cache_entry *entry, *tmp;
-	int count = 0;
+	int freed = 0;
 	int nr_to_scan = sc->nr_to_scan;
 	gfp_t gfp_mask = sc->gfp_mask;
 
@@ -180,18 +182,27 @@ mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
 		list_move_tail(&ce->e_lru_list, &free_list);
 		__mb_cache_entry_unhash(ce);
 	}
-	list_for_each_entry(cache, &mb_cache_list, c_cache_list) {
-		mb_debug("cache %s (%d)", cache->c_name,
-			  atomic_read(&cache->c_entry_count));
-		count += atomic_read(&cache->c_entry_count);
-	}
 	spin_unlock(&mb_cache_spinlock);
 	list_for_each_entry_safe(entry, tmp, &free_list, e_lru_list) {
 		__mb_cache_entry_forget(entry, gfp_mask);
+		freed++;
 	}
-	return (count / 100) * sysctl_vfs_cache_pressure;
+	return freed;
 }
 
+static long
+mb_cache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct mb_cache *cache;
+	long count = 0;
+
+	spin_lock(&mb_cache_spinlock);
+	list_for_each_entry(cache, &mb_cache_list, c_cache_list)
+		count += atomic_read(&cache->c_entry_count);
+
+	spin_unlock(&mb_cache_spinlock);
+	return (count / 100) * sysctl_vfs_cache_pressure;
+}
 
 /*
  * mb_cache_create()  create a new cache
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index b238d95..a5aefb2 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2057,17 +2057,18 @@ static void nfs_access_free_list(struct list_head *head)
 	}
 }
 
-int nfs_access_cache_shrinker(struct shrinker *shrink,
-			      struct shrink_control *sc)
+long nfs_access_cache_scan(struct shrinker *shrink,
+			   struct shrink_control *sc)
 {
 	LIST_HEAD(head);
 	struct nfs_inode *nfsi, *next;
 	struct nfs_access_entry *cache;
 	int nr_to_scan = sc->nr_to_scan;
+	int freed = 0;
 	gfp_t gfp_mask = sc->gfp_mask;
 
 	if ((gfp_mask & GFP_KERNEL) != GFP_KERNEL)
-		return (nr_to_scan == 0) ? 0 : -1;
+		return -1;
 
 	spin_lock(&nfs_access_lru_lock);
 	list_for_each_entry_safe(nfsi, next, &nfs_access_lru_list, access_cache_inode_lru) {
@@ -2079,6 +2080,7 @@ int nfs_access_cache_shrinker(struct shrinker *shrink,
 		spin_lock(&inode->i_lock);
 		if (list_empty(&nfsi->access_cache_entry_lru))
 			goto remove_lru_entry;
+		freed++;
 		cache = list_entry(nfsi->access_cache_entry_lru.next,
 				struct nfs_access_entry, lru);
 		list_move(&cache->lru, &head);
@@ -2097,7 +2099,14 @@ remove_lru_entry:
 	}
 	spin_unlock(&nfs_access_lru_lock);
 	nfs_access_free_list(&head);
-	return (atomic_long_read(&nfs_access_nr_entries) / 100) * sysctl_vfs_cache_pressure;
+	return freed;
+}
+
+long nfs_access_cache_count(struct shrinker *shrink,
+			    struct shrink_control *sc)
+{
+	return (atomic_long_read(&nfs_access_nr_entries) / 100) *
+						sysctl_vfs_cache_pressure;
 }
 
 static void __nfs_access_zap_cache(struct nfs_inode *nfsi, struct list_head *head)
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index ab12913..9c65e1f 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -244,8 +244,10 @@ extern int nfs_init_client(struct nfs_client *clp,
 			   int noresvport);
 
 /* dir.c */
-extern int nfs_access_cache_shrinker(struct shrinker *shrink,
-					struct shrink_control *sc);
+extern long nfs_access_cache_scan(struct shrinker *shrink,
+				  struct shrink_control *sc);
+extern long nfs_access_cache_count(struct shrinker *shrink,
+				   struct shrink_control *sc);
 
 /* inode.c */
 extern struct workqueue_struct *nfsiod_workqueue;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index b961cea..e088c03 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -380,7 +380,8 @@ static const struct super_operations nfs4_sops = {
 #endif
 
 static struct shrinker acl_shrinker = {
-	.shrink		= nfs_access_cache_shrinker,
+	.scan_objects	= nfs_access_cache_scan,
+	.count_objects	= nfs_access_cache_count,
 	.seeks		= DEFAULT_SEEKS,
 };
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 5b572c8..c8724d2 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -669,45 +669,42 @@ int dquot_quota_sync(struct super_block *sb, int type, int wait)
 }
 EXPORT_SYMBOL(dquot_quota_sync);
 
-/* Free unused dquots from cache */
-static void prune_dqcache(int count)
+/*
+ * This is called from kswapd when we think we need some
+ * more memory
+ */
+static long shrink_dqcache_scan(struct shrinker *shrink,
+				 struct shrink_control *sc)
 {
 	struct list_head *head;
 	struct dquot *dquot;
+	int freed = 0;
 
+	spin_lock(&dq_list_lock);
 	head = free_dquots.prev;
-	while (head != &free_dquots && count) {
+	while (head != &free_dquots && freed < sc->nr_to_scan) {
 		dquot = list_entry(head, struct dquot, dq_free);
 		remove_dquot_hash(dquot);
 		remove_free_dquot(dquot);
 		remove_inuse(dquot);
 		do_destroy_dquot(dquot);
-		count--;
+		freed++;
 		head = free_dquots.prev;
 	}
+	spin_unlock(&dq_list_lock);
+
+	return freed;
 }
 
-/*
- * This is called from kswapd when we think we need some
- * more memory
- */
-static int shrink_dqcache_memory(struct shrinker *shrink,
+static long shrink_dqcache_count(struct shrinker *shrink,
 				 struct shrink_control *sc)
 {
-	int nr = sc->nr_to_scan;
-
-	if (nr) {
-		spin_lock(&dq_list_lock);
-		prune_dqcache(nr);
-		spin_unlock(&dq_list_lock);
-	}
-	return ((unsigned)
-		percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
-		/100) * sysctl_vfs_cache_pressure;
+	return (percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
+		/ 100) * sysctl_vfs_cache_pressure;
 }
-
 static struct shrinker dqcache_shrinker = {
-	.shrink = shrink_dqcache_memory,
+	.scan_objects = shrink_dqcache_scan,
+	.count_objects = shrink_dqcache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/super.c b/fs/super.c
index 6a72693..074abbe 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -45,11 +45,14 @@ DEFINE_SPINLOCK(sb_lock);
  * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence we
  * take a passive reference to the superblock to avoid this from occurring.
  */
-static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
+static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct super_block *sb;
-	int	fs_objects = 0;
-	int	total_objects;
+	long	fs_objects = 0;
+	long	total_objects;
+	long	freed = 0;
+	long	dentries;
+	long	inodes;
 
 	sb = container_of(shrink, struct super_block, s_shrink);
 
@@ -57,7 +60,7 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	 * Deadlock avoidance.  We may hold various FS locks, and we don't want
 	 * to recurse into the FS that called us in clear_inode() and friends..
 	 */
-	if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
+	if (!(sc->gfp_mask & __GFP_FS))
 		return -1;
 
 	if (!grab_super_passive(sb))
@@ -69,33 +72,42 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	total_objects = sb->s_nr_dentry_unused +
 			sb->s_nr_inodes_unused + fs_objects + 1;
 
-	if (sc->nr_to_scan) {
-		int	dentries;
-		int	inodes;
-
-		/* proportion the scan between the caches */
-		dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) /
-							total_objects;
-		inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) /
-							total_objects;
-		if (fs_objects)
-			fs_objects = (sc->nr_to_scan * fs_objects) /
-							total_objects;
-		/*
-		 * prune the dcache first as the icache is pinned by it, then
-		 * prune the icache, followed by the filesystem specific caches
-		 */
-		prune_dcache_sb(sb, dentries);
-		prune_icache_sb(sb, inodes);
+	/* proportion the scan between the caches */
+	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
+	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
 
-		if (fs_objects && sb->s_op->free_cached_objects) {
-			sb->s_op->free_cached_objects(sb, fs_objects);
-			fs_objects = sb->s_op->nr_cached_objects(sb);
-		}
-		total_objects = sb->s_nr_dentry_unused +
-				sb->s_nr_inodes_unused + fs_objects;
+	/*
+	 * prune the dcache first as the icache is pinned by it, then
+	 * prune the icache, followed by the filesystem specific caches
+	 */
+	freed = prune_dcache_sb(sb, dentries);
+	freed += prune_icache_sb(sb, inodes);
+
+	if (fs_objects) {
+		fs_objects = (sc->nr_to_scan * fs_objects) / total_objects;
+		freed += sb->s_op->free_cached_objects(sb, fs_objects);
 	}
 
+	drop_super(sb);
+	return freed;
+}
+
+static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct super_block *sb;
+	long	total_objects = 0;
+
+	sb = container_of(shrink, struct super_block, s_shrink);
+
+	if (!grab_super_passive(sb))
+		return -1;
+
+	if (sb->s_op && sb->s_op->nr_cached_objects)
+		total_objects = sb->s_op->nr_cached_objects(sb);
+
+	total_objects += sb->s_nr_dentry_unused;
+	total_objects += sb->s_nr_inodes_unused;
+
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
 	drop_super(sb);
 	return total_objects;
@@ -182,7 +194,8 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		s->cleancache_poolid = -1;
 
 		s->s_shrink.seeks = DEFAULT_SEEKS;
-		s->s_shrink.shrink = prune_super;
+		s->s_shrink.scan_objects = super_cache_scan;
+		s->s_shrink.count_objects = super_cache_count;
 		s->s_shrink.batch = 1024;
 	}
 out:
diff --git a/fs/ubifs/shrinker.c b/fs/ubifs/shrinker.c
index 9e1d056..78ca7b7 100644
--- a/fs/ubifs/shrinker.c
+++ b/fs/ubifs/shrinker.c
@@ -277,19 +277,12 @@ static int kick_a_thread(void)
 	return 0;
 }
 
-int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	int nr = sc->nr_to_scan;
 	int freed, contention = 0;
 	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
 
-	if (nr == 0)
-		/*
-		 * Due to the way UBIFS updates the clean znode counter it may
-		 * temporarily be negative.
-		 */
-		return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
-
 	if (!clean_zn_cnt) {
 		/*
 		 * No clean znodes, nothing to reap. All we can do in this case
@@ -323,3 +316,13 @@ out:
 	dbg_tnc("%d znodes were freed, requested %d", freed, nr);
 	return freed;
 }
+
+long ubifs_shrinker_count(struct shrinker *shrink, ubifs_shrinker_scan)
+{
+	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
+	/*
+	 * Due to the way UBIFS updates the clean znode counter it may
+	 * temporarily be negative.
+	 */
+	return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
+}
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 91903f6..3d3f3e9 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -49,7 +49,8 @@ struct kmem_cache *ubifs_inode_slab;
 
 /* UBIFS TNC shrinker description */
 static struct shrinker ubifs_shrinker_info = {
-	.shrink = ubifs_shrinker,
+	.scan_objects = ubifs_shrinker_scan,
+	.count_objects = ubifs_shrinker_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 27f2255..2b8f48c 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -1625,7 +1625,8 @@ int ubifs_tnc_start_commit(struct ubifs_info *c, struct ubifs_zbranch *zroot);
 int ubifs_tnc_end_commit(struct ubifs_info *c);
 
 /* shrinker.c */
-int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc);
+long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc);
+long ubifs_shrinker_count(struct shrinker *shrink, struct shrink_control *sc);
 
 /* commit.c */
 int ubifs_bg_thread(void *info);
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 7a026cb..b2eea9e 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1456,8 +1456,8 @@ restart:
 	spin_unlock(&btp->bt_lru_lock);
 }
 
-int
-xfs_buftarg_shrink(
+static long
+xfs_buftarg_shrink_scan(
 	struct shrinker		*shrink,
 	struct shrink_control	*sc)
 {
@@ -1465,6 +1465,7 @@ xfs_buftarg_shrink(
 					struct xfs_buftarg, bt_shrinker);
 	struct xfs_buf		*bp;
 	int nr_to_scan = sc->nr_to_scan;
+	int freed = 0;
 	LIST_HEAD(dispose);
 
 	if (!nr_to_scan)
@@ -1493,6 +1494,7 @@ xfs_buftarg_shrink(
 		 */
 		list_move(&bp->b_lru, &dispose);
 		btp->bt_lru_nr--;
+		freed++;
 	}
 	spin_unlock(&btp->bt_lru_lock);
 
@@ -1502,6 +1504,16 @@ xfs_buftarg_shrink(
 		xfs_buf_rele(bp);
 	}
 
+	return freed;
+}
+
+static long
+xfs_buftarg_shrink_count(
+	struct shrinker		*shrink,
+	struct shrink_control	*sc)
+{
+	struct xfs_buftarg	*btp = container_of(shrink,
+					struct xfs_buftarg, bt_shrinker);
 	return btp->bt_lru_nr;
 }
 
@@ -1602,7 +1614,8 @@ xfs_alloc_buftarg(
 		goto error;
 	if (xfs_alloc_delwrite_queue(btp, fsname))
 		goto error;
-	btp->bt_shrinker.shrink = xfs_buftarg_shrink;
+	btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
+	btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
 	btp->bt_shrinker.seeks = DEFAULT_SEEKS;
 	register_shrinker(&btp->bt_shrinker);
 	return btp;
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 9a0aa76..19863a8 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -60,10 +60,12 @@ STATIC void	xfs_qm_list_destroy(xfs_dqlist_t *);
 
 STATIC int	xfs_qm_init_quotainos(xfs_mount_t *);
 STATIC int	xfs_qm_init_quotainfo(xfs_mount_t *);
-STATIC int	xfs_qm_shake(struct shrinker *, struct shrink_control *);
+STATIC long	xfs_qm_shake_scan(struct shrinker *, struct shrink_control *);
+STATIC long	xfs_qm_shake_count(struct shrinker *, struct shrink_control *);
 
 static struct shrinker xfs_qm_shaker = {
-	.shrink = xfs_qm_shake,
+	.scan_objects = xfs_qm_shake_scan,
+	.count_objects = xfs_qm_shake_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
@@ -1963,9 +1965,8 @@ xfs_qm_shake_freelist(
 /*
  * The kmem_shake interface is invoked when memory is running low.
  */
-/* ARGSUSED */
-STATIC int
-xfs_qm_shake(
+STATIC long
+xfs_qm_shake_scan(
 	struct shrinker	*shrink,
 	struct shrink_control *sc)
 {
@@ -1973,9 +1974,9 @@ xfs_qm_shake(
 	gfp_t gfp_mask = sc->gfp_mask;
 
 	if (!kmem_shake_allow(gfp_mask))
-		return 0;
+		return -1;
 	if (!xfs_Gqm)
-		return 0;
+		return -1;
 
 	nfree = xfs_Gqm->qm_dqfrlist_cnt; /* free dquots */
 	/* incore dquots in all f/s's */
@@ -1992,6 +1993,13 @@ xfs_qm_shake(
 	return xfs_qm_shake_freelist(MAX(nfree, n));
 }
 
+STATIC long
+xfs_qm_shake_count(
+	struct shrinker	*shrink,
+	struct shrink_control *sc)
+{
+	return xfs_Gqm ? xfs_Gqm->qm_dqfrlist_cnt : -1;
+}
 
 /*------------------------------------------------------------------*/
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c94ec22..dff4b67 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1473,19 +1473,19 @@ xfs_fs_mount(
 	return mount_bdev(fs_type, flags, dev_name, data, xfs_fs_fill_super);
 }
 
-static int
+static long
 xfs_fs_nr_cached_objects(
 	struct super_block	*sb)
 {
 	return xfs_reclaim_inodes_count(XFS_M(sb));
 }
 
-static void
+static long
 xfs_fs_free_cached_objects(
 	struct super_block	*sb,
-	int			nr_to_scan)
+	long			nr_to_scan)
 {
-	xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
+	return xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
 }
 
 static const struct super_operations xfs_super_operations = {
diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
index 4604f90..5b60a3a 100644
--- a/fs/xfs/xfs_sync.c
+++ b/fs/xfs/xfs_sync.c
@@ -896,7 +896,7 @@ int
 xfs_reclaim_inodes_ag(
 	struct xfs_mount	*mp,
 	int			flags,
-	int			*nr_to_scan)
+	long			*nr_to_scan)
 {
 	struct xfs_perag	*pag;
 	int			error = 0;
@@ -1017,7 +1017,7 @@ xfs_reclaim_inodes(
 	xfs_mount_t	*mp,
 	int		mode)
 {
-	int		nr_to_scan = INT_MAX;
+	long		nr_to_scan = LONG_MAX;
 
 	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
 }
@@ -1031,29 +1031,32 @@ xfs_reclaim_inodes(
  * them to be cleaned, which we hope will not be very long due to the
  * background walker having already kicked the IO off on those dirty inodes.
  */
-void
+long
 xfs_reclaim_inodes_nr(
 	struct xfs_mount	*mp,
-	int			nr_to_scan)
+	long			nr_to_scan)
 {
+	long nr = nr_to_scan;
+
 	/* kick background reclaimer and push the AIL */
 	xfs_syncd_queue_reclaim(mp);
 	xfs_ail_push_all(mp->m_ail);
 
-	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan);
+	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr);
+	return nr_to_scan - nr;
 }
 
 /*
  * Return the number of reclaimable inodes in the filesystem for
  * the shrinker to determine how much to reclaim.
  */
-int
+long
 xfs_reclaim_inodes_count(
 	struct xfs_mount	*mp)
 {
 	struct xfs_perag	*pag;
 	xfs_agnumber_t		ag = 0;
-	int			reclaimable = 0;
+	long			reclaimable = 0;
 
 	while ((pag = xfs_perag_get_tag(mp, ag, XFS_ICI_RECLAIM_TAG))) {
 		ag = pag->pag_agno + 1;
diff --git a/fs/xfs/xfs_sync.h b/fs/xfs/xfs_sync.h
index 941202e..82e1b1c 100644
--- a/fs/xfs/xfs_sync.h
+++ b/fs/xfs/xfs_sync.h
@@ -35,8 +35,8 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
 void xfs_flush_inodes(struct xfs_inode *ip);
 
 int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
-int xfs_reclaim_inodes_count(struct xfs_mount *mp);
-void xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
+long xfs_reclaim_inodes_count(struct xfs_mount *mp);
+long xfs_reclaim_inodes_nr(struct xfs_mount *mp, long nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 void __xfs_inode_set_reclaim_tag(struct xfs_perag *pag, struct xfs_inode *ip);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 14be4d8..958c025 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1465,10 +1465,6 @@ struct super_block {
 	struct shrinker s_shrink;	/* per-sb shrinker handle */
 };
 
-/* superblock cache pruning functions */
-extern void prune_icache_sb(struct super_block *sb, int nr_to_scan);
-extern void prune_dcache_sb(struct super_block *sb, int nr_to_scan);
-
 extern struct timespec current_fs_time(struct super_block *sb);
 
 /*
@@ -1662,8 +1658,8 @@ struct super_operations {
 	ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
 #endif
 	int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
-	int (*nr_cached_objects)(struct super_block *);
-	void (*free_cached_objects)(struct super_block *, int);
+	long (*nr_cached_objects)(struct super_block *);
+	long (*free_cached_objects)(struct super_block *, long);
 };
 
 /*
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 36851f7..80308ea 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -190,7 +190,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
-		__field(void *, shrink)
+		__field(void *, scan)
 		__field(long, nr_objects_to_shrink)
 		__field(gfp_t, gfp_flags)
 		__field(unsigned long, pgs_scanned)
@@ -202,7 +202,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->shrink = shr->shrink;
+		__entry->scan = shr->scan_objects;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = sc->gfp_mask;
 		__entry->pgs_scanned = pgs_scanned;
@@ -213,7 +213,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 	),
 
 	TP_printk("%pF %p: objects to shrink %ld gfp_flags %s pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
-		__entry->shrink,
+		__entry->scan,
 		__entry->shr,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
@@ -232,7 +232,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
-		__field(void *, shrink)
+		__field(void *, scan)
 		__field(long, unused_scan)
 		__field(long, new_scan)
 		__field(int, retval)
@@ -241,7 +241,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->shrink = shr->shrink;
+		__entry->scan = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
 		__entry->retval = shrinker_retval;
@@ -249,7 +249,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	),
 
 	TP_printk("%pF %p: unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
-		__entry->shrink,
+		__entry->scan,
 		__entry->shr,
 		__entry->unused_scan,
 		__entry->new_scan,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7ef6912..e32ce2d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -202,14 +202,6 @@ void unregister_shrinker(struct shrinker *shrinker)
 }
 EXPORT_SYMBOL(unregister_shrinker);
 
-static inline int do_shrinker_shrink(struct shrinker *shrinker,
-				     struct shrink_control *sc,
-				     unsigned long nr_to_scan)
-{
-	sc->nr_to_scan = nr_to_scan;
-	return (*shrinker->shrink)(shrinker, sc);
-}
-
 #define SHRINK_BATCH 128
 /*
  * Call the shrink functions to age shrinkable caches
@@ -230,27 +222,26 @@ static inline int do_shrinker_shrink(struct shrinker *shrinker,
  *
  * Returns the number of slab objects which we shrunk.
  */
-unsigned long shrink_slab(struct shrink_control *shrink,
+unsigned long shrink_slab(struct shrink_control *sc,
 			  unsigned long nr_pages_scanned,
 			  unsigned long lru_pages)
 {
 	struct shrinker *shrinker;
-	unsigned long ret = 0;
+	unsigned long freed = 0;
 
 	if (nr_pages_scanned == 0)
 		nr_pages_scanned = SWAP_CLUSTER_MAX;
 
 	if (!down_read_trylock(&shrinker_rwsem)) {
 		/* Assume we'll be able to shrink next time */
-		ret = 1;
+		freed = 1;
 		goto out;
 	}
 
 	list_for_each_entry(shrinker, &shrinker_list, list) {
-		unsigned long long delta;
-		unsigned long total_scan;
-		unsigned long max_pass;
-		int shrink_ret = 0;
+		long long delta;
+		long total_scan;
+		long max_pass;
 		long nr;
 		long new_nr;
 		long batch_size = shrinker->batch ? shrinker->batch
@@ -266,7 +257,9 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
 
 		total_scan = nr;
-		max_pass = do_shrinker_shrink(shrinker, shrink, 0);
+		max_pass = shrinker->count_objects(shrinker, sc);
+		WARN_ON_ONCE(max_pass < 0);
+
 		delta = (4 * nr_pages_scanned) / shrinker->seeks;
 		delta *= max_pass;
 		do_div(delta, lru_pages + 1);
@@ -274,7 +267,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		if (total_scan < 0) {
 			printk(KERN_ERR "shrink_slab: %pF negative objects to "
 			       "delete nr=%ld\n",
-			       shrinker->shrink, total_scan);
+			       shrinker->scan_objects, total_scan);
 			total_scan = max_pass;
 		}
 
@@ -301,20 +294,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		if (total_scan > max_pass * 2)
 			total_scan = max_pass * 2;
 
-		trace_mm_shrink_slab_start(shrinker, shrink, nr,
+		trace_mm_shrink_slab_start(shrinker, sc, nr,
 					nr_pages_scanned, lru_pages,
 					max_pass, delta, total_scan);
 
 		while (total_scan >= batch_size) {
-			int nr_before;
+			long ret;
+
+			sc->nr_to_scan = batch_size;
+			ret = shrinker->scan_objects(shrinker, sc);
 
-			nr_before = do_shrinker_shrink(shrinker, shrink, 0);
-			shrink_ret = do_shrinker_shrink(shrinker, shrink,
-							batch_size);
-			if (shrink_ret == -1)
+			if (ret == -1)
 				break;
-			if (shrink_ret < nr_before)
-				ret += nr_before - shrink_ret;
+			freed += ret;
 			count_vm_events(SLABS_SCANNED, batch_size);
 			total_scan -= batch_size;
 
@@ -333,12 +325,12 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 				break;
 		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
 
-		trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
+		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
 	}
 	up_read(&shrinker_rwsem);
 out:
 	cond_resched();
-	return ret;
+	return freed;
 }
 
 static void set_reclaim_mode(int priority, struct scan_control *sc,
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 727e506..f5955c3 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -292,6 +292,7 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 	spinlock_t *cache_lock;
 	struct rpc_cred *cred, *next;
 	unsigned long expired = jiffies - RPC_AUTH_EXPIRY_MORATORIUM;
+	int freed = 0;
 
 	list_for_each_entry_safe(cred, next, &cred_unused, cr_lru) {
 
@@ -303,10 +304,10 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 		 */
 		if (time_in_range(cred->cr_expire, expired, jiffies) &&
 		    test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
-			return 0;
+			break;
 
-		list_del_init(&cred->cr_lru);
 		number_cred_unused--;
+		list_del_init(&cred->cr_lru);
 		if (atomic_read(&cred->cr_count) != 0)
 			continue;
 
@@ -316,17 +317,18 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 			get_rpccred(cred);
 			list_add_tail(&cred->cr_lru, free);
 			rpcauth_unhash_cred_locked(cred);
+			freed++;
 		}
 		spin_unlock(cache_lock);
 	}
-	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
+	return freed;
 }
 
 /*
  * Run memory cache shrinker.
  */
-static int
-rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+static long
+rpcauth_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	LIST_HEAD(free);
 	int res;
@@ -344,6 +346,12 @@ rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
 	return res;
 }
 
+static long
+rpcauth_cache_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
+}
+
 /*
  * Look up a process' credentials in the authentication cache
  */
@@ -658,7 +666,8 @@ rpcauth_uptodatecred(struct rpc_task *task)
 }
 
 static struct shrinker rpc_cred_shrinker = {
-	.shrink = rpcauth_cache_shrinker,
+	.scan_objects = rpcauth_cache_scan,
+	.count_objects = rpcauth_cache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 05/13] mm: convert shrinkers to use new API
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Modify shrink_slab() to use the new .count_objects/.scan_objects API
and implement the callouts for all the existing shrinkers.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Documentation/filesystems/vfs.txt    |   11 +++--
 arch/x86/kvm/mmu.c                   |   16 ++++---
 drivers/gpu/drm/i915/i915_dma.c      |    4 +-
 drivers/gpu/drm/i915/i915_gem.c      |   49 ++++++++++++++---------
 drivers/gpu/drm/ttm/ttm_page_alloc.c |   14 ++++--
 drivers/staging/zcache/zcache-main.c |   45 ++++++++++++---------
 fs/cifs/cifsacl.c                    |   57 +++++++++++++++++----------
 fs/dcache.c                          |   15 ++++---
 fs/gfs2/glock.c                      |   24 +++++++-----
 fs/gfs2/main.c                       |    3 +-
 fs/gfs2/quota.c                      |   19 +++++----
 fs/gfs2/quota.h                      |    4 +-
 fs/inode.c                           |    7 ++-
 fs/internal.h                        |    3 +
 fs/mbcache.c                         |   37 +++++++++++------
 fs/nfs/dir.c                         |   17 ++++++--
 fs/nfs/internal.h                    |    6 ++-
 fs/nfs/super.c                       |    3 +-
 fs/quota/dquot.c                     |   39 +++++++++----------
 fs/super.c                           |   71 ++++++++++++++++++++--------------
 fs/ubifs/shrinker.c                  |   19 +++++----
 fs/ubifs/super.c                     |    3 +-
 fs/ubifs/ubifs.h                     |    3 +-
 fs/xfs/xfs_buf.c                     |   19 ++++++++-
 fs/xfs/xfs_qm.c                      |   22 +++++++---
 fs/xfs/xfs_super.c                   |    8 ++--
 fs/xfs/xfs_sync.c                    |   17 +++++---
 fs/xfs/xfs_sync.h                    |    4 +-
 include/linux/fs.h                   |    8 +---
 include/trace/events/vmscan.h        |   12 +++---
 mm/vmscan.c                          |   46 +++++++++-------------
 net/sunrpc/auth.c                    |   21 +++++++---
 32 files changed, 369 insertions(+), 257 deletions(-)

diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 52d8fb8..4ca3c2d 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -229,8 +229,8 @@ struct super_operations {
 
         ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
         ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
-	int (*nr_cached_objects)(struct super_block *);
-	void (*free_cached_objects)(struct super_block *, int);
+	long (*nr_cached_objects)(struct super_block *);
+	long (*free_cached_objects)(struct super_block *, long);
 };
 
 All methods are called without any locks being held, unless otherwise
@@ -313,9 +313,10 @@ or bottom half).
 	implement ->nr_cached_objects for it to be called correctly.
 
 	We can't do anything with any errors that the filesystem might
-	encountered, hence the void return type. This will never be called if
-	the VM is trying to reclaim under GFP_NOFS conditions, hence this
-	method does not need to handle that situation itself.
+	encountered, so the return value is the number of objects freed. This
+	will never be called if the VM is trying to reclaim under GFP_NOFS
+	conditions, hence this method does not need to handle that situation
+	itself.
 
 	Implementations must include conditional reschedule calls inside any
 	scanning loop that is done. This allows the VFS to determine
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1c5b693..939e201 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3858,14 +3858,12 @@ static int kvm_mmu_remove_some_alloc_mmu_pages(struct kvm *kvm,
 	return kvm_mmu_prepare_zap_page(kvm, page, invalid_list);
 }
 
-static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
+static long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct kvm *kvm;
 	struct kvm *kvm_freed = NULL;
 	int nr_to_scan = sc->nr_to_scan;
-
-	if (nr_to_scan == 0)
-		goto out;
+	long freed_pages = 0;
 
 	raw_spin_lock(&kvm_lock);
 
@@ -3877,7 +3875,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 		spin_lock(&kvm->mmu_lock);
 		if (!kvm_freed && nr_to_scan > 0 &&
 		    kvm->arch.n_used_mmu_pages > 0) {
-			freed_pages = kvm_mmu_remove_some_alloc_mmu_pages(kvm,
+			freed_pages += kvm_mmu_remove_some_alloc_mmu_pages(kvm,
 							  &invalid_list);
 			kvm_freed = kvm;
 		}
@@ -3891,13 +3889,17 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 		list_move_tail(&kvm_freed->vm_list, &vm_list);
 
 	raw_spin_unlock(&kvm_lock);
+	return freed_pages;
+}
 
-out:
+static long mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
 	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
 }
 
 static struct shrinker mmu_shrinker = {
-	.shrink = mmu_shrink,
+	.scan_objects = mmu_shrink_scan,
+	.count_objects = mmu_shrink_count,
 	.seeks = DEFAULT_SEEKS * 10,
 };
 
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 8a3942c..734ea5e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -2074,7 +2074,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
 	return 0;
 
 out_gem_unload:
-	if (dev_priv->mm.inactive_shrinker.shrink)
+	if (dev_priv->mm.inactive_shrinker.scan_objects)
 		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
 
 	if (dev->pdev->msi_enabled)
@@ -2108,7 +2108,7 @@ int i915_driver_unload(struct drm_device *dev)
 	i915_mch_dev = NULL;
 	spin_unlock(&mchdev_lock);
 
-	if (dev_priv->mm.inactive_shrinker.shrink)
+	if (dev_priv->mm.inactive_shrinker.scan_objects)
 		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
 
 	mutex_lock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a546a71..0647a33 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -56,7 +56,9 @@ static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_file *file);
 static void i915_gem_free_object_tail(struct drm_i915_gem_object *obj);
 
-static int i915_gem_inactive_shrink(struct shrinker *shrinker,
+static long i915_gem_inactive_scan(struct shrinker *shrinker,
+				   struct shrink_control *sc);
+static long i915_gem_inactive_count(struct shrinker *shrinker,
 				    struct shrink_control *sc);
 
 /* some bookkeeping */
@@ -3999,7 +4001,8 @@ i915_gem_load(struct drm_device *dev)
 
 	dev_priv->mm.interruptible = true;
 
-	dev_priv->mm.inactive_shrinker.shrink = i915_gem_inactive_shrink;
+	dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan;
+	dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count;
 	dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS;
 	register_shrinker(&dev_priv->mm.inactive_shrinker);
 }
@@ -4221,8 +4224,8 @@ i915_gpu_is_active(struct drm_device *dev)
 	return !lists_empty;
 }
 
-static int
-i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
+static long
+i915_gem_inactive_scan(struct shrinker *shrinker, struct shrink_control *sc)
 {
 	struct drm_i915_private *dev_priv =
 		container_of(shrinker,
@@ -4231,22 +4234,10 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	struct drm_device *dev = dev_priv->dev;
 	struct drm_i915_gem_object *obj, *next;
 	int nr_to_scan = sc->nr_to_scan;
-	int cnt;
 
 	if (!mutex_trylock(&dev->struct_mutex))
 		return 0;
 
-	/* "fast-path" to count number of available objects */
-	if (nr_to_scan == 0) {
-		cnt = 0;
-		list_for_each_entry(obj,
-				    &dev_priv->mm.inactive_list,
-				    mm_list)
-			cnt++;
-		mutex_unlock(&dev->struct_mutex);
-		return cnt / 100 * sysctl_vfs_cache_pressure;
-	}
-
 rescan:
 	/* first scan for clean buffers */
 	i915_gem_retire_requests(dev);
@@ -4262,15 +4253,12 @@ rescan:
 	}
 
 	/* second pass, evict/count anything still on the inactive list */
-	cnt = 0;
 	list_for_each_entry_safe(obj, next,
 				 &dev_priv->mm.inactive_list,
 				 mm_list) {
 		if (nr_to_scan &&
 		    i915_gem_object_unbind(obj) == 0)
 			nr_to_scan--;
-		else
-			cnt++;
 	}
 
 	if (nr_to_scan && i915_gpu_is_active(dev)) {
@@ -4284,5 +4272,26 @@ rescan:
 			goto rescan;
 	}
 	mutex_unlock(&dev->struct_mutex);
-	return cnt / 100 * sysctl_vfs_cache_pressure;
+	return sc->nr_to_scan - nr_to_scan;
+}
+
+static long
+i915_gem_inactive_count(struct shrinker *shrinker, struct shrink_control *sc)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(shrinker,
+			     struct drm_i915_private,
+			     mm.inactive_shrinker);
+	struct drm_device *dev = dev_priv->dev;
+	struct drm_i915_gem_object *obj;
+	long count = 0;
+
+	if (!mutex_trylock(&dev->struct_mutex))
+		return 0;
+
+	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list)
+		count++;
+
+	mutex_unlock(&dev->struct_mutex);
+	return count;
 }
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index 727e93d..3e71c68 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -395,14 +395,13 @@ static int ttm_pool_get_num_unused_pages(void)
 /**
  * Callback for mm to request pool to reduce number of page held.
  */
-static int ttm_pool_mm_shrink(struct shrinker *shrink,
-			      struct shrink_control *sc)
+static long ttm_pool_mm_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	static atomic_t start_pool = ATOMIC_INIT(0);
 	unsigned i;
 	unsigned pool_offset = atomic_add_return(1, &start_pool);
 	struct ttm_page_pool *pool;
-	int shrink_pages = sc->nr_to_scan;
+	long shrink_pages = sc->nr_to_scan;
 
 	pool_offset = pool_offset % NUM_POOLS;
 	/* select start pool in round robin fashion */
@@ -413,13 +412,18 @@ static int ttm_pool_mm_shrink(struct shrinker *shrink,
 		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
 		shrink_pages = ttm_page_pool_free(pool, nr_free);
 	}
-	/* return estimated number of unused pages in pool */
+	return sc->nr_to_scan;
+}
+
+static long ttm_pool_mm_count(struct shrinker *shrink, struct shrink_control *sc)
+{
 	return ttm_pool_get_num_unused_pages();
 }
 
 static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
 {
-	manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
+	manager->mm_shrink.scan_objects = ttm_pool_mm_scan;
+	manager->mm_shrink.count_objects = ttm_pool_mm_count;
 	manager->mm_shrink.seeks = 1;
 	register_shrinker(&manager->mm_shrink);
 }
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 855a5bb..3ccb723 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -493,9 +493,10 @@ static void zbud_evict_zbpg(struct zbud_page *zbpg)
  * page in use by another cpu, but also to avoid potential deadlock due to
  * lock inversion.
  */
-static void zbud_evict_pages(int nr)
+static int zbud_evict_pages(int nr)
 {
 	struct zbud_page *zbpg;
+	int freed = 0;
 	int i;
 
 	/* first try freeing any pages on unused list */
@@ -511,7 +512,7 @@ retry_unused_list:
 		spin_unlock_bh(&zbpg_unused_list_spinlock);
 		zcache_free_page(zbpg);
 		zcache_evicted_raw_pages++;
-		if (--nr <= 0)
+		if (++freed >= nr)
 			goto out;
 		goto retry_unused_list;
 	}
@@ -535,7 +536,7 @@ retry_unbud_list_i:
 			/* want budlists unlocked when doing zbpg eviction */
 			zbud_evict_zbpg(zbpg);
 			local_bh_enable();
-			if (--nr <= 0)
+			if (++freed >= nr)
 				goto out;
 			goto retry_unbud_list_i;
 		}
@@ -559,13 +560,13 @@ retry_bud_list:
 		/* want budlists unlocked when doing zbpg eviction */
 		zbud_evict_zbpg(zbpg);
 		local_bh_enable();
-		if (--nr <= 0)
+		if (++freed >= nr)
 			goto out;
 		goto retry_bud_list;
 	}
 	spin_unlock_bh(&zbud_budlists_spinlock);
 out:
-	return;
+	return freed;
 }
 
 static void zbud_init(void)
@@ -1496,30 +1497,34 @@ static bool zcache_freeze;
 /*
  * zcache shrinker interface (only useful for ephemeral pages, so zbud only)
  */
-static int shrink_zcache_memory(struct shrinker *shrink,
-				struct shrink_control *sc)
+static long shrink_zcache_scan(struct shrinker *shrink,
+			       struct shrink_control *sc)
 {
 	int ret = -1;
 	int nr = sc->nr_to_scan;
 	gfp_t gfp_mask = sc->gfp_mask;
 
-	if (nr >= 0) {
-		if (!(gfp_mask & __GFP_FS))
-			/* does this case really need to be skipped? */
-			goto out;
-		if (spin_trylock(&zcache_direct_reclaim_lock)) {
-			zbud_evict_pages(nr);
-			spin_unlock(&zcache_direct_reclaim_lock);
-		} else
-			zcache_aborted_shrink++;
-	}
-	ret = (int)atomic_read(&zcache_zbud_curr_raw_pages);
-out:
+	if (!(gfp_mask & __GFP_FS))
+		return -1;
+
+	if (spin_trylock(&zcache_direct_reclaim_lock)) {
+		ret = zbud_evict_pages(nr);
+		spin_unlock(&zcache_direct_reclaim_lock);
+	} else
+		zcache_aborted_shrink++;
+
 	return ret;
 }
 
+static long shrink_zcache_count(struct shrinker *shrink,
+				struct shrink_control *sc)
+{
+	return atomic_read(&zcache_zbud_curr_raw_pages);
+}
+
 static struct shrinker zcache_shrinker = {
-	.shrink = shrink_zcache_memory,
+	.scan_objects = shrink_zcache_scan,
+	.count_objects = shrink_zcache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index d0f59fa..508a684 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -44,58 +44,73 @@ static const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} };
 
 const struct cred *root_cred;
 
-static void
-shrink_idmap_tree(struct rb_root *root, int nr_to_scan, int *nr_rem,
-			int *nr_del)
+static long
+shrink_idmap_tree(struct rb_root *root, int nr_to_scan)
 {
 	struct rb_node *node;
 	struct rb_node *tmp;
 	struct cifs_sid_id *psidid;
+	long count = 0;
 
 	node = rb_first(root);
 	while (node) {
 		tmp = node;
 		node = rb_next(tmp);
 		psidid = rb_entry(tmp, struct cifs_sid_id, rbnode);
-		if (nr_to_scan == 0 || *nr_del == nr_to_scan)
-			++(*nr_rem);
-		else {
-			if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
-						&& psidid->refcount == 0) {
-				rb_erase(tmp, root);
-				++(*nr_del);
-			} else
-				++(*nr_rem);
+		if (nr_to_scan == 0) {
+			count++;
+			continue:
+		}
+		if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
+					&& psidid->refcount == 0) {
+			rb_erase(tmp, root);
+			if (++count >= nr_to_scan)
+				break;
 		}
 	}
+	return count;
 }
 
 /*
  * Run idmap cache shrinker.
  */
-static int
-cifs_idmap_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+static long
+cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
-	int nr_to_scan = sc->nr_to_scan;
-	int nr_del = 0;
-	int nr_rem = 0;
 	struct rb_root *root;
+	long freed;
 
 	root = &uidtree;
 	spin_lock(&siduidlock);
-	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
+	freed = shrink_idmap_tree(root, sc->nr_to_scan);
 	spin_unlock(&siduidlock);
 
 	root = &gidtree;
 	spin_lock(&sidgidlock);
-	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
+	freed += shrink_idmap_tree(root, sc->nr_to_scan);
 	spin_unlock(&sidgidlock);
 
-	return nr_rem;
+	return freed;
+}
+
+/*
+ * This still abuses the nr_to_scan == 0 trick to get the common code just to
+ * count objects. There neds to be an external count of the objects in the
+ * caches to avoid this.
+ */
+static long
+cifs_idmap_shrinker_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct shrinker_control sc = {
+		.nr_to_scan = 0,
+	}
+
+	return cifs_idmap_shrinker_scan(shrink, &sc);
 }
 
 static struct shrinker cifs_shrinker = {
-	.shrink = cifs_idmap_shrinker,
+	.scan_objects = cifs_idmap_shrinker_scan,
+	.count_objects = cifs_idmap_shrinker_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/dcache.c b/fs/dcache.c
index 5123d71..d19e453 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -759,11 +759,12 @@ static void shrink_dentry_list(struct list_head *list)
  *
  * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
  */
-static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
+static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
 {
 	struct dentry *dentry;
 	LIST_HEAD(referenced);
 	LIST_HEAD(tmp);
+	long freed = 0;
 
 relock:
 	spin_lock(&sb->s_dentry_lru_lock);
@@ -791,6 +792,7 @@ relock:
 		} else {
 			list_move_tail(&dentry->d_lru, &tmp);
 			spin_unlock(&dentry->d_lock);
+			freed++;
 			if (!--count)
 				break;
 		}
@@ -801,6 +803,7 @@ relock:
 	spin_unlock(&sb->s_dentry_lru_lock);
 
 	shrink_dentry_list(&tmp);
+	return freed;
 }
 
 /**
@@ -815,9 +818,9 @@ relock:
  * This function may fail to free any resources if all the dentries are in
  * use.
  */
-void prune_dcache_sb(struct super_block *sb, int nr_to_scan)
+long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
-	__shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
+	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
 }
 
 /**
@@ -1070,12 +1073,12 @@ EXPORT_SYMBOL(have_submounts);
  * drop the lock and return early due to latency
  * constraints.
  */
-static int select_parent(struct dentry * parent)
+static long select_parent(struct dentry * parent)
 {
 	struct dentry *this_parent;
 	struct list_head *next;
 	unsigned seq;
-	int found = 0;
+	long found = 0;
 	int locked = 0;
 
 	seq = read_seqbegin(&rename_lock);
@@ -1163,7 +1166,7 @@ rename_retry:
 void shrink_dcache_parent(struct dentry * parent)
 {
 	struct super_block *sb = parent->d_sb;
-	int found;
+	long found;
 
 	while ((found = select_parent(parent)) != 0)
 		__shrink_dcache_sb(sb, found, 0);
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 88e8a23..f9bc88d 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1370,24 +1370,21 @@ void gfs2_glock_complete(struct gfs2_glock *gl, int ret)
 }
 
 
-static int gfs2_shrink_glock_memory(struct shrinker *shrink,
-				    struct shrink_control *sc)
+static long gfs2_shrink_glock_scan(struct shrinker *shrink,
+				   struct shrink_control *sc)
 {
 	struct gfs2_glock *gl;
 	int may_demote;
 	int nr_skipped = 0;
-	int nr = sc->nr_to_scan;
+	int freed = 0;
 	gfp_t gfp_mask = sc->gfp_mask;
 	LIST_HEAD(skipped);
 
-	if (nr == 0)
-		goto out;
-
 	if (!(gfp_mask & __GFP_FS))
 		return -1;
 
 	spin_lock(&lru_lock);
-	while(nr && !list_empty(&lru_list)) {
+	while (freed < sc->nr_to_scan && !list_empty(&lru_list)) {
 		gl = list_entry(lru_list.next, struct gfs2_glock, gl_lru);
 		list_del_init(&gl->gl_lru);
 		clear_bit(GLF_LRU, &gl->gl_flags);
@@ -1401,7 +1398,7 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
 			may_demote = demote_ok(gl);
 			if (may_demote) {
 				handle_callback(gl, LM_ST_UNLOCKED, 0);
-				nr--;
+				freed++;
 			}
 			clear_bit(GLF_LOCK, &gl->gl_flags);
 			smp_mb__after_clear_bit();
@@ -1418,12 +1415,19 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
 	list_splice(&skipped, &lru_list);
 	atomic_add(nr_skipped, &lru_count);
 	spin_unlock(&lru_lock);
-out:
+
+	return freed;
+}
+
+static long gfs2_shrink_glock_count(struct shrinker *shrink,
+				    struct shrink_control *sc)
+{
 	return (atomic_read(&lru_count) / 100) * sysctl_vfs_cache_pressure;
 }
 
 static struct shrinker glock_shrinker = {
-	.shrink = gfs2_shrink_glock_memory,
+	.scan_objects = gfs2_shrink_glock_scan,
+	.count_objects = gfs2_shrink_glock_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 8ea7747..2c21986 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -29,7 +29,8 @@
 #include "dir.h"
 
 static struct shrinker qd_shrinker = {
-	.shrink = gfs2_shrink_qd_memory,
+	.scan_objects = gfs2_shrink_qd_scan,
+	.count_objects = gfs2_shrink_qd_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 42e8d23..5a5f76c 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -78,20 +78,17 @@ static LIST_HEAD(qd_lru_list);
 static atomic_t qd_lru_count = ATOMIC_INIT(0);
 static DEFINE_SPINLOCK(qd_lru_lock);
 
-int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
+long gfs2_shrink_qd_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct gfs2_quota_data *qd;
 	struct gfs2_sbd *sdp;
-	int nr_to_scan = sc->nr_to_scan;
-
-	if (nr_to_scan == 0)
-		goto out;
+	int freed = 0;
 
 	if (!(sc->gfp_mask & __GFP_FS))
 		return -1;
 
 	spin_lock(&qd_lru_lock);
-	while (nr_to_scan && !list_empty(&qd_lru_list)) {
+	while (freed <= sc->nr_to_scan && !list_empty(&qd_lru_list)) {
 		qd = list_entry(qd_lru_list.next,
 				struct gfs2_quota_data, qd_reclaim);
 		sdp = qd->qd_gl->gl_sbd;
@@ -112,12 +109,16 @@ int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
 		spin_unlock(&qd_lru_lock);
 		kmem_cache_free(gfs2_quotad_cachep, qd);
 		spin_lock(&qd_lru_lock);
-		nr_to_scan--;
+		freed++;
 	}
 	spin_unlock(&qd_lru_lock);
 
-out:
-	return (atomic_read(&qd_lru_count) * sysctl_vfs_cache_pressure) / 100;
+	return freed;
+}
+
+long gfs2_shrink_qd_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	return (atomic_read(&qd_lru_count) / 100) * sysctl_vfs_cache_pressure;
 }
 
 static u64 qd2offset(struct gfs2_quota_data *qd)
diff --git a/fs/gfs2/quota.h b/fs/gfs2/quota.h
index 90bf1c3..c40fe6d 100644
--- a/fs/gfs2/quota.h
+++ b/fs/gfs2/quota.h
@@ -52,7 +52,9 @@ static inline int gfs2_quota_lock_check(struct gfs2_inode *ip)
 	return ret;
 }
 
-extern int gfs2_shrink_qd_memory(struct shrinker *shrink,
+extern long gfs2_shrink_qd_scan(struct shrinker *shrink,
+				struct shrink_control *sc);
+extern long gfs2_shrink_qd_count(struct shrinker *shrink,
 				 struct shrink_control *sc);
 extern const struct quotactl_ops gfs2_quotactl_ops;
 
diff --git a/fs/inode.c b/fs/inode.c
index 848808f..fee5d9a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -613,10 +613,11 @@ static int can_unuse(struct inode *inode)
  * LRU does not have strict ordering. Hence we don't want to reclaim inodes
  * with this flag set because they are the inodes that are out of order.
  */
-void prune_icache_sb(struct super_block *sb, int nr_to_scan)
+long prune_icache_sb(struct super_block *sb, long nr_to_scan)
 {
 	LIST_HEAD(freeable);
-	int nr_scanned;
+	long nr_scanned;
+	long freed = 0;
 	unsigned long reap = 0;
 
 	spin_lock(&sb->s_inode_lru_lock);
@@ -686,6 +687,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
 		list_move(&inode->i_lru, &freeable);
 		sb->s_nr_inodes_unused--;
 		this_cpu_dec(nr_unused);
+		freed++;
 	}
 	if (current_is_kswapd())
 		__count_vm_events(KSWAPD_INODESTEAL, reap);
@@ -694,6 +696,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
 	spin_unlock(&sb->s_inode_lru_lock);
 
 	dispose_list(&freeable);
+	return freed;
 }
 
 static void __wait_on_freeing_inode(struct inode *inode);
diff --git a/fs/internal.h b/fs/internal.h
index fe327c2..2662ffa 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -127,6 +127,8 @@ extern long do_handle_open(int mountdirfd,
  * inode.c
  */
 extern spinlock_t inode_sb_list_lock;
+extern long prune_icache_sb(struct super_block *sb, long nr_to_scan);
+
 
 /*
  * fs-writeback.c
@@ -141,3 +143,4 @@ extern int invalidate_inodes(struct super_block *, bool);
  * dcache.c
  */
 extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
+extern long prune_dcache_sb(struct super_block *sb, long nr_to_scan);
diff --git a/fs/mbcache.c b/fs/mbcache.c
index 8c32ef3..aa3a19a 100644
--- a/fs/mbcache.c
+++ b/fs/mbcache.c
@@ -90,11 +90,14 @@ static DEFINE_SPINLOCK(mb_cache_spinlock);
  * What the mbcache registers as to get shrunk dynamically.
  */
 
-static int mb_cache_shrink_fn(struct shrinker *shrink,
-			      struct shrink_control *sc);
+static long mb_cache_shrink_scan(struct shrinker *shrink,
+				 struct shrink_control *sc);
+static long mb_cache_shrink_count(struct shrinker *shrink,
+				  struct shrink_control *sc);
 
 static struct shrinker mb_cache_shrinker = {
-	.shrink = mb_cache_shrink_fn,
+	.scan_objects = mb_cache_shrink_scan,
+	.count_objects = mb_cache_shrink_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
@@ -161,13 +164,12 @@ forget:
  *
  * Returns the number of objects which are present in the cache.
  */
-static int
-mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
+static long
+mb_cache_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	LIST_HEAD(free_list);
-	struct mb_cache *cache;
 	struct mb_cache_entry *entry, *tmp;
-	int count = 0;
+	int freed = 0;
 	int nr_to_scan = sc->nr_to_scan;
 	gfp_t gfp_mask = sc->gfp_mask;
 
@@ -180,18 +182,27 @@ mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
 		list_move_tail(&ce->e_lru_list, &free_list);
 		__mb_cache_entry_unhash(ce);
 	}
-	list_for_each_entry(cache, &mb_cache_list, c_cache_list) {
-		mb_debug("cache %s (%d)", cache->c_name,
-			  atomic_read(&cache->c_entry_count));
-		count += atomic_read(&cache->c_entry_count);
-	}
 	spin_unlock(&mb_cache_spinlock);
 	list_for_each_entry_safe(entry, tmp, &free_list, e_lru_list) {
 		__mb_cache_entry_forget(entry, gfp_mask);
+		freed++;
 	}
-	return (count / 100) * sysctl_vfs_cache_pressure;
+	return freed;
 }
 
+static long
+mb_cache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct mb_cache *cache;
+	long count = 0;
+
+	spin_lock(&mb_cache_spinlock);
+	list_for_each_entry(cache, &mb_cache_list, c_cache_list)
+		count += atomic_read(&cache->c_entry_count);
+
+	spin_unlock(&mb_cache_spinlock);
+	return (count / 100) * sysctl_vfs_cache_pressure;
+}
 
 /*
  * mb_cache_create()  create a new cache
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index b238d95..a5aefb2 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -2057,17 +2057,18 @@ static void nfs_access_free_list(struct list_head *head)
 	}
 }
 
-int nfs_access_cache_shrinker(struct shrinker *shrink,
-			      struct shrink_control *sc)
+long nfs_access_cache_scan(struct shrinker *shrink,
+			   struct shrink_control *sc)
 {
 	LIST_HEAD(head);
 	struct nfs_inode *nfsi, *next;
 	struct nfs_access_entry *cache;
 	int nr_to_scan = sc->nr_to_scan;
+	int freed = 0;
 	gfp_t gfp_mask = sc->gfp_mask;
 
 	if ((gfp_mask & GFP_KERNEL) != GFP_KERNEL)
-		return (nr_to_scan == 0) ? 0 : -1;
+		return -1;
 
 	spin_lock(&nfs_access_lru_lock);
 	list_for_each_entry_safe(nfsi, next, &nfs_access_lru_list, access_cache_inode_lru) {
@@ -2079,6 +2080,7 @@ int nfs_access_cache_shrinker(struct shrinker *shrink,
 		spin_lock(&inode->i_lock);
 		if (list_empty(&nfsi->access_cache_entry_lru))
 			goto remove_lru_entry;
+		freed++;
 		cache = list_entry(nfsi->access_cache_entry_lru.next,
 				struct nfs_access_entry, lru);
 		list_move(&cache->lru, &head);
@@ -2097,7 +2099,14 @@ remove_lru_entry:
 	}
 	spin_unlock(&nfs_access_lru_lock);
 	nfs_access_free_list(&head);
-	return (atomic_long_read(&nfs_access_nr_entries) / 100) * sysctl_vfs_cache_pressure;
+	return freed;
+}
+
+long nfs_access_cache_count(struct shrinker *shrink,
+			    struct shrink_control *sc)
+{
+	return (atomic_long_read(&nfs_access_nr_entries) / 100) *
+						sysctl_vfs_cache_pressure;
 }
 
 static void __nfs_access_zap_cache(struct nfs_inode *nfsi, struct list_head *head)
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index ab12913..9c65e1f 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -244,8 +244,10 @@ extern int nfs_init_client(struct nfs_client *clp,
 			   int noresvport);
 
 /* dir.c */
-extern int nfs_access_cache_shrinker(struct shrinker *shrink,
-					struct shrink_control *sc);
+extern long nfs_access_cache_scan(struct shrinker *shrink,
+				  struct shrink_control *sc);
+extern long nfs_access_cache_count(struct shrinker *shrink,
+				   struct shrink_control *sc);
 
 /* inode.c */
 extern struct workqueue_struct *nfsiod_workqueue;
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index b961cea..e088c03 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -380,7 +380,8 @@ static const struct super_operations nfs4_sops = {
 #endif
 
 static struct shrinker acl_shrinker = {
-	.shrink		= nfs_access_cache_shrinker,
+	.scan_objects	= nfs_access_cache_scan,
+	.count_objects	= nfs_access_cache_count,
 	.seeks		= DEFAULT_SEEKS,
 };
 
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 5b572c8..c8724d2 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -669,45 +669,42 @@ int dquot_quota_sync(struct super_block *sb, int type, int wait)
 }
 EXPORT_SYMBOL(dquot_quota_sync);
 
-/* Free unused dquots from cache */
-static void prune_dqcache(int count)
+/*
+ * This is called from kswapd when we think we need some
+ * more memory
+ */
+static long shrink_dqcache_scan(struct shrinker *shrink,
+				 struct shrink_control *sc)
 {
 	struct list_head *head;
 	struct dquot *dquot;
+	int freed = 0;
 
+	spin_lock(&dq_list_lock);
 	head = free_dquots.prev;
-	while (head != &free_dquots && count) {
+	while (head != &free_dquots && freed < sc->nr_to_scan) {
 		dquot = list_entry(head, struct dquot, dq_free);
 		remove_dquot_hash(dquot);
 		remove_free_dquot(dquot);
 		remove_inuse(dquot);
 		do_destroy_dquot(dquot);
-		count--;
+		freed++;
 		head = free_dquots.prev;
 	}
+	spin_unlock(&dq_list_lock);
+
+	return freed;
 }
 
-/*
- * This is called from kswapd when we think we need some
- * more memory
- */
-static int shrink_dqcache_memory(struct shrinker *shrink,
+static long shrink_dqcache_count(struct shrinker *shrink,
 				 struct shrink_control *sc)
 {
-	int nr = sc->nr_to_scan;
-
-	if (nr) {
-		spin_lock(&dq_list_lock);
-		prune_dqcache(nr);
-		spin_unlock(&dq_list_lock);
-	}
-	return ((unsigned)
-		percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
-		/100) * sysctl_vfs_cache_pressure;
+	return (percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
+		/ 100) * sysctl_vfs_cache_pressure;
 }
-
 static struct shrinker dqcache_shrinker = {
-	.shrink = shrink_dqcache_memory,
+	.scan_objects = shrink_dqcache_scan,
+	.count_objects = shrink_dqcache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/super.c b/fs/super.c
index 6a72693..074abbe 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -45,11 +45,14 @@ DEFINE_SPINLOCK(sb_lock);
  * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence we
  * take a passive reference to the superblock to avoid this from occurring.
  */
-static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
+static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct super_block *sb;
-	int	fs_objects = 0;
-	int	total_objects;
+	long	fs_objects = 0;
+	long	total_objects;
+	long	freed = 0;
+	long	dentries;
+	long	inodes;
 
 	sb = container_of(shrink, struct super_block, s_shrink);
 
@@ -57,7 +60,7 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	 * Deadlock avoidance.  We may hold various FS locks, and we don't want
 	 * to recurse into the FS that called us in clear_inode() and friends..
 	 */
-	if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
+	if (!(sc->gfp_mask & __GFP_FS))
 		return -1;
 
 	if (!grab_super_passive(sb))
@@ -69,33 +72,42 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
 	total_objects = sb->s_nr_dentry_unused +
 			sb->s_nr_inodes_unused + fs_objects + 1;
 
-	if (sc->nr_to_scan) {
-		int	dentries;
-		int	inodes;
-
-		/* proportion the scan between the caches */
-		dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) /
-							total_objects;
-		inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) /
-							total_objects;
-		if (fs_objects)
-			fs_objects = (sc->nr_to_scan * fs_objects) /
-							total_objects;
-		/*
-		 * prune the dcache first as the icache is pinned by it, then
-		 * prune the icache, followed by the filesystem specific caches
-		 */
-		prune_dcache_sb(sb, dentries);
-		prune_icache_sb(sb, inodes);
+	/* proportion the scan between the caches */
+	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
+	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
 
-		if (fs_objects && sb->s_op->free_cached_objects) {
-			sb->s_op->free_cached_objects(sb, fs_objects);
-			fs_objects = sb->s_op->nr_cached_objects(sb);
-		}
-		total_objects = sb->s_nr_dentry_unused +
-				sb->s_nr_inodes_unused + fs_objects;
+	/*
+	 * prune the dcache first as the icache is pinned by it, then
+	 * prune the icache, followed by the filesystem specific caches
+	 */
+	freed = prune_dcache_sb(sb, dentries);
+	freed += prune_icache_sb(sb, inodes);
+
+	if (fs_objects) {
+		fs_objects = (sc->nr_to_scan * fs_objects) / total_objects;
+		freed += sb->s_op->free_cached_objects(sb, fs_objects);
 	}
 
+	drop_super(sb);
+	return freed;
+}
+
+static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	struct super_block *sb;
+	long	total_objects = 0;
+
+	sb = container_of(shrink, struct super_block, s_shrink);
+
+	if (!grab_super_passive(sb))
+		return -1;
+
+	if (sb->s_op && sb->s_op->nr_cached_objects)
+		total_objects = sb->s_op->nr_cached_objects(sb);
+
+	total_objects += sb->s_nr_dentry_unused;
+	total_objects += sb->s_nr_inodes_unused;
+
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
 	drop_super(sb);
 	return total_objects;
@@ -182,7 +194,8 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		s->cleancache_poolid = -1;
 
 		s->s_shrink.seeks = DEFAULT_SEEKS;
-		s->s_shrink.shrink = prune_super;
+		s->s_shrink.scan_objects = super_cache_scan;
+		s->s_shrink.count_objects = super_cache_count;
 		s->s_shrink.batch = 1024;
 	}
 out:
diff --git a/fs/ubifs/shrinker.c b/fs/ubifs/shrinker.c
index 9e1d056..78ca7b7 100644
--- a/fs/ubifs/shrinker.c
+++ b/fs/ubifs/shrinker.c
@@ -277,19 +277,12 @@ static int kick_a_thread(void)
 	return 0;
 }
 
-int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	int nr = sc->nr_to_scan;
 	int freed, contention = 0;
 	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
 
-	if (nr == 0)
-		/*
-		 * Due to the way UBIFS updates the clean znode counter it may
-		 * temporarily be negative.
-		 */
-		return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
-
 	if (!clean_zn_cnt) {
 		/*
 		 * No clean znodes, nothing to reap. All we can do in this case
@@ -323,3 +316,13 @@ out:
 	dbg_tnc("%d znodes were freed, requested %d", freed, nr);
 	return freed;
 }
+
+long ubifs_shrinker_count(struct shrinker *shrink, ubifs_shrinker_scan)
+{
+	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
+	/*
+	 * Due to the way UBIFS updates the clean znode counter it may
+	 * temporarily be negative.
+	 */
+	return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
+}
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 91903f6..3d3f3e9 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -49,7 +49,8 @@ struct kmem_cache *ubifs_inode_slab;
 
 /* UBIFS TNC shrinker description */
 static struct shrinker ubifs_shrinker_info = {
-	.shrink = ubifs_shrinker,
+	.scan_objects = ubifs_shrinker_scan,
+	.count_objects = ubifs_shrinker_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index 27f2255..2b8f48c 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -1625,7 +1625,8 @@ int ubifs_tnc_start_commit(struct ubifs_info *c, struct ubifs_zbranch *zroot);
 int ubifs_tnc_end_commit(struct ubifs_info *c);
 
 /* shrinker.c */
-int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc);
+long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc);
+long ubifs_shrinker_count(struct shrinker *shrink, struct shrink_control *sc);
 
 /* commit.c */
 int ubifs_bg_thread(void *info);
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 7a026cb..b2eea9e 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1456,8 +1456,8 @@ restart:
 	spin_unlock(&btp->bt_lru_lock);
 }
 
-int
-xfs_buftarg_shrink(
+static long
+xfs_buftarg_shrink_scan(
 	struct shrinker		*shrink,
 	struct shrink_control	*sc)
 {
@@ -1465,6 +1465,7 @@ xfs_buftarg_shrink(
 					struct xfs_buftarg, bt_shrinker);
 	struct xfs_buf		*bp;
 	int nr_to_scan = sc->nr_to_scan;
+	int freed = 0;
 	LIST_HEAD(dispose);
 
 	if (!nr_to_scan)
@@ -1493,6 +1494,7 @@ xfs_buftarg_shrink(
 		 */
 		list_move(&bp->b_lru, &dispose);
 		btp->bt_lru_nr--;
+		freed++;
 	}
 	spin_unlock(&btp->bt_lru_lock);
 
@@ -1502,6 +1504,16 @@ xfs_buftarg_shrink(
 		xfs_buf_rele(bp);
 	}
 
+	return freed;
+}
+
+static long
+xfs_buftarg_shrink_count(
+	struct shrinker		*shrink,
+	struct shrink_control	*sc)
+{
+	struct xfs_buftarg	*btp = container_of(shrink,
+					struct xfs_buftarg, bt_shrinker);
 	return btp->bt_lru_nr;
 }
 
@@ -1602,7 +1614,8 @@ xfs_alloc_buftarg(
 		goto error;
 	if (xfs_alloc_delwrite_queue(btp, fsname))
 		goto error;
-	btp->bt_shrinker.shrink = xfs_buftarg_shrink;
+	btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
+	btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
 	btp->bt_shrinker.seeks = DEFAULT_SEEKS;
 	register_shrinker(&btp->bt_shrinker);
 	return btp;
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 9a0aa76..19863a8 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -60,10 +60,12 @@ STATIC void	xfs_qm_list_destroy(xfs_dqlist_t *);
 
 STATIC int	xfs_qm_init_quotainos(xfs_mount_t *);
 STATIC int	xfs_qm_init_quotainfo(xfs_mount_t *);
-STATIC int	xfs_qm_shake(struct shrinker *, struct shrink_control *);
+STATIC long	xfs_qm_shake_scan(struct shrinker *, struct shrink_control *);
+STATIC long	xfs_qm_shake_count(struct shrinker *, struct shrink_control *);
 
 static struct shrinker xfs_qm_shaker = {
-	.shrink = xfs_qm_shake,
+	.scan_objects = xfs_qm_shake_scan,
+	.count_objects = xfs_qm_shake_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
@@ -1963,9 +1965,8 @@ xfs_qm_shake_freelist(
 /*
  * The kmem_shake interface is invoked when memory is running low.
  */
-/* ARGSUSED */
-STATIC int
-xfs_qm_shake(
+STATIC long
+xfs_qm_shake_scan(
 	struct shrinker	*shrink,
 	struct shrink_control *sc)
 {
@@ -1973,9 +1974,9 @@ xfs_qm_shake(
 	gfp_t gfp_mask = sc->gfp_mask;
 
 	if (!kmem_shake_allow(gfp_mask))
-		return 0;
+		return -1;
 	if (!xfs_Gqm)
-		return 0;
+		return -1;
 
 	nfree = xfs_Gqm->qm_dqfrlist_cnt; /* free dquots */
 	/* incore dquots in all f/s's */
@@ -1992,6 +1993,13 @@ xfs_qm_shake(
 	return xfs_qm_shake_freelist(MAX(nfree, n));
 }
 
+STATIC long
+xfs_qm_shake_count(
+	struct shrinker	*shrink,
+	struct shrink_control *sc)
+{
+	return xfs_Gqm ? xfs_Gqm->qm_dqfrlist_cnt : -1;
+}
 
 /*------------------------------------------------------------------*/
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c94ec22..dff4b67 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1473,19 +1473,19 @@ xfs_fs_mount(
 	return mount_bdev(fs_type, flags, dev_name, data, xfs_fs_fill_super);
 }
 
-static int
+static long
 xfs_fs_nr_cached_objects(
 	struct super_block	*sb)
 {
 	return xfs_reclaim_inodes_count(XFS_M(sb));
 }
 
-static void
+static long
 xfs_fs_free_cached_objects(
 	struct super_block	*sb,
-	int			nr_to_scan)
+	long			nr_to_scan)
 {
-	xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
+	return xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
 }
 
 static const struct super_operations xfs_super_operations = {
diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
index 4604f90..5b60a3a 100644
--- a/fs/xfs/xfs_sync.c
+++ b/fs/xfs/xfs_sync.c
@@ -896,7 +896,7 @@ int
 xfs_reclaim_inodes_ag(
 	struct xfs_mount	*mp,
 	int			flags,
-	int			*nr_to_scan)
+	long			*nr_to_scan)
 {
 	struct xfs_perag	*pag;
 	int			error = 0;
@@ -1017,7 +1017,7 @@ xfs_reclaim_inodes(
 	xfs_mount_t	*mp,
 	int		mode)
 {
-	int		nr_to_scan = INT_MAX;
+	long		nr_to_scan = LONG_MAX;
 
 	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
 }
@@ -1031,29 +1031,32 @@ xfs_reclaim_inodes(
  * them to be cleaned, which we hope will not be very long due to the
  * background walker having already kicked the IO off on those dirty inodes.
  */
-void
+long
 xfs_reclaim_inodes_nr(
 	struct xfs_mount	*mp,
-	int			nr_to_scan)
+	long			nr_to_scan)
 {
+	long nr = nr_to_scan;
+
 	/* kick background reclaimer and push the AIL */
 	xfs_syncd_queue_reclaim(mp);
 	xfs_ail_push_all(mp->m_ail);
 
-	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan);
+	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr);
+	return nr_to_scan - nr;
 }
 
 /*
  * Return the number of reclaimable inodes in the filesystem for
  * the shrinker to determine how much to reclaim.
  */
-int
+long
 xfs_reclaim_inodes_count(
 	struct xfs_mount	*mp)
 {
 	struct xfs_perag	*pag;
 	xfs_agnumber_t		ag = 0;
-	int			reclaimable = 0;
+	long			reclaimable = 0;
 
 	while ((pag = xfs_perag_get_tag(mp, ag, XFS_ICI_RECLAIM_TAG))) {
 		ag = pag->pag_agno + 1;
diff --git a/fs/xfs/xfs_sync.h b/fs/xfs/xfs_sync.h
index 941202e..82e1b1c 100644
--- a/fs/xfs/xfs_sync.h
+++ b/fs/xfs/xfs_sync.h
@@ -35,8 +35,8 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
 void xfs_flush_inodes(struct xfs_inode *ip);
 
 int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
-int xfs_reclaim_inodes_count(struct xfs_mount *mp);
-void xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
+long xfs_reclaim_inodes_count(struct xfs_mount *mp);
+long xfs_reclaim_inodes_nr(struct xfs_mount *mp, long nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 void __xfs_inode_set_reclaim_tag(struct xfs_perag *pag, struct xfs_inode *ip);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 14be4d8..958c025 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1465,10 +1465,6 @@ struct super_block {
 	struct shrinker s_shrink;	/* per-sb shrinker handle */
 };
 
-/* superblock cache pruning functions */
-extern void prune_icache_sb(struct super_block *sb, int nr_to_scan);
-extern void prune_dcache_sb(struct super_block *sb, int nr_to_scan);
-
 extern struct timespec current_fs_time(struct super_block *sb);
 
 /*
@@ -1662,8 +1658,8 @@ struct super_operations {
 	ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
 #endif
 	int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
-	int (*nr_cached_objects)(struct super_block *);
-	void (*free_cached_objects)(struct super_block *, int);
+	long (*nr_cached_objects)(struct super_block *);
+	long (*free_cached_objects)(struct super_block *, long);
 };
 
 /*
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index 36851f7..80308ea 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -190,7 +190,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
-		__field(void *, shrink)
+		__field(void *, scan)
 		__field(long, nr_objects_to_shrink)
 		__field(gfp_t, gfp_flags)
 		__field(unsigned long, pgs_scanned)
@@ -202,7 +202,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->shrink = shr->shrink;
+		__entry->scan = shr->scan_objects;
 		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
 		__entry->gfp_flags = sc->gfp_mask;
 		__entry->pgs_scanned = pgs_scanned;
@@ -213,7 +213,7 @@ TRACE_EVENT(mm_shrink_slab_start,
 	),
 
 	TP_printk("%pF %p: objects to shrink %ld gfp_flags %s pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
-		__entry->shrink,
+		__entry->scan,
 		__entry->shr,
 		__entry->nr_objects_to_shrink,
 		show_gfp_flags(__entry->gfp_flags),
@@ -232,7 +232,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_STRUCT__entry(
 		__field(struct shrinker *, shr)
-		__field(void *, shrink)
+		__field(void *, scan)
 		__field(long, unused_scan)
 		__field(long, new_scan)
 		__field(int, retval)
@@ -241,7 +241,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 
 	TP_fast_assign(
 		__entry->shr = shr;
-		__entry->shrink = shr->shrink;
+		__entry->scan = shr->scan_objects;
 		__entry->unused_scan = unused_scan_cnt;
 		__entry->new_scan = new_scan_cnt;
 		__entry->retval = shrinker_retval;
@@ -249,7 +249,7 @@ TRACE_EVENT(mm_shrink_slab_end,
 	),
 
 	TP_printk("%pF %p: unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
-		__entry->shrink,
+		__entry->scan,
 		__entry->shr,
 		__entry->unused_scan,
 		__entry->new_scan,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7ef6912..e32ce2d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -202,14 +202,6 @@ void unregister_shrinker(struct shrinker *shrinker)
 }
 EXPORT_SYMBOL(unregister_shrinker);
 
-static inline int do_shrinker_shrink(struct shrinker *shrinker,
-				     struct shrink_control *sc,
-				     unsigned long nr_to_scan)
-{
-	sc->nr_to_scan = nr_to_scan;
-	return (*shrinker->shrink)(shrinker, sc);
-}
-
 #define SHRINK_BATCH 128
 /*
  * Call the shrink functions to age shrinkable caches
@@ -230,27 +222,26 @@ static inline int do_shrinker_shrink(struct shrinker *shrinker,
  *
  * Returns the number of slab objects which we shrunk.
  */
-unsigned long shrink_slab(struct shrink_control *shrink,
+unsigned long shrink_slab(struct shrink_control *sc,
 			  unsigned long nr_pages_scanned,
 			  unsigned long lru_pages)
 {
 	struct shrinker *shrinker;
-	unsigned long ret = 0;
+	unsigned long freed = 0;
 
 	if (nr_pages_scanned == 0)
 		nr_pages_scanned = SWAP_CLUSTER_MAX;
 
 	if (!down_read_trylock(&shrinker_rwsem)) {
 		/* Assume we'll be able to shrink next time */
-		ret = 1;
+		freed = 1;
 		goto out;
 	}
 
 	list_for_each_entry(shrinker, &shrinker_list, list) {
-		unsigned long long delta;
-		unsigned long total_scan;
-		unsigned long max_pass;
-		int shrink_ret = 0;
+		long long delta;
+		long total_scan;
+		long max_pass;
 		long nr;
 		long new_nr;
 		long batch_size = shrinker->batch ? shrinker->batch
@@ -266,7 +257,9 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
 
 		total_scan = nr;
-		max_pass = do_shrinker_shrink(shrinker, shrink, 0);
+		max_pass = shrinker->count_objects(shrinker, sc);
+		WARN_ON_ONCE(max_pass < 0);
+
 		delta = (4 * nr_pages_scanned) / shrinker->seeks;
 		delta *= max_pass;
 		do_div(delta, lru_pages + 1);
@@ -274,7 +267,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		if (total_scan < 0) {
 			printk(KERN_ERR "shrink_slab: %pF negative objects to "
 			       "delete nr=%ld\n",
-			       shrinker->shrink, total_scan);
+			       shrinker->scan_objects, total_scan);
 			total_scan = max_pass;
 		}
 
@@ -301,20 +294,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 		if (total_scan > max_pass * 2)
 			total_scan = max_pass * 2;
 
-		trace_mm_shrink_slab_start(shrinker, shrink, nr,
+		trace_mm_shrink_slab_start(shrinker, sc, nr,
 					nr_pages_scanned, lru_pages,
 					max_pass, delta, total_scan);
 
 		while (total_scan >= batch_size) {
-			int nr_before;
+			long ret;
+
+			sc->nr_to_scan = batch_size;
+			ret = shrinker->scan_objects(shrinker, sc);
 
-			nr_before = do_shrinker_shrink(shrinker, shrink, 0);
-			shrink_ret = do_shrinker_shrink(shrinker, shrink,
-							batch_size);
-			if (shrink_ret == -1)
+			if (ret == -1)
 				break;
-			if (shrink_ret < nr_before)
-				ret += nr_before - shrink_ret;
+			freed += ret;
 			count_vm_events(SLABS_SCANNED, batch_size);
 			total_scan -= batch_size;
 
@@ -333,12 +325,12 @@ unsigned long shrink_slab(struct shrink_control *shrink,
 				break;
 		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
 
-		trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
+		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
 	}
 	up_read(&shrinker_rwsem);
 out:
 	cond_resched();
-	return ret;
+	return freed;
 }
 
 static void set_reclaim_mode(int priority, struct scan_control *sc,
diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index 727e506..f5955c3 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -292,6 +292,7 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 	spinlock_t *cache_lock;
 	struct rpc_cred *cred, *next;
 	unsigned long expired = jiffies - RPC_AUTH_EXPIRY_MORATORIUM;
+	int freed = 0;
 
 	list_for_each_entry_safe(cred, next, &cred_unused, cr_lru) {
 
@@ -303,10 +304,10 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 		 */
 		if (time_in_range(cred->cr_expire, expired, jiffies) &&
 		    test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
-			return 0;
+			break;
 
-		list_del_init(&cred->cr_lru);
 		number_cred_unused--;
+		list_del_init(&cred->cr_lru);
 		if (atomic_read(&cred->cr_count) != 0)
 			continue;
 
@@ -316,17 +317,18 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
 			get_rpccred(cred);
 			list_add_tail(&cred->cr_lru, free);
 			rpcauth_unhash_cred_locked(cred);
+			freed++;
 		}
 		spin_unlock(cache_lock);
 	}
-	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
+	return freed;
 }
 
 /*
  * Run memory cache shrinker.
  */
-static int
-rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
+static long
+rpcauth_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 {
 	LIST_HEAD(free);
 	int res;
@@ -344,6 +346,12 @@ rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
 	return res;
 }
 
+static long
+rpcauth_cache_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
+}
+
 /*
  * Look up a process' credentials in the authentication cache
  */
@@ -658,7 +666,8 @@ rpcauth_uptodatecred(struct rpc_task *task)
 }
 
 static struct shrinker rpc_cred_shrinker = {
-	.shrink = rpcauth_cache_shrinker,
+	.scan_objects = rpcauth_cache_scan,
+	.count_objects = rpcauth_cache_count,
 	.seeks = DEFAULT_SEEKS,
 };
 
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/13] shrinker: remove old API now it is unused
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/shrinker.h |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 50f213f..ab6c572 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -20,11 +20,6 @@ struct shrink_control {
 /*
  * A callback you can register to apply pressure to ageable caches.
  *
- * @shrink() should look through the least-recently-used 'nr_to_scan' entries
- * and attempt to free them up.  It should return the number of objects which
- * remain in the cache.  If it returns -1, it means it cannot do any scanning at
- * this time (eg. there is a risk of deadlock).
- *
  * @count_objects should return the number of freeable items in the cache. If
  * there are no objects to free or the number of freeable items cannot be
  * determined, it should return 0. No deadlock checks should be done during the
@@ -40,7 +35,6 @@ struct shrink_control {
  * @scan_objects will be made from the current reclaim context.
  */
 struct shrinker {
-	int (*shrink)(struct shrinker *, struct shrink_control *sc);
 	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
 	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 06/13] shrinker: remove old API now it is unused
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/shrinker.h |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 50f213f..ab6c572 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -20,11 +20,6 @@ struct shrink_control {
 /*
  * A callback you can register to apply pressure to ageable caches.
  *
- * @shrink() should look through the least-recently-used 'nr_to_scan' entries
- * and attempt to free them up.  It should return the number of objects which
- * remain in the cache.  If it returns -1, it means it cannot do any scanning at
- * this time (eg. there is a risk of deadlock).
- *
  * @count_objects should return the number of freeable items in the cache. If
  * there are no objects to free or the number of freeable items cannot be
  * determined, it should return 0. No deadlock checks should be done during the
@@ -40,7 +35,6 @@ struct shrink_control {
  * @scan_objects will be made from the current reclaim context.
  */
 struct shrinker {
-	int (*shrink)(struct shrinker *, struct shrink_control *sc);
 	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
 	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
 
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/13] Use atomic-long operations instead of looping around cmpxchg().
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Konstantin Khlebnikov <khlebnikov@openvz.org>

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/fs.h       |    2 +-
 include/linux/shrinker.h |    2 +-
 mm/vmscan.c              |   17 +++++++----------
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 958c025..2651059 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -394,8 +394,8 @@ struct inodes_stat_t {
 #include <linux/semaphore.h>
 #include <linux/fiemap.h>
 #include <linux/rculist_bl.h>
-#include <linux/shrinker.h>
 #include <linux/atomic.h>
+#include <linux/shrinker.h>
 
 #include <asm/byteorder.h>
 
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index ab6c572..02b5b6b 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -43,7 +43,7 @@ struct shrinker {
 
 	/* These are for internal use */
 	struct list_head list;
-	long nr;	/* objs pending delete */
+	atomic_long_t nr_pending;	/* objs pending delete */
 };
 
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e32ce2d..534ed34 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -184,7 +184,7 @@ static unsigned long zone_nr_lru_pages(struct zone *zone,
  */
 void register_shrinker(struct shrinker *shrinker)
 {
-	shrinker->nr = 0;
+	atomic_long_set(&shrinker->nr_pending, 0);
 	down_write(&shrinker_rwsem);
 	list_add_tail(&shrinker->list, &shrinker_list);
 	up_write(&shrinker_rwsem);
@@ -252,9 +252,7 @@ unsigned long shrink_slab(struct shrink_control *sc,
 		 * and zero it so that other concurrent shrinker invocations
 		 * don't also do this scanning work.
 		 */
-		do {
-			nr = shrinker->nr;
-		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
+		nr = atomic_long_xchg(&shrinker->nr_pending, 0);
 
 		total_scan = nr;
 		max_pass = shrinker->count_objects(shrinker, sc);
@@ -318,12 +316,11 @@ unsigned long shrink_slab(struct shrink_control *sc,
 		 * manner that handles concurrent updates. If we exhausted the
 		 * scan, there is no need to do an update.
 		 */
-		do {
-			nr = shrinker->nr;
-			new_nr = total_scan + nr;
-			if (total_scan <= 0)
-				break;
-		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
+		if (total_scan > 0)
+			new_nr = atomic_long_add_return(total_scan,
+							&shrinker->nr_pending);
+		else
+			new_nr = atomic_long_read(&shrinker->nr_pending);
 
 		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
 	}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 07/13] Use atomic-long operations instead of looping around cmpxchg().
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Konstantin Khlebnikov <khlebnikov@openvz.org>

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/fs.h       |    2 +-
 include/linux/shrinker.h |    2 +-
 mm/vmscan.c              |   17 +++++++----------
 3 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 958c025..2651059 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -394,8 +394,8 @@ struct inodes_stat_t {
 #include <linux/semaphore.h>
 #include <linux/fiemap.h>
 #include <linux/rculist_bl.h>
-#include <linux/shrinker.h>
 #include <linux/atomic.h>
+#include <linux/shrinker.h>
 
 #include <asm/byteorder.h>
 
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index ab6c572..02b5b6b 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -43,7 +43,7 @@ struct shrinker {
 
 	/* These are for internal use */
 	struct list_head list;
-	long nr;	/* objs pending delete */
+	atomic_long_t nr_pending;	/* objs pending delete */
 };
 
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e32ce2d..534ed34 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -184,7 +184,7 @@ static unsigned long zone_nr_lru_pages(struct zone *zone,
  */
 void register_shrinker(struct shrinker *shrinker)
 {
-	shrinker->nr = 0;
+	atomic_long_set(&shrinker->nr_pending, 0);
 	down_write(&shrinker_rwsem);
 	list_add_tail(&shrinker->list, &shrinker_list);
 	up_write(&shrinker_rwsem);
@@ -252,9 +252,7 @@ unsigned long shrink_slab(struct shrink_control *sc,
 		 * and zero it so that other concurrent shrinker invocations
 		 * don't also do this scanning work.
 		 */
-		do {
-			nr = shrinker->nr;
-		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
+		nr = atomic_long_xchg(&shrinker->nr_pending, 0);
 
 		total_scan = nr;
 		max_pass = shrinker->count_objects(shrinker, sc);
@@ -318,12 +316,11 @@ unsigned long shrink_slab(struct shrink_control *sc,
 		 * manner that handles concurrent updates. If we exhausted the
 		 * scan, there is no need to do an update.
 		 */
-		do {
-			nr = shrinker->nr;
-			new_nr = total_scan + nr;
-			if (total_scan <= 0)
-				break;
-		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
+		if (total_scan > 0)
+			new_nr = atomic_long_add_return(total_scan,
+							&shrinker->nr_pending);
+		else
+			new_nr = atomic_long_read(&shrinker->nr_pending);
 
 		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
 	}
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/13] list: add a new LRU list type
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Several subsystems use the same construct for LRU lists - a list
head, a spin lock and and item count. They also use exactly the same
code for adding and removing items from the LRU. Create a generic
type for these LRU lists.

This is the beginning of generic, node aware LRUs for shrinkers to
work with.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/list_lru.h |   30 ++++++++++++
 lib/Makefile             |    3 +-
 lib/list_lru.c           |  112 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 144 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/list_lru.h
 create mode 100644 lib/list_lru.c

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
new file mode 100644
index 0000000..b112ca3
--- /dev/null
+++ b/include/linux/list_lru.h
@@ -0,0 +1,30 @@
+#ifndef _LRU_LIST_H
+#define _LRU_LIST_H 0
+
+#include <linux/list.h>
+
+struct list_lru {
+	spinlock_t		lock;
+	struct list_head	list;
+	long			nr_items;
+};
+
+int list_lru_init(struct list_lru *lru);
+int list_lru_add(struct list_lru *lru, struct list_head *item);
+int list_lru_del(struct list_lru *lru, struct list_head *item);
+
+static inline long list_lru_count(struct list_lru *lru)
+{
+	return lru->nr_items;
+}
+
+typedef int (*list_lru_walk_cb)(struct list_head *item, spinlock_t *lock,
+				void *cb_arg);
+typedef void (*list_lru_dispose_cb)(struct list_head *dispose_list);
+
+long list_lru_walk(struct list_lru *lru, list_lru_walk_cb isolate,
+		   void *cb_arg, long nr_to_walk);
+
+long list_lru_dispose_all(struct list_lru *lru, list_lru_dispose_cb dispose);
+
+#endif /* _LRU_LIST_H */
diff --git a/lib/Makefile b/lib/Makefile
index d5d175c..a08212f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 idr.o int_sqrt.o extable.o prio_tree.o \
 	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
 	 proportions.o prio_heap.o ratelimit.o show_mem.o \
-	 is_single_threaded.o plist.o decompress.o find_next_bit.o
+	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
+	 list_lru.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/list_lru.c b/lib/list_lru.c
new file mode 100644
index 0000000..8c18c63
--- /dev/null
+++ b/lib/list_lru.c
@@ -0,0 +1,112 @@
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/list_lru.h>
+
+int
+list_lru_add(
+	struct list_lru	*lru,
+	struct list_head *item)
+{
+	spin_lock(&lru->lock);
+	if (list_empty(item)) {
+		list_add_tail(item, &lru->list);
+		lru->nr_items++;
+		spin_unlock(&lru->lock);
+		return 1;
+	}
+	spin_unlock(&lru->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_add);
+
+int
+list_lru_del(
+	struct list_lru	*lru,
+	struct list_head *item)
+{
+	spin_lock(&lru->lock);
+	if (!list_empty(item)) {
+		list_del_init(item);
+		lru->nr_items--;
+		spin_unlock(&lru->lock);
+		return 1;
+	}
+	spin_unlock(&lru->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_del);
+
+long
+list_lru_walk(
+	struct list_lru *lru,
+	list_lru_walk_cb isolate,
+	void		*cb_arg,
+	long		nr_to_walk)
+{
+	struct list_head *item, *n;
+	long removed = 0;
+restart:
+	spin_lock(&lru->lock);
+	list_for_each_safe(item, n, &lru->list) {
+		int ret;
+
+		if (nr_to_walk-- < 0)
+			break;
+
+		ret = isolate(item, &lru->lock, cb_arg);
+		switch (ret) {
+		case 0:	/* item removed from list */
+			lru->nr_items--;
+			removed++;
+			break;
+		case 1: /* item referenced, give another pass */
+			list_move_tail(item, &lru->list);
+			break;
+		case 2: /* item cannot be locked, skip */
+			break;
+		case 3: /* item not freeable, lock dropped */
+			goto restart;
+		default:
+			BUG();
+		}
+	}
+	spin_unlock(&lru->lock);
+	return removed;
+}
+EXPORT_SYMBOL_GPL(list_lru_walk);
+
+long
+list_lru_dispose_all(
+	struct list_lru *lru,
+	list_lru_dispose_cb dispose)
+{
+	long disposed = 0;
+	LIST_HEAD(dispose_list);
+
+	spin_lock(&lru->lock);
+	while (!list_empty(&lru->list)) {
+		list_splice_init(&lru->list, &dispose_list);
+		disposed += lru->nr_items;
+		lru->nr_items = 0;
+		spin_unlock(&lru->lock);
+
+		dispose(&dispose_list);
+
+		spin_lock(&lru->lock);
+	}
+	spin_unlock(&lru->lock);
+	return disposed;
+}
+
+int
+list_lru_init(
+	struct list_lru	*lru)
+{
+	spin_lock_init(&lru->lock);
+	INIT_LIST_HEAD(&lru->list);
+	lru->nr_items = 0;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_init);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 08/13] list: add a new LRU list type
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Several subsystems use the same construct for LRU lists - a list
head, a spin lock and and item count. They also use exactly the same
code for adding and removing items from the LRU. Create a generic
type for these LRU lists.

This is the beginning of generic, node aware LRUs for shrinkers to
work with.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 include/linux/list_lru.h |   30 ++++++++++++
 lib/Makefile             |    3 +-
 lib/list_lru.c           |  112 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 144 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/list_lru.h
 create mode 100644 lib/list_lru.c

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
new file mode 100644
index 0000000..b112ca3
--- /dev/null
+++ b/include/linux/list_lru.h
@@ -0,0 +1,30 @@
+#ifndef _LRU_LIST_H
+#define _LRU_LIST_H 0
+
+#include <linux/list.h>
+
+struct list_lru {
+	spinlock_t		lock;
+	struct list_head	list;
+	long			nr_items;
+};
+
+int list_lru_init(struct list_lru *lru);
+int list_lru_add(struct list_lru *lru, struct list_head *item);
+int list_lru_del(struct list_lru *lru, struct list_head *item);
+
+static inline long list_lru_count(struct list_lru *lru)
+{
+	return lru->nr_items;
+}
+
+typedef int (*list_lru_walk_cb)(struct list_head *item, spinlock_t *lock,
+				void *cb_arg);
+typedef void (*list_lru_dispose_cb)(struct list_head *dispose_list);
+
+long list_lru_walk(struct list_lru *lru, list_lru_walk_cb isolate,
+		   void *cb_arg, long nr_to_walk);
+
+long list_lru_dispose_all(struct list_lru *lru, list_lru_dispose_cb dispose);
+
+#endif /* _LRU_LIST_H */
diff --git a/lib/Makefile b/lib/Makefile
index d5d175c..a08212f 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 idr.o int_sqrt.o extable.o prio_tree.o \
 	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
 	 proportions.o prio_heap.o ratelimit.o show_mem.o \
-	 is_single_threaded.o plist.o decompress.o find_next_bit.o
+	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
+	 list_lru.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/list_lru.c b/lib/list_lru.c
new file mode 100644
index 0000000..8c18c63
--- /dev/null
+++ b/lib/list_lru.c
@@ -0,0 +1,112 @@
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/list_lru.h>
+
+int
+list_lru_add(
+	struct list_lru	*lru,
+	struct list_head *item)
+{
+	spin_lock(&lru->lock);
+	if (list_empty(item)) {
+		list_add_tail(item, &lru->list);
+		lru->nr_items++;
+		spin_unlock(&lru->lock);
+		return 1;
+	}
+	spin_unlock(&lru->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_add);
+
+int
+list_lru_del(
+	struct list_lru	*lru,
+	struct list_head *item)
+{
+	spin_lock(&lru->lock);
+	if (!list_empty(item)) {
+		list_del_init(item);
+		lru->nr_items--;
+		spin_unlock(&lru->lock);
+		return 1;
+	}
+	spin_unlock(&lru->lock);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_del);
+
+long
+list_lru_walk(
+	struct list_lru *lru,
+	list_lru_walk_cb isolate,
+	void		*cb_arg,
+	long		nr_to_walk)
+{
+	struct list_head *item, *n;
+	long removed = 0;
+restart:
+	spin_lock(&lru->lock);
+	list_for_each_safe(item, n, &lru->list) {
+		int ret;
+
+		if (nr_to_walk-- < 0)
+			break;
+
+		ret = isolate(item, &lru->lock, cb_arg);
+		switch (ret) {
+		case 0:	/* item removed from list */
+			lru->nr_items--;
+			removed++;
+			break;
+		case 1: /* item referenced, give another pass */
+			list_move_tail(item, &lru->list);
+			break;
+		case 2: /* item cannot be locked, skip */
+			break;
+		case 3: /* item not freeable, lock dropped */
+			goto restart;
+		default:
+			BUG();
+		}
+	}
+	spin_unlock(&lru->lock);
+	return removed;
+}
+EXPORT_SYMBOL_GPL(list_lru_walk);
+
+long
+list_lru_dispose_all(
+	struct list_lru *lru,
+	list_lru_dispose_cb dispose)
+{
+	long disposed = 0;
+	LIST_HEAD(dispose_list);
+
+	spin_lock(&lru->lock);
+	while (!list_empty(&lru->list)) {
+		list_splice_init(&lru->list, &dispose_list);
+		disposed += lru->nr_items;
+		lru->nr_items = 0;
+		spin_unlock(&lru->lock);
+
+		dispose(&dispose_list);
+
+		spin_lock(&lru->lock);
+	}
+	spin_unlock(&lru->lock);
+	return disposed;
+}
+
+int
+list_lru_init(
+	struct list_lru	*lru)
+{
+	spin_lock_init(&lru->lock);
+	INIT_LIST_HEAD(&lru->list);
+	lru->nr_items = 0;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(list_lru_init);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/13] inode: convert inode lru list to generic lru list code.
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/inode.c         |  168 ++++++++++++++++++++-------------------------------
 fs/super.c         |   11 ++--
 include/linux/fs.h |    6 +-
 3 files changed, 73 insertions(+), 112 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index fee5d9a..98ca516 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -26,6 +26,7 @@
 #include <linux/ima.h>
 #include <linux/cred.h>
 #include <linux/buffer_head.h> /* for inode_has_buffers */
+#include <linux/list_lru.h>
 #include "internal.h"
 
 /*
@@ -328,24 +329,14 @@ EXPORT_SYMBOL(ihold);
 
 static void inode_lru_list_add(struct inode *inode)
 {
-	spin_lock(&inode->i_sb->s_inode_lru_lock);
-	if (list_empty(&inode->i_lru)) {
-		list_add(&inode->i_lru, &inode->i_sb->s_inode_lru);
-		inode->i_sb->s_nr_inodes_unused++;
+	if (list_lru_add(&inode->i_sb->s_inode_lru, &inode->i_lru))
 		this_cpu_inc(nr_unused);
-	}
-	spin_unlock(&inode->i_sb->s_inode_lru_lock);
 }
 
 static void inode_lru_list_del(struct inode *inode)
 {
-	spin_lock(&inode->i_sb->s_inode_lru_lock);
-	if (!list_empty(&inode->i_lru)) {
-		list_del_init(&inode->i_lru);
-		inode->i_sb->s_nr_inodes_unused--;
-		this_cpu_dec(nr_unused);
-	}
-	spin_unlock(&inode->i_sb->s_inode_lru_lock);
+	if (list_lru_del(&inode->i_sb->s_inode_lru, &inode->i_lru))
+		this_cpu_inc(nr_unused);
 }
 
 /**
@@ -582,24 +573,8 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
 	return busy;
 }
 
-static int can_unuse(struct inode *inode)
-{
-	if (inode->i_state & ~I_REFERENCED)
-		return 0;
-	if (inode_has_buffers(inode))
-		return 0;
-	if (atomic_read(&inode->i_count))
-		return 0;
-	if (inode->i_data.nrpages)
-		return 0;
-	return 1;
-}
-
 /*
- * Walk the superblock inode LRU for freeable inodes and attempt to free them.
- * This is called from the superblock shrinker function with a number of inodes
- * to trim from the LRU. Inodes to be freed are moved to a temporary list and
- * then are freed outside inode_lock by dispose_list().
+ * Isolate the inode from the LRU in preparation for freeing it.
  *
  * Any inodes which are pinned purely because of attached pagecache have their
  * pagecache removed.  If the inode has metadata buffers attached to
@@ -613,88 +588,77 @@ static int can_unuse(struct inode *inode)
  * LRU does not have strict ordering. Hence we don't want to reclaim inodes
  * with this flag set because they are the inodes that are out of order.
  */
-long prune_icache_sb(struct super_block *sb, long nr_to_scan)
+static int inode_lru_isolate(struct list_head *item, spinlock_t *lru_lock,
+				void *arg)
 {
-	LIST_HEAD(freeable);
-	long nr_scanned;
-	long freed = 0;
-	unsigned long reap = 0;
+	struct list_head *freeable = arg;
+	struct inode	*inode = container_of(item, struct inode, i_lru);
 
-	spin_lock(&sb->s_inode_lru_lock);
-	for (nr_scanned = nr_to_scan; nr_scanned >= 0; nr_scanned--) {
-		struct inode *inode;
+	/*
+	 * we are inverting the lru lock/inode->i_lock here,
+	 * so use a trylock. If we fail to get the lock, just skip
+	 * it
+	 */
+	if (!spin_trylock(&inode->i_lock))
+		return 2;
 
-		if (list_empty(&sb->s_inode_lru))
-			break;
+	/*
+	 * Referenced or dirty inodes are still in use. Give them
+	 * another pass through the LRU as we canot reclaim them now.
+	 */
+	if (atomic_read(&inode->i_count) ||
+	    (inode->i_state & ~I_REFERENCED)) {
+		list_del_init(&inode->i_lru);
+		spin_unlock(&inode->i_lock);
+		this_cpu_dec(nr_unused);
+		return 0;
+	}
 
-		inode = list_entry(sb->s_inode_lru.prev, struct inode, i_lru);
+	/* recently referenced inodes get one more pass */
+	if (inode->i_state & I_REFERENCED) {
+		inode->i_state &= ~I_REFERENCED;
+		spin_unlock(&inode->i_lock);
+		return 1;
+	}
 
-		/*
-		 * we are inverting the sb->s_inode_lru_lock/inode->i_lock here,
-		 * so use a trylock. If we fail to get the lock, just move the
-		 * inode to the back of the list so we don't spin on it.
-		 */
-		if (!spin_trylock(&inode->i_lock)) {
-			list_move(&inode->i_lru, &sb->s_inode_lru);
-			continue;
+	if (inode_has_buffers(inode) || inode->i_data.nrpages) {
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(lru_lock);
+		if (remove_inode_buffers(inode)) {
+			unsigned long reap;
+			reap = invalidate_mapping_pages(&inode->i_data, 0, -1);
+			if (current_is_kswapd())
+				__count_vm_events(KSWAPD_INODESTEAL, reap);
+			else
+				__count_vm_events(PGINODESTEAL, reap);
 		}
+		iput(inode);
+		return 3;
+	}
 
-		/*
-		 * Referenced or dirty inodes are still in use. Give them
-		 * another pass through the LRU as we canot reclaim them now.
-		 */
-		if (atomic_read(&inode->i_count) ||
-		    (inode->i_state & ~I_REFERENCED)) {
-			list_del_init(&inode->i_lru);
-			spin_unlock(&inode->i_lock);
-			sb->s_nr_inodes_unused--;
-			this_cpu_dec(nr_unused);
-			continue;
-		}
+	WARN_ON(inode->i_state & I_NEW);
+	inode->i_state |= I_FREEING;
+	spin_unlock(&inode->i_lock);
 
-		/* recently referenced inodes get one more pass */
-		if (inode->i_state & I_REFERENCED) {
-			inode->i_state &= ~I_REFERENCED;
-			list_move(&inode->i_lru, &sb->s_inode_lru);
-			spin_unlock(&inode->i_lock);
-			continue;
-		}
-		if (inode_has_buffers(inode) || inode->i_data.nrpages) {
-			__iget(inode);
-			spin_unlock(&inode->i_lock);
-			spin_unlock(&sb->s_inode_lru_lock);
-			if (remove_inode_buffers(inode))
-				reap += invalidate_mapping_pages(&inode->i_data,
-								0, -1);
-			iput(inode);
-			spin_lock(&sb->s_inode_lru_lock);
-
-			if (inode != list_entry(sb->s_inode_lru.next,
-						struct inode, i_lru))
-				continue;	/* wrong inode or list_empty */
-			/* avoid lock inversions with trylock */
-			if (!spin_trylock(&inode->i_lock))
-				continue;
-			if (!can_unuse(inode)) {
-				spin_unlock(&inode->i_lock);
-				continue;
-			}
-		}
-		WARN_ON(inode->i_state & I_NEW);
-		inode->i_state |= I_FREEING;
-		spin_unlock(&inode->i_lock);
+	list_move(&inode->i_lru, freeable);
+	this_cpu_dec(nr_unused);
+	return 0;
+}
 
-		list_move(&inode->i_lru, &freeable);
-		sb->s_nr_inodes_unused--;
-		this_cpu_dec(nr_unused);
-		freed++;
-	}
-	if (current_is_kswapd())
-		__count_vm_events(KSWAPD_INODESTEAL, reap);
-	else
-		__count_vm_events(PGINODESTEAL, reap);
-	spin_unlock(&sb->s_inode_lru_lock);
+/*
+ * Walk the superblock inode LRU for freeable inodes and attempt to free them.
+ * This is called from the superblock shrinker function with a number of inodes
+ * to trim from the LRU. Inodes to be freed are moved to a temporary list and
+ * then are freed outside inode_lock by dispose_list().
+ */
+long prune_icache_sb(struct super_block *sb, long nr_to_scan)
+{
+	LIST_HEAD(freeable);
+	long freed;
 
+	freed = list_lru_walk(&sb->s_inode_lru, inode_lru_isolate,
+						&freeable, nr_to_scan);
 	dispose_list(&freeable);
 	return freed;
 }
diff --git a/fs/super.c b/fs/super.c
index 074abbe..bc0e101 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -69,12 +69,12 @@ static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 	if (sb->s_op && sb->s_op->nr_cached_objects)
 		fs_objects = sb->s_op->nr_cached_objects(sb);
 
-	total_objects = sb->s_nr_dentry_unused +
-			sb->s_nr_inodes_unused + fs_objects + 1;
+	inodes = list_lru_count(&sb->s_inode_lru);
+	total_objects = sb->s_nr_dentry_unused + inodes + fs_objects + 1;
 
 	/* proportion the scan between the caches */
 	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
-	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
+	inodes = (sc->nr_to_scan * inodes) / total_objects;
 
 	/*
 	 * prune the dcache first as the icache is pinned by it, then
@@ -106,7 +106,7 @@ static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc
 		total_objects = sb->s_op->nr_cached_objects(sb);
 
 	total_objects += sb->s_nr_dentry_unused;
-	total_objects += sb->s_nr_inodes_unused;
+	total_objects += list_lru_count(&sb->s_inode_lru);
 
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
 	drop_super(sb);
@@ -153,8 +153,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_LIST_HEAD(&s->s_inodes);
 		INIT_LIST_HEAD(&s->s_dentry_lru);
 		spin_lock_init(&s->s_dentry_lru_lock);
-		INIT_LIST_HEAD(&s->s_inode_lru);
-		spin_lock_init(&s->s_inode_lru_lock);
+		list_lru_init(&s->s_inode_lru);
 		init_rwsem(&s->s_umount);
 		mutex_init(&s->s_lock);
 		lockdep_set_class(&s->s_umount, &type->s_umount_key);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2651059..80beb62 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -385,6 +385,7 @@ struct inodes_stat_t {
 #include <linux/stat.h>
 #include <linux/cache.h>
 #include <linux/list.h>
+#include <linux/list_lru.h>
 #include <linux/radix-tree.h>
 #include <linux/prio_tree.h>
 #include <linux/init.h>
@@ -1414,10 +1415,7 @@ struct super_block {
 	struct list_head	s_dentry_lru;	/* unused dentry lru */
 	int			s_nr_dentry_unused; /* # of dentries on lru */
 
-	/* s_inode_lru_lock protects s_inode_lru and s_nr_inodes_unused */
-	spinlock_t		s_inode_lru_lock ____cacheline_aligned_in_smp;
-	struct list_head	s_inode_lru;		/* unused inode lru */
-	int			s_nr_inodes_unused;	/* # of inodes on lru */
+	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
 
 	struct block_device	*s_bdev;
 	struct backing_dev_info *s_bdi;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 09/13] inode: convert inode lru list to generic lru list code.
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/inode.c         |  168 ++++++++++++++++++++-------------------------------
 fs/super.c         |   11 ++--
 include/linux/fs.h |    6 +-
 3 files changed, 73 insertions(+), 112 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index fee5d9a..98ca516 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -26,6 +26,7 @@
 #include <linux/ima.h>
 #include <linux/cred.h>
 #include <linux/buffer_head.h> /* for inode_has_buffers */
+#include <linux/list_lru.h>
 #include "internal.h"
 
 /*
@@ -328,24 +329,14 @@ EXPORT_SYMBOL(ihold);
 
 static void inode_lru_list_add(struct inode *inode)
 {
-	spin_lock(&inode->i_sb->s_inode_lru_lock);
-	if (list_empty(&inode->i_lru)) {
-		list_add(&inode->i_lru, &inode->i_sb->s_inode_lru);
-		inode->i_sb->s_nr_inodes_unused++;
+	if (list_lru_add(&inode->i_sb->s_inode_lru, &inode->i_lru))
 		this_cpu_inc(nr_unused);
-	}
-	spin_unlock(&inode->i_sb->s_inode_lru_lock);
 }
 
 static void inode_lru_list_del(struct inode *inode)
 {
-	spin_lock(&inode->i_sb->s_inode_lru_lock);
-	if (!list_empty(&inode->i_lru)) {
-		list_del_init(&inode->i_lru);
-		inode->i_sb->s_nr_inodes_unused--;
-		this_cpu_dec(nr_unused);
-	}
-	spin_unlock(&inode->i_sb->s_inode_lru_lock);
+	if (list_lru_del(&inode->i_sb->s_inode_lru, &inode->i_lru))
+		this_cpu_inc(nr_unused);
 }
 
 /**
@@ -582,24 +573,8 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
 	return busy;
 }
 
-static int can_unuse(struct inode *inode)
-{
-	if (inode->i_state & ~I_REFERENCED)
-		return 0;
-	if (inode_has_buffers(inode))
-		return 0;
-	if (atomic_read(&inode->i_count))
-		return 0;
-	if (inode->i_data.nrpages)
-		return 0;
-	return 1;
-}
-
 /*
- * Walk the superblock inode LRU for freeable inodes and attempt to free them.
- * This is called from the superblock shrinker function with a number of inodes
- * to trim from the LRU. Inodes to be freed are moved to a temporary list and
- * then are freed outside inode_lock by dispose_list().
+ * Isolate the inode from the LRU in preparation for freeing it.
  *
  * Any inodes which are pinned purely because of attached pagecache have their
  * pagecache removed.  If the inode has metadata buffers attached to
@@ -613,88 +588,77 @@ static int can_unuse(struct inode *inode)
  * LRU does not have strict ordering. Hence we don't want to reclaim inodes
  * with this flag set because they are the inodes that are out of order.
  */
-long prune_icache_sb(struct super_block *sb, long nr_to_scan)
+static int inode_lru_isolate(struct list_head *item, spinlock_t *lru_lock,
+				void *arg)
 {
-	LIST_HEAD(freeable);
-	long nr_scanned;
-	long freed = 0;
-	unsigned long reap = 0;
+	struct list_head *freeable = arg;
+	struct inode	*inode = container_of(item, struct inode, i_lru);
 
-	spin_lock(&sb->s_inode_lru_lock);
-	for (nr_scanned = nr_to_scan; nr_scanned >= 0; nr_scanned--) {
-		struct inode *inode;
+	/*
+	 * we are inverting the lru lock/inode->i_lock here,
+	 * so use a trylock. If we fail to get the lock, just skip
+	 * it
+	 */
+	if (!spin_trylock(&inode->i_lock))
+		return 2;
 
-		if (list_empty(&sb->s_inode_lru))
-			break;
+	/*
+	 * Referenced or dirty inodes are still in use. Give them
+	 * another pass through the LRU as we canot reclaim them now.
+	 */
+	if (atomic_read(&inode->i_count) ||
+	    (inode->i_state & ~I_REFERENCED)) {
+		list_del_init(&inode->i_lru);
+		spin_unlock(&inode->i_lock);
+		this_cpu_dec(nr_unused);
+		return 0;
+	}
 
-		inode = list_entry(sb->s_inode_lru.prev, struct inode, i_lru);
+	/* recently referenced inodes get one more pass */
+	if (inode->i_state & I_REFERENCED) {
+		inode->i_state &= ~I_REFERENCED;
+		spin_unlock(&inode->i_lock);
+		return 1;
+	}
 
-		/*
-		 * we are inverting the sb->s_inode_lru_lock/inode->i_lock here,
-		 * so use a trylock. If we fail to get the lock, just move the
-		 * inode to the back of the list so we don't spin on it.
-		 */
-		if (!spin_trylock(&inode->i_lock)) {
-			list_move(&inode->i_lru, &sb->s_inode_lru);
-			continue;
+	if (inode_has_buffers(inode) || inode->i_data.nrpages) {
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(lru_lock);
+		if (remove_inode_buffers(inode)) {
+			unsigned long reap;
+			reap = invalidate_mapping_pages(&inode->i_data, 0, -1);
+			if (current_is_kswapd())
+				__count_vm_events(KSWAPD_INODESTEAL, reap);
+			else
+				__count_vm_events(PGINODESTEAL, reap);
 		}
+		iput(inode);
+		return 3;
+	}
 
-		/*
-		 * Referenced or dirty inodes are still in use. Give them
-		 * another pass through the LRU as we canot reclaim them now.
-		 */
-		if (atomic_read(&inode->i_count) ||
-		    (inode->i_state & ~I_REFERENCED)) {
-			list_del_init(&inode->i_lru);
-			spin_unlock(&inode->i_lock);
-			sb->s_nr_inodes_unused--;
-			this_cpu_dec(nr_unused);
-			continue;
-		}
+	WARN_ON(inode->i_state & I_NEW);
+	inode->i_state |= I_FREEING;
+	spin_unlock(&inode->i_lock);
 
-		/* recently referenced inodes get one more pass */
-		if (inode->i_state & I_REFERENCED) {
-			inode->i_state &= ~I_REFERENCED;
-			list_move(&inode->i_lru, &sb->s_inode_lru);
-			spin_unlock(&inode->i_lock);
-			continue;
-		}
-		if (inode_has_buffers(inode) || inode->i_data.nrpages) {
-			__iget(inode);
-			spin_unlock(&inode->i_lock);
-			spin_unlock(&sb->s_inode_lru_lock);
-			if (remove_inode_buffers(inode))
-				reap += invalidate_mapping_pages(&inode->i_data,
-								0, -1);
-			iput(inode);
-			spin_lock(&sb->s_inode_lru_lock);
-
-			if (inode != list_entry(sb->s_inode_lru.next,
-						struct inode, i_lru))
-				continue;	/* wrong inode or list_empty */
-			/* avoid lock inversions with trylock */
-			if (!spin_trylock(&inode->i_lock))
-				continue;
-			if (!can_unuse(inode)) {
-				spin_unlock(&inode->i_lock);
-				continue;
-			}
-		}
-		WARN_ON(inode->i_state & I_NEW);
-		inode->i_state |= I_FREEING;
-		spin_unlock(&inode->i_lock);
+	list_move(&inode->i_lru, freeable);
+	this_cpu_dec(nr_unused);
+	return 0;
+}
 
-		list_move(&inode->i_lru, &freeable);
-		sb->s_nr_inodes_unused--;
-		this_cpu_dec(nr_unused);
-		freed++;
-	}
-	if (current_is_kswapd())
-		__count_vm_events(KSWAPD_INODESTEAL, reap);
-	else
-		__count_vm_events(PGINODESTEAL, reap);
-	spin_unlock(&sb->s_inode_lru_lock);
+/*
+ * Walk the superblock inode LRU for freeable inodes and attempt to free them.
+ * This is called from the superblock shrinker function with a number of inodes
+ * to trim from the LRU. Inodes to be freed are moved to a temporary list and
+ * then are freed outside inode_lock by dispose_list().
+ */
+long prune_icache_sb(struct super_block *sb, long nr_to_scan)
+{
+	LIST_HEAD(freeable);
+	long freed;
 
+	freed = list_lru_walk(&sb->s_inode_lru, inode_lru_isolate,
+						&freeable, nr_to_scan);
 	dispose_list(&freeable);
 	return freed;
 }
diff --git a/fs/super.c b/fs/super.c
index 074abbe..bc0e101 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -69,12 +69,12 @@ static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 	if (sb->s_op && sb->s_op->nr_cached_objects)
 		fs_objects = sb->s_op->nr_cached_objects(sb);
 
-	total_objects = sb->s_nr_dentry_unused +
-			sb->s_nr_inodes_unused + fs_objects + 1;
+	inodes = list_lru_count(&sb->s_inode_lru);
+	total_objects = sb->s_nr_dentry_unused + inodes + fs_objects + 1;
 
 	/* proportion the scan between the caches */
 	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
-	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
+	inodes = (sc->nr_to_scan * inodes) / total_objects;
 
 	/*
 	 * prune the dcache first as the icache is pinned by it, then
@@ -106,7 +106,7 @@ static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc
 		total_objects = sb->s_op->nr_cached_objects(sb);
 
 	total_objects += sb->s_nr_dentry_unused;
-	total_objects += sb->s_nr_inodes_unused;
+	total_objects += list_lru_count(&sb->s_inode_lru);
 
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
 	drop_super(sb);
@@ -153,8 +153,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_LIST_HEAD(&s->s_inodes);
 		INIT_LIST_HEAD(&s->s_dentry_lru);
 		spin_lock_init(&s->s_dentry_lru_lock);
-		INIT_LIST_HEAD(&s->s_inode_lru);
-		spin_lock_init(&s->s_inode_lru_lock);
+		list_lru_init(&s->s_inode_lru);
 		init_rwsem(&s->s_umount);
 		mutex_init(&s->s_lock);
 		lockdep_set_class(&s->s_umount, &type->s_umount_key);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2651059..80beb62 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -385,6 +385,7 @@ struct inodes_stat_t {
 #include <linux/stat.h>
 #include <linux/cache.h>
 #include <linux/list.h>
+#include <linux/list_lru.h>
 #include <linux/radix-tree.h>
 #include <linux/prio_tree.h>
 #include <linux/init.h>
@@ -1414,10 +1415,7 @@ struct super_block {
 	struct list_head	s_dentry_lru;	/* unused dentry lru */
 	int			s_nr_dentry_unused; /* # of dentries on lru */
 
-	/* s_inode_lru_lock protects s_inode_lru and s_nr_inodes_unused */
-	spinlock_t		s_inode_lru_lock ____cacheline_aligned_in_smp;
-	struct list_head	s_inode_lru;		/* unused inode lru */
-	int			s_nr_inodes_unused;	/* # of inodes on lru */
+	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
 
 	struct block_device	*s_bdev;
 	struct backing_dev_info *s_bdi;
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/13] xfs: convert buftarg LRU to generic code
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c |  143 +++++++++++++++++++++++-------------------------------
 fs/xfs/xfs_buf.h |    5 +-
 2 files changed, 62 insertions(+), 86 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index b2eea9e..25c8ffd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -98,19 +98,12 @@ xfs_buf_vmap_len(
  * The LRU takes a new reference to the buffer so that it will only be freed
  * once the shrinker takes the buffer off the LRU.
  */
-STATIC void
+static inline void
 xfs_buf_lru_add(
 	struct xfs_buf	*bp)
 {
-	struct xfs_buftarg *btp = bp->b_target;
-
-	spin_lock(&btp->bt_lru_lock);
-	if (list_empty(&bp->b_lru)) {
+	if (list_lru_add(&bp->b_target->bt_lru, &bp->b_lru))
 		atomic_inc(&bp->b_hold);
-		list_add_tail(&bp->b_lru, &btp->bt_lru);
-		btp->bt_lru_nr++;
-	}
-	spin_unlock(&btp->bt_lru_lock);
 }
 
 /*
@@ -119,24 +112,16 @@ xfs_buf_lru_add(
  * The unlocked check is safe here because it only occurs when there are not
  * b_lru_ref counts left on the inode under the pag->pag_buf_lock. it is there
  * to optimise the shrinker removing the buffer from the LRU and calling
- * xfs_buf_free(). i.e. it removes an unnecessary round trip on the
- * bt_lru_lock.
+ * xfs_buf_free().
  */
-STATIC void
+static inline void
 xfs_buf_lru_del(
 	struct xfs_buf	*bp)
 {
-	struct xfs_buftarg *btp = bp->b_target;
-
 	if (list_empty(&bp->b_lru))
 		return;
 
-	spin_lock(&btp->bt_lru_lock);
-	if (!list_empty(&bp->b_lru)) {
-		list_del_init(&bp->b_lru);
-		btp->bt_lru_nr--;
-	}
-	spin_unlock(&btp->bt_lru_lock);
+	list_lru_del(&bp->b_target->bt_lru, &bp->b_lru);
 }
 
 /*
@@ -153,17 +138,10 @@ xfs_buf_stale(
 {
 	bp->b_flags |= XBF_STALE;
 	atomic_set(&(bp)->b_lru_ref, 0);
-	if (!list_empty(&bp->b_lru)) {
-		struct xfs_buftarg *btp = bp->b_target;
+	if (!list_empty(&bp->b_lru) &&
+	    list_lru_del(&bp->b_target->bt_lru, &bp->b_lru))
+		atomic_dec(&bp->b_hold);
 
-		spin_lock(&btp->bt_lru_lock);
-		if (!list_empty(&bp->b_lru)) {
-			list_del_init(&bp->b_lru);
-			btp->bt_lru_nr--;
-			atomic_dec(&bp->b_hold);
-		}
-		spin_unlock(&btp->bt_lru_lock);
-	}
 	ASSERT(atomic_read(&bp->b_hold) >= 1);
 }
 
@@ -1429,31 +1407,59 @@ xfs_buf_iomove(
  * returned. These buffers will have an elevated hold count, so wait on those
  * while freeing all the buffers only held by the LRU.
  */
-void
-xfs_wait_buftarg(
-	struct xfs_buftarg	*btp)
+static int
+xfs_buftarg_wait_rele(
+	struct list_head	*item,
+	spinlock_t		*lru_lock,
+	void			*arg)
 {
-	struct xfs_buf		*bp;
+	struct xfs_buf		*bp = container_of(item, struct xfs_buf, b_lru);
 
-restart:
-	spin_lock(&btp->bt_lru_lock);
-	while (!list_empty(&btp->bt_lru)) {
-		bp = list_first_entry(&btp->bt_lru, struct xfs_buf, b_lru);
-		if (atomic_read(&bp->b_hold) > 1) {
-			spin_unlock(&btp->bt_lru_lock);
-			delay(100);
-			goto restart;
-		}
+	if (atomic_read(&bp->b_hold) > 1) {
+		/* need to wait */
+		spin_unlock(lru_lock);
+		delay(100);
+	} else {
 		/*
-		 * clear the LRU reference count so the bufer doesn't get
+		 * clear the LRU reference count so the buffer doesn't get
 		 * ignored in xfs_buf_rele().
 		 */
 		atomic_set(&bp->b_lru_ref, 0);
-		spin_unlock(&btp->bt_lru_lock);
+		spin_unlock(lru_lock);
 		xfs_buf_rele(bp);
-		spin_lock(&btp->bt_lru_lock);
 	}
-	spin_unlock(&btp->bt_lru_lock);
+	return 3;
+}
+void
+xfs_wait_buftarg(
+	struct xfs_buftarg	*btp)
+{
+	list_lru_walk(&btp->bt_lru, xfs_buftarg_wait_rele, NULL, LONG_MAX);
+}
+
+static int
+xfs_buftarg_isolate(
+	struct list_head	*item,
+	spinlock_t		*lru_lock,
+	void			*arg)
+{
+	struct xfs_buf		*bp = container_of(item, struct xfs_buf, b_lru);
+	struct list_head	*dispose = arg;
+
+	/*
+	 * Decrement the b_lru_ref count unless the value is already
+	 * zero. If the value is already zero, we need to reclaim the
+	 * buffer, otherwise it gets another trip through the LRU.
+	 */
+	if (!atomic_add_unless(&bp->b_lru_ref, -1, 0))
+		return 1;
+
+	/*
+	 * remove the buffer from the LRU now to avoid needing another
+	 * lock round trip inside xfs_buf_rele().
+	 */
+	list_move(item, dispose);
+	return 0;
 }
 
 static long
@@ -1463,42 +1469,14 @@ xfs_buftarg_shrink_scan(
 {
 	struct xfs_buftarg	*btp = container_of(shrink,
 					struct xfs_buftarg, bt_shrinker);
-	struct xfs_buf		*bp;
-	int nr_to_scan = sc->nr_to_scan;
-	int freed = 0;
+	long freed = 0;
 	LIST_HEAD(dispose);
 
-	if (!nr_to_scan)
-		return btp->bt_lru_nr;
-
-	spin_lock(&btp->bt_lru_lock);
-	while (!list_empty(&btp->bt_lru)) {
-		if (nr_to_scan-- <= 0)
-			break;
-
-		bp = list_first_entry(&btp->bt_lru, struct xfs_buf, b_lru);
-
-		/*
-		 * Decrement the b_lru_ref count unless the value is already
-		 * zero. If the value is already zero, we need to reclaim the
-		 * buffer, otherwise it gets another trip through the LRU.
-		 */
-		if (!atomic_add_unless(&bp->b_lru_ref, -1, 0)) {
-			list_move_tail(&bp->b_lru, &btp->bt_lru);
-			continue;
-		}
-
-		/*
-		 * remove the buffer from the LRU now to avoid needing another
-		 * lock round trip inside xfs_buf_rele().
-		 */
-		list_move(&bp->b_lru, &dispose);
-		btp->bt_lru_nr--;
-		freed++;
-	}
-	spin_unlock(&btp->bt_lru_lock);
+	freed = list_lru_walk(&btp->bt_lru, xfs_buftarg_isolate,
+		      &dispose, sc->nr_to_scan);
 
 	while (!list_empty(&dispose)) {
+		struct xfs_buf *bp;
 		bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
 		list_del_init(&bp->b_lru);
 		xfs_buf_rele(bp);
@@ -1514,7 +1492,7 @@ xfs_buftarg_shrink_count(
 {
 	struct xfs_buftarg	*btp = container_of(shrink,
 					struct xfs_buftarg, bt_shrinker);
-	return btp->bt_lru_nr;
+	return list_lru_count(&btp->bt_lru);
 }
 
 void
@@ -1608,8 +1586,7 @@ xfs_alloc_buftarg(
 	if (!btp->bt_bdi)
 		goto error;
 
-	INIT_LIST_HEAD(&btp->bt_lru);
-	spin_lock_init(&btp->bt_lru_lock);
+	list_lru_init(&btp->bt_lru);
 	if (xfs_setsize_buftarg_early(btp, bdev))
 		goto error;
 	if (xfs_alloc_delwrite_queue(btp, fsname))
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 620972b..f8dafde 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -26,6 +26,7 @@
 #include <linux/fs.h>
 #include <linux/buffer_head.h>
 #include <linux/uio.h>
+#include <linux/list_lru.h>
 
 /*
  *	Base types
@@ -111,9 +112,7 @@ typedef struct xfs_buftarg {
 
 	/* LRU control structures */
 	struct shrinker		bt_shrinker;
-	struct list_head	bt_lru;
-	spinlock_t		bt_lru_lock;
-	unsigned int		bt_lru_nr;
+	struct list_lru		bt_lru;
 } xfs_buftarg_t;
 
 struct xfs_buf;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 10/13] xfs: convert buftarg LRU to generic code
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf.c |  143 +++++++++++++++++++++++-------------------------------
 fs/xfs/xfs_buf.h |    5 +-
 2 files changed, 62 insertions(+), 86 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index b2eea9e..25c8ffd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -98,19 +98,12 @@ xfs_buf_vmap_len(
  * The LRU takes a new reference to the buffer so that it will only be freed
  * once the shrinker takes the buffer off the LRU.
  */
-STATIC void
+static inline void
 xfs_buf_lru_add(
 	struct xfs_buf	*bp)
 {
-	struct xfs_buftarg *btp = bp->b_target;
-
-	spin_lock(&btp->bt_lru_lock);
-	if (list_empty(&bp->b_lru)) {
+	if (list_lru_add(&bp->b_target->bt_lru, &bp->b_lru))
 		atomic_inc(&bp->b_hold);
-		list_add_tail(&bp->b_lru, &btp->bt_lru);
-		btp->bt_lru_nr++;
-	}
-	spin_unlock(&btp->bt_lru_lock);
 }
 
 /*
@@ -119,24 +112,16 @@ xfs_buf_lru_add(
  * The unlocked check is safe here because it only occurs when there are not
  * b_lru_ref counts left on the inode under the pag->pag_buf_lock. it is there
  * to optimise the shrinker removing the buffer from the LRU and calling
- * xfs_buf_free(). i.e. it removes an unnecessary round trip on the
- * bt_lru_lock.
+ * xfs_buf_free().
  */
-STATIC void
+static inline void
 xfs_buf_lru_del(
 	struct xfs_buf	*bp)
 {
-	struct xfs_buftarg *btp = bp->b_target;
-
 	if (list_empty(&bp->b_lru))
 		return;
 
-	spin_lock(&btp->bt_lru_lock);
-	if (!list_empty(&bp->b_lru)) {
-		list_del_init(&bp->b_lru);
-		btp->bt_lru_nr--;
-	}
-	spin_unlock(&btp->bt_lru_lock);
+	list_lru_del(&bp->b_target->bt_lru, &bp->b_lru);
 }
 
 /*
@@ -153,17 +138,10 @@ xfs_buf_stale(
 {
 	bp->b_flags |= XBF_STALE;
 	atomic_set(&(bp)->b_lru_ref, 0);
-	if (!list_empty(&bp->b_lru)) {
-		struct xfs_buftarg *btp = bp->b_target;
+	if (!list_empty(&bp->b_lru) &&
+	    list_lru_del(&bp->b_target->bt_lru, &bp->b_lru))
+		atomic_dec(&bp->b_hold);
 
-		spin_lock(&btp->bt_lru_lock);
-		if (!list_empty(&bp->b_lru)) {
-			list_del_init(&bp->b_lru);
-			btp->bt_lru_nr--;
-			atomic_dec(&bp->b_hold);
-		}
-		spin_unlock(&btp->bt_lru_lock);
-	}
 	ASSERT(atomic_read(&bp->b_hold) >= 1);
 }
 
@@ -1429,31 +1407,59 @@ xfs_buf_iomove(
  * returned. These buffers will have an elevated hold count, so wait on those
  * while freeing all the buffers only held by the LRU.
  */
-void
-xfs_wait_buftarg(
-	struct xfs_buftarg	*btp)
+static int
+xfs_buftarg_wait_rele(
+	struct list_head	*item,
+	spinlock_t		*lru_lock,
+	void			*arg)
 {
-	struct xfs_buf		*bp;
+	struct xfs_buf		*bp = container_of(item, struct xfs_buf, b_lru);
 
-restart:
-	spin_lock(&btp->bt_lru_lock);
-	while (!list_empty(&btp->bt_lru)) {
-		bp = list_first_entry(&btp->bt_lru, struct xfs_buf, b_lru);
-		if (atomic_read(&bp->b_hold) > 1) {
-			spin_unlock(&btp->bt_lru_lock);
-			delay(100);
-			goto restart;
-		}
+	if (atomic_read(&bp->b_hold) > 1) {
+		/* need to wait */
+		spin_unlock(lru_lock);
+		delay(100);
+	} else {
 		/*
-		 * clear the LRU reference count so the bufer doesn't get
+		 * clear the LRU reference count so the buffer doesn't get
 		 * ignored in xfs_buf_rele().
 		 */
 		atomic_set(&bp->b_lru_ref, 0);
-		spin_unlock(&btp->bt_lru_lock);
+		spin_unlock(lru_lock);
 		xfs_buf_rele(bp);
-		spin_lock(&btp->bt_lru_lock);
 	}
-	spin_unlock(&btp->bt_lru_lock);
+	return 3;
+}
+void
+xfs_wait_buftarg(
+	struct xfs_buftarg	*btp)
+{
+	list_lru_walk(&btp->bt_lru, xfs_buftarg_wait_rele, NULL, LONG_MAX);
+}
+
+static int
+xfs_buftarg_isolate(
+	struct list_head	*item,
+	spinlock_t		*lru_lock,
+	void			*arg)
+{
+	struct xfs_buf		*bp = container_of(item, struct xfs_buf, b_lru);
+	struct list_head	*dispose = arg;
+
+	/*
+	 * Decrement the b_lru_ref count unless the value is already
+	 * zero. If the value is already zero, we need to reclaim the
+	 * buffer, otherwise it gets another trip through the LRU.
+	 */
+	if (!atomic_add_unless(&bp->b_lru_ref, -1, 0))
+		return 1;
+
+	/*
+	 * remove the buffer from the LRU now to avoid needing another
+	 * lock round trip inside xfs_buf_rele().
+	 */
+	list_move(item, dispose);
+	return 0;
 }
 
 static long
@@ -1463,42 +1469,14 @@ xfs_buftarg_shrink_scan(
 {
 	struct xfs_buftarg	*btp = container_of(shrink,
 					struct xfs_buftarg, bt_shrinker);
-	struct xfs_buf		*bp;
-	int nr_to_scan = sc->nr_to_scan;
-	int freed = 0;
+	long freed = 0;
 	LIST_HEAD(dispose);
 
-	if (!nr_to_scan)
-		return btp->bt_lru_nr;
-
-	spin_lock(&btp->bt_lru_lock);
-	while (!list_empty(&btp->bt_lru)) {
-		if (nr_to_scan-- <= 0)
-			break;
-
-		bp = list_first_entry(&btp->bt_lru, struct xfs_buf, b_lru);
-
-		/*
-		 * Decrement the b_lru_ref count unless the value is already
-		 * zero. If the value is already zero, we need to reclaim the
-		 * buffer, otherwise it gets another trip through the LRU.
-		 */
-		if (!atomic_add_unless(&bp->b_lru_ref, -1, 0)) {
-			list_move_tail(&bp->b_lru, &btp->bt_lru);
-			continue;
-		}
-
-		/*
-		 * remove the buffer from the LRU now to avoid needing another
-		 * lock round trip inside xfs_buf_rele().
-		 */
-		list_move(&bp->b_lru, &dispose);
-		btp->bt_lru_nr--;
-		freed++;
-	}
-	spin_unlock(&btp->bt_lru_lock);
+	freed = list_lru_walk(&btp->bt_lru, xfs_buftarg_isolate,
+		      &dispose, sc->nr_to_scan);
 
 	while (!list_empty(&dispose)) {
+		struct xfs_buf *bp;
 		bp = list_first_entry(&dispose, struct xfs_buf, b_lru);
 		list_del_init(&bp->b_lru);
 		xfs_buf_rele(bp);
@@ -1514,7 +1492,7 @@ xfs_buftarg_shrink_count(
 {
 	struct xfs_buftarg	*btp = container_of(shrink,
 					struct xfs_buftarg, bt_shrinker);
-	return btp->bt_lru_nr;
+	return list_lru_count(&btp->bt_lru);
 }
 
 void
@@ -1608,8 +1586,7 @@ xfs_alloc_buftarg(
 	if (!btp->bt_bdi)
 		goto error;
 
-	INIT_LIST_HEAD(&btp->bt_lru);
-	spin_lock_init(&btp->bt_lru_lock);
+	list_lru_init(&btp->bt_lru);
 	if (xfs_setsize_buftarg_early(btp, bdev))
 		goto error;
 	if (xfs_alloc_delwrite_queue(btp, fsname))
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 620972b..f8dafde 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -26,6 +26,7 @@
 #include <linux/fs.h>
 #include <linux/buffer_head.h>
 #include <linux/uio.h>
+#include <linux/list_lru.h>
 
 /*
  *	Base types
@@ -111,9 +112,7 @@ typedef struct xfs_buftarg {
 
 	/* LRU control structures */
 	struct shrinker		bt_shrinker;
-	struct list_head	bt_lru;
-	spinlock_t		bt_lru_lock;
-	unsigned int		bt_lru_nr;
+	struct list_lru		bt_lru;
 } xfs_buftarg_t;
 
 struct xfs_buf;
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/13] dcache: use a dispose list in select_parent
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

select_parent currently abuses the dentry cache LRU to provide
cleanup features for child dentries that need to be freed. It moves
them to the tail of the LRU, then tells shrink_dcache_parent() to
calls __shrink_dcache_sb to unconditionally move them to a dispose
list (as DCACHE_REFERENCED is ignored). __shrink_dcache_sb() has to
relock the dentries to move them off the LRU onto the dispose list,
but otherwise does not touch the dentries that select_parent() moved
to the tail of the LRU. It then passses the dispose list to
shrink_dentry_list() which tries to free the dentries.

IOWs, the use of __shrink_dcache_sb() is superfluous - we can build
exactly the same list of dentries for disposal directly in
select_parent() and call shrink_dentry_list() instead of calling
__shrink_dcache_sb() to do that. This means that we avoid long holds
on the lru lock walking the LRU moving dentries to the dispose list
We also avoid the need to relock each dentry just to move it off the
LRU, reducing the numebr of times we lock each dentry to dispose of
them in shrink_dcache_parent() from 3 to 2 times.

Further, we remove one of the two callers of __shrink_dcache_sb().
This also means that __shrink_dcache_sb can be moved into back into
prune_dcache_sb() and we no longer have to handle referenced
dentries conditionally, simplifying the code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   65 ++++++++++++++++++++---------------------------------------
 1 files changed, 22 insertions(+), 43 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index d19e453..b931415 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -264,15 +264,15 @@ static void dentry_lru_del(struct dentry *dentry)
 	}
 }
 
-static void dentry_lru_move_tail(struct dentry *dentry)
+static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
 {
 	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
-		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
+		list_add_tail(&dentry->d_lru, list);
 		dentry->d_sb->s_nr_dentry_unused++;
 		this_cpu_inc(nr_dentry_unused);
 	} else {
-		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
+		list_move_tail(&dentry->d_lru, list);
 	}
 	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
@@ -752,14 +752,18 @@ static void shrink_dentry_list(struct list_head *list)
 }
 
 /**
- * __shrink_dcache_sb - shrink the dentry LRU on a given superblock
- * @sb:		superblock to shrink dentry LRU.
- * @count:	number of entries to prune
- * @flags:	flags to control the dentry processing
+ * prune_dcache_sb - shrink the dcache
+ * @sb: superblock
+ * @nr_to_scan: number of entries to try to free
+ *
+ * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is
+ * done when we need more memory an called from the superblock shrinker
+ * function.
  *
- * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
+ * This function may fail to free any resources if all the dentries are in
+ * use.
  */
-static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
+long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
 	struct dentry *dentry;
 	LIST_HEAD(referenced);
@@ -779,13 +783,7 @@ relock:
 			goto relock;
 		}
 
-		/*
-		 * If we are honouring the DCACHE_REFERENCED flag and the
-		 * dentry has this flag set, don't free it.  Clear the flag
-		 * and put it back on the LRU.
-		 */
-		if (flags & DCACHE_REFERENCED &&
-				dentry->d_flags & DCACHE_REFERENCED) {
+		if (dentry->d_flags & DCACHE_REFERENCED) {
 			dentry->d_flags &= ~DCACHE_REFERENCED;
 			list_move(&dentry->d_lru, &referenced);
 			spin_unlock(&dentry->d_lock);
@@ -793,7 +791,7 @@ relock:
 			list_move_tail(&dentry->d_lru, &tmp);
 			spin_unlock(&dentry->d_lock);
 			freed++;
-			if (!--count)
+			if (!--nr_to_scan)
 				break;
 		}
 		cond_resched_lock(&sb->s_dentry_lru_lock);
@@ -807,23 +805,6 @@ relock:
 }
 
 /**
- * prune_dcache_sb - shrink the dcache
- * @sb: superblock
- * @nr_to_scan: number of entries to try to free
- *
- * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is
- * done when we need more memory an called from the superblock shrinker
- * function.
- *
- * This function may fail to free any resources if all the dentries are in
- * use.
- */
-long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
-{
-	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
-}
-
-/**
  * shrink_dcache_sb - shrink dcache for a superblock
  * @sb: superblock
  *
@@ -1073,7 +1054,7 @@ EXPORT_SYMBOL(have_submounts);
  * drop the lock and return early due to latency
  * constraints.
  */
-static long select_parent(struct dentry * parent)
+static long select_parent(struct dentry *parent, struct list_head *dispose)
 {
 	struct dentry *this_parent;
 	struct list_head *next;
@@ -1095,12 +1076,11 @@ resume:
 
 		spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
 
-		/* 
-		 * move only zero ref count dentries to the end 
-		 * of the unused list for prune_dcache
+		/*
+		 * move only zero ref count dentries to the dispose list.
 		 */
 		if (!dentry->d_count) {
-			dentry_lru_move_tail(dentry);
+			dentry_lru_move_list(dentry, dispose);
 			found++;
 		} else {
 			dentry_lru_del(dentry);
@@ -1162,14 +1142,13 @@ rename_retry:
  *
  * Prune the dcache to remove unused children of the parent dentry.
  */
- 
 void shrink_dcache_parent(struct dentry * parent)
 {
-	struct super_block *sb = parent->d_sb;
+	LIST_HEAD(dispose);
 	long found;
 
-	while ((found = select_parent(parent)) != 0)
-		__shrink_dcache_sb(sb, found, 0);
+	while ((found = select_parent(parent, &dispose)) != 0)
+		shrink_dentry_list(&dispose);
 }
 EXPORT_SYMBOL(shrink_dcache_parent);
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 11/13] dcache: use a dispose list in select_parent
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

select_parent currently abuses the dentry cache LRU to provide
cleanup features for child dentries that need to be freed. It moves
them to the tail of the LRU, then tells shrink_dcache_parent() to
calls __shrink_dcache_sb to unconditionally move them to a dispose
list (as DCACHE_REFERENCED is ignored). __shrink_dcache_sb() has to
relock the dentries to move them off the LRU onto the dispose list,
but otherwise does not touch the dentries that select_parent() moved
to the tail of the LRU. It then passses the dispose list to
shrink_dentry_list() which tries to free the dentries.

IOWs, the use of __shrink_dcache_sb() is superfluous - we can build
exactly the same list of dentries for disposal directly in
select_parent() and call shrink_dentry_list() instead of calling
__shrink_dcache_sb() to do that. This means that we avoid long holds
on the lru lock walking the LRU moving dentries to the dispose list
We also avoid the need to relock each dentry just to move it off the
LRU, reducing the numebr of times we lock each dentry to dispose of
them in shrink_dcache_parent() from 3 to 2 times.

Further, we remove one of the two callers of __shrink_dcache_sb().
This also means that __shrink_dcache_sb can be moved into back into
prune_dcache_sb() and we no longer have to handle referenced
dentries conditionally, simplifying the code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   65 ++++++++++++++++++++---------------------------------------
 1 files changed, 22 insertions(+), 43 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index d19e453..b931415 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -264,15 +264,15 @@ static void dentry_lru_del(struct dentry *dentry)
 	}
 }
 
-static void dentry_lru_move_tail(struct dentry *dentry)
+static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
 {
 	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
-		list_add_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
+		list_add_tail(&dentry->d_lru, list);
 		dentry->d_sb->s_nr_dentry_unused++;
 		this_cpu_inc(nr_dentry_unused);
 	} else {
-		list_move_tail(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
+		list_move_tail(&dentry->d_lru, list);
 	}
 	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
@@ -752,14 +752,18 @@ static void shrink_dentry_list(struct list_head *list)
 }
 
 /**
- * __shrink_dcache_sb - shrink the dentry LRU on a given superblock
- * @sb:		superblock to shrink dentry LRU.
- * @count:	number of entries to prune
- * @flags:	flags to control the dentry processing
+ * prune_dcache_sb - shrink the dcache
+ * @sb: superblock
+ * @nr_to_scan: number of entries to try to free
+ *
+ * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is
+ * done when we need more memory an called from the superblock shrinker
+ * function.
  *
- * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
+ * This function may fail to free any resources if all the dentries are in
+ * use.
  */
-static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
+long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
 	struct dentry *dentry;
 	LIST_HEAD(referenced);
@@ -779,13 +783,7 @@ relock:
 			goto relock;
 		}
 
-		/*
-		 * If we are honouring the DCACHE_REFERENCED flag and the
-		 * dentry has this flag set, don't free it.  Clear the flag
-		 * and put it back on the LRU.
-		 */
-		if (flags & DCACHE_REFERENCED &&
-				dentry->d_flags & DCACHE_REFERENCED) {
+		if (dentry->d_flags & DCACHE_REFERENCED) {
 			dentry->d_flags &= ~DCACHE_REFERENCED;
 			list_move(&dentry->d_lru, &referenced);
 			spin_unlock(&dentry->d_lock);
@@ -793,7 +791,7 @@ relock:
 			list_move_tail(&dentry->d_lru, &tmp);
 			spin_unlock(&dentry->d_lock);
 			freed++;
-			if (!--count)
+			if (!--nr_to_scan)
 				break;
 		}
 		cond_resched_lock(&sb->s_dentry_lru_lock);
@@ -807,23 +805,6 @@ relock:
 }
 
 /**
- * prune_dcache_sb - shrink the dcache
- * @sb: superblock
- * @nr_to_scan: number of entries to try to free
- *
- * Attempt to shrink the superblock dcache LRU by @nr_to_scan entries. This is
- * done when we need more memory an called from the superblock shrinker
- * function.
- *
- * This function may fail to free any resources if all the dentries are in
- * use.
- */
-long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
-{
-	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
-}
-
-/**
  * shrink_dcache_sb - shrink dcache for a superblock
  * @sb: superblock
  *
@@ -1073,7 +1054,7 @@ EXPORT_SYMBOL(have_submounts);
  * drop the lock and return early due to latency
  * constraints.
  */
-static long select_parent(struct dentry * parent)
+static long select_parent(struct dentry *parent, struct list_head *dispose)
 {
 	struct dentry *this_parent;
 	struct list_head *next;
@@ -1095,12 +1076,11 @@ resume:
 
 		spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
 
-		/* 
-		 * move only zero ref count dentries to the end 
-		 * of the unused list for prune_dcache
+		/*
+		 * move only zero ref count dentries to the dispose list.
 		 */
 		if (!dentry->d_count) {
-			dentry_lru_move_tail(dentry);
+			dentry_lru_move_list(dentry, dispose);
 			found++;
 		} else {
 			dentry_lru_del(dentry);
@@ -1162,14 +1142,13 @@ rename_retry:
  *
  * Prune the dcache to remove unused children of the parent dentry.
  */
- 
 void shrink_dcache_parent(struct dentry * parent)
 {
-	struct super_block *sb = parent->d_sb;
+	LIST_HEAD(dispose);
 	long found;
 
-	while ((found = select_parent(parent)) != 0)
-		__shrink_dcache_sb(sb, found, 0);
+	while ((found = select_parent(parent, &dispose)) != 0)
+		shrink_dentry_list(&dispose);
 }
 EXPORT_SYMBOL(shrink_dcache_parent);
 
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

One of the big problems with modifying the way the dcache shrinker
and LRU implementation works is that the LRU is abused in several
ways. One of these is shrinker_dentry_list().

Basically, we can move a dentry off the LRU onto a different list
without doing any accounting changes, and then use dentry_lru_del()
to remove it from what-ever list it is now on to do the LRU
accounting at that point.

This makes it -really hard- to change the LRU implementation. The
use of the per-sb LRU lock serialises movement of the dentries
between the different lists and the removal of them, and this is the
only reason that it works. If we want to break up the dentry LRU
lock and lists into, say, per-node lists, we remove the only
serialisation that allows this lru list/dispose list abuse to work.

To make this work effectively, the dispose list has to be isolated
from the LRU list - dentries have to be removed from the LRU
*before* being placed on the dispose list. This means that the LRU
accounting and isolation is completed before disposal is started,
and that means we can change the LRU implementation freely in
future..

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   25 ++++++++++++++++++++-----
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index b931415..79bf47c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
 	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, list);
-		dentry->d_sb->s_nr_dentry_unused++;
-		this_cpu_inc(nr_dentry_unused);
 	} else {
 		list_move_tail(&dentry->d_lru, list);
+		dentry->d_sb->s_nr_dentry_unused--;
+		this_cpu_dec(nr_dentry_unused);
 	}
 	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
@@ -732,12 +732,17 @@ static void shrink_dentry_list(struct list_head *list)
 		}
 
 		/*
+		 * The dispose list is isolated and dentries are not accounted
+		 * to the LRU here, so we can simply remove it from the list
+		 * here regardless of whether it is referenced or not.
+		 */
+		list_del_init(&dentry->d_lru);
+
+		/*
 		 * We found an inuse dentry which was not removed from
-		 * the LRU because of laziness during lookup.  Do not free
-		 * it - just keep it off the LRU list.
+		 * the LRU because of laziness during lookup. Do not free it.
 		 */
 		if (dentry->d_count) {
-			dentry_lru_del(dentry);
 			spin_unlock(&dentry->d_lock);
 			continue;
 		}
@@ -789,6 +794,8 @@ relock:
 			spin_unlock(&dentry->d_lock);
 		} else {
 			list_move_tail(&dentry->d_lru, &tmp);
+			this_cpu_dec(nr_dentry_unused);
+			sb->s_nr_dentry_unused--;
 			spin_unlock(&dentry->d_lock);
 			freed++;
 			if (!--nr_to_scan)
@@ -818,6 +825,14 @@ void shrink_dcache_sb(struct super_block *sb)
 	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		list_splice_init(&sb->s_dentry_lru, &tmp);
+
+		/*
+		 * account for removal here so we don't need to handle it later
+		 * even though the dentry is no longer on the lru list.
+		 */
+		this_cpu_sub(nr_dentry_unused, sb->s_nr_dentry_unused);
+		sb->s_nr_dentry_unused = 0;
+
 		spin_unlock(&sb->s_dentry_lru_lock);
 		shrink_dentry_list(&tmp);
 		spin_lock(&sb->s_dentry_lru_lock);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

One of the big problems with modifying the way the dcache shrinker
and LRU implementation works is that the LRU is abused in several
ways. One of these is shrinker_dentry_list().

Basically, we can move a dentry off the LRU onto a different list
without doing any accounting changes, and then use dentry_lru_del()
to remove it from what-ever list it is now on to do the LRU
accounting at that point.

This makes it -really hard- to change the LRU implementation. The
use of the per-sb LRU lock serialises movement of the dentries
between the different lists and the removal of them, and this is the
only reason that it works. If we want to break up the dentry LRU
lock and lists into, say, per-node lists, we remove the only
serialisation that allows this lru list/dispose list abuse to work.

To make this work effectively, the dispose list has to be isolated
from the LRU list - dentries have to be removed from the LRU
*before* being placed on the dispose list. This means that the LRU
accounting and isolation is completed before disposal is started,
and that means we can change the LRU implementation freely in
future..

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c |   25 ++++++++++++++++++++-----
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index b931415..79bf47c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
 	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
 	if (list_empty(&dentry->d_lru)) {
 		list_add_tail(&dentry->d_lru, list);
-		dentry->d_sb->s_nr_dentry_unused++;
-		this_cpu_inc(nr_dentry_unused);
 	} else {
 		list_move_tail(&dentry->d_lru, list);
+		dentry->d_sb->s_nr_dentry_unused--;
+		this_cpu_dec(nr_dentry_unused);
 	}
 	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
@@ -732,12 +732,17 @@ static void shrink_dentry_list(struct list_head *list)
 		}
 
 		/*
+		 * The dispose list is isolated and dentries are not accounted
+		 * to the LRU here, so we can simply remove it from the list
+		 * here regardless of whether it is referenced or not.
+		 */
+		list_del_init(&dentry->d_lru);
+
+		/*
 		 * We found an inuse dentry which was not removed from
-		 * the LRU because of laziness during lookup.  Do not free
-		 * it - just keep it off the LRU list.
+		 * the LRU because of laziness during lookup. Do not free it.
 		 */
 		if (dentry->d_count) {
-			dentry_lru_del(dentry);
 			spin_unlock(&dentry->d_lock);
 			continue;
 		}
@@ -789,6 +794,8 @@ relock:
 			spin_unlock(&dentry->d_lock);
 		} else {
 			list_move_tail(&dentry->d_lru, &tmp);
+			this_cpu_dec(nr_dentry_unused);
+			sb->s_nr_dentry_unused--;
 			spin_unlock(&dentry->d_lock);
 			freed++;
 			if (!--nr_to_scan)
@@ -818,6 +825,14 @@ void shrink_dcache_sb(struct super_block *sb)
 	spin_lock(&sb->s_dentry_lru_lock);
 	while (!list_empty(&sb->s_dentry_lru)) {
 		list_splice_init(&sb->s_dentry_lru, &tmp);
+
+		/*
+		 * account for removal here so we don't need to handle it later
+		 * even though the dentry is no longer on the lru list.
+		 */
+		this_cpu_sub(nr_dentry_unused, sb->s_nr_dentry_unused);
+		sb->s_nr_dentry_unused = 0;
+
 		spin_unlock(&sb->s_dentry_lru_lock);
 		shrink_dentry_list(&tmp);
 		spin_lock(&sb->s_dentry_lru_lock);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/13] dcache: convert to use new lru list infrastructure
  2011-08-23  8:56 ` Dave Chinner
@ 2011-08-23  8:56   ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c        |  142 +++++++++++++++++++++-------------------------------
 fs/super.c         |   10 ++--
 include/linux/fs.h |   14 +++--
 3 files changed, 71 insertions(+), 95 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 79bf47c..382cd27 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -36,6 +36,7 @@
 #include <linux/bit_spinlock.h>
 #include <linux/rculist_bl.h>
 #include <linux/prefetch.h>
+#include <linux/list_lru.h>
 #include "internal.h"
 
 /*
@@ -239,42 +240,17 @@ static void dentry_unlink_inode(struct dentry * dentry)
  */
 static void dentry_lru_add(struct dentry *dentry)
 {
-	if (list_empty(&dentry->d_lru)) {
-		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
-		dentry->d_sb->s_nr_dentry_unused++;
+	if (list_empty(&dentry->d_lru) &&
+	    list_lru_add(&dentry->d_sb->s_dentry_lru, &dentry->d_lru))
 		this_cpu_inc(nr_dentry_unused);
-		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
-	}
 }
 
-static void __dentry_lru_del(struct dentry *dentry)
-{
-	list_del_init(&dentry->d_lru);
-	dentry->d_sb->s_nr_dentry_unused--;
-	this_cpu_dec(nr_dentry_unused);
-}
 
 static void dentry_lru_del(struct dentry *dentry)
 {
-	if (!list_empty(&dentry->d_lru)) {
-		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-		__dentry_lru_del(dentry);
-		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
-	}
-}
-
-static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
-{
-	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-	if (list_empty(&dentry->d_lru)) {
-		list_add_tail(&dentry->d_lru, list);
-	} else {
-		list_move_tail(&dentry->d_lru, list);
-		dentry->d_sb->s_nr_dentry_unused--;
+	if (!list_empty(&dentry->d_lru) &&
+	    list_lru_del(&dentry->d_sb->s_dentry_lru, &dentry->d_lru))
 		this_cpu_dec(nr_dentry_unused);
-	}
-	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
 
 /**
@@ -756,6 +732,50 @@ static void shrink_dentry_list(struct list_head *list)
 	rcu_read_unlock();
 }
 
+static int dentry_lru_isolate(struct list_head *item, spinlock_t *lru_lock,
+				void *arg)
+{
+	struct list_head *freeable = arg;
+	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
+
+
+	/*
+	 * we are inverting the lru lock/dentry->d_lock here,
+	 * so use a trylock. If we fail to get the lock, just skip
+	 * it
+	 */
+	if (!spin_trylock(&dentry->d_lock))
+		return 2;
+
+	/*
+	 * Referenced dentries are still in use. If they have active
+	 * counts, just remove them from the LRU. Otherwise give them
+	 * another pass through the LRU.
+	 */
+	if (dentry->d_count) {
+		list_del_init(&dentry->d_lru);
+		spin_unlock(&dentry->d_lock);
+		return 0;
+	}
+
+	if (dentry->d_flags & DCACHE_REFERENCED) {
+		dentry->d_flags &= ~DCACHE_REFERENCED;
+		spin_unlock(&dentry->d_lock);
+
+		/*
+		 * XXX: this list move should be be done under d_lock. Need to
+		 * determine if it is safe just to do it under the lru lock.
+		 */
+		return 1;
+	}
+
+	list_move_tail(&dentry->d_lru, freeable);
+	this_cpu_dec(nr_dentry_unused);
+	spin_unlock(&dentry->d_lock);
+
+	return 0;
+}
+
 /**
  * prune_dcache_sb - shrink the dcache
  * @sb: superblock
@@ -770,44 +790,13 @@ static void shrink_dentry_list(struct list_head *list)
  */
 long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
-	struct dentry *dentry;
-	LIST_HEAD(referenced);
-	LIST_HEAD(tmp);
+	LIST_HEAD(dispose);
 	long freed = 0;
 
-relock:
-	spin_lock(&sb->s_dentry_lru_lock);
-	while (!list_empty(&sb->s_dentry_lru)) {
-		dentry = list_entry(sb->s_dentry_lru.prev,
-				struct dentry, d_lru);
-		BUG_ON(dentry->d_sb != sb);
-
-		if (!spin_trylock(&dentry->d_lock)) {
-			spin_unlock(&sb->s_dentry_lru_lock);
-			cpu_relax();
-			goto relock;
-		}
+	freed = list_lru_walk(&sb->s_dentry_lru, dentry_lru_isolate,
+			      &dispose, nr_to_scan);
 
-		if (dentry->d_flags & DCACHE_REFERENCED) {
-			dentry->d_flags &= ~DCACHE_REFERENCED;
-			list_move(&dentry->d_lru, &referenced);
-			spin_unlock(&dentry->d_lock);
-		} else {
-			list_move_tail(&dentry->d_lru, &tmp);
-			this_cpu_dec(nr_dentry_unused);
-			sb->s_nr_dentry_unused--;
-			spin_unlock(&dentry->d_lock);
-			freed++;
-			if (!--nr_to_scan)
-				break;
-		}
-		cond_resched_lock(&sb->s_dentry_lru_lock);
-	}
-	if (!list_empty(&referenced))
-		list_splice(&referenced, &sb->s_dentry_lru);
-	spin_unlock(&sb->s_dentry_lru_lock);
-
-	shrink_dentry_list(&tmp);
+	shrink_dentry_list(&dispose);
 	return freed;
 }
 
@@ -820,24 +809,10 @@ relock:
  */
 void shrink_dcache_sb(struct super_block *sb)
 {
-	LIST_HEAD(tmp);
-
-	spin_lock(&sb->s_dentry_lru_lock);
-	while (!list_empty(&sb->s_dentry_lru)) {
-		list_splice_init(&sb->s_dentry_lru, &tmp);
+	long freed;
 
-		/*
-		 * account for removal here so we don't need to handle it later
-		 * even though the dentry is no longer on the lru list.
-		 */
-		this_cpu_sub(nr_dentry_unused, sb->s_nr_dentry_unused);
-		sb->s_nr_dentry_unused = 0;
-
-		spin_unlock(&sb->s_dentry_lru_lock);
-		shrink_dentry_list(&tmp);
-		spin_lock(&sb->s_dentry_lru_lock);
-	}
-	spin_unlock(&sb->s_dentry_lru_lock);
+	freed = list_lru_dispose_all(&sb->s_dentry_lru, shrink_dentry_list);
+	this_cpu_sub(nr_dentry_unused, freed);
 }
 EXPORT_SYMBOL(shrink_dcache_sb);
 
@@ -1094,11 +1069,10 @@ resume:
 		/*
 		 * move only zero ref count dentries to the dispose list.
 		 */
+		dentry_lru_del(dentry);
 		if (!dentry->d_count) {
-			dentry_lru_move_list(dentry, dispose);
+			list_add_tail(&dentry->d_lru, dispose);
 			found++;
-		} else {
-			dentry_lru_del(dentry);
 		}
 
 		/*
diff --git a/fs/super.c b/fs/super.c
index bc0e101..cc96fdf 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -70,10 +70,11 @@ static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 		fs_objects = sb->s_op->nr_cached_objects(sb);
 
 	inodes = list_lru_count(&sb->s_inode_lru);
-	total_objects = sb->s_nr_dentry_unused + inodes + fs_objects + 1;
+	dentries = list_lru_count(&sb->s_dentry_lru);
+	total_objects = dentries + inodes + fs_objects + 1;
 
 	/* proportion the scan between the caches */
-	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
+	dentries = (sc->nr_to_scan * dentries) / total_objects;
 	inodes = (sc->nr_to_scan * inodes) / total_objects;
 
 	/*
@@ -105,7 +106,7 @@ static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc
 	if (sb->s_op && sb->s_op->nr_cached_objects)
 		total_objects = sb->s_op->nr_cached_objects(sb);
 
-	total_objects += sb->s_nr_dentry_unused;
+	total_objects += list_lru_count(&sb->s_dentry_lru);
 	total_objects += list_lru_count(&sb->s_inode_lru);
 
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
@@ -151,8 +152,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_LIST_HEAD(&s->s_instances);
 		INIT_HLIST_BL_HEAD(&s->s_anon);
 		INIT_LIST_HEAD(&s->s_inodes);
-		INIT_LIST_HEAD(&s->s_dentry_lru);
-		spin_lock_init(&s->s_dentry_lru_lock);
+		list_lru_init(&s->s_dentry_lru);
 		list_lru_init(&s->s_inode_lru);
 		init_rwsem(&s->s_umount);
 		mutex_init(&s->s_lock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 80beb62..fd458f9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1411,12 +1411,6 @@ struct super_block {
 #else
 	struct list_head	s_files;
 #endif
-	spinlock_t		s_dentry_lru_lock ____cacheline_aligned_in_smp;
-	struct list_head	s_dentry_lru;	/* unused dentry lru */
-	int			s_nr_dentry_unused; /* # of dentries on lru */
-
-	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
-
 	struct block_device	*s_bdev;
 	struct backing_dev_info *s_bdi;
 	struct mtd_info		*s_mtd;
@@ -1461,6 +1455,14 @@ struct super_block {
 	int cleancache_poolid;
 
 	struct shrinker s_shrink;	/* per-sb shrinker handle */
+
+	/*
+	 * keep the lru lists last in the structure so they always sit on their
+	 * own individual cachelines.
+	 */
+	struct list_lru		s_dentry_lru ____cacheline_aligned_in_smp;
+	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
+
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 13/13] dcache: convert to use new lru list infrastructure
@ 2011-08-23  8:56   ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  8:56 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, linux-mm, khlebnikov

From: Dave Chinner <dchinner@redhat.com>

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/dcache.c        |  142 +++++++++++++++++++++-------------------------------
 fs/super.c         |   10 ++--
 include/linux/fs.h |   14 +++--
 3 files changed, 71 insertions(+), 95 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 79bf47c..382cd27 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -36,6 +36,7 @@
 #include <linux/bit_spinlock.h>
 #include <linux/rculist_bl.h>
 #include <linux/prefetch.h>
+#include <linux/list_lru.h>
 #include "internal.h"
 
 /*
@@ -239,42 +240,17 @@ static void dentry_unlink_inode(struct dentry * dentry)
  */
 static void dentry_lru_add(struct dentry *dentry)
 {
-	if (list_empty(&dentry->d_lru)) {
-		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-		list_add(&dentry->d_lru, &dentry->d_sb->s_dentry_lru);
-		dentry->d_sb->s_nr_dentry_unused++;
+	if (list_empty(&dentry->d_lru) &&
+	    list_lru_add(&dentry->d_sb->s_dentry_lru, &dentry->d_lru))
 		this_cpu_inc(nr_dentry_unused);
-		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
-	}
 }
 
-static void __dentry_lru_del(struct dentry *dentry)
-{
-	list_del_init(&dentry->d_lru);
-	dentry->d_sb->s_nr_dentry_unused--;
-	this_cpu_dec(nr_dentry_unused);
-}
 
 static void dentry_lru_del(struct dentry *dentry)
 {
-	if (!list_empty(&dentry->d_lru)) {
-		spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-		__dentry_lru_del(dentry);
-		spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
-	}
-}
-
-static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
-{
-	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
-	if (list_empty(&dentry->d_lru)) {
-		list_add_tail(&dentry->d_lru, list);
-	} else {
-		list_move_tail(&dentry->d_lru, list);
-		dentry->d_sb->s_nr_dentry_unused--;
+	if (!list_empty(&dentry->d_lru) &&
+	    list_lru_del(&dentry->d_sb->s_dentry_lru, &dentry->d_lru))
 		this_cpu_dec(nr_dentry_unused);
-	}
-	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
 }
 
 /**
@@ -756,6 +732,50 @@ static void shrink_dentry_list(struct list_head *list)
 	rcu_read_unlock();
 }
 
+static int dentry_lru_isolate(struct list_head *item, spinlock_t *lru_lock,
+				void *arg)
+{
+	struct list_head *freeable = arg;
+	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
+
+
+	/*
+	 * we are inverting the lru lock/dentry->d_lock here,
+	 * so use a trylock. If we fail to get the lock, just skip
+	 * it
+	 */
+	if (!spin_trylock(&dentry->d_lock))
+		return 2;
+
+	/*
+	 * Referenced dentries are still in use. If they have active
+	 * counts, just remove them from the LRU. Otherwise give them
+	 * another pass through the LRU.
+	 */
+	if (dentry->d_count) {
+		list_del_init(&dentry->d_lru);
+		spin_unlock(&dentry->d_lock);
+		return 0;
+	}
+
+	if (dentry->d_flags & DCACHE_REFERENCED) {
+		dentry->d_flags &= ~DCACHE_REFERENCED;
+		spin_unlock(&dentry->d_lock);
+
+		/*
+		 * XXX: this list move should be be done under d_lock. Need to
+		 * determine if it is safe just to do it under the lru lock.
+		 */
+		return 1;
+	}
+
+	list_move_tail(&dentry->d_lru, freeable);
+	this_cpu_dec(nr_dentry_unused);
+	spin_unlock(&dentry->d_lock);
+
+	return 0;
+}
+
 /**
  * prune_dcache_sb - shrink the dcache
  * @sb: superblock
@@ -770,44 +790,13 @@ static void shrink_dentry_list(struct list_head *list)
  */
 long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
 {
-	struct dentry *dentry;
-	LIST_HEAD(referenced);
-	LIST_HEAD(tmp);
+	LIST_HEAD(dispose);
 	long freed = 0;
 
-relock:
-	spin_lock(&sb->s_dentry_lru_lock);
-	while (!list_empty(&sb->s_dentry_lru)) {
-		dentry = list_entry(sb->s_dentry_lru.prev,
-				struct dentry, d_lru);
-		BUG_ON(dentry->d_sb != sb);
-
-		if (!spin_trylock(&dentry->d_lock)) {
-			spin_unlock(&sb->s_dentry_lru_lock);
-			cpu_relax();
-			goto relock;
-		}
+	freed = list_lru_walk(&sb->s_dentry_lru, dentry_lru_isolate,
+			      &dispose, nr_to_scan);
 
-		if (dentry->d_flags & DCACHE_REFERENCED) {
-			dentry->d_flags &= ~DCACHE_REFERENCED;
-			list_move(&dentry->d_lru, &referenced);
-			spin_unlock(&dentry->d_lock);
-		} else {
-			list_move_tail(&dentry->d_lru, &tmp);
-			this_cpu_dec(nr_dentry_unused);
-			sb->s_nr_dentry_unused--;
-			spin_unlock(&dentry->d_lock);
-			freed++;
-			if (!--nr_to_scan)
-				break;
-		}
-		cond_resched_lock(&sb->s_dentry_lru_lock);
-	}
-	if (!list_empty(&referenced))
-		list_splice(&referenced, &sb->s_dentry_lru);
-	spin_unlock(&sb->s_dentry_lru_lock);
-
-	shrink_dentry_list(&tmp);
+	shrink_dentry_list(&dispose);
 	return freed;
 }
 
@@ -820,24 +809,10 @@ relock:
  */
 void shrink_dcache_sb(struct super_block *sb)
 {
-	LIST_HEAD(tmp);
-
-	spin_lock(&sb->s_dentry_lru_lock);
-	while (!list_empty(&sb->s_dentry_lru)) {
-		list_splice_init(&sb->s_dentry_lru, &tmp);
+	long freed;
 
-		/*
-		 * account for removal here so we don't need to handle it later
-		 * even though the dentry is no longer on the lru list.
-		 */
-		this_cpu_sub(nr_dentry_unused, sb->s_nr_dentry_unused);
-		sb->s_nr_dentry_unused = 0;
-
-		spin_unlock(&sb->s_dentry_lru_lock);
-		shrink_dentry_list(&tmp);
-		spin_lock(&sb->s_dentry_lru_lock);
-	}
-	spin_unlock(&sb->s_dentry_lru_lock);
+	freed = list_lru_dispose_all(&sb->s_dentry_lru, shrink_dentry_list);
+	this_cpu_sub(nr_dentry_unused, freed);
 }
 EXPORT_SYMBOL(shrink_dcache_sb);
 
@@ -1094,11 +1069,10 @@ resume:
 		/*
 		 * move only zero ref count dentries to the dispose list.
 		 */
+		dentry_lru_del(dentry);
 		if (!dentry->d_count) {
-			dentry_lru_move_list(dentry, dispose);
+			list_add_tail(&dentry->d_lru, dispose);
 			found++;
-		} else {
-			dentry_lru_del(dentry);
 		}
 
 		/*
diff --git a/fs/super.c b/fs/super.c
index bc0e101..cc96fdf 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -70,10 +70,11 @@ static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
 		fs_objects = sb->s_op->nr_cached_objects(sb);
 
 	inodes = list_lru_count(&sb->s_inode_lru);
-	total_objects = sb->s_nr_dentry_unused + inodes + fs_objects + 1;
+	dentries = list_lru_count(&sb->s_dentry_lru);
+	total_objects = dentries + inodes + fs_objects + 1;
 
 	/* proportion the scan between the caches */
-	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
+	dentries = (sc->nr_to_scan * dentries) / total_objects;
 	inodes = (sc->nr_to_scan * inodes) / total_objects;
 
 	/*
@@ -105,7 +106,7 @@ static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc
 	if (sb->s_op && sb->s_op->nr_cached_objects)
 		total_objects = sb->s_op->nr_cached_objects(sb);
 
-	total_objects += sb->s_nr_dentry_unused;
+	total_objects += list_lru_count(&sb->s_dentry_lru);
 	total_objects += list_lru_count(&sb->s_inode_lru);
 
 	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
@@ -151,8 +152,7 @@ static struct super_block *alloc_super(struct file_system_type *type)
 		INIT_LIST_HEAD(&s->s_instances);
 		INIT_HLIST_BL_HEAD(&s->s_anon);
 		INIT_LIST_HEAD(&s->s_inodes);
-		INIT_LIST_HEAD(&s->s_dentry_lru);
-		spin_lock_init(&s->s_dentry_lru_lock);
+		list_lru_init(&s->s_dentry_lru);
 		list_lru_init(&s->s_inode_lru);
 		init_rwsem(&s->s_umount);
 		mutex_init(&s->s_lock);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 80beb62..fd458f9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1411,12 +1411,6 @@ struct super_block {
 #else
 	struct list_head	s_files;
 #endif
-	spinlock_t		s_dentry_lru_lock ____cacheline_aligned_in_smp;
-	struct list_head	s_dentry_lru;	/* unused dentry lru */
-	int			s_nr_dentry_unused; /* # of dentries on lru */
-
-	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
-
 	struct block_device	*s_bdev;
 	struct backing_dev_info *s_bdi;
 	struct mtd_info		*s_mtd;
@@ -1461,6 +1455,14 @@ struct super_block {
 	int cleancache_poolid;
 
 	struct shrinker s_shrink;	/* per-sb shrinker handle */
+
+	/*
+	 * keep the lru lists last in the structure so they always sit on their
+	 * own individual cachelines.
+	 */
+	struct list_lru		s_dentry_lru ____cacheline_aligned_in_smp;
+	struct list_lru		s_inode_lru ____cacheline_aligned_in_smp;
+
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
-- 
1.7.5.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:13     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:14PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> All inode slab cache initialisation calls need to use specific flags
> so that certain core functionality works correctly (e.g. reclaimable
> memory accounting). Some of these flags are used inconsistently
> across different filesystems, so inode cache slab behaviour can vary
> according to filesystem type.
> 
> Wrap all the SLAB_* flags relevant to inode caches up into a single
> SLAB_INODES flag and convert all the inode caches to use the new
> flag.

Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
@ 2011-08-23  9:13     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:14PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> All inode slab cache initialisation calls need to use specific flags
> so that certain core functionality works correctly (e.g. reclaimable
> memory accounting). Some of these flags are used inconsistently
> across different filesystems, so inode cache slab behaviour can vary
> according to filesystem type.
> 
> Wrap all the SLAB_* flags relevant to inode caches up into a single
> SLAB_INODES flag and convert all the inode caches to use the new
> flag.

Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:13     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:15PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Before we split up the dcache_lru_lock, the unused dentry counter
> needs to be made independent of the global dcache_lru_lock. Convert
> it to per-cpu counters to do this.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks good (been there, done that..)

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
@ 2011-08-23  9:13     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:13 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:15PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Before we split up the dcache_lru_lock, the unused dentry counter
> needs to be made independent of the global dcache_lru_lock. Convert
> it to per-cpu counters to do this.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

Looks good (been there, done that..)

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:15     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

>  /*
>   * A callback you can register to apply pressure to ageable caches.

It's much more than just a single callback these days.

> + * @scan_objects will be made from the current reclaim context.
>   */
>  struct shrinker {
>  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);

Is shrink_object really such a good name for this method?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
@ 2011-08-23  9:15     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:15 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

>  /*
>   * A callback you can register to apply pressure to ageable caches.

It's much more than just a single callback these days.

> + * @scan_objects will be made from the current reclaim context.
>   */
>  struct shrinker {
>  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);

Is shrink_object really such a good name for this method?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:17     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:18PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.

The split between this and the previous patch isn't obvious to me.
Either we'll do the whole switchover in one patch, or add the new
API fully in a first one and convert them one for one.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
@ 2011-08-23  9:17     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:18PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.

The split between this and the previous patch isn't obvious to me.
Either we'll do the whole switchover in one patch, or add the new
API fully in a first one and convert them one for one.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
  2011-08-23  9:13     ` Christoph Hellwig
@ 2011-08-23  9:20       ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:13:07AM -0400, Christoph Hellwig wrote:
> On Tue, Aug 23, 2011 at 06:56:14PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > All inode slab cache initialisation calls need to use specific flags
> > so that certain core functionality works correctly (e.g. reclaimable
> > memory accounting). Some of these flags are used inconsistently
> > across different filesystems, so inode cache slab behaviour can vary
> > according to filesystem type.
> > 
> > Wrap all the SLAB_* flags relevant to inode caches up into a single
> > SLAB_INODES flag and convert all the inode caches to use the new
> > flag.
> 
> Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?

I didn't touch that one, mainly because I think that there are
different reasons for wanting cacheline alignment. e.g. a filesystem
aimed primarily at embedded systms with slow CPUs and little memory
doesn't want to waste memory on cacheline alignment....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
@ 2011-08-23  9:20       ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:13:07AM -0400, Christoph Hellwig wrote:
> On Tue, Aug 23, 2011 at 06:56:14PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > All inode slab cache initialisation calls need to use specific flags
> > so that certain core functionality works correctly (e.g. reclaimable
> > memory accounting). Some of these flags are used inconsistently
> > across different filesystems, so inode cache slab behaviour can vary
> > according to filesystem type.
> > 
> > Wrap all the SLAB_* flags relevant to inode caches up into a single
> > SLAB_INODES flag and convert all the inode caches to use the new
> > flag.
> 
> Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?

I didn't touch that one, mainly because I think that there are
different reasons for wanting cacheline alignment. e.g. a filesystem
aimed primarily at embedded systms with slow CPUs and little memory
doesn't want to waste memory on cacheline alignment....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:20     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:20 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Several subsystems use the same construct for LRU lists - a list
> head, a spin lock and and item count. They also use exactly the same
> code for adding and removing items from the LRU. Create a generic
> type for these LRU lists.
> 
> This is the beginning of generic, node aware LRUs for shrinkers to
> work with.

Why list_lru vs the more natural sounding lru_list?

> diff --git a/lib/Makefile b/lib/Makefile
> index d5d175c..a08212f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>  	 idr.o int_sqrt.o extable.o prio_tree.o \
>  	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
>  	 proportions.o prio_heap.o ratelimit.o show_mem.o \
> -	 is_single_threaded.o plist.o decompress.o find_next_bit.o
> +	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
> +	 list_lru.o

Di we finally fix the issues with lib-y objects beeing discarded despite
modules relying on the exports?

> +int
> +list_lru_add(
> +	struct list_lru	*lru,
> +	struct list_head *item)
> +{

What about some kerneldoc comments for the helpers?

> +		ret = isolate(item, &lru->lock, cb_arg);
> +		switch (ret) {
> +		case 0:	/* item removed from list */
> +			lru->nr_items--;
> +			removed++;
> +			break;
> +		case 1: /* item referenced, give another pass */
> +			list_move_tail(item, &lru->list);
> +			break;
> +		case 2: /* item cannot be locked, skip */
> +			break;
> +		case 3: /* item not freeable, lock dropped */
> +			goto restart;

I think the isolate callback returns shoud have symbolic names, i.e.
and enum lru_isolate or similar.

> +int
> +list_lru_init(
> +	struct list_lru	*lru)
> +{
> +	spin_lock_init(&lru->lock);
> +	INIT_LIST_HEAD(&lru->list);
> +	lru->nr_items = 0;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(list_lru_init);

This one doesn't need a return value.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
@ 2011-08-23  9:20     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:20 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Several subsystems use the same construct for LRU lists - a list
> head, a spin lock and and item count. They also use exactly the same
> code for adding and removing items from the LRU. Create a generic
> type for these LRU lists.
> 
> This is the beginning of generic, node aware LRUs for shrinkers to
> work with.

Why list_lru vs the more natural sounding lru_list?

> diff --git a/lib/Makefile b/lib/Makefile
> index d5d175c..a08212f 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>  	 idr.o int_sqrt.o extable.o prio_tree.o \
>  	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
>  	 proportions.o prio_heap.o ratelimit.o show_mem.o \
> -	 is_single_threaded.o plist.o decompress.o find_next_bit.o
> +	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
> +	 list_lru.o

Di we finally fix the issues with lib-y objects beeing discarded despite
modules relying on the exports?

> +int
> +list_lru_add(
> +	struct list_lru	*lru,
> +	struct list_head *item)
> +{

What about some kerneldoc comments for the helpers?

> +		ret = isolate(item, &lru->lock, cb_arg);
> +		switch (ret) {
> +		case 0:	/* item removed from list */
> +			lru->nr_items--;
> +			removed++;
> +			break;
> +		case 1: /* item referenced, give another pass */
> +			list_move_tail(item, &lru->list);
> +			break;
> +		case 2: /* item cannot be locked, skip */
> +			break;
> +		case 3: /* item not freeable, lock dropped */
> +			goto restart;

I think the isolate callback returns shoud have symbolic names, i.e.
and enum lru_isolate or similar.

> +int
> +list_lru_init(
> +	struct list_lru	*lru)
> +{
> +	spin_lock_init(&lru->lock);
> +	INIT_LIST_HEAD(&lru->list);
> +	lru->nr_items = 0;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(list_lru_init);

This one doesn't need a return value.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
  2011-08-23  9:15     ` Christoph Hellwig
@ 2011-08-23  9:23       ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:15:29AM -0400, Christoph Hellwig wrote:
> >  /*
> >   * A callback you can register to apply pressure to ageable caches.
> 
> It's much more than just a single callback these days.
> 
> > + * @scan_objects will be made from the current reclaim context.
> >   */
> >  struct shrinker {
> >  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> > +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> > +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
> 
> Is shrink_object really such a good name for this method?

Apart from the fact it is called "scan_objects", I'm open to more
appropriate names. I called is "scan_objects" because of the fact we
are asking to scan (rather than free) a specific number objects on
the LRU, and it matches with the "sc->nr_to_scan" control field.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
@ 2011-08-23  9:23       ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:15:29AM -0400, Christoph Hellwig wrote:
> >  /*
> >   * A callback you can register to apply pressure to ageable caches.
> 
> It's much more than just a single callback these days.
> 
> > + * @scan_objects will be made from the current reclaim context.
> >   */
> >  struct shrinker {
> >  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> > +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> > +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
> 
> Is shrink_object really such a good name for this method?

Apart from the fact it is called "scan_objects", I'm open to more
appropriate names. I called is "scan_objects" because of the fact we
are asking to scan (rather than free) a specific number objects on
the LRU, and it matches with the "sc->nr_to_scan" control field.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
  2011-08-23  9:20     ` Christoph Hellwig
@ 2011-08-23  9:32       ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:20:56AM -0400, Christoph Hellwig wrote:
> On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Several subsystems use the same construct for LRU lists - a list
> > head, a spin lock and and item count. They also use exactly the same
> > code for adding and removing items from the LRU. Create a generic
> > type for these LRU lists.
> > 
> > This is the beginning of generic, node aware LRUs for shrinkers to
> > work with.
> 
> Why list_lru vs the more natural sounding lru_list?

because the mmzone.h claimed that namespace:

enum lru_list {
        LRU_INACTIVE_ANON = LRU_BASE,
        LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE,
        LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,
        LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE,
        LRU_UNEVICTABLE,
        NR_LRU_LISTS
};

and it is widely spewed through the mm code. I didn't really feel
like having to clean that mess up first....

> > diff --git a/lib/Makefile b/lib/Makefile
> > index d5d175c..a08212f 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
> >  	 idr.o int_sqrt.o extable.o prio_tree.o \
> >  	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
> >  	 proportions.o prio_heap.o ratelimit.o show_mem.o \
> > -	 is_single_threaded.o plist.o decompress.o find_next_bit.o
> > +	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
> > +	 list_lru.o
> 
> Di we finally fix the issues with lib-y objects beeing discarded despite
> modules relying on the exports?

Don't care. The list_lru code is used by the VFs, so it will always
be built in....

> > +int
> > +list_lru_add(
> > +	struct list_lru	*lru,
> > +	struct list_head *item)
> > +{
> 
> What about some kerneldoc comments for the helpers?

Yup, to be done.

> 
> > +		ret = isolate(item, &lru->lock, cb_arg);
> > +		switch (ret) {
> > +		case 0:	/* item removed from list */
> > +			lru->nr_items--;
> > +			removed++;
> > +			break;
> > +		case 1: /* item referenced, give another pass */
> > +			list_move_tail(item, &lru->list);
> > +			break;
> > +		case 2: /* item cannot be locked, skip */
> > +			break;
> > +		case 3: /* item not freeable, lock dropped */
> > +			goto restart;
> 
> I think the isolate callback returns shoud have symbolic names, i.e.
> and enum lru_isolate or similar.

Will do.

> 
> > +int
> > +list_lru_init(
> > +	struct list_lru	*lru)
> > +{
> > +	spin_lock_init(&lru->lock);
> > +	INIT_LIST_HEAD(&lru->list);
> > +	lru->nr_items = 0;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(list_lru_init);
> 
> This one doesn't need a return value.

No, not yet. I'll kill it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
@ 2011-08-23  9:32       ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:32 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:20:56AM -0400, Christoph Hellwig wrote:
> On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Several subsystems use the same construct for LRU lists - a list
> > head, a spin lock and and item count. They also use exactly the same
> > code for adding and removing items from the LRU. Create a generic
> > type for these LRU lists.
> > 
> > This is the beginning of generic, node aware LRUs for shrinkers to
> > work with.
> 
> Why list_lru vs the more natural sounding lru_list?

because the mmzone.h claimed that namespace:

enum lru_list {
        LRU_INACTIVE_ANON = LRU_BASE,
        LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE,
        LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,
        LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE,
        LRU_UNEVICTABLE,
        NR_LRU_LISTS
};

and it is widely spewed through the mm code. I didn't really feel
like having to clean that mess up first....

> > diff --git a/lib/Makefile b/lib/Makefile
> > index d5d175c..a08212f 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -12,7 +12,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
> >  	 idr.o int_sqrt.o extable.o prio_tree.o \
> >  	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
> >  	 proportions.o prio_heap.o ratelimit.o show_mem.o \
> > -	 is_single_threaded.o plist.o decompress.o find_next_bit.o
> > +	 is_single_threaded.o plist.o decompress.o find_next_bit.o \
> > +	 list_lru.o
> 
> Di we finally fix the issues with lib-y objects beeing discarded despite
> modules relying on the exports?

Don't care. The list_lru code is used by the VFs, so it will always
be built in....

> > +int
> > +list_lru_add(
> > +	struct list_lru	*lru,
> > +	struct list_head *item)
> > +{
> 
> What about some kerneldoc comments for the helpers?

Yup, to be done.

> 
> > +		ret = isolate(item, &lru->lock, cb_arg);
> > +		switch (ret) {
> > +		case 0:	/* item removed from list */
> > +			lru->nr_items--;
> > +			removed++;
> > +			break;
> > +		case 1: /* item referenced, give another pass */
> > +			list_move_tail(item, &lru->list);
> > +			break;
> > +		case 2: /* item cannot be locked, skip */
> > +			break;
> > +		case 3: /* item not freeable, lock dropped */
> > +			goto restart;
> 
> I think the isolate callback returns shoud have symbolic names, i.e.
> and enum lru_isolate or similar.

Will do.

> 
> > +int
> > +list_lru_init(
> > +	struct list_lru	*lru)
> > +{
> > +	spin_lock_init(&lru->lock);
> > +	INIT_LIST_HEAD(&lru->list);
> > +	lru->nr_items = 0;
> > +
> > +	return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(list_lru_init);
> 
> This one doesn't need a return value.

No, not yet. I'll kill it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:35     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> diff --git a/fs/dcache.c b/fs/dcache.c
> index b931415..79bf47c 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
>  	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
>  	if (list_empty(&dentry->d_lru)) {
>  		list_add_tail(&dentry->d_lru, list);
> -		dentry->d_sb->s_nr_dentry_unused++;
> -		this_cpu_inc(nr_dentry_unused);
>  	} else {
>  		list_move_tail(&dentry->d_lru, list);
> +		dentry->d_sb->s_nr_dentry_unused--;
> +		this_cpu_dec(nr_dentry_unused);
>  	}
>  	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);

I suspect at this point it might be more obvious to simply remove
dentry_lru_move_list.  Just call dentry_lru_del to remove it from the
lru, and then we can add it to the local dispose list without the need
of any locking, similar to how it is done for inodes already.

>  		if (dentry->d_count) {
> -			dentry_lru_del(dentry);
>  			spin_unlock(&dentry->d_lock);
>  			continue;
>  		}
> @@ -789,6 +794,8 @@ relock:
>  			spin_unlock(&dentry->d_lock);
>  		} else {
>  			list_move_tail(&dentry->d_lru, &tmp);
> +			this_cpu_dec(nr_dentry_unused);
> +			sb->s_nr_dentry_unused--;

It might be more obvious to use __dentry_lru_del + an opencoded list_add
here.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
@ 2011-08-23  9:35     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> diff --git a/fs/dcache.c b/fs/dcache.c
> index b931415..79bf47c 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
>  	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
>  	if (list_empty(&dentry->d_lru)) {
>  		list_add_tail(&dentry->d_lru, list);
> -		dentry->d_sb->s_nr_dentry_unused++;
> -		this_cpu_inc(nr_dentry_unused);
>  	} else {
>  		list_move_tail(&dentry->d_lru, list);
> +		dentry->d_sb->s_nr_dentry_unused--;
> +		this_cpu_dec(nr_dentry_unused);
>  	}
>  	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);

I suspect at this point it might be more obvious to simply remove
dentry_lru_move_list.  Just call dentry_lru_del to remove it from the
lru, and then we can add it to the local dispose list without the need
of any locking, similar to how it is done for inodes already.

>  		if (dentry->d_count) {
> -			dentry_lru_del(dentry);
>  			spin_unlock(&dentry->d_lock);
>  			continue;
>  		}
> @@ -789,6 +794,8 @@ relock:
>  			spin_unlock(&dentry->d_lock);
>  		} else {
>  			list_move_tail(&dentry->d_lru, &tmp);
> +			this_cpu_dec(nr_dentry_unused);
> +			sb->s_nr_dentry_unused--;

It might be more obvious to use __dentry_lru_del + an opencoded list_add
here.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:35     ` Steven Whitehouse
  -1 siblings, 0 replies; 74+ messages in thread
From: Steven Whitehouse @ 2011-08-23  9:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

Hi,

On Tue, 2011-08-23 at 18:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

GFS2 bits:
Acked-by: Steven Whitehouse <swhiteho@redhat.com>

Looks good to me,

Steve.

> ---
>  Documentation/filesystems/vfs.txt    |   11 +++--
>  arch/x86/kvm/mmu.c                   |   16 ++++---
>  drivers/gpu/drm/i915/i915_dma.c      |    4 +-
>  drivers/gpu/drm/i915/i915_gem.c      |   49 ++++++++++++++---------
>  drivers/gpu/drm/ttm/ttm_page_alloc.c |   14 ++++--
>  drivers/staging/zcache/zcache-main.c |   45 ++++++++++++---------
>  fs/cifs/cifsacl.c                    |   57 +++++++++++++++++----------
>  fs/dcache.c                          |   15 ++++---
>  fs/gfs2/glock.c                      |   24 +++++++-----
>  fs/gfs2/main.c                       |    3 +-
>  fs/gfs2/quota.c                      |   19 +++++----
>  fs/gfs2/quota.h                      |    4 +-
>  fs/inode.c                           |    7 ++-
>  fs/internal.h                        |    3 +
>  fs/mbcache.c                         |   37 +++++++++++------
>  fs/nfs/dir.c                         |   17 ++++++--
>  fs/nfs/internal.h                    |    6 ++-
>  fs/nfs/super.c                       |    3 +-
>  fs/quota/dquot.c                     |   39 +++++++++----------
>  fs/super.c                           |   71 ++++++++++++++++++++--------------
>  fs/ubifs/shrinker.c                  |   19 +++++----
>  fs/ubifs/super.c                     |    3 +-
>  fs/ubifs/ubifs.h                     |    3 +-
>  fs/xfs/xfs_buf.c                     |   19 ++++++++-
>  fs/xfs/xfs_qm.c                      |   22 +++++++---
>  fs/xfs/xfs_super.c                   |    8 ++--
>  fs/xfs/xfs_sync.c                    |   17 +++++---
>  fs/xfs/xfs_sync.h                    |    4 +-
>  include/linux/fs.h                   |    8 +---
>  include/trace/events/vmscan.h        |   12 +++---
>  mm/vmscan.c                          |   46 +++++++++-------------
>  net/sunrpc/auth.c                    |   21 +++++++---
>  32 files changed, 369 insertions(+), 257 deletions(-)
> 
> diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> index 52d8fb8..4ca3c2d 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -229,8 +229,8 @@ struct super_operations {
>  
>          ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
>          ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
> -	int (*nr_cached_objects)(struct super_block *);
> -	void (*free_cached_objects)(struct super_block *, int);
> +	long (*nr_cached_objects)(struct super_block *);
> +	long (*free_cached_objects)(struct super_block *, long);
>  };
>  
>  All methods are called without any locks being held, unless otherwise
> @@ -313,9 +313,10 @@ or bottom half).
>  	implement ->nr_cached_objects for it to be called correctly.
>  
>  	We can't do anything with any errors that the filesystem might
> -	encountered, hence the void return type. This will never be called if
> -	the VM is trying to reclaim under GFP_NOFS conditions, hence this
> -	method does not need to handle that situation itself.
> +	encountered, so the return value is the number of objects freed. This
> +	will never be called if the VM is trying to reclaim under GFP_NOFS
> +	conditions, hence this method does not need to handle that situation
> +	itself.
>  
>  	Implementations must include conditional reschedule calls inside any
>  	scanning loop that is done. This allows the VFS to determine
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 1c5b693..939e201 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3858,14 +3858,12 @@ static int kvm_mmu_remove_some_alloc_mmu_pages(struct kvm *kvm,
>  	return kvm_mmu_prepare_zap_page(kvm, page, invalid_list);
>  }
>  
> -static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> +static long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct kvm *kvm;
>  	struct kvm *kvm_freed = NULL;
>  	int nr_to_scan = sc->nr_to_scan;
> -
> -	if (nr_to_scan == 0)
> -		goto out;
> +	long freed_pages = 0;
>  
>  	raw_spin_lock(&kvm_lock);
>  
> @@ -3877,7 +3875,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  		spin_lock(&kvm->mmu_lock);
>  		if (!kvm_freed && nr_to_scan > 0 &&
>  		    kvm->arch.n_used_mmu_pages > 0) {
> -			freed_pages = kvm_mmu_remove_some_alloc_mmu_pages(kvm,
> +			freed_pages += kvm_mmu_remove_some_alloc_mmu_pages(kvm,
>  							  &invalid_list);
>  			kvm_freed = kvm;
>  		}
> @@ -3891,13 +3889,17 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  		list_move_tail(&kvm_freed->vm_list, &vm_list);
>  
>  	raw_spin_unlock(&kvm_lock);
> +	return freed_pages;
> +}
>  
> -out:
> +static long mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
>  	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
>  }
>  
>  static struct shrinker mmu_shrinker = {
> -	.shrink = mmu_shrink,
> +	.scan_objects = mmu_shrink_scan,
> +	.count_objects = mmu_shrink_count,
>  	.seeks = DEFAULT_SEEKS * 10,
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 8a3942c..734ea5e 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -2074,7 +2074,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	return 0;
>  
>  out_gem_unload:
> -	if (dev_priv->mm.inactive_shrinker.shrink)
> +	if (dev_priv->mm.inactive_shrinker.scan_objects)
>  		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
>  
>  	if (dev->pdev->msi_enabled)
> @@ -2108,7 +2108,7 @@ int i915_driver_unload(struct drm_device *dev)
>  	i915_mch_dev = NULL;
>  	spin_unlock(&mchdev_lock);
>  
> -	if (dev_priv->mm.inactive_shrinker.shrink)
> +	if (dev_priv->mm.inactive_shrinker.scan_objects)
>  		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
>  
>  	mutex_lock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a546a71..0647a33 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -56,7 +56,9 @@ static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_file *file);
>  static void i915_gem_free_object_tail(struct drm_i915_gem_object *obj);
>  
> -static int i915_gem_inactive_shrink(struct shrinker *shrinker,
> +static long i915_gem_inactive_scan(struct shrinker *shrinker,
> +				   struct shrink_control *sc);
> +static long i915_gem_inactive_count(struct shrinker *shrinker,
>  				    struct shrink_control *sc);
>  
>  /* some bookkeeping */
> @@ -3999,7 +4001,8 @@ i915_gem_load(struct drm_device *dev)
>  
>  	dev_priv->mm.interruptible = true;
>  
> -	dev_priv->mm.inactive_shrinker.shrink = i915_gem_inactive_shrink;
> +	dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan;
> +	dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count;
>  	dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS;
>  	register_shrinker(&dev_priv->mm.inactive_shrinker);
>  }
> @@ -4221,8 +4224,8 @@ i915_gpu_is_active(struct drm_device *dev)
>  	return !lists_empty;
>  }
>  
> -static int
> -i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> +static long
> +i915_gem_inactive_scan(struct shrinker *shrinker, struct shrink_control *sc)
>  {
>  	struct drm_i915_private *dev_priv =
>  		container_of(shrinker,
> @@ -4231,22 +4234,10 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	struct drm_device *dev = dev_priv->dev;
>  	struct drm_i915_gem_object *obj, *next;
>  	int nr_to_scan = sc->nr_to_scan;
> -	int cnt;
>  
>  	if (!mutex_trylock(&dev->struct_mutex))
>  		return 0;
>  
> -	/* "fast-path" to count number of available objects */
> -	if (nr_to_scan == 0) {
> -		cnt = 0;
> -		list_for_each_entry(obj,
> -				    &dev_priv->mm.inactive_list,
> -				    mm_list)
> -			cnt++;
> -		mutex_unlock(&dev->struct_mutex);
> -		return cnt / 100 * sysctl_vfs_cache_pressure;
> -	}
> -
>  rescan:
>  	/* first scan for clean buffers */
>  	i915_gem_retire_requests(dev);
> @@ -4262,15 +4253,12 @@ rescan:
>  	}
>  
>  	/* second pass, evict/count anything still on the inactive list */
> -	cnt = 0;
>  	list_for_each_entry_safe(obj, next,
>  				 &dev_priv->mm.inactive_list,
>  				 mm_list) {
>  		if (nr_to_scan &&
>  		    i915_gem_object_unbind(obj) == 0)
>  			nr_to_scan--;
> -		else
> -			cnt++;
>  	}
>  
>  	if (nr_to_scan && i915_gpu_is_active(dev)) {
> @@ -4284,5 +4272,26 @@ rescan:
>  			goto rescan;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
> -	return cnt / 100 * sysctl_vfs_cache_pressure;
> +	return sc->nr_to_scan - nr_to_scan;
> +}
> +
> +static long
> +i915_gem_inactive_count(struct shrinker *shrinker, struct shrink_control *sc)
> +{
> +	struct drm_i915_private *dev_priv =
> +		container_of(shrinker,
> +			     struct drm_i915_private,
> +			     mm.inactive_shrinker);
> +	struct drm_device *dev = dev_priv->dev;
> +	struct drm_i915_gem_object *obj;
> +	long count = 0;
> +
> +	if (!mutex_trylock(&dev->struct_mutex))
> +		return 0;
> +
> +	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list)
> +		count++;
> +
> +	mutex_unlock(&dev->struct_mutex);
> +	return count;
>  }
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 727e93d..3e71c68 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -395,14 +395,13 @@ static int ttm_pool_get_num_unused_pages(void)
>  /**
>   * Callback for mm to request pool to reduce number of page held.
>   */
> -static int ttm_pool_mm_shrink(struct shrinker *shrink,
> -			      struct shrink_control *sc)
> +static long ttm_pool_mm_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	static atomic_t start_pool = ATOMIC_INIT(0);
>  	unsigned i;
>  	unsigned pool_offset = atomic_add_return(1, &start_pool);
>  	struct ttm_page_pool *pool;
> -	int shrink_pages = sc->nr_to_scan;
> +	long shrink_pages = sc->nr_to_scan;
>  
>  	pool_offset = pool_offset % NUM_POOLS;
>  	/* select start pool in round robin fashion */
> @@ -413,13 +412,18 @@ static int ttm_pool_mm_shrink(struct shrinker *shrink,
>  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
>  		shrink_pages = ttm_page_pool_free(pool, nr_free);
>  	}
> -	/* return estimated number of unused pages in pool */
> +	return sc->nr_to_scan;
> +}
> +
> +static long ttm_pool_mm_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
>  	return ttm_pool_get_num_unused_pages();
>  }
>  
>  static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
>  {
> -	manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
> +	manager->mm_shrink.scan_objects = ttm_pool_mm_scan;
> +	manager->mm_shrink.count_objects = ttm_pool_mm_count;
>  	manager->mm_shrink.seeks = 1;
>  	register_shrinker(&manager->mm_shrink);
>  }
> diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
> index 855a5bb..3ccb723 100644
> --- a/drivers/staging/zcache/zcache-main.c
> +++ b/drivers/staging/zcache/zcache-main.c
> @@ -493,9 +493,10 @@ static void zbud_evict_zbpg(struct zbud_page *zbpg)
>   * page in use by another cpu, but also to avoid potential deadlock due to
>   * lock inversion.
>   */
> -static void zbud_evict_pages(int nr)
> +static int zbud_evict_pages(int nr)
>  {
>  	struct zbud_page *zbpg;
> +	int freed = 0;
>  	int i;
>  
>  	/* first try freeing any pages on unused list */
> @@ -511,7 +512,7 @@ retry_unused_list:
>  		spin_unlock_bh(&zbpg_unused_list_spinlock);
>  		zcache_free_page(zbpg);
>  		zcache_evicted_raw_pages++;
> -		if (--nr <= 0)
> +		if (++freed >= nr)
>  			goto out;
>  		goto retry_unused_list;
>  	}
> @@ -535,7 +536,7 @@ retry_unbud_list_i:
>  			/* want budlists unlocked when doing zbpg eviction */
>  			zbud_evict_zbpg(zbpg);
>  			local_bh_enable();
> -			if (--nr <= 0)
> +			if (++freed >= nr)
>  				goto out;
>  			goto retry_unbud_list_i;
>  		}
> @@ -559,13 +560,13 @@ retry_bud_list:
>  		/* want budlists unlocked when doing zbpg eviction */
>  		zbud_evict_zbpg(zbpg);
>  		local_bh_enable();
> -		if (--nr <= 0)
> +		if (++freed >= nr)
>  			goto out;
>  		goto retry_bud_list;
>  	}
>  	spin_unlock_bh(&zbud_budlists_spinlock);
>  out:
> -	return;
> +	return freed;
>  }
>  
>  static void zbud_init(void)
> @@ -1496,30 +1497,34 @@ static bool zcache_freeze;
>  /*
>   * zcache shrinker interface (only useful for ephemeral pages, so zbud only)
>   */
> -static int shrink_zcache_memory(struct shrinker *shrink,
> -				struct shrink_control *sc)
> +static long shrink_zcache_scan(struct shrinker *shrink,
> +			       struct shrink_control *sc)
>  {
>  	int ret = -1;
>  	int nr = sc->nr_to_scan;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
> -	if (nr >= 0) {
> -		if (!(gfp_mask & __GFP_FS))
> -			/* does this case really need to be skipped? */
> -			goto out;
> -		if (spin_trylock(&zcache_direct_reclaim_lock)) {
> -			zbud_evict_pages(nr);
> -			spin_unlock(&zcache_direct_reclaim_lock);
> -		} else
> -			zcache_aborted_shrink++;
> -	}
> -	ret = (int)atomic_read(&zcache_zbud_curr_raw_pages);
> -out:
> +	if (!(gfp_mask & __GFP_FS))
> +		return -1;
> +
> +	if (spin_trylock(&zcache_direct_reclaim_lock)) {
> +		ret = zbud_evict_pages(nr);
> +		spin_unlock(&zcache_direct_reclaim_lock);
> +	} else
> +		zcache_aborted_shrink++;
> +
>  	return ret;
>  }
>  
> +static long shrink_zcache_count(struct shrinker *shrink,
> +				struct shrink_control *sc)
> +{
> +	return atomic_read(&zcache_zbud_curr_raw_pages);
> +}
> +
>  static struct shrinker zcache_shrinker = {
> -	.shrink = shrink_zcache_memory,
> +	.scan_objects = shrink_zcache_scan,
> +	.count_objects = shrink_zcache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index d0f59fa..508a684 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -44,58 +44,73 @@ static const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} };
>  
>  const struct cred *root_cred;
>  
> -static void
> -shrink_idmap_tree(struct rb_root *root, int nr_to_scan, int *nr_rem,
> -			int *nr_del)
> +static long
> +shrink_idmap_tree(struct rb_root *root, int nr_to_scan)
>  {
>  	struct rb_node *node;
>  	struct rb_node *tmp;
>  	struct cifs_sid_id *psidid;
> +	long count = 0;
>  
>  	node = rb_first(root);
>  	while (node) {
>  		tmp = node;
>  		node = rb_next(tmp);
>  		psidid = rb_entry(tmp, struct cifs_sid_id, rbnode);
> -		if (nr_to_scan == 0 || *nr_del == nr_to_scan)
> -			++(*nr_rem);
> -		else {
> -			if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
> -						&& psidid->refcount == 0) {
> -				rb_erase(tmp, root);
> -				++(*nr_del);
> -			} else
> -				++(*nr_rem);
> +		if (nr_to_scan == 0) {
> +			count++;
> +			continue:
> +		}
> +		if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
> +					&& psidid->refcount == 0) {
> +			rb_erase(tmp, root);
> +			if (++count >= nr_to_scan)
> +				break;
>  		}
>  	}
> +	return count;
>  }
>  
>  /*
>   * Run idmap cache shrinker.
>   */
> -static int
> -cifs_idmap_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
> -	int nr_to_scan = sc->nr_to_scan;
> -	int nr_del = 0;
> -	int nr_rem = 0;
>  	struct rb_root *root;
> +	long freed;
>  
>  	root = &uidtree;
>  	spin_lock(&siduidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed = shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&siduidlock);
>  
>  	root = &gidtree;
>  	spin_lock(&sidgidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed += shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&sidgidlock);
>  
> -	return nr_rem;
> +	return freed;
> +}
> +
> +/*
> + * This still abuses the nr_to_scan == 0 trick to get the common code just to
> + * count objects. There neds to be an external count of the objects in the
> + * caches to avoid this.
> + */
> +static long
> +cifs_idmap_shrinker_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct shrinker_control sc = {
> +		.nr_to_scan = 0,
> +	}
> +
> +	return cifs_idmap_shrinker_scan(shrink, &sc);
>  }
>  
>  static struct shrinker cifs_shrinker = {
> -	.shrink = cifs_idmap_shrinker,
> +	.scan_objects = cifs_idmap_shrinker_scan,
> +	.count_objects = cifs_idmap_shrinker_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5123d71..d19e453 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -759,11 +759,12 @@ static void shrink_dentry_list(struct list_head *list)
>   *
>   * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
>   */
> -static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
> +static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
>  {
>  	struct dentry *dentry;
>  	LIST_HEAD(referenced);
>  	LIST_HEAD(tmp);
> +	long freed = 0;
>  
>  relock:
>  	spin_lock(&sb->s_dentry_lru_lock);
> @@ -791,6 +792,7 @@ relock:
>  		} else {
>  			list_move_tail(&dentry->d_lru, &tmp);
>  			spin_unlock(&dentry->d_lock);
> +			freed++;
>  			if (!--count)
>  				break;
>  		}
> @@ -801,6 +803,7 @@ relock:
>  	spin_unlock(&sb->s_dentry_lru_lock);
>  
>  	shrink_dentry_list(&tmp);
> +	return freed;
>  }
>  
>  /**
> @@ -815,9 +818,9 @@ relock:
>   * This function may fail to free any resources if all the dentries are in
>   * use.
>   */
> -void prune_dcache_sb(struct super_block *sb, int nr_to_scan)
> +long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
>  {
> -	__shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
> +	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
>  }
>  
>  /**
> @@ -1070,12 +1073,12 @@ EXPORT_SYMBOL(have_submounts);
>   * drop the lock and return early due to latency
>   * constraints.
>   */
> -static int select_parent(struct dentry * parent)
> +static long select_parent(struct dentry * parent)
>  {
>  	struct dentry *this_parent;
>  	struct list_head *next;
>  	unsigned seq;
> -	int found = 0;
> +	long found = 0;
>  	int locked = 0;
>  
>  	seq = read_seqbegin(&rename_lock);
> @@ -1163,7 +1166,7 @@ rename_retry:
>  void shrink_dcache_parent(struct dentry * parent)
>  {
>  	struct super_block *sb = parent->d_sb;
> -	int found;
> +	long found;
>  
>  	while ((found = select_parent(parent)) != 0)
>  		__shrink_dcache_sb(sb, found, 0);
> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> index 88e8a23..f9bc88d 100644
> --- a/fs/gfs2/glock.c
> +++ b/fs/gfs2/glock.c
> @@ -1370,24 +1370,21 @@ void gfs2_glock_complete(struct gfs2_glock *gl, int ret)
>  }
>  
> 
> -static int gfs2_shrink_glock_memory(struct shrinker *shrink,
> -				    struct shrink_control *sc)
> +static long gfs2_shrink_glock_scan(struct shrinker *shrink,
> +				   struct shrink_control *sc)
>  {
>  	struct gfs2_glock *gl;
>  	int may_demote;
>  	int nr_skipped = 0;
> -	int nr = sc->nr_to_scan;
> +	int freed = 0;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  	LIST_HEAD(skipped);
>  
> -	if (nr == 0)
> -		goto out;
> -
>  	if (!(gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	spin_lock(&lru_lock);
> -	while(nr && !list_empty(&lru_list)) {
> +	while (freed < sc->nr_to_scan && !list_empty(&lru_list)) {
>  		gl = list_entry(lru_list.next, struct gfs2_glock, gl_lru);
>  		list_del_init(&gl->gl_lru);
>  		clear_bit(GLF_LRU, &gl->gl_flags);
> @@ -1401,7 +1398,7 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
>  			may_demote = demote_ok(gl);
>  			if (may_demote) {
>  				handle_callback(gl, LM_ST_UNLOCKED, 0);
> -				nr--;
> +				freed++;
>  			}
>  			clear_bit(GLF_LOCK, &gl->gl_flags);
>  			smp_mb__after_clear_bit();
> @@ -1418,12 +1415,19 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
>  	list_splice(&skipped, &lru_list);
>  	atomic_add(nr_skipped, &lru_count);
>  	spin_unlock(&lru_lock);
> -out:
> +
> +	return freed;
> +}
> +
> +static long gfs2_shrink_glock_count(struct shrinker *shrink,
> +				    struct shrink_control *sc)
> +{
>  	return (atomic_read(&lru_count) / 100) * sysctl_vfs_cache_pressure;
>  }
>  
>  static struct shrinker glock_shrinker = {
> -	.shrink = gfs2_shrink_glock_memory,
> +	.scan_objects = gfs2_shrink_glock_scan,
> +	.count_objects = gfs2_shrink_glock_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 8ea7747..2c21986 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -29,7 +29,8 @@
>  #include "dir.h"
>  
>  static struct shrinker qd_shrinker = {
> -	.shrink = gfs2_shrink_qd_memory,
> +	.scan_objects = gfs2_shrink_qd_scan,
> +	.count_objects = gfs2_shrink_qd_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index 42e8d23..5a5f76c 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -78,20 +78,17 @@ static LIST_HEAD(qd_lru_list);
>  static atomic_t qd_lru_count = ATOMIC_INIT(0);
>  static DEFINE_SPINLOCK(qd_lru_lock);
>  
> -int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
> +long gfs2_shrink_qd_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct gfs2_quota_data *qd;
>  	struct gfs2_sbd *sdp;
> -	int nr_to_scan = sc->nr_to_scan;
> -
> -	if (nr_to_scan == 0)
> -		goto out;
> +	int freed = 0;
>  
>  	if (!(sc->gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	spin_lock(&qd_lru_lock);
> -	while (nr_to_scan && !list_empty(&qd_lru_list)) {
> +	while (freed <= sc->nr_to_scan && !list_empty(&qd_lru_list)) {
>  		qd = list_entry(qd_lru_list.next,
>  				struct gfs2_quota_data, qd_reclaim);
>  		sdp = qd->qd_gl->gl_sbd;
> @@ -112,12 +109,16 @@ int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
>  		spin_unlock(&qd_lru_lock);
>  		kmem_cache_free(gfs2_quotad_cachep, qd);
>  		spin_lock(&qd_lru_lock);
> -		nr_to_scan--;
> +		freed++;
>  	}
>  	spin_unlock(&qd_lru_lock);
>  
> -out:
> -	return (atomic_read(&qd_lru_count) * sysctl_vfs_cache_pressure) / 100;
> +	return freed;
> +}
> +
> +long gfs2_shrink_qd_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	return (atomic_read(&qd_lru_count) / 100) * sysctl_vfs_cache_pressure;
>  }
>  
>  static u64 qd2offset(struct gfs2_quota_data *qd)
> diff --git a/fs/gfs2/quota.h b/fs/gfs2/quota.h
> index 90bf1c3..c40fe6d 100644
> --- a/fs/gfs2/quota.h
> +++ b/fs/gfs2/quota.h
> @@ -52,7 +52,9 @@ static inline int gfs2_quota_lock_check(struct gfs2_inode *ip)
>  	return ret;
>  }
>  
> -extern int gfs2_shrink_qd_memory(struct shrinker *shrink,
> +extern long gfs2_shrink_qd_scan(struct shrinker *shrink,
> +				struct shrink_control *sc);
> +extern long gfs2_shrink_qd_count(struct shrinker *shrink,
>  				 struct shrink_control *sc);
>  extern const struct quotactl_ops gfs2_quotactl_ops;
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index 848808f..fee5d9a 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -613,10 +613,11 @@ static int can_unuse(struct inode *inode)
>   * LRU does not have strict ordering. Hence we don't want to reclaim inodes
>   * with this flag set because they are the inodes that are out of order.
>   */
> -void prune_icache_sb(struct super_block *sb, int nr_to_scan)
> +long prune_icache_sb(struct super_block *sb, long nr_to_scan)
>  {
>  	LIST_HEAD(freeable);
> -	int nr_scanned;
> +	long nr_scanned;
> +	long freed = 0;
>  	unsigned long reap = 0;
>  
>  	spin_lock(&sb->s_inode_lru_lock);
> @@ -686,6 +687,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
>  		list_move(&inode->i_lru, &freeable);
>  		sb->s_nr_inodes_unused--;
>  		this_cpu_dec(nr_unused);
> +		freed++;
>  	}
>  	if (current_is_kswapd())
>  		__count_vm_events(KSWAPD_INODESTEAL, reap);
> @@ -694,6 +696,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
>  	spin_unlock(&sb->s_inode_lru_lock);
>  
>  	dispose_list(&freeable);
> +	return freed;
>  }
>  
>  static void __wait_on_freeing_inode(struct inode *inode);
> diff --git a/fs/internal.h b/fs/internal.h
> index fe327c2..2662ffa 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -127,6 +127,8 @@ extern long do_handle_open(int mountdirfd,
>   * inode.c
>   */
>  extern spinlock_t inode_sb_list_lock;
> +extern long prune_icache_sb(struct super_block *sb, long nr_to_scan);
> +
>  
>  /*
>   * fs-writeback.c
> @@ -141,3 +143,4 @@ extern int invalidate_inodes(struct super_block *, bool);
>   * dcache.c
>   */
>  extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
> +extern long prune_dcache_sb(struct super_block *sb, long nr_to_scan);
> diff --git a/fs/mbcache.c b/fs/mbcache.c
> index 8c32ef3..aa3a19a 100644
> --- a/fs/mbcache.c
> +++ b/fs/mbcache.c
> @@ -90,11 +90,14 @@ static DEFINE_SPINLOCK(mb_cache_spinlock);
>   * What the mbcache registers as to get shrunk dynamically.
>   */
>  
> -static int mb_cache_shrink_fn(struct shrinker *shrink,
> -			      struct shrink_control *sc);
> +static long mb_cache_shrink_scan(struct shrinker *shrink,
> +				 struct shrink_control *sc);
> +static long mb_cache_shrink_count(struct shrinker *shrink,
> +				  struct shrink_control *sc);
>  
>  static struct shrinker mb_cache_shrinker = {
> -	.shrink = mb_cache_shrink_fn,
> +	.scan_objects = mb_cache_shrink_scan,
> +	.count_objects = mb_cache_shrink_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> @@ -161,13 +164,12 @@ forget:
>   *
>   * Returns the number of objects which are present in the cache.
>   */
> -static int
> -mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +mb_cache_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	LIST_HEAD(free_list);
> -	struct mb_cache *cache;
>  	struct mb_cache_entry *entry, *tmp;
> -	int count = 0;
> +	int freed = 0;
>  	int nr_to_scan = sc->nr_to_scan;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
> @@ -180,18 +182,27 @@ mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
>  		list_move_tail(&ce->e_lru_list, &free_list);
>  		__mb_cache_entry_unhash(ce);
>  	}
> -	list_for_each_entry(cache, &mb_cache_list, c_cache_list) {
> -		mb_debug("cache %s (%d)", cache->c_name,
> -			  atomic_read(&cache->c_entry_count));
> -		count += atomic_read(&cache->c_entry_count);
> -	}
>  	spin_unlock(&mb_cache_spinlock);
>  	list_for_each_entry_safe(entry, tmp, &free_list, e_lru_list) {
>  		__mb_cache_entry_forget(entry, gfp_mask);
> +		freed++;
>  	}
> -	return (count / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
>  }
>  
> +static long
> +mb_cache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct mb_cache *cache;
> +	long count = 0;
> +
> +	spin_lock(&mb_cache_spinlock);
> +	list_for_each_entry(cache, &mb_cache_list, c_cache_list)
> +		count += atomic_read(&cache->c_entry_count);
> +
> +	spin_unlock(&mb_cache_spinlock);
> +	return (count / 100) * sysctl_vfs_cache_pressure;
> +}
>  
>  /*
>   * mb_cache_create()  create a new cache
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index b238d95..a5aefb2 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2057,17 +2057,18 @@ static void nfs_access_free_list(struct list_head *head)
>  	}
>  }
>  
> -int nfs_access_cache_shrinker(struct shrinker *shrink,
> -			      struct shrink_control *sc)
> +long nfs_access_cache_scan(struct shrinker *shrink,
> +			   struct shrink_control *sc)
>  {
>  	LIST_HEAD(head);
>  	struct nfs_inode *nfsi, *next;
>  	struct nfs_access_entry *cache;
>  	int nr_to_scan = sc->nr_to_scan;
> +	int freed = 0;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
>  	if ((gfp_mask & GFP_KERNEL) != GFP_KERNEL)
> -		return (nr_to_scan == 0) ? 0 : -1;
> +		return -1;
>  
>  	spin_lock(&nfs_access_lru_lock);
>  	list_for_each_entry_safe(nfsi, next, &nfs_access_lru_list, access_cache_inode_lru) {
> @@ -2079,6 +2080,7 @@ int nfs_access_cache_shrinker(struct shrinker *shrink,
>  		spin_lock(&inode->i_lock);
>  		if (list_empty(&nfsi->access_cache_entry_lru))
>  			goto remove_lru_entry;
> +		freed++;
>  		cache = list_entry(nfsi->access_cache_entry_lru.next,
>  				struct nfs_access_entry, lru);
>  		list_move(&cache->lru, &head);
> @@ -2097,7 +2099,14 @@ remove_lru_entry:
>  	}
>  	spin_unlock(&nfs_access_lru_lock);
>  	nfs_access_free_list(&head);
> -	return (atomic_long_read(&nfs_access_nr_entries) / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
> +}
> +
> +long nfs_access_cache_count(struct shrinker *shrink,
> +			    struct shrink_control *sc)
> +{
> +	return (atomic_long_read(&nfs_access_nr_entries) / 100) *
> +						sysctl_vfs_cache_pressure;
>  }
>  
>  static void __nfs_access_zap_cache(struct nfs_inode *nfsi, struct list_head *head)
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index ab12913..9c65e1f 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -244,8 +244,10 @@ extern int nfs_init_client(struct nfs_client *clp,
>  			   int noresvport);
>  
>  /* dir.c */
> -extern int nfs_access_cache_shrinker(struct shrinker *shrink,
> -					struct shrink_control *sc);
> +extern long nfs_access_cache_scan(struct shrinker *shrink,
> +				  struct shrink_control *sc);
> +extern long nfs_access_cache_count(struct shrinker *shrink,
> +				   struct shrink_control *sc);
>  
>  /* inode.c */
>  extern struct workqueue_struct *nfsiod_workqueue;
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index b961cea..e088c03 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -380,7 +380,8 @@ static const struct super_operations nfs4_sops = {
>  #endif
>  
>  static struct shrinker acl_shrinker = {
> -	.shrink		= nfs_access_cache_shrinker,
> +	.scan_objects	= nfs_access_cache_scan,
> +	.count_objects	= nfs_access_cache_count,
>  	.seeks		= DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 5b572c8..c8724d2 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -669,45 +669,42 @@ int dquot_quota_sync(struct super_block *sb, int type, int wait)
>  }
>  EXPORT_SYMBOL(dquot_quota_sync);
>  
> -/* Free unused dquots from cache */
> -static void prune_dqcache(int count)
> +/*
> + * This is called from kswapd when we think we need some
> + * more memory
> + */
> +static long shrink_dqcache_scan(struct shrinker *shrink,
> +				 struct shrink_control *sc)
>  {
>  	struct list_head *head;
>  	struct dquot *dquot;
> +	int freed = 0;
>  
> +	spin_lock(&dq_list_lock);
>  	head = free_dquots.prev;
> -	while (head != &free_dquots && count) {
> +	while (head != &free_dquots && freed < sc->nr_to_scan) {
>  		dquot = list_entry(head, struct dquot, dq_free);
>  		remove_dquot_hash(dquot);
>  		remove_free_dquot(dquot);
>  		remove_inuse(dquot);
>  		do_destroy_dquot(dquot);
> -		count--;
> +		freed++;
>  		head = free_dquots.prev;
>  	}
> +	spin_unlock(&dq_list_lock);
> +
> +	return freed;
>  }
>  
> -/*
> - * This is called from kswapd when we think we need some
> - * more memory
> - */
> -static int shrink_dqcache_memory(struct shrinker *shrink,
> +static long shrink_dqcache_count(struct shrinker *shrink,
>  				 struct shrink_control *sc)
>  {
> -	int nr = sc->nr_to_scan;
> -
> -	if (nr) {
> -		spin_lock(&dq_list_lock);
> -		prune_dqcache(nr);
> -		spin_unlock(&dq_list_lock);
> -	}
> -	return ((unsigned)
> -		percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
> -		/100) * sysctl_vfs_cache_pressure;
> +	return (percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
> +		/ 100) * sysctl_vfs_cache_pressure;
>  }
> -
>  static struct shrinker dqcache_shrinker = {
> -	.shrink = shrink_dqcache_memory,
> +	.scan_objects = shrink_dqcache_scan,
> +	.count_objects = shrink_dqcache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/super.c b/fs/super.c
> index 6a72693..074abbe 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -45,11 +45,14 @@ DEFINE_SPINLOCK(sb_lock);
>   * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence we
>   * take a passive reference to the superblock to avoid this from occurring.
>   */
> -static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
> +static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct super_block *sb;
> -	int	fs_objects = 0;
> -	int	total_objects;
> +	long	fs_objects = 0;
> +	long	total_objects;
> +	long	freed = 0;
> +	long	dentries;
> +	long	inodes;
>  
>  	sb = container_of(shrink, struct super_block, s_shrink);
>  
> @@ -57,7 +60,7 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
>  	 * Deadlock avoidance.  We may hold various FS locks, and we don't want
>  	 * to recurse into the FS that called us in clear_inode() and friends..
>  	 */
> -	if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
> +	if (!(sc->gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	if (!grab_super_passive(sb))
> @@ -69,33 +72,42 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
>  	total_objects = sb->s_nr_dentry_unused +
>  			sb->s_nr_inodes_unused + fs_objects + 1;
>  
> -	if (sc->nr_to_scan) {
> -		int	dentries;
> -		int	inodes;
> -
> -		/* proportion the scan between the caches */
> -		dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) /
> -							total_objects;
> -		inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) /
> -							total_objects;
> -		if (fs_objects)
> -			fs_objects = (sc->nr_to_scan * fs_objects) /
> -							total_objects;
> -		/*
> -		 * prune the dcache first as the icache is pinned by it, then
> -		 * prune the icache, followed by the filesystem specific caches
> -		 */
> -		prune_dcache_sb(sb, dentries);
> -		prune_icache_sb(sb, inodes);
> +	/* proportion the scan between the caches */
> +	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
> +	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
>  
> -		if (fs_objects && sb->s_op->free_cached_objects) {
> -			sb->s_op->free_cached_objects(sb, fs_objects);
> -			fs_objects = sb->s_op->nr_cached_objects(sb);
> -		}
> -		total_objects = sb->s_nr_dentry_unused +
> -				sb->s_nr_inodes_unused + fs_objects;
> +	/*
> +	 * prune the dcache first as the icache is pinned by it, then
> +	 * prune the icache, followed by the filesystem specific caches
> +	 */
> +	freed = prune_dcache_sb(sb, dentries);
> +	freed += prune_icache_sb(sb, inodes);
> +
> +	if (fs_objects) {
> +		fs_objects = (sc->nr_to_scan * fs_objects) / total_objects;
> +		freed += sb->s_op->free_cached_objects(sb, fs_objects);
>  	}
>  
> +	drop_super(sb);
> +	return freed;
> +}
> +
> +static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct super_block *sb;
> +	long	total_objects = 0;
> +
> +	sb = container_of(shrink, struct super_block, s_shrink);
> +
> +	if (!grab_super_passive(sb))
> +		return -1;
> +
> +	if (sb->s_op && sb->s_op->nr_cached_objects)
> +		total_objects = sb->s_op->nr_cached_objects(sb);
> +
> +	total_objects += sb->s_nr_dentry_unused;
> +	total_objects += sb->s_nr_inodes_unused;
> +
>  	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
>  	drop_super(sb);
>  	return total_objects;
> @@ -182,7 +194,8 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  		s->cleancache_poolid = -1;
>  
>  		s->s_shrink.seeks = DEFAULT_SEEKS;
> -		s->s_shrink.shrink = prune_super;
> +		s->s_shrink.scan_objects = super_cache_scan;
> +		s->s_shrink.count_objects = super_cache_count;
>  		s->s_shrink.batch = 1024;
>  	}
>  out:
> diff --git a/fs/ubifs/shrinker.c b/fs/ubifs/shrinker.c
> index 9e1d056..78ca7b7 100644
> --- a/fs/ubifs/shrinker.c
> +++ b/fs/ubifs/shrinker.c
> @@ -277,19 +277,12 @@ static int kick_a_thread(void)
>  	return 0;
>  }
>  
> -int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	int nr = sc->nr_to_scan;
>  	int freed, contention = 0;
>  	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
>  
> -	if (nr == 0)
> -		/*
> -		 * Due to the way UBIFS updates the clean znode counter it may
> -		 * temporarily be negative.
> -		 */
> -		return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
> -
>  	if (!clean_zn_cnt) {
>  		/*
>  		 * No clean znodes, nothing to reap. All we can do in this case
> @@ -323,3 +316,13 @@ out:
>  	dbg_tnc("%d znodes were freed, requested %d", freed, nr);
>  	return freed;
>  }
> +
> +long ubifs_shrinker_count(struct shrinker *shrink, ubifs_shrinker_scan)
> +{
> +	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
> +	/*
> +	 * Due to the way UBIFS updates the clean znode counter it may
> +	 * temporarily be negative.
> +	 */
> +	return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
> +}
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 91903f6..3d3f3e9 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -49,7 +49,8 @@ struct kmem_cache *ubifs_inode_slab;
>  
>  /* UBIFS TNC shrinker description */
>  static struct shrinker ubifs_shrinker_info = {
> -	.shrink = ubifs_shrinker,
> +	.scan_objects = ubifs_shrinker_scan,
> +	.count_objects = ubifs_shrinker_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
> index 27f2255..2b8f48c 100644
> --- a/fs/ubifs/ubifs.h
> +++ b/fs/ubifs/ubifs.h
> @@ -1625,7 +1625,8 @@ int ubifs_tnc_start_commit(struct ubifs_info *c, struct ubifs_zbranch *zroot);
>  int ubifs_tnc_end_commit(struct ubifs_info *c);
>  
>  /* shrinker.c */
> -int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc);
> +long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc);
> +long ubifs_shrinker_count(struct shrinker *shrink, struct shrink_control *sc);
>  
>  /* commit.c */
>  int ubifs_bg_thread(void *info);
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 7a026cb..b2eea9e 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1456,8 +1456,8 @@ restart:
>  	spin_unlock(&btp->bt_lru_lock);
>  }
>  
> -int
> -xfs_buftarg_shrink(
> +static long
> +xfs_buftarg_shrink_scan(
>  	struct shrinker		*shrink,
>  	struct shrink_control	*sc)
>  {
> @@ -1465,6 +1465,7 @@ xfs_buftarg_shrink(
>  					struct xfs_buftarg, bt_shrinker);
>  	struct xfs_buf		*bp;
>  	int nr_to_scan = sc->nr_to_scan;
> +	int freed = 0;
>  	LIST_HEAD(dispose);
>  
>  	if (!nr_to_scan)
> @@ -1493,6 +1494,7 @@ xfs_buftarg_shrink(
>  		 */
>  		list_move(&bp->b_lru, &dispose);
>  		btp->bt_lru_nr--;
> +		freed++;
>  	}
>  	spin_unlock(&btp->bt_lru_lock);
>  
> @@ -1502,6 +1504,16 @@ xfs_buftarg_shrink(
>  		xfs_buf_rele(bp);
>  	}
>  
> +	return freed;
> +}
> +
> +static long
> +xfs_buftarg_shrink_count(
> +	struct shrinker		*shrink,
> +	struct shrink_control	*sc)
> +{
> +	struct xfs_buftarg	*btp = container_of(shrink,
> +					struct xfs_buftarg, bt_shrinker);
>  	return btp->bt_lru_nr;
>  }
>  
> @@ -1602,7 +1614,8 @@ xfs_alloc_buftarg(
>  		goto error;
>  	if (xfs_alloc_delwrite_queue(btp, fsname))
>  		goto error;
> -	btp->bt_shrinker.shrink = xfs_buftarg_shrink;
> +	btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> +	btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
>  	btp->bt_shrinker.seeks = DEFAULT_SEEKS;
>  	register_shrinker(&btp->bt_shrinker);
>  	return btp;
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 9a0aa76..19863a8 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -60,10 +60,12 @@ STATIC void	xfs_qm_list_destroy(xfs_dqlist_t *);
>  
>  STATIC int	xfs_qm_init_quotainos(xfs_mount_t *);
>  STATIC int	xfs_qm_init_quotainfo(xfs_mount_t *);
> -STATIC int	xfs_qm_shake(struct shrinker *, struct shrink_control *);
> +STATIC long	xfs_qm_shake_scan(struct shrinker *, struct shrink_control *);
> +STATIC long	xfs_qm_shake_count(struct shrinker *, struct shrink_control *);
>  
>  static struct shrinker xfs_qm_shaker = {
> -	.shrink = xfs_qm_shake,
> +	.scan_objects = xfs_qm_shake_scan,
> +	.count_objects = xfs_qm_shake_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> @@ -1963,9 +1965,8 @@ xfs_qm_shake_freelist(
>  /*
>   * The kmem_shake interface is invoked when memory is running low.
>   */
> -/* ARGSUSED */
> -STATIC int
> -xfs_qm_shake(
> +STATIC long
> +xfs_qm_shake_scan(
>  	struct shrinker	*shrink,
>  	struct shrink_control *sc)
>  {
> @@ -1973,9 +1974,9 @@ xfs_qm_shake(
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
>  	if (!kmem_shake_allow(gfp_mask))
> -		return 0;
> +		return -1;
>  	if (!xfs_Gqm)
> -		return 0;
> +		return -1;
>  
>  	nfree = xfs_Gqm->qm_dqfrlist_cnt; /* free dquots */
>  	/* incore dquots in all f/s's */
> @@ -1992,6 +1993,13 @@ xfs_qm_shake(
>  	return xfs_qm_shake_freelist(MAX(nfree, n));
>  }
>  
> +STATIC long
> +xfs_qm_shake_count(
> +	struct shrinker	*shrink,
> +	struct shrink_control *sc)
> +{
> +	return xfs_Gqm ? xfs_Gqm->qm_dqfrlist_cnt : -1;
> +}
>  
>  /*------------------------------------------------------------------*/
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index c94ec22..dff4b67 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1473,19 +1473,19 @@ xfs_fs_mount(
>  	return mount_bdev(fs_type, flags, dev_name, data, xfs_fs_fill_super);
>  }
>  
> -static int
> +static long
>  xfs_fs_nr_cached_objects(
>  	struct super_block	*sb)
>  {
>  	return xfs_reclaim_inodes_count(XFS_M(sb));
>  }
>  
> -static void
> +static long
>  xfs_fs_free_cached_objects(
>  	struct super_block	*sb,
> -	int			nr_to_scan)
> +	long			nr_to_scan)
>  {
> -	xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
> +	return xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
>  }
>  
>  static const struct super_operations xfs_super_operations = {
> diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
> index 4604f90..5b60a3a 100644
> --- a/fs/xfs/xfs_sync.c
> +++ b/fs/xfs/xfs_sync.c
> @@ -896,7 +896,7 @@ int
>  xfs_reclaim_inodes_ag(
>  	struct xfs_mount	*mp,
>  	int			flags,
> -	int			*nr_to_scan)
> +	long			*nr_to_scan)
>  {
>  	struct xfs_perag	*pag;
>  	int			error = 0;
> @@ -1017,7 +1017,7 @@ xfs_reclaim_inodes(
>  	xfs_mount_t	*mp,
>  	int		mode)
>  {
> -	int		nr_to_scan = INT_MAX;
> +	long		nr_to_scan = LONG_MAX;
>  
>  	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
>  }
> @@ -1031,29 +1031,32 @@ xfs_reclaim_inodes(
>   * them to be cleaned, which we hope will not be very long due to the
>   * background walker having already kicked the IO off on those dirty inodes.
>   */
> -void
> +long
>  xfs_reclaim_inodes_nr(
>  	struct xfs_mount	*mp,
> -	int			nr_to_scan)
> +	long			nr_to_scan)
>  {
> +	long nr = nr_to_scan;
> +
>  	/* kick background reclaimer and push the AIL */
>  	xfs_syncd_queue_reclaim(mp);
>  	xfs_ail_push_all(mp->m_ail);
>  
> -	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan);
> +	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr);
> +	return nr_to_scan - nr;
>  }
>  
>  /*
>   * Return the number of reclaimable inodes in the filesystem for
>   * the shrinker to determine how much to reclaim.
>   */
> -int
> +long
>  xfs_reclaim_inodes_count(
>  	struct xfs_mount	*mp)
>  {
>  	struct xfs_perag	*pag;
>  	xfs_agnumber_t		ag = 0;
> -	int			reclaimable = 0;
> +	long			reclaimable = 0;
>  
>  	while ((pag = xfs_perag_get_tag(mp, ag, XFS_ICI_RECLAIM_TAG))) {
>  		ag = pag->pag_agno + 1;
> diff --git a/fs/xfs/xfs_sync.h b/fs/xfs/xfs_sync.h
> index 941202e..82e1b1c 100644
> --- a/fs/xfs/xfs_sync.h
> +++ b/fs/xfs/xfs_sync.h
> @@ -35,8 +35,8 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
>  void xfs_flush_inodes(struct xfs_inode *ip);
>  
>  int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
> -int xfs_reclaim_inodes_count(struct xfs_mount *mp);
> -void xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
> +long xfs_reclaim_inodes_count(struct xfs_mount *mp);
> +long xfs_reclaim_inodes_nr(struct xfs_mount *mp, long nr_to_scan);
>  
>  void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
>  void __xfs_inode_set_reclaim_tag(struct xfs_perag *pag, struct xfs_inode *ip);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 14be4d8..958c025 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1465,10 +1465,6 @@ struct super_block {
>  	struct shrinker s_shrink;	/* per-sb shrinker handle */
>  };
>  
> -/* superblock cache pruning functions */
> -extern void prune_icache_sb(struct super_block *sb, int nr_to_scan);
> -extern void prune_dcache_sb(struct super_block *sb, int nr_to_scan);
> -
>  extern struct timespec current_fs_time(struct super_block *sb);
>  
>  /*
> @@ -1662,8 +1658,8 @@ struct super_operations {
>  	ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
>  #endif
>  	int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
> -	int (*nr_cached_objects)(struct super_block *);
> -	void (*free_cached_objects)(struct super_block *, int);
> +	long (*nr_cached_objects)(struct super_block *);
> +	long (*free_cached_objects)(struct super_block *, long);
>  };
>  
>  /*
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 36851f7..80308ea 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -190,7 +190,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  
>  	TP_STRUCT__entry(
>  		__field(struct shrinker *, shr)
> -		__field(void *, shrink)
> +		__field(void *, scan)
>  		__field(long, nr_objects_to_shrink)
>  		__field(gfp_t, gfp_flags)
>  		__field(unsigned long, pgs_scanned)
> @@ -202,7 +202,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  
>  	TP_fast_assign(
>  		__entry->shr = shr;
> -		__entry->shrink = shr->shrink;
> +		__entry->scan = shr->scan_objects;
>  		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
>  		__entry->gfp_flags = sc->gfp_mask;
>  		__entry->pgs_scanned = pgs_scanned;
> @@ -213,7 +213,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  	),
>  
>  	TP_printk("%pF %p: objects to shrink %ld gfp_flags %s pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
> -		__entry->shrink,
> +		__entry->scan,
>  		__entry->shr,
>  		__entry->nr_objects_to_shrink,
>  		show_gfp_flags(__entry->gfp_flags),
> @@ -232,7 +232,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  
>  	TP_STRUCT__entry(
>  		__field(struct shrinker *, shr)
> -		__field(void *, shrink)
> +		__field(void *, scan)
>  		__field(long, unused_scan)
>  		__field(long, new_scan)
>  		__field(int, retval)
> @@ -241,7 +241,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  
>  	TP_fast_assign(
>  		__entry->shr = shr;
> -		__entry->shrink = shr->shrink;
> +		__entry->scan = shr->scan_objects;
>  		__entry->unused_scan = unused_scan_cnt;
>  		__entry->new_scan = new_scan_cnt;
>  		__entry->retval = shrinker_retval;
> @@ -249,7 +249,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  	),
>  
>  	TP_printk("%pF %p: unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
> -		__entry->shrink,
> +		__entry->scan,
>  		__entry->shr,
>  		__entry->unused_scan,
>  		__entry->new_scan,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7ef6912..e32ce2d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -202,14 +202,6 @@ void unregister_shrinker(struct shrinker *shrinker)
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
>  
> -static inline int do_shrinker_shrink(struct shrinker *shrinker,
> -				     struct shrink_control *sc,
> -				     unsigned long nr_to_scan)
> -{
> -	sc->nr_to_scan = nr_to_scan;
> -	return (*shrinker->shrink)(shrinker, sc);
> -}
> -
>  #define SHRINK_BATCH 128
>  /*
>   * Call the shrink functions to age shrinkable caches
> @@ -230,27 +222,26 @@ static inline int do_shrinker_shrink(struct shrinker *shrinker,
>   *
>   * Returns the number of slab objects which we shrunk.
>   */
> -unsigned long shrink_slab(struct shrink_control *shrink,
> +unsigned long shrink_slab(struct shrink_control *sc,
>  			  unsigned long nr_pages_scanned,
>  			  unsigned long lru_pages)
>  {
>  	struct shrinker *shrinker;
> -	unsigned long ret = 0;
> +	unsigned long freed = 0;
>  
>  	if (nr_pages_scanned == 0)
>  		nr_pages_scanned = SWAP_CLUSTER_MAX;
>  
>  	if (!down_read_trylock(&shrinker_rwsem)) {
>  		/* Assume we'll be able to shrink next time */
> -		ret = 1;
> +		freed = 1;
>  		goto out;
>  	}
>  
>  	list_for_each_entry(shrinker, &shrinker_list, list) {
> -		unsigned long long delta;
> -		unsigned long total_scan;
> -		unsigned long max_pass;
> -		int shrink_ret = 0;
> +		long long delta;
> +		long total_scan;
> +		long max_pass;
>  		long nr;
>  		long new_nr;
>  		long batch_size = shrinker->batch ? shrinker->batch
> @@ -266,7 +257,9 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
>  
>  		total_scan = nr;
> -		max_pass = do_shrinker_shrink(shrinker, shrink, 0);
> +		max_pass = shrinker->count_objects(shrinker, sc);
> +		WARN_ON_ONCE(max_pass < 0);
> +
>  		delta = (4 * nr_pages_scanned) / shrinker->seeks;
>  		delta *= max_pass;
>  		do_div(delta, lru_pages + 1);
> @@ -274,7 +267,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		if (total_scan < 0) {
>  			printk(KERN_ERR "shrink_slab: %pF negative objects to "
>  			       "delete nr=%ld\n",
> -			       shrinker->shrink, total_scan);
> +			       shrinker->scan_objects, total_scan);
>  			total_scan = max_pass;
>  		}
>  
> @@ -301,20 +294,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		if (total_scan > max_pass * 2)
>  			total_scan = max_pass * 2;
>  
> -		trace_mm_shrink_slab_start(shrinker, shrink, nr,
> +		trace_mm_shrink_slab_start(shrinker, sc, nr,
>  					nr_pages_scanned, lru_pages,
>  					max_pass, delta, total_scan);
>  
>  		while (total_scan >= batch_size) {
> -			int nr_before;
> +			long ret;
> +
> +			sc->nr_to_scan = batch_size;
> +			ret = shrinker->scan_objects(shrinker, sc);
>  
> -			nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> -			shrink_ret = do_shrinker_shrink(shrinker, shrink,
> -							batch_size);
> -			if (shrink_ret == -1)
> +			if (ret == -1)
>  				break;
> -			if (shrink_ret < nr_before)
> -				ret += nr_before - shrink_ret;
> +			freed += ret;
>  			count_vm_events(SLABS_SCANNED, batch_size);
>  			total_scan -= batch_size;
>  
> @@ -333,12 +325,12 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  				break;
>  		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
>  
> -		trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
> +		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
>  	}
>  	up_read(&shrinker_rwsem);
>  out:
>  	cond_resched();
> -	return ret;
> +	return freed;
>  }
>  
>  static void set_reclaim_mode(int priority, struct scan_control *sc,
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 727e506..f5955c3 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -292,6 +292,7 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  	spinlock_t *cache_lock;
>  	struct rpc_cred *cred, *next;
>  	unsigned long expired = jiffies - RPC_AUTH_EXPIRY_MORATORIUM;
> +	int freed = 0;
>  
>  	list_for_each_entry_safe(cred, next, &cred_unused, cr_lru) {
>  
> @@ -303,10 +304,10 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  		 */
>  		if (time_in_range(cred->cr_expire, expired, jiffies) &&
>  		    test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
> -			return 0;
> +			break;
>  
> -		list_del_init(&cred->cr_lru);
>  		number_cred_unused--;
> +		list_del_init(&cred->cr_lru);
>  		if (atomic_read(&cred->cr_count) != 0)
>  			continue;
>  
> @@ -316,17 +317,18 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  			get_rpccred(cred);
>  			list_add_tail(&cred->cr_lru, free);
>  			rpcauth_unhash_cred_locked(cred);
> +			freed++;
>  		}
>  		spin_unlock(cache_lock);
>  	}
> -	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
>  }
>  
>  /*
>   * Run memory cache shrinker.
>   */
> -static int
> -rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +rpcauth_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	LIST_HEAD(free);
>  	int res;
> @@ -344,6 +346,12 @@ rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
>  	return res;
>  }
>  
> +static long
> +rpcauth_cache_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
> +}
> +
>  /*
>   * Look up a process' credentials in the authentication cache
>   */
> @@ -658,7 +666,8 @@ rpcauth_uptodatecred(struct rpc_task *task)
>  }
>  
>  static struct shrinker rpc_cred_shrinker = {
> -	.shrink = rpcauth_cache_shrinker,
> +	.scan_objects = rpcauth_cache_scan,
> +	.count_objects = rpcauth_cache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
@ 2011-08-23  9:35     ` Steven Whitehouse
  0 siblings, 0 replies; 74+ messages in thread
From: Steven Whitehouse @ 2011-08-23  9:35 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

Hi,

On Tue, 2011-08-23 at 18:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

GFS2 bits:
Acked-by: Steven Whitehouse <swhiteho@redhat.com>

Looks good to me,

Steve.

> ---
>  Documentation/filesystems/vfs.txt    |   11 +++--
>  arch/x86/kvm/mmu.c                   |   16 ++++---
>  drivers/gpu/drm/i915/i915_dma.c      |    4 +-
>  drivers/gpu/drm/i915/i915_gem.c      |   49 ++++++++++++++---------
>  drivers/gpu/drm/ttm/ttm_page_alloc.c |   14 ++++--
>  drivers/staging/zcache/zcache-main.c |   45 ++++++++++++---------
>  fs/cifs/cifsacl.c                    |   57 +++++++++++++++++----------
>  fs/dcache.c                          |   15 ++++---
>  fs/gfs2/glock.c                      |   24 +++++++-----
>  fs/gfs2/main.c                       |    3 +-
>  fs/gfs2/quota.c                      |   19 +++++----
>  fs/gfs2/quota.h                      |    4 +-
>  fs/inode.c                           |    7 ++-
>  fs/internal.h                        |    3 +
>  fs/mbcache.c                         |   37 +++++++++++------
>  fs/nfs/dir.c                         |   17 ++++++--
>  fs/nfs/internal.h                    |    6 ++-
>  fs/nfs/super.c                       |    3 +-
>  fs/quota/dquot.c                     |   39 +++++++++----------
>  fs/super.c                           |   71 ++++++++++++++++++++--------------
>  fs/ubifs/shrinker.c                  |   19 +++++----
>  fs/ubifs/super.c                     |    3 +-
>  fs/ubifs/ubifs.h                     |    3 +-
>  fs/xfs/xfs_buf.c                     |   19 ++++++++-
>  fs/xfs/xfs_qm.c                      |   22 +++++++---
>  fs/xfs/xfs_super.c                   |    8 ++--
>  fs/xfs/xfs_sync.c                    |   17 +++++---
>  fs/xfs/xfs_sync.h                    |    4 +-
>  include/linux/fs.h                   |    8 +---
>  include/trace/events/vmscan.h        |   12 +++---
>  mm/vmscan.c                          |   46 +++++++++-------------
>  net/sunrpc/auth.c                    |   21 +++++++---
>  32 files changed, 369 insertions(+), 257 deletions(-)
> 
> diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> index 52d8fb8..4ca3c2d 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -229,8 +229,8 @@ struct super_operations {
>  
>          ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t);
>          ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
> -	int (*nr_cached_objects)(struct super_block *);
> -	void (*free_cached_objects)(struct super_block *, int);
> +	long (*nr_cached_objects)(struct super_block *);
> +	long (*free_cached_objects)(struct super_block *, long);
>  };
>  
>  All methods are called without any locks being held, unless otherwise
> @@ -313,9 +313,10 @@ or bottom half).
>  	implement ->nr_cached_objects for it to be called correctly.
>  
>  	We can't do anything with any errors that the filesystem might
> -	encountered, hence the void return type. This will never be called if
> -	the VM is trying to reclaim under GFP_NOFS conditions, hence this
> -	method does not need to handle that situation itself.
> +	encountered, so the return value is the number of objects freed. This
> +	will never be called if the VM is trying to reclaim under GFP_NOFS
> +	conditions, hence this method does not need to handle that situation
> +	itself.
>  
>  	Implementations must include conditional reschedule calls inside any
>  	scanning loop that is done. This allows the VFS to determine
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 1c5b693..939e201 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3858,14 +3858,12 @@ static int kvm_mmu_remove_some_alloc_mmu_pages(struct kvm *kvm,
>  	return kvm_mmu_prepare_zap_page(kvm, page, invalid_list);
>  }
>  
> -static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> +static long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct kvm *kvm;
>  	struct kvm *kvm_freed = NULL;
>  	int nr_to_scan = sc->nr_to_scan;
> -
> -	if (nr_to_scan == 0)
> -		goto out;
> +	long freed_pages = 0;
>  
>  	raw_spin_lock(&kvm_lock);
>  
> @@ -3877,7 +3875,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  		spin_lock(&kvm->mmu_lock);
>  		if (!kvm_freed && nr_to_scan > 0 &&
>  		    kvm->arch.n_used_mmu_pages > 0) {
> -			freed_pages = kvm_mmu_remove_some_alloc_mmu_pages(kvm,
> +			freed_pages += kvm_mmu_remove_some_alloc_mmu_pages(kvm,
>  							  &invalid_list);
>  			kvm_freed = kvm;
>  		}
> @@ -3891,13 +3889,17 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  		list_move_tail(&kvm_freed->vm_list, &vm_list);
>  
>  	raw_spin_unlock(&kvm_lock);
> +	return freed_pages;
> +}
>  
> -out:
> +static long mmu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
>  	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
>  }
>  
>  static struct shrinker mmu_shrinker = {
> -	.shrink = mmu_shrink,
> +	.scan_objects = mmu_shrink_scan,
> +	.count_objects = mmu_shrink_count,
>  	.seeks = DEFAULT_SEEKS * 10,
>  };
>  
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index 8a3942c..734ea5e 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -2074,7 +2074,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags)
>  	return 0;
>  
>  out_gem_unload:
> -	if (dev_priv->mm.inactive_shrinker.shrink)
> +	if (dev_priv->mm.inactive_shrinker.scan_objects)
>  		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
>  
>  	if (dev->pdev->msi_enabled)
> @@ -2108,7 +2108,7 @@ int i915_driver_unload(struct drm_device *dev)
>  	i915_mch_dev = NULL;
>  	spin_unlock(&mchdev_lock);
>  
> -	if (dev_priv->mm.inactive_shrinker.shrink)
> +	if (dev_priv->mm.inactive_shrinker.scan_objects)
>  		unregister_shrinker(&dev_priv->mm.inactive_shrinker);
>  
>  	mutex_lock(&dev->struct_mutex);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a546a71..0647a33 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -56,7 +56,9 @@ static int i915_gem_phys_pwrite(struct drm_device *dev,
>  				struct drm_file *file);
>  static void i915_gem_free_object_tail(struct drm_i915_gem_object *obj);
>  
> -static int i915_gem_inactive_shrink(struct shrinker *shrinker,
> +static long i915_gem_inactive_scan(struct shrinker *shrinker,
> +				   struct shrink_control *sc);
> +static long i915_gem_inactive_count(struct shrinker *shrinker,
>  				    struct shrink_control *sc);
>  
>  /* some bookkeeping */
> @@ -3999,7 +4001,8 @@ i915_gem_load(struct drm_device *dev)
>  
>  	dev_priv->mm.interruptible = true;
>  
> -	dev_priv->mm.inactive_shrinker.shrink = i915_gem_inactive_shrink;
> +	dev_priv->mm.inactive_shrinker.scan_objects = i915_gem_inactive_scan;
> +	dev_priv->mm.inactive_shrinker.count_objects = i915_gem_inactive_count;
>  	dev_priv->mm.inactive_shrinker.seeks = DEFAULT_SEEKS;
>  	register_shrinker(&dev_priv->mm.inactive_shrinker);
>  }
> @@ -4221,8 +4224,8 @@ i915_gpu_is_active(struct drm_device *dev)
>  	return !lists_empty;
>  }
>  
> -static int
> -i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
> +static long
> +i915_gem_inactive_scan(struct shrinker *shrinker, struct shrink_control *sc)
>  {
>  	struct drm_i915_private *dev_priv =
>  		container_of(shrinker,
> @@ -4231,22 +4234,10 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
>  	struct drm_device *dev = dev_priv->dev;
>  	struct drm_i915_gem_object *obj, *next;
>  	int nr_to_scan = sc->nr_to_scan;
> -	int cnt;
>  
>  	if (!mutex_trylock(&dev->struct_mutex))
>  		return 0;
>  
> -	/* "fast-path" to count number of available objects */
> -	if (nr_to_scan == 0) {
> -		cnt = 0;
> -		list_for_each_entry(obj,
> -				    &dev_priv->mm.inactive_list,
> -				    mm_list)
> -			cnt++;
> -		mutex_unlock(&dev->struct_mutex);
> -		return cnt / 100 * sysctl_vfs_cache_pressure;
> -	}
> -
>  rescan:
>  	/* first scan for clean buffers */
>  	i915_gem_retire_requests(dev);
> @@ -4262,15 +4253,12 @@ rescan:
>  	}
>  
>  	/* second pass, evict/count anything still on the inactive list */
> -	cnt = 0;
>  	list_for_each_entry_safe(obj, next,
>  				 &dev_priv->mm.inactive_list,
>  				 mm_list) {
>  		if (nr_to_scan &&
>  		    i915_gem_object_unbind(obj) == 0)
>  			nr_to_scan--;
> -		else
> -			cnt++;
>  	}
>  
>  	if (nr_to_scan && i915_gpu_is_active(dev)) {
> @@ -4284,5 +4272,26 @@ rescan:
>  			goto rescan;
>  	}
>  	mutex_unlock(&dev->struct_mutex);
> -	return cnt / 100 * sysctl_vfs_cache_pressure;
> +	return sc->nr_to_scan - nr_to_scan;
> +}
> +
> +static long
> +i915_gem_inactive_count(struct shrinker *shrinker, struct shrink_control *sc)
> +{
> +	struct drm_i915_private *dev_priv =
> +		container_of(shrinker,
> +			     struct drm_i915_private,
> +			     mm.inactive_shrinker);
> +	struct drm_device *dev = dev_priv->dev;
> +	struct drm_i915_gem_object *obj;
> +	long count = 0;
> +
> +	if (!mutex_trylock(&dev->struct_mutex))
> +		return 0;
> +
> +	list_for_each_entry(obj, &dev_priv->mm.inactive_list, mm_list)
> +		count++;
> +
> +	mutex_unlock(&dev->struct_mutex);
> +	return count;
>  }
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index 727e93d..3e71c68 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -395,14 +395,13 @@ static int ttm_pool_get_num_unused_pages(void)
>  /**
>   * Callback for mm to request pool to reduce number of page held.
>   */
> -static int ttm_pool_mm_shrink(struct shrinker *shrink,
> -			      struct shrink_control *sc)
> +static long ttm_pool_mm_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	static atomic_t start_pool = ATOMIC_INIT(0);
>  	unsigned i;
>  	unsigned pool_offset = atomic_add_return(1, &start_pool);
>  	struct ttm_page_pool *pool;
> -	int shrink_pages = sc->nr_to_scan;
> +	long shrink_pages = sc->nr_to_scan;
>  
>  	pool_offset = pool_offset % NUM_POOLS;
>  	/* select start pool in round robin fashion */
> @@ -413,13 +412,18 @@ static int ttm_pool_mm_shrink(struct shrinker *shrink,
>  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
>  		shrink_pages = ttm_page_pool_free(pool, nr_free);
>  	}
> -	/* return estimated number of unused pages in pool */
> +	return sc->nr_to_scan;
> +}
> +
> +static long ttm_pool_mm_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
>  	return ttm_pool_get_num_unused_pages();
>  }
>  
>  static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
>  {
> -	manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
> +	manager->mm_shrink.scan_objects = ttm_pool_mm_scan;
> +	manager->mm_shrink.count_objects = ttm_pool_mm_count;
>  	manager->mm_shrink.seeks = 1;
>  	register_shrinker(&manager->mm_shrink);
>  }
> diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
> index 855a5bb..3ccb723 100644
> --- a/drivers/staging/zcache/zcache-main.c
> +++ b/drivers/staging/zcache/zcache-main.c
> @@ -493,9 +493,10 @@ static void zbud_evict_zbpg(struct zbud_page *zbpg)
>   * page in use by another cpu, but also to avoid potential deadlock due to
>   * lock inversion.
>   */
> -static void zbud_evict_pages(int nr)
> +static int zbud_evict_pages(int nr)
>  {
>  	struct zbud_page *zbpg;
> +	int freed = 0;
>  	int i;
>  
>  	/* first try freeing any pages on unused list */
> @@ -511,7 +512,7 @@ retry_unused_list:
>  		spin_unlock_bh(&zbpg_unused_list_spinlock);
>  		zcache_free_page(zbpg);
>  		zcache_evicted_raw_pages++;
> -		if (--nr <= 0)
> +		if (++freed >= nr)
>  			goto out;
>  		goto retry_unused_list;
>  	}
> @@ -535,7 +536,7 @@ retry_unbud_list_i:
>  			/* want budlists unlocked when doing zbpg eviction */
>  			zbud_evict_zbpg(zbpg);
>  			local_bh_enable();
> -			if (--nr <= 0)
> +			if (++freed >= nr)
>  				goto out;
>  			goto retry_unbud_list_i;
>  		}
> @@ -559,13 +560,13 @@ retry_bud_list:
>  		/* want budlists unlocked when doing zbpg eviction */
>  		zbud_evict_zbpg(zbpg);
>  		local_bh_enable();
> -		if (--nr <= 0)
> +		if (++freed >= nr)
>  			goto out;
>  		goto retry_bud_list;
>  	}
>  	spin_unlock_bh(&zbud_budlists_spinlock);
>  out:
> -	return;
> +	return freed;
>  }
>  
>  static void zbud_init(void)
> @@ -1496,30 +1497,34 @@ static bool zcache_freeze;
>  /*
>   * zcache shrinker interface (only useful for ephemeral pages, so zbud only)
>   */
> -static int shrink_zcache_memory(struct shrinker *shrink,
> -				struct shrink_control *sc)
> +static long shrink_zcache_scan(struct shrinker *shrink,
> +			       struct shrink_control *sc)
>  {
>  	int ret = -1;
>  	int nr = sc->nr_to_scan;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
> -	if (nr >= 0) {
> -		if (!(gfp_mask & __GFP_FS))
> -			/* does this case really need to be skipped? */
> -			goto out;
> -		if (spin_trylock(&zcache_direct_reclaim_lock)) {
> -			zbud_evict_pages(nr);
> -			spin_unlock(&zcache_direct_reclaim_lock);
> -		} else
> -			zcache_aborted_shrink++;
> -	}
> -	ret = (int)atomic_read(&zcache_zbud_curr_raw_pages);
> -out:
> +	if (!(gfp_mask & __GFP_FS))
> +		return -1;
> +
> +	if (spin_trylock(&zcache_direct_reclaim_lock)) {
> +		ret = zbud_evict_pages(nr);
> +		spin_unlock(&zcache_direct_reclaim_lock);
> +	} else
> +		zcache_aborted_shrink++;
> +
>  	return ret;
>  }
>  
> +static long shrink_zcache_count(struct shrinker *shrink,
> +				struct shrink_control *sc)
> +{
> +	return atomic_read(&zcache_zbud_curr_raw_pages);
> +}
> +
>  static struct shrinker zcache_shrinker = {
> -	.shrink = shrink_zcache_memory,
> +	.scan_objects = shrink_zcache_scan,
> +	.count_objects = shrink_zcache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index d0f59fa..508a684 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -44,58 +44,73 @@ static const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} };
>  
>  const struct cred *root_cred;
>  
> -static void
> -shrink_idmap_tree(struct rb_root *root, int nr_to_scan, int *nr_rem,
> -			int *nr_del)
> +static long
> +shrink_idmap_tree(struct rb_root *root, int nr_to_scan)
>  {
>  	struct rb_node *node;
>  	struct rb_node *tmp;
>  	struct cifs_sid_id *psidid;
> +	long count = 0;
>  
>  	node = rb_first(root);
>  	while (node) {
>  		tmp = node;
>  		node = rb_next(tmp);
>  		psidid = rb_entry(tmp, struct cifs_sid_id, rbnode);
> -		if (nr_to_scan == 0 || *nr_del == nr_to_scan)
> -			++(*nr_rem);
> -		else {
> -			if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
> -						&& psidid->refcount == 0) {
> -				rb_erase(tmp, root);
> -				++(*nr_del);
> -			} else
> -				++(*nr_rem);
> +		if (nr_to_scan == 0) {
> +			count++;
> +			continue:
> +		}
> +		if (time_after(jiffies, psidid->time + SID_MAP_EXPIRE)
> +					&& psidid->refcount == 0) {
> +			rb_erase(tmp, root);
> +			if (++count >= nr_to_scan)
> +				break;
>  		}
>  	}
> +	return count;
>  }
>  
>  /*
>   * Run idmap cache shrinker.
>   */
> -static int
> -cifs_idmap_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
> -	int nr_to_scan = sc->nr_to_scan;
> -	int nr_del = 0;
> -	int nr_rem = 0;
>  	struct rb_root *root;
> +	long freed;
>  
>  	root = &uidtree;
>  	spin_lock(&siduidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed = shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&siduidlock);
>  
>  	root = &gidtree;
>  	spin_lock(&sidgidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed += shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&sidgidlock);
>  
> -	return nr_rem;
> +	return freed;
> +}
> +
> +/*
> + * This still abuses the nr_to_scan == 0 trick to get the common code just to
> + * count objects. There neds to be an external count of the objects in the
> + * caches to avoid this.
> + */
> +static long
> +cifs_idmap_shrinker_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct shrinker_control sc = {
> +		.nr_to_scan = 0,
> +	}
> +
> +	return cifs_idmap_shrinker_scan(shrink, &sc);
>  }
>  
>  static struct shrinker cifs_shrinker = {
> -	.shrink = cifs_idmap_shrinker,
> +	.scan_objects = cifs_idmap_shrinker_scan,
> +	.count_objects = cifs_idmap_shrinker_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5123d71..d19e453 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -759,11 +759,12 @@ static void shrink_dentry_list(struct list_head *list)
>   *
>   * If flags contains DCACHE_REFERENCED reference dentries will not be pruned.
>   */
> -static void __shrink_dcache_sb(struct super_block *sb, int count, int flags)
> +static long __shrink_dcache_sb(struct super_block *sb, long count, int flags)
>  {
>  	struct dentry *dentry;
>  	LIST_HEAD(referenced);
>  	LIST_HEAD(tmp);
> +	long freed = 0;
>  
>  relock:
>  	spin_lock(&sb->s_dentry_lru_lock);
> @@ -791,6 +792,7 @@ relock:
>  		} else {
>  			list_move_tail(&dentry->d_lru, &tmp);
>  			spin_unlock(&dentry->d_lock);
> +			freed++;
>  			if (!--count)
>  				break;
>  		}
> @@ -801,6 +803,7 @@ relock:
>  	spin_unlock(&sb->s_dentry_lru_lock);
>  
>  	shrink_dentry_list(&tmp);
> +	return freed;
>  }
>  
>  /**
> @@ -815,9 +818,9 @@ relock:
>   * This function may fail to free any resources if all the dentries are in
>   * use.
>   */
> -void prune_dcache_sb(struct super_block *sb, int nr_to_scan)
> +long prune_dcache_sb(struct super_block *sb, long nr_to_scan)
>  {
> -	__shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
> +	return __shrink_dcache_sb(sb, nr_to_scan, DCACHE_REFERENCED);
>  }
>  
>  /**
> @@ -1070,12 +1073,12 @@ EXPORT_SYMBOL(have_submounts);
>   * drop the lock and return early due to latency
>   * constraints.
>   */
> -static int select_parent(struct dentry * parent)
> +static long select_parent(struct dentry * parent)
>  {
>  	struct dentry *this_parent;
>  	struct list_head *next;
>  	unsigned seq;
> -	int found = 0;
> +	long found = 0;
>  	int locked = 0;
>  
>  	seq = read_seqbegin(&rename_lock);
> @@ -1163,7 +1166,7 @@ rename_retry:
>  void shrink_dcache_parent(struct dentry * parent)
>  {
>  	struct super_block *sb = parent->d_sb;
> -	int found;
> +	long found;
>  
>  	while ((found = select_parent(parent)) != 0)
>  		__shrink_dcache_sb(sb, found, 0);
> diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> index 88e8a23..f9bc88d 100644
> --- a/fs/gfs2/glock.c
> +++ b/fs/gfs2/glock.c
> @@ -1370,24 +1370,21 @@ void gfs2_glock_complete(struct gfs2_glock *gl, int ret)
>  }
>  
> 
> -static int gfs2_shrink_glock_memory(struct shrinker *shrink,
> -				    struct shrink_control *sc)
> +static long gfs2_shrink_glock_scan(struct shrinker *shrink,
> +				   struct shrink_control *sc)
>  {
>  	struct gfs2_glock *gl;
>  	int may_demote;
>  	int nr_skipped = 0;
> -	int nr = sc->nr_to_scan;
> +	int freed = 0;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  	LIST_HEAD(skipped);
>  
> -	if (nr == 0)
> -		goto out;
> -
>  	if (!(gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	spin_lock(&lru_lock);
> -	while(nr && !list_empty(&lru_list)) {
> +	while (freed < sc->nr_to_scan && !list_empty(&lru_list)) {
>  		gl = list_entry(lru_list.next, struct gfs2_glock, gl_lru);
>  		list_del_init(&gl->gl_lru);
>  		clear_bit(GLF_LRU, &gl->gl_flags);
> @@ -1401,7 +1398,7 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
>  			may_demote = demote_ok(gl);
>  			if (may_demote) {
>  				handle_callback(gl, LM_ST_UNLOCKED, 0);
> -				nr--;
> +				freed++;
>  			}
>  			clear_bit(GLF_LOCK, &gl->gl_flags);
>  			smp_mb__after_clear_bit();
> @@ -1418,12 +1415,19 @@ static int gfs2_shrink_glock_memory(struct shrinker *shrink,
>  	list_splice(&skipped, &lru_list);
>  	atomic_add(nr_skipped, &lru_count);
>  	spin_unlock(&lru_lock);
> -out:
> +
> +	return freed;
> +}
> +
> +static long gfs2_shrink_glock_count(struct shrinker *shrink,
> +				    struct shrink_control *sc)
> +{
>  	return (atomic_read(&lru_count) / 100) * sysctl_vfs_cache_pressure;
>  }
>  
>  static struct shrinker glock_shrinker = {
> -	.shrink = gfs2_shrink_glock_memory,
> +	.scan_objects = gfs2_shrink_glock_scan,
> +	.count_objects = gfs2_shrink_glock_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 8ea7747..2c21986 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -29,7 +29,8 @@
>  #include "dir.h"
>  
>  static struct shrinker qd_shrinker = {
> -	.shrink = gfs2_shrink_qd_memory,
> +	.scan_objects = gfs2_shrink_qd_scan,
> +	.count_objects = gfs2_shrink_qd_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
> index 42e8d23..5a5f76c 100644
> --- a/fs/gfs2/quota.c
> +++ b/fs/gfs2/quota.c
> @@ -78,20 +78,17 @@ static LIST_HEAD(qd_lru_list);
>  static atomic_t qd_lru_count = ATOMIC_INIT(0);
>  static DEFINE_SPINLOCK(qd_lru_lock);
>  
> -int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
> +long gfs2_shrink_qd_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct gfs2_quota_data *qd;
>  	struct gfs2_sbd *sdp;
> -	int nr_to_scan = sc->nr_to_scan;
> -
> -	if (nr_to_scan == 0)
> -		goto out;
> +	int freed = 0;
>  
>  	if (!(sc->gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	spin_lock(&qd_lru_lock);
> -	while (nr_to_scan && !list_empty(&qd_lru_list)) {
> +	while (freed <= sc->nr_to_scan && !list_empty(&qd_lru_list)) {
>  		qd = list_entry(qd_lru_list.next,
>  				struct gfs2_quota_data, qd_reclaim);
>  		sdp = qd->qd_gl->gl_sbd;
> @@ -112,12 +109,16 @@ int gfs2_shrink_qd_memory(struct shrinker *shrink, struct shrink_control *sc)
>  		spin_unlock(&qd_lru_lock);
>  		kmem_cache_free(gfs2_quotad_cachep, qd);
>  		spin_lock(&qd_lru_lock);
> -		nr_to_scan--;
> +		freed++;
>  	}
>  	spin_unlock(&qd_lru_lock);
>  
> -out:
> -	return (atomic_read(&qd_lru_count) * sysctl_vfs_cache_pressure) / 100;
> +	return freed;
> +}
> +
> +long gfs2_shrink_qd_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	return (atomic_read(&qd_lru_count) / 100) * sysctl_vfs_cache_pressure;
>  }
>  
>  static u64 qd2offset(struct gfs2_quota_data *qd)
> diff --git a/fs/gfs2/quota.h b/fs/gfs2/quota.h
> index 90bf1c3..c40fe6d 100644
> --- a/fs/gfs2/quota.h
> +++ b/fs/gfs2/quota.h
> @@ -52,7 +52,9 @@ static inline int gfs2_quota_lock_check(struct gfs2_inode *ip)
>  	return ret;
>  }
>  
> -extern int gfs2_shrink_qd_memory(struct shrinker *shrink,
> +extern long gfs2_shrink_qd_scan(struct shrinker *shrink,
> +				struct shrink_control *sc);
> +extern long gfs2_shrink_qd_count(struct shrinker *shrink,
>  				 struct shrink_control *sc);
>  extern const struct quotactl_ops gfs2_quotactl_ops;
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index 848808f..fee5d9a 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -613,10 +613,11 @@ static int can_unuse(struct inode *inode)
>   * LRU does not have strict ordering. Hence we don't want to reclaim inodes
>   * with this flag set because they are the inodes that are out of order.
>   */
> -void prune_icache_sb(struct super_block *sb, int nr_to_scan)
> +long prune_icache_sb(struct super_block *sb, long nr_to_scan)
>  {
>  	LIST_HEAD(freeable);
> -	int nr_scanned;
> +	long nr_scanned;
> +	long freed = 0;
>  	unsigned long reap = 0;
>  
>  	spin_lock(&sb->s_inode_lru_lock);
> @@ -686,6 +687,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
>  		list_move(&inode->i_lru, &freeable);
>  		sb->s_nr_inodes_unused--;
>  		this_cpu_dec(nr_unused);
> +		freed++;
>  	}
>  	if (current_is_kswapd())
>  		__count_vm_events(KSWAPD_INODESTEAL, reap);
> @@ -694,6 +696,7 @@ void prune_icache_sb(struct super_block *sb, int nr_to_scan)
>  	spin_unlock(&sb->s_inode_lru_lock);
>  
>  	dispose_list(&freeable);
> +	return freed;
>  }
>  
>  static void __wait_on_freeing_inode(struct inode *inode);
> diff --git a/fs/internal.h b/fs/internal.h
> index fe327c2..2662ffa 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -127,6 +127,8 @@ extern long do_handle_open(int mountdirfd,
>   * inode.c
>   */
>  extern spinlock_t inode_sb_list_lock;
> +extern long prune_icache_sb(struct super_block *sb, long nr_to_scan);
> +
>  
>  /*
>   * fs-writeback.c
> @@ -141,3 +143,4 @@ extern int invalidate_inodes(struct super_block *, bool);
>   * dcache.c
>   */
>  extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
> +extern long prune_dcache_sb(struct super_block *sb, long nr_to_scan);
> diff --git a/fs/mbcache.c b/fs/mbcache.c
> index 8c32ef3..aa3a19a 100644
> --- a/fs/mbcache.c
> +++ b/fs/mbcache.c
> @@ -90,11 +90,14 @@ static DEFINE_SPINLOCK(mb_cache_spinlock);
>   * What the mbcache registers as to get shrunk dynamically.
>   */
>  
> -static int mb_cache_shrink_fn(struct shrinker *shrink,
> -			      struct shrink_control *sc);
> +static long mb_cache_shrink_scan(struct shrinker *shrink,
> +				 struct shrink_control *sc);
> +static long mb_cache_shrink_count(struct shrinker *shrink,
> +				  struct shrink_control *sc);
>  
>  static struct shrinker mb_cache_shrinker = {
> -	.shrink = mb_cache_shrink_fn,
> +	.scan_objects = mb_cache_shrink_scan,
> +	.count_objects = mb_cache_shrink_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> @@ -161,13 +164,12 @@ forget:
>   *
>   * Returns the number of objects which are present in the cache.
>   */
> -static int
> -mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +mb_cache_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	LIST_HEAD(free_list);
> -	struct mb_cache *cache;
>  	struct mb_cache_entry *entry, *tmp;
> -	int count = 0;
> +	int freed = 0;
>  	int nr_to_scan = sc->nr_to_scan;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
> @@ -180,18 +182,27 @@ mb_cache_shrink_fn(struct shrinker *shrink, struct shrink_control *sc)
>  		list_move_tail(&ce->e_lru_list, &free_list);
>  		__mb_cache_entry_unhash(ce);
>  	}
> -	list_for_each_entry(cache, &mb_cache_list, c_cache_list) {
> -		mb_debug("cache %s (%d)", cache->c_name,
> -			  atomic_read(&cache->c_entry_count));
> -		count += atomic_read(&cache->c_entry_count);
> -	}
>  	spin_unlock(&mb_cache_spinlock);
>  	list_for_each_entry_safe(entry, tmp, &free_list, e_lru_list) {
>  		__mb_cache_entry_forget(entry, gfp_mask);
> +		freed++;
>  	}
> -	return (count / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
>  }
>  
> +static long
> +mb_cache_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct mb_cache *cache;
> +	long count = 0;
> +
> +	spin_lock(&mb_cache_spinlock);
> +	list_for_each_entry(cache, &mb_cache_list, c_cache_list)
> +		count += atomic_read(&cache->c_entry_count);
> +
> +	spin_unlock(&mb_cache_spinlock);
> +	return (count / 100) * sysctl_vfs_cache_pressure;
> +}
>  
>  /*
>   * mb_cache_create()  create a new cache
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index b238d95..a5aefb2 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -2057,17 +2057,18 @@ static void nfs_access_free_list(struct list_head *head)
>  	}
>  }
>  
> -int nfs_access_cache_shrinker(struct shrinker *shrink,
> -			      struct shrink_control *sc)
> +long nfs_access_cache_scan(struct shrinker *shrink,
> +			   struct shrink_control *sc)
>  {
>  	LIST_HEAD(head);
>  	struct nfs_inode *nfsi, *next;
>  	struct nfs_access_entry *cache;
>  	int nr_to_scan = sc->nr_to_scan;
> +	int freed = 0;
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
>  	if ((gfp_mask & GFP_KERNEL) != GFP_KERNEL)
> -		return (nr_to_scan == 0) ? 0 : -1;
> +		return -1;
>  
>  	spin_lock(&nfs_access_lru_lock);
>  	list_for_each_entry_safe(nfsi, next, &nfs_access_lru_list, access_cache_inode_lru) {
> @@ -2079,6 +2080,7 @@ int nfs_access_cache_shrinker(struct shrinker *shrink,
>  		spin_lock(&inode->i_lock);
>  		if (list_empty(&nfsi->access_cache_entry_lru))
>  			goto remove_lru_entry;
> +		freed++;
>  		cache = list_entry(nfsi->access_cache_entry_lru.next,
>  				struct nfs_access_entry, lru);
>  		list_move(&cache->lru, &head);
> @@ -2097,7 +2099,14 @@ remove_lru_entry:
>  	}
>  	spin_unlock(&nfs_access_lru_lock);
>  	nfs_access_free_list(&head);
> -	return (atomic_long_read(&nfs_access_nr_entries) / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
> +}
> +
> +long nfs_access_cache_count(struct shrinker *shrink,
> +			    struct shrink_control *sc)
> +{
> +	return (atomic_long_read(&nfs_access_nr_entries) / 100) *
> +						sysctl_vfs_cache_pressure;
>  }
>  
>  static void __nfs_access_zap_cache(struct nfs_inode *nfsi, struct list_head *head)
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index ab12913..9c65e1f 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -244,8 +244,10 @@ extern int nfs_init_client(struct nfs_client *clp,
>  			   int noresvport);
>  
>  /* dir.c */
> -extern int nfs_access_cache_shrinker(struct shrinker *shrink,
> -					struct shrink_control *sc);
> +extern long nfs_access_cache_scan(struct shrinker *shrink,
> +				  struct shrink_control *sc);
> +extern long nfs_access_cache_count(struct shrinker *shrink,
> +				   struct shrink_control *sc);
>  
>  /* inode.c */
>  extern struct workqueue_struct *nfsiod_workqueue;
> diff --git a/fs/nfs/super.c b/fs/nfs/super.c
> index b961cea..e088c03 100644
> --- a/fs/nfs/super.c
> +++ b/fs/nfs/super.c
> @@ -380,7 +380,8 @@ static const struct super_operations nfs4_sops = {
>  #endif
>  
>  static struct shrinker acl_shrinker = {
> -	.shrink		= nfs_access_cache_shrinker,
> +	.scan_objects	= nfs_access_cache_scan,
> +	.count_objects	= nfs_access_cache_count,
>  	.seeks		= DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 5b572c8..c8724d2 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -669,45 +669,42 @@ int dquot_quota_sync(struct super_block *sb, int type, int wait)
>  }
>  EXPORT_SYMBOL(dquot_quota_sync);
>  
> -/* Free unused dquots from cache */
> -static void prune_dqcache(int count)
> +/*
> + * This is called from kswapd when we think we need some
> + * more memory
> + */
> +static long shrink_dqcache_scan(struct shrinker *shrink,
> +				 struct shrink_control *sc)
>  {
>  	struct list_head *head;
>  	struct dquot *dquot;
> +	int freed = 0;
>  
> +	spin_lock(&dq_list_lock);
>  	head = free_dquots.prev;
> -	while (head != &free_dquots && count) {
> +	while (head != &free_dquots && freed < sc->nr_to_scan) {
>  		dquot = list_entry(head, struct dquot, dq_free);
>  		remove_dquot_hash(dquot);
>  		remove_free_dquot(dquot);
>  		remove_inuse(dquot);
>  		do_destroy_dquot(dquot);
> -		count--;
> +		freed++;
>  		head = free_dquots.prev;
>  	}
> +	spin_unlock(&dq_list_lock);
> +
> +	return freed;
>  }
>  
> -/*
> - * This is called from kswapd when we think we need some
> - * more memory
> - */
> -static int shrink_dqcache_memory(struct shrinker *shrink,
> +static long shrink_dqcache_count(struct shrinker *shrink,
>  				 struct shrink_control *sc)
>  {
> -	int nr = sc->nr_to_scan;
> -
> -	if (nr) {
> -		spin_lock(&dq_list_lock);
> -		prune_dqcache(nr);
> -		spin_unlock(&dq_list_lock);
> -	}
> -	return ((unsigned)
> -		percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
> -		/100) * sysctl_vfs_cache_pressure;
> +	return (percpu_counter_read_positive(&dqstats.counter[DQST_FREE_DQUOTS])
> +		/ 100) * sysctl_vfs_cache_pressure;
>  }
> -
>  static struct shrinker dqcache_shrinker = {
> -	.shrink = shrink_dqcache_memory,
> +	.scan_objects = shrink_dqcache_scan,
> +	.count_objects = shrink_dqcache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/super.c b/fs/super.c
> index 6a72693..074abbe 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -45,11 +45,14 @@ DEFINE_SPINLOCK(sb_lock);
>   * shrinker path and that leads to deadlock on the shrinker_rwsem. Hence we
>   * take a passive reference to the superblock to avoid this from occurring.
>   */
> -static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
> +static long super_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct super_block *sb;
> -	int	fs_objects = 0;
> -	int	total_objects;
> +	long	fs_objects = 0;
> +	long	total_objects;
> +	long	freed = 0;
> +	long	dentries;
> +	long	inodes;
>  
>  	sb = container_of(shrink, struct super_block, s_shrink);
>  
> @@ -57,7 +60,7 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
>  	 * Deadlock avoidance.  We may hold various FS locks, and we don't want
>  	 * to recurse into the FS that called us in clear_inode() and friends..
>  	 */
> -	if (sc->nr_to_scan && !(sc->gfp_mask & __GFP_FS))
> +	if (!(sc->gfp_mask & __GFP_FS))
>  		return -1;
>  
>  	if (!grab_super_passive(sb))
> @@ -69,33 +72,42 @@ static int prune_super(struct shrinker *shrink, struct shrink_control *sc)
>  	total_objects = sb->s_nr_dentry_unused +
>  			sb->s_nr_inodes_unused + fs_objects + 1;
>  
> -	if (sc->nr_to_scan) {
> -		int	dentries;
> -		int	inodes;
> -
> -		/* proportion the scan between the caches */
> -		dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) /
> -							total_objects;
> -		inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) /
> -							total_objects;
> -		if (fs_objects)
> -			fs_objects = (sc->nr_to_scan * fs_objects) /
> -							total_objects;
> -		/*
> -		 * prune the dcache first as the icache is pinned by it, then
> -		 * prune the icache, followed by the filesystem specific caches
> -		 */
> -		prune_dcache_sb(sb, dentries);
> -		prune_icache_sb(sb, inodes);
> +	/* proportion the scan between the caches */
> +	dentries = (sc->nr_to_scan * sb->s_nr_dentry_unused) / total_objects;
> +	inodes = (sc->nr_to_scan * sb->s_nr_inodes_unused) / total_objects;
>  
> -		if (fs_objects && sb->s_op->free_cached_objects) {
> -			sb->s_op->free_cached_objects(sb, fs_objects);
> -			fs_objects = sb->s_op->nr_cached_objects(sb);
> -		}
> -		total_objects = sb->s_nr_dentry_unused +
> -				sb->s_nr_inodes_unused + fs_objects;
> +	/*
> +	 * prune the dcache first as the icache is pinned by it, then
> +	 * prune the icache, followed by the filesystem specific caches
> +	 */
> +	freed = prune_dcache_sb(sb, dentries);
> +	freed += prune_icache_sb(sb, inodes);
> +
> +	if (fs_objects) {
> +		fs_objects = (sc->nr_to_scan * fs_objects) / total_objects;
> +		freed += sb->s_op->free_cached_objects(sb, fs_objects);
>  	}
>  
> +	drop_super(sb);
> +	return freed;
> +}
> +
> +static long super_cache_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct super_block *sb;
> +	long	total_objects = 0;
> +
> +	sb = container_of(shrink, struct super_block, s_shrink);
> +
> +	if (!grab_super_passive(sb))
> +		return -1;
> +
> +	if (sb->s_op && sb->s_op->nr_cached_objects)
> +		total_objects = sb->s_op->nr_cached_objects(sb);
> +
> +	total_objects += sb->s_nr_dentry_unused;
> +	total_objects += sb->s_nr_inodes_unused;
> +
>  	total_objects = (total_objects / 100) * sysctl_vfs_cache_pressure;
>  	drop_super(sb);
>  	return total_objects;
> @@ -182,7 +194,8 @@ static struct super_block *alloc_super(struct file_system_type *type)
>  		s->cleancache_poolid = -1;
>  
>  		s->s_shrink.seeks = DEFAULT_SEEKS;
> -		s->s_shrink.shrink = prune_super;
> +		s->s_shrink.scan_objects = super_cache_scan;
> +		s->s_shrink.count_objects = super_cache_count;
>  		s->s_shrink.batch = 1024;
>  	}
>  out:
> diff --git a/fs/ubifs/shrinker.c b/fs/ubifs/shrinker.c
> index 9e1d056..78ca7b7 100644
> --- a/fs/ubifs/shrinker.c
> +++ b/fs/ubifs/shrinker.c
> @@ -277,19 +277,12 @@ static int kick_a_thread(void)
>  	return 0;
>  }
>  
> -int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	int nr = sc->nr_to_scan;
>  	int freed, contention = 0;
>  	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
>  
> -	if (nr == 0)
> -		/*
> -		 * Due to the way UBIFS updates the clean znode counter it may
> -		 * temporarily be negative.
> -		 */
> -		return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
> -
>  	if (!clean_zn_cnt) {
>  		/*
>  		 * No clean znodes, nothing to reap. All we can do in this case
> @@ -323,3 +316,13 @@ out:
>  	dbg_tnc("%d znodes were freed, requested %d", freed, nr);
>  	return freed;
>  }
> +
> +long ubifs_shrinker_count(struct shrinker *shrink, ubifs_shrinker_scan)
> +{
> +	long clean_zn_cnt = atomic_long_read(&ubifs_clean_zn_cnt);
> +	/*
> +	 * Due to the way UBIFS updates the clean znode counter it may
> +	 * temporarily be negative.
> +	 */
> +	return clean_zn_cnt >= 0 ? clean_zn_cnt : 1;
> +}
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 91903f6..3d3f3e9 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -49,7 +49,8 @@ struct kmem_cache *ubifs_inode_slab;
>  
>  /* UBIFS TNC shrinker description */
>  static struct shrinker ubifs_shrinker_info = {
> -	.shrink = ubifs_shrinker,
> +	.scan_objects = ubifs_shrinker_scan,
> +	.count_objects = ubifs_shrinker_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
> index 27f2255..2b8f48c 100644
> --- a/fs/ubifs/ubifs.h
> +++ b/fs/ubifs/ubifs.h
> @@ -1625,7 +1625,8 @@ int ubifs_tnc_start_commit(struct ubifs_info *c, struct ubifs_zbranch *zroot);
>  int ubifs_tnc_end_commit(struct ubifs_info *c);
>  
>  /* shrinker.c */
> -int ubifs_shrinker(struct shrinker *shrink, struct shrink_control *sc);
> +long ubifs_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc);
> +long ubifs_shrinker_count(struct shrinker *shrink, struct shrink_control *sc);
>  
>  /* commit.c */
>  int ubifs_bg_thread(void *info);
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 7a026cb..b2eea9e 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -1456,8 +1456,8 @@ restart:
>  	spin_unlock(&btp->bt_lru_lock);
>  }
>  
> -int
> -xfs_buftarg_shrink(
> +static long
> +xfs_buftarg_shrink_scan(
>  	struct shrinker		*shrink,
>  	struct shrink_control	*sc)
>  {
> @@ -1465,6 +1465,7 @@ xfs_buftarg_shrink(
>  					struct xfs_buftarg, bt_shrinker);
>  	struct xfs_buf		*bp;
>  	int nr_to_scan = sc->nr_to_scan;
> +	int freed = 0;
>  	LIST_HEAD(dispose);
>  
>  	if (!nr_to_scan)
> @@ -1493,6 +1494,7 @@ xfs_buftarg_shrink(
>  		 */
>  		list_move(&bp->b_lru, &dispose);
>  		btp->bt_lru_nr--;
> +		freed++;
>  	}
>  	spin_unlock(&btp->bt_lru_lock);
>  
> @@ -1502,6 +1504,16 @@ xfs_buftarg_shrink(
>  		xfs_buf_rele(bp);
>  	}
>  
> +	return freed;
> +}
> +
> +static long
> +xfs_buftarg_shrink_count(
> +	struct shrinker		*shrink,
> +	struct shrink_control	*sc)
> +{
> +	struct xfs_buftarg	*btp = container_of(shrink,
> +					struct xfs_buftarg, bt_shrinker);
>  	return btp->bt_lru_nr;
>  }
>  
> @@ -1602,7 +1614,8 @@ xfs_alloc_buftarg(
>  		goto error;
>  	if (xfs_alloc_delwrite_queue(btp, fsname))
>  		goto error;
> -	btp->bt_shrinker.shrink = xfs_buftarg_shrink;
> +	btp->bt_shrinker.scan_objects = xfs_buftarg_shrink_scan;
> +	btp->bt_shrinker.count_objects = xfs_buftarg_shrink_count;
>  	btp->bt_shrinker.seeks = DEFAULT_SEEKS;
>  	register_shrinker(&btp->bt_shrinker);
>  	return btp;
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 9a0aa76..19863a8 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -60,10 +60,12 @@ STATIC void	xfs_qm_list_destroy(xfs_dqlist_t *);
>  
>  STATIC int	xfs_qm_init_quotainos(xfs_mount_t *);
>  STATIC int	xfs_qm_init_quotainfo(xfs_mount_t *);
> -STATIC int	xfs_qm_shake(struct shrinker *, struct shrink_control *);
> +STATIC long	xfs_qm_shake_scan(struct shrinker *, struct shrink_control *);
> +STATIC long	xfs_qm_shake_count(struct shrinker *, struct shrink_control *);
>  
>  static struct shrinker xfs_qm_shaker = {
> -	.shrink = xfs_qm_shake,
> +	.scan_objects = xfs_qm_shake_scan,
> +	.count_objects = xfs_qm_shake_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  
> @@ -1963,9 +1965,8 @@ xfs_qm_shake_freelist(
>  /*
>   * The kmem_shake interface is invoked when memory is running low.
>   */
> -/* ARGSUSED */
> -STATIC int
> -xfs_qm_shake(
> +STATIC long
> +xfs_qm_shake_scan(
>  	struct shrinker	*shrink,
>  	struct shrink_control *sc)
>  {
> @@ -1973,9 +1974,9 @@ xfs_qm_shake(
>  	gfp_t gfp_mask = sc->gfp_mask;
>  
>  	if (!kmem_shake_allow(gfp_mask))
> -		return 0;
> +		return -1;
>  	if (!xfs_Gqm)
> -		return 0;
> +		return -1;
>  
>  	nfree = xfs_Gqm->qm_dqfrlist_cnt; /* free dquots */
>  	/* incore dquots in all f/s's */
> @@ -1992,6 +1993,13 @@ xfs_qm_shake(
>  	return xfs_qm_shake_freelist(MAX(nfree, n));
>  }
>  
> +STATIC long
> +xfs_qm_shake_count(
> +	struct shrinker	*shrink,
> +	struct shrink_control *sc)
> +{
> +	return xfs_Gqm ? xfs_Gqm->qm_dqfrlist_cnt : -1;
> +}
>  
>  /*------------------------------------------------------------------*/
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index c94ec22..dff4b67 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1473,19 +1473,19 @@ xfs_fs_mount(
>  	return mount_bdev(fs_type, flags, dev_name, data, xfs_fs_fill_super);
>  }
>  
> -static int
> +static long
>  xfs_fs_nr_cached_objects(
>  	struct super_block	*sb)
>  {
>  	return xfs_reclaim_inodes_count(XFS_M(sb));
>  }
>  
> -static void
> +static long
>  xfs_fs_free_cached_objects(
>  	struct super_block	*sb,
> -	int			nr_to_scan)
> +	long			nr_to_scan)
>  {
> -	xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
> +	return xfs_reclaim_inodes_nr(XFS_M(sb), nr_to_scan);
>  }
>  
>  static const struct super_operations xfs_super_operations = {
> diff --git a/fs/xfs/xfs_sync.c b/fs/xfs/xfs_sync.c
> index 4604f90..5b60a3a 100644
> --- a/fs/xfs/xfs_sync.c
> +++ b/fs/xfs/xfs_sync.c
> @@ -896,7 +896,7 @@ int
>  xfs_reclaim_inodes_ag(
>  	struct xfs_mount	*mp,
>  	int			flags,
> -	int			*nr_to_scan)
> +	long			*nr_to_scan)
>  {
>  	struct xfs_perag	*pag;
>  	int			error = 0;
> @@ -1017,7 +1017,7 @@ xfs_reclaim_inodes(
>  	xfs_mount_t	*mp,
>  	int		mode)
>  {
> -	int		nr_to_scan = INT_MAX;
> +	long		nr_to_scan = LONG_MAX;
>  
>  	return xfs_reclaim_inodes_ag(mp, mode, &nr_to_scan);
>  }
> @@ -1031,29 +1031,32 @@ xfs_reclaim_inodes(
>   * them to be cleaned, which we hope will not be very long due to the
>   * background walker having already kicked the IO off on those dirty inodes.
>   */
> -void
> +long
>  xfs_reclaim_inodes_nr(
>  	struct xfs_mount	*mp,
> -	int			nr_to_scan)
> +	long			nr_to_scan)
>  {
> +	long nr = nr_to_scan;
> +
>  	/* kick background reclaimer and push the AIL */
>  	xfs_syncd_queue_reclaim(mp);
>  	xfs_ail_push_all(mp->m_ail);
>  
> -	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr_to_scan);
> +	xfs_reclaim_inodes_ag(mp, SYNC_TRYLOCK | SYNC_WAIT, &nr);
> +	return nr_to_scan - nr;
>  }
>  
>  /*
>   * Return the number of reclaimable inodes in the filesystem for
>   * the shrinker to determine how much to reclaim.
>   */
> -int
> +long
>  xfs_reclaim_inodes_count(
>  	struct xfs_mount	*mp)
>  {
>  	struct xfs_perag	*pag;
>  	xfs_agnumber_t		ag = 0;
> -	int			reclaimable = 0;
> +	long			reclaimable = 0;
>  
>  	while ((pag = xfs_perag_get_tag(mp, ag, XFS_ICI_RECLAIM_TAG))) {
>  		ag = pag->pag_agno + 1;
> diff --git a/fs/xfs/xfs_sync.h b/fs/xfs/xfs_sync.h
> index 941202e..82e1b1c 100644
> --- a/fs/xfs/xfs_sync.h
> +++ b/fs/xfs/xfs_sync.h
> @@ -35,8 +35,8 @@ void xfs_quiesce_attr(struct xfs_mount *mp);
>  void xfs_flush_inodes(struct xfs_inode *ip);
>  
>  int xfs_reclaim_inodes(struct xfs_mount *mp, int mode);
> -int xfs_reclaim_inodes_count(struct xfs_mount *mp);
> -void xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
> +long xfs_reclaim_inodes_count(struct xfs_mount *mp);
> +long xfs_reclaim_inodes_nr(struct xfs_mount *mp, long nr_to_scan);
>  
>  void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
>  void __xfs_inode_set_reclaim_tag(struct xfs_perag *pag, struct xfs_inode *ip);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 14be4d8..958c025 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1465,10 +1465,6 @@ struct super_block {
>  	struct shrinker s_shrink;	/* per-sb shrinker handle */
>  };
>  
> -/* superblock cache pruning functions */
> -extern void prune_icache_sb(struct super_block *sb, int nr_to_scan);
> -extern void prune_dcache_sb(struct super_block *sb, int nr_to_scan);
> -
>  extern struct timespec current_fs_time(struct super_block *sb);
>  
>  /*
> @@ -1662,8 +1658,8 @@ struct super_operations {
>  	ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t);
>  #endif
>  	int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
> -	int (*nr_cached_objects)(struct super_block *);
> -	void (*free_cached_objects)(struct super_block *, int);
> +	long (*nr_cached_objects)(struct super_block *);
> +	long (*free_cached_objects)(struct super_block *, long);
>  };
>  
>  /*
> diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
> index 36851f7..80308ea 100644
> --- a/include/trace/events/vmscan.h
> +++ b/include/trace/events/vmscan.h
> @@ -190,7 +190,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  
>  	TP_STRUCT__entry(
>  		__field(struct shrinker *, shr)
> -		__field(void *, shrink)
> +		__field(void *, scan)
>  		__field(long, nr_objects_to_shrink)
>  		__field(gfp_t, gfp_flags)
>  		__field(unsigned long, pgs_scanned)
> @@ -202,7 +202,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  
>  	TP_fast_assign(
>  		__entry->shr = shr;
> -		__entry->shrink = shr->shrink;
> +		__entry->scan = shr->scan_objects;
>  		__entry->nr_objects_to_shrink = nr_objects_to_shrink;
>  		__entry->gfp_flags = sc->gfp_mask;
>  		__entry->pgs_scanned = pgs_scanned;
> @@ -213,7 +213,7 @@ TRACE_EVENT(mm_shrink_slab_start,
>  	),
>  
>  	TP_printk("%pF %p: objects to shrink %ld gfp_flags %s pgs_scanned %ld lru_pgs %ld cache items %ld delta %lld total_scan %ld",
> -		__entry->shrink,
> +		__entry->scan,
>  		__entry->shr,
>  		__entry->nr_objects_to_shrink,
>  		show_gfp_flags(__entry->gfp_flags),
> @@ -232,7 +232,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  
>  	TP_STRUCT__entry(
>  		__field(struct shrinker *, shr)
> -		__field(void *, shrink)
> +		__field(void *, scan)
>  		__field(long, unused_scan)
>  		__field(long, new_scan)
>  		__field(int, retval)
> @@ -241,7 +241,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  
>  	TP_fast_assign(
>  		__entry->shr = shr;
> -		__entry->shrink = shr->shrink;
> +		__entry->scan = shr->scan_objects;
>  		__entry->unused_scan = unused_scan_cnt;
>  		__entry->new_scan = new_scan_cnt;
>  		__entry->retval = shrinker_retval;
> @@ -249,7 +249,7 @@ TRACE_EVENT(mm_shrink_slab_end,
>  	),
>  
>  	TP_printk("%pF %p: unused scan count %ld new scan count %ld total_scan %ld last shrinker return val %d",
> -		__entry->shrink,
> +		__entry->scan,
>  		__entry->shr,
>  		__entry->unused_scan,
>  		__entry->new_scan,
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7ef6912..e32ce2d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -202,14 +202,6 @@ void unregister_shrinker(struct shrinker *shrinker)
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
>  
> -static inline int do_shrinker_shrink(struct shrinker *shrinker,
> -				     struct shrink_control *sc,
> -				     unsigned long nr_to_scan)
> -{
> -	sc->nr_to_scan = nr_to_scan;
> -	return (*shrinker->shrink)(shrinker, sc);
> -}
> -
>  #define SHRINK_BATCH 128
>  /*
>   * Call the shrink functions to age shrinkable caches
> @@ -230,27 +222,26 @@ static inline int do_shrinker_shrink(struct shrinker *shrinker,
>   *
>   * Returns the number of slab objects which we shrunk.
>   */
> -unsigned long shrink_slab(struct shrink_control *shrink,
> +unsigned long shrink_slab(struct shrink_control *sc,
>  			  unsigned long nr_pages_scanned,
>  			  unsigned long lru_pages)
>  {
>  	struct shrinker *shrinker;
> -	unsigned long ret = 0;
> +	unsigned long freed = 0;
>  
>  	if (nr_pages_scanned == 0)
>  		nr_pages_scanned = SWAP_CLUSTER_MAX;
>  
>  	if (!down_read_trylock(&shrinker_rwsem)) {
>  		/* Assume we'll be able to shrink next time */
> -		ret = 1;
> +		freed = 1;
>  		goto out;
>  	}
>  
>  	list_for_each_entry(shrinker, &shrinker_list, list) {
> -		unsigned long long delta;
> -		unsigned long total_scan;
> -		unsigned long max_pass;
> -		int shrink_ret = 0;
> +		long long delta;
> +		long total_scan;
> +		long max_pass;
>  		long nr;
>  		long new_nr;
>  		long batch_size = shrinker->batch ? shrinker->batch
> @@ -266,7 +257,9 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		} while (cmpxchg(&shrinker->nr, nr, 0) != nr);
>  
>  		total_scan = nr;
> -		max_pass = do_shrinker_shrink(shrinker, shrink, 0);
> +		max_pass = shrinker->count_objects(shrinker, sc);
> +		WARN_ON_ONCE(max_pass < 0);
> +
>  		delta = (4 * nr_pages_scanned) / shrinker->seeks;
>  		delta *= max_pass;
>  		do_div(delta, lru_pages + 1);
> @@ -274,7 +267,7 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		if (total_scan < 0) {
>  			printk(KERN_ERR "shrink_slab: %pF negative objects to "
>  			       "delete nr=%ld\n",
> -			       shrinker->shrink, total_scan);
> +			       shrinker->scan_objects, total_scan);
>  			total_scan = max_pass;
>  		}
>  
> @@ -301,20 +294,19 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  		if (total_scan > max_pass * 2)
>  			total_scan = max_pass * 2;
>  
> -		trace_mm_shrink_slab_start(shrinker, shrink, nr,
> +		trace_mm_shrink_slab_start(shrinker, sc, nr,
>  					nr_pages_scanned, lru_pages,
>  					max_pass, delta, total_scan);
>  
>  		while (total_scan >= batch_size) {
> -			int nr_before;
> +			long ret;
> +
> +			sc->nr_to_scan = batch_size;
> +			ret = shrinker->scan_objects(shrinker, sc);
>  
> -			nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> -			shrink_ret = do_shrinker_shrink(shrinker, shrink,
> -							batch_size);
> -			if (shrink_ret == -1)
> +			if (ret == -1)
>  				break;
> -			if (shrink_ret < nr_before)
> -				ret += nr_before - shrink_ret;
> +			freed += ret;
>  			count_vm_events(SLABS_SCANNED, batch_size);
>  			total_scan -= batch_size;
>  
> @@ -333,12 +325,12 @@ unsigned long shrink_slab(struct shrink_control *shrink,
>  				break;
>  		} while (cmpxchg(&shrinker->nr, nr, new_nr) != nr);
>  
> -		trace_mm_shrink_slab_end(shrinker, shrink_ret, nr, new_nr);
> +		trace_mm_shrink_slab_end(shrinker, freed, nr, new_nr);
>  	}
>  	up_read(&shrinker_rwsem);
>  out:
>  	cond_resched();
> -	return ret;
> +	return freed;
>  }
>  
>  static void set_reclaim_mode(int priority, struct scan_control *sc,
> diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
> index 727e506..f5955c3 100644
> --- a/net/sunrpc/auth.c
> +++ b/net/sunrpc/auth.c
> @@ -292,6 +292,7 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  	spinlock_t *cache_lock;
>  	struct rpc_cred *cred, *next;
>  	unsigned long expired = jiffies - RPC_AUTH_EXPIRY_MORATORIUM;
> +	int freed = 0;
>  
>  	list_for_each_entry_safe(cred, next, &cred_unused, cr_lru) {
>  
> @@ -303,10 +304,10 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  		 */
>  		if (time_in_range(cred->cr_expire, expired, jiffies) &&
>  		    test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
> -			return 0;
> +			break;
>  
> -		list_del_init(&cred->cr_lru);
>  		number_cred_unused--;
> +		list_del_init(&cred->cr_lru);
>  		if (atomic_read(&cred->cr_count) != 0)
>  			continue;
>  
> @@ -316,17 +317,18 @@ rpcauth_prune_expired(struct list_head *free, int nr_to_scan)
>  			get_rpccred(cred);
>  			list_add_tail(&cred->cr_lru, free);
>  			rpcauth_unhash_cred_locked(cred);
> +			freed++;
>  		}
>  		spin_unlock(cache_lock);
>  	}
> -	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
> +	return freed;
>  }
>  
>  /*
>   * Run memory cache shrinker.
>   */
> -static int
> -rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
> +static long
> +rpcauth_cache_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	LIST_HEAD(free);
>  	int res;
> @@ -344,6 +346,12 @@ rpcauth_cache_shrinker(struct shrinker *shrink, struct shrink_control *sc)
>  	return res;
>  }
>  
> +static long
> +rpcauth_cache_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	return (number_cred_unused / 100) * sysctl_vfs_cache_pressure;
> +}
> +
>  /*
>   * Look up a process' credentials in the authentication cache
>   */
> @@ -658,7 +666,8 @@ rpcauth_uptodatecred(struct rpc_task *task)
>  }
>  
>  static struct shrinker rpc_cred_shrinker = {
> -	.shrink = rpcauth_cache_shrinker,
> +	.scan_objects = rpcauth_cache_scan,
> +	.count_objects = rpcauth_cache_count,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/13] dcache: use a dispose list in select_parent
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-23  9:37     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:37 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

This does indeed look much saner.  And given how much we use
shrink_dcache_parent during normal operation it might even have a
real impact.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/13] dcache: use a dispose list in select_parent
@ 2011-08-23  9:37     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-23  9:37 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

This does indeed look much saner.  And given how much we use
shrink_dcache_parent during normal operation it might even have a
real impact.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
  2011-08-23  9:35     ` Christoph Hellwig
@ 2011-08-23  9:57       ` Dave Chinner
  -1 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:35:20AM -0400, Christoph Hellwig wrote:
> > diff --git a/fs/dcache.c b/fs/dcache.c
> > index b931415..79bf47c 100644
> > --- a/fs/dcache.c
> > +++ b/fs/dcache.c
> > @@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
> >  	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
> >  	if (list_empty(&dentry->d_lru)) {
> >  		list_add_tail(&dentry->d_lru, list);
> > -		dentry->d_sb->s_nr_dentry_unused++;
> > -		this_cpu_inc(nr_dentry_unused);
> >  	} else {
> >  		list_move_tail(&dentry->d_lru, list);
> > +		dentry->d_sb->s_nr_dentry_unused--;
> > +		this_cpu_dec(nr_dentry_unused);
> >  	}
> >  	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
> 
> I suspect at this point it might be more obvious to simply remove
> dentry_lru_move_list.  Just call dentry_lru_del to remove it from the
> lru, and then we can add it to the local dispose list without the need
> of any locking, similar to how it is done for inodes already.

Yeah, that is what is done in the next patch when converting to the
generic LRU list code. I can probably pull this back into this patch
and just remove dentry_lru_move_list(0 right here.

> 
> >  		if (dentry->d_count) {
> > -			dentry_lru_del(dentry);
> >  			spin_unlock(&dentry->d_lock);
> >  			continue;
> >  		}
> > @@ -789,6 +794,8 @@ relock:
> >  			spin_unlock(&dentry->d_lock);
> >  		} else {
> >  			list_move_tail(&dentry->d_lru, &tmp);
> > +			this_cpu_dec(nr_dentry_unused);
> > +			sb->s_nr_dentry_unused--;
> 
> It might be more obvious to use __dentry_lru_del + an opencoded list_add
> here.

This goes away completely in the next patch, so I don't think it
matters that much...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list
@ 2011-08-23  9:57       ` Dave Chinner
  0 siblings, 0 replies; 74+ messages in thread
From: Dave Chinner @ 2011-08-23  9:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 05:35:20AM -0400, Christoph Hellwig wrote:
> > diff --git a/fs/dcache.c b/fs/dcache.c
> > index b931415..79bf47c 100644
> > --- a/fs/dcache.c
> > +++ b/fs/dcache.c
> > @@ -269,10 +269,10 @@ static void dentry_lru_move_list(struct dentry *dentry, struct list_head *list)
> >  	spin_lock(&dentry->d_sb->s_dentry_lru_lock);
> >  	if (list_empty(&dentry->d_lru)) {
> >  		list_add_tail(&dentry->d_lru, list);
> > -		dentry->d_sb->s_nr_dentry_unused++;
> > -		this_cpu_inc(nr_dentry_unused);
> >  	} else {
> >  		list_move_tail(&dentry->d_lru, list);
> > +		dentry->d_sb->s_nr_dentry_unused--;
> > +		this_cpu_dec(nr_dentry_unused);
> >  	}
> >  	spin_unlock(&dentry->d_sb->s_dentry_lru_lock);
> 
> I suspect at this point it might be more obvious to simply remove
> dentry_lru_move_list.  Just call dentry_lru_del to remove it from the
> lru, and then we can add it to the local dispose list without the need
> of any locking, similar to how it is done for inodes already.

Yeah, that is what is done in the next patch when converting to the
generic LRU list code. I can probably pull this back into this patch
and just remove dentry_lru_move_list(0 right here.

> 
> >  		if (dentry->d_count) {
> > -			dentry_lru_del(dentry);
> >  			spin_unlock(&dentry->d_lock);
> >  			continue;
> >  		}
> > @@ -789,6 +794,8 @@ relock:
> >  			spin_unlock(&dentry->d_lock);
> >  		} else {
> >  			list_move_tail(&dentry->d_lru, &tmp);
> > +			this_cpu_dec(nr_dentry_unused);
> > +			sb->s_nr_dentry_unused--;
> 
> It might be more obvious to use __dentry_lru_del + an opencoded list_add
> here.

This goes away completely in the next patch, so I don't think it
matters that much...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
  2011-08-23  9:32       ` Dave Chinner
@ 2011-08-23  9:58         ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 74+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-23  9:58 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm

Dave Chinner wrote:
> On Tue, Aug 23, 2011 at 05:20:56AM -0400, Christoph Hellwig wrote:
>> On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
>>> From: Dave Chinner<dchinner@redhat.com>
>>>
>>> Several subsystems use the same construct for LRU lists - a list
>>> head, a spin lock and and item count. They also use exactly the same
>>> code for adding and removing items from the LRU. Create a generic
>>> type for these LRU lists.
>>>
>>> This is the beginning of generic, node aware LRUs for shrinkers to
>>> work with.
>>
>> Why list_lru vs the more natural sounding lru_list?
>
> because the mmzone.h claimed that namespace:
>
> enum lru_list {
>          LRU_INACTIVE_ANON = LRU_BASE,
>          LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE,
>          LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,
>          LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE,
>          LRU_UNEVICTABLE,
>          NR_LRU_LISTS
> };
>
> and it is widely spewed through the mm code. I didn't really feel
> like having to clean that mess up first....

not so widely:

$ git grep -wc 'enum lru_list'
include/linux/memcontrol.h:5
include/linux/mm_inline.h:7
include/linux/mmzone.h:4
include/linux/pagevec.h:1
include/linux/swap.h:2
mm/memcontrol.c:10
mm/page_alloc.c:1
mm/swap.c:6
mm/vmscan.c:6

maybe is better to rename it to enum page_lru_list

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
@ 2011-08-23  9:58         ` Konstantin Khlebnikov
  0 siblings, 0 replies; 74+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-23  9:58 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm

Dave Chinner wrote:
> On Tue, Aug 23, 2011 at 05:20:56AM -0400, Christoph Hellwig wrote:
>> On Tue, Aug 23, 2011 at 06:56:21PM +1000, Dave Chinner wrote:
>>> From: Dave Chinner<dchinner@redhat.com>
>>>
>>> Several subsystems use the same construct for LRU lists - a list
>>> head, a spin lock and and item count. They also use exactly the same
>>> code for adding and removing items from the LRU. Create a generic
>>> type for these LRU lists.
>>>
>>> This is the beginning of generic, node aware LRUs for shrinkers to
>>> work with.
>>
>> Why list_lru vs the more natural sounding lru_list?
>
> because the mmzone.h claimed that namespace:
>
> enum lru_list {
>          LRU_INACTIVE_ANON = LRU_BASE,
>          LRU_ACTIVE_ANON = LRU_BASE + LRU_ACTIVE,
>          LRU_INACTIVE_FILE = LRU_BASE + LRU_FILE,
>          LRU_ACTIVE_FILE = LRU_BASE + LRU_FILE + LRU_ACTIVE,
>          LRU_UNEVICTABLE,
>          NR_LRU_LISTS
> };
>
> and it is widely spewed through the mm code. I didn't really feel
> like having to clean that mess up first....

not so widely:

$ git grep -wc 'enum lru_list'
include/linux/memcontrol.h:5
include/linux/mm_inline.h:7
include/linux/mmzone.h:4
include/linux/pagevec.h:1
include/linux/swap.h:2
mm/memcontrol.c:10
mm/page_alloc.c:1
mm/swap.c:6
mm/vmscan.c:6

maybe is better to rename it to enum page_lru_list

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
  2011-08-23  9:20       ` Dave Chinner
@ 2011-08-24  6:16         ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:16 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 07:20:41PM +1000, Dave Chinner wrote:
> > Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?
> 
> I didn't touch that one, mainly because I think that there are
> different reasons for wanting cacheline alignment. e.g. a filesystem
> aimed primarily at embedded systms with slow CPUs and little memory
> doesn't want to waste memory on cacheline alignment....

A little grepping shows jffs2 is a counter example, because it exactly
wants SLAB_HWCACHE_ALIGN to avoid issues with mtd dma.

I'm fine with defering this for now, but the state of using
SLAB_HWCACHE_ALIGN or not is just as much as mess as the rest of the
inode slab flags was.  I'd go as far as calling the whole existance of
most slab flags an utter mess, but that is another fight.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 01/13] fs: Use a common define for inode slab caches
@ 2011-08-24  6:16         ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:16 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 07:20:41PM +1000, Dave Chinner wrote:
> > Why do we keep the SLAB_HWCACHE_ALIGN flag for some filesystems?
> 
> I didn't touch that one, mainly because I think that there are
> different reasons for wanting cacheline alignment. e.g. a filesystem
> aimed primarily at embedded systms with slow CPUs and little memory
> doesn't want to waste memory on cacheline alignment....

A little grepping shows jffs2 is a counter example, because it exactly
wants SLAB_HWCACHE_ALIGN to avoid issues with mtd dma.

I'm fine with defering this for now, but the state of using
SLAB_HWCACHE_ALIGN or not is just as much as mess as the rest of the
inode slab flags was.  I'd go as far as calling the whole existance of
most slab flags an utter mess, but that is another fight.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/13] dentry: move to per-sb LRU locks
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-24  6:16     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:16 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

Looks reasonable,

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 03/13] dentry: move to per-sb LRU locks
@ 2011-08-24  6:16     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:16 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

Looks reasonable,

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
  2011-08-23  9:23       ` Dave Chinner
@ 2011-08-24  6:17         ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 07:23:30PM +1000, Dave Chinner wrote:
> > It's much more than just a single callback these days.
> > 
> > > + * @scan_objects will be made from the current reclaim context.
> > >   */
> > >  struct shrinker {
> > >  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> > > +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> > > +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
> > 
> > Is shrink_object really such a good name for this method?
> 
> Apart from the fact it is called "scan_objects", I'm open to more
> appropriate names. I called is "scan_objects" because of the fact we
> are asking to scan (rather than free) a specific number objects on
> the LRU, and it matches with the "sc->nr_to_scan" control field.

Shrink_objects actually was my suggestion - while we are asked to scan
the objects the scan really isn't the main purpose of it.  


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 04/13] mm: new shrinker API
@ 2011-08-24  6:17         ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:17 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, Aug 23, 2011 at 07:23:30PM +1000, Dave Chinner wrote:
> > It's much more than just a single callback these days.
> > 
> > > + * @scan_objects will be made from the current reclaim context.
> > >   */
> > >  struct shrinker {
> > >  	int (*shrink)(struct shrinker *, struct shrink_control *sc);
> > > +	long (*count_objects)(struct shrinker *, struct shrink_control *sc);
> > > +	long (*scan_objects)(struct shrinker *, struct shrink_control *sc);
> > 
> > Is shrink_object really such a good name for this method?
> 
> Apart from the fact it is called "scan_objects", I'm open to more
> appropriate names. I called is "scan_objects" because of the fact we
> are asking to scan (rather than free) a specific number objects on
> the LRU, and it matches with the "sc->nr_to_scan" control field.

Shrink_objects actually was my suggestion - while we are asked to scan
the objects the scan really isn't the main purpose of it.  

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/13] xfs: convert buftarg LRU to generic code
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-24  6:27     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> -STATIC void
> +static inline void
>  xfs_buf_lru_add(
>  	struct xfs_buf	*bp)
>  {
> -	struct xfs_buftarg *btp = bp->b_target;
> -
> -	spin_lock(&btp->bt_lru_lock);
> -	if (list_empty(&bp->b_lru)) {
> +	if (list_lru_add(&bp->b_target->bt_lru, &bp->b_lru))
>  		atomic_inc(&bp->b_hold);
> -		list_add_tail(&bp->b_lru, &btp->bt_lru);
> -		btp->bt_lru_nr++;
> -	}
> -	spin_unlock(&btp->bt_lru_lock);
>  }

Is there any point in keeping this wrapper?

> +static inline void
>  xfs_buf_lru_del(
>  	struct xfs_buf	*bp)
>  {
>  	if (list_empty(&bp->b_lru))
>  		return;
>  
> +	list_lru_del(&bp->b_target->bt_lru, &bp->b_lru);
>  }

It seems like all callers of list_lru_del really want the unlocked
check.  Out of your current set only two of the inode.c callers
are missing it, but given that those set I_FREEING first they should
be safe to do it as well.  What do you think about pulling
the unlocked check into list_lru_del?


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 10/13] xfs: convert buftarg LRU to generic code
@ 2011-08-24  6:27     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:27 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> -STATIC void
> +static inline void
>  xfs_buf_lru_add(
>  	struct xfs_buf	*bp)
>  {
> -	struct xfs_buftarg *btp = bp->b_target;
> -
> -	spin_lock(&btp->bt_lru_lock);
> -	if (list_empty(&bp->b_lru)) {
> +	if (list_lru_add(&bp->b_target->bt_lru, &bp->b_lru))
>  		atomic_inc(&bp->b_hold);
> -		list_add_tail(&bp->b_lru, &btp->bt_lru);
> -		btp->bt_lru_nr++;
> -	}
> -	spin_unlock(&btp->bt_lru_lock);
>  }

Is there any point in keeping this wrapper?

> +static inline void
>  xfs_buf_lru_del(
>  	struct xfs_buf	*bp)
>  {
>  	if (list_empty(&bp->b_lru))
>  		return;
>  
> +	list_lru_del(&bp->b_target->bt_lru, &bp->b_lru);
>  }

It seems like all callers of list_lru_del really want the unlocked
check.  Out of your current set only two of the inode.c callers
are missing it, but given that those set I_FREEING first they should
be safe to do it as well.  What do you think about pulling
the unlocked check into list_lru_del?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/13] dcache: convert to use new lru list infrastructure
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-24  6:32     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> +	struct list_head *freeable = arg;
> +	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
> +
> +
> +	/*

double empty line.

> +	 * we are inverting the lru lock/dentry->d_lock here,
> +	 * so use a trylock. If we fail to get the lock, just skip
> +	 * it
> +	 */
> +	if (!spin_trylock(&dentry->d_lock))
> +		return 2;
> +
> +	/*
> +	 * Referenced dentries are still in use. If they have active
> +	 * counts, just remove them from the LRU. Otherwise give them
> +	 * another pass through the LRU.
> +	 */
> +	if (dentry->d_count) {
> +		list_del_init(&dentry->d_lru);
> +		spin_unlock(&dentry->d_lock);
> +		return 0;
> +	}
> +
> +	if (dentry->d_flags & DCACHE_REFERENCED) {

The comment aove seems odd, given that it doesn't match the code.
I'd rather have something like:

	/*
	 * Used dentry, remove it from the LRU.
	 */

in its place, and a second one above the DCACHE_REFERENCED check:

	/*
	 * Referenced dentry, give it another pass through the LRU.
	 */

> +		dentry->d_flags &= ~DCACHE_REFERENCED;
> +		spin_unlock(&dentry->d_lock);
> +
> +		/*
> +		 * XXX: this list move should be be done under d_lock. Need to
> +		 * determine if it is safe just to do it under the lru lock.
> +		 */
> +		return 1;
> +	}
> +
> +	list_move_tail(&dentry->d_lru, freeable);

Another odd comment.  It talks about doing a list_move in the branch
that doesn't do the list_move, and the list_move outside the branch
actually has the d_lock, thus disagreeing with the comment.

> +	this_cpu_dec(nr_dentry_unused);
> +	spin_unlock(&dentry->d_lock);

No need to decrement the per-cpu counter while still having the lock
held.

> @@ -1094,11 +1069,10 @@ resume:
>  		/*
>  		 * move only zero ref count dentries to the dispose list.
>  		 */
> +		dentry_lru_del(dentry);
>  		if (!dentry->d_count) {
> -			dentry_lru_move_list(dentry, dispose);
> +			list_add_tail(&dentry->d_lru, dispose);
>  			found++;
> -		} else {
> -			dentry_lru_del(dentry);

I'd rather move this hunk to the previous patch, as it fits into the
logical change done there.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 13/13] dcache: convert to use new lru list infrastructure
@ 2011-08-24  6:32     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> +	struct list_head *freeable = arg;
> +	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
> +
> +
> +	/*

double empty line.

> +	 * we are inverting the lru lock/dentry->d_lock here,
> +	 * so use a trylock. If we fail to get the lock, just skip
> +	 * it
> +	 */
> +	if (!spin_trylock(&dentry->d_lock))
> +		return 2;
> +
> +	/*
> +	 * Referenced dentries are still in use. If they have active
> +	 * counts, just remove them from the LRU. Otherwise give them
> +	 * another pass through the LRU.
> +	 */
> +	if (dentry->d_count) {
> +		list_del_init(&dentry->d_lru);
> +		spin_unlock(&dentry->d_lock);
> +		return 0;
> +	}
> +
> +	if (dentry->d_flags & DCACHE_REFERENCED) {

The comment aove seems odd, given that it doesn't match the code.
I'd rather have something like:

	/*
	 * Used dentry, remove it from the LRU.
	 */

in its place, and a second one above the DCACHE_REFERENCED check:

	/*
	 * Referenced dentry, give it another pass through the LRU.
	 */

> +		dentry->d_flags &= ~DCACHE_REFERENCED;
> +		spin_unlock(&dentry->d_lock);
> +
> +		/*
> +		 * XXX: this list move should be be done under d_lock. Need to
> +		 * determine if it is safe just to do it under the lru lock.
> +		 */
> +		return 1;
> +	}
> +
> +	list_move_tail(&dentry->d_lru, freeable);

Another odd comment.  It talks about doing a list_move in the branch
that doesn't do the list_move, and the list_move outside the branch
actually has the d_lock, thus disagreeing with the comment.

> +	this_cpu_dec(nr_dentry_unused);
> +	spin_unlock(&dentry->d_lock);

No need to decrement the per-cpu counter while still having the lock
held.

> @@ -1094,11 +1069,10 @@ resume:
>  		/*
>  		 * move only zero ref count dentries to the dispose list.
>  		 */
> +		dentry_lru_del(dentry);
>  		if (!dentry->d_count) {
> -			dentry_lru_move_list(dentry, dispose);
> +			list_add_tail(&dentry->d_lru, dispose);
>  			found++;
> -		} else {
> -			dentry_lru_del(dentry);

I'd rather move this hunk to the previous patch, as it fits into the
logical change done there.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 09/13] inode: convert inode lru list to generic lru list code.
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-24  6:38     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

This silently drops the can_unuse check.  I can't say I was a fan of
it, but this really needs a separate patch with a proper description.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 09/13] inode: convert inode lru list to generic lru list code.
@ 2011-08-24  6:38     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-08-24  6:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

This silently drops the can_unuse check.  I can't say I was a fan of
it, but this really needs a separate patch with a proper description.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-24 14:12     ` Christoph Lameter
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Lameter @ 2011-08-24 14:12 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov, Tejun Heo

On Tue, 23 Aug 2011, Dave Chinner wrote:

> Before we split up the dcache_lru_lock, the unused dentry counter
> needs to be made independent of the global dcache_lru_lock. Convert
> it to per-cpu counters to do this.

I hope there is nothing depending on the counter being accurate.

Otherwise

Acked-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters
@ 2011-08-24 14:12     ` Christoph Lameter
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Lameter @ 2011-08-24 14:12 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov, Tejun Heo

On Tue, 23 Aug 2011, Dave Chinner wrote:

> Before we split up the dcache_lru_lock, the unused dentry counter
> needs to be made independent of the global dcache_lru_lock. Convert
> it to per-cpu counters to do this.

I hope there is nothing depending on the counter being accurate.

Otherwise

Acked-by: Christoph Lameter <cl@linux.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
  2011-08-23  9:58         ` Konstantin Khlebnikov
@ 2011-08-24 14:24           ` Christoph Lameter
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Lameter @ 2011-08-24 14:24 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Dave Chinner, Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm

On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote:

> maybe is better to rename it to enum page_lru_list

Better rename both and clearly indicate what type of lru list it is. An
LRU list is such a generic concept that it shows up in an excessive amount
of contexts.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 08/13] list: add a new LRU list type
@ 2011-08-24 14:24           ` Christoph Lameter
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Lameter @ 2011-08-24 14:24 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Dave Chinner, Christoph Hellwig, linux-kernel, linux-fsdevel, linux-mm

On Tue, 23 Aug 2011, Konstantin Khlebnikov wrote:

> maybe is better to rename it to enum page_lru_list

Better rename both and clearly indicate what type of lru list it is. An
LRU list is such a generic concept that it shows up in an excessive amount
of contexts.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
  2011-08-23  8:56   ` Dave Chinner
@ 2011-08-26 17:09     ` Wanlong Gao
  -1 siblings, 0 replies; 74+ messages in thread
From: Wanlong Gao @ 2011-08-26 17:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, 2011-08-23 at 18:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.

> +static long
> +cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
> -	int nr_to_scan = sc->nr_to_scan;
> -	int nr_del = 0;
> -	int nr_rem = 0;
>  	struct rb_root *root;
> +	long freed;
>  
>  	root = &uidtree;
>  	spin_lock(&siduidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed = shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&siduidlock);
>  
>  	root = &gidtree;
>  	spin_lock(&sidgidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed += shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&sidgidlock);
>  
> -	return nr_rem;
> +	return freed;
> +}
> +
> +/*
> + * This still abuses the nr_to_scan == 0 trick to get the common code just to
> + * count objects. There neds to be an external count of the objects in the

			  ^^^^^?
Hi Dave:
Great work. a bit comments.
Thanks
-Wanlong Gao



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 05/13] mm: convert shrinkers to use new API
@ 2011-08-26 17:09     ` Wanlong Gao
  0 siblings, 0 replies; 74+ messages in thread
From: Wanlong Gao @ 2011-08-26 17:09 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

On Tue, 2011-08-23 at 18:56 +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Modify shrink_slab() to use the new .count_objects/.scan_objects API
> and implement the callouts for all the existing shrinkers.

> +static long
> +cifs_idmap_shrinker_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
> -	int nr_to_scan = sc->nr_to_scan;
> -	int nr_del = 0;
> -	int nr_rem = 0;
>  	struct rb_root *root;
> +	long freed;
>  
>  	root = &uidtree;
>  	spin_lock(&siduidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed = shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&siduidlock);
>  
>  	root = &gidtree;
>  	spin_lock(&sidgidlock);
> -	shrink_idmap_tree(root, nr_to_scan, &nr_rem, &nr_del);
> +	freed += shrink_idmap_tree(root, sc->nr_to_scan);
>  	spin_unlock(&sidgidlock);
>  
> -	return nr_rem;
> +	return freed;
> +}
> +
> +/*
> + * This still abuses the nr_to_scan == 0 trick to get the common code just to
> + * count objects. There neds to be an external count of the objects in the

			  ^^^^^?
Hi Dave:
Great work. a bit comments.
Thanks
-Wanlong Gao


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/13] dcache: use a dispose list in select_parent
  2011-08-23  8:56   ` Dave Chinner
@ 2011-09-05  9:42     ` Christoph Hellwig
  -1 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-09-05  9:42 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> -/**
>   * shrink_dcache_sb - shrink dcache for a superblock
>   * @sb: superblock
>   *
> @@ -1073,7 +1054,7 @@ EXPORT_SYMBOL(have_submounts);
>   * drop the lock and return early due to latency
>   * constraints.
>   */
> -static long select_parent(struct dentry * parent)
> +static long select_parent(struct dentry *parent, struct list_head *dispose)

Btw, the function header comment above select_parent is entirely
incorrect after your changes.

Also I'd suggest folding select_parent into shrink_dcache_parent as
the split doesn't make a whole lot of sense any more.  Maybe factoring
it at a different level would make sense, though.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 11/13] dcache: use a dispose list in select_parent
@ 2011-09-05  9:42     ` Christoph Hellwig
  0 siblings, 0 replies; 74+ messages in thread
From: Christoph Hellwig @ 2011-09-05  9:42 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-kernel, linux-fsdevel, linux-mm, khlebnikov

> -/**
>   * shrink_dcache_sb - shrink dcache for a superblock
>   * @sb: superblock
>   *
> @@ -1073,7 +1054,7 @@ EXPORT_SYMBOL(have_submounts);
>   * drop the lock and return early due to latency
>   * constraints.
>   */
> -static long select_parent(struct dentry * parent)
> +static long select_parent(struct dentry *parent, struct list_head *dispose)

Btw, the function header comment above select_parent is entirely
incorrect after your changes.

Also I'd suggest folding select_parent into shrink_dcache_parent as
the split doesn't make a whole lot of sense any more.  Maybe factoring
it at a different level would make sense, though.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2011-09-05  9:43 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-23  8:56 [PATCH 00/12] RFC: shrinker APi rework and generic LRU lists Dave Chinner
2011-08-23  8:56 ` Dave Chinner
2011-08-23  8:56 ` [PATCH 01/13] fs: Use a common define for inode slab caches Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:13   ` Christoph Hellwig
2011-08-23  9:13     ` Christoph Hellwig
2011-08-23  9:20     ` Dave Chinner
2011-08-23  9:20       ` Dave Chinner
2011-08-24  6:16       ` Christoph Hellwig
2011-08-24  6:16         ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 02/13] dcache: convert dentry_stat.nr_unused to per-cpu counters Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:13   ` Christoph Hellwig
2011-08-23  9:13     ` Christoph Hellwig
2011-08-24 14:12   ` Christoph Lameter
2011-08-24 14:12     ` Christoph Lameter
2011-08-23  8:56 ` [PATCH 03/13] dentry: move to per-sb LRU locks Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-24  6:16   ` Christoph Hellwig
2011-08-24  6:16     ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 04/13] mm: new shrinker API Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:15   ` Christoph Hellwig
2011-08-23  9:15     ` Christoph Hellwig
2011-08-23  9:23     ` Dave Chinner
2011-08-23  9:23       ` Dave Chinner
2011-08-24  6:17       ` Christoph Hellwig
2011-08-24  6:17         ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 05/13] mm: convert shrinkers to use new API Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:17   ` Christoph Hellwig
2011-08-23  9:17     ` Christoph Hellwig
2011-08-23  9:35   ` Steven Whitehouse
2011-08-23  9:35     ` Steven Whitehouse
2011-08-26 17:09   ` Wanlong Gao
2011-08-26 17:09     ` Wanlong Gao
2011-08-23  8:56 ` [PATCH 06/13] shrinker: remove old API now it is unused Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  8:56 ` [PATCH 07/13] Use atomic-long operations instead of looping around cmpxchg() Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  8:56 ` [PATCH 08/13] list: add a new LRU list type Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:20   ` Christoph Hellwig
2011-08-23  9:20     ` Christoph Hellwig
2011-08-23  9:32     ` Dave Chinner
2011-08-23  9:32       ` Dave Chinner
2011-08-23  9:58       ` Konstantin Khlebnikov
2011-08-23  9:58         ` Konstantin Khlebnikov
2011-08-24 14:24         ` Christoph Lameter
2011-08-24 14:24           ` Christoph Lameter
2011-08-23  8:56 ` [PATCH 09/13] inode: convert inode lru list to generic lru list code Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-24  6:38   ` Christoph Hellwig
2011-08-24  6:38     ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 10/13] xfs: convert buftarg LRU to generic code Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-24  6:27   ` Christoph Hellwig
2011-08-24  6:27     ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 11/13] dcache: use a dispose list in select_parent Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:37   ` Christoph Hellwig
2011-08-23  9:37     ` Christoph Hellwig
2011-09-05  9:42   ` Christoph Hellwig
2011-09-05  9:42     ` Christoph Hellwig
2011-08-23  8:56 ` [PATCH 12/13] dcache: remove dentries from LRU before putting on dispose list Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-23  9:35   ` Christoph Hellwig
2011-08-23  9:35     ` Christoph Hellwig
2011-08-23  9:57     ` Dave Chinner
2011-08-23  9:57       ` Dave Chinner
2011-08-23  8:56 ` [PATCH 13/13] dcache: convert to use new lru list infrastructure Dave Chinner
2011-08-23  8:56   ` Dave Chinner
2011-08-24  6:32   ` Christoph Hellwig
2011-08-24  6:32     ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.