All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/6] memcg/kmem: switch to white list policy
@ 2015-11-10 18:34 ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hi,

Currently, all kmem allocations (namely every kmem_cache_alloc, kmalloc,
alloc_kmem_pages call) are accounted to memory cgroup automatically.
Callers have to explicitly opt out if they don't want/need accounting
for some reason. Such a design decision leads to several problems:

 - kmalloc users are highly sensitive to failures, many of them
   implicitly rely on the fact that kmalloc never fails, while memcg
   makes failures quite plausible.

 - A lot of objects are shared among different containers by design.
   Accounting such objects to one of containers is just unfair.
   Moreover, it might lead to pinning a dead memcg along with its kmem
   caches, which aren't tiny, which might result in noticeable increase
   in memory consumption for no apparent reason in the long run.

 - There are tons of short-lived objects. Accounting them to memcg will
   only result in slight noise and won't change the overall picture, but
   we still have to pay accounting overhead.

For more info, see

 - http://lkml.kernel.org/r/20151105144002.GB15111%40dhcp22.suse.cz
 - http://lkml.kernel.org/r/20151106090555.GK29259@esperanza

Therefore this patch switches to the white list policy. Now kmalloc
users have to explicitly opt in by passing __GFP_ACCOUNT flag.

Currently, the list of accounted objects is quite limited and only
includes those allocations that (1) are known to be easily triggered
from userspace and (2) can fail gracefully (for the full list see patch
no. 6) and it still misses many object types. However, accounting only
those objects should be a satisfactory approximation of the behavior we
used to have for most sane workloads.

Changes in v2:
 - add and use SLAB_ACCOUNT flag (Tejun)

v1: http://marc.info/?l=linux-mm&m=144692684713032&w=2

Thanks,

Vladimir Davydov (6):
  Revert "kernfs: do not account ino_ida allocations to memcg"
  Revert "gfp: add __GFP_NOACCOUNT"
  memcg: only account kmem allocations marked as __GFP_ACCOUNT
  slab: add SLAB_ACCOUNT flag
  vmalloc: allow to account vmalloc to memcg
  Account certain kmem allocations to memcg

 arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
 drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
 fs/9p/v9fs.c                                  |  2 +-
 fs/adfs/super.c                               |  2 +-
 fs/affs/super.c                               |  2 +-
 fs/afs/super.c                                |  2 +-
 fs/befs/linuxvfs.c                            |  2 +-
 fs/bfs/inode.c                                |  2 +-
 fs/block_dev.c                                |  2 +-
 fs/btrfs/inode.c                              |  3 ++-
 fs/ceph/super.c                               |  4 ++--
 fs/cifs/cifsfs.c                              |  2 +-
 fs/coda/inode.c                               |  6 +++---
 fs/dcache.c                                   |  5 +++--
 fs/ecryptfs/main.c                            |  6 ++++--
 fs/efs/super.c                                |  6 +++---
 fs/exofs/super.c                              |  4 ++--
 fs/ext2/super.c                               |  2 +-
 fs/ext4/super.c                               |  2 +-
 fs/f2fs/super.c                               |  5 +++--
 fs/fat/inode.c                                |  2 +-
 fs/file.c                                     |  7 ++++---
 fs/fuse/inode.c                               |  4 ++--
 fs/gfs2/main.c                                |  3 ++-
 fs/hfs/super.c                                |  4 ++--
 fs/hfsplus/super.c                            |  2 +-
 fs/hostfs/hostfs_kern.c                       |  2 +-
 fs/hpfs/super.c                               |  2 +-
 fs/hugetlbfs/inode.c                          |  2 +-
 fs/inode.c                                    |  2 +-
 fs/isofs/inode.c                              |  2 +-
 fs/jffs2/super.c                              |  2 +-
 fs/jfs/super.c                                |  2 +-
 fs/kernfs/dir.c                               |  9 +--------
 fs/logfs/inode.c                              |  3 ++-
 fs/minix/inode.c                              |  2 +-
 fs/ncpfs/inode.c                              |  2 +-
 fs/nfs/inode.c                                |  2 +-
 fs/nilfs2/super.c                             |  3 ++-
 fs/ntfs/super.c                               |  4 ++--
 fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
 fs/ocfs2/super.c                              |  2 +-
 fs/openpromfs/inode.c                         |  2 +-
 fs/proc/inode.c                               |  3 ++-
 fs/qnx4/inode.c                               |  2 +-
 fs/qnx6/inode.c                               |  2 +-
 fs/reiserfs/super.c                           |  3 ++-
 fs/romfs/super.c                              |  4 ++--
 fs/squashfs/super.c                           |  3 ++-
 fs/sysv/inode.c                               |  2 +-
 fs/ubifs/super.c                              |  4 ++--
 fs/udf/super.c                                |  3 ++-
 fs/ufs/super.c                                |  2 +-
 fs/xfs/kmem.h                                 |  1 +
 fs/xfs/xfs_super.c                            |  4 ++--
 include/linux/gfp.h                           |  6 ++++--
 include/linux/memcontrol.h                    | 15 +++++++--------
 include/linux/slab.h                          |  5 +++++
 include/linux/thread_info.h                   |  5 +++--
 ipc/mqueue.c                                  |  2 +-
 kernel/cred.c                                 |  4 ++--
 kernel/delayacct.c                            |  2 +-
 kernel/fork.c                                 | 22 +++++++++++++---------
 kernel/pid.c                                  |  2 +-
 mm/kmemleak.c                                 |  3 +--
 mm/memcontrol.c                               |  8 +++++++-
 mm/nommu.c                                    |  2 +-
 mm/page_alloc.c                               |  3 ++-
 mm/rmap.c                                     |  6 ++++--
 mm/shmem.c                                    |  2 +-
 mm/slab.h                                     |  5 +++--
 mm/slab_common.c                              |  3 ++-
 mm/slub.c                                     |  2 ++
 mm/vmalloc.c                                  |  6 +++---
 net/socket.c                                  |  2 +-
 net/sunrpc/rpc_pipe.c                         |  2 +-
 76 files changed, 151 insertions(+), 120 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 0/6] memcg/kmem: switch to white list policy
@ 2015-11-10 18:34 ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hi,

Currently, all kmem allocations (namely every kmem_cache_alloc, kmalloc,
alloc_kmem_pages call) are accounted to memory cgroup automatically.
Callers have to explicitly opt out if they don't want/need accounting
for some reason. Such a design decision leads to several problems:

 - kmalloc users are highly sensitive to failures, many of them
   implicitly rely on the fact that kmalloc never fails, while memcg
   makes failures quite plausible.

 - A lot of objects are shared among different containers by design.
   Accounting such objects to one of containers is just unfair.
   Moreover, it might lead to pinning a dead memcg along with its kmem
   caches, which aren't tiny, which might result in noticeable increase
   in memory consumption for no apparent reason in the long run.

 - There are tons of short-lived objects. Accounting them to memcg will
   only result in slight noise and won't change the overall picture, but
   we still have to pay accounting overhead.

For more info, see

 - http://lkml.kernel.org/r/20151105144002.GB15111%40dhcp22.suse.cz
 - http://lkml.kernel.org/r/20151106090555.GK29259@esperanza

Therefore this patch switches to the white list policy. Now kmalloc
users have to explicitly opt in by passing __GFP_ACCOUNT flag.

Currently, the list of accounted objects is quite limited and only
includes those allocations that (1) are known to be easily triggered
from userspace and (2) can fail gracefully (for the full list see patch
no. 6) and it still misses many object types. However, accounting only
those objects should be a satisfactory approximation of the behavior we
used to have for most sane workloads.

Changes in v2:
 - add and use SLAB_ACCOUNT flag (Tejun)

v1: http://marc.info/?l=linux-mm&m=144692684713032&w=2

Thanks,

Vladimir Davydov (6):
  Revert "kernfs: do not account ino_ida allocations to memcg"
  Revert "gfp: add __GFP_NOACCOUNT"
  memcg: only account kmem allocations marked as __GFP_ACCOUNT
  slab: add SLAB_ACCOUNT flag
  vmalloc: allow to account vmalloc to memcg
  Account certain kmem allocations to memcg

 arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
 drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
 fs/9p/v9fs.c                                  |  2 +-
 fs/adfs/super.c                               |  2 +-
 fs/affs/super.c                               |  2 +-
 fs/afs/super.c                                |  2 +-
 fs/befs/linuxvfs.c                            |  2 +-
 fs/bfs/inode.c                                |  2 +-
 fs/block_dev.c                                |  2 +-
 fs/btrfs/inode.c                              |  3 ++-
 fs/ceph/super.c                               |  4 ++--
 fs/cifs/cifsfs.c                              |  2 +-
 fs/coda/inode.c                               |  6 +++---
 fs/dcache.c                                   |  5 +++--
 fs/ecryptfs/main.c                            |  6 ++++--
 fs/efs/super.c                                |  6 +++---
 fs/exofs/super.c                              |  4 ++--
 fs/ext2/super.c                               |  2 +-
 fs/ext4/super.c                               |  2 +-
 fs/f2fs/super.c                               |  5 +++--
 fs/fat/inode.c                                |  2 +-
 fs/file.c                                     |  7 ++++---
 fs/fuse/inode.c                               |  4 ++--
 fs/gfs2/main.c                                |  3 ++-
 fs/hfs/super.c                                |  4 ++--
 fs/hfsplus/super.c                            |  2 +-
 fs/hostfs/hostfs_kern.c                       |  2 +-
 fs/hpfs/super.c                               |  2 +-
 fs/hugetlbfs/inode.c                          |  2 +-
 fs/inode.c                                    |  2 +-
 fs/isofs/inode.c                              |  2 +-
 fs/jffs2/super.c                              |  2 +-
 fs/jfs/super.c                                |  2 +-
 fs/kernfs/dir.c                               |  9 +--------
 fs/logfs/inode.c                              |  3 ++-
 fs/minix/inode.c                              |  2 +-
 fs/ncpfs/inode.c                              |  2 +-
 fs/nfs/inode.c                                |  2 +-
 fs/nilfs2/super.c                             |  3 ++-
 fs/ntfs/super.c                               |  4 ++--
 fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
 fs/ocfs2/super.c                              |  2 +-
 fs/openpromfs/inode.c                         |  2 +-
 fs/proc/inode.c                               |  3 ++-
 fs/qnx4/inode.c                               |  2 +-
 fs/qnx6/inode.c                               |  2 +-
 fs/reiserfs/super.c                           |  3 ++-
 fs/romfs/super.c                              |  4 ++--
 fs/squashfs/super.c                           |  3 ++-
 fs/sysv/inode.c                               |  2 +-
 fs/ubifs/super.c                              |  4 ++--
 fs/udf/super.c                                |  3 ++-
 fs/ufs/super.c                                |  2 +-
 fs/xfs/kmem.h                                 |  1 +
 fs/xfs/xfs_super.c                            |  4 ++--
 include/linux/gfp.h                           |  6 ++++--
 include/linux/memcontrol.h                    | 15 +++++++--------
 include/linux/slab.h                          |  5 +++++
 include/linux/thread_info.h                   |  5 +++--
 ipc/mqueue.c                                  |  2 +-
 kernel/cred.c                                 |  4 ++--
 kernel/delayacct.c                            |  2 +-
 kernel/fork.c                                 | 22 +++++++++++++---------
 kernel/pid.c                                  |  2 +-
 mm/kmemleak.c                                 |  3 +--
 mm/memcontrol.c                               |  8 +++++++-
 mm/nommu.c                                    |  2 +-
 mm/page_alloc.c                               |  3 ++-
 mm/rmap.c                                     |  6 ++++--
 mm/shmem.c                                    |  2 +-
 mm/slab.h                                     |  5 +++--
 mm/slab_common.c                              |  3 ++-
 mm/slub.c                                     |  2 ++
 mm/vmalloc.c                                  |  6 +++---
 net/socket.c                                  |  2 +-
 net/sunrpc/rpc_pipe.c                         |  2 +-
 76 files changed, 151 insertions(+), 120 deletions(-)

-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 0/6] memcg/kmem: switch to white list policy
@ 2015-11-10 18:34 ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

Currently, all kmem allocations (namely every kmem_cache_alloc, kmalloc,
alloc_kmem_pages call) are accounted to memory cgroup automatically.
Callers have to explicitly opt out if they don't want/need accounting
for some reason. Such a design decision leads to several problems:

 - kmalloc users are highly sensitive to failures, many of them
   implicitly rely on the fact that kmalloc never fails, while memcg
   makes failures quite plausible.

 - A lot of objects are shared among different containers by design.
   Accounting such objects to one of containers is just unfair.
   Moreover, it might lead to pinning a dead memcg along with its kmem
   caches, which aren't tiny, which might result in noticeable increase
   in memory consumption for no apparent reason in the long run.

 - There are tons of short-lived objects. Accounting them to memcg will
   only result in slight noise and won't change the overall picture, but
   we still have to pay accounting overhead.

For more info, see

 - http://lkml.kernel.org/r/20151105144002.GB15111%40dhcp22.suse.cz
 - http://lkml.kernel.org/r/20151106090555.GK29259@esperanza

Therefore this patch switches to the white list policy. Now kmalloc
users have to explicitly opt in by passing __GFP_ACCOUNT flag.

Currently, the list of accounted objects is quite limited and only
includes those allocations that (1) are known to be easily triggered
from userspace and (2) can fail gracefully (for the full list see patch
no. 6) and it still misses many object types. However, accounting only
those objects should be a satisfactory approximation of the behavior we
used to have for most sane workloads.

Changes in v2:
 - add and use SLAB_ACCOUNT flag (Tejun)

v1: http://marc.info/?l=linux-mm&m=144692684713032&w=2

Thanks,

Vladimir Davydov (6):
  Revert "kernfs: do not account ino_ida allocations to memcg"
  Revert "gfp: add __GFP_NOACCOUNT"
  memcg: only account kmem allocations marked as __GFP_ACCOUNT
  slab: add SLAB_ACCOUNT flag
  vmalloc: allow to account vmalloc to memcg
  Account certain kmem allocations to memcg

 arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
 drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
 fs/9p/v9fs.c                                  |  2 +-
 fs/adfs/super.c                               |  2 +-
 fs/affs/super.c                               |  2 +-
 fs/afs/super.c                                |  2 +-
 fs/befs/linuxvfs.c                            |  2 +-
 fs/bfs/inode.c                                |  2 +-
 fs/block_dev.c                                |  2 +-
 fs/btrfs/inode.c                              |  3 ++-
 fs/ceph/super.c                               |  4 ++--
 fs/cifs/cifsfs.c                              |  2 +-
 fs/coda/inode.c                               |  6 +++---
 fs/dcache.c                                   |  5 +++--
 fs/ecryptfs/main.c                            |  6 ++++--
 fs/efs/super.c                                |  6 +++---
 fs/exofs/super.c                              |  4 ++--
 fs/ext2/super.c                               |  2 +-
 fs/ext4/super.c                               |  2 +-
 fs/f2fs/super.c                               |  5 +++--
 fs/fat/inode.c                                |  2 +-
 fs/file.c                                     |  7 ++++---
 fs/fuse/inode.c                               |  4 ++--
 fs/gfs2/main.c                                |  3 ++-
 fs/hfs/super.c                                |  4 ++--
 fs/hfsplus/super.c                            |  2 +-
 fs/hostfs/hostfs_kern.c                       |  2 +-
 fs/hpfs/super.c                               |  2 +-
 fs/hugetlbfs/inode.c                          |  2 +-
 fs/inode.c                                    |  2 +-
 fs/isofs/inode.c                              |  2 +-
 fs/jffs2/super.c                              |  2 +-
 fs/jfs/super.c                                |  2 +-
 fs/kernfs/dir.c                               |  9 +--------
 fs/logfs/inode.c                              |  3 ++-
 fs/minix/inode.c                              |  2 +-
 fs/ncpfs/inode.c                              |  2 +-
 fs/nfs/inode.c                                |  2 +-
 fs/nilfs2/super.c                             |  3 ++-
 fs/ntfs/super.c                               |  4 ++--
 fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
 fs/ocfs2/super.c                              |  2 +-
 fs/openpromfs/inode.c                         |  2 +-
 fs/proc/inode.c                               |  3 ++-
 fs/qnx4/inode.c                               |  2 +-
 fs/qnx6/inode.c                               |  2 +-
 fs/reiserfs/super.c                           |  3 ++-
 fs/romfs/super.c                              |  4 ++--
 fs/squashfs/super.c                           |  3 ++-
 fs/sysv/inode.c                               |  2 +-
 fs/ubifs/super.c                              |  4 ++--
 fs/udf/super.c                                |  3 ++-
 fs/ufs/super.c                                |  2 +-
 fs/xfs/kmem.h                                 |  1 +
 fs/xfs/xfs_super.c                            |  4 ++--
 include/linux/gfp.h                           |  6 ++++--
 include/linux/memcontrol.h                    | 15 +++++++--------
 include/linux/slab.h                          |  5 +++++
 include/linux/thread_info.h                   |  5 +++--
 ipc/mqueue.c                                  |  2 +-
 kernel/cred.c                                 |  4 ++--
 kernel/delayacct.c                            |  2 +-
 kernel/fork.c                                 | 22 +++++++++++++---------
 kernel/pid.c                                  |  2 +-
 mm/kmemleak.c                                 |  3 +--
 mm/memcontrol.c                               |  8 +++++++-
 mm/nommu.c                                    |  2 +-
 mm/page_alloc.c                               |  3 ++-
 mm/rmap.c                                     |  6 ++++--
 mm/shmem.c                                    |  2 +-
 mm/slab.h                                     |  5 +++--
 mm/slab_common.c                              |  3 ++-
 mm/slub.c                                     |  2 ++
 mm/vmalloc.c                                  |  6 +++---
 net/socket.c                                  |  2 +-
 net/sunrpc/rpc_pipe.c                         |  2 +-
 76 files changed, 151 insertions(+), 120 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 56+ messages in thread

* [PATCH v2 1/6] Revert "kernfs: do not account ino_ida allocations to memcg"
  2015-11-10 18:34 ` Vladimir Davydov
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This reverts commit 499611ed451508a42d1d7d1faff10177827755d5.

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So it was decided to switch to the white-list policy. This patch reverts
bits introducing the black-list policy. The white-list policy will be
introduced later in the series.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 fs/kernfs/dir.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 91e004518237..0239a0a76ed5 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -541,14 +541,7 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 	if (!kn)
 		goto err_out1;
 
-	/*
-	 * If the ino of the sysfs entry created for a kmem cache gets
-	 * allocated from an ida layer, which is accounted to the memcg that
-	 * owns the cache, the memcg will get pinned forever. So do not account
-	 * ino ida allocations.
-	 */
-	ret = ida_simple_get(&root->ino_ida, 1, 0,
-			     GFP_KERNEL | __GFP_NOACCOUNT);
+	ret = ida_simple_get(&root->ino_ida, 1, 0, GFP_KERNEL);
 	if (ret < 0)
 		goto err_out2;
 	kn->ino = ret;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 1/6] Revert "kernfs: do not account ino_ida allocations to memcg"
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This reverts commit 499611ed451508a42d1d7d1faff10177827755d5.

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So it was decided to switch to the white-list policy. This patch reverts
bits introducing the black-list policy. The white-list policy will be
introduced later in the series.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 fs/kernfs/dir.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 91e004518237..0239a0a76ed5 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -541,14 +541,7 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
 	if (!kn)
 		goto err_out1;
 
-	/*
-	 * If the ino of the sysfs entry created for a kmem cache gets
-	 * allocated from an ida layer, which is accounted to the memcg that
-	 * owns the cache, the memcg will get pinned forever. So do not account
-	 * ino ida allocations.
-	 */
-	ret = ida_simple_get(&root->ino_ida, 1, 0,
-			     GFP_KERNEL | __GFP_NOACCOUNT);
+	ret = ida_simple_get(&root->ino_ida, 1, 0, GFP_KERNEL);
 	if (ret < 0)
 		goto err_out2;
 	kn->ino = ret;
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 2/6] Revert "gfp: add __GFP_NOACCOUNT"
  2015-11-10 18:34 ` Vladimir Davydov
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This reverts commit 8f4fc071b1926d0b20336e2b3f8ab85c94c734c5.

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So it was decided to switch to the white-list policy. This patch reverts
bits introducing the black-list policy. The white-list policy will be
introduced later in the series.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Conflicts:
	include/linux/memcontrol.h
---
 include/linux/gfp.h        | 2 --
 include/linux/memcontrol.h | 2 --
 mm/kmemleak.c              | 3 +--
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f92cbd2f4450..2b917ce34efc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -30,7 +30,6 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x20000u
 #define ___GFP_THISNODE		0x40000u
 #define ___GFP_RECLAIMABLE	0x80000u
-#define ___GFP_NOACCOUNT	0x100000u
 #define ___GFP_NOTRACK		0x200000u
 #define ___GFP_NO_KSWAPD	0x400000u
 #define ___GFP_OTHER_NODE	0x800000u
@@ -91,7 +90,6 @@ struct vm_area_struct;
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
-#define __GFP_NOACCOUNT	((__force gfp_t)___GFP_NOACCOUNT) /* Don't account to kmemcg */
 #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
 
 #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index cd0e2413c358..2103f36b3bd3 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,8 +773,6 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 {
 	if (!memcg_kmem_enabled())
 		return true;
-	if (gfp & __GFP_NOACCOUNT)
-		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 19423a45d7d7..25c0ad36fe38 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -122,8 +122,7 @@
 #define BYTES_PER_POINTER	sizeof(void *)
 
 /* GFP bitmask for kmemleak internal allocations */
-#define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC | \
-					   __GFP_NOACCOUNT)) | \
+#define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
 				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
 				 __GFP_NOWARN)
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 2/6] Revert "gfp: add __GFP_NOACCOUNT"
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This reverts commit 8f4fc071b1926d0b20336e2b3f8ab85c94c734c5.

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So it was decided to switch to the white-list policy. This patch reverts
bits introducing the black-list policy. The white-list policy will be
introduced later in the series.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Conflicts:
	include/linux/memcontrol.h
---
 include/linux/gfp.h        | 2 --
 include/linux/memcontrol.h | 2 --
 mm/kmemleak.c              | 3 +--
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index f92cbd2f4450..2b917ce34efc 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -30,7 +30,6 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x20000u
 #define ___GFP_THISNODE		0x40000u
 #define ___GFP_RECLAIMABLE	0x80000u
-#define ___GFP_NOACCOUNT	0x100000u
 #define ___GFP_NOTRACK		0x200000u
 #define ___GFP_NO_KSWAPD	0x400000u
 #define ___GFP_OTHER_NODE	0x800000u
@@ -91,7 +90,6 @@ struct vm_area_struct;
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
-#define __GFP_NOACCOUNT	((__force gfp_t)___GFP_NOACCOUNT) /* Don't account to kmemcg */
 #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
 
 #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index cd0e2413c358..2103f36b3bd3 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,8 +773,6 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 {
 	if (!memcg_kmem_enabled())
 		return true;
-	if (gfp & __GFP_NOACCOUNT)
-		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 19423a45d7d7..25c0ad36fe38 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -122,8 +122,7 @@
 #define BYTES_PER_POINTER	sizeof(void *)
 
 /* GFP bitmask for kmemleak internal allocations */
-#define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC | \
-					   __GFP_NOACCOUNT)) | \
+#define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
 				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
 				 __GFP_NOWARN)
 
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
  2015-11-10 18:34 ` Vladimir Davydov
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So this patch switches kmem accounting to the white-policy: now only
those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
memcg. Currently, no kmem allocations are marked like this. The
following patches will mark several kmem allocations that are known to
be easily triggered from userspace and therefore should be accounted to
memcg.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/gfp.h        | 4 ++++
 include/linux/memcontrol.h | 2 ++
 mm/page_alloc.c            | 3 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 2b917ce34efc..61305a492356 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -30,6 +30,7 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x20000u
 #define ___GFP_THISNODE		0x40000u
 #define ___GFP_RECLAIMABLE	0x80000u
+#define ___GFP_ACCOUNT		0x100000u
 #define ___GFP_NOTRACK		0x200000u
 #define ___GFP_NO_KSWAPD	0x400000u
 #define ___GFP_OTHER_NODE	0x800000u
@@ -90,6 +91,8 @@ struct vm_area_struct;
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
+#define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)	/* Account to memcg (only relevant
+							 * to kmem allocations) */
 #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
 
 #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
@@ -112,6 +115,7 @@ struct vm_area_struct;
 #define GFP_NOIO	(__GFP_WAIT)
 #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
 #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
+#define GFP_KERNEL_ACCOUNT	(GFP_KERNEL | __GFP_ACCOUNT)
 #define GFP_TEMPORARY	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
 			 __GFP_RECLAIMABLE)
 #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 2103f36b3bd3..c9d9a8e7b45f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,6 +773,8 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 {
 	if (!memcg_kmem_enabled())
 		return true;
+	if (!(gfp & __GFP_ACCOUNT))
+		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 446bb36ee59d..8e22f5b27de0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3420,7 +3420,8 @@ EXPORT_SYMBOL(__free_page_frag);
 
 /*
  * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
- * of the current memory cgroup.
+ * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
+ * equivalent to alloc_pages.
  *
  * It should be used when the caller would like to use kmalloc, but since the
  * allocation is large, it has to fall back to the page allocator.
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
fragile and difficult to maintain, because there seem to be many more
allocations that should not be accounted than those that should be.
Besides, false accounting an allocation might result in much worse
consequences than not accounting at all, namely increased memory
consumption due to pinned dead kmem caches.

So this patch switches kmem accounting to the white-policy: now only
those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
memcg. Currently, no kmem allocations are marked like this. The
following patches will mark several kmem allocations that are known to
be easily triggered from userspace and therefore should be accounted to
memcg.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/gfp.h        | 4 ++++
 include/linux/memcontrol.h | 2 ++
 mm/page_alloc.c            | 3 ++-
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 2b917ce34efc..61305a492356 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -30,6 +30,7 @@ struct vm_area_struct;
 #define ___GFP_HARDWALL		0x20000u
 #define ___GFP_THISNODE		0x40000u
 #define ___GFP_RECLAIMABLE	0x80000u
+#define ___GFP_ACCOUNT		0x100000u
 #define ___GFP_NOTRACK		0x200000u
 #define ___GFP_NO_KSWAPD	0x400000u
 #define ___GFP_OTHER_NODE	0x800000u
@@ -90,6 +91,8 @@ struct vm_area_struct;
 #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
 #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
+#define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)	/* Account to memcg (only relevant
+							 * to kmem allocations) */
 #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
 
 #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
@@ -112,6 +115,7 @@ struct vm_area_struct;
 #define GFP_NOIO	(__GFP_WAIT)
 #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
 #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
+#define GFP_KERNEL_ACCOUNT	(GFP_KERNEL | __GFP_ACCOUNT)
 #define GFP_TEMPORARY	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
 			 __GFP_RECLAIMABLE)
 #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 2103f36b3bd3..c9d9a8e7b45f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -773,6 +773,8 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 {
 	if (!memcg_kmem_enabled())
 		return true;
+	if (!(gfp & __GFP_ACCOUNT))
+		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 446bb36ee59d..8e22f5b27de0 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3420,7 +3420,8 @@ EXPORT_SYMBOL(__free_page_frag);
 
 /*
  * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
- * of the current memory cgroup.
+ * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
+ * equivalent to alloc_pages.
  *
  * It should be used when the caller would like to use kmalloc, but since the
  * allocation is large, it has to fall back to the page allocator.
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:34 ` Vladimir Davydov
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Currently, if we want to account all objects of a particular kmem cache,
we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
kmem_cache_create will force accounting for every allocation from this
cache even if __GFP_ACCOUNT is not passed.

This patch does not make any of the existing caches use this flag - it
will be done later in the series.

Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
merged slabs even if kmem accounting is not used (only compiled in).

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/memcontrol.h | 15 +++++++--------
 include/linux/slab.h       |  5 +++++
 mm/memcontrol.c            |  8 +++++++-
 mm/slab.h                  |  5 +++--
 mm/slab_common.c           |  3 ++-
 mm/slub.c                  |  2 ++
 6 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c9d9a8e7b45f..5c97265c1c6e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -766,15 +766,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
 	return memcg ? memcg->kmemcg_id : -1;
 }
 
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
 void __memcg_kmem_put_cache(struct kmem_cache *cachep);
 
-static inline bool __memcg_kmem_bypass(gfp_t gfp)
+static inline bool __memcg_kmem_bypass(void)
 {
 	if (!memcg_kmem_enabled())
 		return true;
-	if (!(gfp & __GFP_ACCOUNT))
-		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
@@ -791,7 +789,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 static __always_inline int memcg_kmem_charge(struct page *page,
 					     gfp_t gfp, int order)
 {
-	if (__memcg_kmem_bypass(gfp))
+	if (__memcg_kmem_bypass())
+		return 0;
+	if (!(gfp & __GFP_ACCOUNT))
 		return 0;
 	return __memcg_kmem_charge(page, gfp, order);
 }
@@ -810,16 +810,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
 /**
  * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
  * @cachep: the original global kmem cache
- * @gfp: allocation flags.
  *
  * All memory allocated from a per-memcg cache is charged to the owner memcg.
  */
 static __always_inline struct kmem_cache *
 memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 {
-	if (__memcg_kmem_bypass(gfp))
+	if (__memcg_kmem_bypass())
 		return cachep;
-	return __memcg_kmem_get_cache(cachep);
+	return __memcg_kmem_get_cache(cachep, gfp);
 }
 
 static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 7c82e3b307a3..20168c6ffe89 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,6 +86,11 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
+#ifdef CONFIG_MEMCG_KMEM
+# define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
+#else
+# define SLAB_ACCOUNT		0x00000000UL
+#endif
 
 /* The following flags affect the page allocator grouping pages by mobility */
 #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bc502e590366..06e4f538e38e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2332,7 +2332,7 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
  * Can't be called in interrupt context or from kernel threads.
  * This function needs to be called with rcu_read_lock() held.
  */
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 {
 	struct mem_cgroup *memcg;
 	struct kmem_cache *memcg_cachep;
@@ -2340,6 +2340,12 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
 
 	VM_BUG_ON(!is_root_cache(cachep));
 
+	if (cachep->flags & SLAB_ACCOUNT)
+		gfp |= __GFP_ACCOUNT;
+
+	if (!(gfp & __GFP_ACCOUNT))
+		return cachep;
+
 	if (current->memcg_kmem_skip_account)
 		return cachep;
 
diff --git a/mm/slab.h b/mm/slab.h
index 27492eb678f7..2778de8673bd 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -128,10 +128,11 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
 
 #if defined(CONFIG_SLAB)
 #define SLAB_CACHE_FLAGS (SLAB_MEM_SPREAD | SLAB_NOLEAKTRACE | \
-			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | SLAB_NOTRACK)
+			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | \
+			  SLAB_NOTRACK | SLAB_ACCOUNT)
 #elif defined(CONFIG_SLUB)
 #define SLAB_CACHE_FLAGS (SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \
-			  SLAB_TEMPORARY | SLAB_NOTRACK)
+			  SLAB_TEMPORARY | SLAB_NOTRACK | SLAB_ACCOUNT)
 #else
 #define SLAB_CACHE_FLAGS (0)
 #endif
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d88e97c10a2e..698b2c97b22b 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -37,7 +37,8 @@ struct kmem_cache *kmem_cache;
 		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
 		SLAB_FAILSLAB)
 
-#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)
+#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
+			 SLAB_NOTRACK | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index 75a5fa92ac2a..b037cea9cfeb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5247,6 +5247,8 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'F';
 	if (!(s->flags & SLAB_NOTRACK))
 		*p++ = 't';
+	if (s->flags & SLAB_ACCOUNT)
+		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
 	p += sprintf(p, "%07d", s->size);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Currently, if we want to account all objects of a particular kmem cache,
we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
kmem_cache_create will force accounting for every allocation from this
cache even if __GFP_ACCOUNT is not passed.

This patch does not make any of the existing caches use this flag - it
will be done later in the series.

Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
merged slabs even if kmem accounting is not used (only compiled in).

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/memcontrol.h | 15 +++++++--------
 include/linux/slab.h       |  5 +++++
 mm/memcontrol.c            |  8 +++++++-
 mm/slab.h                  |  5 +++--
 mm/slab_common.c           |  3 ++-
 mm/slub.c                  |  2 ++
 6 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index c9d9a8e7b45f..5c97265c1c6e 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -766,15 +766,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
 	return memcg ? memcg->kmemcg_id : -1;
 }
 
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
 void __memcg_kmem_put_cache(struct kmem_cache *cachep);
 
-static inline bool __memcg_kmem_bypass(gfp_t gfp)
+static inline bool __memcg_kmem_bypass(void)
 {
 	if (!memcg_kmem_enabled())
 		return true;
-	if (!(gfp & __GFP_ACCOUNT))
-		return true;
 	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
 		return true;
 	return false;
@@ -791,7 +789,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
 static __always_inline int memcg_kmem_charge(struct page *page,
 					     gfp_t gfp, int order)
 {
-	if (__memcg_kmem_bypass(gfp))
+	if (__memcg_kmem_bypass())
+		return 0;
+	if (!(gfp & __GFP_ACCOUNT))
 		return 0;
 	return __memcg_kmem_charge(page, gfp, order);
 }
@@ -810,16 +810,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
 /**
  * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
  * @cachep: the original global kmem cache
- * @gfp: allocation flags.
  *
  * All memory allocated from a per-memcg cache is charged to the owner memcg.
  */
 static __always_inline struct kmem_cache *
 memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 {
-	if (__memcg_kmem_bypass(gfp))
+	if (__memcg_kmem_bypass())
 		return cachep;
-	return __memcg_kmem_get_cache(cachep);
+	return __memcg_kmem_get_cache(cachep, gfp);
 }
 
 static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 7c82e3b307a3..20168c6ffe89 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -86,6 +86,11 @@
 #else
 # define SLAB_FAILSLAB		0x00000000UL
 #endif
+#ifdef CONFIG_MEMCG_KMEM
+# define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
+#else
+# define SLAB_ACCOUNT		0x00000000UL
+#endif
 
 /* The following flags affect the page allocator grouping pages by mobility */
 #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index bc502e590366..06e4f538e38e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2332,7 +2332,7 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
  * Can't be called in interrupt context or from kernel threads.
  * This function needs to be called with rcu_read_lock() held.
  */
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
+struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 {
 	struct mem_cgroup *memcg;
 	struct kmem_cache *memcg_cachep;
@@ -2340,6 +2340,12 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
 
 	VM_BUG_ON(!is_root_cache(cachep));
 
+	if (cachep->flags & SLAB_ACCOUNT)
+		gfp |= __GFP_ACCOUNT;
+
+	if (!(gfp & __GFP_ACCOUNT))
+		return cachep;
+
 	if (current->memcg_kmem_skip_account)
 		return cachep;
 
diff --git a/mm/slab.h b/mm/slab.h
index 27492eb678f7..2778de8673bd 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -128,10 +128,11 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
 
 #if defined(CONFIG_SLAB)
 #define SLAB_CACHE_FLAGS (SLAB_MEM_SPREAD | SLAB_NOLEAKTRACE | \
-			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | SLAB_NOTRACK)
+			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | \
+			  SLAB_NOTRACK | SLAB_ACCOUNT)
 #elif defined(CONFIG_SLUB)
 #define SLAB_CACHE_FLAGS (SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \
-			  SLAB_TEMPORARY | SLAB_NOTRACK)
+			  SLAB_TEMPORARY | SLAB_NOTRACK | SLAB_ACCOUNT)
 #else
 #define SLAB_CACHE_FLAGS (0)
 #endif
diff --git a/mm/slab_common.c b/mm/slab_common.c
index d88e97c10a2e..698b2c97b22b 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -37,7 +37,8 @@ struct kmem_cache *kmem_cache;
 		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
 		SLAB_FAILSLAB)
 
-#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)
+#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
+			 SLAB_NOTRACK | SLAB_ACCOUNT)
 
 /*
  * Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index 75a5fa92ac2a..b037cea9cfeb 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5247,6 +5247,8 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'F';
 	if (!(s->flags & SLAB_NOTRACK))
 		*p++ = 't';
+	if (s->flags & SLAB_ACCOUNT)
+		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
 	p += sprintf(p, "%07d", s->size);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
  2015-11-10 18:34 ` Vladimir Davydov
  (?)
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This patch makes vmalloc family functions allocate vmalloc area pages
with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
accounted to memcg. This is needed, at least, to account alloc_fdmem
allocations.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 mm/vmalloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9db9ef5e8481..259cfb32b7cf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1476,7 +1476,7 @@ static void __vunmap(const void *addr, int deallocate_pages)
 			struct page *page = area->pages[i];
 
 			BUG_ON(!page);
-			__free_page(page);
+			__free_kmem_pages(page, 0);
 		}
 
 		if (area->flags & VM_VPAGES)
@@ -1607,9 +1607,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		struct page *page;
 
 		if (node == NUMA_NO_NODE)
-			page = alloc_page(alloc_mask);
+			page = alloc_kmem_pages(alloc_mask, order);
 		else
-			page = alloc_pages_node(node, alloc_mask, order);
+			page = alloc_kmem_pages_node(node, alloc_mask, order);
 
 		if (unlikely(!page)) {
 			/* Successfully allocated i pages, free them in __vunmap() */
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This patch makes vmalloc family functions allocate vmalloc area pages
with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
accounted to memcg. This is needed, at least, to account alloc_fdmem
allocations.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 mm/vmalloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9db9ef5e8481..259cfb32b7cf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1476,7 +1476,7 @@ static void __vunmap(const void *addr, int deallocate_pages)
 			struct page *page = area->pages[i];
 
 			BUG_ON(!page);
-			__free_page(page);
+			__free_kmem_pages(page, 0);
 		}
 
 		if (area->flags & VM_VPAGES)
@@ -1607,9 +1607,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		struct page *page;
 
 		if (node == NUMA_NO_NODE)
-			page = alloc_page(alloc_mask);
+			page = alloc_kmem_pages(alloc_mask, order);
 		else
-			page = alloc_pages_node(node, alloc_mask, order);
+			page = alloc_kmem_pages_node(node, alloc_mask, order);
 
 		if (unlikely(!page)) {
 			/* Successfully allocated i pages, free them in __vunmap() */
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This patch makes vmalloc family functions allocate vmalloc area pages
with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
accounted to memcg. This is needed, at least, to account alloc_fdmem
allocations.

Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>
---
 mm/vmalloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9db9ef5e8481..259cfb32b7cf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1476,7 +1476,7 @@ static void __vunmap(const void *addr, int deallocate_pages)
 			struct page *page = area->pages[i];
 
 			BUG_ON(!page);
-			__free_page(page);
+			__free_kmem_pages(page, 0);
 		}
 
 		if (area->flags & VM_VPAGES)
@@ -1607,9 +1607,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		struct page *page;
 
 		if (node == NUMA_NO_NODE)
-			page = alloc_page(alloc_mask);
+			page = alloc_kmem_pages(alloc_mask, order);
 		else
-			page = alloc_pages_node(node, alloc_mask, order);
+			page = alloc_kmem_pages_node(node, alloc_mask, order);
 
 		if (unlikely(!page)) {
 			/* Successfully allocated i pages, free them in __vunmap() */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 6/6] Account certain kmem allocations to memcg
  2015-11-10 18:34 ` Vladimir Davydov
@ 2015-11-10 18:34   ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This patch marks those kmem allocations that are known to be easily
triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
accounted to memcg. For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is by far not complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
 drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
 fs/9p/v9fs.c                                  |  2 +-
 fs/adfs/super.c                               |  2 +-
 fs/affs/super.c                               |  2 +-
 fs/afs/super.c                                |  2 +-
 fs/befs/linuxvfs.c                            |  2 +-
 fs/bfs/inode.c                                |  2 +-
 fs/block_dev.c                                |  2 +-
 fs/btrfs/inode.c                              |  3 ++-
 fs/ceph/super.c                               |  4 ++--
 fs/cifs/cifsfs.c                              |  2 +-
 fs/coda/inode.c                               |  6 +++---
 fs/dcache.c                                   |  5 +++--
 fs/ecryptfs/main.c                            |  6 ++++--
 fs/efs/super.c                                |  6 +++---
 fs/exofs/super.c                              |  4 ++--
 fs/ext2/super.c                               |  2 +-
 fs/ext4/super.c                               |  2 +-
 fs/f2fs/super.c                               |  5 +++--
 fs/fat/inode.c                                |  2 +-
 fs/file.c                                     |  7 ++++---
 fs/fuse/inode.c                               |  4 ++--
 fs/gfs2/main.c                                |  3 ++-
 fs/hfs/super.c                                |  4 ++--
 fs/hfsplus/super.c                            |  2 +-
 fs/hostfs/hostfs_kern.c                       |  2 +-
 fs/hpfs/super.c                               |  2 +-
 fs/hugetlbfs/inode.c                          |  2 +-
 fs/inode.c                                    |  2 +-
 fs/isofs/inode.c                              |  2 +-
 fs/jffs2/super.c                              |  2 +-
 fs/jfs/super.c                                |  2 +-
 fs/logfs/inode.c                              |  3 ++-
 fs/minix/inode.c                              |  2 +-
 fs/ncpfs/inode.c                              |  2 +-
 fs/nfs/inode.c                                |  2 +-
 fs/nilfs2/super.c                             |  3 ++-
 fs/ntfs/super.c                               |  4 ++--
 fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
 fs/ocfs2/super.c                              |  2 +-
 fs/openpromfs/inode.c                         |  2 +-
 fs/proc/inode.c                               |  3 ++-
 fs/qnx4/inode.c                               |  2 +-
 fs/qnx6/inode.c                               |  2 +-
 fs/reiserfs/super.c                           |  3 ++-
 fs/romfs/super.c                              |  4 ++--
 fs/squashfs/super.c                           |  3 ++-
 fs/sysv/inode.c                               |  2 +-
 fs/ubifs/super.c                              |  4 ++--
 fs/udf/super.c                                |  3 ++-
 fs/ufs/super.c                                |  2 +-
 fs/xfs/kmem.h                                 |  1 +
 fs/xfs/xfs_super.c                            |  4 ++--
 include/linux/thread_info.h                   |  5 +++--
 ipc/mqueue.c                                  |  2 +-
 kernel/cred.c                                 |  4 ++--
 kernel/delayacct.c                            |  2 +-
 kernel/fork.c                                 | 22 +++++++++++++---------
 kernel/pid.c                                  |  2 +-
 mm/nommu.c                                    |  2 +-
 mm/rmap.c                                     |  6 ++++--
 mm/shmem.c                                    |  2 +-
 net/socket.c                                  |  2 +-
 net/sunrpc/rpc_pipe.c                         |  2 +-
 65 files changed, 114 insertions(+), 92 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 11634fa7ab3c..ad4840f86be1 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -767,7 +767,7 @@ static int __init spufs_init(void)
 	ret = -ENOMEM;
 	spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
 			sizeof(struct spufs_inode_info), 0,
-			SLAB_HWCACHE_ALIGN, spufs_init_once);
+			SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
 
 	if (!spufs_inode_cache)
 		goto out;
diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
index 013136860664..60828d692db4 100644
--- a/drivers/staging/lustre/lustre/llite/super25.c
+++ b/drivers/staging/lustre/lustre/llite/super25.c
@@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
 	rc = -ENOMEM;
 	ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
 					    sizeof(struct ll_inode_info),
-					    0, SLAB_HWCACHE_ALIGN, NULL);
+					    0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
+					    NULL);
 	if (ll_inode_cachep == NULL)
 		goto out_cache;
 
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 6caca025019d..072e7599583a 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
 	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
 					  sizeof(struct v9fs_inode),
 					  0, (SLAB_RECLAIM_ACCOUNT|
-					      SLAB_MEM_SPREAD),
+					      SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					  v9fs_inode_init_once);
 	if (!v9fs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index 4d4a0df8344f..c9fdfb112933 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -271,7 +271,7 @@ static int __init init_inodecache(void)
 	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
 					     sizeof(struct adfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (adfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/affs/super.c b/fs/affs/super.c
index 5b50c4ca43a7..84a84fcb5f5a 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -132,7 +132,7 @@ static int __init init_inodecache(void)
 	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
 					     sizeof(struct affs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (affs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 1fb4a5129f7d..81afefe7d8a6 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -91,7 +91,7 @@ int __init afs_fs_init(void)
 	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
 					     sizeof(struct afs_vnode),
 					     0,
-					     SLAB_HWCACHE_ALIGN,
+					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 					     afs_i_init_once);
 	if (!afs_inode_cachep) {
 		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 46aedacfa6a8..2a23edf5703e 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -434,7 +434,7 @@ befs_init_inodecache(void)
 	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
 					      sizeof (struct befs_inode_info),
 					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					      init_once);
 	if (befs_inode_cachep == NULL) {
 		pr_err("%s: Couldn't initialize inode slabcache\n", __func__);
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index fdcb4d69f430..1e5c896f6b79 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -270,7 +270,7 @@ static int __init init_inodecache(void)
 	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
 					     sizeof(struct bfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (bfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 0a793c7930eb..29ce98bfe04f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -567,7 +567,7 @@ void __init bdev_cache_init(void)
 
 	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
 			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-				SLAB_MEM_SPREAD|SLAB_PANIC),
+				SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
 			init_once);
 	err = register_filesystem(&bd_type);
 	if (err)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4439fbb4ff45..c24d4cd9c14f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9157,7 +9157,8 @@ int btrfs_init_cachep(void)
 {
 	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
 			sizeof(struct btrfs_inode), 0,
-			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
+			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
+			init_once);
 	if (!btrfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index f446afada328..ca4d5e8457f1 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -639,8 +639,8 @@ static int __init init_caches(void)
 	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
 				      sizeof(struct ceph_inode_info),
 				      __alignof__(struct ceph_inode_info),
-				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
-				      ceph_inode_init_once);
+				      SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				      SLAB_ACCOUNT, ceph_inode_init_once);
 	if (ceph_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index e739950ca084..7f2e2639d1d1 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1040,7 +1040,7 @@ cifs_init_inodecache(void)
 	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
 					      sizeof(struct cifsInodeInfo),
 					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					      cifs_init_once);
 	if (cifs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index cac1390b87a3..57e81cbba0fa 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -74,9 +74,9 @@ static void init_once(void *foo)
 int __init coda_init_inodecache(void)
 {
 	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
-				sizeof(struct coda_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				sizeof(struct coda_inode_info), 0,
+				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				SLAB_ACCOUNT, init_once);
 	if (coda_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/dcache.c b/fs/dcache.c
index 5c33aeb0f68f..7ac590912106 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
 	dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
 	if (name->len > DNAME_INLINE_LEN-1) {
 		size_t size = offsetof(struct external_name, name[1]);
-		struct external_name *p = kmalloc(size + name->len, GFP_KERNEL);
+		struct external_name *p = kmalloc(size + name->len,
+						  GFP_KERNEL_ACCOUNT);
 		if (!p) {
 			kmem_cache_free(dentry_cache, dentry); 
 			return NULL;
@@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
 	 * of the dcache. 
 	 */
 	dentry_cache = KMEM_CACHE(dentry,
-		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
+		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
 
 	/* Hash may have been set up in dcache_init_early */
 	if (!hashdist)
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 4f4d0474bee9..e25b6b06bacf 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
 	struct kmem_cache **cache;
 	const char *name;
 	size_t size;
+	unsigned long flags;
 	void (*ctor)(void *obj);
 } ecryptfs_cache_infos[] = {
 	{
@@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
 		.cache = &ecryptfs_inode_info_cache,
 		.name = "ecryptfs_inode_cache",
 		.size = sizeof(struct ecryptfs_inode_info),
+		.flags = SLAB_ACCOUNT,
 		.ctor = inode_info_init_once,
 	},
 	{
@@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
 		struct ecryptfs_cache_info *info;
 
 		info = &ecryptfs_cache_infos[i];
-		*(info->cache) = kmem_cache_create(info->name, info->size,
-				0, SLAB_HWCACHE_ALIGN, info->ctor);
+		*(info->cache) = kmem_cache_create(info->name, info->size, 0,
+				SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
 		if (!*(info->cache)) {
 			ecryptfs_free_kmem_caches();
 			ecryptfs_printk(KERN_WARNING, "%s: "
diff --git a/fs/efs/super.c b/fs/efs/super.c
index c8411a30f7da..cb68dac4f9d3 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -94,9 +94,9 @@ static void init_once(void *foo)
 static int __init init_inodecache(void)
 {
 	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
-				sizeof(struct efs_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				sizeof(struct efs_inode_info), 0,
+				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				SLAB_ACCOUNT, init_once);
 	if (efs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index b795c567b5e1..6658a50530a0 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -194,8 +194,8 @@ static int init_inodecache(void)
 {
 	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
 				sizeof(struct exofs_i_info), 0,
-				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				exofs_init_once);
+				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
+				SLAB_ACCOUNT, exofs_init_once);
 	if (exofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 900e19cf9ef6..973092a32b98 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -200,7 +200,7 @@ static int __init init_inodecache(void)
 	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
 					     sizeof(struct ext2_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ext2_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 04d0f1b33409..c4a5c415b881 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -966,7 +966,7 @@ static int __init init_inodecache(void)
 	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
 					     sizeof(struct ext4_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ext4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 3a65e0132352..862916c7e3f8 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1424,8 +1424,9 @@ MODULE_ALIAS_FS("f2fs");
 
 static int __init init_inodecache(void)
 {
-	f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache",
-			sizeof(struct f2fs_inode_info));
+	f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
+			sizeof(struct f2fs_inode_info), 0,
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
 	if (!f2fs_inode_cachep)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 509411dd3698..6aece96df19f 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
 	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
 					     sizeof(struct msdos_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (fat_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/file.c b/fs/file.c
index 39f8f15921da..7d76c929d557 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
 	 * vmalloc() if the allocation size will be considered "large" by the VM.
 	 */
 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
-		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
+		void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
+				     __GFP_NOWARN | __GFP_NORETRY);
 		if (data != NULL)
 			return data;
 	}
-	return vmalloc(size);
+	return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
 static void __free_fdtable(struct fdtable *fdt)
@@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
 	if (unlikely(nr > sysctl_nr_open))
 		nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
 
-	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL);
+	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
 	if (!fdt)
 		goto out;
 	fdt->max_fds = nr;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 2913db2a5b99..4d69d5c0bedc 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
 	int err;
 
 	fuse_inode_cachep = kmem_cache_create("fuse_inode",
-					      sizeof(struct fuse_inode),
-					      0, SLAB_HWCACHE_ALIGN,
+					      sizeof(struct fuse_inode), 0,
+					      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 					      fuse_inode_init_once);
 	err = -ENOMEM;
 	if (!fuse_inode_cachep)
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 241a399bf83d..6ee38e210602 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -112,7 +112,8 @@ static int __init init_gfs2_fs(void)
 	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
 					      sizeof(struct gfs2_inode),
 					      0,  SLAB_RECLAIM_ACCOUNT|
-					          SLAB_MEM_SPREAD,
+					          SLAB_MEM_SPREAD|
+						  SLAB_ACCOUNT,
 					      gfs2_init_inode_once);
 	if (!gfs2_inode_cachep)
 		goto fail;
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index 4574fdd3d421..1ca95c232bb5 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
 	int err;
 
 	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
-		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
-		hfs_init_once);
+		sizeof(struct hfs_inode_info), 0,
+		SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
 	if (!hfs_inode_cachep)
 		return -ENOMEM;
 	err = register_filesystem(&hfs_fs_type);
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index 7302d96ae8bf..5d54490a136d 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
 	int err;
 
 	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
-		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
+		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 		hfsplus_init_once);
 	if (!hfsplus_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index 2ac99db3750e..a4cf6b11a142 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
 {
 	struct hostfs_inode_info *hi;
 
-	hi = kmalloc(sizeof(*hi), GFP_KERNEL);
+	hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
 	if (hi == NULL)
 		return NULL;
 	hi->fd = -1;
diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index a561591896bd..458cf463047b 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -261,7 +261,7 @@ static int init_inodecache(void)
 	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
 					     sizeof(struct hpfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (hpfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 316adb968b65..496add05f380 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1322,7 +1322,7 @@ static int __init init_hugetlbfs_fs(void)
 	error = -ENOMEM;
 	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
 					sizeof(struct hugetlbfs_inode_info),
-					0, 0, init_once);
+					0, SLAB_ACCOUNT, init_once);
 	if (hugetlbfs_inode_cachep == NULL)
 		goto out2;
 
diff --git a/fs/inode.c b/fs/inode.c
index 78a17b8859e1..08c66502f1f4 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1882,7 +1882,7 @@ void __init inode_init(void)
 					 sizeof(struct inode),
 					 0,
 					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-					 SLAB_MEM_SPREAD),
+					 SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					 init_once);
 
 	/* Hash may have been set up in inode_init_early */
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index d67a16f2a45d..9bc2431d2df8 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -94,7 +94,7 @@ static int __init init_inodecache(void)
 	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
 					sizeof(struct iso_inode_info),
 					0, (SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					init_once);
 	if (isofs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index d86c5e3176a1..bb080c272149 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
 	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
 					     sizeof(struct jffs2_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     jffs2_i_init_once);
 	if (!jffs2_inode_cachep) {
 		pr_err("error: Failed to initialise inode cache\n");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 4cd9798f4948..6efadc61c15b 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -901,7 +901,7 @@ static int __init init_jfs_fs(void)
 
 	jfs_inode_cachep =
 	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
-			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
 			    init_once);
 	if (jfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
index af49e2d6941a..5d65db2e03f4 100644
--- a/fs/logfs/inode.c
+++ b/fs/logfs/inode.c
@@ -408,7 +408,8 @@ const struct super_operations logfs_super_operations = {
 int logfs_init_inode_cache(void)
 {
 	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
-			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
+			sizeof(struct logfs_inode), 0,
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
 			logfs_init_once);
 	if (!logfs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 086cd0a61e80..5942c3e10fa5 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -91,7 +91,7 @@ static int __init init_inodecache(void)
 	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
 					     sizeof(struct minix_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (minix_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index 9605a2f63549..d80446e1a333 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -82,7 +82,7 @@ static int init_inodecache(void)
 	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
 					     sizeof(struct ncp_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ncp_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 326d9e10d833..412f888fad13 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1904,7 +1904,7 @@ static int __init nfs_init_inodecache(void)
 	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
 					     sizeof(struct nfs_inode),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (nfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index f47585bfeb01..dcf8e2ff3072 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1419,7 +1419,8 @@ static int __init nilfs_init_cachep(void)
 {
 	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
 			sizeof(struct nilfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
+			nilfs_inode_init_once);
 	if (!nilfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index d1a853585b53..2f77f8dfb861 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
 
 	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
 			sizeof(big_ntfs_inode), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-			ntfs_big_inode_init_once);
+			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+			SLAB_ACCOUNT, ntfs_big_inode_init_once);
 	if (!ntfs_big_inode_cache) {
 		pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
 		goto big_inode_err_out;
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index b5cf27dcb18a..03768bb3aab1 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
 	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
 				sizeof(struct dlmfs_inode_private),
 				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				dlmfs_init_once);
 	if (!dlmfs_inode_cache) {
 		status = -ENOMEM;
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 2de4c8a9340c..8ab0fcbc0b86 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1771,7 +1771,7 @@ static int ocfs2_initialize_mem_caches(void)
 				       sizeof(struct ocfs2_inode_info),
 				       0,
 				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				       ocfs2_inode_init_once);
 	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
 					sizeof(struct ocfs2_dquot),
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index 15e4500cda3e..b61b883c8ff8 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
 					    sizeof(struct op_inode_info),
 					    0,
 					    (SLAB_RECLAIM_ACCOUNT |
-					     SLAB_MEM_SPREAD),
+					     SLAB_MEM_SPREAD | SLAB_ACCOUNT),
 					    op_inode_init_once);
 	if (!op_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index bd95b9fdebb0..561557122dea 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
 	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
 					     sizeof(struct proc_inode),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_PANIC),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT|
+						SLAB_PANIC),
 					     init_once);
 }
 
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index c4bcb778886e..f761acdd5a7a 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -364,7 +364,7 @@ static int init_inodecache(void)
 	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
 					     sizeof(struct qnx4_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (qnx4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 32d2e1a9774c..4f04f00a7e5e 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -624,7 +624,7 @@ static int init_inodecache(void)
 	qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
 					     sizeof(struct qnx6_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (!qnx6_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 4a62fe8cc3bf..05db7473bcb5 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -626,7 +626,8 @@ static int __init init_inodecache(void)
 						  sizeof(struct
 							 reiserfs_inode_info),
 						  0, (SLAB_RECLAIM_ACCOUNT|
-							SLAB_MEM_SPREAD),
+						      SLAB_MEM_SPREAD|
+						      SLAB_ACCOUNT),
 						  init_once);
 	if (reiserfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 268733cda397..e1113399a6b4 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -618,8 +618,8 @@ static int __init init_romfs_fs(void)
 	romfs_inode_cachep =
 		kmem_cache_create("romfs_i",
 				  sizeof(struct romfs_inode_info), 0,
-				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				  romfs_i_init_once);
+				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
+				  SLAB_ACCOUNT, romfs_i_init_once);
 
 	if (!romfs_inode_cachep) {
 		pr_err("Failed to initialise inode cache\n");
diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
index 5056babe00df..ea59b475663c 100644
--- a/fs/squashfs/super.c
+++ b/fs/squashfs/super.c
@@ -420,7 +420,8 @@ static int __init init_inodecache(void)
 {
 	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
 		sizeof(struct squashfs_inode_info), 0,
-		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
+		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
+		init_once);
 
 	return squashfs_inode_cachep ? 0 : -ENOMEM;
 }
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 590ad9206e3f..087ed6a1c1df 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -353,7 +353,7 @@ int __init sysv_init_icache(void)
 {
 	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
 			sizeof(struct sysv_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
 			init_once);
 	if (!sysv_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 9547a27868ad..9d064789c63a 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2241,8 +2241,8 @@ static int __init ubifs_init(void)
 
 	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
 				sizeof(struct ubifs_inode), 0,
-				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
-				&inode_slab_ctor);
+				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
+				SLAB_ACCOUNT, &inode_slab_ctor);
 	if (!ubifs_inode_slab)
 		return -ENOMEM;
 
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 81155b9b445b..9c64a3ca9837 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -179,7 +179,8 @@ static int __init init_inodecache(void)
 	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
 					     sizeof(struct udf_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT |
-						 SLAB_MEM_SPREAD),
+						 SLAB_MEM_SPREAD |
+						 SLAB_ACCOUNT),
 					     init_once);
 	if (!udf_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index f6390eec02ca..442fd52ebffe 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
 	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
 					     sizeof(struct ufs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ufs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index cc6b768fc068..d1c66e465ca5 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
 #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
 #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
 #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
+#define KM_ZONE_ACCOUNT	SLAB_ACCOUNT
 
 #define kmem_zone	kmem_cache
 #define kmem_zone_t	struct kmem_cache
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 904f637cfa5f..70d5b3072631 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1703,8 +1703,8 @@ xfs_init_zones(void)
 
 	xfs_inode_zone =
 		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
-			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
-			xfs_fs_inode_init_once);
+			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
+			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
 	if (!xfs_inode_zone)
 		goto out_destroy_efi_zone;
 
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index ff307b548ed3..b4c2a485b28a 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -56,9 +56,10 @@ extern long do_no_restart_syscall(struct restart_block *parm);
 #ifdef __KERNEL__
 
 #ifdef CONFIG_DEBUG_STACK_USAGE
-# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
+# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | \
+				 __GFP_ZERO)
 #else
-# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK)
+# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
 #endif
 
 /*
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 161a1807e6ef..f4617cf07069 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1438,7 +1438,7 @@ static int __init init_mqueue_fs(void)
 
 	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
 				sizeof(struct mqueue_inode_info), 0,
-				SLAB_HWCACHE_ALIGN, init_once);
+				SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
 	if (mqueue_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/kernel/cred.c b/kernel/cred.c
index 71179a09c1d6..0c0cd8a62285 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -569,8 +569,8 @@ EXPORT_SYMBOL(revert_creds);
 void __init cred_init(void)
 {
 	/* allocate a slab in which we can store credentials */
-	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
-				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred), 0,
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
 }
 
 /**
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index ef90b04d783f..435c14a45118 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -34,7 +34,7 @@ __setup("nodelayacct", delayacct_setup_disable);
 
 void delayacct_init(void)
 {
-	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC);
+	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC|SLAB_ACCOUNT);
 	delayacct_tsk_init(&init_task);
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index f97f2c449f5c..ff39b78e6e23 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -300,9 +300,9 @@ void __init fork_init(void)
 #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
 #endif
 	/* create a slab on which task_structs can be allocated */
-	task_struct_cachep =
-		kmem_cache_create("task_struct", arch_task_struct_size,
-			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
+	task_struct_cachep = kmem_cache_create("task_struct",
+			arch_task_struct_size, ARCH_MIN_TASKALIGN,
+			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
 #endif
 
 	/* do the arch specific task caches init */
@@ -1851,16 +1851,19 @@ void __init proc_caches_init(void)
 	sighand_cachep = kmem_cache_create("sighand_cache",
 			sizeof(struct sighand_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU|
-			SLAB_NOTRACK, sighand_ctor);
+			SLAB_NOTRACK|SLAB_ACCOUNT, sighand_ctor);
 	signal_cachep = kmem_cache_create("signal_cache",
 			sizeof(struct signal_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	files_cachep = kmem_cache_create("files_cache",
 			sizeof(struct files_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	fs_cachep = kmem_cache_create("fs_cache",
 			sizeof(struct fs_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	/*
 	 * FIXME! The "sizeof(struct mm_struct)" currently includes the
 	 * whole struct cpumask for the OFFSTACK case. We could change
@@ -1870,8 +1873,9 @@ void __init proc_caches_init(void)
 	 */
 	mm_cachep = kmem_cache_create("mm_struct",
 			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
-	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
+	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
 	mmap_init();
 	nsproxy_cache_init();
 }
diff --git a/kernel/pid.c b/kernel/pid.c
index ca368793808e..f09b026f5b56 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -604,5 +604,5 @@ void __init pidmap_init(void)
 	atomic_dec(&init_pid_ns.pidmap[0].nr_free);
 
 	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
-			SLAB_HWCACHE_ALIGN | SLAB_PANIC);
+			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
 }
diff --git a/mm/nommu.c b/mm/nommu.c
index 92be862c859b..fbf6f0f1d6c9 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -560,7 +560,7 @@ void __init mmap_init(void)
 
 	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
 	VM_BUG_ON(ret);
-	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC);
+	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC|SLAB_ACCOUNT);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index b577fbb98d4b..3c3f1d21f075 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -428,8 +428,10 @@ static void anon_vma_ctor(void *data)
 void __init anon_vma_init(void)
 {
 	anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct anon_vma),
-			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC, anon_vma_ctor);
-	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain, SLAB_PANIC);
+			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
+			anon_vma_ctor);
+	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain,
+			SLAB_PANIC|SLAB_ACCOUNT);
 }
 
 /*
diff --git a/mm/shmem.c b/mm/shmem.c
index 3b8b73928398..882933a7de99 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3107,7 +3107,7 @@ static int shmem_init_inodecache(void)
 {
 	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
 				sizeof(struct shmem_inode_info),
-				0, SLAB_PANIC, shmem_init_inode);
+				0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
 	return 0;
 }
 
diff --git a/net/socket.c b/net/socket.c
index 9963a0b53a64..2d70af8d943f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -293,7 +293,7 @@ static int init_inodecache(void)
 					      0,
 					      (SLAB_HWCACHE_ALIGN |
 					       SLAB_RECLAIM_ACCOUNT |
-					       SLAB_MEM_SPREAD),
+					       SLAB_MEM_SPREAD | SLAB_ACCOUNT),
 					      init_once);
 	if (sock_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index d81186d34558..14f45bf0410c 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1500,7 +1500,7 @@ int register_rpc_pipefs(void)
 	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
 				sizeof(struct rpc_inode),
 				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				init_once);
 	if (!rpc_inode_cachep)
 		return -ENOMEM;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 56+ messages in thread

* [PATCH v2 6/6] Account certain kmem allocations to memcg
@ 2015-11-10 18:34   ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:34 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

This patch marks those kmem allocations that are known to be easily
triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
accounted to memcg. For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable->full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is by far not complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
 drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
 fs/9p/v9fs.c                                  |  2 +-
 fs/adfs/super.c                               |  2 +-
 fs/affs/super.c                               |  2 +-
 fs/afs/super.c                                |  2 +-
 fs/befs/linuxvfs.c                            |  2 +-
 fs/bfs/inode.c                                |  2 +-
 fs/block_dev.c                                |  2 +-
 fs/btrfs/inode.c                              |  3 ++-
 fs/ceph/super.c                               |  4 ++--
 fs/cifs/cifsfs.c                              |  2 +-
 fs/coda/inode.c                               |  6 +++---
 fs/dcache.c                                   |  5 +++--
 fs/ecryptfs/main.c                            |  6 ++++--
 fs/efs/super.c                                |  6 +++---
 fs/exofs/super.c                              |  4 ++--
 fs/ext2/super.c                               |  2 +-
 fs/ext4/super.c                               |  2 +-
 fs/f2fs/super.c                               |  5 +++--
 fs/fat/inode.c                                |  2 +-
 fs/file.c                                     |  7 ++++---
 fs/fuse/inode.c                               |  4 ++--
 fs/gfs2/main.c                                |  3 ++-
 fs/hfs/super.c                                |  4 ++--
 fs/hfsplus/super.c                            |  2 +-
 fs/hostfs/hostfs_kern.c                       |  2 +-
 fs/hpfs/super.c                               |  2 +-
 fs/hugetlbfs/inode.c                          |  2 +-
 fs/inode.c                                    |  2 +-
 fs/isofs/inode.c                              |  2 +-
 fs/jffs2/super.c                              |  2 +-
 fs/jfs/super.c                                |  2 +-
 fs/logfs/inode.c                              |  3 ++-
 fs/minix/inode.c                              |  2 +-
 fs/ncpfs/inode.c                              |  2 +-
 fs/nfs/inode.c                                |  2 +-
 fs/nilfs2/super.c                             |  3 ++-
 fs/ntfs/super.c                               |  4 ++--
 fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
 fs/ocfs2/super.c                              |  2 +-
 fs/openpromfs/inode.c                         |  2 +-
 fs/proc/inode.c                               |  3 ++-
 fs/qnx4/inode.c                               |  2 +-
 fs/qnx6/inode.c                               |  2 +-
 fs/reiserfs/super.c                           |  3 ++-
 fs/romfs/super.c                              |  4 ++--
 fs/squashfs/super.c                           |  3 ++-
 fs/sysv/inode.c                               |  2 +-
 fs/ubifs/super.c                              |  4 ++--
 fs/udf/super.c                                |  3 ++-
 fs/ufs/super.c                                |  2 +-
 fs/xfs/kmem.h                                 |  1 +
 fs/xfs/xfs_super.c                            |  4 ++--
 include/linux/thread_info.h                   |  5 +++--
 ipc/mqueue.c                                  |  2 +-
 kernel/cred.c                                 |  4 ++--
 kernel/delayacct.c                            |  2 +-
 kernel/fork.c                                 | 22 +++++++++++++---------
 kernel/pid.c                                  |  2 +-
 mm/nommu.c                                    |  2 +-
 mm/rmap.c                                     |  6 ++++--
 mm/shmem.c                                    |  2 +-
 net/socket.c                                  |  2 +-
 net/sunrpc/rpc_pipe.c                         |  2 +-
 65 files changed, 114 insertions(+), 92 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 11634fa7ab3c..ad4840f86be1 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -767,7 +767,7 @@ static int __init spufs_init(void)
 	ret = -ENOMEM;
 	spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
 			sizeof(struct spufs_inode_info), 0,
-			SLAB_HWCACHE_ALIGN, spufs_init_once);
+			SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
 
 	if (!spufs_inode_cache)
 		goto out;
diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
index 013136860664..60828d692db4 100644
--- a/drivers/staging/lustre/lustre/llite/super25.c
+++ b/drivers/staging/lustre/lustre/llite/super25.c
@@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
 	rc = -ENOMEM;
 	ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
 					    sizeof(struct ll_inode_info),
-					    0, SLAB_HWCACHE_ALIGN, NULL);
+					    0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
+					    NULL);
 	if (ll_inode_cachep == NULL)
 		goto out_cache;
 
diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
index 6caca025019d..072e7599583a 100644
--- a/fs/9p/v9fs.c
+++ b/fs/9p/v9fs.c
@@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
 	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
 					  sizeof(struct v9fs_inode),
 					  0, (SLAB_RECLAIM_ACCOUNT|
-					      SLAB_MEM_SPREAD),
+					      SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					  v9fs_inode_init_once);
 	if (!v9fs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/adfs/super.c b/fs/adfs/super.c
index 4d4a0df8344f..c9fdfb112933 100644
--- a/fs/adfs/super.c
+++ b/fs/adfs/super.c
@@ -271,7 +271,7 @@ static int __init init_inodecache(void)
 	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
 					     sizeof(struct adfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (adfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/affs/super.c b/fs/affs/super.c
index 5b50c4ca43a7..84a84fcb5f5a 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -132,7 +132,7 @@ static int __init init_inodecache(void)
 	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
 					     sizeof(struct affs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (affs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/afs/super.c b/fs/afs/super.c
index 1fb4a5129f7d..81afefe7d8a6 100644
--- a/fs/afs/super.c
+++ b/fs/afs/super.c
@@ -91,7 +91,7 @@ int __init afs_fs_init(void)
 	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
 					     sizeof(struct afs_vnode),
 					     0,
-					     SLAB_HWCACHE_ALIGN,
+					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 					     afs_i_init_once);
 	if (!afs_inode_cachep) {
 		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 46aedacfa6a8..2a23edf5703e 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -434,7 +434,7 @@ befs_init_inodecache(void)
 	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
 					      sizeof (struct befs_inode_info),
 					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					      init_once);
 	if (befs_inode_cachep == NULL) {
 		pr_err("%s: Couldn't initialize inode slabcache\n", __func__);
diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
index fdcb4d69f430..1e5c896f6b79 100644
--- a/fs/bfs/inode.c
+++ b/fs/bfs/inode.c
@@ -270,7 +270,7 @@ static int __init init_inodecache(void)
 	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
 					     sizeof(struct bfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (bfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 0a793c7930eb..29ce98bfe04f 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -567,7 +567,7 @@ void __init bdev_cache_init(void)
 
 	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
 			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-				SLAB_MEM_SPREAD|SLAB_PANIC),
+				SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
 			init_once);
 	err = register_filesystem(&bd_type);
 	if (err)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 4439fbb4ff45..c24d4cd9c14f 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9157,7 +9157,8 @@ int btrfs_init_cachep(void)
 {
 	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
 			sizeof(struct btrfs_inode), 0,
-			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
+			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
+			init_once);
 	if (!btrfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index f446afada328..ca4d5e8457f1 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -639,8 +639,8 @@ static int __init init_caches(void)
 	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
 				      sizeof(struct ceph_inode_info),
 				      __alignof__(struct ceph_inode_info),
-				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
-				      ceph_inode_init_once);
+				      SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				      SLAB_ACCOUNT, ceph_inode_init_once);
 	if (ceph_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index e739950ca084..7f2e2639d1d1 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1040,7 +1040,7 @@ cifs_init_inodecache(void)
 	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
 					      sizeof(struct cifsInodeInfo),
 					      0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					      cifs_init_once);
 	if (cifs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index cac1390b87a3..57e81cbba0fa 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -74,9 +74,9 @@ static void init_once(void *foo)
 int __init coda_init_inodecache(void)
 {
 	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
-				sizeof(struct coda_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				sizeof(struct coda_inode_info), 0,
+				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				SLAB_ACCOUNT, init_once);
 	if (coda_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/dcache.c b/fs/dcache.c
index 5c33aeb0f68f..7ac590912106 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
 	dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
 	if (name->len > DNAME_INLINE_LEN-1) {
 		size_t size = offsetof(struct external_name, name[1]);
-		struct external_name *p = kmalloc(size + name->len, GFP_KERNEL);
+		struct external_name *p = kmalloc(size + name->len,
+						  GFP_KERNEL_ACCOUNT);
 		if (!p) {
 			kmem_cache_free(dentry_cache, dentry); 
 			return NULL;
@@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
 	 * of the dcache. 
 	 */
 	dentry_cache = KMEM_CACHE(dentry,
-		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
+		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
 
 	/* Hash may have been set up in dcache_init_early */
 	if (!hashdist)
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
index 4f4d0474bee9..e25b6b06bacf 100644
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
 	struct kmem_cache **cache;
 	const char *name;
 	size_t size;
+	unsigned long flags;
 	void (*ctor)(void *obj);
 } ecryptfs_cache_infos[] = {
 	{
@@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
 		.cache = &ecryptfs_inode_info_cache,
 		.name = "ecryptfs_inode_cache",
 		.size = sizeof(struct ecryptfs_inode_info),
+		.flags = SLAB_ACCOUNT,
 		.ctor = inode_info_init_once,
 	},
 	{
@@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
 		struct ecryptfs_cache_info *info;
 
 		info = &ecryptfs_cache_infos[i];
-		*(info->cache) = kmem_cache_create(info->name, info->size,
-				0, SLAB_HWCACHE_ALIGN, info->ctor);
+		*(info->cache) = kmem_cache_create(info->name, info->size, 0,
+				SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
 		if (!*(info->cache)) {
 			ecryptfs_free_kmem_caches();
 			ecryptfs_printk(KERN_WARNING, "%s: "
diff --git a/fs/efs/super.c b/fs/efs/super.c
index c8411a30f7da..cb68dac4f9d3 100644
--- a/fs/efs/super.c
+++ b/fs/efs/super.c
@@ -94,9 +94,9 @@ static void init_once(void *foo)
 static int __init init_inodecache(void)
 {
 	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
-				sizeof(struct efs_inode_info),
-				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-				init_once);
+				sizeof(struct efs_inode_info), 0,
+				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+				SLAB_ACCOUNT, init_once);
 	if (efs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index b795c567b5e1..6658a50530a0 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -194,8 +194,8 @@ static int init_inodecache(void)
 {
 	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
 				sizeof(struct exofs_i_info), 0,
-				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				exofs_init_once);
+				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
+				SLAB_ACCOUNT, exofs_init_once);
 	if (exofs_inode_cachep == NULL)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 900e19cf9ef6..973092a32b98 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -200,7 +200,7 @@ static int __init init_inodecache(void)
 	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
 					     sizeof(struct ext2_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ext2_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 04d0f1b33409..c4a5c415b881 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -966,7 +966,7 @@ static int __init init_inodecache(void)
 	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
 					     sizeof(struct ext4_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ext4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 3a65e0132352..862916c7e3f8 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1424,8 +1424,9 @@ MODULE_ALIAS_FS("f2fs");
 
 static int __init init_inodecache(void)
 {
-	f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache",
-			sizeof(struct f2fs_inode_info));
+	f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
+			sizeof(struct f2fs_inode_info), 0,
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
 	if (!f2fs_inode_cachep)
 		return -ENOMEM;
 	return 0;
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 509411dd3698..6aece96df19f 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
 	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
 					     sizeof(struct msdos_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (fat_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/file.c b/fs/file.c
index 39f8f15921da..7d76c929d557 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
 	 * vmalloc() if the allocation size will be considered "large" by the VM.
 	 */
 	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
-		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
+		void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
+				     __GFP_NOWARN | __GFP_NORETRY);
 		if (data != NULL)
 			return data;
 	}
-	return vmalloc(size);
+	return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
 }
 
 static void __free_fdtable(struct fdtable *fdt)
@@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
 	if (unlikely(nr > sysctl_nr_open))
 		nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
 
-	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL);
+	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
 	if (!fdt)
 		goto out;
 	fdt->max_fds = nr;
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index 2913db2a5b99..4d69d5c0bedc 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
 	int err;
 
 	fuse_inode_cachep = kmem_cache_create("fuse_inode",
-					      sizeof(struct fuse_inode),
-					      0, SLAB_HWCACHE_ALIGN,
+					      sizeof(struct fuse_inode), 0,
+					      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 					      fuse_inode_init_once);
 	err = -ENOMEM;
 	if (!fuse_inode_cachep)
diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
index 241a399bf83d..6ee38e210602 100644
--- a/fs/gfs2/main.c
+++ b/fs/gfs2/main.c
@@ -112,7 +112,8 @@ static int __init init_gfs2_fs(void)
 	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
 					      sizeof(struct gfs2_inode),
 					      0,  SLAB_RECLAIM_ACCOUNT|
-					          SLAB_MEM_SPREAD,
+					          SLAB_MEM_SPREAD|
+						  SLAB_ACCOUNT,
 					      gfs2_init_inode_once);
 	if (!gfs2_inode_cachep)
 		goto fail;
diff --git a/fs/hfs/super.c b/fs/hfs/super.c
index 4574fdd3d421..1ca95c232bb5 100644
--- a/fs/hfs/super.c
+++ b/fs/hfs/super.c
@@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
 	int err;
 
 	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
-		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
-		hfs_init_once);
+		sizeof(struct hfs_inode_info), 0,
+		SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
 	if (!hfs_inode_cachep)
 		return -ENOMEM;
 	err = register_filesystem(&hfs_fs_type);
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index 7302d96ae8bf..5d54490a136d 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
 	int err;
 
 	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
-		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
+		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
 		hfsplus_init_once);
 	if (!hfsplus_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index 2ac99db3750e..a4cf6b11a142 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
 {
 	struct hostfs_inode_info *hi;
 
-	hi = kmalloc(sizeof(*hi), GFP_KERNEL);
+	hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
 	if (hi == NULL)
 		return NULL;
 	hi->fd = -1;
diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
index a561591896bd..458cf463047b 100644
--- a/fs/hpfs/super.c
+++ b/fs/hpfs/super.c
@@ -261,7 +261,7 @@ static int init_inodecache(void)
 	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
 					     sizeof(struct hpfs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (hpfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 316adb968b65..496add05f380 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -1322,7 +1322,7 @@ static int __init init_hugetlbfs_fs(void)
 	error = -ENOMEM;
 	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
 					sizeof(struct hugetlbfs_inode_info),
-					0, 0, init_once);
+					0, SLAB_ACCOUNT, init_once);
 	if (hugetlbfs_inode_cachep == NULL)
 		goto out2;
 
diff --git a/fs/inode.c b/fs/inode.c
index 78a17b8859e1..08c66502f1f4 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1882,7 +1882,7 @@ void __init inode_init(void)
 					 sizeof(struct inode),
 					 0,
 					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
-					 SLAB_MEM_SPREAD),
+					 SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					 init_once);
 
 	/* Hash may have been set up in inode_init_early */
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index d67a16f2a45d..9bc2431d2df8 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -94,7 +94,7 @@ static int __init init_inodecache(void)
 	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
 					sizeof(struct iso_inode_info),
 					0, (SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					init_once);
 	if (isofs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
index d86c5e3176a1..bb080c272149 100644
--- a/fs/jffs2/super.c
+++ b/fs/jffs2/super.c
@@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
 	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
 					     sizeof(struct jffs2_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     jffs2_i_init_once);
 	if (!jffs2_inode_cachep) {
 		pr_err("error: Failed to initialise inode cache\n");
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 4cd9798f4948..6efadc61c15b 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -901,7 +901,7 @@ static int __init init_jfs_fs(void)
 
 	jfs_inode_cachep =
 	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
-			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
 			    init_once);
 	if (jfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
index af49e2d6941a..5d65db2e03f4 100644
--- a/fs/logfs/inode.c
+++ b/fs/logfs/inode.c
@@ -408,7 +408,8 @@ const struct super_operations logfs_super_operations = {
 int logfs_init_inode_cache(void)
 {
 	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
-			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
+			sizeof(struct logfs_inode), 0,
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
 			logfs_init_once);
 	if (!logfs_inode_cache)
 		return -ENOMEM;
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index 086cd0a61e80..5942c3e10fa5 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -91,7 +91,7 @@ static int __init init_inodecache(void)
 	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
 					     sizeof(struct minix_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (minix_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
index 9605a2f63549..d80446e1a333 100644
--- a/fs/ncpfs/inode.c
+++ b/fs/ncpfs/inode.c
@@ -82,7 +82,7 @@ static int init_inodecache(void)
 	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
 					     sizeof(struct ncp_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ncp_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 326d9e10d833..412f888fad13 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -1904,7 +1904,7 @@ static int __init nfs_init_inodecache(void)
 	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
 					     sizeof(struct nfs_inode),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (nfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
index f47585bfeb01..dcf8e2ff3072 100644
--- a/fs/nilfs2/super.c
+++ b/fs/nilfs2/super.c
@@ -1419,7 +1419,8 @@ static int __init nilfs_init_cachep(void)
 {
 	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
 			sizeof(struct nilfs_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
+			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
+			nilfs_inode_init_once);
 	if (!nilfs_inode_cachep)
 		goto fail;
 
diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
index d1a853585b53..2f77f8dfb861 100644
--- a/fs/ntfs/super.c
+++ b/fs/ntfs/super.c
@@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
 
 	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
 			sizeof(big_ntfs_inode), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
-			ntfs_big_inode_init_once);
+			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
+			SLAB_ACCOUNT, ntfs_big_inode_init_once);
 	if (!ntfs_big_inode_cache) {
 		pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
 		goto big_inode_err_out;
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index b5cf27dcb18a..03768bb3aab1 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
 	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
 				sizeof(struct dlmfs_inode_private),
 				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-					SLAB_MEM_SPREAD),
+					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				dlmfs_init_once);
 	if (!dlmfs_inode_cache) {
 		status = -ENOMEM;
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 2de4c8a9340c..8ab0fcbc0b86 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1771,7 +1771,7 @@ static int ocfs2_initialize_mem_caches(void)
 				       sizeof(struct ocfs2_inode_info),
 				       0,
 				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				       ocfs2_inode_init_once);
 	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
 					sizeof(struct ocfs2_dquot),
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index 15e4500cda3e..b61b883c8ff8 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
 					    sizeof(struct op_inode_info),
 					    0,
 					    (SLAB_RECLAIM_ACCOUNT |
-					     SLAB_MEM_SPREAD),
+					     SLAB_MEM_SPREAD | SLAB_ACCOUNT),
 					    op_inode_init_once);
 	if (!op_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index bd95b9fdebb0..561557122dea 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
 	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
 					     sizeof(struct proc_inode),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD|SLAB_PANIC),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT|
+						SLAB_PANIC),
 					     init_once);
 }
 
diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
index c4bcb778886e..f761acdd5a7a 100644
--- a/fs/qnx4/inode.c
+++ b/fs/qnx4/inode.c
@@ -364,7 +364,7 @@ static int init_inodecache(void)
 	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
 					     sizeof(struct qnx4_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (qnx4_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
index 32d2e1a9774c..4f04f00a7e5e 100644
--- a/fs/qnx6/inode.c
+++ b/fs/qnx6/inode.c
@@ -624,7 +624,7 @@ static int init_inodecache(void)
 	qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
 					     sizeof(struct qnx6_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (!qnx6_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 4a62fe8cc3bf..05db7473bcb5 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -626,7 +626,8 @@ static int __init init_inodecache(void)
 						  sizeof(struct
 							 reiserfs_inode_info),
 						  0, (SLAB_RECLAIM_ACCOUNT|
-							SLAB_MEM_SPREAD),
+						      SLAB_MEM_SPREAD|
+						      SLAB_ACCOUNT),
 						  init_once);
 	if (reiserfs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index 268733cda397..e1113399a6b4 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -618,8 +618,8 @@ static int __init init_romfs_fs(void)
 	romfs_inode_cachep =
 		kmem_cache_create("romfs_i",
 				  sizeof(struct romfs_inode_info), 0,
-				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
-				  romfs_i_init_once);
+				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
+				  SLAB_ACCOUNT, romfs_i_init_once);
 
 	if (!romfs_inode_cachep) {
 		pr_err("Failed to initialise inode cache\n");
diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
index 5056babe00df..ea59b475663c 100644
--- a/fs/squashfs/super.c
+++ b/fs/squashfs/super.c
@@ -420,7 +420,8 @@ static int __init init_inodecache(void)
 {
 	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
 		sizeof(struct squashfs_inode_info), 0,
-		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
+		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
+		init_once);
 
 	return squashfs_inode_cachep ? 0 : -ENOMEM;
 }
diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
index 590ad9206e3f..087ed6a1c1df 100644
--- a/fs/sysv/inode.c
+++ b/fs/sysv/inode.c
@@ -353,7 +353,7 @@ int __init sysv_init_icache(void)
 {
 	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
 			sizeof(struct sysv_inode_info), 0,
-			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
+			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
 			init_once);
 	if (!sysv_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 9547a27868ad..9d064789c63a 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -2241,8 +2241,8 @@ static int __init ubifs_init(void)
 
 	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
 				sizeof(struct ubifs_inode), 0,
-				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
-				&inode_slab_ctor);
+				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
+				SLAB_ACCOUNT, &inode_slab_ctor);
 	if (!ubifs_inode_slab)
 		return -ENOMEM;
 
diff --git a/fs/udf/super.c b/fs/udf/super.c
index 81155b9b445b..9c64a3ca9837 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -179,7 +179,8 @@ static int __init init_inodecache(void)
 	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
 					     sizeof(struct udf_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT |
-						 SLAB_MEM_SPREAD),
+						 SLAB_MEM_SPREAD |
+						 SLAB_ACCOUNT),
 					     init_once);
 	if (!udf_inode_cachep)
 		return -ENOMEM;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index f6390eec02ca..442fd52ebffe 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
 	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
 					     sizeof(struct ufs_inode_info),
 					     0, (SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 					     init_once);
 	if (ufs_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
index cc6b768fc068..d1c66e465ca5 100644
--- a/fs/xfs/kmem.h
+++ b/fs/xfs/kmem.h
@@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
 #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
 #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
 #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
+#define KM_ZONE_ACCOUNT	SLAB_ACCOUNT
 
 #define kmem_zone	kmem_cache
 #define kmem_zone_t	struct kmem_cache
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 904f637cfa5f..70d5b3072631 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1703,8 +1703,8 @@ xfs_init_zones(void)
 
 	xfs_inode_zone =
 		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
-			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
-			xfs_fs_inode_init_once);
+			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
+			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
 	if (!xfs_inode_zone)
 		goto out_destroy_efi_zone;
 
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index ff307b548ed3..b4c2a485b28a 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -56,9 +56,10 @@ extern long do_no_restart_syscall(struct restart_block *parm);
 #ifdef __KERNEL__
 
 #ifdef CONFIG_DEBUG_STACK_USAGE
-# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
+# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | \
+				 __GFP_ZERO)
 #else
-# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK)
+# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
 #endif
 
 /*
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 161a1807e6ef..f4617cf07069 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -1438,7 +1438,7 @@ static int __init init_mqueue_fs(void)
 
 	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
 				sizeof(struct mqueue_inode_info), 0,
-				SLAB_HWCACHE_ALIGN, init_once);
+				SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
 	if (mqueue_inode_cachep == NULL)
 		return -ENOMEM;
 
diff --git a/kernel/cred.c b/kernel/cred.c
index 71179a09c1d6..0c0cd8a62285 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -569,8 +569,8 @@ EXPORT_SYMBOL(revert_creds);
 void __init cred_init(void)
 {
 	/* allocate a slab in which we can store credentials */
-	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
-				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred), 0,
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
 }
 
 /**
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index ef90b04d783f..435c14a45118 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -34,7 +34,7 @@ __setup("nodelayacct", delayacct_setup_disable);
 
 void delayacct_init(void)
 {
-	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC);
+	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC|SLAB_ACCOUNT);
 	delayacct_tsk_init(&init_task);
 }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index f97f2c449f5c..ff39b78e6e23 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -300,9 +300,9 @@ void __init fork_init(void)
 #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
 #endif
 	/* create a slab on which task_structs can be allocated */
-	task_struct_cachep =
-		kmem_cache_create("task_struct", arch_task_struct_size,
-			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
+	task_struct_cachep = kmem_cache_create("task_struct",
+			arch_task_struct_size, ARCH_MIN_TASKALIGN,
+			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
 #endif
 
 	/* do the arch specific task caches init */
@@ -1851,16 +1851,19 @@ void __init proc_caches_init(void)
 	sighand_cachep = kmem_cache_create("sighand_cache",
 			sizeof(struct sighand_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU|
-			SLAB_NOTRACK, sighand_ctor);
+			SLAB_NOTRACK|SLAB_ACCOUNT, sighand_ctor);
 	signal_cachep = kmem_cache_create("signal_cache",
 			sizeof(struct signal_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	files_cachep = kmem_cache_create("files_cache",
 			sizeof(struct files_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	fs_cachep = kmem_cache_create("fs_cache",
 			sizeof(struct fs_struct), 0,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
 	/*
 	 * FIXME! The "sizeof(struct mm_struct)" currently includes the
 	 * whole struct cpumask for the OFFSTACK case. We could change
@@ -1870,8 +1873,9 @@ void __init proc_caches_init(void)
 	 */
 	mm_cachep = kmem_cache_create("mm_struct",
 			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
-			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
-	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC);
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
+			NULL);
+	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
 	mmap_init();
 	nsproxy_cache_init();
 }
diff --git a/kernel/pid.c b/kernel/pid.c
index ca368793808e..f09b026f5b56 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -604,5 +604,5 @@ void __init pidmap_init(void)
 	atomic_dec(&init_pid_ns.pidmap[0].nr_free);
 
 	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
-			SLAB_HWCACHE_ALIGN | SLAB_PANIC);
+			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
 }
diff --git a/mm/nommu.c b/mm/nommu.c
index 92be862c859b..fbf6f0f1d6c9 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -560,7 +560,7 @@ void __init mmap_init(void)
 
 	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
 	VM_BUG_ON(ret);
-	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC);
+	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC|SLAB_ACCOUNT);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index b577fbb98d4b..3c3f1d21f075 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -428,8 +428,10 @@ static void anon_vma_ctor(void *data)
 void __init anon_vma_init(void)
 {
 	anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct anon_vma),
-			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC, anon_vma_ctor);
-	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain, SLAB_PANIC);
+			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
+			anon_vma_ctor);
+	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain,
+			SLAB_PANIC|SLAB_ACCOUNT);
 }
 
 /*
diff --git a/mm/shmem.c b/mm/shmem.c
index 3b8b73928398..882933a7de99 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3107,7 +3107,7 @@ static int shmem_init_inodecache(void)
 {
 	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
 				sizeof(struct shmem_inode_info),
-				0, SLAB_PANIC, shmem_init_inode);
+				0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
 	return 0;
 }
 
diff --git a/net/socket.c b/net/socket.c
index 9963a0b53a64..2d70af8d943f 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -293,7 +293,7 @@ static int init_inodecache(void)
 					      0,
 					      (SLAB_HWCACHE_ALIGN |
 					       SLAB_RECLAIM_ACCOUNT |
-					       SLAB_MEM_SPREAD),
+					       SLAB_MEM_SPREAD | SLAB_ACCOUNT),
 					      init_once);
 	if (sock_inode_cachep == NULL)
 		return -ENOMEM;
diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
index d81186d34558..14f45bf0410c 100644
--- a/net/sunrpc/rpc_pipe.c
+++ b/net/sunrpc/rpc_pipe.c
@@ -1500,7 +1500,7 @@ int register_rpc_pipefs(void)
 	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
 				sizeof(struct rpc_inode),
 				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
-						SLAB_MEM_SPREAD),
+						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
 				init_once);
 	if (!rpc_inode_cachep)
 		return -ENOMEM;
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-10 18:38     ` Tejun Heo
  -1 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-10 18:38 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

Am I correct in thinking that we should eventually be able to removed
__GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
would need to be handled by kmemcg?

Thanks a lot for doing this!

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-10 18:38     ` Tejun Heo
  0 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-10 18:38 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

Am I correct in thinking that we should eventually be able to removed
__GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
would need to be handled by kmemcg?

Thanks a lot for doing this!

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-10 18:38     ` Tejun Heo
  0 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-10 18:38 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

Am I correct in thinking that we should eventually be able to removed
__GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
would need to be handled by kmemcg?

Thanks a lot for doing this!

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:38     ` Tejun Heo
@ 2015-11-10 18:54       ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 01:38:08PM -0500, Tejun Heo wrote:
> On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> > Currently, if we want to account all objects of a particular kmem cache,
> > we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> > inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> > kmem_cache_create will force accounting for every allocation from this
> > cache even if __GFP_ACCOUNT is not passed.
> > 
> > This patch does not make any of the existing caches use this flag - it
> > will be done later in the series.
> > 
> > Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> > SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> > merged slabs even if kmem accounting is not used (only compiled in).
> 
> Am I correct in thinking that we should eventually be able to removed
> __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> would need to be handled by kmemcg?

Don't think so, because sometimes we want to account kmalloc.

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-10 18:54       ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-10 18:54 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 01:38:08PM -0500, Tejun Heo wrote:
> On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> > Currently, if we want to account all objects of a particular kmem cache,
> > we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> > inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> > kmem_cache_create will force accounting for every allocation from this
> > cache even if __GFP_ACCOUNT is not passed.
> > 
> > This patch does not make any of the existing caches use this flag - it
> > will be done later in the series.
> > 
> > Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> > SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> > merged slabs even if kmem accounting is not used (only compiled in).
> 
> Am I correct in thinking that we should eventually be able to removed
> __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> would need to be handled by kmemcg?

Don't think so, because sometimes we want to account kmalloc.

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:54       ` Vladimir Davydov
  (?)
@ 2015-11-11 15:54         ` Tejun Heo
  -1 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-11 15:54 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hello,

On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > Am I correct in thinking that we should eventually be able to removed
> > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > would need to be handled by kmemcg?
> 
> Don't think so, because sometimes we want to account kmalloc.

I'm kinda skeptical about that because if those allocations are
occassional by nature, we don't care and if there can be a huge number
of them, splitting them into a separate cache makes sense.  I think it
makes sense to pin down exactly which caches are memcg managed.  That
has the potential to simplify the involved code path and shave off a
small bit of hot path overhead.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-11 15:54         ` Tejun Heo
  0 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-11 15:54 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hello,

On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > Am I correct in thinking that we should eventually be able to removed
> > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > would need to be handled by kmemcg?
> 
> Don't think so, because sometimes we want to account kmalloc.

I'm kinda skeptical about that because if those allocations are
occassional by nature, we don't care and if there can be a huge number
of them, splitting them into a separate cache makes sense.  I think it
makes sense to pin down exactly which caches are memcg managed.  That
has the potential to simplify the involved code path and shave off a
small bit of hot path overhead.

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-11 15:54         ` Tejun Heo
  0 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-11 15:54 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hello,

On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > Am I correct in thinking that we should eventually be able to removed
> > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > would need to be handled by kmemcg?
> 
> Don't think so, because sometimes we want to account kmalloc.

I'm kinda skeptical about that because if those allocations are
occassional by nature, we don't care and if there can be a huge number
of them, splitting them into a separate cache makes sense.  I think it
makes sense to pin down exactly which caches are memcg managed.  That
has the potential to simplify the involved code path and shave off a
small bit of hot path overhead.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-11 15:54         ` Tejun Heo
  (?)
@ 2015-11-11 16:07           ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-11 16:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Wed, Nov 11, 2015 at 10:54:50AM -0500, Tejun Heo wrote:
> Hello,
> 
> On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > > Am I correct in thinking that we should eventually be able to removed
> > > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > > would need to be handled by kmemcg?
> > 
> > Don't think so, because sometimes we want to account kmalloc.
> 
> I'm kinda skeptical about that because if those allocations are
> occassional by nature, we don't care and if there can be a huge number
> of them, splitting them into a separate cache makes sense.  I think it
> makes sense to pin down exactly which caches are memcg managed.  That
> has the potential to simplify the involved code path and shave off a
> small bit of hot path overhead.

What about external_name allocation in __d_alloc? Is it occasional?
Depends on the workload I guess. Can we create a separate cache for it?
No, because its size is variable. There are other things like that, e.g.
pipe_buffer array.

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-11 16:07           ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-11 16:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Wed, Nov 11, 2015 at 10:54:50AM -0500, Tejun Heo wrote:
> Hello,
> 
> On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > > Am I correct in thinking that we should eventually be able to removed
> > > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > > would need to be handled by kmemcg?
> > 
> > Don't think so, because sometimes we want to account kmalloc.
> 
> I'm kinda skeptical about that because if those allocations are
> occassional by nature, we don't care and if there can be a huge number
> of them, splitting them into a separate cache makes sense.  I think it
> makes sense to pin down exactly which caches are memcg managed.  That
> has the potential to simplify the involved code path and shave off a
> small bit of hot path overhead.

What about external_name allocation in __d_alloc? Is it occasional?
Depends on the workload I guess. Can we create a separate cache for it?
No, because its size is variable. There are other things like that, e.g.
pipe_buffer array.

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-11 16:07           ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-11 16:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Nov 11, 2015 at 10:54:50AM -0500, Tejun Heo wrote:
> Hello,
> 
> On Tue, Nov 10, 2015 at 09:54:01PM +0300, Vladimir Davydov wrote:
> > > Am I correct in thinking that we should eventually be able to removed
> > > __GFP_ACCOUNT and that only caches explicitly marked with SLAB_ACCOUNT
> > > would need to be handled by kmemcg?
> > 
> > Don't think so, because sometimes we want to account kmalloc.
> 
> I'm kinda skeptical about that because if those allocations are
> occassional by nature, we don't care and if there can be a huge number
> of them, splitting them into a separate cache makes sense.  I think it
> makes sense to pin down exactly which caches are memcg managed.  That
> has the potential to simplify the involved code path and shave off a
> small bit of hot path overhead.

What about external_name allocation in __d_alloc? Is it occasional?
Depends on the workload I guess. Can we create a separate cache for it?
No, because its size is variable. There are other things like that, e.g.
pipe_buffer array.

Thanks,
Vladimir

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-11 16:07           ` Vladimir Davydov
@ 2015-11-11 16:19             ` Tejun Heo
  -1 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-11 16:19 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hello, Vladimir.

On Wed, Nov 11, 2015 at 07:07:19PM +0300, Vladimir Davydov wrote:
> What about external_name allocation in __d_alloc? Is it occasional?
> Depends on the workload I guess. Can we create a separate cache for it?
> No, because its size is variable. There are other things like that, e.g.
> pipe_buffer array.

You're right.  Ah, it was so close. :(

-- 
tejun

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-11 16:19             ` Tejun Heo
  0 siblings, 0 replies; 56+ messages in thread
From: Tejun Heo @ 2015-11-11 16:19 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Michal Hocko, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

Hello, Vladimir.

On Wed, Nov 11, 2015 at 07:07:19PM +0300, Vladimir Davydov wrote:
> What about external_name allocation in __d_alloc? Is it occasional?
> Depends on the workload I guess. Can we create a separate cache for it?
> No, because its size is variable. There are other things like that, e.g.
> pipe_buffer array.

You're right.  Ah, it was so close. :(

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-12 16:04     ` Michal Hocko
  -1 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:04, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

As mentioned previously I would simply squash 1-3 into a single patch.
Anyway
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/gfp.h        | 4 ++++
>  include/linux/memcontrol.h | 2 ++
>  mm/page_alloc.c            | 3 ++-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 2b917ce34efc..61305a492356 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -30,6 +30,7 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL		0x20000u
>  #define ___GFP_THISNODE		0x40000u
>  #define ___GFP_RECLAIMABLE	0x80000u
> +#define ___GFP_ACCOUNT		0x100000u
>  #define ___GFP_NOTRACK		0x200000u
>  #define ___GFP_NO_KSWAPD	0x400000u
>  #define ___GFP_OTHER_NODE	0x800000u
> @@ -90,6 +91,8 @@ struct vm_area_struct;
>  #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
>  #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
> +#define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)	/* Account to memcg (only relevant
> +							 * to kmem allocations) */
>  #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
>  
>  #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
> @@ -112,6 +115,7 @@ struct vm_area_struct;
>  #define GFP_NOIO	(__GFP_WAIT)
>  #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
>  #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
> +#define GFP_KERNEL_ACCOUNT	(GFP_KERNEL | __GFP_ACCOUNT)
>  #define GFP_TEMPORARY	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
>  			 __GFP_RECLAIMABLE)
>  #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 2103f36b3bd3..c9d9a8e7b45f 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -773,6 +773,8 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 446bb36ee59d..8e22f5b27de0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3420,7 +3420,8 @@ EXPORT_SYMBOL(__free_page_frag);
>  
>  /*
>   * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
> - * of the current memory cgroup.
> + * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
> + * equivalent to alloc_pages.
>   *
>   * It should be used when the caller would like to use kmalloc, but since the
>   * allocation is large, it has to fall back to the page allocator.
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
@ 2015-11-12 16:04     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:04, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

As mentioned previously I would simply squash 1-3 into a single patch.
Anyway
Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/gfp.h        | 4 ++++
>  include/linux/memcontrol.h | 2 ++
>  mm/page_alloc.c            | 3 ++-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 2b917ce34efc..61305a492356 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -30,6 +30,7 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL		0x20000u
>  #define ___GFP_THISNODE		0x40000u
>  #define ___GFP_RECLAIMABLE	0x80000u
> +#define ___GFP_ACCOUNT		0x100000u
>  #define ___GFP_NOTRACK		0x200000u
>  #define ___GFP_NO_KSWAPD	0x400000u
>  #define ___GFP_OTHER_NODE	0x800000u
> @@ -90,6 +91,8 @@ struct vm_area_struct;
>  #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
>  #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
> +#define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)	/* Account to memcg (only relevant
> +							 * to kmem allocations) */
>  #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
>  
>  #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
> @@ -112,6 +115,7 @@ struct vm_area_struct;
>  #define GFP_NOIO	(__GFP_WAIT)
>  #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
>  #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
> +#define GFP_KERNEL_ACCOUNT	(GFP_KERNEL | __GFP_ACCOUNT)
>  #define GFP_TEMPORARY	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
>  			 __GFP_RECLAIMABLE)
>  #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 2103f36b3bd3..c9d9a8e7b45f 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -773,6 +773,8 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 446bb36ee59d..8e22f5b27de0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3420,7 +3420,8 @@ EXPORT_SYMBOL(__free_page_frag);
>  
>  /*
>   * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
> - * of the current memory cgroup.
> + * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
> + * equivalent to alloc_pages.
>   *
>   * It should be used when the caller would like to use kmalloc, but since the
>   * allocation is large, it has to fall back to the page allocator.
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
@ 2015-11-12 16:04     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 10-11-15 21:34:04, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

As mentioned previously I would simply squash 1-3 into a single patch.
Anyway
Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  include/linux/gfp.h        | 4 ++++
>  include/linux/memcontrol.h | 2 ++
>  mm/page_alloc.c            | 3 ++-
>  3 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 2b917ce34efc..61305a492356 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -30,6 +30,7 @@ struct vm_area_struct;
>  #define ___GFP_HARDWALL		0x20000u
>  #define ___GFP_THISNODE		0x40000u
>  #define ___GFP_RECLAIMABLE	0x80000u
> +#define ___GFP_ACCOUNT		0x100000u
>  #define ___GFP_NOTRACK		0x200000u
>  #define ___GFP_NO_KSWAPD	0x400000u
>  #define ___GFP_OTHER_NODE	0x800000u
> @@ -90,6 +91,8 @@ struct vm_area_struct;
>  #define __GFP_HARDWALL   ((__force gfp_t)___GFP_HARDWALL) /* Enforce hardwall cpuset memory allocs */
>  #define __GFP_THISNODE	((__force gfp_t)___GFP_THISNODE)/* No fallback, no policies */
>  #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) /* Page is reclaimable */
> +#define __GFP_ACCOUNT	((__force gfp_t)___GFP_ACCOUNT)	/* Account to memcg (only relevant
> +							 * to kmem allocations) */
>  #define __GFP_NOTRACK	((__force gfp_t)___GFP_NOTRACK)  /* Don't track with kmemcheck */
>  
>  #define __GFP_NO_KSWAPD	((__force gfp_t)___GFP_NO_KSWAPD)
> @@ -112,6 +115,7 @@ struct vm_area_struct;
>  #define GFP_NOIO	(__GFP_WAIT)
>  #define GFP_NOFS	(__GFP_WAIT | __GFP_IO)
>  #define GFP_KERNEL	(__GFP_WAIT | __GFP_IO | __GFP_FS)
> +#define GFP_KERNEL_ACCOUNT	(GFP_KERNEL | __GFP_ACCOUNT)
>  #define GFP_TEMPORARY	(__GFP_WAIT | __GFP_IO | __GFP_FS | \
>  			 __GFP_RECLAIMABLE)
>  #define GFP_USER	(__GFP_WAIT | __GFP_IO | __GFP_FS | __GFP_HARDWALL)
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 2103f36b3bd3..c9d9a8e7b45f 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -773,6 +773,8 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 446bb36ee59d..8e22f5b27de0 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3420,7 +3420,8 @@ EXPORT_SYMBOL(__free_page_frag);
>  
>  /*
>   * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
> - * of the current memory cgroup.
> + * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
> + * equivalent to alloc_pages.
>   *
>   * It should be used when the caller would like to use kmalloc, but since the
>   * allocation is large, it has to fall back to the page allocator.
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-12 16:17     ` Michal Hocko
  -1 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:17 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:05, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.

Yes this is much better and less error prone for dedicated caches.

> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

I would expect some reasoning why this is the case. Why cannot caches of
the same memcg be merged? I remember you have mentioned something in the
previous discussion with Tejun but it should be in the changelog as well
IMO.

> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

I am not sufficiently qualified to judge the slab implementation
specifics but for the overal approach

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/memcontrol.h | 15 +++++++--------
>  include/linux/slab.h       |  5 +++++
>  mm/memcontrol.c            |  8 +++++++-
>  mm/slab.h                  |  5 +++--
>  mm/slab_common.c           |  3 ++-
>  mm/slub.c                  |  2 ++
>  6 files changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index c9d9a8e7b45f..5c97265c1c6e 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -766,15 +766,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
>  	return memcg ? memcg->kmemcg_id : -1;
>  }
>  
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
>  void __memcg_kmem_put_cache(struct kmem_cache *cachep);
>  
> -static inline bool __memcg_kmem_bypass(gfp_t gfp)
> +static inline bool __memcg_kmem_bypass(void)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> -	if (!(gfp & __GFP_ACCOUNT))
> -		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> @@ -791,7 +789,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  static __always_inline int memcg_kmem_charge(struct page *page,
>  					     gfp_t gfp, int order)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
> +		return 0;
> +	if (!(gfp & __GFP_ACCOUNT))
>  		return 0;
>  	return __memcg_kmem_charge(page, gfp, order);
>  }
> @@ -810,16 +810,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
>  /**
>   * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
>   * @cachep: the original global kmem cache
> - * @gfp: allocation flags.
>   *
>   * All memory allocated from a per-memcg cache is charged to the owner memcg.
>   */
>  static __always_inline struct kmem_cache *
>  memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
>  		return cachep;
> -	return __memcg_kmem_get_cache(cachep);
> +	return __memcg_kmem_get_cache(cachep, gfp);
>  }
>  
>  static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 7c82e3b307a3..20168c6ffe89 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -86,6 +86,11 @@
>  #else
>  # define SLAB_FAILSLAB		0x00000000UL
>  #endif
> +#ifdef CONFIG_MEMCG_KMEM
> +# define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
> +#else
> +# define SLAB_ACCOUNT		0x00000000UL
> +#endif
>  
>  /* The following flags affect the page allocator grouping pages by mobility */
>  #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc502e590366..06e4f538e38e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2332,7 +2332,7 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>   * Can't be called in interrupt context or from kernel threads.
>   * This function needs to be called with rcu_read_lock() held.
>   */
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
>  	struct mem_cgroup *memcg;
>  	struct kmem_cache *memcg_cachep;
> @@ -2340,6 +2340,12 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
>  
>  	VM_BUG_ON(!is_root_cache(cachep));
>  
> +	if (cachep->flags & SLAB_ACCOUNT)
> +		gfp |= __GFP_ACCOUNT;
> +
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return cachep;
> +
>  	if (current->memcg_kmem_skip_account)
>  		return cachep;
>  
> diff --git a/mm/slab.h b/mm/slab.h
> index 27492eb678f7..2778de8673bd 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -128,10 +128,11 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>  
>  #if defined(CONFIG_SLAB)
>  #define SLAB_CACHE_FLAGS (SLAB_MEM_SPREAD | SLAB_NOLEAKTRACE | \
> -			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | \
> +			  SLAB_NOTRACK | SLAB_ACCOUNT)
>  #elif defined(CONFIG_SLUB)
>  #define SLAB_CACHE_FLAGS (SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \
> -			  SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_TEMPORARY | SLAB_NOTRACK | SLAB_ACCOUNT)
>  #else
>  #define SLAB_CACHE_FLAGS (0)
>  #endif
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d88e97c10a2e..698b2c97b22b 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -37,7 +37,8 @@ struct kmem_cache *kmem_cache;
>  		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
>  		SLAB_FAILSLAB)
>  
> -#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)
> +#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> +			 SLAB_NOTRACK | SLAB_ACCOUNT)
>  
>  /*
>   * Merge control. If this is set then no merging of slab caches will occur.
> diff --git a/mm/slub.c b/mm/slub.c
> index 75a5fa92ac2a..b037cea9cfeb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5247,6 +5247,8 @@ static char *create_unique_id(struct kmem_cache *s)
>  		*p++ = 'F';
>  	if (!(s->flags & SLAB_NOTRACK))
>  		*p++ = 't';
> +	if (s->flags & SLAB_ACCOUNT)
> +		*p++ = 'A';
>  	if (p != name + 1)
>  		*p++ = '-';
>  	p += sprintf(p, "%07d", s->size);
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-12 16:17     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:17 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:05, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.

Yes this is much better and less error prone for dedicated caches.

> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

I would expect some reasoning why this is the case. Why cannot caches of
the same memcg be merged? I remember you have mentioned something in the
previous discussion with Tejun but it should be in the changelog as well
IMO.

> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

I am not sufficiently qualified to judge the slab implementation
specifics but for the overal approach

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/memcontrol.h | 15 +++++++--------
>  include/linux/slab.h       |  5 +++++
>  mm/memcontrol.c            |  8 +++++++-
>  mm/slab.h                  |  5 +++--
>  mm/slab_common.c           |  3 ++-
>  mm/slub.c                  |  2 ++
>  6 files changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index c9d9a8e7b45f..5c97265c1c6e 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -766,15 +766,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
>  	return memcg ? memcg->kmemcg_id : -1;
>  }
>  
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
>  void __memcg_kmem_put_cache(struct kmem_cache *cachep);
>  
> -static inline bool __memcg_kmem_bypass(gfp_t gfp)
> +static inline bool __memcg_kmem_bypass(void)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> -	if (!(gfp & __GFP_ACCOUNT))
> -		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> @@ -791,7 +789,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  static __always_inline int memcg_kmem_charge(struct page *page,
>  					     gfp_t gfp, int order)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
> +		return 0;
> +	if (!(gfp & __GFP_ACCOUNT))
>  		return 0;
>  	return __memcg_kmem_charge(page, gfp, order);
>  }
> @@ -810,16 +810,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
>  /**
>   * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
>   * @cachep: the original global kmem cache
> - * @gfp: allocation flags.
>   *
>   * All memory allocated from a per-memcg cache is charged to the owner memcg.
>   */
>  static __always_inline struct kmem_cache *
>  memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
>  		return cachep;
> -	return __memcg_kmem_get_cache(cachep);
> +	return __memcg_kmem_get_cache(cachep, gfp);
>  }
>  
>  static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 7c82e3b307a3..20168c6ffe89 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -86,6 +86,11 @@
>  #else
>  # define SLAB_FAILSLAB		0x00000000UL
>  #endif
> +#ifdef CONFIG_MEMCG_KMEM
> +# define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
> +#else
> +# define SLAB_ACCOUNT		0x00000000UL
> +#endif
>  
>  /* The following flags affect the page allocator grouping pages by mobility */
>  #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc502e590366..06e4f538e38e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2332,7 +2332,7 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>   * Can't be called in interrupt context or from kernel threads.
>   * This function needs to be called with rcu_read_lock() held.
>   */
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
>  	struct mem_cgroup *memcg;
>  	struct kmem_cache *memcg_cachep;
> @@ -2340,6 +2340,12 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
>  
>  	VM_BUG_ON(!is_root_cache(cachep));
>  
> +	if (cachep->flags & SLAB_ACCOUNT)
> +		gfp |= __GFP_ACCOUNT;
> +
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return cachep;
> +
>  	if (current->memcg_kmem_skip_account)
>  		return cachep;
>  
> diff --git a/mm/slab.h b/mm/slab.h
> index 27492eb678f7..2778de8673bd 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -128,10 +128,11 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>  
>  #if defined(CONFIG_SLAB)
>  #define SLAB_CACHE_FLAGS (SLAB_MEM_SPREAD | SLAB_NOLEAKTRACE | \
> -			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | \
> +			  SLAB_NOTRACK | SLAB_ACCOUNT)
>  #elif defined(CONFIG_SLUB)
>  #define SLAB_CACHE_FLAGS (SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \
> -			  SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_TEMPORARY | SLAB_NOTRACK | SLAB_ACCOUNT)
>  #else
>  #define SLAB_CACHE_FLAGS (0)
>  #endif
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d88e97c10a2e..698b2c97b22b 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -37,7 +37,8 @@ struct kmem_cache *kmem_cache;
>  		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
>  		SLAB_FAILSLAB)
>  
> -#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)
> +#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> +			 SLAB_NOTRACK | SLAB_ACCOUNT)
>  
>  /*
>   * Merge control. If this is set then no merging of slab caches will occur.
> diff --git a/mm/slub.c b/mm/slub.c
> index 75a5fa92ac2a..b037cea9cfeb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5247,6 +5247,8 @@ static char *create_unique_id(struct kmem_cache *s)
>  		*p++ = 'F';
>  	if (!(s->flags & SLAB_NOTRACK))
>  		*p++ = 't';
> +	if (s->flags & SLAB_ACCOUNT)
> +		*p++ = 'A';
>  	if (p != name + 1)
>  		*p++ = '-';
>  	p += sprintf(p, "%07d", s->size);
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-12 16:17     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:17 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 10-11-15 21:34:05, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.

Yes this is much better and less error prone for dedicated caches.

> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).

I would expect some reasoning why this is the case. Why cannot caches of
the same memcg be merged? I remember you have mentioned something in the
previous discussion with Tejun but it should be in the changelog as well
IMO.

> Suggested-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

I am not sufficiently qualified to judge the slab implementation
specifics but for the overal approach

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  include/linux/memcontrol.h | 15 +++++++--------
>  include/linux/slab.h       |  5 +++++
>  mm/memcontrol.c            |  8 +++++++-
>  mm/slab.h                  |  5 +++--
>  mm/slab_common.c           |  3 ++-
>  mm/slub.c                  |  2 ++
>  6 files changed, 26 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index c9d9a8e7b45f..5c97265c1c6e 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -766,15 +766,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
>  	return memcg ? memcg->kmemcg_id : -1;
>  }
>  
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep);
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
>  void __memcg_kmem_put_cache(struct kmem_cache *cachep);
>  
> -static inline bool __memcg_kmem_bypass(gfp_t gfp)
> +static inline bool __memcg_kmem_bypass(void)
>  {
>  	if (!memcg_kmem_enabled())
>  		return true;
> -	if (!(gfp & __GFP_ACCOUNT))
> -		return true;
>  	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
>  		return true;
>  	return false;
> @@ -791,7 +789,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
>  static __always_inline int memcg_kmem_charge(struct page *page,
>  					     gfp_t gfp, int order)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
> +		return 0;
> +	if (!(gfp & __GFP_ACCOUNT))
>  		return 0;
>  	return __memcg_kmem_charge(page, gfp, order);
>  }
> @@ -810,16 +810,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
>  /**
>   * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
>   * @cachep: the original global kmem cache
> - * @gfp: allocation flags.
>   *
>   * All memory allocated from a per-memcg cache is charged to the owner memcg.
>   */
>  static __always_inline struct kmem_cache *
>  memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
> -	if (__memcg_kmem_bypass(gfp))
> +	if (__memcg_kmem_bypass())
>  		return cachep;
> -	return __memcg_kmem_get_cache(cachep);
> +	return __memcg_kmem_get_cache(cachep, gfp);
>  }
>  
>  static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 7c82e3b307a3..20168c6ffe89 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -86,6 +86,11 @@
>  #else
>  # define SLAB_FAILSLAB		0x00000000UL
>  #endif
> +#ifdef CONFIG_MEMCG_KMEM
> +# define SLAB_ACCOUNT		0x04000000UL	/* Account to memcg */
> +#else
> +# define SLAB_ACCOUNT		0x00000000UL
> +#endif
>  
>  /* The following flags affect the page allocator grouping pages by mobility */
>  #define SLAB_RECLAIM_ACCOUNT	0x00020000UL		/* Objects are reclaimable */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index bc502e590366..06e4f538e38e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2332,7 +2332,7 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
>   * Can't be called in interrupt context or from kernel threads.
>   * This function needs to be called with rcu_read_lock() held.
>   */
> -struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
> +struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
>  {
>  	struct mem_cgroup *memcg;
>  	struct kmem_cache *memcg_cachep;
> @@ -2340,6 +2340,12 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep)
>  
>  	VM_BUG_ON(!is_root_cache(cachep));
>  
> +	if (cachep->flags & SLAB_ACCOUNT)
> +		gfp |= __GFP_ACCOUNT;
> +
> +	if (!(gfp & __GFP_ACCOUNT))
> +		return cachep;
> +
>  	if (current->memcg_kmem_skip_account)
>  		return cachep;
>  
> diff --git a/mm/slab.h b/mm/slab.h
> index 27492eb678f7..2778de8673bd 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -128,10 +128,11 @@ static inline unsigned long kmem_cache_flags(unsigned long object_size,
>  
>  #if defined(CONFIG_SLAB)
>  #define SLAB_CACHE_FLAGS (SLAB_MEM_SPREAD | SLAB_NOLEAKTRACE | \
> -			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_RECLAIM_ACCOUNT | SLAB_TEMPORARY | \
> +			  SLAB_NOTRACK | SLAB_ACCOUNT)
>  #elif defined(CONFIG_SLUB)
>  #define SLAB_CACHE_FLAGS (SLAB_NOLEAKTRACE | SLAB_RECLAIM_ACCOUNT | \
> -			  SLAB_TEMPORARY | SLAB_NOTRACK)
> +			  SLAB_TEMPORARY | SLAB_NOTRACK | SLAB_ACCOUNT)
>  #else
>  #define SLAB_CACHE_FLAGS (0)
>  #endif
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index d88e97c10a2e..698b2c97b22b 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -37,7 +37,8 @@ struct kmem_cache *kmem_cache;
>  		SLAB_TRACE | SLAB_DESTROY_BY_RCU | SLAB_NOLEAKTRACE | \
>  		SLAB_FAILSLAB)
>  
> -#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | SLAB_NOTRACK)
> +#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> +			 SLAB_NOTRACK | SLAB_ACCOUNT)
>  
>  /*
>   * Merge control. If this is set then no merging of slab caches will occur.
> diff --git a/mm/slub.c b/mm/slub.c
> index 75a5fa92ac2a..b037cea9cfeb 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -5247,6 +5247,8 @@ static char *create_unique_id(struct kmem_cache *s)
>  		*p++ = 'F';
>  	if (!(s->flags & SLAB_NOTRACK))
>  		*p++ = 't';
> +	if (s->flags & SLAB_ACCOUNT)
> +		*p++ = 'A';
>  	if (p != name + 1)
>  		*p++ = '-';
>  	p += sprintf(p, "%07d", s->size);
> -- 
> 2.1.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-12 16:50     ` Michal Hocko
  -1 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:50 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:07, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.

It would be imho nicer to split this into few patches based on the
memory category (task management, address space, icache) with a
justification.

> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

>From a quick look it seems reasonable.

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
>  drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
>  fs/9p/v9fs.c                                  |  2 +-
>  fs/adfs/super.c                               |  2 +-
>  fs/affs/super.c                               |  2 +-
>  fs/afs/super.c                                |  2 +-
>  fs/befs/linuxvfs.c                            |  2 +-
>  fs/bfs/inode.c                                |  2 +-
>  fs/block_dev.c                                |  2 +-
>  fs/btrfs/inode.c                              |  3 ++-
>  fs/ceph/super.c                               |  4 ++--
>  fs/cifs/cifsfs.c                              |  2 +-
>  fs/coda/inode.c                               |  6 +++---
>  fs/dcache.c                                   |  5 +++--
>  fs/ecryptfs/main.c                            |  6 ++++--
>  fs/efs/super.c                                |  6 +++---
>  fs/exofs/super.c                              |  4 ++--
>  fs/ext2/super.c                               |  2 +-
>  fs/ext4/super.c                               |  2 +-
>  fs/f2fs/super.c                               |  5 +++--
>  fs/fat/inode.c                                |  2 +-
>  fs/file.c                                     |  7 ++++---
>  fs/fuse/inode.c                               |  4 ++--
>  fs/gfs2/main.c                                |  3 ++-
>  fs/hfs/super.c                                |  4 ++--
>  fs/hfsplus/super.c                            |  2 +-
>  fs/hostfs/hostfs_kern.c                       |  2 +-
>  fs/hpfs/super.c                               |  2 +-
>  fs/hugetlbfs/inode.c                          |  2 +-
>  fs/inode.c                                    |  2 +-
>  fs/isofs/inode.c                              |  2 +-
>  fs/jffs2/super.c                              |  2 +-
>  fs/jfs/super.c                                |  2 +-
>  fs/logfs/inode.c                              |  3 ++-
>  fs/minix/inode.c                              |  2 +-
>  fs/ncpfs/inode.c                              |  2 +-
>  fs/nfs/inode.c                                |  2 +-
>  fs/nilfs2/super.c                             |  3 ++-
>  fs/ntfs/super.c                               |  4 ++--
>  fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
>  fs/ocfs2/super.c                              |  2 +-
>  fs/openpromfs/inode.c                         |  2 +-
>  fs/proc/inode.c                               |  3 ++-
>  fs/qnx4/inode.c                               |  2 +-
>  fs/qnx6/inode.c                               |  2 +-
>  fs/reiserfs/super.c                           |  3 ++-
>  fs/romfs/super.c                              |  4 ++--
>  fs/squashfs/super.c                           |  3 ++-
>  fs/sysv/inode.c                               |  2 +-
>  fs/ubifs/super.c                              |  4 ++--
>  fs/udf/super.c                                |  3 ++-
>  fs/ufs/super.c                                |  2 +-
>  fs/xfs/kmem.h                                 |  1 +
>  fs/xfs/xfs_super.c                            |  4 ++--
>  include/linux/thread_info.h                   |  5 +++--
>  ipc/mqueue.c                                  |  2 +-
>  kernel/cred.c                                 |  4 ++--
>  kernel/delayacct.c                            |  2 +-
>  kernel/fork.c                                 | 22 +++++++++++++---------
>  kernel/pid.c                                  |  2 +-
>  mm/nommu.c                                    |  2 +-
>  mm/rmap.c                                     |  6 ++++--
>  mm/shmem.c                                    |  2 +-
>  net/socket.c                                  |  2 +-
>  net/sunrpc/rpc_pipe.c                         |  2 +-
>  65 files changed, 114 insertions(+), 92 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> index 11634fa7ab3c..ad4840f86be1 100644
> --- a/arch/powerpc/platforms/cell/spufs/inode.c
> +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> @@ -767,7 +767,7 @@ static int __init spufs_init(void)
>  	ret = -ENOMEM;
>  	spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
>  			sizeof(struct spufs_inode_info), 0,
> -			SLAB_HWCACHE_ALIGN, spufs_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
>  
>  	if (!spufs_inode_cache)
>  		goto out;
> diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
> index 013136860664..60828d692db4 100644
> --- a/drivers/staging/lustre/lustre/llite/super25.c
> +++ b/drivers/staging/lustre/lustre/llite/super25.c
> @@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
>  	rc = -ENOMEM;
>  	ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
>  					    sizeof(struct ll_inode_info),
> -					    0, SLAB_HWCACHE_ALIGN, NULL);
> +					    0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
> +					    NULL);
>  	if (ll_inode_cachep == NULL)
>  		goto out_cache;
>  
> diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
> index 6caca025019d..072e7599583a 100644
> --- a/fs/9p/v9fs.c
> +++ b/fs/9p/v9fs.c
> @@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
>  	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
>  					  sizeof(struct v9fs_inode),
>  					  0, (SLAB_RECLAIM_ACCOUNT|
> -					      SLAB_MEM_SPREAD),
> +					      SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					  v9fs_inode_init_once);
>  	if (!v9fs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/adfs/super.c b/fs/adfs/super.c
> index 4d4a0df8344f..c9fdfb112933 100644
> --- a/fs/adfs/super.c
> +++ b/fs/adfs/super.c
> @@ -271,7 +271,7 @@ static int __init init_inodecache(void)
>  	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
>  					     sizeof(struct adfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (adfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/affs/super.c b/fs/affs/super.c
> index 5b50c4ca43a7..84a84fcb5f5a 100644
> --- a/fs/affs/super.c
> +++ b/fs/affs/super.c
> @@ -132,7 +132,7 @@ static int __init init_inodecache(void)
>  	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
>  					     sizeof(struct affs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (affs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/afs/super.c b/fs/afs/super.c
> index 1fb4a5129f7d..81afefe7d8a6 100644
> --- a/fs/afs/super.c
> +++ b/fs/afs/super.c
> @@ -91,7 +91,7 @@ int __init afs_fs_init(void)
>  	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
>  					     sizeof(struct afs_vnode),
>  					     0,
> -					     SLAB_HWCACHE_ALIGN,
> +					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					     afs_i_init_once);
>  	if (!afs_inode_cachep) {
>  		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
> diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
> index 46aedacfa6a8..2a23edf5703e 100644
> --- a/fs/befs/linuxvfs.c
> +++ b/fs/befs/linuxvfs.c
> @@ -434,7 +434,7 @@ befs_init_inodecache(void)
>  	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
>  					      sizeof (struct befs_inode_info),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      init_once);
>  	if (befs_inode_cachep == NULL) {
>  		pr_err("%s: Couldn't initialize inode slabcache\n", __func__);
> diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
> index fdcb4d69f430..1e5c896f6b79 100644
> --- a/fs/bfs/inode.c
> +++ b/fs/bfs/inode.c
> @@ -270,7 +270,7 @@ static int __init init_inodecache(void)
>  	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
>  					     sizeof(struct bfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (bfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 0a793c7930eb..29ce98bfe04f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -567,7 +567,7 @@ void __init bdev_cache_init(void)
>  
>  	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
>  			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -				SLAB_MEM_SPREAD|SLAB_PANIC),
> +				SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
>  			init_once);
>  	err = register_filesystem(&bd_type);
>  	if (err)
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 4439fbb4ff45..c24d4cd9c14f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9157,7 +9157,8 @@ int btrfs_init_cachep(void)
>  {
>  	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
>  			sizeof(struct btrfs_inode), 0,
> -			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
> +			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
> +			init_once);
>  	if (!btrfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> index f446afada328..ca4d5e8457f1 100644
> --- a/fs/ceph/super.c
> +++ b/fs/ceph/super.c
> @@ -639,8 +639,8 @@ static int __init init_caches(void)
>  	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
>  				      sizeof(struct ceph_inode_info),
>  				      __alignof__(struct ceph_inode_info),
> -				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
> -				      ceph_inode_init_once);
> +				      SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				      SLAB_ACCOUNT, ceph_inode_init_once);
>  	if (ceph_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> index e739950ca084..7f2e2639d1d1 100644
> --- a/fs/cifs/cifsfs.c
> +++ b/fs/cifs/cifsfs.c
> @@ -1040,7 +1040,7 @@ cifs_init_inodecache(void)
>  	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
>  					      sizeof(struct cifsInodeInfo),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      cifs_init_once);
>  	if (cifs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/coda/inode.c b/fs/coda/inode.c
> index cac1390b87a3..57e81cbba0fa 100644
> --- a/fs/coda/inode.c
> +++ b/fs/coda/inode.c
> @@ -74,9 +74,9 @@ static void init_once(void *foo)
>  int __init coda_init_inodecache(void)
>  {
>  	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
> -				sizeof(struct coda_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct coda_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (coda_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5c33aeb0f68f..7ac590912106 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
>  	dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
>  	if (name->len > DNAME_INLINE_LEN-1) {
>  		size_t size = offsetof(struct external_name, name[1]);
> -		struct external_name *p = kmalloc(size + name->len, GFP_KERNEL);
> +		struct external_name *p = kmalloc(size + name->len,
> +						  GFP_KERNEL_ACCOUNT);
>  		if (!p) {
>  			kmem_cache_free(dentry_cache, dentry); 
>  			return NULL;
> @@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
>  	 * of the dcache. 
>  	 */
>  	dentry_cache = KMEM_CACHE(dentry,
> -		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
> +		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
>  
>  	/* Hash may have been set up in dcache_init_early */
>  	if (!hashdist)
> diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> index 4f4d0474bee9..e25b6b06bacf 100644
> --- a/fs/ecryptfs/main.c
> +++ b/fs/ecryptfs/main.c
> @@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
>  	struct kmem_cache **cache;
>  	const char *name;
>  	size_t size;
> +	unsigned long flags;
>  	void (*ctor)(void *obj);
>  } ecryptfs_cache_infos[] = {
>  	{
> @@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
>  		.cache = &ecryptfs_inode_info_cache,
>  		.name = "ecryptfs_inode_cache",
>  		.size = sizeof(struct ecryptfs_inode_info),
> +		.flags = SLAB_ACCOUNT,
>  		.ctor = inode_info_init_once,
>  	},
>  	{
> @@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
>  		struct ecryptfs_cache_info *info;
>  
>  		info = &ecryptfs_cache_infos[i];
> -		*(info->cache) = kmem_cache_create(info->name, info->size,
> -				0, SLAB_HWCACHE_ALIGN, info->ctor);
> +		*(info->cache) = kmem_cache_create(info->name, info->size, 0,
> +				SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
>  		if (!*(info->cache)) {
>  			ecryptfs_free_kmem_caches();
>  			ecryptfs_printk(KERN_WARNING, "%s: "
> diff --git a/fs/efs/super.c b/fs/efs/super.c
> index c8411a30f7da..cb68dac4f9d3 100644
> --- a/fs/efs/super.c
> +++ b/fs/efs/super.c
> @@ -94,9 +94,9 @@ static void init_once(void *foo)
>  static int __init init_inodecache(void)
>  {
>  	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
> -				sizeof(struct efs_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct efs_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (efs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/exofs/super.c b/fs/exofs/super.c
> index b795c567b5e1..6658a50530a0 100644
> --- a/fs/exofs/super.c
> +++ b/fs/exofs/super.c
> @@ -194,8 +194,8 @@ static int init_inodecache(void)
>  {
>  	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
>  				sizeof(struct exofs_i_info), 0,
> -				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				exofs_init_once);
> +				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				SLAB_ACCOUNT, exofs_init_once);
>  	if (exofs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 900e19cf9ef6..973092a32b98 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -200,7 +200,7 @@ static int __init init_inodecache(void)
>  	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
>  					     sizeof(struct ext2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext2_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 04d0f1b33409..c4a5c415b881 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -966,7 +966,7 @@ static int __init init_inodecache(void)
>  	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
>  					     sizeof(struct ext4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 3a65e0132352..862916c7e3f8 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1424,8 +1424,9 @@ MODULE_ALIAS_FS("f2fs");
>  
>  static int __init init_inodecache(void)
>  {
> -	f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache",
> -			sizeof(struct f2fs_inode_info));
> +	f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
> +			sizeof(struct f2fs_inode_info), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
>  	if (!f2fs_inode_cachep)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 509411dd3698..6aece96df19f 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
>  	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
>  					     sizeof(struct msdos_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (fat_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/file.c b/fs/file.c
> index 39f8f15921da..7d76c929d557 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
>  	 * vmalloc() if the allocation size will be considered "large" by the VM.
>  	 */
>  	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
> +		void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
> +				     __GFP_NOWARN | __GFP_NORETRY);
>  		if (data != NULL)
>  			return data;
>  	}
> -	return vmalloc(size);
> +	return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
>  }
>  
>  static void __free_fdtable(struct fdtable *fdt)
> @@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
>  	if (unlikely(nr > sysctl_nr_open))
>  		nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
>  
> -	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL);
> +	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
>  	if (!fdt)
>  		goto out;
>  	fdt->max_fds = nr;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 2913db2a5b99..4d69d5c0bedc 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
>  	int err;
>  
>  	fuse_inode_cachep = kmem_cache_create("fuse_inode",
> -					      sizeof(struct fuse_inode),
> -					      0, SLAB_HWCACHE_ALIGN,
> +					      sizeof(struct fuse_inode), 0,
> +					      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					      fuse_inode_init_once);
>  	err = -ENOMEM;
>  	if (!fuse_inode_cachep)
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 241a399bf83d..6ee38e210602 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -112,7 +112,8 @@ static int __init init_gfs2_fs(void)
>  	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
>  					      sizeof(struct gfs2_inode),
>  					      0,  SLAB_RECLAIM_ACCOUNT|
> -					          SLAB_MEM_SPREAD,
> +					          SLAB_MEM_SPREAD|
> +						  SLAB_ACCOUNT,
>  					      gfs2_init_inode_once);
>  	if (!gfs2_inode_cachep)
>  		goto fail;
> diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> index 4574fdd3d421..1ca95c232bb5 100644
> --- a/fs/hfs/super.c
> +++ b/fs/hfs/super.c
> @@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
>  	int err;
>  
>  	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
> -		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
> -		hfs_init_once);
> +		sizeof(struct hfs_inode_info), 0,
> +		SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
>  	if (!hfs_inode_cachep)
>  		return -ENOMEM;
>  	err = register_filesystem(&hfs_fs_type);
> diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
> index 7302d96ae8bf..5d54490a136d 100644
> --- a/fs/hfsplus/super.c
> +++ b/fs/hfsplus/super.c
> @@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
>  	int err;
>  
>  	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
> -		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
> +		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  		hfsplus_init_once);
>  	if (!hfsplus_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
> index 2ac99db3750e..a4cf6b11a142 100644
> --- a/fs/hostfs/hostfs_kern.c
> +++ b/fs/hostfs/hostfs_kern.c
> @@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
>  {
>  	struct hostfs_inode_info *hi;
>  
> -	hi = kmalloc(sizeof(*hi), GFP_KERNEL);
> +	hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
>  	if (hi == NULL)
>  		return NULL;
>  	hi->fd = -1;
> diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
> index a561591896bd..458cf463047b 100644
> --- a/fs/hpfs/super.c
> +++ b/fs/hpfs/super.c
> @@ -261,7 +261,7 @@ static int init_inodecache(void)
>  	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
>  					     sizeof(struct hpfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (hpfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 316adb968b65..496add05f380 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1322,7 +1322,7 @@ static int __init init_hugetlbfs_fs(void)
>  	error = -ENOMEM;
>  	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
>  					sizeof(struct hugetlbfs_inode_info),
> -					0, 0, init_once);
> +					0, SLAB_ACCOUNT, init_once);
>  	if (hugetlbfs_inode_cachep == NULL)
>  		goto out2;
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index 78a17b8859e1..08c66502f1f4 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1882,7 +1882,7 @@ void __init inode_init(void)
>  					 sizeof(struct inode),
>  					 0,
>  					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
> -					 SLAB_MEM_SPREAD),
> +					 SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					 init_once);
>  
>  	/* Hash may have been set up in inode_init_early */
> diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
> index d67a16f2a45d..9bc2431d2df8 100644
> --- a/fs/isofs/inode.c
> +++ b/fs/isofs/inode.c
> @@ -94,7 +94,7 @@ static int __init init_inodecache(void)
>  	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
>  					sizeof(struct iso_inode_info),
>  					0, (SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					init_once);
>  	if (isofs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
> index d86c5e3176a1..bb080c272149 100644
> --- a/fs/jffs2/super.c
> +++ b/fs/jffs2/super.c
> @@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
>  	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
>  					     sizeof(struct jffs2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     jffs2_i_init_once);
>  	if (!jffs2_inode_cachep) {
>  		pr_err("error: Failed to initialise inode cache\n");
> diff --git a/fs/jfs/super.c b/fs/jfs/super.c
> index 4cd9798f4948..6efadc61c15b 100644
> --- a/fs/jfs/super.c
> +++ b/fs/jfs/super.c
> @@ -901,7 +901,7 @@ static int __init init_jfs_fs(void)
>  
>  	jfs_inode_cachep =
>  	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
> -			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			    init_once);
>  	if (jfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
> index af49e2d6941a..5d65db2e03f4 100644
> --- a/fs/logfs/inode.c
> +++ b/fs/logfs/inode.c
> @@ -408,7 +408,8 @@ const struct super_operations logfs_super_operations = {
>  int logfs_init_inode_cache(void)
>  {
>  	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
> -			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
> +			sizeof(struct logfs_inode), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
>  			logfs_init_once);
>  	if (!logfs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/minix/inode.c b/fs/minix/inode.c
> index 086cd0a61e80..5942c3e10fa5 100644
> --- a/fs/minix/inode.c
> +++ b/fs/minix/inode.c
> @@ -91,7 +91,7 @@ static int __init init_inodecache(void)
>  	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
>  					     sizeof(struct minix_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (minix_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
> index 9605a2f63549..d80446e1a333 100644
> --- a/fs/ncpfs/inode.c
> +++ b/fs/ncpfs/inode.c
> @@ -82,7 +82,7 @@ static int init_inodecache(void)
>  	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
>  					     sizeof(struct ncp_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ncp_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 326d9e10d833..412f888fad13 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1904,7 +1904,7 @@ static int __init nfs_init_inodecache(void)
>  	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
>  					     sizeof(struct nfs_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (nfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> index f47585bfeb01..dcf8e2ff3072 100644
> --- a/fs/nilfs2/super.c
> +++ b/fs/nilfs2/super.c
> @@ -1419,7 +1419,8 @@ static int __init nilfs_init_cachep(void)
>  {
>  	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
>  			sizeof(struct nilfs_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +			nilfs_inode_init_once);
>  	if (!nilfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
> index d1a853585b53..2f77f8dfb861 100644
> --- a/fs/ntfs/super.c
> +++ b/fs/ntfs/super.c
> @@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
>  
>  	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
>  			sizeof(big_ntfs_inode), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -			ntfs_big_inode_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +			SLAB_ACCOUNT, ntfs_big_inode_init_once);
>  	if (!ntfs_big_inode_cache) {
>  		pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
>  		goto big_inode_err_out;
> diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
> index b5cf27dcb18a..03768bb3aab1 100644
> --- a/fs/ocfs2/dlmfs/dlmfs.c
> +++ b/fs/ocfs2/dlmfs/dlmfs.c
> @@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
>  	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
>  				sizeof(struct dlmfs_inode_private),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				dlmfs_init_once);
>  	if (!dlmfs_inode_cache) {
>  		status = -ENOMEM;
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 2de4c8a9340c..8ab0fcbc0b86 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1771,7 +1771,7 @@ static int ocfs2_initialize_mem_caches(void)
>  				       sizeof(struct ocfs2_inode_info),
>  				       0,
>  				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				       ocfs2_inode_init_once);
>  	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
>  					sizeof(struct ocfs2_dquot),
> diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
> index 15e4500cda3e..b61b883c8ff8 100644
> --- a/fs/openpromfs/inode.c
> +++ b/fs/openpromfs/inode.c
> @@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
>  					    sizeof(struct op_inode_info),
>  					    0,
>  					    (SLAB_RECLAIM_ACCOUNT |
> -					     SLAB_MEM_SPREAD),
> +					     SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					    op_inode_init_once);
>  	if (!op_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
> index bd95b9fdebb0..561557122dea 100644
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
>  	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
>  					     sizeof(struct proc_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD|SLAB_PANIC),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT|
> +						SLAB_PANIC),
>  					     init_once);
>  }
>  
> diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
> index c4bcb778886e..f761acdd5a7a 100644
> --- a/fs/qnx4/inode.c
> +++ b/fs/qnx4/inode.c
> @@ -364,7 +364,7 @@ static int init_inodecache(void)
>  	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
>  					     sizeof(struct qnx4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (qnx4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
> index 32d2e1a9774c..4f04f00a7e5e 100644
> --- a/fs/qnx6/inode.c
> +++ b/fs/qnx6/inode.c
> @@ -624,7 +624,7 @@ static int init_inodecache(void)
>  	qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
>  					     sizeof(struct qnx6_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (!qnx6_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
> index 4a62fe8cc3bf..05db7473bcb5 100644
> --- a/fs/reiserfs/super.c
> +++ b/fs/reiserfs/super.c
> @@ -626,7 +626,8 @@ static int __init init_inodecache(void)
>  						  sizeof(struct
>  							 reiserfs_inode_info),
>  						  0, (SLAB_RECLAIM_ACCOUNT|
> -							SLAB_MEM_SPREAD),
> +						      SLAB_MEM_SPREAD|
> +						      SLAB_ACCOUNT),
>  						  init_once);
>  	if (reiserfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/romfs/super.c b/fs/romfs/super.c
> index 268733cda397..e1113399a6b4 100644
> --- a/fs/romfs/super.c
> +++ b/fs/romfs/super.c
> @@ -618,8 +618,8 @@ static int __init init_romfs_fs(void)
>  	romfs_inode_cachep =
>  		kmem_cache_create("romfs_i",
>  				  sizeof(struct romfs_inode_info), 0,
> -				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				  romfs_i_init_once);
> +				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				  SLAB_ACCOUNT, romfs_i_init_once);
>  
>  	if (!romfs_inode_cachep) {
>  		pr_err("Failed to initialise inode cache\n");
> diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
> index 5056babe00df..ea59b475663c 100644
> --- a/fs/squashfs/super.c
> +++ b/fs/squashfs/super.c
> @@ -420,7 +420,8 @@ static int __init init_inodecache(void)
>  {
>  	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
>  		sizeof(struct squashfs_inode_info), 0,
> -		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
> +		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +		init_once);
>  
>  	return squashfs_inode_cachep ? 0 : -ENOMEM;
>  }
> diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
> index 590ad9206e3f..087ed6a1c1df 100644
> --- a/fs/sysv/inode.c
> +++ b/fs/sysv/inode.c
> @@ -353,7 +353,7 @@ int __init sysv_init_icache(void)
>  {
>  	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
>  			sizeof(struct sysv_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			init_once);
>  	if (!sysv_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 9547a27868ad..9d064789c63a 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2241,8 +2241,8 @@ static int __init ubifs_init(void)
>  
>  	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
>  				sizeof(struct ubifs_inode), 0,
> -				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
> -				&inode_slab_ctor);
> +				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
> +				SLAB_ACCOUNT, &inode_slab_ctor);
>  	if (!ubifs_inode_slab)
>  		return -ENOMEM;
>  
> diff --git a/fs/udf/super.c b/fs/udf/super.c
> index 81155b9b445b..9c64a3ca9837 100644
> --- a/fs/udf/super.c
> +++ b/fs/udf/super.c
> @@ -179,7 +179,8 @@ static int __init init_inodecache(void)
>  	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
>  					     sizeof(struct udf_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT |
> -						 SLAB_MEM_SPREAD),
> +						 SLAB_MEM_SPREAD |
> +						 SLAB_ACCOUNT),
>  					     init_once);
>  	if (!udf_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ufs/super.c b/fs/ufs/super.c
> index f6390eec02ca..442fd52ebffe 100644
> --- a/fs/ufs/super.c
> +++ b/fs/ufs/super.c
> @@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
>  	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
>  					     sizeof(struct ufs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ufs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> index cc6b768fc068..d1c66e465ca5 100644
> --- a/fs/xfs/kmem.h
> +++ b/fs/xfs/kmem.h
> @@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
>  #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
>  #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
>  #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
> +#define KM_ZONE_ACCOUNT	SLAB_ACCOUNT
>  
>  #define kmem_zone	kmem_cache
>  #define kmem_zone_t	struct kmem_cache
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 904f637cfa5f..70d5b3072631 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1703,8 +1703,8 @@ xfs_init_zones(void)
>  
>  	xfs_inode_zone =
>  		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
> -			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
> -			xfs_fs_inode_init_once);
> +			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
> +			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
>  	if (!xfs_inode_zone)
>  		goto out_destroy_efi_zone;
>  
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index ff307b548ed3..b4c2a485b28a 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -56,9 +56,10 @@ extern long do_no_restart_syscall(struct restart_block *parm);
>  #ifdef __KERNEL__
>  
>  #ifdef CONFIG_DEBUG_STACK_USAGE
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | \
> +				 __GFP_ZERO)
>  #else
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
>  #endif
>  
>  /*
> diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> index 161a1807e6ef..f4617cf07069 100644
> --- a/ipc/mqueue.c
> +++ b/ipc/mqueue.c
> @@ -1438,7 +1438,7 @@ static int __init init_mqueue_fs(void)
>  
>  	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
>  				sizeof(struct mqueue_inode_info), 0,
> -				SLAB_HWCACHE_ALIGN, init_once);
> +				SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
>  	if (mqueue_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/kernel/cred.c b/kernel/cred.c
> index 71179a09c1d6..0c0cd8a62285 100644
> --- a/kernel/cred.c
> +++ b/kernel/cred.c
> @@ -569,8 +569,8 @@ EXPORT_SYMBOL(revert_creds);
>  void __init cred_init(void)
>  {
>  	/* allocate a slab in which we can store credentials */
> -	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
> -				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> +	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred), 0,
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
>  }
>  
>  /**
> diff --git a/kernel/delayacct.c b/kernel/delayacct.c
> index ef90b04d783f..435c14a45118 100644
> --- a/kernel/delayacct.c
> +++ b/kernel/delayacct.c
> @@ -34,7 +34,7 @@ __setup("nodelayacct", delayacct_setup_disable);
>  
>  void delayacct_init(void)
>  {
> -	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC);
> +	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC|SLAB_ACCOUNT);
>  	delayacct_tsk_init(&init_task);
>  }
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index f97f2c449f5c..ff39b78e6e23 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -300,9 +300,9 @@ void __init fork_init(void)
>  #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
>  #endif
>  	/* create a slab on which task_structs can be allocated */
> -	task_struct_cachep =
> -		kmem_cache_create("task_struct", arch_task_struct_size,
> -			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
> +	task_struct_cachep = kmem_cache_create("task_struct",
> +			arch_task_struct_size, ARCH_MIN_TASKALIGN,
> +			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
>  #endif
>  
>  	/* do the arch specific task caches init */
> @@ -1851,16 +1851,19 @@ void __init proc_caches_init(void)
>  	sighand_cachep = kmem_cache_create("sighand_cache",
>  			sizeof(struct sighand_struct), 0,
>  			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU|
> -			SLAB_NOTRACK, sighand_ctor);
> +			SLAB_NOTRACK|SLAB_ACCOUNT, sighand_ctor);
>  	signal_cachep = kmem_cache_create("signal_cache",
>  			sizeof(struct signal_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	files_cachep = kmem_cache_create("files_cache",
>  			sizeof(struct files_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	fs_cachep = kmem_cache_create("fs_cache",
>  			sizeof(struct fs_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	/*
>  	 * FIXME! The "sizeof(struct mm_struct)" currently includes the
>  	 * whole struct cpumask for the OFFSTACK case. We could change
> @@ -1870,8 +1873,9 @@ void __init proc_caches_init(void)
>  	 */
>  	mm_cachep = kmem_cache_create("mm_struct",
>  			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> -	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
> +	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
>  	mmap_init();
>  	nsproxy_cache_init();
>  }
> diff --git a/kernel/pid.c b/kernel/pid.c
> index ca368793808e..f09b026f5b56 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -604,5 +604,5 @@ void __init pidmap_init(void)
>  	atomic_dec(&init_pid_ns.pidmap[0].nr_free);
>  
>  	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
> -			SLAB_HWCACHE_ALIGN | SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
>  }
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 92be862c859b..fbf6f0f1d6c9 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -560,7 +560,7 @@ void __init mmap_init(void)
>  
>  	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
>  	VM_BUG_ON(ret);
> -	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC);
> +	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b577fbb98d4b..3c3f1d21f075 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -428,8 +428,10 @@ static void anon_vma_ctor(void *data)
>  void __init anon_vma_init(void)
>  {
>  	anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct anon_vma),
> -			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC, anon_vma_ctor);
> -	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain, SLAB_PANIC);
> +			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
> +			anon_vma_ctor);
> +	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain,
> +			SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 3b8b73928398..882933a7de99 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3107,7 +3107,7 @@ static int shmem_init_inodecache(void)
>  {
>  	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
>  				sizeof(struct shmem_inode_info),
> -				0, SLAB_PANIC, shmem_init_inode);
> +				0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
>  	return 0;
>  }
>  
> diff --git a/net/socket.c b/net/socket.c
> index 9963a0b53a64..2d70af8d943f 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -293,7 +293,7 @@ static int init_inodecache(void)
>  					      0,
>  					      (SLAB_HWCACHE_ALIGN |
>  					       SLAB_RECLAIM_ACCOUNT |
> -					       SLAB_MEM_SPREAD),
> +					       SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					      init_once);
>  	if (sock_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index d81186d34558..14f45bf0410c 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1500,7 +1500,7 @@ int register_rpc_pipefs(void)
>  	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
>  				sizeof(struct rpc_inode),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				init_once);
>  	if (!rpc_inode_cachep)
>  		return -ENOMEM;
> -- 
> 2.1.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
@ 2015-11-12 16:50     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:50 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue 10-11-15 21:34:07, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.

It would be imho nicer to split this into few patches based on the
memory category (task management, address space, icache) with a
justification.

> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

>From a quick look it seems reasonable.

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
>  drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
>  fs/9p/v9fs.c                                  |  2 +-
>  fs/adfs/super.c                               |  2 +-
>  fs/affs/super.c                               |  2 +-
>  fs/afs/super.c                                |  2 +-
>  fs/befs/linuxvfs.c                            |  2 +-
>  fs/bfs/inode.c                                |  2 +-
>  fs/block_dev.c                                |  2 +-
>  fs/btrfs/inode.c                              |  3 ++-
>  fs/ceph/super.c                               |  4 ++--
>  fs/cifs/cifsfs.c                              |  2 +-
>  fs/coda/inode.c                               |  6 +++---
>  fs/dcache.c                                   |  5 +++--
>  fs/ecryptfs/main.c                            |  6 ++++--
>  fs/efs/super.c                                |  6 +++---
>  fs/exofs/super.c                              |  4 ++--
>  fs/ext2/super.c                               |  2 +-
>  fs/ext4/super.c                               |  2 +-
>  fs/f2fs/super.c                               |  5 +++--
>  fs/fat/inode.c                                |  2 +-
>  fs/file.c                                     |  7 ++++---
>  fs/fuse/inode.c                               |  4 ++--
>  fs/gfs2/main.c                                |  3 ++-
>  fs/hfs/super.c                                |  4 ++--
>  fs/hfsplus/super.c                            |  2 +-
>  fs/hostfs/hostfs_kern.c                       |  2 +-
>  fs/hpfs/super.c                               |  2 +-
>  fs/hugetlbfs/inode.c                          |  2 +-
>  fs/inode.c                                    |  2 +-
>  fs/isofs/inode.c                              |  2 +-
>  fs/jffs2/super.c                              |  2 +-
>  fs/jfs/super.c                                |  2 +-
>  fs/logfs/inode.c                              |  3 ++-
>  fs/minix/inode.c                              |  2 +-
>  fs/ncpfs/inode.c                              |  2 +-
>  fs/nfs/inode.c                                |  2 +-
>  fs/nilfs2/super.c                             |  3 ++-
>  fs/ntfs/super.c                               |  4 ++--
>  fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
>  fs/ocfs2/super.c                              |  2 +-
>  fs/openpromfs/inode.c                         |  2 +-
>  fs/proc/inode.c                               |  3 ++-
>  fs/qnx4/inode.c                               |  2 +-
>  fs/qnx6/inode.c                               |  2 +-
>  fs/reiserfs/super.c                           |  3 ++-
>  fs/romfs/super.c                              |  4 ++--
>  fs/squashfs/super.c                           |  3 ++-
>  fs/sysv/inode.c                               |  2 +-
>  fs/ubifs/super.c                              |  4 ++--
>  fs/udf/super.c                                |  3 ++-
>  fs/ufs/super.c                                |  2 +-
>  fs/xfs/kmem.h                                 |  1 +
>  fs/xfs/xfs_super.c                            |  4 ++--
>  include/linux/thread_info.h                   |  5 +++--
>  ipc/mqueue.c                                  |  2 +-
>  kernel/cred.c                                 |  4 ++--
>  kernel/delayacct.c                            |  2 +-
>  kernel/fork.c                                 | 22 +++++++++++++---------
>  kernel/pid.c                                  |  2 +-
>  mm/nommu.c                                    |  2 +-
>  mm/rmap.c                                     |  6 ++++--
>  mm/shmem.c                                    |  2 +-
>  net/socket.c                                  |  2 +-
>  net/sunrpc/rpc_pipe.c                         |  2 +-
>  65 files changed, 114 insertions(+), 92 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> index 11634fa7ab3c..ad4840f86be1 100644
> --- a/arch/powerpc/platforms/cell/spufs/inode.c
> +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> @@ -767,7 +767,7 @@ static int __init spufs_init(void)
>  	ret = -ENOMEM;
>  	spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
>  			sizeof(struct spufs_inode_info), 0,
> -			SLAB_HWCACHE_ALIGN, spufs_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
>  
>  	if (!spufs_inode_cache)
>  		goto out;
> diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
> index 013136860664..60828d692db4 100644
> --- a/drivers/staging/lustre/lustre/llite/super25.c
> +++ b/drivers/staging/lustre/lustre/llite/super25.c
> @@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
>  	rc = -ENOMEM;
>  	ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
>  					    sizeof(struct ll_inode_info),
> -					    0, SLAB_HWCACHE_ALIGN, NULL);
> +					    0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
> +					    NULL);
>  	if (ll_inode_cachep == NULL)
>  		goto out_cache;
>  
> diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
> index 6caca025019d..072e7599583a 100644
> --- a/fs/9p/v9fs.c
> +++ b/fs/9p/v9fs.c
> @@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
>  	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
>  					  sizeof(struct v9fs_inode),
>  					  0, (SLAB_RECLAIM_ACCOUNT|
> -					      SLAB_MEM_SPREAD),
> +					      SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					  v9fs_inode_init_once);
>  	if (!v9fs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/adfs/super.c b/fs/adfs/super.c
> index 4d4a0df8344f..c9fdfb112933 100644
> --- a/fs/adfs/super.c
> +++ b/fs/adfs/super.c
> @@ -271,7 +271,7 @@ static int __init init_inodecache(void)
>  	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
>  					     sizeof(struct adfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (adfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/affs/super.c b/fs/affs/super.c
> index 5b50c4ca43a7..84a84fcb5f5a 100644
> --- a/fs/affs/super.c
> +++ b/fs/affs/super.c
> @@ -132,7 +132,7 @@ static int __init init_inodecache(void)
>  	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
>  					     sizeof(struct affs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (affs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/afs/super.c b/fs/afs/super.c
> index 1fb4a5129f7d..81afefe7d8a6 100644
> --- a/fs/afs/super.c
> +++ b/fs/afs/super.c
> @@ -91,7 +91,7 @@ int __init afs_fs_init(void)
>  	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
>  					     sizeof(struct afs_vnode),
>  					     0,
> -					     SLAB_HWCACHE_ALIGN,
> +					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					     afs_i_init_once);
>  	if (!afs_inode_cachep) {
>  		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
> diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
> index 46aedacfa6a8..2a23edf5703e 100644
> --- a/fs/befs/linuxvfs.c
> +++ b/fs/befs/linuxvfs.c
> @@ -434,7 +434,7 @@ befs_init_inodecache(void)
>  	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
>  					      sizeof (struct befs_inode_info),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      init_once);
>  	if (befs_inode_cachep == NULL) {
>  		pr_err("%s: Couldn't initialize inode slabcache\n", __func__);
> diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
> index fdcb4d69f430..1e5c896f6b79 100644
> --- a/fs/bfs/inode.c
> +++ b/fs/bfs/inode.c
> @@ -270,7 +270,7 @@ static int __init init_inodecache(void)
>  	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
>  					     sizeof(struct bfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (bfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 0a793c7930eb..29ce98bfe04f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -567,7 +567,7 @@ void __init bdev_cache_init(void)
>  
>  	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
>  			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -				SLAB_MEM_SPREAD|SLAB_PANIC),
> +				SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
>  			init_once);
>  	err = register_filesystem(&bd_type);
>  	if (err)
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 4439fbb4ff45..c24d4cd9c14f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9157,7 +9157,8 @@ int btrfs_init_cachep(void)
>  {
>  	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
>  			sizeof(struct btrfs_inode), 0,
> -			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
> +			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
> +			init_once);
>  	if (!btrfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> index f446afada328..ca4d5e8457f1 100644
> --- a/fs/ceph/super.c
> +++ b/fs/ceph/super.c
> @@ -639,8 +639,8 @@ static int __init init_caches(void)
>  	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
>  				      sizeof(struct ceph_inode_info),
>  				      __alignof__(struct ceph_inode_info),
> -				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
> -				      ceph_inode_init_once);
> +				      SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				      SLAB_ACCOUNT, ceph_inode_init_once);
>  	if (ceph_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> index e739950ca084..7f2e2639d1d1 100644
> --- a/fs/cifs/cifsfs.c
> +++ b/fs/cifs/cifsfs.c
> @@ -1040,7 +1040,7 @@ cifs_init_inodecache(void)
>  	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
>  					      sizeof(struct cifsInodeInfo),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      cifs_init_once);
>  	if (cifs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/coda/inode.c b/fs/coda/inode.c
> index cac1390b87a3..57e81cbba0fa 100644
> --- a/fs/coda/inode.c
> +++ b/fs/coda/inode.c
> @@ -74,9 +74,9 @@ static void init_once(void *foo)
>  int __init coda_init_inodecache(void)
>  {
>  	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
> -				sizeof(struct coda_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct coda_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (coda_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5c33aeb0f68f..7ac590912106 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
>  	dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
>  	if (name->len > DNAME_INLINE_LEN-1) {
>  		size_t size = offsetof(struct external_name, name[1]);
> -		struct external_name *p = kmalloc(size + name->len, GFP_KERNEL);
> +		struct external_name *p = kmalloc(size + name->len,
> +						  GFP_KERNEL_ACCOUNT);
>  		if (!p) {
>  			kmem_cache_free(dentry_cache, dentry); 
>  			return NULL;
> @@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
>  	 * of the dcache. 
>  	 */
>  	dentry_cache = KMEM_CACHE(dentry,
> -		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
> +		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
>  
>  	/* Hash may have been set up in dcache_init_early */
>  	if (!hashdist)
> diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> index 4f4d0474bee9..e25b6b06bacf 100644
> --- a/fs/ecryptfs/main.c
> +++ b/fs/ecryptfs/main.c
> @@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
>  	struct kmem_cache **cache;
>  	const char *name;
>  	size_t size;
> +	unsigned long flags;
>  	void (*ctor)(void *obj);
>  } ecryptfs_cache_infos[] = {
>  	{
> @@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
>  		.cache = &ecryptfs_inode_info_cache,
>  		.name = "ecryptfs_inode_cache",
>  		.size = sizeof(struct ecryptfs_inode_info),
> +		.flags = SLAB_ACCOUNT,
>  		.ctor = inode_info_init_once,
>  	},
>  	{
> @@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
>  		struct ecryptfs_cache_info *info;
>  
>  		info = &ecryptfs_cache_infos[i];
> -		*(info->cache) = kmem_cache_create(info->name, info->size,
> -				0, SLAB_HWCACHE_ALIGN, info->ctor);
> +		*(info->cache) = kmem_cache_create(info->name, info->size, 0,
> +				SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
>  		if (!*(info->cache)) {
>  			ecryptfs_free_kmem_caches();
>  			ecryptfs_printk(KERN_WARNING, "%s: "
> diff --git a/fs/efs/super.c b/fs/efs/super.c
> index c8411a30f7da..cb68dac4f9d3 100644
> --- a/fs/efs/super.c
> +++ b/fs/efs/super.c
> @@ -94,9 +94,9 @@ static void init_once(void *foo)
>  static int __init init_inodecache(void)
>  {
>  	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
> -				sizeof(struct efs_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct efs_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (efs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/exofs/super.c b/fs/exofs/super.c
> index b795c567b5e1..6658a50530a0 100644
> --- a/fs/exofs/super.c
> +++ b/fs/exofs/super.c
> @@ -194,8 +194,8 @@ static int init_inodecache(void)
>  {
>  	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
>  				sizeof(struct exofs_i_info), 0,
> -				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				exofs_init_once);
> +				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				SLAB_ACCOUNT, exofs_init_once);
>  	if (exofs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 900e19cf9ef6..973092a32b98 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -200,7 +200,7 @@ static int __init init_inodecache(void)
>  	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
>  					     sizeof(struct ext2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext2_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 04d0f1b33409..c4a5c415b881 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -966,7 +966,7 @@ static int __init init_inodecache(void)
>  	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
>  					     sizeof(struct ext4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 3a65e0132352..862916c7e3f8 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1424,8 +1424,9 @@ MODULE_ALIAS_FS("f2fs");
>  
>  static int __init init_inodecache(void)
>  {
> -	f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache",
> -			sizeof(struct f2fs_inode_info));
> +	f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
> +			sizeof(struct f2fs_inode_info), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
>  	if (!f2fs_inode_cachep)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 509411dd3698..6aece96df19f 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
>  	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
>  					     sizeof(struct msdos_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (fat_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/file.c b/fs/file.c
> index 39f8f15921da..7d76c929d557 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
>  	 * vmalloc() if the allocation size will be considered "large" by the VM.
>  	 */
>  	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
> +		void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
> +				     __GFP_NOWARN | __GFP_NORETRY);
>  		if (data != NULL)
>  			return data;
>  	}
> -	return vmalloc(size);
> +	return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
>  }
>  
>  static void __free_fdtable(struct fdtable *fdt)
> @@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
>  	if (unlikely(nr > sysctl_nr_open))
>  		nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
>  
> -	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL);
> +	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
>  	if (!fdt)
>  		goto out;
>  	fdt->max_fds = nr;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 2913db2a5b99..4d69d5c0bedc 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
>  	int err;
>  
>  	fuse_inode_cachep = kmem_cache_create("fuse_inode",
> -					      sizeof(struct fuse_inode),
> -					      0, SLAB_HWCACHE_ALIGN,
> +					      sizeof(struct fuse_inode), 0,
> +					      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					      fuse_inode_init_once);
>  	err = -ENOMEM;
>  	if (!fuse_inode_cachep)
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 241a399bf83d..6ee38e210602 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -112,7 +112,8 @@ static int __init init_gfs2_fs(void)
>  	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
>  					      sizeof(struct gfs2_inode),
>  					      0,  SLAB_RECLAIM_ACCOUNT|
> -					          SLAB_MEM_SPREAD,
> +					          SLAB_MEM_SPREAD|
> +						  SLAB_ACCOUNT,
>  					      gfs2_init_inode_once);
>  	if (!gfs2_inode_cachep)
>  		goto fail;
> diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> index 4574fdd3d421..1ca95c232bb5 100644
> --- a/fs/hfs/super.c
> +++ b/fs/hfs/super.c
> @@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
>  	int err;
>  
>  	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
> -		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
> -		hfs_init_once);
> +		sizeof(struct hfs_inode_info), 0,
> +		SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
>  	if (!hfs_inode_cachep)
>  		return -ENOMEM;
>  	err = register_filesystem(&hfs_fs_type);
> diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
> index 7302d96ae8bf..5d54490a136d 100644
> --- a/fs/hfsplus/super.c
> +++ b/fs/hfsplus/super.c
> @@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
>  	int err;
>  
>  	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
> -		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
> +		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  		hfsplus_init_once);
>  	if (!hfsplus_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
> index 2ac99db3750e..a4cf6b11a142 100644
> --- a/fs/hostfs/hostfs_kern.c
> +++ b/fs/hostfs/hostfs_kern.c
> @@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
>  {
>  	struct hostfs_inode_info *hi;
>  
> -	hi = kmalloc(sizeof(*hi), GFP_KERNEL);
> +	hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
>  	if (hi == NULL)
>  		return NULL;
>  	hi->fd = -1;
> diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
> index a561591896bd..458cf463047b 100644
> --- a/fs/hpfs/super.c
> +++ b/fs/hpfs/super.c
> @@ -261,7 +261,7 @@ static int init_inodecache(void)
>  	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
>  					     sizeof(struct hpfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (hpfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 316adb968b65..496add05f380 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1322,7 +1322,7 @@ static int __init init_hugetlbfs_fs(void)
>  	error = -ENOMEM;
>  	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
>  					sizeof(struct hugetlbfs_inode_info),
> -					0, 0, init_once);
> +					0, SLAB_ACCOUNT, init_once);
>  	if (hugetlbfs_inode_cachep == NULL)
>  		goto out2;
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index 78a17b8859e1..08c66502f1f4 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1882,7 +1882,7 @@ void __init inode_init(void)
>  					 sizeof(struct inode),
>  					 0,
>  					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
> -					 SLAB_MEM_SPREAD),
> +					 SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					 init_once);
>  
>  	/* Hash may have been set up in inode_init_early */
> diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
> index d67a16f2a45d..9bc2431d2df8 100644
> --- a/fs/isofs/inode.c
> +++ b/fs/isofs/inode.c
> @@ -94,7 +94,7 @@ static int __init init_inodecache(void)
>  	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
>  					sizeof(struct iso_inode_info),
>  					0, (SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					init_once);
>  	if (isofs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
> index d86c5e3176a1..bb080c272149 100644
> --- a/fs/jffs2/super.c
> +++ b/fs/jffs2/super.c
> @@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
>  	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
>  					     sizeof(struct jffs2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     jffs2_i_init_once);
>  	if (!jffs2_inode_cachep) {
>  		pr_err("error: Failed to initialise inode cache\n");
> diff --git a/fs/jfs/super.c b/fs/jfs/super.c
> index 4cd9798f4948..6efadc61c15b 100644
> --- a/fs/jfs/super.c
> +++ b/fs/jfs/super.c
> @@ -901,7 +901,7 @@ static int __init init_jfs_fs(void)
>  
>  	jfs_inode_cachep =
>  	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
> -			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			    init_once);
>  	if (jfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
> index af49e2d6941a..5d65db2e03f4 100644
> --- a/fs/logfs/inode.c
> +++ b/fs/logfs/inode.c
> @@ -408,7 +408,8 @@ const struct super_operations logfs_super_operations = {
>  int logfs_init_inode_cache(void)
>  {
>  	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
> -			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
> +			sizeof(struct logfs_inode), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
>  			logfs_init_once);
>  	if (!logfs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/minix/inode.c b/fs/minix/inode.c
> index 086cd0a61e80..5942c3e10fa5 100644
> --- a/fs/minix/inode.c
> +++ b/fs/minix/inode.c
> @@ -91,7 +91,7 @@ static int __init init_inodecache(void)
>  	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
>  					     sizeof(struct minix_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (minix_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
> index 9605a2f63549..d80446e1a333 100644
> --- a/fs/ncpfs/inode.c
> +++ b/fs/ncpfs/inode.c
> @@ -82,7 +82,7 @@ static int init_inodecache(void)
>  	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
>  					     sizeof(struct ncp_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ncp_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 326d9e10d833..412f888fad13 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1904,7 +1904,7 @@ static int __init nfs_init_inodecache(void)
>  	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
>  					     sizeof(struct nfs_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (nfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> index f47585bfeb01..dcf8e2ff3072 100644
> --- a/fs/nilfs2/super.c
> +++ b/fs/nilfs2/super.c
> @@ -1419,7 +1419,8 @@ static int __init nilfs_init_cachep(void)
>  {
>  	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
>  			sizeof(struct nilfs_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +			nilfs_inode_init_once);
>  	if (!nilfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
> index d1a853585b53..2f77f8dfb861 100644
> --- a/fs/ntfs/super.c
> +++ b/fs/ntfs/super.c
> @@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
>  
>  	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
>  			sizeof(big_ntfs_inode), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -			ntfs_big_inode_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +			SLAB_ACCOUNT, ntfs_big_inode_init_once);
>  	if (!ntfs_big_inode_cache) {
>  		pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
>  		goto big_inode_err_out;
> diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
> index b5cf27dcb18a..03768bb3aab1 100644
> --- a/fs/ocfs2/dlmfs/dlmfs.c
> +++ b/fs/ocfs2/dlmfs/dlmfs.c
> @@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
>  	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
>  				sizeof(struct dlmfs_inode_private),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				dlmfs_init_once);
>  	if (!dlmfs_inode_cache) {
>  		status = -ENOMEM;
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 2de4c8a9340c..8ab0fcbc0b86 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1771,7 +1771,7 @@ static int ocfs2_initialize_mem_caches(void)
>  				       sizeof(struct ocfs2_inode_info),
>  				       0,
>  				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				       ocfs2_inode_init_once);
>  	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
>  					sizeof(struct ocfs2_dquot),
> diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
> index 15e4500cda3e..b61b883c8ff8 100644
> --- a/fs/openpromfs/inode.c
> +++ b/fs/openpromfs/inode.c
> @@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
>  					    sizeof(struct op_inode_info),
>  					    0,
>  					    (SLAB_RECLAIM_ACCOUNT |
> -					     SLAB_MEM_SPREAD),
> +					     SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					    op_inode_init_once);
>  	if (!op_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
> index bd95b9fdebb0..561557122dea 100644
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
>  	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
>  					     sizeof(struct proc_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD|SLAB_PANIC),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT|
> +						SLAB_PANIC),
>  					     init_once);
>  }
>  
> diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
> index c4bcb778886e..f761acdd5a7a 100644
> --- a/fs/qnx4/inode.c
> +++ b/fs/qnx4/inode.c
> @@ -364,7 +364,7 @@ static int init_inodecache(void)
>  	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
>  					     sizeof(struct qnx4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (qnx4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
> index 32d2e1a9774c..4f04f00a7e5e 100644
> --- a/fs/qnx6/inode.c
> +++ b/fs/qnx6/inode.c
> @@ -624,7 +624,7 @@ static int init_inodecache(void)
>  	qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
>  					     sizeof(struct qnx6_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (!qnx6_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
> index 4a62fe8cc3bf..05db7473bcb5 100644
> --- a/fs/reiserfs/super.c
> +++ b/fs/reiserfs/super.c
> @@ -626,7 +626,8 @@ static int __init init_inodecache(void)
>  						  sizeof(struct
>  							 reiserfs_inode_info),
>  						  0, (SLAB_RECLAIM_ACCOUNT|
> -							SLAB_MEM_SPREAD),
> +						      SLAB_MEM_SPREAD|
> +						      SLAB_ACCOUNT),
>  						  init_once);
>  	if (reiserfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/romfs/super.c b/fs/romfs/super.c
> index 268733cda397..e1113399a6b4 100644
> --- a/fs/romfs/super.c
> +++ b/fs/romfs/super.c
> @@ -618,8 +618,8 @@ static int __init init_romfs_fs(void)
>  	romfs_inode_cachep =
>  		kmem_cache_create("romfs_i",
>  				  sizeof(struct romfs_inode_info), 0,
> -				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				  romfs_i_init_once);
> +				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				  SLAB_ACCOUNT, romfs_i_init_once);
>  
>  	if (!romfs_inode_cachep) {
>  		pr_err("Failed to initialise inode cache\n");
> diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
> index 5056babe00df..ea59b475663c 100644
> --- a/fs/squashfs/super.c
> +++ b/fs/squashfs/super.c
> @@ -420,7 +420,8 @@ static int __init init_inodecache(void)
>  {
>  	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
>  		sizeof(struct squashfs_inode_info), 0,
> -		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
> +		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +		init_once);
>  
>  	return squashfs_inode_cachep ? 0 : -ENOMEM;
>  }
> diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
> index 590ad9206e3f..087ed6a1c1df 100644
> --- a/fs/sysv/inode.c
> +++ b/fs/sysv/inode.c
> @@ -353,7 +353,7 @@ int __init sysv_init_icache(void)
>  {
>  	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
>  			sizeof(struct sysv_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			init_once);
>  	if (!sysv_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 9547a27868ad..9d064789c63a 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2241,8 +2241,8 @@ static int __init ubifs_init(void)
>  
>  	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
>  				sizeof(struct ubifs_inode), 0,
> -				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
> -				&inode_slab_ctor);
> +				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
> +				SLAB_ACCOUNT, &inode_slab_ctor);
>  	if (!ubifs_inode_slab)
>  		return -ENOMEM;
>  
> diff --git a/fs/udf/super.c b/fs/udf/super.c
> index 81155b9b445b..9c64a3ca9837 100644
> --- a/fs/udf/super.c
> +++ b/fs/udf/super.c
> @@ -179,7 +179,8 @@ static int __init init_inodecache(void)
>  	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
>  					     sizeof(struct udf_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT |
> -						 SLAB_MEM_SPREAD),
> +						 SLAB_MEM_SPREAD |
> +						 SLAB_ACCOUNT),
>  					     init_once);
>  	if (!udf_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ufs/super.c b/fs/ufs/super.c
> index f6390eec02ca..442fd52ebffe 100644
> --- a/fs/ufs/super.c
> +++ b/fs/ufs/super.c
> @@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
>  	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
>  					     sizeof(struct ufs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ufs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> index cc6b768fc068..d1c66e465ca5 100644
> --- a/fs/xfs/kmem.h
> +++ b/fs/xfs/kmem.h
> @@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
>  #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
>  #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
>  #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
> +#define KM_ZONE_ACCOUNT	SLAB_ACCOUNT
>  
>  #define kmem_zone	kmem_cache
>  #define kmem_zone_t	struct kmem_cache
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 904f637cfa5f..70d5b3072631 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1703,8 +1703,8 @@ xfs_init_zones(void)
>  
>  	xfs_inode_zone =
>  		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
> -			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
> -			xfs_fs_inode_init_once);
> +			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
> +			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
>  	if (!xfs_inode_zone)
>  		goto out_destroy_efi_zone;
>  
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index ff307b548ed3..b4c2a485b28a 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -56,9 +56,10 @@ extern long do_no_restart_syscall(struct restart_block *parm);
>  #ifdef __KERNEL__
>  
>  #ifdef CONFIG_DEBUG_STACK_USAGE
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | \
> +				 __GFP_ZERO)
>  #else
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
>  #endif
>  
>  /*
> diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> index 161a1807e6ef..f4617cf07069 100644
> --- a/ipc/mqueue.c
> +++ b/ipc/mqueue.c
> @@ -1438,7 +1438,7 @@ static int __init init_mqueue_fs(void)
>  
>  	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
>  				sizeof(struct mqueue_inode_info), 0,
> -				SLAB_HWCACHE_ALIGN, init_once);
> +				SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
>  	if (mqueue_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/kernel/cred.c b/kernel/cred.c
> index 71179a09c1d6..0c0cd8a62285 100644
> --- a/kernel/cred.c
> +++ b/kernel/cred.c
> @@ -569,8 +569,8 @@ EXPORT_SYMBOL(revert_creds);
>  void __init cred_init(void)
>  {
>  	/* allocate a slab in which we can store credentials */
> -	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
> -				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> +	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred), 0,
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
>  }
>  
>  /**
> diff --git a/kernel/delayacct.c b/kernel/delayacct.c
> index ef90b04d783f..435c14a45118 100644
> --- a/kernel/delayacct.c
> +++ b/kernel/delayacct.c
> @@ -34,7 +34,7 @@ __setup("nodelayacct", delayacct_setup_disable);
>  
>  void delayacct_init(void)
>  {
> -	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC);
> +	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC|SLAB_ACCOUNT);
>  	delayacct_tsk_init(&init_task);
>  }
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index f97f2c449f5c..ff39b78e6e23 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -300,9 +300,9 @@ void __init fork_init(void)
>  #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
>  #endif
>  	/* create a slab on which task_structs can be allocated */
> -	task_struct_cachep =
> -		kmem_cache_create("task_struct", arch_task_struct_size,
> -			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
> +	task_struct_cachep = kmem_cache_create("task_struct",
> +			arch_task_struct_size, ARCH_MIN_TASKALIGN,
> +			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
>  #endif
>  
>  	/* do the arch specific task caches init */
> @@ -1851,16 +1851,19 @@ void __init proc_caches_init(void)
>  	sighand_cachep = kmem_cache_create("sighand_cache",
>  			sizeof(struct sighand_struct), 0,
>  			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU|
> -			SLAB_NOTRACK, sighand_ctor);
> +			SLAB_NOTRACK|SLAB_ACCOUNT, sighand_ctor);
>  	signal_cachep = kmem_cache_create("signal_cache",
>  			sizeof(struct signal_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	files_cachep = kmem_cache_create("files_cache",
>  			sizeof(struct files_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	fs_cachep = kmem_cache_create("fs_cache",
>  			sizeof(struct fs_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	/*
>  	 * FIXME! The "sizeof(struct mm_struct)" currently includes the
>  	 * whole struct cpumask for the OFFSTACK case. We could change
> @@ -1870,8 +1873,9 @@ void __init proc_caches_init(void)
>  	 */
>  	mm_cachep = kmem_cache_create("mm_struct",
>  			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> -	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
> +	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
>  	mmap_init();
>  	nsproxy_cache_init();
>  }
> diff --git a/kernel/pid.c b/kernel/pid.c
> index ca368793808e..f09b026f5b56 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -604,5 +604,5 @@ void __init pidmap_init(void)
>  	atomic_dec(&init_pid_ns.pidmap[0].nr_free);
>  
>  	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
> -			SLAB_HWCACHE_ALIGN | SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
>  }
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 92be862c859b..fbf6f0f1d6c9 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -560,7 +560,7 @@ void __init mmap_init(void)
>  
>  	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
>  	VM_BUG_ON(ret);
> -	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC);
> +	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b577fbb98d4b..3c3f1d21f075 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -428,8 +428,10 @@ static void anon_vma_ctor(void *data)
>  void __init anon_vma_init(void)
>  {
>  	anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct anon_vma),
> -			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC, anon_vma_ctor);
> -	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain, SLAB_PANIC);
> +			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
> +			anon_vma_ctor);
> +	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain,
> +			SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 3b8b73928398..882933a7de99 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3107,7 +3107,7 @@ static int shmem_init_inodecache(void)
>  {
>  	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
>  				sizeof(struct shmem_inode_info),
> -				0, SLAB_PANIC, shmem_init_inode);
> +				0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
>  	return 0;
>  }
>  
> diff --git a/net/socket.c b/net/socket.c
> index 9963a0b53a64..2d70af8d943f 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -293,7 +293,7 @@ static int init_inodecache(void)
>  					      0,
>  					      (SLAB_HWCACHE_ALIGN |
>  					       SLAB_RECLAIM_ACCOUNT |
> -					       SLAB_MEM_SPREAD),
> +					       SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					      init_once);
>  	if (sock_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index d81186d34558..14f45bf0410c 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1500,7 +1500,7 @@ int register_rpc_pipefs(void)
>  	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
>  				sizeof(struct rpc_inode),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				init_once);
>  	if (!rpc_inode_cachep)
>  		return -ENOMEM;
> -- 
> 2.1.4
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
@ 2015-11-12 16:50     ` Michal Hocko
  0 siblings, 0 replies; 56+ messages in thread
From: Michal Hocko @ 2015-11-12 16:50 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 10-11-15 21:34:07, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.

It would be imho nicer to split this into few patches based on the
memory category (task management, address space, icache) with a
justification.

> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

From a quick look it seems reasonable.

Acked-by: Michal Hocko <mhocko-IBi9RG/b67k@public.gmane.org>

> ---
>  arch/powerpc/platforms/cell/spufs/inode.c     |  2 +-
>  drivers/staging/lustre/lustre/llite/super25.c |  3 ++-
>  fs/9p/v9fs.c                                  |  2 +-
>  fs/adfs/super.c                               |  2 +-
>  fs/affs/super.c                               |  2 +-
>  fs/afs/super.c                                |  2 +-
>  fs/befs/linuxvfs.c                            |  2 +-
>  fs/bfs/inode.c                                |  2 +-
>  fs/block_dev.c                                |  2 +-
>  fs/btrfs/inode.c                              |  3 ++-
>  fs/ceph/super.c                               |  4 ++--
>  fs/cifs/cifsfs.c                              |  2 +-
>  fs/coda/inode.c                               |  6 +++---
>  fs/dcache.c                                   |  5 +++--
>  fs/ecryptfs/main.c                            |  6 ++++--
>  fs/efs/super.c                                |  6 +++---
>  fs/exofs/super.c                              |  4 ++--
>  fs/ext2/super.c                               |  2 +-
>  fs/ext4/super.c                               |  2 +-
>  fs/f2fs/super.c                               |  5 +++--
>  fs/fat/inode.c                                |  2 +-
>  fs/file.c                                     |  7 ++++---
>  fs/fuse/inode.c                               |  4 ++--
>  fs/gfs2/main.c                                |  3 ++-
>  fs/hfs/super.c                                |  4 ++--
>  fs/hfsplus/super.c                            |  2 +-
>  fs/hostfs/hostfs_kern.c                       |  2 +-
>  fs/hpfs/super.c                               |  2 +-
>  fs/hugetlbfs/inode.c                          |  2 +-
>  fs/inode.c                                    |  2 +-
>  fs/isofs/inode.c                              |  2 +-
>  fs/jffs2/super.c                              |  2 +-
>  fs/jfs/super.c                                |  2 +-
>  fs/logfs/inode.c                              |  3 ++-
>  fs/minix/inode.c                              |  2 +-
>  fs/ncpfs/inode.c                              |  2 +-
>  fs/nfs/inode.c                                |  2 +-
>  fs/nilfs2/super.c                             |  3 ++-
>  fs/ntfs/super.c                               |  4 ++--
>  fs/ocfs2/dlmfs/dlmfs.c                        |  2 +-
>  fs/ocfs2/super.c                              |  2 +-
>  fs/openpromfs/inode.c                         |  2 +-
>  fs/proc/inode.c                               |  3 ++-
>  fs/qnx4/inode.c                               |  2 +-
>  fs/qnx6/inode.c                               |  2 +-
>  fs/reiserfs/super.c                           |  3 ++-
>  fs/romfs/super.c                              |  4 ++--
>  fs/squashfs/super.c                           |  3 ++-
>  fs/sysv/inode.c                               |  2 +-
>  fs/ubifs/super.c                              |  4 ++--
>  fs/udf/super.c                                |  3 ++-
>  fs/ufs/super.c                                |  2 +-
>  fs/xfs/kmem.h                                 |  1 +
>  fs/xfs/xfs_super.c                            |  4 ++--
>  include/linux/thread_info.h                   |  5 +++--
>  ipc/mqueue.c                                  |  2 +-
>  kernel/cred.c                                 |  4 ++--
>  kernel/delayacct.c                            |  2 +-
>  kernel/fork.c                                 | 22 +++++++++++++---------
>  kernel/pid.c                                  |  2 +-
>  mm/nommu.c                                    |  2 +-
>  mm/rmap.c                                     |  6 ++++--
>  mm/shmem.c                                    |  2 +-
>  net/socket.c                                  |  2 +-
>  net/sunrpc/rpc_pipe.c                         |  2 +-
>  65 files changed, 114 insertions(+), 92 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
> index 11634fa7ab3c..ad4840f86be1 100644
> --- a/arch/powerpc/platforms/cell/spufs/inode.c
> +++ b/arch/powerpc/platforms/cell/spufs/inode.c
> @@ -767,7 +767,7 @@ static int __init spufs_init(void)
>  	ret = -ENOMEM;
>  	spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
>  			sizeof(struct spufs_inode_info), 0,
> -			SLAB_HWCACHE_ALIGN, spufs_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
>  
>  	if (!spufs_inode_cache)
>  		goto out;
> diff --git a/drivers/staging/lustre/lustre/llite/super25.c b/drivers/staging/lustre/lustre/llite/super25.c
> index 013136860664..60828d692db4 100644
> --- a/drivers/staging/lustre/lustre/llite/super25.c
> +++ b/drivers/staging/lustre/lustre/llite/super25.c
> @@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
>  	rc = -ENOMEM;
>  	ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
>  					    sizeof(struct ll_inode_info),
> -					    0, SLAB_HWCACHE_ALIGN, NULL);
> +					    0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
> +					    NULL);
>  	if (ll_inode_cachep == NULL)
>  		goto out_cache;
>  
> diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c
> index 6caca025019d..072e7599583a 100644
> --- a/fs/9p/v9fs.c
> +++ b/fs/9p/v9fs.c
> @@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
>  	v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
>  					  sizeof(struct v9fs_inode),
>  					  0, (SLAB_RECLAIM_ACCOUNT|
> -					      SLAB_MEM_SPREAD),
> +					      SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					  v9fs_inode_init_once);
>  	if (!v9fs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/adfs/super.c b/fs/adfs/super.c
> index 4d4a0df8344f..c9fdfb112933 100644
> --- a/fs/adfs/super.c
> +++ b/fs/adfs/super.c
> @@ -271,7 +271,7 @@ static int __init init_inodecache(void)
>  	adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
>  					     sizeof(struct adfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (adfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/affs/super.c b/fs/affs/super.c
> index 5b50c4ca43a7..84a84fcb5f5a 100644
> --- a/fs/affs/super.c
> +++ b/fs/affs/super.c
> @@ -132,7 +132,7 @@ static int __init init_inodecache(void)
>  	affs_inode_cachep = kmem_cache_create("affs_inode_cache",
>  					     sizeof(struct affs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (affs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/afs/super.c b/fs/afs/super.c
> index 1fb4a5129f7d..81afefe7d8a6 100644
> --- a/fs/afs/super.c
> +++ b/fs/afs/super.c
> @@ -91,7 +91,7 @@ int __init afs_fs_init(void)
>  	afs_inode_cachep = kmem_cache_create("afs_inode_cache",
>  					     sizeof(struct afs_vnode),
>  					     0,
> -					     SLAB_HWCACHE_ALIGN,
> +					     SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					     afs_i_init_once);
>  	if (!afs_inode_cachep) {
>  		printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");
> diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
> index 46aedacfa6a8..2a23edf5703e 100644
> --- a/fs/befs/linuxvfs.c
> +++ b/fs/befs/linuxvfs.c
> @@ -434,7 +434,7 @@ befs_init_inodecache(void)
>  	befs_inode_cachep = kmem_cache_create("befs_inode_cache",
>  					      sizeof (struct befs_inode_info),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      init_once);
>  	if (befs_inode_cachep == NULL) {
>  		pr_err("%s: Couldn't initialize inode slabcache\n", __func__);
> diff --git a/fs/bfs/inode.c b/fs/bfs/inode.c
> index fdcb4d69f430..1e5c896f6b79 100644
> --- a/fs/bfs/inode.c
> +++ b/fs/bfs/inode.c
> @@ -270,7 +270,7 @@ static int __init init_inodecache(void)
>  	bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
>  					     sizeof(struct bfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (bfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 0a793c7930eb..29ce98bfe04f 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -567,7 +567,7 @@ void __init bdev_cache_init(void)
>  
>  	bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
>  			0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -				SLAB_MEM_SPREAD|SLAB_PANIC),
> +				SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
>  			init_once);
>  	err = register_filesystem(&bd_type);
>  	if (err)
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 4439fbb4ff45..c24d4cd9c14f 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9157,7 +9157,8 @@ int btrfs_init_cachep(void)
>  {
>  	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
>  			sizeof(struct btrfs_inode), 0,
> -			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once);
> +			SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
> +			init_once);
>  	if (!btrfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ceph/super.c b/fs/ceph/super.c
> index f446afada328..ca4d5e8457f1 100644
> --- a/fs/ceph/super.c
> +++ b/fs/ceph/super.c
> @@ -639,8 +639,8 @@ static int __init init_caches(void)
>  	ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
>  				      sizeof(struct ceph_inode_info),
>  				      __alignof__(struct ceph_inode_info),
> -				      (SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD),
> -				      ceph_inode_init_once);
> +				      SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				      SLAB_ACCOUNT, ceph_inode_init_once);
>  	if (ceph_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> index e739950ca084..7f2e2639d1d1 100644
> --- a/fs/cifs/cifsfs.c
> +++ b/fs/cifs/cifsfs.c
> @@ -1040,7 +1040,7 @@ cifs_init_inodecache(void)
>  	cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
>  					      sizeof(struct cifsInodeInfo),
>  					      0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					      cifs_init_once);
>  	if (cifs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/coda/inode.c b/fs/coda/inode.c
> index cac1390b87a3..57e81cbba0fa 100644
> --- a/fs/coda/inode.c
> +++ b/fs/coda/inode.c
> @@ -74,9 +74,9 @@ static void init_once(void *foo)
>  int __init coda_init_inodecache(void)
>  {
>  	coda_inode_cachep = kmem_cache_create("coda_inode_cache",
> -				sizeof(struct coda_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct coda_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (coda_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/dcache.c b/fs/dcache.c
> index 5c33aeb0f68f..7ac590912106 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
>  	dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
>  	if (name->len > DNAME_INLINE_LEN-1) {
>  		size_t size = offsetof(struct external_name, name[1]);
> -		struct external_name *p = kmalloc(size + name->len, GFP_KERNEL);
> +		struct external_name *p = kmalloc(size + name->len,
> +						  GFP_KERNEL_ACCOUNT);
>  		if (!p) {
>  			kmem_cache_free(dentry_cache, dentry); 
>  			return NULL;
> @@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
>  	 * of the dcache. 
>  	 */
>  	dentry_cache = KMEM_CACHE(dentry,
> -		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
> +		SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
>  
>  	/* Hash may have been set up in dcache_init_early */
>  	if (!hashdist)
> diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c
> index 4f4d0474bee9..e25b6b06bacf 100644
> --- a/fs/ecryptfs/main.c
> +++ b/fs/ecryptfs/main.c
> @@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
>  	struct kmem_cache **cache;
>  	const char *name;
>  	size_t size;
> +	unsigned long flags;
>  	void (*ctor)(void *obj);
>  } ecryptfs_cache_infos[] = {
>  	{
> @@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
>  		.cache = &ecryptfs_inode_info_cache,
>  		.name = "ecryptfs_inode_cache",
>  		.size = sizeof(struct ecryptfs_inode_info),
> +		.flags = SLAB_ACCOUNT,
>  		.ctor = inode_info_init_once,
>  	},
>  	{
> @@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
>  		struct ecryptfs_cache_info *info;
>  
>  		info = &ecryptfs_cache_infos[i];
> -		*(info->cache) = kmem_cache_create(info->name, info->size,
> -				0, SLAB_HWCACHE_ALIGN, info->ctor);
> +		*(info->cache) = kmem_cache_create(info->name, info->size, 0,
> +				SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
>  		if (!*(info->cache)) {
>  			ecryptfs_free_kmem_caches();
>  			ecryptfs_printk(KERN_WARNING, "%s: "
> diff --git a/fs/efs/super.c b/fs/efs/super.c
> index c8411a30f7da..cb68dac4f9d3 100644
> --- a/fs/efs/super.c
> +++ b/fs/efs/super.c
> @@ -94,9 +94,9 @@ static void init_once(void *foo)
>  static int __init init_inodecache(void)
>  {
>  	efs_inode_cachep = kmem_cache_create("efs_inode_cache",
> -				sizeof(struct efs_inode_info),
> -				0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -				init_once);
> +				sizeof(struct efs_inode_info), 0,
> +				SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +				SLAB_ACCOUNT, init_once);
>  	if (efs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/exofs/super.c b/fs/exofs/super.c
> index b795c567b5e1..6658a50530a0 100644
> --- a/fs/exofs/super.c
> +++ b/fs/exofs/super.c
> @@ -194,8 +194,8 @@ static int init_inodecache(void)
>  {
>  	exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
>  				sizeof(struct exofs_i_info), 0,
> -				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				exofs_init_once);
> +				SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				SLAB_ACCOUNT, exofs_init_once);
>  	if (exofs_inode_cachep == NULL)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/ext2/super.c b/fs/ext2/super.c
> index 900e19cf9ef6..973092a32b98 100644
> --- a/fs/ext2/super.c
> +++ b/fs/ext2/super.c
> @@ -200,7 +200,7 @@ static int __init init_inodecache(void)
>  	ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
>  					     sizeof(struct ext2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext2_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 04d0f1b33409..c4a5c415b881 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -966,7 +966,7 @@ static int __init init_inodecache(void)
>  	ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
>  					     sizeof(struct ext4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ext4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 3a65e0132352..862916c7e3f8 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1424,8 +1424,9 @@ MODULE_ALIAS_FS("f2fs");
>  
>  static int __init init_inodecache(void)
>  {
> -	f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache",
> -			sizeof(struct f2fs_inode_info));
> +	f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
> +			sizeof(struct f2fs_inode_info), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
>  	if (!f2fs_inode_cachep)
>  		return -ENOMEM;
>  	return 0;
> diff --git a/fs/fat/inode.c b/fs/fat/inode.c
> index 509411dd3698..6aece96df19f 100644
> --- a/fs/fat/inode.c
> +++ b/fs/fat/inode.c
> @@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
>  	fat_inode_cachep = kmem_cache_create("fat_inode_cache",
>  					     sizeof(struct msdos_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (fat_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/file.c b/fs/file.c
> index 39f8f15921da..7d76c929d557 100644
> --- a/fs/file.c
> +++ b/fs/file.c
> @@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
>  	 * vmalloc() if the allocation size will be considered "large" by the VM.
>  	 */
>  	if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -		void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY);
> +		void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
> +				     __GFP_NOWARN | __GFP_NORETRY);
>  		if (data != NULL)
>  			return data;
>  	}
> -	return vmalloc(size);
> +	return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
>  }
>  
>  static void __free_fdtable(struct fdtable *fdt)
> @@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
>  	if (unlikely(nr > sysctl_nr_open))
>  		nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
>  
> -	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL);
> +	fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
>  	if (!fdt)
>  		goto out;
>  	fdt->max_fds = nr;
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 2913db2a5b99..4d69d5c0bedc 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
>  	int err;
>  
>  	fuse_inode_cachep = kmem_cache_create("fuse_inode",
> -					      sizeof(struct fuse_inode),
> -					      0, SLAB_HWCACHE_ALIGN,
> +					      sizeof(struct fuse_inode), 0,
> +					      SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  					      fuse_inode_init_once);
>  	err = -ENOMEM;
>  	if (!fuse_inode_cachep)
> diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> index 241a399bf83d..6ee38e210602 100644
> --- a/fs/gfs2/main.c
> +++ b/fs/gfs2/main.c
> @@ -112,7 +112,8 @@ static int __init init_gfs2_fs(void)
>  	gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
>  					      sizeof(struct gfs2_inode),
>  					      0,  SLAB_RECLAIM_ACCOUNT|
> -					          SLAB_MEM_SPREAD,
> +					          SLAB_MEM_SPREAD|
> +						  SLAB_ACCOUNT,
>  					      gfs2_init_inode_once);
>  	if (!gfs2_inode_cachep)
>  		goto fail;
> diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> index 4574fdd3d421..1ca95c232bb5 100644
> --- a/fs/hfs/super.c
> +++ b/fs/hfs/super.c
> @@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
>  	int err;
>  
>  	hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
> -		sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN,
> -		hfs_init_once);
> +		sizeof(struct hfs_inode_info), 0,
> +		SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
>  	if (!hfs_inode_cachep)
>  		return -ENOMEM;
>  	err = register_filesystem(&hfs_fs_type);
> diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
> index 7302d96ae8bf..5d54490a136d 100644
> --- a/fs/hfsplus/super.c
> +++ b/fs/hfsplus/super.c
> @@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
>  	int err;
>  
>  	hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
> -		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN,
> +		HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
>  		hfsplus_init_once);
>  	if (!hfsplus_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
> index 2ac99db3750e..a4cf6b11a142 100644
> --- a/fs/hostfs/hostfs_kern.c
> +++ b/fs/hostfs/hostfs_kern.c
> @@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
>  {
>  	struct hostfs_inode_info *hi;
>  
> -	hi = kmalloc(sizeof(*hi), GFP_KERNEL);
> +	hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
>  	if (hi == NULL)
>  		return NULL;
>  	hi->fd = -1;
> diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
> index a561591896bd..458cf463047b 100644
> --- a/fs/hpfs/super.c
> +++ b/fs/hpfs/super.c
> @@ -261,7 +261,7 @@ static int init_inodecache(void)
>  	hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
>  					     sizeof(struct hpfs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (hpfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index 316adb968b65..496add05f380 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -1322,7 +1322,7 @@ static int __init init_hugetlbfs_fs(void)
>  	error = -ENOMEM;
>  	hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
>  					sizeof(struct hugetlbfs_inode_info),
> -					0, 0, init_once);
> +					0, SLAB_ACCOUNT, init_once);
>  	if (hugetlbfs_inode_cachep == NULL)
>  		goto out2;
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index 78a17b8859e1..08c66502f1f4 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1882,7 +1882,7 @@ void __init inode_init(void)
>  					 sizeof(struct inode),
>  					 0,
>  					 (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
> -					 SLAB_MEM_SPREAD),
> +					 SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					 init_once);
>  
>  	/* Hash may have been set up in inode_init_early */
> diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
> index d67a16f2a45d..9bc2431d2df8 100644
> --- a/fs/isofs/inode.c
> +++ b/fs/isofs/inode.c
> @@ -94,7 +94,7 @@ static int __init init_inodecache(void)
>  	isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
>  					sizeof(struct iso_inode_info),
>  					0, (SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					init_once);
>  	if (isofs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/jffs2/super.c b/fs/jffs2/super.c
> index d86c5e3176a1..bb080c272149 100644
> --- a/fs/jffs2/super.c
> +++ b/fs/jffs2/super.c
> @@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
>  	jffs2_inode_cachep = kmem_cache_create("jffs2_i",
>  					     sizeof(struct jffs2_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     jffs2_i_init_once);
>  	if (!jffs2_inode_cachep) {
>  		pr_err("error: Failed to initialise inode cache\n");
> diff --git a/fs/jfs/super.c b/fs/jfs/super.c
> index 4cd9798f4948..6efadc61c15b 100644
> --- a/fs/jfs/super.c
> +++ b/fs/jfs/super.c
> @@ -901,7 +901,7 @@ static int __init init_jfs_fs(void)
>  
>  	jfs_inode_cachep =
>  	    kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
> -			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			    SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			    init_once);
>  	if (jfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/logfs/inode.c b/fs/logfs/inode.c
> index af49e2d6941a..5d65db2e03f4 100644
> --- a/fs/logfs/inode.c
> +++ b/fs/logfs/inode.c
> @@ -408,7 +408,8 @@ const struct super_operations logfs_super_operations = {
>  int logfs_init_inode_cache(void)
>  {
>  	logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
> -			sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT,
> +			sizeof(struct logfs_inode), 0,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
>  			logfs_init_once);
>  	if (!logfs_inode_cache)
>  		return -ENOMEM;
> diff --git a/fs/minix/inode.c b/fs/minix/inode.c
> index 086cd0a61e80..5942c3e10fa5 100644
> --- a/fs/minix/inode.c
> +++ b/fs/minix/inode.c
> @@ -91,7 +91,7 @@ static int __init init_inodecache(void)
>  	minix_inode_cachep = kmem_cache_create("minix_inode_cache",
>  					     sizeof(struct minix_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (minix_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/ncpfs/inode.c b/fs/ncpfs/inode.c
> index 9605a2f63549..d80446e1a333 100644
> --- a/fs/ncpfs/inode.c
> +++ b/fs/ncpfs/inode.c
> @@ -82,7 +82,7 @@ static int init_inodecache(void)
>  	ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
>  					     sizeof(struct ncp_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ncp_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 326d9e10d833..412f888fad13 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -1904,7 +1904,7 @@ static int __init nfs_init_inodecache(void)
>  	nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
>  					     sizeof(struct nfs_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (nfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/nilfs2/super.c b/fs/nilfs2/super.c
> index f47585bfeb01..dcf8e2ff3072 100644
> --- a/fs/nilfs2/super.c
> +++ b/fs/nilfs2/super.c
> @@ -1419,7 +1419,8 @@ static int __init nilfs_init_cachep(void)
>  {
>  	nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
>  			sizeof(struct nilfs_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once);
> +			SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +			nilfs_inode_init_once);
>  	if (!nilfs_inode_cachep)
>  		goto fail;
>  
> diff --git a/fs/ntfs/super.c b/fs/ntfs/super.c
> index d1a853585b53..2f77f8dfb861 100644
> --- a/fs/ntfs/super.c
> +++ b/fs/ntfs/super.c
> @@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
>  
>  	ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
>  			sizeof(big_ntfs_inode), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> -			ntfs_big_inode_init_once);
> +			SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
> +			SLAB_ACCOUNT, ntfs_big_inode_init_once);
>  	if (!ntfs_big_inode_cache) {
>  		pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
>  		goto big_inode_err_out;
> diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
> index b5cf27dcb18a..03768bb3aab1 100644
> --- a/fs/ocfs2/dlmfs/dlmfs.c
> +++ b/fs/ocfs2/dlmfs/dlmfs.c
> @@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
>  	dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
>  				sizeof(struct dlmfs_inode_private),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -					SLAB_MEM_SPREAD),
> +					SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				dlmfs_init_once);
>  	if (!dlmfs_inode_cache) {
>  		status = -ENOMEM;
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 2de4c8a9340c..8ab0fcbc0b86 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1771,7 +1771,7 @@ static int ocfs2_initialize_mem_caches(void)
>  				       sizeof(struct ocfs2_inode_info),
>  				       0,
>  				       (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				       ocfs2_inode_init_once);
>  	ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
>  					sizeof(struct ocfs2_dquot),
> diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
> index 15e4500cda3e..b61b883c8ff8 100644
> --- a/fs/openpromfs/inode.c
> +++ b/fs/openpromfs/inode.c
> @@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
>  					    sizeof(struct op_inode_info),
>  					    0,
>  					    (SLAB_RECLAIM_ACCOUNT |
> -					     SLAB_MEM_SPREAD),
> +					     SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					    op_inode_init_once);
>  	if (!op_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/proc/inode.c b/fs/proc/inode.c
> index bd95b9fdebb0..561557122dea 100644
> --- a/fs/proc/inode.c
> +++ b/fs/proc/inode.c
> @@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
>  	proc_inode_cachep = kmem_cache_create("proc_inode_cache",
>  					     sizeof(struct proc_inode),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD|SLAB_PANIC),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT|
> +						SLAB_PANIC),
>  					     init_once);
>  }
>  
> diff --git a/fs/qnx4/inode.c b/fs/qnx4/inode.c
> index c4bcb778886e..f761acdd5a7a 100644
> --- a/fs/qnx4/inode.c
> +++ b/fs/qnx4/inode.c
> @@ -364,7 +364,7 @@ static int init_inodecache(void)
>  	qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
>  					     sizeof(struct qnx4_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (qnx4_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/qnx6/inode.c b/fs/qnx6/inode.c
> index 32d2e1a9774c..4f04f00a7e5e 100644
> --- a/fs/qnx6/inode.c
> +++ b/fs/qnx6/inode.c
> @@ -624,7 +624,7 @@ static int init_inodecache(void)
>  	qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
>  					     sizeof(struct qnx6_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (!qnx6_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
> index 4a62fe8cc3bf..05db7473bcb5 100644
> --- a/fs/reiserfs/super.c
> +++ b/fs/reiserfs/super.c
> @@ -626,7 +626,8 @@ static int __init init_inodecache(void)
>  						  sizeof(struct
>  							 reiserfs_inode_info),
>  						  0, (SLAB_RECLAIM_ACCOUNT|
> -							SLAB_MEM_SPREAD),
> +						      SLAB_MEM_SPREAD|
> +						      SLAB_ACCOUNT),
>  						  init_once);
>  	if (reiserfs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/romfs/super.c b/fs/romfs/super.c
> index 268733cda397..e1113399a6b4 100644
> --- a/fs/romfs/super.c
> +++ b/fs/romfs/super.c
> @@ -618,8 +618,8 @@ static int __init init_romfs_fs(void)
>  	romfs_inode_cachep =
>  		kmem_cache_create("romfs_i",
>  				  sizeof(struct romfs_inode_info), 0,
> -				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD,
> -				  romfs_i_init_once);
> +				  SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
> +				  SLAB_ACCOUNT, romfs_i_init_once);
>  
>  	if (!romfs_inode_cachep) {
>  		pr_err("Failed to initialise inode cache\n");
> diff --git a/fs/squashfs/super.c b/fs/squashfs/super.c
> index 5056babe00df..ea59b475663c 100644
> --- a/fs/squashfs/super.c
> +++ b/fs/squashfs/super.c
> @@ -420,7 +420,8 @@ static int __init init_inodecache(void)
>  {
>  	squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
>  		sizeof(struct squashfs_inode_info), 0,
> -		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once);
> +		SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
> +		init_once);
>  
>  	return squashfs_inode_cachep ? 0 : -ENOMEM;
>  }
> diff --git a/fs/sysv/inode.c b/fs/sysv/inode.c
> index 590ad9206e3f..087ed6a1c1df 100644
> --- a/fs/sysv/inode.c
> +++ b/fs/sysv/inode.c
> @@ -353,7 +353,7 @@ int __init sysv_init_icache(void)
>  {
>  	sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
>  			sizeof(struct sysv_inode_info), 0,
> -			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD,
> +			SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
>  			init_once);
>  	if (!sysv_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
> index 9547a27868ad..9d064789c63a 100644
> --- a/fs/ubifs/super.c
> +++ b/fs/ubifs/super.c
> @@ -2241,8 +2241,8 @@ static int __init ubifs_init(void)
>  
>  	ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
>  				sizeof(struct ubifs_inode), 0,
> -				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT,
> -				&inode_slab_ctor);
> +				SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
> +				SLAB_ACCOUNT, &inode_slab_ctor);
>  	if (!ubifs_inode_slab)
>  		return -ENOMEM;
>  
> diff --git a/fs/udf/super.c b/fs/udf/super.c
> index 81155b9b445b..9c64a3ca9837 100644
> --- a/fs/udf/super.c
> +++ b/fs/udf/super.c
> @@ -179,7 +179,8 @@ static int __init init_inodecache(void)
>  	udf_inode_cachep = kmem_cache_create("udf_inode_cache",
>  					     sizeof(struct udf_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT |
> -						 SLAB_MEM_SPREAD),
> +						 SLAB_MEM_SPREAD |
> +						 SLAB_ACCOUNT),
>  					     init_once);
>  	if (!udf_inode_cachep)
>  		return -ENOMEM;
> diff --git a/fs/ufs/super.c b/fs/ufs/super.c
> index f6390eec02ca..442fd52ebffe 100644
> --- a/fs/ufs/super.c
> +++ b/fs/ufs/super.c
> @@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
>  	ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
>  					     sizeof(struct ufs_inode_info),
>  					     0, (SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  					     init_once);
>  	if (ufs_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/fs/xfs/kmem.h b/fs/xfs/kmem.h
> index cc6b768fc068..d1c66e465ca5 100644
> --- a/fs/xfs/kmem.h
> +++ b/fs/xfs/kmem.h
> @@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
>  #define KM_ZONE_HWALIGN	SLAB_HWCACHE_ALIGN
>  #define KM_ZONE_RECLAIM	SLAB_RECLAIM_ACCOUNT
>  #define KM_ZONE_SPREAD	SLAB_MEM_SPREAD
> +#define KM_ZONE_ACCOUNT	SLAB_ACCOUNT
>  
>  #define kmem_zone	kmem_cache
>  #define kmem_zone_t	struct kmem_cache
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 904f637cfa5f..70d5b3072631 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1703,8 +1703,8 @@ xfs_init_zones(void)
>  
>  	xfs_inode_zone =
>  		kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
> -			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD,
> -			xfs_fs_inode_init_once);
> +			KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
> +			KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
>  	if (!xfs_inode_zone)
>  		goto out_destroy_efi_zone;
>  
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index ff307b548ed3..b4c2a485b28a 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -56,9 +56,10 @@ extern long do_no_restart_syscall(struct restart_block *parm);
>  #ifdef __KERNEL__
>  
>  #ifdef CONFIG_DEBUG_STACK_USAGE
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | \
> +				 __GFP_ZERO)
>  #else
> -# define THREADINFO_GFP		(GFP_KERNEL | __GFP_NOTRACK)
> +# define THREADINFO_GFP		(GFP_KERNEL_ACCOUNT | __GFP_NOTRACK)
>  #endif
>  
>  /*
> diff --git a/ipc/mqueue.c b/ipc/mqueue.c
> index 161a1807e6ef..f4617cf07069 100644
> --- a/ipc/mqueue.c
> +++ b/ipc/mqueue.c
> @@ -1438,7 +1438,7 @@ static int __init init_mqueue_fs(void)
>  
>  	mqueue_inode_cachep = kmem_cache_create("mqueue_inode_cache",
>  				sizeof(struct mqueue_inode_info), 0,
> -				SLAB_HWCACHE_ALIGN, init_once);
> +				SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, init_once);
>  	if (mqueue_inode_cachep == NULL)
>  		return -ENOMEM;
>  
> diff --git a/kernel/cred.c b/kernel/cred.c
> index 71179a09c1d6..0c0cd8a62285 100644
> --- a/kernel/cred.c
> +++ b/kernel/cred.c
> @@ -569,8 +569,8 @@ EXPORT_SYMBOL(revert_creds);
>  void __init cred_init(void)
>  {
>  	/* allocate a slab in which we can store credentials */
> -	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred),
> -				     0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
> +	cred_jar = kmem_cache_create("cred_jar", sizeof(struct cred), 0,
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL);
>  }
>  
>  /**
> diff --git a/kernel/delayacct.c b/kernel/delayacct.c
> index ef90b04d783f..435c14a45118 100644
> --- a/kernel/delayacct.c
> +++ b/kernel/delayacct.c
> @@ -34,7 +34,7 @@ __setup("nodelayacct", delayacct_setup_disable);
>  
>  void delayacct_init(void)
>  {
> -	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC);
> +	delayacct_cache = KMEM_CACHE(task_delay_info, SLAB_PANIC|SLAB_ACCOUNT);
>  	delayacct_tsk_init(&init_task);
>  }
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index f97f2c449f5c..ff39b78e6e23 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -300,9 +300,9 @@ void __init fork_init(void)
>  #define ARCH_MIN_TASKALIGN	L1_CACHE_BYTES
>  #endif
>  	/* create a slab on which task_structs can be allocated */
> -	task_struct_cachep =
> -		kmem_cache_create("task_struct", arch_task_struct_size,
> -			ARCH_MIN_TASKALIGN, SLAB_PANIC | SLAB_NOTRACK, NULL);
> +	task_struct_cachep = kmem_cache_create("task_struct",
> +			arch_task_struct_size, ARCH_MIN_TASKALIGN,
> +			SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT, NULL);
>  #endif
>  
>  	/* do the arch specific task caches init */
> @@ -1851,16 +1851,19 @@ void __init proc_caches_init(void)
>  	sighand_cachep = kmem_cache_create("sighand_cache",
>  			sizeof(struct sighand_struct), 0,
>  			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_DESTROY_BY_RCU|
> -			SLAB_NOTRACK, sighand_ctor);
> +			SLAB_NOTRACK|SLAB_ACCOUNT, sighand_ctor);
>  	signal_cachep = kmem_cache_create("signal_cache",
>  			sizeof(struct signal_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	files_cachep = kmem_cache_create("files_cache",
>  			sizeof(struct files_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	fs_cachep = kmem_cache_create("fs_cache",
>  			sizeof(struct fs_struct), 0,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
>  	/*
>  	 * FIXME! The "sizeof(struct mm_struct)" currently includes the
>  	 * whole struct cpumask for the OFFSTACK case. We could change
> @@ -1870,8 +1873,9 @@ void __init proc_caches_init(void)
>  	 */
>  	mm_cachep = kmem_cache_create("mm_struct",
>  			sizeof(struct mm_struct), ARCH_MIN_MMSTRUCT_ALIGN,
> -			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK, NULL);
> -	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_NOTRACK|SLAB_ACCOUNT,
> +			NULL);
> +	vm_area_cachep = KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
>  	mmap_init();
>  	nsproxy_cache_init();
>  }
> diff --git a/kernel/pid.c b/kernel/pid.c
> index ca368793808e..f09b026f5b56 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -604,5 +604,5 @@ void __init pidmap_init(void)
>  	atomic_dec(&init_pid_ns.pidmap[0].nr_free);
>  
>  	init_pid_ns.pid_cachep = KMEM_CACHE(pid,
> -			SLAB_HWCACHE_ALIGN | SLAB_PANIC);
> +			SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT);
>  }
> diff --git a/mm/nommu.c b/mm/nommu.c
> index 92be862c859b..fbf6f0f1d6c9 100644
> --- a/mm/nommu.c
> +++ b/mm/nommu.c
> @@ -560,7 +560,7 @@ void __init mmap_init(void)
>  
>  	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
>  	VM_BUG_ON(ret);
> -	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC);
> +	vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/rmap.c b/mm/rmap.c
> index b577fbb98d4b..3c3f1d21f075 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -428,8 +428,10 @@ static void anon_vma_ctor(void *data)
>  void __init anon_vma_init(void)
>  {
>  	anon_vma_cachep = kmem_cache_create("anon_vma", sizeof(struct anon_vma),
> -			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC, anon_vma_ctor);
> -	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain, SLAB_PANIC);
> +			0, SLAB_DESTROY_BY_RCU|SLAB_PANIC|SLAB_ACCOUNT,
> +			anon_vma_ctor);
> +	anon_vma_chain_cachep = KMEM_CACHE(anon_vma_chain,
> +			SLAB_PANIC|SLAB_ACCOUNT);
>  }
>  
>  /*
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 3b8b73928398..882933a7de99 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3107,7 +3107,7 @@ static int shmem_init_inodecache(void)
>  {
>  	shmem_inode_cachep = kmem_cache_create("shmem_inode_cache",
>  				sizeof(struct shmem_inode_info),
> -				0, SLAB_PANIC, shmem_init_inode);
> +				0, SLAB_PANIC|SLAB_ACCOUNT, shmem_init_inode);
>  	return 0;
>  }
>  
> diff --git a/net/socket.c b/net/socket.c
> index 9963a0b53a64..2d70af8d943f 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -293,7 +293,7 @@ static int init_inodecache(void)
>  					      0,
>  					      (SLAB_HWCACHE_ALIGN |
>  					       SLAB_RECLAIM_ACCOUNT |
> -					       SLAB_MEM_SPREAD),
> +					       SLAB_MEM_SPREAD | SLAB_ACCOUNT),
>  					      init_once);
>  	if (sock_inode_cachep == NULL)
>  		return -ENOMEM;
> diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c
> index d81186d34558..14f45bf0410c 100644
> --- a/net/sunrpc/rpc_pipe.c
> +++ b/net/sunrpc/rpc_pipe.c
> @@ -1500,7 +1500,7 @@ int register_rpc_pipefs(void)
>  	rpc_inode_cachep = kmem_cache_create("rpc_inode_cache",
>  				sizeof(struct rpc_inode),
>  				0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
> -						SLAB_MEM_SPREAD),
> +						SLAB_MEM_SPREAD|SLAB_ACCOUNT),
>  				init_once);
>  	if (!rpc_inode_cachep)
>  		return -ENOMEM;
> -- 
> 2.1.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-12 16:17     ` Michal Hocko
@ 2015-11-14 11:29       ` Vladimir Davydov
  -1 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-14 11:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Thu, Nov 12, 2015 at 05:17:41PM +0100, Michal Hocko wrote:
> On Tue 10-11-15 21:34:05, Vladimir Davydov wrote:
> > Currently, if we want to account all objects of a particular kmem cache,
> > we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> > inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> > kmem_cache_create will force accounting for every allocation from this
> > cache even if __GFP_ACCOUNT is not passed.
> 
> Yes this is much better and less error prone for dedicated caches.
> 
> > This patch does not make any of the existing caches use this flag - it
> > will be done later in the series.
> > 
> > Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> > SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> > merged slabs even if kmem accounting is not used (only compiled in).
> 
> I would expect some reasoning why this is the case. Why cannot caches of
> the same memcg be merged? I remember you have mentioned something in the
> previous discussion with Tejun but it should be in the changelog as well
> IMO.

Here goes an extended version of the last paragraph, hope it makes
everything clear:

"""
Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
SLAB_ACCOUNT, because merged caches share the same kmem_cache struct and
hence cannot have different sets of SLAB_* flags. Thus using this flag
will probably reduce the number of merged slabs even if kmem accounting
is not used (only compiled in).
"""

Andrew, could you please update the commit message?

> 
> > Suggested-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
> 
> I am not sufficiently qualified to judge the slab implementation
> specifics but for the overal approach
> 
> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-14 11:29       ` Vladimir Davydov
  0 siblings, 0 replies; 56+ messages in thread
From: Vladimir Davydov @ 2015-11-14 11:29 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Johannes Weiner, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Thu, Nov 12, 2015 at 05:17:41PM +0100, Michal Hocko wrote:
> On Tue 10-11-15 21:34:05, Vladimir Davydov wrote:
> > Currently, if we want to account all objects of a particular kmem cache,
> > we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> > inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> > kmem_cache_create will force accounting for every allocation from this
> > cache even if __GFP_ACCOUNT is not passed.
> 
> Yes this is much better and less error prone for dedicated caches.
> 
> > This patch does not make any of the existing caches use this flag - it
> > will be done later in the series.
> > 
> > Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> > SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> > merged slabs even if kmem accounting is not used (only compiled in).
> 
> I would expect some reasoning why this is the case. Why cannot caches of
> the same memcg be merged? I remember you have mentioned something in the
> previous discussion with Tejun but it should be in the changelog as well
> IMO.

Here goes an extended version of the last paragraph, hope it makes
everything clear:

"""
Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
SLAB_ACCOUNT, because merged caches share the same kmem_cache struct and
hence cannot have different sets of SLAB_* flags. Thus using this flag
will probably reduce the number of merged slabs even if kmem accounting
is not used (only compiled in).
"""

Andrew, could you please update the commit message?

> 
> > Suggested-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
> 
> I am not sufficiently qualified to judge the slab implementation
> specifics but for the overal approach
> 
> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 1/6] Revert "kernfs: do not account ino_ida allocations to memcg"
  2015-11-10 18:34   ` Vladimir Davydov
@ 2015-11-19 18:56     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 18:56 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:02PM +0300, Vladimir Davydov wrote:
> This reverts commit 499611ed451508a42d1d7d1faff10177827755d5.
> 
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So it was decided to switch to the white-list policy. This patch reverts
> bits introducing the black-list policy. The white-list policy will be
> introduced later in the series.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 1/6] Revert "kernfs: do not account ino_ida allocations to memcg"
@ 2015-11-19 18:56     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 18:56 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:02PM +0300, Vladimir Davydov wrote:
> This reverts commit 499611ed451508a42d1d7d1faff10177827755d5.
> 
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So it was decided to switch to the white-list policy. This patch reverts
> bits introducing the black-list policy. The white-list policy will be
> introduced later in the series.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 2/6] Revert "gfp: add __GFP_NOACCOUNT"
  2015-11-10 18:34   ` Vladimir Davydov
@ 2015-11-19 18:59     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 18:59 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:03PM +0300, Vladimir Davydov wrote:
> This reverts commit 8f4fc071b1926d0b20336e2b3f8ab85c94c734c5.
> 
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So it was decided to switch to the white-list policy. This patch reverts
> bits introducing the black-list policy. The white-list policy will be
> introduced later in the series.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 2/6] Revert "gfp: add __GFP_NOACCOUNT"
@ 2015-11-19 18:59     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 18:59 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:03PM +0300, Vladimir Davydov wrote:
> This reverts commit 8f4fc071b1926d0b20336e2b3f8ab85c94c734c5.
> 
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So it was decided to switch to the white-list policy. This patch reverts
> bits introducing the black-list policy. The white-list policy will be
> introduced later in the series.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-19 19:00     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:00 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:04PM +0300, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
@ 2015-11-19 19:00     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:00 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:04PM +0300, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT
@ 2015-11-19 19:00     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:00 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 10, 2015 at 09:34:04PM +0300, Vladimir Davydov wrote:
> Black-list kmem accounting policy (aka __GFP_NOACCOUNT) turned out to be
> fragile and difficult to maintain, because there seem to be many more
> allocations that should not be accounted than those that should be.
> Besides, false accounting an allocation might result in much worse
> consequences than not accounting at all, namely increased memory
> consumption due to pinned dead kmem caches.
> 
> So this patch switches kmem accounting to the white-policy: now only
> those kmem allocations that are marked as __GFP_ACCOUNT are accounted to
> memcg. Currently, no kmem allocations are marked like this. The
> following patches will mark several kmem allocations that are known to
> be easily triggered from userspace and therefore should be accounted to
> memcg.
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-19 19:01     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:01 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-19 19:01     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:01 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).
> 
> Suggested-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag
@ 2015-11-19 19:01     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:01 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 10, 2015 at 09:34:05PM +0300, Vladimir Davydov wrote:
> Currently, if we want to account all objects of a particular kmem cache,
> we have to pass __GFP_ACCOUNT to each kmem_cache_alloc call, which is
> inconvenient. This patch introduces SLAB_ACCOUNT flag which if passed to
> kmem_cache_create will force accounting for every allocation from this
> cache even if __GFP_ACCOUNT is not passed.
> 
> This patch does not make any of the existing caches use this flag - it
> will be done later in the series.
> 
> Note, a cache with SLAB_ACCOUNT cannot be merged with a cache w/o
> SLAB_ACCOUNT, i.e. using this flag will probably reduce the number of
> merged slabs even if kmem accounting is not used (only compiled in).
> 
> Suggested-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-19 19:04     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:06PM +0300, Vladimir Davydov wrote:
> This patch makes vmalloc family functions allocate vmalloc area pages
> with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
> accounted to memcg. This is needed, at least, to account alloc_fdmem
> allocations.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
@ 2015-11-19 19:04     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:06PM +0300, Vladimir Davydov wrote:
> This patch makes vmalloc family functions allocate vmalloc area pages
> with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
> accounted to memcg. This is needed, at least, to account alloc_fdmem
> allocations.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg
@ 2015-11-19 19:04     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:04 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 10, 2015 at 09:34:06PM +0300, Vladimir Davydov wrote:
> This patch makes vmalloc family functions allocate vmalloc area pages
> with alloc_kmem_pages so that if __GFP_ACCOUNT is set they will be
> accounted to memcg. This is needed, at least, to account alloc_fdmem
> allocations.
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
  2015-11-10 18:34   ` Vladimir Davydov
  (?)
@ 2015-11-19 19:12     ` Johannes Weiner
  -1 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:12 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:07PM +0300, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.
> 
> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Thanks for doing that work, Vladimir. It looks reasonable to me.

We can update the list as we go along and testing reveals more things
that need to be considered. As far as malicious users go, I agree that
we can not make this bullet proof, and so we shouldn't aim for that.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
@ 2015-11-19 19:12     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:12 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm, cgroups, linux-kernel

On Tue, Nov 10, 2015 at 09:34:07PM +0300, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.
> 
> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>

Thanks for doing that work, Vladimir. It looks reasonable to me.

We can update the list as we go along and testing reveals more things
that need to be considered. As far as malicious users go, I agree that
we can not make this bullet proof, and so we shouldn't aim for that.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: [PATCH v2 6/6] Account certain kmem allocations to memcg
@ 2015-11-19 19:12     ` Johannes Weiner
  0 siblings, 0 replies; 56+ messages in thread
From: Johannes Weiner @ 2015-11-19 19:12 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: Andrew Morton, Michal Hocko, Tejun Heo, Greg Thelen,
	Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, Nov 10, 2015 at 09:34:07PM +0300, Vladimir Davydov wrote:
> This patch marks those kmem allocations that are known to be easily
> triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them
> accounted to memcg. For the list, see below:
> 
>  - threadinfo
>  - task_struct
>  - task_delay_info
>  - pid
>  - cred
>  - mm_struct
>  - vm_area_struct and vm_region (nommu)
>  - anon_vma and anon_vma_chain
>  - signal_struct
>  - sighand_struct
>  - fs_struct
>  - files_struct
>  - fdtable and fdtable->full_fds_bits
>  - dentry and external_name
>  - inode for all filesystems. This is the most tedious part, because
>    most filesystems overwrite the alloc_inode method.
> 
> The list is by far not complete, so feel free to add more objects.
> Nevertheless, it should be close to "account everything" approach and
> keep most workloads within bounds. Malevolent users will be able to
> breach the limit, but this was possible even with the former "account
> everything" approach (simply because it did not account everything in
> fact).
> 
> Signed-off-by: Vladimir Davydov <vdavydov-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org>

Thanks for doing that work, Vladimir. It looks reasonable to me.

We can update the list as we go along and testing reveals more things
that need to be considered. As far as malicious users go, I agree that
we can not make this bullet proof, and so we shouldn't aim for that.

Acked-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-11-19 19:12 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-10 18:34 [PATCH v2 0/6] memcg/kmem: switch to white list policy Vladimir Davydov
2015-11-10 18:34 ` Vladimir Davydov
2015-11-10 18:34 ` Vladimir Davydov
2015-11-10 18:34 ` [PATCH v2 1/6] Revert "kernfs: do not account ino_ida allocations to memcg" Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-19 18:56   ` Johannes Weiner
2015-11-19 18:56     ` Johannes Weiner
2015-11-10 18:34 ` [PATCH v2 2/6] Revert "gfp: add __GFP_NOACCOUNT" Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-19 18:59   ` Johannes Weiner
2015-11-19 18:59     ` Johannes Weiner
2015-11-10 18:34 ` [PATCH v2 3/6] memcg: only account kmem allocations marked as __GFP_ACCOUNT Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-12 16:04   ` Michal Hocko
2015-11-12 16:04     ` Michal Hocko
2015-11-12 16:04     ` Michal Hocko
2015-11-19 19:00   ` Johannes Weiner
2015-11-19 19:00     ` Johannes Weiner
2015-11-19 19:00     ` Johannes Weiner
2015-11-10 18:34 ` [PATCH v2 4/6] slab: add SLAB_ACCOUNT flag Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-10 18:38   ` Tejun Heo
2015-11-10 18:38     ` Tejun Heo
2015-11-10 18:38     ` Tejun Heo
2015-11-10 18:54     ` Vladimir Davydov
2015-11-10 18:54       ` Vladimir Davydov
2015-11-11 15:54       ` Tejun Heo
2015-11-11 15:54         ` Tejun Heo
2015-11-11 15:54         ` Tejun Heo
2015-11-11 16:07         ` Vladimir Davydov
2015-11-11 16:07           ` Vladimir Davydov
2015-11-11 16:07           ` Vladimir Davydov
2015-11-11 16:19           ` Tejun Heo
2015-11-11 16:19             ` Tejun Heo
2015-11-12 16:17   ` Michal Hocko
2015-11-12 16:17     ` Michal Hocko
2015-11-12 16:17     ` Michal Hocko
2015-11-14 11:29     ` Vladimir Davydov
2015-11-14 11:29       ` Vladimir Davydov
2015-11-19 19:01   ` Johannes Weiner
2015-11-19 19:01     ` Johannes Weiner
2015-11-19 19:01     ` Johannes Weiner
2015-11-10 18:34 ` [PATCH v2 5/6] vmalloc: allow to account vmalloc to memcg Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-19 19:04   ` Johannes Weiner
2015-11-19 19:04     ` Johannes Weiner
2015-11-19 19:04     ` Johannes Weiner
2015-11-10 18:34 ` [PATCH v2 6/6] Account certain kmem allocations " Vladimir Davydov
2015-11-10 18:34   ` Vladimir Davydov
2015-11-12 16:50   ` Michal Hocko
2015-11-12 16:50     ` Michal Hocko
2015-11-12 16:50     ` Michal Hocko
2015-11-19 19:12   ` Johannes Weiner
2015-11-19 19:12     ` Johannes Weiner
2015-11-19 19:12     ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.