linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] More stuff to charge to kmemcg
@ 2016-05-23 10:20 Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 1/8] mm: remove pointless struct in struct page definition Vladimir Davydov
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

Hi,

This patch implements per kmemcg accounting of page tables (x86-only),
pipe buffers, and unix socket buffers.

Basically, this is v2 of my earlier attempt [1], addressing comments by
Andrew, namely: lack of comments to non-standard _mapcount usage, extra
overhead even when kmemcg is unused, wrong handling of stolen pipe
buffer pages.

Patches 1-3 are just cleanups that are not supposed to introduce any
functional changes. Patches 4 and 5 move charge/uncharge to generic page
allocator paths for the sake of accounting pipe and unix socket buffers.
Patches 5-7 make x86 page tables, pipe buffers, and unix socket buffers
accountable.

[1] http://lkml.kernel.org/r/%3Ccover.1443262808.git.vdavydov@parallels.com%3E

Thanks,

Vladimir Davydov (8):
  mm: remove pointless struct in struct page definition
  mm: clean up non-standard page->_mapcount users
  mm: memcontrol: cleanup kmem charge functions
  mm: charge/uncharge kmemcg from generic page allocator paths
  mm: memcontrol: teach uncharge_list to deal with kmem pages
  arch: x86: charge page tables to kmemcg
  pipe: account to kmemcg
  af_unix: charge buffers to kmemcg

 arch/x86/include/asm/pgalloc.h |  12 ++++-
 arch/x86/mm/pgtable.c          |  11 ++--
 fs/pipe.c                      |  27 ++++++++--
 include/linux/gfp.h            |  10 +---
 include/linux/memcontrol.h     | 103 +++---------------------------------
 include/linux/mm_types.h       |  73 ++++++++++++-------------
 include/linux/page-flags.h     |  78 +++++++++++++--------------
 kernel/fork.c                  |   6 +--
 mm/memcontrol.c                | 117 ++++++++++++++++++++++++++++-------------
 mm/page_alloc.c                |  63 +++++-----------------
 mm/slab.h                      |  16 ++++--
 mm/slab_common.c               |   2 +-
 mm/slub.c                      |   6 +--
 mm/vmalloc.c                   |   6 +--
 net/unix/af_unix.c             |   1 +
 scripts/tags.sh                |   3 ++
 16 files changed, 241 insertions(+), 293 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/8] mm: remove pointless struct in struct page definition
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 2/8] mm: clean up non-standard page->_mapcount users Vladimir Davydov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

... to reduce indentation level thus leaving more space for comments.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/mm_types.h | 68 +++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 36 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index d553855503e6..3cc5977a9cab 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -60,51 +60,47 @@ struct page {
 	};
 
 	/* Second double word */
-	struct {
-		union {
-			pgoff_t index;		/* Our offset within mapping. */
-			void *freelist;		/* sl[aou]b first free object */
-			/* page_deferred_list().prev	-- second tail page */
-		};
+	union {
+		pgoff_t index;		/* Our offset within mapping. */
+		void *freelist;		/* sl[aou]b first free object */
+		/* page_deferred_list().prev	-- second tail page */
+	};
 
-		union {
+	union {
 #if defined(CONFIG_HAVE_CMPXCHG_DOUBLE) && \
 	defined(CONFIG_HAVE_ALIGNED_STRUCT_PAGE)
-			/* Used for cmpxchg_double in slub */
-			unsigned long counters;
+		/* Used for cmpxchg_double in slub */
+		unsigned long counters;
 #else
-			/*
-			 * Keep _refcount separate from slub cmpxchg_double
-			 * data.  As the rest of the double word is protected by
-			 * slab_lock but _refcount is not.
-			 */
-			unsigned counters;
+		/*
+		 * Keep _refcount separate from slub cmpxchg_double data.
+		 * As the rest of the double word is protected by slab_lock
+		 * but _refcount is not.
+		 */
+		unsigned counters;
 #endif
+		struct {
 
-			struct {
-
-				union {
-					/*
-					 * Count of ptes mapped in mms, to show
-					 * when page is mapped & limit reverse
-					 * map searches.
-					 */
-					atomic_t _mapcount;
-
-					struct { /* SLUB */
-						unsigned inuse:16;
-						unsigned objects:15;
-						unsigned frozen:1;
-					};
-					int units;	/* SLOB */
-				};
+			union {
 				/*
-				 * Usage count, *USE WRAPPER FUNCTION*
-				 * when manual accounting. See page_ref.h
+				 * Count of ptes mapped in mms, to show when
+				 * page is mapped & limit reverse map searches.
 				 */
-				atomic_t _refcount;
+				atomic_t _mapcount;
+
+				unsigned int active;		/* SLAB */
+				struct {			/* SLUB */
+					unsigned inuse:16;
+					unsigned objects:15;
+					unsigned frozen:1;
+				};
+				int units;			/* SLOB */
 			};
-			unsigned int active;	/* SLAB */
+			/*
+			 * Usage count, *USE WRAPPER FUNCTION* when manual
+			 * accounting. See page_ref.h
+			 */
+			atomic_t _refcount;
 		};
 	};
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/8] mm: clean up non-standard page->_mapcount users
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 1/8] mm: remove pointless struct in struct page definition Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 3/8] mm: memcontrol: cleanup kmem charge functions Vladimir Davydov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

 - Add a proper comment to page->_mapcount.
 - Introduce a macro for generating helper functions.
 - Place all special page->_mapcount values next to each other so that
   readers can see all possible values and so we don't get duplicates.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/mm_types.h   |  5 ++++
 include/linux/page-flags.h | 73 ++++++++++++++++++++--------------------------
 scripts/tags.sh            |  3 ++
 3 files changed, 40 insertions(+), 41 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3cc5977a9cab..16bdef7943e3 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -85,6 +85,11 @@ struct page {
 				/*
 				 * Count of ptes mapped in mms, to show when
 				 * page is mapped & limit reverse map searches.
+				 *
+				 * Extra information about page type may be
+				 * stored here for pages that are never mapped,
+				 * in which case the value MUST BE <= -2.
+				 * See page-flags.h for more details.
 				 */
 				atomic_t _mapcount;
 
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index e5a32445f930..9940ade6a25e 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -593,54 +593,45 @@ TESTPAGEFLAG_FALSE(DoubleMap)
 #endif
 
 /*
- * PageBuddy() indicate that the page is free and in the buddy system
- * (see mm/page_alloc.c).
- *
- * PAGE_BUDDY_MAPCOUNT_VALUE must be <= -2 but better not too close to
- * -2 so that an underflow of the page_mapcount() won't be mistaken
- * for a genuine PAGE_BUDDY_MAPCOUNT_VALUE. -128 can be created very
- * efficiently by most CPU architectures.
+ * For pages that are never mapped to userspace, page->mapcount may be
+ * used for storing extra information about page type. Any value used
+ * for this purpose must be <= -2, but it's better start not too close
+ * to -2 so that an underflow of the page_mapcount() won't be mistaken
+ * for a special page.
  */
-#define PAGE_BUDDY_MAPCOUNT_VALUE (-128)
-
-static inline int PageBuddy(struct page *page)
-{
-	return atomic_read(&page->_mapcount) == PAGE_BUDDY_MAPCOUNT_VALUE;
+#define PAGE_MAPCOUNT_OPS(uname, lname)					\
+static __always_inline int Page##uname(struct page *page)		\
+{									\
+	return atomic_read(&page->_mapcount) ==				\
+				PAGE_##lname##_MAPCOUNT_VALUE;		\
+}									\
+static __always_inline void __SetPage##uname(struct page *page)		\
+{									\
+	VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);	\
+	atomic_set(&page->_mapcount, PAGE_##lname##_MAPCOUNT_VALUE);	\
+}									\
+static __always_inline void __ClearPage##uname(struct page *page)	\
+{									\
+	VM_BUG_ON_PAGE(!Page##uname(page), page);			\
+	atomic_set(&page->_mapcount, -1);				\
 }
 
-static inline void __SetPageBuddy(struct page *page)
-{
-	VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
-	atomic_set(&page->_mapcount, PAGE_BUDDY_MAPCOUNT_VALUE);
-}
+/*
+ * PageBuddy() indicate that the page is free and in the buddy system
+ * (see mm/page_alloc.c).
+ */
+#define PAGE_BUDDY_MAPCOUNT_VALUE		(-128)
+PAGE_MAPCOUNT_OPS(Buddy, BUDDY)
 
-static inline void __ClearPageBuddy(struct page *page)
-{
-	VM_BUG_ON_PAGE(!PageBuddy(page), page);
-	atomic_set(&page->_mapcount, -1);
-}
+/*
+ * PageBalloon() is set on pages that are on the balloon page list
+ * (see mm/balloon_compaction.c).
+ */
+#define PAGE_BALLOON_MAPCOUNT_VALUE		(-256)
+PAGE_MAPCOUNT_OPS(Balloon, BALLOON)
 
 extern bool is_free_buddy_page(struct page *page);
 
-#define PAGE_BALLOON_MAPCOUNT_VALUE (-256)
-
-static inline int PageBalloon(struct page *page)
-{
-	return atomic_read(&page->_mapcount) == PAGE_BALLOON_MAPCOUNT_VALUE;
-}
-
-static inline void __SetPageBalloon(struct page *page)
-{
-	VM_BUG_ON_PAGE(atomic_read(&page->_mapcount) != -1, page);
-	atomic_set(&page->_mapcount, PAGE_BALLOON_MAPCOUNT_VALUE);
-}
-
-static inline void __ClearPageBalloon(struct page *page)
-{
-	VM_BUG_ON_PAGE(!PageBalloon(page), page);
-	atomic_set(&page->_mapcount, -1);
-}
-
 /*
  * If network-based swap is enabled, sl*b must keep track of whether pages
  * were allocated from pfmemalloc reserves.
diff --git a/scripts/tags.sh b/scripts/tags.sh
index f72f48f638ae..ed7eef24ef89 100755
--- a/scripts/tags.sh
+++ b/scripts/tags.sh
@@ -185,6 +185,9 @@ regex_c=(
 	'/\<CLEARPAGEFLAG_NOOP(\([[:alnum:]_]*\).*/ClearPage\1/'
 	'/\<__CLEARPAGEFLAG_NOOP(\([[:alnum:]_]*\).*/__ClearPage\1/'
 	'/\<TESTCLEARFLAG_FALSE(\([[:alnum:]_]*\).*/TestClearPage\1/'
+	'/^PAGE_MAPCOUNT_OPS(\([[:alnum:]_]*\).*/Page\1/'
+	'/^PAGE_MAPCOUNT_OPS(\([[:alnum:]_]*\).*/__SetPage\1/'
+	'/^PAGE_MAPCOUNT_OPS(\([[:alnum:]_]*\).*/__ClearPage\1/'
 	'/^TASK_PFA_TEST([^,]*, *\([[:alnum:]_]*\))/task_\1/'
 	'/^TASK_PFA_SET([^,]*, *\([[:alnum:]_]*\))/task_set_\1/'
 	'/^TASK_PFA_CLEAR([^,]*, *\([[:alnum:]_]*\))/task_clear_\1/'
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/8] mm: memcontrol: cleanup kmem charge functions
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 1/8] mm: remove pointless struct in struct page definition Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 2/8] mm: clean up non-standard page->_mapcount users Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 4/8] mm: charge/uncharge kmemcg from generic page allocator paths Vladimir Davydov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

 - Handle memcg_kmem_enabled check out to the caller. This reduces the
   number of function definitions making the code easier to follow. At
   the same time it doesn't result in code bloat, because all of these
   functions are used only in one or two places.
 - Move __GFP_ACCOUNT check to the caller as well so that one wouldn't
   have to dive deep into memcg implementation to see which allocations
   are charged and which are not.
 - Refresh comments.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/memcontrol.h | 103 +++------------------------------------------
 mm/memcontrol.c            |  75 ++++++++++++++++++++++++---------
 mm/page_alloc.c            |   9 ++--
 mm/slab.h                  |  16 +++++--
 4 files changed, 80 insertions(+), 123 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index a805474df4ab..2d03975c7dc0 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -754,6 +754,13 @@ static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
 }
 #endif
 
+struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep);
+void memcg_kmem_put_cache(struct kmem_cache *cachep);
+int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
+			    struct mem_cgroup *memcg);
+int memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
+void memcg_kmem_uncharge(struct page *page, int order);
+
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 extern struct static_key_false memcg_kmem_enabled_key;
 
@@ -775,22 +782,6 @@ static inline bool memcg_kmem_enabled(void)
 }
 
 /*
- * In general, we'll do everything in our power to not incur in any overhead
- * for non-memcg users for the kmem functions. Not even a function call, if we
- * can avoid it.
- *
- * Therefore, we'll inline all those functions so that in the best case, we'll
- * see that kmemcg is off for everybody and proceed quickly.  If it is on,
- * we'll still do most of the flag checking inline. We check a lot of
- * conditions, but because they are pretty simple, they are expected to be
- * fast.
- */
-int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
-			      struct mem_cgroup *memcg);
-int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order);
-void __memcg_kmem_uncharge(struct page *page, int order);
-
-/*
  * helper for accessing a memcg's index. It will be used as an index in the
  * child cache array in kmem_cache, and also to derive its name. This function
  * will return -1 when this is not a kmem-limited memcg.
@@ -800,67 +791,6 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
 	return memcg ? memcg->kmemcg_id : -1;
 }
 
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
-void __memcg_kmem_put_cache(struct kmem_cache *cachep);
-
-static inline bool __memcg_kmem_bypass(void)
-{
-	if (!memcg_kmem_enabled())
-		return true;
-	if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
-		return true;
-	return false;
-}
-
-/**
- * memcg_kmem_charge: charge a kmem page
- * @page: page to charge
- * @gfp: reclaim mode
- * @order: allocation order
- *
- * Returns 0 on success, an error code on failure.
- */
-static __always_inline int memcg_kmem_charge(struct page *page,
-					     gfp_t gfp, int order)
-{
-	if (__memcg_kmem_bypass())
-		return 0;
-	if (!(gfp & __GFP_ACCOUNT))
-		return 0;
-	return __memcg_kmem_charge(page, gfp, order);
-}
-
-/**
- * memcg_kmem_uncharge: uncharge a kmem page
- * @page: page to uncharge
- * @order: allocation order
- */
-static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_uncharge(page, order);
-}
-
-/**
- * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
- * @cachep: the original global kmem cache
- *
- * All memory allocated from a per-memcg cache is charged to the owner memcg.
- */
-static __always_inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	if (__memcg_kmem_bypass())
-		return cachep;
-	return __memcg_kmem_get_cache(cachep, gfp);
-}
-
-static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-	if (memcg_kmem_enabled())
-		__memcg_kmem_put_cache(cachep);
-}
-
 /**
  * memcg_kmem_update_page_stat - update kmem page state statistics
  * @page: the page
@@ -883,15 +813,6 @@ static inline bool memcg_kmem_enabled(void)
 	return false;
 }
 
-static inline int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
-{
-	return 0;
-}
-
-static inline void memcg_kmem_uncharge(struct page *page, int order)
-{
-}
-
 static inline int memcg_cache_id(struct mem_cgroup *memcg)
 {
 	return -1;
@@ -905,16 +826,6 @@ static inline void memcg_put_cache_ids(void)
 {
 }
 
-static inline struct kmem_cache *
-memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
-{
-	return cachep;
-}
-
-static inline void memcg_kmem_put_cache(struct kmem_cache *cachep)
-{
-}
-
 static inline void memcg_kmem_update_page_stat(struct page *page,
 				enum mem_cgroup_stat_index idx, int val)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b3f16ab4b431..482b4a0c97e4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2268,20 +2268,30 @@ static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg,
 	current->memcg_kmem_skip_account = 0;
 }
 
-/*
+static inline bool memcg_kmem_bypass(void)
+{
+	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
+		return true;
+	return false;
+}
+
+/**
+ * memcg_kmem_get_cache: select the correct per-memcg cache for allocation
+ * @cachep: the original global kmem cache
+ *
  * Return the kmem_cache we're supposed to use for a slab allocation.
  * We try to use the current memcg's version of the cache.
  *
- * If the cache does not exist yet, if we are the first user of it,
- * we either create it immediately, if possible, or create it asynchronously
- * in a workqueue.
- * In the latter case, we will let the current allocation go through with
- * the original cache.
+ * If the cache does not exist yet, if we are the first user of it, we
+ * create it asynchronously in a workqueue and let the current allocation
+ * go through with the original cache.
  *
- * Can't be called in interrupt context or from kernel threads.
- * This function needs to be called with rcu_read_lock() held.
+ * This function takes a reference to the cache it returns to assure it
+ * won't get destroyed while we are working with it. Once the caller is
+ * done with it, memcg_kmem_put_cache() must be called to release the
+ * reference.
  */
-struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
+struct kmem_cache *memcg_kmem_get_cache(struct kmem_cache *cachep)
 {
 	struct mem_cgroup *memcg;
 	struct kmem_cache *memcg_cachep;
@@ -2289,10 +2299,7 @@ struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
 
 	VM_BUG_ON(!is_root_cache(cachep));
 
-	if (cachep->flags & SLAB_ACCOUNT)
-		gfp |= __GFP_ACCOUNT;
-
-	if (!(gfp & __GFP_ACCOUNT))
+	if (memcg_kmem_bypass())
 		return cachep;
 
 	if (current->memcg_kmem_skip_account)
@@ -2325,14 +2332,27 @@ out:
 	return cachep;
 }
 
-void __memcg_kmem_put_cache(struct kmem_cache *cachep)
+/**
+ * memcg_kmem_put_cache: drop reference taken by memcg_kmem_get_cache
+ * @cachep: the cache returned by memcg_kmem_get_cache
+ */
+void memcg_kmem_put_cache(struct kmem_cache *cachep)
 {
 	if (!is_root_cache(cachep))
 		css_put(&cachep->memcg_params.memcg->css);
 }
 
-int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
-			      struct mem_cgroup *memcg)
+/**
+ * memcg_kmem_charge: charge a kmem page
+ * @page: page to charge
+ * @gfp: reclaim mode
+ * @order: allocation order
+ * @memcg: memory cgroup to charge
+ *
+ * Returns 0 on success, an error code on failure.
+ */
+int memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
+			    struct mem_cgroup *memcg)
 {
 	unsigned int nr_pages = 1 << order;
 	struct page_counter *counter;
@@ -2353,19 +2373,34 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
 	return 0;
 }
 
-int __memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
+/**
+ * memcg_kmem_charge: charge a kmem page to the current memory cgroup
+ * @page: page to charge
+ * @gfp: reclaim mode
+ * @order: allocation order
+ *
+ * Returns 0 on success, an error code on failure.
+ */
+int memcg_kmem_charge(struct page *page, gfp_t gfp, int order)
 {
 	struct mem_cgroup *memcg;
 	int ret = 0;
 
+	if (memcg_kmem_bypass())
+		return 0;
+
 	memcg = get_mem_cgroup_from_mm(current->mm);
 	if (!mem_cgroup_is_root(memcg))
-		ret = __memcg_kmem_charge_memcg(page, gfp, order, memcg);
+		ret = memcg_kmem_charge_memcg(page, gfp, order, memcg);
 	css_put(&memcg->css);
 	return ret;
 }
-
-void __memcg_kmem_uncharge(struct page *page, int order)
+/**
+ * memcg_kmem_uncharge: uncharge a kmem page
+ * @page: page to uncharge
+ * @order: allocation order
+ */
+void memcg_kmem_uncharge(struct page *page, int order)
 {
 	struct mem_cgroup *memcg = page->mem_cgroup;
 	unsigned int nr_pages = 1 << order;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 520c3576249b..f948f4e02e89 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4001,7 +4001,8 @@ struct page *alloc_kmem_pages(gfp_t gfp_mask, unsigned int order)
 	struct page *page;
 
 	page = alloc_pages(gfp_mask, order);
-	if (page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
+	if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) &&
+	    page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
 		__free_pages(page, order);
 		page = NULL;
 	}
@@ -4013,7 +4014,8 @@ struct page *alloc_kmem_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
 	struct page *page;
 
 	page = alloc_pages_node(nid, gfp_mask, order);
-	if (page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
+	if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) &&
+	    page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
 		__free_pages(page, order);
 		page = NULL;
 	}
@@ -4026,7 +4028,8 @@ struct page *alloc_kmem_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
  */
 void __free_kmem_pages(struct page *page, unsigned int order)
 {
-	memcg_kmem_uncharge(page, order);
+	if (memcg_kmem_enabled())
+		memcg_kmem_uncharge(page, order);
 	__free_pages(page, order);
 }
 
diff --git a/mm/slab.h b/mm/slab.h
index dedb1a920fb8..a98859400bc0 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -253,8 +253,7 @@ static __always_inline int memcg_charge_slab(struct page *page,
 	if (is_root_cache(s))
 		return 0;
 
-	ret = __memcg_kmem_charge_memcg(page, gfp, order,
-					s->memcg_params.memcg);
+	ret = memcg_kmem_charge_memcg(page, gfp, order, s->memcg_params.memcg);
 	if (ret)
 		return ret;
 
@@ -268,6 +267,9 @@ static __always_inline int memcg_charge_slab(struct page *page,
 static __always_inline void memcg_uncharge_slab(struct page *page, int order,
 						struct kmem_cache *s)
 {
+	if (!memcg_kmem_enabled())
+		return;
+
 	memcg_kmem_update_page_stat(page,
 			(s->flags & SLAB_RECLAIM_ACCOUNT) ?
 			MEMCG_SLAB_RECLAIMABLE : MEMCG_SLAB_UNRECLAIMABLE,
@@ -390,7 +392,11 @@ static inline struct kmem_cache *slab_pre_alloc_hook(struct kmem_cache *s,
 	if (should_failslab(s, flags))
 		return NULL;
 
-	return memcg_kmem_get_cache(s, flags);
+	if (memcg_kmem_enabled() &&
+	    ((flags & __GFP_ACCOUNT) || (s->flags & SLAB_ACCOUNT)))
+		return memcg_kmem_get_cache(s);
+
+	return s;
 }
 
 static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
@@ -407,7 +413,9 @@ static inline void slab_post_alloc_hook(struct kmem_cache *s, gfp_t flags,
 					 s->flags, flags);
 		kasan_slab_alloc(s, object, flags);
 	}
-	memcg_kmem_put_cache(s);
+
+	if (memcg_kmem_enabled())
+		memcg_kmem_put_cache(s);
 }
 
 #ifndef CONFIG_SLOB
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/8] mm: charge/uncharge kmemcg from generic page allocator paths
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
                   ` (2 preceding siblings ...)
  2016-05-23 10:20 ` [PATCH 3/8] mm: memcontrol: cleanup kmem charge functions Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 5/8] mm: memcontrol: teach uncharge_list to deal with kmem pages Vladimir Davydov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

Currently, to charge a non-slab allocation to kmemcg one has to use
alloc_kmem_pages helper with __GFP_ACCOUNT flag. A page allocated with
this helper should finally be freed using free_kmem_pages, otherwise it
won't be uncharged.

This API suits its current users fine, but it turns out to be impossible
to use along with page reference counting, i.e. when an allocation is
supposed to be freed with put_page, as it is the case with pipe or unix
socket buffers.

To overcome this limitation, this patch moves charging/uncharging to
generic page allocator paths, i.e. to __alloc_pages_nodemask and
free_pages_prepare, and zaps alloc/free_kmem_pages helpers. This way,
one can use any of the available page allocation functions to get the
allocated page charged to kmemcg - it's enough to pass __GFP_ACCOUNT,
just like in case of kmalloc and friends. A charged page will be
automatically uncharged on free.

To make it possible, we need to mark pages charged to kmemcg somehow. To
avoid introducing a new page flag, we make use of page->_mapcount for
marking such pages. Since pages charged to kmemcg are not supposed to be
mapped to userspace, it should work just fine. There are other (ab)users
of page->_mapcount - buddy and balloon pages - but we don't conflict
with them.

In case kmemcg is compiled out or not used at runtime, this patch
introduces no overhead to generic page allocator paths. If kmemcg is
used, it will be plus one gfp flags check on alloc and plus one
page->_mapcount check on free, which shouldn't hurt performance, because
the data accessed are hot.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 include/linux/gfp.h        | 10 +------
 include/linux/page-flags.h |  7 +++++
 kernel/fork.c              |  6 ++---
 mm/page_alloc.c            | 66 +++++++++-------------------------------------
 mm/slab_common.c           |  2 +-
 mm/slub.c                  |  6 ++---
 mm/vmalloc.c               |  6 ++---
 7 files changed, 31 insertions(+), 72 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 570383a41853..c29e9d347bc6 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -78,8 +78,7 @@ struct vm_area_struct;
  * __GFP_THISNODE forces the allocation to be satisified from the requested
  *   node with no fallbacks or placement policy enforcements.
  *
- * __GFP_ACCOUNT causes the allocation to be accounted to kmemcg (only relevant
- *   to kmem allocations).
+ * __GFP_ACCOUNT causes the allocation to be accounted to kmemcg.
  */
 #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
 #define __GFP_WRITE	((__force gfp_t)___GFP_WRITE)
@@ -486,10 +485,6 @@ extern struct page *alloc_pages_vma(gfp_t gfp_mask, int order,
 #define alloc_page_vma_node(gfp_mask, vma, addr, node)		\
 	alloc_pages_vma(gfp_mask, 0, vma, addr, node, false)
 
-extern struct page *alloc_kmem_pages(gfp_t gfp_mask, unsigned int order);
-extern struct page *alloc_kmem_pages_node(int nid, gfp_t gfp_mask,
-					  unsigned int order);
-
 extern unsigned long __get_free_pages(gfp_t gfp_mask, unsigned int order);
 extern unsigned long get_zeroed_page(gfp_t gfp_mask);
 
@@ -513,9 +508,6 @@ extern void *__alloc_page_frag(struct page_frag_cache *nc,
 			       unsigned int fragsz, gfp_t gfp_mask);
 extern void __free_page_frag(void *addr);
 
-extern void __free_kmem_pages(struct page *page, unsigned int order);
-extern void free_kmem_pages(unsigned long addr, unsigned int order);
-
 #define __free_page(page) __free_pages((page), 0)
 #define free_page(addr) free_pages((addr), 0)
 
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 9940ade6a25e..b51e75a47e82 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -630,6 +630,13 @@ PAGE_MAPCOUNT_OPS(Buddy, BUDDY)
 #define PAGE_BALLOON_MAPCOUNT_VALUE		(-256)
 PAGE_MAPCOUNT_OPS(Balloon, BALLOON)
 
+/*
+ * If kmemcg is enabled, the buddy allocator will set PageKmemcg() on
+ * pages allocated with __GFP_ACCOUNT. It gets cleared on page free.
+ */
+#define PAGE_KMEMCG_MAPCOUNT_VALUE		(-512)
+PAGE_MAPCOUNT_OPS(Kmemcg, KMEMCG)
+
 extern bool is_free_buddy_page(struct page *page);
 
 /*
diff --git a/kernel/fork.c b/kernel/fork.c
index 66cc2e0e137e..3f3c30f80786 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -162,8 +162,8 @@ void __weak arch_release_thread_info(struct thread_info *ti)
 static struct thread_info *alloc_thread_info_node(struct task_struct *tsk,
 						  int node)
 {
-	struct page *page = alloc_kmem_pages_node(node, THREADINFO_GFP,
-						  THREAD_SIZE_ORDER);
+	struct page *page = alloc_pages_node(node, THREADINFO_GFP,
+					     THREAD_SIZE_ORDER);
 
 	if (page)
 		memcg_kmem_update_page_stat(page, MEMCG_KERNEL_STACK,
@@ -178,7 +178,7 @@ static inline void free_thread_info(struct thread_info *ti)
 
 	memcg_kmem_update_page_stat(page, MEMCG_KERNEL_STACK,
 				    -(1 << THREAD_SIZE_ORDER));
-	__free_kmem_pages(page, THREAD_SIZE_ORDER);
+	__free_pages(page, THREAD_SIZE_ORDER);
 }
 # else
 static struct kmem_cache *thread_info_cache;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f948f4e02e89..c9ee98f1d3cc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -63,6 +63,7 @@
 #include <linux/sched/rt.h>
 #include <linux/page_owner.h>
 #include <linux/kthread.h>
+#include <linux/memcontrol.h>
 
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
@@ -1017,6 +1018,10 @@ static __always_inline bool free_pages_prepare(struct page *page,
 	}
 	if (PageAnonHead(page))
 		page->mapping = NULL;
+	if (memcg_kmem_enabled() && PageKmemcg(page)) {
+		memcg_kmem_uncharge(page, order);
+		__ClearPageKmemcg(page);
+	}
 	if (check_free)
 		bad += free_pages_check(page);
 	if (bad)
@@ -3833,6 +3838,14 @@ no_zone:
 	}
 
 out:
+	if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) && page) {
+		if (unlikely(memcg_kmem_charge(page, gfp_mask, order))) {
+			__free_pages(page, order);
+			page = NULL;
+		} else
+			__SetPageKmemcg(page);
+	}
+
 	if (kmemcheck_enabled && page)
 		kmemcheck_pagealloc_alloc(page, order, gfp_mask);
 
@@ -3988,59 +4001,6 @@ void __free_page_frag(void *addr)
 }
 EXPORT_SYMBOL(__free_page_frag);
 
-/*
- * alloc_kmem_pages charges newly allocated pages to the kmem resource counter
- * of the current memory cgroup if __GFP_ACCOUNT is set, other than that it is
- * equivalent to alloc_pages.
- *
- * It should be used when the caller would like to use kmalloc, but since the
- * allocation is large, it has to fall back to the page allocator.
- */
-struct page *alloc_kmem_pages(gfp_t gfp_mask, unsigned int order)
-{
-	struct page *page;
-
-	page = alloc_pages(gfp_mask, order);
-	if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) &&
-	    page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
-		__free_pages(page, order);
-		page = NULL;
-	}
-	return page;
-}
-
-struct page *alloc_kmem_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
-{
-	struct page *page;
-
-	page = alloc_pages_node(nid, gfp_mask, order);
-	if (memcg_kmem_enabled() && (gfp_mask & __GFP_ACCOUNT) &&
-	    page && memcg_kmem_charge(page, gfp_mask, order) != 0) {
-		__free_pages(page, order);
-		page = NULL;
-	}
-	return page;
-}
-
-/*
- * __free_kmem_pages and free_kmem_pages will free pages allocated with
- * alloc_kmem_pages.
- */
-void __free_kmem_pages(struct page *page, unsigned int order)
-{
-	if (memcg_kmem_enabled())
-		memcg_kmem_uncharge(page, order);
-	__free_pages(page, order);
-}
-
-void free_kmem_pages(unsigned long addr, unsigned int order)
-{
-	if (addr != 0) {
-		VM_BUG_ON(!virt_addr_valid((void *)addr));
-		__free_kmem_pages(virt_to_page((void *)addr), order);
-	}
-}
-
 static void *make_alloc_exact(unsigned long addr, unsigned int order,
 		size_t size)
 {
diff --git a/mm/slab_common.c b/mm/slab_common.c
index a65dad7fdcd1..3bf8f63a6abe 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1012,7 +1012,7 @@ void *kmalloc_order(size_t size, gfp_t flags, unsigned int order)
 	struct page *page;
 
 	flags |= __GFP_COMP;
-	page = alloc_kmem_pages(flags, order);
+	page = alloc_pages(flags, order);
 	ret = page ? page_address(page) : NULL;
 	kmemleak_alloc(ret, size, 1, flags);
 	kasan_kmalloc_large(ret, size, flags);
diff --git a/mm/slub.c b/mm/slub.c
index 538c8584ba8d..ac7b5a55e4c9 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2867,7 +2867,7 @@ int build_detached_freelist(struct kmem_cache *s, size_t size,
 		if (unlikely(!PageSlab(page))) {
 			BUG_ON(!PageCompound(page));
 			kfree_hook(object);
-			__free_kmem_pages(page, compound_order(page));
+			__free_pages(page, compound_order(page));
 			p[size] = NULL; /* mark object processed */
 			return size;
 		}
@@ -3575,7 +3575,7 @@ static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
 	void *ptr = NULL;
 
 	flags |= __GFP_COMP | __GFP_NOTRACK;
-	page = alloc_kmem_pages_node(node, flags, get_order(size));
+	page = alloc_pages_node(node, flags, get_order(size));
 	if (page)
 		ptr = page_address(page);
 
@@ -3656,7 +3656,7 @@ void kfree(const void *x)
 	if (unlikely(!PageSlab(page))) {
 		BUG_ON(!PageCompound(page));
 		kfree_hook(x);
-		__free_kmem_pages(page, compound_order(page));
+		__free_pages(page, compound_order(page));
 		return;
 	}
 	slab_free(page->slab_cache, page, object, NULL, 1, _RET_IP_);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cf7ad1a53be0..a806236ecd62 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1501,7 +1501,7 @@ static void __vunmap(const void *addr, int deallocate_pages)
 			struct page *page = area->pages[i];
 
 			BUG_ON(!page);
-			__free_kmem_pages(page, 0);
+			__free_pages(page, 0);
 		}
 
 		kvfree(area->pages);
@@ -1628,9 +1628,9 @@ static void *__vmalloc_area_node(struct vm_struct *area, gfp_t gfp_mask,
 		struct page *page;
 
 		if (node == NUMA_NO_NODE)
-			page = alloc_kmem_pages(alloc_mask, order);
+			page = alloc_pages(alloc_mask, order);
 		else
-			page = alloc_kmem_pages_node(node, alloc_mask, order);
+			page = alloc_pages_node(node, alloc_mask, order);
 
 		if (unlikely(!page)) {
 			/* Successfully allocated i pages, free them in __vunmap() */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/8] mm: memcontrol: teach uncharge_list to deal with kmem pages
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
                   ` (3 preceding siblings ...)
  2016-05-23 10:20 ` [PATCH 4/8] mm: charge/uncharge kmemcg from generic page allocator paths Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 6/8] arch: x86: charge page tables to kmemcg Vladimir Davydov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

Page table pages are batched-freed in release_pages on most
architectures. If we want to charge them to kmemcg (this is what is done
later in this series), we need to teach mem_cgroup_uncharge_list to
handle kmem pages.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
---
 mm/memcontrol.c | 42 ++++++++++++++++++++++++------------------
 1 file changed, 24 insertions(+), 18 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 482b4a0c97e4..89a421ee4713 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5432,15 +5432,18 @@ void mem_cgroup_cancel_charge(struct page *page, struct mem_cgroup *memcg,
 
 static void uncharge_batch(struct mem_cgroup *memcg, unsigned long pgpgout,
 			   unsigned long nr_anon, unsigned long nr_file,
-			   unsigned long nr_huge, struct page *dummy_page)
+			   unsigned long nr_huge, unsigned long nr_kmem,
+			   struct page *dummy_page)
 {
-	unsigned long nr_pages = nr_anon + nr_file;
+	unsigned long nr_pages = nr_anon + nr_file + nr_kmem;
 	unsigned long flags;
 
 	if (!mem_cgroup_is_root(memcg)) {
 		page_counter_uncharge(&memcg->memory, nr_pages);
 		if (do_memsw_account())
 			page_counter_uncharge(&memcg->memsw, nr_pages);
+		if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) && nr_kmem)
+			page_counter_uncharge(&memcg->kmem, nr_kmem);
 		memcg_oom_recover(memcg);
 	}
 
@@ -5463,6 +5466,7 @@ static void uncharge_list(struct list_head *page_list)
 	unsigned long nr_anon = 0;
 	unsigned long nr_file = 0;
 	unsigned long nr_huge = 0;
+	unsigned long nr_kmem = 0;
 	unsigned long pgpgout = 0;
 	struct list_head *next;
 	struct page *page;
@@ -5473,8 +5477,6 @@ static void uncharge_list(struct list_head *page_list)
 	 */
 	next = page_list->next;
 	do {
-		unsigned int nr_pages = 1;
-
 		page = list_entry(next, struct page, lru);
 		next = page->lru.next;
 
@@ -5493,31 +5495,35 @@ static void uncharge_list(struct list_head *page_list)
 		if (memcg != page->mem_cgroup) {
 			if (memcg) {
 				uncharge_batch(memcg, pgpgout, nr_anon, nr_file,
-					       nr_huge, page);
-				pgpgout = nr_anon = nr_file = nr_huge = 0;
+					       nr_huge, nr_kmem, page);
+				pgpgout = nr_anon = nr_file =
+					nr_huge = nr_kmem = 0;
 			}
 			memcg = page->mem_cgroup;
 		}
 
-		if (PageTransHuge(page)) {
-			nr_pages <<= compound_order(page);
-			VM_BUG_ON_PAGE(!PageTransHuge(page), page);
-			nr_huge += nr_pages;
-		}
+		if (!PageKmemcg(page)) {
+			unsigned int nr_pages = 1;
 
-		if (PageAnon(page))
-			nr_anon += nr_pages;
-		else
-			nr_file += nr_pages;
+			if (PageTransHuge(page)) {
+				nr_pages <<= compound_order(page);
+				VM_BUG_ON_PAGE(!PageTransHuge(page), page);
+				nr_huge += nr_pages;
+			}
+			if (PageAnon(page))
+				nr_anon += nr_pages;
+			else
+				nr_file += nr_pages;
+			pgpgout++;
+		} else
+			nr_kmem += 1 << compound_order(page);
 
 		page->mem_cgroup = NULL;
-
-		pgpgout++;
 	} while (next != page_list);
 
 	if (memcg)
 		uncharge_batch(memcg, pgpgout, nr_anon, nr_file,
-			       nr_huge, page);
+			       nr_huge, nr_kmem, page);
 }
 
 /**
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/8] arch: x86: charge page tables to kmemcg
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
                   ` (4 preceding siblings ...)
  2016-05-23 10:20 ` [PATCH 5/8] mm: memcontrol: teach uncharge_list to deal with kmem pages Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
  2016-05-23 10:20 ` [PATCH 8/8] af_unix: charge buffers " Vladimir Davydov
  7 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, Johannes Weiner,
	Michal Hocko, linux-mm, linux-kernel

Page tables can bite a relatively big chunk off system memory and their
allocations are easy to trigger from userspace, so they should be
accounted to kmemcg.

This patch marks page table allocations as __GFP_ACCOUNT for x86. Note
we must not charge allocations of kernel page tables, because they can
be shared among processes from different cgroups so accounting them to a
particular one can pin other cgroups for indefinitely long. So we clear
__GFP_ACCOUNT flag if a page table is allocated for the kernel.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/include/asm/pgalloc.h | 12 ++++++++++--
 arch/x86/mm/pgtable.c          | 11 ++++++++---
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index bf7f8b55b0f9..2f531633cb16 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -81,7 +81,11 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
 	struct page *page;
-	page = alloc_pages(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO, 0);
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_REPEAT | __GFP_ZERO;
+
+	if (mm == &init_mm)
+		gfp &= ~__GFP_ACCOUNT;
+	page = alloc_pages(gfp, 0);
 	if (!page)
 		return NULL;
 	if (!pgtable_pmd_page_ctor(page)) {
@@ -125,7 +129,11 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	return (pud_t *)get_zeroed_page(GFP_KERNEL|__GFP_REPEAT);
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_REPEAT;
+
+	if (mm == &init_mm)
+		gfp &= ~__GFP_ACCOUNT;
+	return (pud_t *)get_zeroed_page(gfp);
 }
 
 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 4eb287e25043..421ac6b74d11 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -6,7 +6,8 @@
 #include <asm/fixmap.h>
 #include <asm/mtrr.h>
 
-#define PGALLOC_GFP GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO
+#define PGALLOC_GFP (GFP_KERNEL_ACCOUNT | __GFP_NOTRACK | __GFP_REPEAT | \
+		     __GFP_ZERO)
 
 #ifdef CONFIG_HIGHPTE
 #define PGALLOC_USER_GFP __GFP_HIGHMEM
@@ -18,7 +19,7 @@ gfp_t __userpte_alloc_gfp = PGALLOC_GFP | PGALLOC_USER_GFP;
 
 pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
 {
-	return (pte_t *)__get_free_page(PGALLOC_GFP);
+	return (pte_t *)__get_free_page(PGALLOC_GFP & ~__GFP_ACCOUNT);
 }
 
 pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long address)
@@ -207,9 +208,13 @@ static int preallocate_pmds(struct mm_struct *mm, pmd_t *pmds[])
 {
 	int i;
 	bool failed = false;
+	gfp_t gfp = PGALLOC_GFP;
+
+	if (mm == &init_mm)
+		gfp &= ~__GFP_ACCOUNT;
 
 	for(i = 0; i < PREALLOCATED_PMDS; i++) {
-		pmd_t *pmd = (pmd_t *)__get_free_page(PGALLOC_GFP);
+		pmd_t *pmd = (pmd_t *)__get_free_page(gfp);
 		if (!pmd)
 			failed = true;
 		if (pmd && !pgtable_pmd_page_ctor(virt_to_page(pmd))) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
                   ` (5 preceding siblings ...)
  2016-05-23 10:20 ` [PATCH 6/8] arch: x86: charge page tables to kmemcg Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 10:39   ` kbuild test robot
                     ` (2 more replies)
  2016-05-23 10:20 ` [PATCH 8/8] af_unix: charge buffers " Vladimir Davydov
  7 siblings, 3 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

Pipes can consume a significant amount of system memory, hence they
should be accounted to kmemcg.

This patch marks pipe_inode_info and anonymous pipe buffer page
allocations as __GFP_ACCOUNT so that they would be charged to kmemcg.
Note, since a pipe buffer page can be "stolen" and get reused for other
purposes, including mapping to userspace, we clear PageKmemcg thus
resetting page->_mapcount and uncharge it in anon_pipe_buf_steal, which
is introduced by this patch.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
---
 fs/pipe.c | 27 ++++++++++++++++++++++-----
 1 file changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/pipe.c b/fs/pipe.c
index 0d3f5165cb0b..6345f3543788 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -21,6 +21,7 @@
 #include <linux/audit.h>
 #include <linux/syscalls.h>
 #include <linux/fcntl.h>
+#include <linux/memcontrol.h>
 
 #include <asm/uaccess.h>
 #include <asm/ioctls.h>
@@ -137,6 +138,20 @@ static void anon_pipe_buf_release(struct pipe_inode_info *pipe,
 		put_page(page);
 }
 
+static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
+			       struct pipe_buffer *buf)
+{
+	struct page *page = buf->page;
+
+	if (page_count(page) == 1) {
+		memcg_kmem_uncharge(page, 0);
+		__ClearPageKmemcg(page);
+		__SetPageLocked(page);
+		return 0;
+	}
+	return 1;
+}
+
 /**
  * generic_pipe_buf_steal - attempt to take ownership of a &pipe_buffer
  * @pipe:	the pipe that the buffer belongs to
@@ -219,7 +234,7 @@ static const struct pipe_buf_operations anon_pipe_buf_ops = {
 	.can_merge = 1,
 	.confirm = generic_pipe_buf_confirm,
 	.release = anon_pipe_buf_release,
-	.steal = generic_pipe_buf_steal,
+	.steal = anon_pipe_buf_steal,
 	.get = generic_pipe_buf_get,
 };
 
@@ -227,7 +242,7 @@ static const struct pipe_buf_operations packet_pipe_buf_ops = {
 	.can_merge = 0,
 	.confirm = generic_pipe_buf_confirm,
 	.release = anon_pipe_buf_release,
-	.steal = generic_pipe_buf_steal,
+	.steal = anon_pipe_buf_steal,
 	.get = generic_pipe_buf_get,
 };
 
@@ -405,7 +420,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
 			int copied;
 
 			if (!page) {
-				page = alloc_page(GFP_HIGHUSER);
+				page = alloc_page(GFP_HIGHUSER | __GFP_ACCOUNT);
 				if (unlikely(!page)) {
 					ret = ret ? : -ENOMEM;
 					break;
@@ -611,7 +626,7 @@ struct pipe_inode_info *alloc_pipe_info(void)
 {
 	struct pipe_inode_info *pipe;
 
-	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL);
+	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL_ACCOUNT);
 	if (pipe) {
 		unsigned long pipe_bufs = PIPE_DEF_BUFFERS;
 		struct user_struct *user = get_current_user();
@@ -619,7 +634,9 @@ struct pipe_inode_info *alloc_pipe_info(void)
 		if (!too_many_pipe_buffers_hard(user)) {
 			if (too_many_pipe_buffers_soft(user))
 				pipe_bufs = 1;
-			pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * pipe_bufs, GFP_KERNEL);
+			pipe->bufs = kcalloc(pipe_bufs,
+					     sizeof(struct pipe_buffer),
+					     GFP_KERNEL_ACCOUNT);
 		}
 
 		if (pipe->bufs) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/8] af_unix: charge buffers to kmemcg
  2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
                   ` (6 preceding siblings ...)
  2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
@ 2016-05-23 10:20 ` Vladimir Davydov
  2016-05-23 22:13   ` David Miller
  2016-05-24  7:41   ` Vladimir Davydov
  7 siblings, 2 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David S. Miller, Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

Unix sockets can consume a significant amount of system memory, hence
they should be accounted to kmemcg.

Since unix socket buffers are always allocated from process context,
all we need to do to charge them to kmemcg is set __GFP_ACCOUNT in
sock->sk_allocation mask.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: "David S. Miller" <davem@davemloft.net>
---
 net/unix/af_unix.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 80aa6a3e6817..022bdd3ab7d9 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -769,6 +769,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
 	lockdep_set_class(&sk->sk_receive_queue.lock,
 				&af_unix_sk_receive_queue_lock_key);
 
+	sk->sk_allocation	= GFP_KERNEL_ACCOUNT;
 	sk->sk_write_space	= unix_write_space;
 	sk->sk_max_ack_backlog	= net->unx.sysctl_max_dgram_qlen;
 	sk->sk_destruct		= unix_sock_destructor;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
@ 2016-05-23 10:39   ` kbuild test robot
  2016-05-23 10:57     ` Vladimir Davydov
  2016-05-23 11:50   ` kbuild test robot
  2016-05-23 11:52   ` kbuild test robot
  2 siblings, 1 reply; 16+ messages in thread
From: kbuild test robot @ 2016-05-23 10:39 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: kbuild-all, Andrew Morton, Alexander Viro, Johannes Weiner,
	Michal Hocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1074 bytes --]

Hi,

[auto build test ERROR on next-20160520]
[cannot apply to tip/x86/core net-next/master v4.6-rc7 v4.6-rc6 v4.6-rc5 v4.6]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939
config: i386-tinyconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

Note: the linux-review/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939 HEAD 08a247942f52ec5444ba2a0d3c951358640eccf5 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   fs/built-in.o: In function `anon_pipe_buf_steal':
>> pipe.c:(.text+0x5f8d): undefined reference to `memcg_kmem_uncharge'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 6359 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:39   ` kbuild test robot
@ 2016-05-23 10:57     ` Vladimir Davydov
  2016-05-23 11:57       ` Vladimir Davydov
  0 siblings, 1 reply; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 10:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

On Mon, May 23, 2016 at 06:39:40PM +0800, kbuild test robot wrote:
...
>    fs/built-in.o: In function `anon_pipe_buf_steal':
> >> pipe.c:(.text+0x5f8d): undefined reference to `memcg_kmem_uncharge'

From: Vladimir Davydov <vdavydov@virtuozzo.com>
Subject: [PATCH] pipe-account-to-kmemcg-fix


diff --git a/fs/pipe.c b/fs/pipe.c
index 6345f3543788..b3ad0b33f04e 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -144,8 +144,10 @@ static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
 	struct page *page = buf->page;
 
 	if (page_count(page) == 1) {
-		memcg_kmem_uncharge(page, 0);
-		__ClearPageKmemcg(page);
+		if (memcg_kmem_enabled()) {
+			memcg_kmem_uncharge(page, 0);
+			__ClearPageKmemcg(page);
+		}
 		__SetPageLocked(page);
 		return 0;
 	}

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
  2016-05-23 10:39   ` kbuild test robot
@ 2016-05-23 11:50   ` kbuild test robot
  2016-05-23 11:52   ` kbuild test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2016-05-23 11:50 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: kbuild-all, Andrew Morton, Alexander Viro, Johannes Weiner,
	Michal Hocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]

Hi,

[auto build test ERROR on next-20160520]
[cannot apply to tip/x86/core net-next/master v4.6-rc7 v4.6-rc6 v4.6-rc5 v4.6]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939
config: blackfin-TCM-BF537_defconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 4.6.3
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=blackfin 

Note: the linux-review/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939 HEAD 08a247942f52ec5444ba2a0d3c951358640eccf5 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   fs/built-in.o: In function `anon_pipe_buf_steal':
>> fs/pipe.c:(.text+0x7304): undefined reference to `memcg_kmem_uncharge'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 10477 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
  2016-05-23 10:39   ` kbuild test robot
  2016-05-23 11:50   ` kbuild test robot
@ 2016-05-23 11:52   ` kbuild test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2016-05-23 11:52 UTC (permalink / raw)
  To: Vladimir Davydov
  Cc: kbuild-all, Andrew Morton, Alexander Viro, Johannes Weiner,
	Michal Hocko, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

Hi,

[auto build test ERROR on next-20160520]
[cannot apply to tip/x86/core net-next/master v4.6-rc7 v4.6-rc6 v4.6-rc5 v4.6]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939
config: um-x86_64_defconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
        # save the attached .config to linux build tree
        make ARCH=um SUBARCH=x86_64

Note: the linux-review/Vladimir-Davydov/mm-remove-pointless-struct-in-struct-page-definition/20160523-182939 HEAD 08a247942f52ec5444ba2a0d3c951358640eccf5 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   fs/built-in.o: In function `anon_pipe_buf_steal':
>> fs/pipe.c:147: undefined reference to `memcg_kmem_uncharge'
   collect2: error: ld returned 1 exit status

vim +147 fs/pipe.c

   141	static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
   142				       struct pipe_buffer *buf)
   143	{
   144		struct page *page = buf->page;
   145	
   146		if (page_count(page) == 1) {
 > 147			memcg_kmem_uncharge(page, 0);
   148			__ClearPageKmemcg(page);
   149			__SetPageLocked(page);
   150			return 0;

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 7224 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 7/8] pipe: account to kmemcg
  2016-05-23 10:57     ` Vladimir Davydov
@ 2016-05-23 11:57       ` Vladimir Davydov
  0 siblings, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-23 11:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Johannes Weiner, Michal Hocko, linux-mm, linux-kernel

On Mon, May 23, 2016 at 01:57:18PM +0300, Vladimir Davydov wrote:
> Subject: [PATCH] pipe-account-to-kmemcg-fix

Darn it, I missed kcalloc in pipe_set_size. Below goes a full version of
this particular patch with both fixes incorporated. Sorry for the noise.
--
From: Vladimir Davydov <vdavydov@virtuozzo.com>
Subject: [PATCH] pipe: account to kmemcg

Pipes can consume a significant amount of system memory, hence they
should be accounted to kmemcg.

This patch marks pipe_inode_info and anonymous pipe buffer page
allocations as __GFP_ACCOUNT so that they would be charged to kmemcg.
Note, since a pipe buffer page can be "stolen" and get reused for other
purposes, including mapping to userspace, we clear PageKmemcg thus
resetting page->_mapcount and uncharge it in anon_pipe_buf_steal, which
is introduced by this patch.

Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>

diff --git a/fs/pipe.c b/fs/pipe.c
index 0d3f5165cb0b..4b32928f5426 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -21,6 +21,7 @@
 #include <linux/audit.h>
 #include <linux/syscalls.h>
 #include <linux/fcntl.h>
+#include <linux/memcontrol.h>
 
 #include <asm/uaccess.h>
 #include <asm/ioctls.h>
@@ -137,6 +138,22 @@ static void anon_pipe_buf_release(struct pipe_inode_info *pipe,
 		put_page(page);
 }
 
+static int anon_pipe_buf_steal(struct pipe_inode_info *pipe,
+			       struct pipe_buffer *buf)
+{
+	struct page *page = buf->page;
+
+	if (page_count(page) == 1) {
+		if (memcg_kmem_enabled()) {
+			memcg_kmem_uncharge(page, 0);
+			__ClearPageKmemcg(page);
+		}
+		__SetPageLocked(page);
+		return 0;
+	}
+	return 1;
+}
+
 /**
  * generic_pipe_buf_steal - attempt to take ownership of a &pipe_buffer
  * @pipe:	the pipe that the buffer belongs to
@@ -219,7 +236,7 @@ static const struct pipe_buf_operations anon_pipe_buf_ops = {
 	.can_merge = 1,
 	.confirm = generic_pipe_buf_confirm,
 	.release = anon_pipe_buf_release,
-	.steal = generic_pipe_buf_steal,
+	.steal = anon_pipe_buf_steal,
 	.get = generic_pipe_buf_get,
 };
 
@@ -227,7 +244,7 @@ static const struct pipe_buf_operations packet_pipe_buf_ops = {
 	.can_merge = 0,
 	.confirm = generic_pipe_buf_confirm,
 	.release = anon_pipe_buf_release,
-	.steal = generic_pipe_buf_steal,
+	.steal = anon_pipe_buf_steal,
 	.get = generic_pipe_buf_get,
 };
 
@@ -405,7 +422,7 @@ pipe_write(struct kiocb *iocb, struct iov_iter *from)
 			int copied;
 
 			if (!page) {
-				page = alloc_page(GFP_HIGHUSER);
+				page = alloc_page(GFP_HIGHUSER | __GFP_ACCOUNT);
 				if (unlikely(!page)) {
 					ret = ret ? : -ENOMEM;
 					break;
@@ -611,7 +628,7 @@ struct pipe_inode_info *alloc_pipe_info(void)
 {
 	struct pipe_inode_info *pipe;
 
-	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL);
+	pipe = kzalloc(sizeof(struct pipe_inode_info), GFP_KERNEL_ACCOUNT);
 	if (pipe) {
 		unsigned long pipe_bufs = PIPE_DEF_BUFFERS;
 		struct user_struct *user = get_current_user();
@@ -619,7 +636,9 @@ struct pipe_inode_info *alloc_pipe_info(void)
 		if (!too_many_pipe_buffers_hard(user)) {
 			if (too_many_pipe_buffers_soft(user))
 				pipe_bufs = 1;
-			pipe->bufs = kzalloc(sizeof(struct pipe_buffer) * pipe_bufs, GFP_KERNEL);
+			pipe->bufs = kcalloc(pipe_bufs,
+					     sizeof(struct pipe_buffer),
+					     GFP_KERNEL_ACCOUNT);
 		}
 
 		if (pipe->bufs) {
@@ -1010,7 +1029,8 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long nr_pages)
 	if (nr_pages < pipe->nrbufs)
 		return -EBUSY;
 
-	bufs = kcalloc(nr_pages, sizeof(*bufs), GFP_KERNEL | __GFP_NOWARN);
+	bufs = kcalloc(nr_pages, sizeof(*bufs),
+		       GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
 	if (unlikely(!bufs))
 		return -ENOMEM;
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] af_unix: charge buffers to kmemcg
  2016-05-23 10:20 ` [PATCH 8/8] af_unix: charge buffers " Vladimir Davydov
@ 2016-05-23 22:13   ` David Miller
  2016-05-24  7:41   ` Vladimir Davydov
  1 sibling, 0 replies; 16+ messages in thread
From: David Miller @ 2016-05-23 22:13 UTC (permalink / raw)
  To: vdavydov; +Cc: akpm, hannes, mhocko, linux-mm, linux-kernel


Networking changes should be CC:'d netdev@vger.kernel.org

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 8/8] af_unix: charge buffers to kmemcg
  2016-05-23 10:20 ` [PATCH 8/8] af_unix: charge buffers " Vladimir Davydov
  2016-05-23 22:13   ` David Miller
@ 2016-05-24  7:41   ` Vladimir Davydov
  1 sibling, 0 replies; 16+ messages in thread
From: Vladimir Davydov @ 2016-05-24  7:41 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David S. Miller, Johannes Weiner, Michal Hocko, linux-mm,
	linux-kernel, netdev

[adding netdev to Cc]

On Mon, May 23, 2016 at 01:20:29PM +0300, Vladimir Davydov wrote:
> Unix sockets can consume a significant amount of system memory, hence
> they should be accounted to kmemcg.
> 
> Since unix socket buffers are always allocated from process context,
> all we need to do to charge them to kmemcg is set __GFP_ACCOUNT in
> sock->sk_allocation mask.
> 
> Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> ---
>  net/unix/af_unix.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 80aa6a3e6817..022bdd3ab7d9 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -769,6 +769,7 @@ static struct sock *unix_create1(struct net *net, struct socket *sock, int kern)
>  	lockdep_set_class(&sk->sk_receive_queue.lock,
>  				&af_unix_sk_receive_queue_lock_key);
>  
> +	sk->sk_allocation	= GFP_KERNEL_ACCOUNT;
>  	sk->sk_write_space	= unix_write_space;
>  	sk->sk_max_ack_backlog	= net->unx.sysctl_max_dgram_qlen;
>  	sk->sk_destruct		= unix_sock_destructor;

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-05-24  7:56 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-23 10:20 [PATCH 0/8] More stuff to charge to kmemcg Vladimir Davydov
2016-05-23 10:20 ` [PATCH 1/8] mm: remove pointless struct in struct page definition Vladimir Davydov
2016-05-23 10:20 ` [PATCH 2/8] mm: clean up non-standard page->_mapcount users Vladimir Davydov
2016-05-23 10:20 ` [PATCH 3/8] mm: memcontrol: cleanup kmem charge functions Vladimir Davydov
2016-05-23 10:20 ` [PATCH 4/8] mm: charge/uncharge kmemcg from generic page allocator paths Vladimir Davydov
2016-05-23 10:20 ` [PATCH 5/8] mm: memcontrol: teach uncharge_list to deal with kmem pages Vladimir Davydov
2016-05-23 10:20 ` [PATCH 6/8] arch: x86: charge page tables to kmemcg Vladimir Davydov
2016-05-23 10:20 ` [PATCH 7/8] pipe: account " Vladimir Davydov
2016-05-23 10:39   ` kbuild test robot
2016-05-23 10:57     ` Vladimir Davydov
2016-05-23 11:57       ` Vladimir Davydov
2016-05-23 11:50   ` kbuild test robot
2016-05-23 11:52   ` kbuild test robot
2016-05-23 10:20 ` [PATCH 8/8] af_unix: charge buffers " Vladimir Davydov
2016-05-23 22:13   ` David Miller
2016-05-24  7:41   ` Vladimir Davydov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).