linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg
@ 2022-01-28 19:56 Waiman Long
  2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
  2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
  0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
  Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long

While debugging the constant increase in percpu memory consumption on
a system that spawned large number of containers, it was found that a
lot of offlined mem_cgroup structures remained in place without being
freed. Further investigation indicated that those mem_cgroup structures
were pinned by some pages.

In order to find out what those pages are, the existing page_owner
debugging tool is extended to show memory cgroup information and whether
those memcgs are offlined or not. With the enhanced page_owner tool,
the following is a typical page that pinned the mem_cgroup structure
in my test case:

Page allocated via order 0, mask 0x1100cca(GFP_HIGHUSER_MOVABLE), pid 62760, ts 119274296592 ns, free_ts 118989764823 ns
PFN 1273412 type Movable Block 2487 type Movable Flags 0x17ffffc00c001c(uptodate|dirty|lru|reclaim|swapbacked|node=0|zone=2|lastcpupid=0x1fffff)
 prep_new_page+0x8e/0xb0
 get_page_from_freelist+0xc4d/0xe50
 __alloc_pages+0x172/0x320
 alloc_pages_vma+0x84/0x230
 shmem_alloc_page+0x3f/0x90
 shmem_alloc_and_acct_page+0x76/0x1c0
 shmem_getpage_gfp+0x48d/0x890
 shmem_write_begin+0x36/0xc0
 generic_perform_write+0xed/0x1d0
 __generic_file_write_iter+0xdc/0x1b0
 generic_file_write_iter+0x5d/0xb0
 new_sync_write+0x11f/0x1b0
 vfs_write+0x1ba/0x2a0
 ksys_write+0x59/0xd0
 do_syscall_64+0x37/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
Charged to offlined memcg libpod-conmon-e59cc83faf807bacc61223fec6a80c1540ebe8f83c802870c6af4708d58f77ea

So the page was not freed because it was part of a shmem segment. That
is useful information that can help users to diagnose similar problems.

Waiman Long (2):
  mm/page_owner: Introduce SNPRINTF() macro that includes length error
    check
  mm/page_owner: Dump memcg information

 mm/page_owner.c | 76 ++++++++++++++++++++++++++++++++-----------------
 1 file changed, 50 insertions(+), 26 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
  2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
@ 2022-01-28 19:56 ` Waiman Long
  2022-01-28 21:18   ` Ira Weiny
  2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
  1 sibling, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
  Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long

In print_page_owner(), there is a repetitive check after each snprintf()
to make sure that the final length is less than the buffer size. Simplify
the code and make it easier to read by embedding the buffer length
check and increment inside the macro.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index 99e360df9465..c52ce9d6bc3b 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 	seq_putc(m, '\n');
 }
 
+#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...)			\
+	do {								\
+		_len += snprintf(_buf + _len, _size - _len, _fmt,	\
+				 ##__VA_ARGS__);			\
+		if (_len >= _size)					\
+			goto _err;					\
+	} while (0)
+
 static ssize_t
 print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		struct page *page, struct page_owner *page_owner,
 		depot_stack_handle_t handle)
 {
-	int ret, pageblock_mt, page_mt;
+	int ret = 0, pageblock_mt, page_mt;
 	char *kbuf;
 
 	count = min_t(size_t, count, PAGE_SIZE);
@@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	if (!kbuf)
 		return -ENOMEM;
 
-	ret = snprintf(kbuf, count,
-			"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
-			page_owner->order, page_owner->gfp_mask,
-			&page_owner->gfp_mask, page_owner->pid,
-			page_owner->ts_nsec, page_owner->free_ts_nsec);
-
-	if (ret >= count)
-		goto err;
+	SNPRINTF(kbuf, count, ret, err,
+		"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
+		page_owner->order, page_owner->gfp_mask,
+		&page_owner->gfp_mask, page_owner->pid,
+		page_owner->ts_nsec, page_owner->free_ts_nsec);
 
 	/* Print information relevant to grouping pages by mobility */
 	pageblock_mt = get_pageblock_migratetype(page);
 	page_mt  = gfp_migratetype(page_owner->gfp_mask);
-	ret += snprintf(kbuf + ret, count - ret,
-			"PFN %lu type %s Block %lu type %s Flags %pGp\n",
-			pfn,
-			migratetype_names[page_mt],
-			pfn >> pageblock_order,
-			migratetype_names[pageblock_mt],
-			&page->flags);
-
-	if (ret >= count)
-		goto err;
+	SNPRINTF(kbuf, count, ret, err,
+		"PFN %lu type %s Block %lu type %s Flags %pGp\n",
+		pfn, migratetype_names[page_mt],
+		pfn >> pageblock_order,
+		migratetype_names[pageblock_mt],
+		&page->flags);
 
 	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
 	if (ret >= count)
 		goto err;
 
-	if (page_owner->last_migrate_reason != -1) {
-		ret += snprintf(kbuf + ret, count - ret,
+	if (page_owner->last_migrate_reason != -1)
+		SNPRINTF(kbuf, count, ret, err,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
-		if (ret >= count)
-			goto err;
-	}
 
-	ret += snprintf(kbuf + ret, count - ret, "\n");
-	if (ret >= count)
-		goto err;
+	SNPRINTF(kbuf, count, ret, err, "\n");
 
 	if (copy_to_user(buf, kbuf, ret))
 		ret = -EFAULT;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
  2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
@ 2022-01-28 19:56 ` Waiman Long
  2022-01-28 21:22   ` Ira Weiny
  1 sibling, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
  To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
  Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long

It was found that a number of offlined memcgs were not freed because
they were pinned by some charged pages that were present. Even "echo
1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
offlined but not freed memcgs tend to increase in number over time with
the side effect that percpu memory consumption as shown in /proc/meminfo
also increases over time.

In order to find out more information about those pages that pin
offlined memcgs, the page_owner feature is extended to dump memory
cgroup information especially whether the cgroup is offlined or not.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 mm/page_owner.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/mm/page_owner.c b/mm/page_owner.c
index c52ce9d6bc3b..e5d8c642296b 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -10,6 +10,7 @@
 #include <linux/migrate.h>
 #include <linux/stackdepot.h>
 #include <linux/seq_file.h>
+#include <linux/memcontrol.h>
 #include <linux/sched/clock.h>
 
 #include "internal.h"
@@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		depot_stack_handle_t handle)
 {
 	int ret = 0, pageblock_mt, page_mt;
+	unsigned long __maybe_unused memcg_data;
 	char *kbuf;
 
 	count = min_t(size_t, count, PAGE_SIZE);
@@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
 
+#ifdef CONFIG_MEMCG
+	/*
+	 * Look for memcg information and print it out
+	 */
+	memcg_data = READ_ONCE(page->memcg_data);
+	if (memcg_data) {
+		struct mem_cgroup *memcg = page_memcg_check(page);
+		bool onlined;
+		char name[80];
+
+		if (memcg_data & MEMCG_DATA_OBJCGS)
+			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
+
+		if (!memcg)
+			goto copy_out;
+
+		onlined = (memcg->css.flags & CSS_ONLINE);
+		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
+		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
+			PageMemcgKmem(page) ? "(via objcg) " : "",
+			onlined ? "" : "offlined ", name);
+	}
+
+copy_out:
+#endif
+
 	SNPRINTF(kbuf, count, ret, err, "\n");
 
 	if (copy_to_user(buf, kbuf, ret))
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
  2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
@ 2022-01-28 21:18   ` Ira Weiny
  2022-01-28 21:22     ` Waiman Long
  0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:18 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On Fri, Jan 28, 2022 at 02:56:41PM -0500, Waiman Long wrote:
> In print_page_owner(), there is a repetitive check after each snprintf()
> to make sure that the final length is less than the buffer size. Simplify
> the code and make it easier to read by embedding the buffer length
> check and increment inside the macro.

This does not follow the kernel coding style of not putting control flow
statements in macros.[1]

> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
>  1 file changed, 23 insertions(+), 27 deletions(-)

And in the end you only saved 4 lines of code to read...  Not worth it IMO.

Ira

[1] https://www.kernel.org/doc/html/v4.17/process/coding-style.html

> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 99e360df9465..c52ce9d6bc3b 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
>  	seq_putc(m, '\n');
>  }
>  
> +#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...)			\
> +	do {								\
> +		_len += snprintf(_buf + _len, _size - _len, _fmt,	\
> +				 ##__VA_ARGS__);			\
> +		if (_len >= _size)					\
> +			goto _err;					\
> +	} while (0)
> +
>  static ssize_t
>  print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>  		struct page *page, struct page_owner *page_owner,
>  		depot_stack_handle_t handle)
>  {
> -	int ret, pageblock_mt, page_mt;
> +	int ret = 0, pageblock_mt, page_mt;
>  	char *kbuf;
>  
>  	count = min_t(size_t, count, PAGE_SIZE);
> @@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>  	if (!kbuf)
>  		return -ENOMEM;
>  
> -	ret = snprintf(kbuf, count,
> -			"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
> -			page_owner->order, page_owner->gfp_mask,
> -			&page_owner->gfp_mask, page_owner->pid,
> -			page_owner->ts_nsec, page_owner->free_ts_nsec);
> -
> -	if (ret >= count)
> -		goto err;
> +	SNPRINTF(kbuf, count, ret, err,
> +		"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
> +		page_owner->order, page_owner->gfp_mask,
> +		&page_owner->gfp_mask, page_owner->pid,
> +		page_owner->ts_nsec, page_owner->free_ts_nsec);
>  
>  	/* Print information relevant to grouping pages by mobility */
>  	pageblock_mt = get_pageblock_migratetype(page);
>  	page_mt  = gfp_migratetype(page_owner->gfp_mask);
> -	ret += snprintf(kbuf + ret, count - ret,
> -			"PFN %lu type %s Block %lu type %s Flags %pGp\n",
> -			pfn,
> -			migratetype_names[page_mt],
> -			pfn >> pageblock_order,
> -			migratetype_names[pageblock_mt],
> -			&page->flags);
> -
> -	if (ret >= count)
> -		goto err;
> +	SNPRINTF(kbuf, count, ret, err,
> +		"PFN %lu type %s Block %lu type %s Flags %pGp\n",
> +		pfn, migratetype_names[page_mt],
> +		pfn >> pageblock_order,
> +		migratetype_names[pageblock_mt],
> +		&page->flags);
>  
>  	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
>  	if (ret >= count)
>  		goto err;
>  
> -	if (page_owner->last_migrate_reason != -1) {
> -		ret += snprintf(kbuf + ret, count - ret,
> +	if (page_owner->last_migrate_reason != -1)
> +		SNPRINTF(kbuf, count, ret, err,
>  			"Page has been migrated, last migrate reason: %s\n",
>  			migrate_reason_names[page_owner->last_migrate_reason]);
> -		if (ret >= count)
> -			goto err;
> -	}
>  
> -	ret += snprintf(kbuf + ret, count - ret, "\n");
> -	if (ret >= count)
> -		goto err;
> +	SNPRINTF(kbuf, count, ret, err, "\n");
>  
>  	if (copy_to_user(buf, kbuf, ret))
>  		ret = -EFAULT;
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
@ 2022-01-28 21:22   ` Ira Weiny
  2022-01-28 21:31     ` Waiman Long
  0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:22 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> It was found that a number of offlined memcgs were not freed because
> they were pinned by some charged pages that were present. Even "echo
> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> offlined but not freed memcgs tend to increase in number over time with
> the side effect that percpu memory consumption as shown in /proc/meminfo
> also increases over time.
> 
> In order to find out more information about those pages that pin
> offlined memcgs, the page_owner feature is extended to dump memory
> cgroup information especially whether the cgroup is offlined or not.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  mm/page_owner.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index c52ce9d6bc3b..e5d8c642296b 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -10,6 +10,7 @@
>  #include <linux/migrate.h>
>  #include <linux/stackdepot.h>
>  #include <linux/seq_file.h>
> +#include <linux/memcontrol.h>
>  #include <linux/sched/clock.h>
>  
>  #include "internal.h"
> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>  		depot_stack_handle_t handle)
>  {
>  	int ret = 0, pageblock_mt, page_mt;
> +	unsigned long __maybe_unused memcg_data;
>  	char *kbuf;
>  
>  	count = min_t(size_t, count, PAGE_SIZE);
> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>  			"Page has been migrated, last migrate reason: %s\n",
>  			migrate_reason_names[page_owner->last_migrate_reason]);
>  
> +#ifdef CONFIG_MEMCG
> +	/*
> +	 * Look for memcg information and print it out
> +	 */
> +	memcg_data = READ_ONCE(page->memcg_data);
> +	if (memcg_data) {
> +		struct mem_cgroup *memcg = page_memcg_check(page);
> +		bool onlined;
> +		char name[80];
> +
> +		if (memcg_data & MEMCG_DATA_OBJCGS)
> +			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> +
> +		if (!memcg)
> +			goto copy_out;
> +
> +		onlined = (memcg->css.flags & CSS_ONLINE);
> +		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> +		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
                                                        ^^^
						Extra specifier?

Did this compile without warnings?

Ira

> +			PageMemcgKmem(page) ? "(via objcg) " : "",
> +			onlined ? "" : "offlined ", name);
> +	}
> +
> +copy_out:
> +#endif
> +
>  	SNPRINTF(kbuf, count, ret, err, "\n");
>  
>  	if (copy_to_user(buf, kbuf, ret))
> -- 
> 2.27.0
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
  2022-01-28 21:18   ` Ira Weiny
@ 2022-01-28 21:22     ` Waiman Long
  0 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2022-01-28 21:22 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini


On 1/28/22 16:18, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 02:56:41PM -0500, Waiman Long wrote:
>> In print_page_owner(), there is a repetitive check after each snprintf()
>> to make sure that the final length is less than the buffer size. Simplify
>> the code and make it easier to read by embedding the buffer length
>> check and increment inside the macro.
> This does not follow the kernel coding style of not putting control flow
> statements in macros.[1]

OK, fair enough. I can remove the macro and repost the patch.

Cheers,
Longman

>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
>>   1 file changed, 23 insertions(+), 27 deletions(-)
> And in the end you only saved 4 lines of code to read...  Not worth it IMO.
>
> Ira
>
> [1] https://www.kernel.org/doc/html/v4.17/process/coding-style.html
>
>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>> index 99e360df9465..c52ce9d6bc3b 100644
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
>>   	seq_putc(m, '\n');
>>   }
>>   
>> +#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...)			\
>> +	do {								\
>> +		_len += snprintf(_buf + _len, _size - _len, _fmt,	\
>> +				 ##__VA_ARGS__);			\
>> +		if (_len >= _size)					\
>> +			goto _err;					\
>> +	} while (0)
>> +
>>   static ssize_t
>>   print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>   		struct page *page, struct page_owner *page_owner,
>>   		depot_stack_handle_t handle)
>>   {
>> -	int ret, pageblock_mt, page_mt;
>> +	int ret = 0, pageblock_mt, page_mt;
>>   	char *kbuf;
>>   
>>   	count = min_t(size_t, count, PAGE_SIZE);
>> @@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>   	if (!kbuf)
>>   		return -ENOMEM;
>>   
>> -	ret = snprintf(kbuf, count,
>> -			"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
>> -			page_owner->order, page_owner->gfp_mask,
>> -			&page_owner->gfp_mask, page_owner->pid,
>> -			page_owner->ts_nsec, page_owner->free_ts_nsec);
>> -
>> -	if (ret >= count)
>> -		goto err;
>> +	SNPRINTF(kbuf, count, ret, err,
>> +		"Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
>> +		page_owner->order, page_owner->gfp_mask,
>> +		&page_owner->gfp_mask, page_owner->pid,
>> +		page_owner->ts_nsec, page_owner->free_ts_nsec);
>>   
>>   	/* Print information relevant to grouping pages by mobility */
>>   	pageblock_mt = get_pageblock_migratetype(page);
>>   	page_mt  = gfp_migratetype(page_owner->gfp_mask);
>> -	ret += snprintf(kbuf + ret, count - ret,
>> -			"PFN %lu type %s Block %lu type %s Flags %pGp\n",
>> -			pfn,
>> -			migratetype_names[page_mt],
>> -			pfn >> pageblock_order,
>> -			migratetype_names[pageblock_mt],
>> -			&page->flags);
>> -
>> -	if (ret >= count)
>> -		goto err;
>> +	SNPRINTF(kbuf, count, ret, err,
>> +		"PFN %lu type %s Block %lu type %s Flags %pGp\n",
>> +		pfn, migratetype_names[page_mt],
>> +		pfn >> pageblock_order,
>> +		migratetype_names[pageblock_mt],
>> +		&page->flags);
>>   
>>   	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
>>   	if (ret >= count)
>>   		goto err;
>>   
>> -	if (page_owner->last_migrate_reason != -1) {
>> -		ret += snprintf(kbuf + ret, count - ret,
>> +	if (page_owner->last_migrate_reason != -1)
>> +		SNPRINTF(kbuf, count, ret, err,
>>   			"Page has been migrated, last migrate reason: %s\n",
>>   			migrate_reason_names[page_owner->last_migrate_reason]);
>> -		if (ret >= count)
>> -			goto err;
>> -	}
>>   
>> -	ret += snprintf(kbuf + ret, count - ret, "\n");
>> -	if (ret >= count)
>> -		goto err;
>> +	SNPRINTF(kbuf, count, ret, err, "\n");
>>   
>>   	if (copy_to_user(buf, kbuf, ret))
>>   		ret = -EFAULT;
>> -- 
>> 2.27.0
>>
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-28 21:22   ` Ira Weiny
@ 2022-01-28 21:31     ` Waiman Long
  2022-01-28 21:48       ` Ira Weiny
  0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 21:31 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On 1/28/22 16:22, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
>> It was found that a number of offlined memcgs were not freed because
>> they were pinned by some charged pages that were present. Even "echo
>> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
>> offlined but not freed memcgs tend to increase in number over time with
>> the side effect that percpu memory consumption as shown in /proc/meminfo
>> also increases over time.
>>
>> In order to find out more information about those pages that pin
>> offlined memcgs, the page_owner feature is extended to dump memory
>> cgroup information especially whether the cgroup is offlined or not.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   mm/page_owner.c | 28 ++++++++++++++++++++++++++++
>>   1 file changed, 28 insertions(+)
>>
>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>> index c52ce9d6bc3b..e5d8c642296b 100644
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -10,6 +10,7 @@
>>   #include <linux/migrate.h>
>>   #include <linux/stackdepot.h>
>>   #include <linux/seq_file.h>
>> +#include <linux/memcontrol.h>
>>   #include <linux/sched/clock.h>
>>   
>>   #include "internal.h"
>> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>   		depot_stack_handle_t handle)
>>   {
>>   	int ret = 0, pageblock_mt, page_mt;
>> +	unsigned long __maybe_unused memcg_data;
>>   	char *kbuf;
>>   
>>   	count = min_t(size_t, count, PAGE_SIZE);
>> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>   			"Page has been migrated, last migrate reason: %s\n",
>>   			migrate_reason_names[page_owner->last_migrate_reason]);
>>   
>> +#ifdef CONFIG_MEMCG
>> +	/*
>> +	 * Look for memcg information and print it out
>> +	 */
>> +	memcg_data = READ_ONCE(page->memcg_data);
>> +	if (memcg_data) {
>> +		struct mem_cgroup *memcg = page_memcg_check(page);
>> +		bool onlined;
>> +		char name[80];
>> +
>> +		if (memcg_data & MEMCG_DATA_OBJCGS)
>> +			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
>> +
>> +		if (!memcg)
>> +			goto copy_out;
>> +
>> +		onlined = (memcg->css.flags & CSS_ONLINE);
>> +		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
>> +		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
>                                                          ^^^
> 						Extra specifier?
>
> Did this compile without warnings?

Yes, there was no warning.

Cheers,
Longmna

> Ira
>
>> +			PageMemcgKmem(page) ? "(via objcg) " : "",
>> +			onlined ? "" : "offlined ", name);
>> +	}
>> +
>> +copy_out:
>> +#endif
>> +
>>   	SNPRINTF(kbuf, count, ret, err, "\n");
>>   
>>   	if (copy_to_user(buf, kbuf, ret))
>> -- 
>> 2.27.0
>>
>>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-28 21:31     ` Waiman Long
@ 2022-01-28 21:48       ` Ira Weiny
  2022-01-29  3:35         ` Waiman Long
  0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:48 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
> On 1/28/22 16:22, Ira Weiny wrote:
> > On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> > > It was found that a number of offlined memcgs were not freed because
> > > they were pinned by some charged pages that were present. Even "echo
> > > 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> > > offlined but not freed memcgs tend to increase in number over time with
> > > the side effect that percpu memory consumption as shown in /proc/meminfo
> > > also increases over time.
> > > 
> > > In order to find out more information about those pages that pin
> > > offlined memcgs, the page_owner feature is extended to dump memory
> > > cgroup information especially whether the cgroup is offlined or not.
> > > 
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > >   mm/page_owner.c | 28 ++++++++++++++++++++++++++++
> > >   1 file changed, 28 insertions(+)
> > > 
> > > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > > index c52ce9d6bc3b..e5d8c642296b 100644
> > > --- a/mm/page_owner.c
> > > +++ b/mm/page_owner.c
> > > @@ -10,6 +10,7 @@
> > >   #include <linux/migrate.h>
> > >   #include <linux/stackdepot.h>
> > >   #include <linux/seq_file.h>
> > > +#include <linux/memcontrol.h>
> > >   #include <linux/sched/clock.h>
> > >   #include "internal.h"
> > > @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > >   		depot_stack_handle_t handle)
> > >   {
> > >   	int ret = 0, pageblock_mt, page_mt;
> > > +	unsigned long __maybe_unused memcg_data;
> > >   	char *kbuf;
> > >   	count = min_t(size_t, count, PAGE_SIZE);
> > > @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > >   			"Page has been migrated, last migrate reason: %s\n",
> > >   			migrate_reason_names[page_owner->last_migrate_reason]);
> > > +#ifdef CONFIG_MEMCG
> > > +	/*
> > > +	 * Look for memcg information and print it out
> > > +	 */
> > > +	memcg_data = READ_ONCE(page->memcg_data);
> > > +	if (memcg_data) {
> > > +		struct mem_cgroup *memcg = page_memcg_check(page);
> > > +		bool onlined;
> > > +		char name[80];
> > > +
> > > +		if (memcg_data & MEMCG_DATA_OBJCGS)
> > > +			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> > > +
> > > +		if (!memcg)
> > > +			goto copy_out;
> > > +
> > > +		onlined = (memcg->css.flags & CSS_ONLINE);
> > > +		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> > > +		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
> >                                                          ^^^
> > 						Extra specifier?
> > 
> > Did this compile without warnings?
> 
> Yes, there was no warning.

But isn't that an extra specifier?

Ira

> 
> Cheers,
> Longmna
> 
> > Ira
> > 
> > > +			PageMemcgKmem(page) ? "(via objcg) " : "",
> > > +			onlined ? "" : "offlined ", name);
> > > +	}
> > > +
> > > +copy_out:
> > > +#endif
> > > +
> > >   	SNPRINTF(kbuf, count, ret, err, "\n");
> > >   	if (copy_to_user(buf, kbuf, ret))
> > > -- 
> > > 2.27.0
> > > 
> > > 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-28 21:48       ` Ira Weiny
@ 2022-01-29  3:35         ` Waiman Long
  2022-01-29  4:05           ` Ira Weiny
  0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-29  3:35 UTC (permalink / raw)
  To: Ira Weiny
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On 1/28/22 16:48, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
>> On 1/28/22 16:22, Ira Weiny wrote:
>>> On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
>>>> It was found that a number of offlined memcgs were not freed because
>>>> they were pinned by some charged pages that were present. Even "echo
>>>> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
>>>> offlined but not freed memcgs tend to increase in number over time with
>>>> the side effect that percpu memory consumption as shown in /proc/meminfo
>>>> also increases over time.
>>>>
>>>> In order to find out more information about those pages that pin
>>>> offlined memcgs, the page_owner feature is extended to dump memory
>>>> cgroup information especially whether the cgroup is offlined or not.
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>>    mm/page_owner.c | 28 ++++++++++++++++++++++++++++
>>>>    1 file changed, 28 insertions(+)
>>>>
>>>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>>>> index c52ce9d6bc3b..e5d8c642296b 100644
>>>> --- a/mm/page_owner.c
>>>> +++ b/mm/page_owner.c
>>>> @@ -10,6 +10,7 @@
>>>>    #include <linux/migrate.h>
>>>>    #include <linux/stackdepot.h>
>>>>    #include <linux/seq_file.h>
>>>> +#include <linux/memcontrol.h>
>>>>    #include <linux/sched/clock.h>
>>>>    #include "internal.h"
>>>> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>>>    		depot_stack_handle_t handle)
>>>>    {
>>>>    	int ret = 0, pageblock_mt, page_mt;
>>>> +	unsigned long __maybe_unused memcg_data;
>>>>    	char *kbuf;
>>>>    	count = min_t(size_t, count, PAGE_SIZE);
>>>> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>>>    			"Page has been migrated, last migrate reason: %s\n",
>>>>    			migrate_reason_names[page_owner->last_migrate_reason]);
>>>> +#ifdef CONFIG_MEMCG
>>>> +	/*
>>>> +	 * Look for memcg information and print it out
>>>> +	 */
>>>> +	memcg_data = READ_ONCE(page->memcg_data);
>>>> +	if (memcg_data) {
>>>> +		struct mem_cgroup *memcg = page_memcg_check(page);
>>>> +		bool onlined;
>>>> +		char name[80];
>>>> +
>>>> +		if (memcg_data & MEMCG_DATA_OBJCGS)
>>>> +			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
>>>> +
>>>> +		if (!memcg)
>>>> +			goto copy_out;
>>>> +
>>>> +		onlined = (memcg->css.flags & CSS_ONLINE);
>>>> +		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
>>>> +		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
>>>                                                           ^^^
>>> 						Extra specifier?
>>>
>>> Did this compile without warnings?
>> Yes, there was no warning.
> But isn't that an extra specifier?

There are 3 arguments to the format string that match the 3 "%s" in it:

1) PageMemcgKmem(page) ? "(via objcg) " : ""
2) onlined ? "" : "offlined
3) name

Cheers,
Longman


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
  2022-01-29  3:35         ` Waiman Long
@ 2022-01-29  4:05           ` Ira Weiny
  0 siblings, 0 replies; 10+ messages in thread
From: Ira Weiny @ 2022-01-29  4:05 UTC (permalink / raw)
  To: Waiman Long
  Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
	linux-kernel, cgroups, linux-mm, Rafael Aquini

On Fri, Jan 28, 2022 at 10:35:22PM -0500, Waiman Long wrote:
> On 1/28/22 16:48, Ira Weiny wrote:
> > On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
> > > On 1/28/22 16:22, Ira Weiny wrote:
> > > > On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> > > > > It was found that a number of offlined memcgs were not freed because
> > > > > they were pinned by some charged pages that were present. Even "echo
> > > > > 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> > > > > offlined but not freed memcgs tend to increase in number over time with
> > > > > the side effect that percpu memory consumption as shown in /proc/meminfo
> > > > > also increases over time.
> > > > > 
> > > > > In order to find out more information about those pages that pin
> > > > > offlined memcgs, the page_owner feature is extended to dump memory
> > > > > cgroup information especially whether the cgroup is offlined or not.
> > > > > 
> > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > ---
> > > > >    mm/page_owner.c | 28 ++++++++++++++++++++++++++++
> > > > >    1 file changed, 28 insertions(+)
> > > > > 
> > > > > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > > > > index c52ce9d6bc3b..e5d8c642296b 100644
> > > > > --- a/mm/page_owner.c
> > > > > +++ b/mm/page_owner.c
> > > > > @@ -10,6 +10,7 @@
> > > > >    #include <linux/migrate.h>
> > > > >    #include <linux/stackdepot.h>
> > > > >    #include <linux/seq_file.h>
> > > > > +#include <linux/memcontrol.h>
> > > > >    #include <linux/sched/clock.h>
> > > > >    #include "internal.h"
> > > > > @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > > >    		depot_stack_handle_t handle)
> > > > >    {
> > > > >    	int ret = 0, pageblock_mt, page_mt;
> > > > > +	unsigned long __maybe_unused memcg_data;
> > > > >    	char *kbuf;
> > > > >    	count = min_t(size_t, count, PAGE_SIZE);
> > > > > @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > > >    			"Page has been migrated, last migrate reason: %s\n",
> > > > >    			migrate_reason_names[page_owner->last_migrate_reason]);
> > > > > +#ifdef CONFIG_MEMCG
> > > > > +	/*
> > > > > +	 * Look for memcg information and print it out
> > > > > +	 */
> > > > > +	memcg_data = READ_ONCE(page->memcg_data);
> > > > > +	if (memcg_data) {
> > > > > +		struct mem_cgroup *memcg = page_memcg_check(page);
> > > > > +		bool onlined;
> > > > > +		char name[80];
> > > > > +
> > > > > +		if (memcg_data & MEMCG_DATA_OBJCGS)
> > > > > +			SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> > > > > +
> > > > > +		if (!memcg)
> > > > > +			goto copy_out;
> > > > > +
> > > > > +		onlined = (memcg->css.flags & CSS_ONLINE);
> > > > > +		cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> > > > > +		SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
> > > >                                                           ^^^
> > > > 						Extra specifier?
> > > > 
> > > > Did this compile without warnings?
> > > Yes, there was no warning.
> > But isn't that an extra specifier?
> 
> There are 3 arguments to the format string that match the 3 "%s" in it:
> 
> 1) PageMemcgKmem(page) ? "(via objcg) " : ""
> 2) onlined ? "" : "offlined
> 3) name

My apologies.  My parsing of the ? statements was off.  FWIW putting

	', name'

on the next line would make it more clear...  But I see now...

Sorry,
Ira

> 
> Cheers,
> Longman
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-01-29  4:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
2022-01-28 21:18   ` Ira Weiny
2022-01-28 21:22     ` Waiman Long
2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
2022-01-28 21:22   ` Ira Weiny
2022-01-28 21:31     ` Waiman Long
2022-01-28 21:48       ` Ira Weiny
2022-01-29  3:35         ` Waiman Long
2022-01-29  4:05           ` Ira Weiny

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).