* [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg
@ 2022-01-28 19:56 Waiman Long
2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
0 siblings, 2 replies; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long
While debugging the constant increase in percpu memory consumption on
a system that spawned large number of containers, it was found that a
lot of offlined mem_cgroup structures remained in place without being
freed. Further investigation indicated that those mem_cgroup structures
were pinned by some pages.
In order to find out what those pages are, the existing page_owner
debugging tool is extended to show memory cgroup information and whether
those memcgs are offlined or not. With the enhanced page_owner tool,
the following is a typical page that pinned the mem_cgroup structure
in my test case:
Page allocated via order 0, mask 0x1100cca(GFP_HIGHUSER_MOVABLE), pid 62760, ts 119274296592 ns, free_ts 118989764823 ns
PFN 1273412 type Movable Block 2487 type Movable Flags 0x17ffffc00c001c(uptodate|dirty|lru|reclaim|swapbacked|node=0|zone=2|lastcpupid=0x1fffff)
prep_new_page+0x8e/0xb0
get_page_from_freelist+0xc4d/0xe50
__alloc_pages+0x172/0x320
alloc_pages_vma+0x84/0x230
shmem_alloc_page+0x3f/0x90
shmem_alloc_and_acct_page+0x76/0x1c0
shmem_getpage_gfp+0x48d/0x890
shmem_write_begin+0x36/0xc0
generic_perform_write+0xed/0x1d0
__generic_file_write_iter+0xdc/0x1b0
generic_file_write_iter+0x5d/0xb0
new_sync_write+0x11f/0x1b0
vfs_write+0x1ba/0x2a0
ksys_write+0x59/0xd0
do_syscall_64+0x37/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Charged to offlined memcg libpod-conmon-e59cc83faf807bacc61223fec6a80c1540ebe8f83c802870c6af4708d58f77ea
So the page was not freed because it was part of a shmem segment. That
is useful information that can help users to diagnose similar problems.
Waiman Long (2):
mm/page_owner: Introduce SNPRINTF() macro that includes length error
check
mm/page_owner: Dump memcg information
mm/page_owner.c | 76 ++++++++++++++++++++++++++++++++-----------------
1 file changed, 50 insertions(+), 26 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
@ 2022-01-28 19:56 ` Waiman Long
2022-01-28 21:18 ` Ira Weiny
2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
1 sibling, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long
In print_page_owner(), there is a repetitive check after each snprintf()
to make sure that the final length is less than the buffer size. Simplify
the code and make it easier to read by embedding the buffer length
check and increment inside the macro.
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
1 file changed, 23 insertions(+), 27 deletions(-)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 99e360df9465..c52ce9d6bc3b 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
seq_putc(m, '\n');
}
+#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...) \
+ do { \
+ _len += snprintf(_buf + _len, _size - _len, _fmt, \
+ ##__VA_ARGS__); \
+ if (_len >= _size) \
+ goto _err; \
+ } while (0)
+
static ssize_t
print_page_owner(char __user *buf, size_t count, unsigned long pfn,
struct page *page, struct page_owner *page_owner,
depot_stack_handle_t handle)
{
- int ret, pageblock_mt, page_mt;
+ int ret = 0, pageblock_mt, page_mt;
char *kbuf;
count = min_t(size_t, count, PAGE_SIZE);
@@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
if (!kbuf)
return -ENOMEM;
- ret = snprintf(kbuf, count,
- "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
- page_owner->order, page_owner->gfp_mask,
- &page_owner->gfp_mask, page_owner->pid,
- page_owner->ts_nsec, page_owner->free_ts_nsec);
-
- if (ret >= count)
- goto err;
+ SNPRINTF(kbuf, count, ret, err,
+ "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
+ page_owner->order, page_owner->gfp_mask,
+ &page_owner->gfp_mask, page_owner->pid,
+ page_owner->ts_nsec, page_owner->free_ts_nsec);
/* Print information relevant to grouping pages by mobility */
pageblock_mt = get_pageblock_migratetype(page);
page_mt = gfp_migratetype(page_owner->gfp_mask);
- ret += snprintf(kbuf + ret, count - ret,
- "PFN %lu type %s Block %lu type %s Flags %pGp\n",
- pfn,
- migratetype_names[page_mt],
- pfn >> pageblock_order,
- migratetype_names[pageblock_mt],
- &page->flags);
-
- if (ret >= count)
- goto err;
+ SNPRINTF(kbuf, count, ret, err,
+ "PFN %lu type %s Block %lu type %s Flags %pGp\n",
+ pfn, migratetype_names[page_mt],
+ pfn >> pageblock_order,
+ migratetype_names[pageblock_mt],
+ &page->flags);
ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
if (ret >= count)
goto err;
- if (page_owner->last_migrate_reason != -1) {
- ret += snprintf(kbuf + ret, count - ret,
+ if (page_owner->last_migrate_reason != -1)
+ SNPRINTF(kbuf, count, ret, err,
"Page has been migrated, last migrate reason: %s\n",
migrate_reason_names[page_owner->last_migrate_reason]);
- if (ret >= count)
- goto err;
- }
- ret += snprintf(kbuf + ret, count - ret, "\n");
- if (ret >= count)
- goto err;
+ SNPRINTF(kbuf, count, ret, err, "\n");
if (copy_to_user(buf, kbuf, ret))
ret = -EFAULT;
--
2.27.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
@ 2022-01-28 19:56 ` Waiman Long
2022-01-28 21:22 ` Ira Weiny
1 sibling, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 19:56 UTC (permalink / raw)
To: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton
Cc: linux-kernel, cgroups, linux-mm, Rafael Aquini, Waiman Long
It was found that a number of offlined memcgs were not freed because
they were pinned by some charged pages that were present. Even "echo
1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
offlined but not freed memcgs tend to increase in number over time with
the side effect that percpu memory consumption as shown in /proc/meminfo
also increases over time.
In order to find out more information about those pages that pin
offlined memcgs, the page_owner feature is extended to dump memory
cgroup information especially whether the cgroup is offlined or not.
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/page_owner.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/mm/page_owner.c b/mm/page_owner.c
index c52ce9d6bc3b..e5d8c642296b 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -10,6 +10,7 @@
#include <linux/migrate.h>
#include <linux/stackdepot.h>
#include <linux/seq_file.h>
+#include <linux/memcontrol.h>
#include <linux/sched/clock.h>
#include "internal.h"
@@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
depot_stack_handle_t handle)
{
int ret = 0, pageblock_mt, page_mt;
+ unsigned long __maybe_unused memcg_data;
char *kbuf;
count = min_t(size_t, count, PAGE_SIZE);
@@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
"Page has been migrated, last migrate reason: %s\n",
migrate_reason_names[page_owner->last_migrate_reason]);
+#ifdef CONFIG_MEMCG
+ /*
+ * Look for memcg information and print it out
+ */
+ memcg_data = READ_ONCE(page->memcg_data);
+ if (memcg_data) {
+ struct mem_cgroup *memcg = page_memcg_check(page);
+ bool onlined;
+ char name[80];
+
+ if (memcg_data & MEMCG_DATA_OBJCGS)
+ SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
+
+ if (!memcg)
+ goto copy_out;
+
+ onlined = (memcg->css.flags & CSS_ONLINE);
+ cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
+ SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
+ PageMemcgKmem(page) ? "(via objcg) " : "",
+ onlined ? "" : "offlined ", name);
+ }
+
+copy_out:
+#endif
+
SNPRINTF(kbuf, count, ret, err, "\n");
if (copy_to_user(buf, kbuf, ret))
--
2.27.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
@ 2022-01-28 21:18 ` Ira Weiny
2022-01-28 21:22 ` Waiman Long
0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:18 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On Fri, Jan 28, 2022 at 02:56:41PM -0500, Waiman Long wrote:
> In print_page_owner(), there is a repetitive check after each snprintf()
> to make sure that the final length is less than the buffer size. Simplify
> the code and make it easier to read by embedding the buffer length
> check and increment inside the macro.
This does not follow the kernel coding style of not putting control flow
statements in macros.[1]
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
> 1 file changed, 23 insertions(+), 27 deletions(-)
And in the end you only saved 4 lines of code to read... Not worth it IMO.
Ira
[1] https://www.kernel.org/doc/html/v4.17/process/coding-style.html
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index 99e360df9465..c52ce9d6bc3b 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
> seq_putc(m, '\n');
> }
>
> +#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...) \
> + do { \
> + _len += snprintf(_buf + _len, _size - _len, _fmt, \
> + ##__VA_ARGS__); \
> + if (_len >= _size) \
> + goto _err; \
> + } while (0)
> +
> static ssize_t
> print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> struct page *page, struct page_owner *page_owner,
> depot_stack_handle_t handle)
> {
> - int ret, pageblock_mt, page_mt;
> + int ret = 0, pageblock_mt, page_mt;
> char *kbuf;
>
> count = min_t(size_t, count, PAGE_SIZE);
> @@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> if (!kbuf)
> return -ENOMEM;
>
> - ret = snprintf(kbuf, count,
> - "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
> - page_owner->order, page_owner->gfp_mask,
> - &page_owner->gfp_mask, page_owner->pid,
> - page_owner->ts_nsec, page_owner->free_ts_nsec);
> -
> - if (ret >= count)
> - goto err;
> + SNPRINTF(kbuf, count, ret, err,
> + "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
> + page_owner->order, page_owner->gfp_mask,
> + &page_owner->gfp_mask, page_owner->pid,
> + page_owner->ts_nsec, page_owner->free_ts_nsec);
>
> /* Print information relevant to grouping pages by mobility */
> pageblock_mt = get_pageblock_migratetype(page);
> page_mt = gfp_migratetype(page_owner->gfp_mask);
> - ret += snprintf(kbuf + ret, count - ret,
> - "PFN %lu type %s Block %lu type %s Flags %pGp\n",
> - pfn,
> - migratetype_names[page_mt],
> - pfn >> pageblock_order,
> - migratetype_names[pageblock_mt],
> - &page->flags);
> -
> - if (ret >= count)
> - goto err;
> + SNPRINTF(kbuf, count, ret, err,
> + "PFN %lu type %s Block %lu type %s Flags %pGp\n",
> + pfn, migratetype_names[page_mt],
> + pfn >> pageblock_order,
> + migratetype_names[pageblock_mt],
> + &page->flags);
>
> ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
> if (ret >= count)
> goto err;
>
> - if (page_owner->last_migrate_reason != -1) {
> - ret += snprintf(kbuf + ret, count - ret,
> + if (page_owner->last_migrate_reason != -1)
> + SNPRINTF(kbuf, count, ret, err,
> "Page has been migrated, last migrate reason: %s\n",
> migrate_reason_names[page_owner->last_migrate_reason]);
> - if (ret >= count)
> - goto err;
> - }
>
> - ret += snprintf(kbuf + ret, count - ret, "\n");
> - if (ret >= count)
> - goto err;
> + SNPRINTF(kbuf, count, ret, err, "\n");
>
> if (copy_to_user(buf, kbuf, ret))
> ret = -EFAULT;
> --
> 2.27.0
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
@ 2022-01-28 21:22 ` Ira Weiny
2022-01-28 21:31 ` Waiman Long
0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:22 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> It was found that a number of offlined memcgs were not freed because
> they were pinned by some charged pages that were present. Even "echo
> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> offlined but not freed memcgs tend to increase in number over time with
> the side effect that percpu memory consumption as shown in /proc/meminfo
> also increases over time.
>
> In order to find out more information about those pages that pin
> offlined memcgs, the page_owner feature is extended to dump memory
> cgroup information especially whether the cgroup is offlined or not.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> mm/page_owner.c | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index c52ce9d6bc3b..e5d8c642296b 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -10,6 +10,7 @@
> #include <linux/migrate.h>
> #include <linux/stackdepot.h>
> #include <linux/seq_file.h>
> +#include <linux/memcontrol.h>
> #include <linux/sched/clock.h>
>
> #include "internal.h"
> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> depot_stack_handle_t handle)
> {
> int ret = 0, pageblock_mt, page_mt;
> + unsigned long __maybe_unused memcg_data;
> char *kbuf;
>
> count = min_t(size_t, count, PAGE_SIZE);
> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> "Page has been migrated, last migrate reason: %s\n",
> migrate_reason_names[page_owner->last_migrate_reason]);
>
> +#ifdef CONFIG_MEMCG
> + /*
> + * Look for memcg information and print it out
> + */
> + memcg_data = READ_ONCE(page->memcg_data);
> + if (memcg_data) {
> + struct mem_cgroup *memcg = page_memcg_check(page);
> + bool onlined;
> + char name[80];
> +
> + if (memcg_data & MEMCG_DATA_OBJCGS)
> + SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> +
> + if (!memcg)
> + goto copy_out;
> +
> + onlined = (memcg->css.flags & CSS_ONLINE);
> + cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> + SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
^^^
Extra specifier?
Did this compile without warnings?
Ira
> + PageMemcgKmem(page) ? "(via objcg) " : "",
> + onlined ? "" : "offlined ", name);
> + }
> +
> +copy_out:
> +#endif
> +
> SNPRINTF(kbuf, count, ret, err, "\n");
>
> if (copy_to_user(buf, kbuf, ret))
> --
> 2.27.0
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check
2022-01-28 21:18 ` Ira Weiny
@ 2022-01-28 21:22 ` Waiman Long
0 siblings, 0 replies; 10+ messages in thread
From: Waiman Long @ 2022-01-28 21:22 UTC (permalink / raw)
To: Ira Weiny
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On 1/28/22 16:18, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 02:56:41PM -0500, Waiman Long wrote:
>> In print_page_owner(), there is a repetitive check after each snprintf()
>> to make sure that the final length is less than the buffer size. Simplify
>> the code and make it easier to read by embedding the buffer length
>> check and increment inside the macro.
> This does not follow the kernel coding style of not putting control flow
> statements in macros.[1]
OK, fair enough. I can remove the macro and repost the patch.
Cheers,
Longman
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> mm/page_owner.c | 50 +++++++++++++++++++++++--------------------------
>> 1 file changed, 23 insertions(+), 27 deletions(-)
> And in the end you only saved 4 lines of code to read... Not worth it IMO.
>
> Ira
>
> [1] https://www.kernel.org/doc/html/v4.17/process/coding-style.html
>
>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>> index 99e360df9465..c52ce9d6bc3b 100644
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -325,12 +325,20 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
>> seq_putc(m, '\n');
>> }
>>
>> +#define SNPRINTF(_buf, _size, _len, _err, _fmt, ...) \
>> + do { \
>> + _len += snprintf(_buf + _len, _size - _len, _fmt, \
>> + ##__VA_ARGS__); \
>> + if (_len >= _size) \
>> + goto _err; \
>> + } while (0)
>> +
>> static ssize_t
>> print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>> struct page *page, struct page_owner *page_owner,
>> depot_stack_handle_t handle)
>> {
>> - int ret, pageblock_mt, page_mt;
>> + int ret = 0, pageblock_mt, page_mt;
>> char *kbuf;
>>
>> count = min_t(size_t, count, PAGE_SIZE);
>> @@ -338,44 +346,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>> if (!kbuf)
>> return -ENOMEM;
>>
>> - ret = snprintf(kbuf, count,
>> - "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
>> - page_owner->order, page_owner->gfp_mask,
>> - &page_owner->gfp_mask, page_owner->pid,
>> - page_owner->ts_nsec, page_owner->free_ts_nsec);
>> -
>> - if (ret >= count)
>> - goto err;
>> + SNPRINTF(kbuf, count, ret, err,
>> + "Page allocated via order %u, mask %#x(%pGg), pid %d, ts %llu ns, free_ts %llu ns\n",
>> + page_owner->order, page_owner->gfp_mask,
>> + &page_owner->gfp_mask, page_owner->pid,
>> + page_owner->ts_nsec, page_owner->free_ts_nsec);
>>
>> /* Print information relevant to grouping pages by mobility */
>> pageblock_mt = get_pageblock_migratetype(page);
>> page_mt = gfp_migratetype(page_owner->gfp_mask);
>> - ret += snprintf(kbuf + ret, count - ret,
>> - "PFN %lu type %s Block %lu type %s Flags %pGp\n",
>> - pfn,
>> - migratetype_names[page_mt],
>> - pfn >> pageblock_order,
>> - migratetype_names[pageblock_mt],
>> - &page->flags);
>> -
>> - if (ret >= count)
>> - goto err;
>> + SNPRINTF(kbuf, count, ret, err,
>> + "PFN %lu type %s Block %lu type %s Flags %pGp\n",
>> + pfn, migratetype_names[page_mt],
>> + pfn >> pageblock_order,
>> + migratetype_names[pageblock_mt],
>> + &page->flags);
>>
>> ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
>> if (ret >= count)
>> goto err;
>>
>> - if (page_owner->last_migrate_reason != -1) {
>> - ret += snprintf(kbuf + ret, count - ret,
>> + if (page_owner->last_migrate_reason != -1)
>> + SNPRINTF(kbuf, count, ret, err,
>> "Page has been migrated, last migrate reason: %s\n",
>> migrate_reason_names[page_owner->last_migrate_reason]);
>> - if (ret >= count)
>> - goto err;
>> - }
>>
>> - ret += snprintf(kbuf + ret, count - ret, "\n");
>> - if (ret >= count)
>> - goto err;
>> + SNPRINTF(kbuf, count, ret, err, "\n");
>>
>> if (copy_to_user(buf, kbuf, ret))
>> ret = -EFAULT;
>> --
>> 2.27.0
>>
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-28 21:22 ` Ira Weiny
@ 2022-01-28 21:31 ` Waiman Long
2022-01-28 21:48 ` Ira Weiny
0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-28 21:31 UTC (permalink / raw)
To: Ira Weiny
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On 1/28/22 16:22, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
>> It was found that a number of offlined memcgs were not freed because
>> they were pinned by some charged pages that were present. Even "echo
>> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
>> offlined but not freed memcgs tend to increase in number over time with
>> the side effect that percpu memory consumption as shown in /proc/meminfo
>> also increases over time.
>>
>> In order to find out more information about those pages that pin
>> offlined memcgs, the page_owner feature is extended to dump memory
>> cgroup information especially whether the cgroup is offlined or not.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> mm/page_owner.c | 28 ++++++++++++++++++++++++++++
>> 1 file changed, 28 insertions(+)
>>
>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>> index c52ce9d6bc3b..e5d8c642296b 100644
>> --- a/mm/page_owner.c
>> +++ b/mm/page_owner.c
>> @@ -10,6 +10,7 @@
>> #include <linux/migrate.h>
>> #include <linux/stackdepot.h>
>> #include <linux/seq_file.h>
>> +#include <linux/memcontrol.h>
>> #include <linux/sched/clock.h>
>>
>> #include "internal.h"
>> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>> depot_stack_handle_t handle)
>> {
>> int ret = 0, pageblock_mt, page_mt;
>> + unsigned long __maybe_unused memcg_data;
>> char *kbuf;
>>
>> count = min_t(size_t, count, PAGE_SIZE);
>> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>> "Page has been migrated, last migrate reason: %s\n",
>> migrate_reason_names[page_owner->last_migrate_reason]);
>>
>> +#ifdef CONFIG_MEMCG
>> + /*
>> + * Look for memcg information and print it out
>> + */
>> + memcg_data = READ_ONCE(page->memcg_data);
>> + if (memcg_data) {
>> + struct mem_cgroup *memcg = page_memcg_check(page);
>> + bool onlined;
>> + char name[80];
>> +
>> + if (memcg_data & MEMCG_DATA_OBJCGS)
>> + SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
>> +
>> + if (!memcg)
>> + goto copy_out;
>> +
>> + onlined = (memcg->css.flags & CSS_ONLINE);
>> + cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
>> + SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
> ^^^
> Extra specifier?
>
> Did this compile without warnings?
Yes, there was no warning.
Cheers,
Longmna
> Ira
>
>> + PageMemcgKmem(page) ? "(via objcg) " : "",
>> + onlined ? "" : "offlined ", name);
>> + }
>> +
>> +copy_out:
>> +#endif
>> +
>> SNPRINTF(kbuf, count, ret, err, "\n");
>>
>> if (copy_to_user(buf, kbuf, ret))
>> --
>> 2.27.0
>>
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-28 21:31 ` Waiman Long
@ 2022-01-28 21:48 ` Ira Weiny
2022-01-29 3:35 ` Waiman Long
0 siblings, 1 reply; 10+ messages in thread
From: Ira Weiny @ 2022-01-28 21:48 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
> On 1/28/22 16:22, Ira Weiny wrote:
> > On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> > > It was found that a number of offlined memcgs were not freed because
> > > they were pinned by some charged pages that were present. Even "echo
> > > 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> > > offlined but not freed memcgs tend to increase in number over time with
> > > the side effect that percpu memory consumption as shown in /proc/meminfo
> > > also increases over time.
> > >
> > > In order to find out more information about those pages that pin
> > > offlined memcgs, the page_owner feature is extended to dump memory
> > > cgroup information especially whether the cgroup is offlined or not.
> > >
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > > mm/page_owner.c | 28 ++++++++++++++++++++++++++++
> > > 1 file changed, 28 insertions(+)
> > >
> > > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > > index c52ce9d6bc3b..e5d8c642296b 100644
> > > --- a/mm/page_owner.c
> > > +++ b/mm/page_owner.c
> > > @@ -10,6 +10,7 @@
> > > #include <linux/migrate.h>
> > > #include <linux/stackdepot.h>
> > > #include <linux/seq_file.h>
> > > +#include <linux/memcontrol.h>
> > > #include <linux/sched/clock.h>
> > > #include "internal.h"
> > > @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > depot_stack_handle_t handle)
> > > {
> > > int ret = 0, pageblock_mt, page_mt;
> > > + unsigned long __maybe_unused memcg_data;
> > > char *kbuf;
> > > count = min_t(size_t, count, PAGE_SIZE);
> > > @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > "Page has been migrated, last migrate reason: %s\n",
> > > migrate_reason_names[page_owner->last_migrate_reason]);
> > > +#ifdef CONFIG_MEMCG
> > > + /*
> > > + * Look for memcg information and print it out
> > > + */
> > > + memcg_data = READ_ONCE(page->memcg_data);
> > > + if (memcg_data) {
> > > + struct mem_cgroup *memcg = page_memcg_check(page);
> > > + bool onlined;
> > > + char name[80];
> > > +
> > > + if (memcg_data & MEMCG_DATA_OBJCGS)
> > > + SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> > > +
> > > + if (!memcg)
> > > + goto copy_out;
> > > +
> > > + onlined = (memcg->css.flags & CSS_ONLINE);
> > > + cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> > > + SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
> > ^^^
> > Extra specifier?
> >
> > Did this compile without warnings?
>
> Yes, there was no warning.
But isn't that an extra specifier?
Ira
>
> Cheers,
> Longmna
>
> > Ira
> >
> > > + PageMemcgKmem(page) ? "(via objcg) " : "",
> > > + onlined ? "" : "offlined ", name);
> > > + }
> > > +
> > > +copy_out:
> > > +#endif
> > > +
> > > SNPRINTF(kbuf, count, ret, err, "\n");
> > > if (copy_to_user(buf, kbuf, ret))
> > > --
> > > 2.27.0
> > >
> > >
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-28 21:48 ` Ira Weiny
@ 2022-01-29 3:35 ` Waiman Long
2022-01-29 4:05 ` Ira Weiny
0 siblings, 1 reply; 10+ messages in thread
From: Waiman Long @ 2022-01-29 3:35 UTC (permalink / raw)
To: Ira Weiny
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On 1/28/22 16:48, Ira Weiny wrote:
> On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
>> On 1/28/22 16:22, Ira Weiny wrote:
>>> On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
>>>> It was found that a number of offlined memcgs were not freed because
>>>> they were pinned by some charged pages that were present. Even "echo
>>>> 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
>>>> offlined but not freed memcgs tend to increase in number over time with
>>>> the side effect that percpu memory consumption as shown in /proc/meminfo
>>>> also increases over time.
>>>>
>>>> In order to find out more information about those pages that pin
>>>> offlined memcgs, the page_owner feature is extended to dump memory
>>>> cgroup information especially whether the cgroup is offlined or not.
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>> mm/page_owner.c | 28 ++++++++++++++++++++++++++++
>>>> 1 file changed, 28 insertions(+)
>>>>
>>>> diff --git a/mm/page_owner.c b/mm/page_owner.c
>>>> index c52ce9d6bc3b..e5d8c642296b 100644
>>>> --- a/mm/page_owner.c
>>>> +++ b/mm/page_owner.c
>>>> @@ -10,6 +10,7 @@
>>>> #include <linux/migrate.h>
>>>> #include <linux/stackdepot.h>
>>>> #include <linux/seq_file.h>
>>>> +#include <linux/memcontrol.h>
>>>> #include <linux/sched/clock.h>
>>>> #include "internal.h"
>>>> @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>>> depot_stack_handle_t handle)
>>>> {
>>>> int ret = 0, pageblock_mt, page_mt;
>>>> + unsigned long __maybe_unused memcg_data;
>>>> char *kbuf;
>>>> count = min_t(size_t, count, PAGE_SIZE);
>>>> @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>>>> "Page has been migrated, last migrate reason: %s\n",
>>>> migrate_reason_names[page_owner->last_migrate_reason]);
>>>> +#ifdef CONFIG_MEMCG
>>>> + /*
>>>> + * Look for memcg information and print it out
>>>> + */
>>>> + memcg_data = READ_ONCE(page->memcg_data);
>>>> + if (memcg_data) {
>>>> + struct mem_cgroup *memcg = page_memcg_check(page);
>>>> + bool onlined;
>>>> + char name[80];
>>>> +
>>>> + if (memcg_data & MEMCG_DATA_OBJCGS)
>>>> + SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
>>>> +
>>>> + if (!memcg)
>>>> + goto copy_out;
>>>> +
>>>> + onlined = (memcg->css.flags & CSS_ONLINE);
>>>> + cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
>>>> + SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
>>> ^^^
>>> Extra specifier?
>>>
>>> Did this compile without warnings?
>> Yes, there was no warning.
> But isn't that an extra specifier?
There are 3 arguments to the format string that match the 3 "%s" in it:
1) PageMemcgKmem(page) ? "(via objcg) " : ""
2) onlined ? "" : "offlined
3) name
Cheers,
Longman
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/2] mm/page_owner: Dump memcg information
2022-01-29 3:35 ` Waiman Long
@ 2022-01-29 4:05 ` Ira Weiny
0 siblings, 0 replies; 10+ messages in thread
From: Ira Weiny @ 2022-01-29 4:05 UTC (permalink / raw)
To: Waiman Long
Cc: Johannes Weiner, Michal Hocko, Vladimir Davydov, Andrew Morton,
linux-kernel, cgroups, linux-mm, Rafael Aquini
On Fri, Jan 28, 2022 at 10:35:22PM -0500, Waiman Long wrote:
> On 1/28/22 16:48, Ira Weiny wrote:
> > On Fri, Jan 28, 2022 at 04:31:07PM -0500, Waiman Long wrote:
> > > On 1/28/22 16:22, Ira Weiny wrote:
> > > > On Fri, Jan 28, 2022 at 02:56:42PM -0500, Waiman Long wrote:
> > > > > It was found that a number of offlined memcgs were not freed because
> > > > > they were pinned by some charged pages that were present. Even "echo
> > > > > 1 > /proc/sys/vm/drop_caches" wasn't able to free those pages. These
> > > > > offlined but not freed memcgs tend to increase in number over time with
> > > > > the side effect that percpu memory consumption as shown in /proc/meminfo
> > > > > also increases over time.
> > > > >
> > > > > In order to find out more information about those pages that pin
> > > > > offlined memcgs, the page_owner feature is extended to dump memory
> > > > > cgroup information especially whether the cgroup is offlined or not.
> > > > >
> > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > ---
> > > > > mm/page_owner.c | 28 ++++++++++++++++++++++++++++
> > > > > 1 file changed, 28 insertions(+)
> > > > >
> > > > > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > > > > index c52ce9d6bc3b..e5d8c642296b 100644
> > > > > --- a/mm/page_owner.c
> > > > > +++ b/mm/page_owner.c
> > > > > @@ -10,6 +10,7 @@
> > > > > #include <linux/migrate.h>
> > > > > #include <linux/stackdepot.h>
> > > > > #include <linux/seq_file.h>
> > > > > +#include <linux/memcontrol.h>
> > > > > #include <linux/sched/clock.h>
> > > > > #include "internal.h"
> > > > > @@ -339,6 +340,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > > > depot_stack_handle_t handle)
> > > > > {
> > > > > int ret = 0, pageblock_mt, page_mt;
> > > > > + unsigned long __maybe_unused memcg_data;
> > > > > char *kbuf;
> > > > > count = min_t(size_t, count, PAGE_SIZE);
> > > > > @@ -371,6 +373,32 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> > > > > "Page has been migrated, last migrate reason: %s\n",
> > > > > migrate_reason_names[page_owner->last_migrate_reason]);
> > > > > +#ifdef CONFIG_MEMCG
> > > > > + /*
> > > > > + * Look for memcg information and print it out
> > > > > + */
> > > > > + memcg_data = READ_ONCE(page->memcg_data);
> > > > > + if (memcg_data) {
> > > > > + struct mem_cgroup *memcg = page_memcg_check(page);
> > > > > + bool onlined;
> > > > > + char name[80];
> > > > > +
> > > > > + if (memcg_data & MEMCG_DATA_OBJCGS)
> > > > > + SNPRINTF(kbuf, count, ret, err, "Slab cache page\n");
> > > > > +
> > > > > + if (!memcg)
> > > > > + goto copy_out;
> > > > > +
> > > > > + onlined = (memcg->css.flags & CSS_ONLINE);
> > > > > + cgroup_name(memcg->css.cgroup, name, sizeof(name) - 1);
> > > > > + SNPRINTF(kbuf, count, ret, err, "Charged %sto %smemcg %s\n",
> > > > ^^^
> > > > Extra specifier?
> > > >
> > > > Did this compile without warnings?
> > > Yes, there was no warning.
> > But isn't that an extra specifier?
>
> There are 3 arguments to the format string that match the 3 "%s" in it:
>
> 1) PageMemcgKmem(page) ? "(via objcg) " : ""
> 2) onlined ? "" : "offlined
> 3) name
My apologies. My parsing of the ? statements was off. FWIW putting
', name'
on the next line would make it more clear... But I see now...
Sorry,
Ira
>
> Cheers,
> Longman
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-01-29 4:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-28 19:56 [PATCH 0/2] mm/page_owner: Extend page_owner to show memcg Waiman Long
2022-01-28 19:56 ` [PATCH 1/2] mm/page_owner: Introduce SNPRINTF() macro that includes length error check Waiman Long
2022-01-28 21:18 ` Ira Weiny
2022-01-28 21:22 ` Waiman Long
2022-01-28 19:56 ` [PATCH 2/2] mm/page_owner: Dump memcg information Waiman Long
2022-01-28 21:22 ` Ira Weiny
2022-01-28 21:31 ` Waiman Long
2022-01-28 21:48 ` Ira Weiny
2022-01-29 3:35 ` Waiman Long
2022-01-29 4:05 ` Ira Weiny
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).