All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v11] oom: capture unreclaimable slab info in oom message
@ 2017-10-10 17:25 ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v10 —> v11:
* Fixed compile failure reported by 0-DAY test. Andrew, please replace all of them.
* Adopted the suggestion from Michal to remove memset()
* Added Acked-By from Michal

Changelog v9 —> v10:
* Adopted the suggestion from Michal to just dump unreclaimable slab stats when !is_memcg_oom
* Adopted the suggestion from Michal to print warning when unreclaimable slabs dump can’t acquire the mutex

Changelog v8 —> v9:
* Adopted Tetsuo’s suggestion to protect global slab list traverse with mutex_trylock() to prevent from sleeping. Without the mutex acquired unreclaimable slbas will not be dumped.
* Adopted the suggestion from Christoph to dump CONFIG_SLABINFO since it is pointless to keep it.
* Rebased to 4.13-rc3

Changelog v7 —> v8:
* Adopted Michal’s suggestion to dump unreclaim slab info when unreclaimable slabs amount > total user memory. Not only in oom panic path.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: slabinfo: dump CONFIG_SLABINFO
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory

 init/Kconfig        |  6 ------
 mm/memcontrol.c     |  2 +-
 mm/oom_kill.c       | 27 +++++++++++++++++++++++++--
 mm/slab.c           |  2 --
 mm/slab.h           |  8 ++++++++
 mm/slab_common.c    | 41 +++++++++++++++++++++++++++++++++++++----
 mm/slub.c           |  4 ++--
 tools/vm/slabinfo.c | 11 ++++++++++-
 8 files changed, 83 insertions(+), 18 deletions(-)

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 0/3 v11] oom: capture unreclaimable slab info in oom message
@ 2017-10-10 17:25 ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v10 a??> v11:
* Fixed compile failure reported by 0-DAY test. Andrew, please replace all of them.
* Adopted the suggestion from Michal to remove memset()
* Added Acked-By from Michal

Changelog v9 a??> v10:
* Adopted the suggestion from Michal to just dump unreclaimable slab stats when !is_memcg_oom
* Adopted the suggestion from Michal to print warning when unreclaimable slabs dump cana??t acquire the mutex

Changelog v8 a??> v9:
* Adopted Tetsuoa??s suggestion to protect global slab list traverse with mutex_trylock() to prevent from sleeping. Without the mutex acquired unreclaimable slbas will not be dumped.
* Adopted the suggestion from Christoph to dump CONFIG_SLABINFO since it is pointless to keep it.
* Rebased to 4.13-rc3

Changelog v7 a??> v8:
* Adopted Michala??s suggestion to dump unreclaim slab info when unreclaimable slabs amount > total user memory. Not only in oom panic path.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 a??> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 a??> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 a??> v4:
* Solved the comments from David
* Added Davida??s Acked-by in patch 1

Changelog v2 a??> v3:
* Show used size and total size of each kmem cache per Davida??s comment

Changelog v1 a??> v2:
* Removed the original patch 1 (a??mm: slab: output reclaimable flag in /proc/slabinfoa??) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christopha??s Acked-by
* Removed acquiring slab_mutex per Tetsuoa??s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: slabinfo: dump CONFIG_SLABINFO
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory

 init/Kconfig        |  6 ------
 mm/memcontrol.c     |  2 +-
 mm/oom_kill.c       | 27 +++++++++++++++++++++++++--
 mm/slab.c           |  2 --
 mm/slab.h           |  8 ++++++++
 mm/slab_common.c    | 41 +++++++++++++++++++++++++++++++++++++----
 mm/slub.c           |  4 ++--
 tools/vm/slabinfo.c | 11 ++++++++++-
 8 files changed, 83 insertions(+), 18 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only
  2017-10-10 17:25 ` Yang Shi
@ 2017-10-10 17:25   ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Add "-U" option to show unreclaimable slabs only.

"-U" and "-S" together can tell us what unreclaimable slabs use the most
memory to help debug huge unreclaimable slabs issue.

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
---
 tools/vm/slabinfo.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index b9d34b3..de8fa11 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -83,6 +83,7 @@ struct aliasinfo {
 int sort_loss;
 int extended_totals;
 int show_bytes;
+int unreclaim_only;
 
 /* Debug options */
 int sanity;
@@ -132,6 +133,7 @@ static void usage(void)
 		"-L|--Loss              Sort by loss\n"
 		"-X|--Xtotals           Show extended summary information\n"
 		"-B|--Bytes             Show size in bytes\n"
+		"-U|--Unreclaim		Show unreclaimable slabs only\n"
 		"\nValid debug options (FZPUT may be combined)\n"
 		"a / A          Switch on all debug options (=FZUP)\n"
 		"-              Switch off all debug options\n"
@@ -568,6 +570,9 @@ static void slabcache(struct slabinfo *s)
 	if (strcmp(s->name, "*") == 0)
 		return;
 
+	if (unreclaim_only && s->reclaim_account)
+		return;
+
 	if (actual_slabs == 1) {
 		report(s);
 		return;
@@ -1346,6 +1351,7 @@ struct option opts[] = {
 	{ "Loss", no_argument, NULL, 'L'},
 	{ "Xtotals", no_argument, NULL, 'X'},
 	{ "Bytes", no_argument, NULL, 'B'},
+	{ "Unreclaim", no_argument, NULL, 'U'},
 	{ NULL, 0, NULL, 0 }
 };
 
@@ -1357,7 +1363,7 @@ int main(int argc, char *argv[])
 
 	page_size = getpagesize();
 
-	while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTSN:LXB",
+	while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTSN:LXBU",
 						opts, NULL)) != -1)
 		switch (c) {
 		case '1':
@@ -1438,6 +1444,9 @@ int main(int argc, char *argv[])
 		case 'B':
 			show_bytes = 1;
 			break;
+		case 'U':
+			unreclaim_only = 1;
+			break;
 		default:
 			fatal("%s: Invalid option '%c'\n", argv[0], optopt);
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only
@ 2017-10-10 17:25   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Add "-U" option to show unreclaimable slabs only.

"-U" and "-S" together can tell us what unreclaimable slabs use the most
memory to help debug huge unreclaimable slabs issue.

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
---
 tools/vm/slabinfo.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/vm/slabinfo.c b/tools/vm/slabinfo.c
index b9d34b3..de8fa11 100644
--- a/tools/vm/slabinfo.c
+++ b/tools/vm/slabinfo.c
@@ -83,6 +83,7 @@ struct aliasinfo {
 int sort_loss;
 int extended_totals;
 int show_bytes;
+int unreclaim_only;
 
 /* Debug options */
 int sanity;
@@ -132,6 +133,7 @@ static void usage(void)
 		"-L|--Loss              Sort by loss\n"
 		"-X|--Xtotals           Show extended summary information\n"
 		"-B|--Bytes             Show size in bytes\n"
+		"-U|--Unreclaim		Show unreclaimable slabs only\n"
 		"\nValid debug options (FZPUT may be combined)\n"
 		"a / A          Switch on all debug options (=FZUP)\n"
 		"-              Switch off all debug options\n"
@@ -568,6 +570,9 @@ static void slabcache(struct slabinfo *s)
 	if (strcmp(s->name, "*") == 0)
 		return;
 
+	if (unreclaim_only && s->reclaim_account)
+		return;
+
 	if (actual_slabs == 1) {
 		report(s);
 		return;
@@ -1346,6 +1351,7 @@ struct option opts[] = {
 	{ "Loss", no_argument, NULL, 'L'},
 	{ "Xtotals", no_argument, NULL, 'X'},
 	{ "Bytes", no_argument, NULL, 'B'},
+	{ "Unreclaim", no_argument, NULL, 'U'},
 	{ NULL, 0, NULL, 0 }
 };
 
@@ -1357,7 +1363,7 @@ int main(int argc, char *argv[])
 
 	page_size = getpagesize();
 
-	while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTSN:LXB",
+	while ((c = getopt_long(argc, argv, "aAd::Defhil1noprstvzTSN:LXBU",
 						opts, NULL)) != -1)
 		switch (c) {
 		case '1':
@@ -1438,6 +1444,9 @@ int main(int argc, char *argv[])
 		case 'B':
 			show_bytes = 1;
 			break;
+		case 'U':
+			unreclaim_only = 1;
+			break;
 		default:
 			fatal("%s: Invalid option '%c'\n", argv[0], optopt);
 
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO
  2017-10-10 17:25 ` Yang Shi
@ 2017-10-10 17:25   ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

According to the discussion with Christoph [1], it sounds it is pointless
to keep CONFIG_SLABINFO around.

This patch just remove CONFIG_SLABINFO config option, but /proc/slabinfo
is still available.

[1] https://marc.info/?l=linux-kernel&m=150695909709711&w=2

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 init/Kconfig     | 6 ------
 mm/memcontrol.c  | 2 +-
 mm/slab.c        | 2 --
 mm/slab_common.c | 7 +++----
 mm/slub.c        | 4 ++--
 5 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 78cb246..5d3c80a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1657,12 +1657,6 @@ config HAVE_GENERIC_DMA_COHERENT
 	bool
 	default n
 
-config SLABINFO
-	bool
-	depends on PROC_FS
-	depends on SLAB || SLUB_DEBUG
-	default y
-
 config RT_MUTEXES
 	bool
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d5f3a62..c3e7f9e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4049,7 +4049,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
 		.write = mem_cgroup_reset,
 		.read_u64 = mem_cgroup_read_u64,
 	},
-#ifdef CONFIG_SLABINFO
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
 	{
 		.name = "kmem.slabinfo",
 		.seq_start = memcg_slab_start,
diff --git a/mm/slab.c b/mm/slab.c
index 04dec48..5743a51 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4096,7 +4096,6 @@ static void cache_reap(struct work_struct *w)
 	schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
 }
 
-#ifdef CONFIG_SLABINFO
 void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)
 {
 	unsigned long active_objs, num_objs, active_slabs;
@@ -4404,7 +4403,6 @@ static int __init slab_proc_init(void)
 	return 0;
 }
 module_init(slab_proc_init);
-#endif
 
 #ifdef CONFIG_HARDENED_USERCOPY
 /*
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8016459..68b2f0d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1183,8 +1183,7 @@ void cache_random_seq_destroy(struct kmem_cache *cachep)
 }
 #endif /* CONFIG_SLAB_FREELIST_RANDOM */
 
-#ifdef CONFIG_SLABINFO
-
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
 #ifdef CONFIG_SLAB
 #define SLABINFO_RIGHTS (S_IWUSR | S_IRUSR)
 #else
@@ -1280,7 +1279,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
+#if defined(CONFIG_MEMCG)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
@@ -1354,7 +1353,7 @@ static int __init slab_proc_init(void)
 	return 0;
 }
 module_init(slab_proc_init);
-#endif /* CONFIG_SLABINFO */
+#endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */
 
 static __always_inline void *__do_krealloc(const void *p, size_t new_size,
 					   gfp_t flags)
diff --git a/mm/slub.c b/mm/slub.c
index 163352c..8e4ac4a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5851,7 +5851,7 @@ static int __init slab_sysfs_init(void)
 /*
  * The /proc/slabinfo ABI
  */
-#ifdef CONFIG_SLABINFO
+#ifdef CONFIG_SLUB_DEBUG
 void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
 {
 	unsigned long nr_slabs = 0;
@@ -5883,4 +5883,4 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 {
 	return -EIO;
 }
-#endif /* CONFIG_SLABINFO */
+#endif /* CONFIG_SLUB_DEBUG */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO
@ 2017-10-10 17:25   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

According to the discussion with Christoph [1], it sounds it is pointless
to keep CONFIG_SLABINFO around.

This patch just remove CONFIG_SLABINFO config option, but /proc/slabinfo
is still available.

[1] https://marc.info/?l=linux-kernel&m=150695909709711&w=2

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 init/Kconfig     | 6 ------
 mm/memcontrol.c  | 2 +-
 mm/slab.c        | 2 --
 mm/slab_common.c | 7 +++----
 mm/slub.c        | 4 ++--
 5 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 78cb246..5d3c80a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1657,12 +1657,6 @@ config HAVE_GENERIC_DMA_COHERENT
 	bool
 	default n
 
-config SLABINFO
-	bool
-	depends on PROC_FS
-	depends on SLAB || SLUB_DEBUG
-	default y
-
 config RT_MUTEXES
 	bool
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d5f3a62..c3e7f9e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4049,7 +4049,7 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
 		.write = mem_cgroup_reset,
 		.read_u64 = mem_cgroup_read_u64,
 	},
-#ifdef CONFIG_SLABINFO
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
 	{
 		.name = "kmem.slabinfo",
 		.seq_start = memcg_slab_start,
diff --git a/mm/slab.c b/mm/slab.c
index 04dec48..5743a51 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4096,7 +4096,6 @@ static void cache_reap(struct work_struct *w)
 	schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
 }
 
-#ifdef CONFIG_SLABINFO
 void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)
 {
 	unsigned long active_objs, num_objs, active_slabs;
@@ -4404,7 +4403,6 @@ static int __init slab_proc_init(void)
 	return 0;
 }
 module_init(slab_proc_init);
-#endif
 
 #ifdef CONFIG_HARDENED_USERCOPY
 /*
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8016459..68b2f0d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1183,8 +1183,7 @@ void cache_random_seq_destroy(struct kmem_cache *cachep)
 }
 #endif /* CONFIG_SLAB_FREELIST_RANDOM */
 
-#ifdef CONFIG_SLABINFO
-
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
 #ifdef CONFIG_SLAB
 #define SLABINFO_RIGHTS (S_IWUSR | S_IRUSR)
 #else
@@ -1280,7 +1279,7 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
-#if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
+#if defined(CONFIG_MEMCG)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
 	struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m));
@@ -1354,7 +1353,7 @@ static int __init slab_proc_init(void)
 	return 0;
 }
 module_init(slab_proc_init);
-#endif /* CONFIG_SLABINFO */
+#endif /* CONFIG_SLAB || CONFIG_SLUB_DEBUG */
 
 static __always_inline void *__do_krealloc(const void *p, size_t new_size,
 					   gfp_t flags)
diff --git a/mm/slub.c b/mm/slub.c
index 163352c..8e4ac4a 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5851,7 +5851,7 @@ static int __init slab_sysfs_init(void)
 /*
  * The /proc/slabinfo ABI
  */
-#ifdef CONFIG_SLABINFO
+#ifdef CONFIG_SLUB_DEBUG
 void get_slabinfo(struct kmem_cache *s, struct slabinfo *sinfo)
 {
 	unsigned long nr_slabs = 0;
@@ -5883,4 +5883,4 @@ ssize_t slabinfo_write(struct file *file, const char __user *buffer,
 {
 	return -EIO;
 }
-#endif /* CONFIG_SLABINFO */
+#endif /* CONFIG_SLUB_DEBUG */
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-10 17:25 ` Yang Shi
@ 2017-10-10 17:25   ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c    | 27 +++++++++++++++++++++++++--
 mm/slab.h        |  8 ++++++++
 mm/slab_common.c | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index dee0f75..3023919 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -44,6 +44,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -420,10 +440,13 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 
 	cpuset_print_current_mems_allowed();
 	dump_stack();
-	if (oc->memcg)
+	if (is_memcg_oom(oc))
 		mem_cgroup_print_oom_info(oc->memcg, p);
-	else
+	else {
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+		if (is_dump_unreclaim_slabs())
+			dump_unreclaimable_slab();
+	}
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..a1537cf 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,14 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
+void dump_unreclaimable_slab(void);
+#else
+static inline void dump_unreclaimable_slab(void)
+{
+}
+#endif
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 68b2f0d..4413cee 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1279,6 +1279,40 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex)) {
+		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
+		return;
+	}
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-10 17:25   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
---
 mm/oom_kill.c    | 27 +++++++++++++++++++++++++--
 mm/slab.h        |  8 ++++++++
 mm/slab_common.c | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index dee0f75..3023919 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -44,6 +44,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -420,10 +440,13 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 
 	cpuset_print_current_mems_allowed();
 	dump_stack();
-	if (oc->memcg)
+	if (is_memcg_oom(oc))
 		mem_cgroup_print_oom_info(oc->memcg, p);
-	else
+	else {
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+		if (is_dump_unreclaim_slabs())
+			dump_unreclaimable_slab();
+	}
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..a1537cf 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,14 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+#if defined(CONFIG_SLAB) || defined(CONFIG_SLUB_DEBUG)
+void dump_unreclaimable_slab(void);
+#else
+static inline void dump_unreclaimable_slab(void)
+{
+}
+#endif
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 68b2f0d..4413cee 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1279,6 +1279,40 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex)) {
+		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
+		return;
+	}
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-10 17:25   ` Yang Shi
@ 2017-10-17  0:15     ` David Rientjes
  -1 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17  0:15 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel

On Wed, 11 Oct 2017, Yang Shi wrote:

> @@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
>  	return false;
>  }
>  
> +/*
> + * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
> + * than all user memory (LRU pages)
> + */
> +static bool is_dump_unreclaim_slabs(void)
> +{
> +	unsigned long nr_lru;
> +
> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> +		 global_node_page_state(NR_INACTIVE_ANON) +
> +		 global_node_page_state(NR_ACTIVE_FILE) +
> +		 global_node_page_state(NR_INACTIVE_FILE) +
> +		 global_node_page_state(NR_ISOLATED_ANON) +
> +		 global_node_page_state(NR_ISOLATED_FILE) +
> +		 global_node_page_state(NR_UNEVICTABLE);
> +
> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> +}

I think this is an excessive requirement to meet to dump potentially very 
helpful information to the kernel log.  On my 256GB system, this would 
probably require >128GB of unreclaimable slab to trigger.  If a single 
slab cache leaker were to blame for this excessive usage, it would suffice 
to only print a single line showing the slab cache with the greatest 
memory footprint.

It also prevents us from diagnosing issues where reclaimable slab isn't 
actually reclaimed as expected, so the scope is too narrow.

Previous iterations of this patchset were actually better because it 
presented useful data that wasn't restricted to excessive requirements for 
a very narrow scope.

Please simply dump statistics for all slab caches where the memory 
footprint is greater than 5% of system memory.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17  0:15     ` David Rientjes
  0 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17  0:15 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel

On Wed, 11 Oct 2017, Yang Shi wrote:

> @@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
>  	return false;
>  }
>  
> +/*
> + * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
> + * than all user memory (LRU pages)
> + */
> +static bool is_dump_unreclaim_slabs(void)
> +{
> +	unsigned long nr_lru;
> +
> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> +		 global_node_page_state(NR_INACTIVE_ANON) +
> +		 global_node_page_state(NR_ACTIVE_FILE) +
> +		 global_node_page_state(NR_INACTIVE_FILE) +
> +		 global_node_page_state(NR_ISOLATED_ANON) +
> +		 global_node_page_state(NR_ISOLATED_FILE) +
> +		 global_node_page_state(NR_UNEVICTABLE);
> +
> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> +}

I think this is an excessive requirement to meet to dump potentially very 
helpful information to the kernel log.  On my 256GB system, this would 
probably require >128GB of unreclaimable slab to trigger.  If a single 
slab cache leaker were to blame for this excessive usage, it would suffice 
to only print a single line showing the slab cache with the greatest 
memory footprint.

It also prevents us from diagnosing issues where reclaimable slab isn't 
actually reclaimed as expected, so the scope is too narrow.

Previous iterations of this patchset were actually better because it 
presented useful data that wasn't restricted to excessive requirements for 
a very narrow scope.

Please simply dump statistics for all slab caches where the memory 
footprint is greater than 5% of system memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO
  2017-10-10 17:25   ` Yang Shi
@ 2017-10-17  0:17     ` David Rientjes
  -1 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17  0:17 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel

On Wed, 11 Oct 2017, Yang Shi wrote:

> According to the discussion with Christoph [1], it sounds it is pointless
> to keep CONFIG_SLABINFO around.
> 
> This patch just remove CONFIG_SLABINFO config option, but /proc/slabinfo
> is still available.
> 
> [1] https://marc.info/?l=linux-kernel&m=150695909709711&w=2
> 
> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>

Acked-by: David Rientjes <rientjes@google.com>

Cool!

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO
@ 2017-10-17  0:17     ` David Rientjes
  0 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17  0:17 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, iamjoonsoo.kim, akpm, mhocko, linux-mm, linux-kernel

On Wed, 11 Oct 2017, Yang Shi wrote:

> According to the discussion with Christoph [1], it sounds it is pointless
> to keep CONFIG_SLABINFO around.
> 
> This patch just remove CONFIG_SLABINFO config option, but /proc/slabinfo
> is still available.
> 
> [1] https://marc.info/?l=linux-kernel&m=150695909709711&w=2
> 
> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>

Acked-by: David Rientjes <rientjes@google.com>

Cool!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17  0:15     ` David Rientjes
@ 2017-10-17  7:44       ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-17  7:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Mon 16-10-17 17:15:31, David Rientjes wrote:
> Please simply dump statistics for all slab caches where the memory 
> footprint is greater than 5% of system memory.

Unconditionally? User controlable?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17  7:44       ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-17  7:44 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Mon 16-10-17 17:15:31, David Rientjes wrote:
> Please simply dump statistics for all slab caches where the memory 
> footprint is greater than 5% of system memory.

Unconditionally? User controlable?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17  7:44       ` Michal Hocko
@ 2017-10-17 20:59         ` David Rientjes
  -1 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 20:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Tue, 17 Oct 2017, Michal Hocko wrote:

> On Mon 16-10-17 17:15:31, David Rientjes wrote:
> > Please simply dump statistics for all slab caches where the memory 
> > footprint is greater than 5% of system memory.
> 
> Unconditionally? User controlable?

Unconditionally, it's a single line of output per slab cache and there 
can't be that many of them if each is using >5% of memory.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17 20:59         ` David Rientjes
  0 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 20:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Tue, 17 Oct 2017, Michal Hocko wrote:

> On Mon 16-10-17 17:15:31, David Rientjes wrote:
> > Please simply dump statistics for all slab caches where the memory 
> > footprint is greater than 5% of system memory.
> 
> Unconditionally? User controlable?

Unconditionally, it's a single line of output per slab cache and there 
can't be that many of them if each is using >5% of memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 20:59         ` David Rientjes
@ 2017-10-17 21:40           ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-17 21:40 UTC (permalink / raw)
  To: David Rientjes, Michal Hocko
  Cc: cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 1:59 PM, David Rientjes wrote:
> On Tue, 17 Oct 2017, Michal Hocko wrote:
> 
>> On Mon 16-10-17 17:15:31, David Rientjes wrote:
>>> Please simply dump statistics for all slab caches where the memory
>>> footprint is greater than 5% of system memory.
>>
>> Unconditionally? User controlable?
> 
> Unconditionally, it's a single line of output per slab cache and there
> can't be that many of them if each is using >5% of memory.

So,you mean just dump the single slab cache if its size > 5% of system 
memory instead of all slab caches?

Thanks,
Yang

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17 21:40           ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-17 21:40 UTC (permalink / raw)
  To: David Rientjes, Michal Hocko
  Cc: cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 1:59 PM, David Rientjes wrote:
> On Tue, 17 Oct 2017, Michal Hocko wrote:
> 
>> On Mon 16-10-17 17:15:31, David Rientjes wrote:
>>> Please simply dump statistics for all slab caches where the memory
>>> footprint is greater than 5% of system memory.
>>
>> Unconditionally? User controlable?
> 
> Unconditionally, it's a single line of output per slab cache and there
> can't be that many of them if each is using >5% of memory.

Soi 1/4 ?you mean just dump the single slab cache if its size > 5% of system 
memory instead of all slab caches?

Thanks,
Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 21:40           ` Yang Shi
@ 2017-10-17 21:50             ` David Rientjes
  -1 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 21:50 UTC (permalink / raw)
  To: Yang Shi
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 550 bytes --]

On Wed, 18 Oct 2017, Yang Shi wrote:

> > > > Please simply dump statistics for all slab caches where the memory
> > > > footprint is greater than 5% of system memory.
> > > 
> > > Unconditionally? User controlable?
> > 
> > Unconditionally, it's a single line of output per slab cache and there
> > can't be that many of them if each is using >5% of memory.
> 
> So,you mean just dump the single slab cache if its size > 5% of system memory
> instead of all slab caches?
> 

Yes, this should catch occurrences of "huge unreclaimable slabs", right?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17 21:50             ` David Rientjes
  0 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 21:50 UTC (permalink / raw)
  To: Yang Shi
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 554 bytes --]

On Wed, 18 Oct 2017, Yang Shi wrote:

> > > > Please simply dump statistics for all slab caches where the memory
> > > > footprint is greater than 5% of system memory.
> > > 
> > > Unconditionally? User controlable?
> > 
> > Unconditionally, it's a single line of output per slab cache and there
> > can't be that many of them if each is using >5% of memory.
> 
> Soi 1/4 ?you mean just dump the single slab cache if its size > 5% of system memory
> instead of all slab caches?
> 

Yes, this should catch occurrences of "huge unreclaimable slabs", right?

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 21:50             ` David Rientjes
@ 2017-10-17 22:20               ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-17 22:20 UTC (permalink / raw)
  To: David Rientjes
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 2:50 PM, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
>>>>> Please simply dump statistics for all slab caches where the memory
>>>>> footprint is greater than 5% of system memory.
>>>>
>>>> Unconditionally? User controlable?
>>>
>>> Unconditionally, it's a single line of output per slab cache and there
>>> can't be that many of them if each is using >5% of memory.
>>
>> So,you mean just dump the single slab cache if its size > 5% of system memory
>> instead of all slab caches?
>>
> 
> Yes, this should catch occurrences of "huge unreclaimable slabs", right?

Yes, it sounds so. Although single "huge" unreclaimable slab might not 
result in excessive slabs use in a whole, but this would help to filter 
out "small" unreclaimable slab.

Yang

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17 22:20               ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-17 22:20 UTC (permalink / raw)
  To: David Rientjes
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 2:50 PM, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
>>>>> Please simply dump statistics for all slab caches where the memory
>>>>> footprint is greater than 5% of system memory.
>>>>
>>>> Unconditionally? User controlable?
>>>
>>> Unconditionally, it's a single line of output per slab cache and there
>>> can't be that many of them if each is using >5% of memory.
>>
>> Soi 1/4 ?you mean just dump the single slab cache if its size > 5% of system memory
>> instead of all slab caches?
>>
> 
> Yes, this should catch occurrences of "huge unreclaimable slabs", right?

Yes, it sounds so. Although single "huge" unreclaimable slab might not 
result in excessive slabs use in a whole, but this would help to filter 
out "small" unreclaimable slab.

Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 22:20               ` Yang Shi
@ 2017-10-17 22:39                 ` David Rientjes
  -1 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 22:39 UTC (permalink / raw)
  To: Yang Shi
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Wed, 18 Oct 2017, Yang Shi wrote:

> > Yes, this should catch occurrences of "huge unreclaimable slabs", right?
> 
> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
> in excessive slabs use in a whole, but this would help to filter out "small"
> unreclaimable slab.
> 

Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has 
value beyond only unreclaimable slab, it can also be used to show 
instances where the oom killer was invoked without properly reclaiming 
slab.  If the total footprint of a slab cache exceeds 5%, I think a line 
should be emitted unconditionally to the kernel log.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-17 22:39                 ` David Rientjes
  0 siblings, 0 replies; 60+ messages in thread
From: David Rientjes @ 2017-10-17 22:39 UTC (permalink / raw)
  To: Yang Shi
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Wed, 18 Oct 2017, Yang Shi wrote:

> > Yes, this should catch occurrences of "huge unreclaimable slabs", right?
> 
> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
> in excessive slabs use in a whole, but this would help to filter out "small"
> unreclaimable slab.
> 

Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has 
value beyond only unreclaimable slab, it can also be used to show 
instances where the oom killer was invoked without properly reclaiming 
slab.  If the total footprint of a slab cache exceeds 5%, I think a line 
should be emitted unconditionally to the kernel log.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 22:39                 ` David Rientjes
@ 2017-10-18 19:09                   ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-18 19:09 UTC (permalink / raw)
  To: David Rientjes
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 3:39 PM, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
>>> Yes, this should catch occurrences of "huge unreclaimable slabs", right?
>>
>> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
>> in excessive slabs use in a whole, but this would help to filter out "small"
>> unreclaimable slab.
>>
> 
> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has
> value beyond only unreclaimable slab, it can also be used to show
> instances where the oom killer was invoked without properly reclaiming
> slab.  If the total footprint of a slab cache exceeds 5%, I think a line
> should be emitted unconditionally to the kernel log.

OK, sounds good. I will propose an incremental patch to see the comments.

Thanks,
Yang

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-18 19:09                   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-18 19:09 UTC (permalink / raw)
  To: David Rientjes
  Cc: Michal Hocko, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/17/17 3:39 PM, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
>>> Yes, this should catch occurrences of "huge unreclaimable slabs", right?
>>
>> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
>> in excessive slabs use in a whole, but this would help to filter out "small"
>> unreclaimable slab.
>>
> 
> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has
> value beyond only unreclaimable slab, it can also be used to show
> instances where the oom killer was invoked without properly reclaiming
> slab.  If the total footprint of a slab cache exceeds 5%, I think a line
> should be emitted unconditionally to the kernel log.

OK, sounds good. I will propose an incremental patch to see the comments.

Thanks,
Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-17 22:39                 ` David Rientjes
@ 2017-10-19  7:28                   ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-19  7:28 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Tue 17-10-17 15:39:08, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
> > > Yes, this should catch occurrences of "huge unreclaimable slabs", right?
> > 
> > Yes, it sounds so. Although single "huge" unreclaimable slab might not result
> > in excessive slabs use in a whole, but this would help to filter out "small"
> > unreclaimable slab.
> > 
> 
> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has 
> value beyond only unreclaimable slab, it can also be used to show 
> instances where the oom killer was invoked without properly reclaiming 
> slab.  If the total footprint of a slab cache exceeds 5%, I think a line 
> should be emitted unconditionally to the kernel log.

agreed. I am not sure 5% is the greatest fit but we can tune that later.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-19  7:28                   ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-19  7:28 UTC (permalink / raw)
  To: David Rientjes
  Cc: Yang Shi, cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Tue 17-10-17 15:39:08, David Rientjes wrote:
> On Wed, 18 Oct 2017, Yang Shi wrote:
> 
> > > Yes, this should catch occurrences of "huge unreclaimable slabs", right?
> > 
> > Yes, it sounds so. Although single "huge" unreclaimable slab might not result
> > in excessive slabs use in a whole, but this would help to filter out "small"
> > unreclaimable slab.
> > 
> 
> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has 
> value beyond only unreclaimable slab, it can also be used to show 
> instances where the oom killer was invoked without properly reclaiming 
> slab.  If the total footprint of a slab cache exceeds 5%, I think a line 
> should be emitted unconditionally to the kernel log.

agreed. I am not sure 5% is the greatest fit but we can tune that later.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-19  7:28                   ` Michal Hocko
@ 2017-10-19 23:12                     ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-19 23:12 UTC (permalink / raw)
  To: Michal Hocko, David Rientjes
  Cc: cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/19/17 12:28 AM, Michal Hocko wrote:
> On Tue 17-10-17 15:39:08, David Rientjes wrote:
>> On Wed, 18 Oct 2017, Yang Shi wrote:
>>
>>>> Yes, this should catch occurrences of "huge unreclaimable slabs", right?
>>>
>>> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
>>> in excessive slabs use in a whole, but this would help to filter out "small"
>>> unreclaimable slab.
>>>
>>
>> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has
>> value beyond only unreclaimable slab, it can also be used to show
>> instances where the oom killer was invoked without properly reclaiming
>> slab.  If the total footprint of a slab cache exceeds 5%, I think a line
>> should be emitted unconditionally to the kernel log.
> 
> agreed. I am not sure 5% is the greatest fit but we can tune that later.

5% might be too few. For example, on a machine with 200G memory, if 
there is 80G page cache, radix_tree_node might consume 10G. IMHO, 10% 
might be better.

Yang

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-19 23:12                     ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-19 23:12 UTC (permalink / raw)
  To: Michal Hocko, David Rientjes
  Cc: cl, penberg, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/19/17 12:28 AM, Michal Hocko wrote:
> On Tue 17-10-17 15:39:08, David Rientjes wrote:
>> On Wed, 18 Oct 2017, Yang Shi wrote:
>>
>>>> Yes, this should catch occurrences of "huge unreclaimable slabs", right?
>>>
>>> Yes, it sounds so. Although single "huge" unreclaimable slab might not result
>>> in excessive slabs use in a whole, but this would help to filter out "small"
>>> unreclaimable slab.
>>>
>>
>> Keep in mind this is regardless of SLAB_RECLAIM_ACCOUNT: your patch has
>> value beyond only unreclaimable slab, it can also be used to show
>> instances where the oom killer was invoked without properly reclaiming
>> slab.  If the total footprint of a slab cache exceeds 5%, I think a line
>> should be emitted unconditionally to the kernel log.
> 
> agreed. I am not sure 5% is the greatest fit but we can tune that later.

5% might be too few. For example, on a machine with 200G memory, if 
there is 80G page cache, radix_tree_node might consume 10G. IMHO, 10% 
might be better.

Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-09 18:53             ` Yang Shi
@ 2017-10-09 21:00               ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 21:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/9/17 11:53 AM, Yang Shi wrote:
> 
> 
> On 10/8/17 11:36 PM, Michal Hocko wrote:
>> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>>
>>>>
>>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>>> [...]
>>>>>> +    list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>>> +        if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>>> +            continue;
>>>>>> +
>>>>>> +        memset(&sinfo, 0, sizeof(sinfo));
>>>>>
>>>>> why do you zero out the structure. All the fields you are printing are
>>>>> filled out in get_slabinfo.
>>>>
>>>> No special reason, just wipe out the potential stale data on the stack.
>>>
>>> Do not add code that has no meaning. The OOM killer is a slow path but
>>> that doesn't mean we should throw spare cycles out of the window.
>>
>> With this fixed and the compile fix [1] folded, feel free to add my
>> Acked-by: Michal Hocko <mhocko@suse.com>
>>
>> [1] 
>> http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com 
>>
> 
> Did some more thorough test and took the code a little deeper, it sounds 
> !CONFIG_SLOB is not enough. Some data structure and functions depends on 
> CONFIG_SLUB_DEBUG, i.e. kmem_cache_node->total_objects and 
> node_nr_objs(), which are essential of get_slabinfo().
> 
> So, I'm supposed it makes more sense to protect the related slab stats 
> code and the unreclaimable slabinfo dump with CONFIG_SLAB || 
> CONFIG_SLUB_DEBUG.

This is needed to solve compile error when CONFIG_SLUB && !CONFIG_SLUB_DEBUG

Yang

> 
> Thanks,
> Yang
> 
>>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-09 21:00               ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 21:00 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/9/17 11:53 AM, Yang Shi wrote:
> 
> 
> On 10/8/17 11:36 PM, Michal Hocko wrote:
>> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>>
>>>>
>>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>>> [...]
>>>>>> +    list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>>> +        if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>>> +            continue;
>>>>>> +
>>>>>> +        memset(&sinfo, 0, sizeof(sinfo));
>>>>>
>>>>> why do you zero out the structure. All the fields you are printing are
>>>>> filled out in get_slabinfo.
>>>>
>>>> No special reason, just wipe out the potential stale data on the stack.
>>>
>>> Do not add code that has no meaning. The OOM killer is a slow path but
>>> that doesn't mean we should throw spare cycles out of the window.
>>
>> With this fixed and the compile fix [1] folded, feel free to add my
>> Acked-by: Michal Hocko <mhocko@suse.com>
>>
>> [1] 
>> http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com 
>>
> 
> Did some more thorough test and took the code a little deeper, it sounds 
> !CONFIG_SLOB is not enough. Some data structure and functions depends on 
> CONFIG_SLUB_DEBUG, i.e. kmem_cache_node->total_objects and 
> node_nr_objs(), which are essential of get_slabinfo().
> 
> So, I'm supposed it makes more sense to protect the related slab stats 
> code and the unreclaimable slabinfo dump with CONFIG_SLAB || 
> CONFIG_SLUB_DEBUG.

This is needed to solve compile error when CONFIG_SLUB && !CONFIG_SLUB_DEBUG

Yang

> 
> Thanks,
> Yang
> 
>>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-09  6:36           ` Michal Hocko
@ 2017-10-09 18:53             ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 18:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/8/17 11:36 PM, Michal Hocko wrote:
> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>
>>>
>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> [...]
>>>>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>> +			continue;
>>>>> +
>>>>> +		memset(&sinfo, 0, sizeof(sinfo));
>>>>
>>>> why do you zero out the structure. All the fields you are printing are
>>>> filled out in get_slabinfo.
>>>
>>> No special reason, just wipe out the potential stale data on the stack.
>>
>> Do not add code that has no meaning. The OOM killer is a slow path but
>> that doesn't mean we should throw spare cycles out of the window.
> 
> With this fixed and the compile fix [1] folded, feel free to add my
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> [1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com

Did some more thorough test and took the code a little deeper, it sounds 
!CONFIG_SLOB is not enough. Some data structure and functions depends on 
CONFIG_SLUB_DEBUG, i.e. kmem_cache_node->total_objects and 
node_nr_objs(), which are essential of get_slabinfo().

So, I'm supposed it makes more sense to protect the related slab stats 
code and the unreclaimable slabinfo dump with CONFIG_SLAB || 
CONFIG_SLUB_DEBUG.

Thanks,
Yang

> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-09 18:53             ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 18:53 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/8/17 11:36 PM, Michal Hocko wrote:
> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>
>>>
>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> [...]
>>>>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>> +			continue;
>>>>> +
>>>>> +		memset(&sinfo, 0, sizeof(sinfo));
>>>>
>>>> why do you zero out the structure. All the fields you are printing are
>>>> filled out in get_slabinfo.
>>>
>>> No special reason, just wipe out the potential stale data on the stack.
>>
>> Do not add code that has no meaning. The OOM killer is a slow path but
>> that doesn't mean we should throw spare cycles out of the window.
> 
> With this fixed and the compile fix [1] folded, feel free to add my
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> [1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com

Did some more thorough test and took the code a little deeper, it sounds 
!CONFIG_SLOB is not enough. Some data structure and functions depends on 
CONFIG_SLUB_DEBUG, i.e. kmem_cache_node->total_objects and 
node_nr_objs(), which are essential of get_slabinfo().

So, I'm supposed it makes more sense to protect the related slab stats 
code and the unreclaimable slabinfo dump with CONFIG_SLAB || 
CONFIG_SLUB_DEBUG.

Thanks,
Yang

> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-09  6:36           ` Michal Hocko
@ 2017-10-09 16:44             ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 16:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/8/17 11:36 PM, Michal Hocko wrote:
> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>
>>>
>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> [...]
>>>>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>> +			continue;
>>>>> +
>>>>> +		memset(&sinfo, 0, sizeof(sinfo));
>>>>
>>>> why do you zero out the structure. All the fields you are printing are
>>>> filled out in get_slabinfo.
>>>
>>> No special reason, just wipe out the potential stale data on the stack.
>>
>> Do not add code that has no meaning. The OOM killer is a slow path but
>> that doesn't mean we should throw spare cycles out of the window.
> 
> With this fixed and the compile fix [1] folded, feel free to add my
> Acked-by: Michal Hocko <mhocko@suse.com>

Sure, thanks. I think I'd better to send out a new version so that 
Andrew could replace them easily.

Yang

> 
> [1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-09 16:44             ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-09 16:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/8/17 11:36 PM, Michal Hocko wrote:
> On Mon 09-10-17 08:33:16, Michal Hocko wrote:
>> On Sat 07-10-17 00:37:55, Yang Shi wrote:
>>>
>>>
>>> On 10/6/17 2:37 AM, Michal Hocko wrote:
>>>> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> [...]
>>>>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>>>>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>>>>> +			continue;
>>>>> +
>>>>> +		memset(&sinfo, 0, sizeof(sinfo));
>>>>
>>>> why do you zero out the structure. All the fields you are printing are
>>>> filled out in get_slabinfo.
>>>
>>> No special reason, just wipe out the potential stale data on the stack.
>>
>> Do not add code that has no meaning. The OOM killer is a slow path but
>> that doesn't mean we should throw spare cycles out of the window.
> 
> With this fixed and the compile fix [1] folded, feel free to add my
> Acked-by: Michal Hocko <mhocko@suse.com>

Sure, thanks. I think I'd better to send out a new version so that 
Andrew could replace them easily.

Yang

> 
> [1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-09  6:33         ` Michal Hocko
@ 2017-10-09  6:36           ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-09  6:36 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Mon 09-10-17 08:33:16, Michal Hocko wrote:
> On Sat 07-10-17 00:37:55, Yang Shi wrote:
> > 
> > 
> > On 10/6/17 2:37 AM, Michal Hocko wrote:
> > > On Thu 05-10-17 05:29:10, Yang Shi wrote:
> [...]
> > > > +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> > > > +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> > > > +			continue;
> > > > +
> > > > +		memset(&sinfo, 0, sizeof(sinfo));
> > > 
> > > why do you zero out the structure. All the fields you are printing are
> > > filled out in get_slabinfo.
> > 
> > No special reason, just wipe out the potential stale data on the stack.
> 
> Do not add code that has no meaning. The OOM killer is a slow path but
> that doesn't mean we should throw spare cycles out of the window.

With this fixed and the compile fix [1] folded, feel free to add my
Acked-by: Michal Hocko <mhocko@suse.com>

[1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-09  6:36           ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-09  6:36 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Mon 09-10-17 08:33:16, Michal Hocko wrote:
> On Sat 07-10-17 00:37:55, Yang Shi wrote:
> > 
> > 
> > On 10/6/17 2:37 AM, Michal Hocko wrote:
> > > On Thu 05-10-17 05:29:10, Yang Shi wrote:
> [...]
> > > > +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> > > > +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> > > > +			continue;
> > > > +
> > > > +		memset(&sinfo, 0, sizeof(sinfo));
> > > 
> > > why do you zero out the structure. All the fields you are printing are
> > > filled out in get_slabinfo.
> > 
> > No special reason, just wipe out the potential stale data on the stack.
> 
> Do not add code that has no meaning. The OOM killer is a slow path but
> that doesn't mean we should throw spare cycles out of the window.

With this fixed and the compile fix [1] folded, feel free to add my
Acked-by: Michal Hocko <mhocko@suse.com>

[1] http://lkml.kernel.org/r/1507492085-42264-1-git-send-email-yang.s@alibaba-inc.com

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-06 16:37       ` Yang Shi
@ 2017-10-09  6:33         ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-09  6:33 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Sat 07-10-17 00:37:55, Yang Shi wrote:
> 
> 
> On 10/6/17 2:37 AM, Michal Hocko wrote:
> > On Thu 05-10-17 05:29:10, Yang Shi wrote:
[...]
> > > +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> > > +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> > > +			continue;
> > > +
> > > +		memset(&sinfo, 0, sizeof(sinfo));
> > 
> > why do you zero out the structure. All the fields you are printing are
> > filled out in get_slabinfo.
> 
> No special reason, just wipe out the potential stale data on the stack.

Do not add code that has no meaning. The OOM killer is a slow path but
that doesn't mean we should throw spare cycles out of the window.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-09  6:33         ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-09  6:33 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Sat 07-10-17 00:37:55, Yang Shi wrote:
> 
> 
> On 10/6/17 2:37 AM, Michal Hocko wrote:
> > On Thu 05-10-17 05:29:10, Yang Shi wrote:
[...]
> > > +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> > > +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> > > +			continue;
> > > +
> > > +		memset(&sinfo, 0, sizeof(sinfo));
> > 
> > why do you zero out the structure. All the fields you are printing are
> > filled out in get_slabinfo.
> 
> No special reason, just wipe out the potential stale data on the stack.

Do not add code that has no meaning. The OOM killer is a slow path but
that doesn't mean we should throw spare cycles out of the window.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 21:29   ` Yang Shi
@ 2017-10-07 13:05     ` kbuild test robot
  -1 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2017-10-07 13:05 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko,
	Yang Shi, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]

Hi Yang,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/oom-capture-unreclaimable-slab-info-in-oom-message/20171007-173639
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: h8300-h8300h-sim_defconfig (attached as .config)
compiler: h8300-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=h8300 

All errors (new ones prefixed by >>):

   mm/slab_common.o: In function `dump_unreclaimable_slab':
>> mm/slab_common.c:1298: undefined reference to `get_slabinfo'

vim +1298 mm/slab_common.c

  1272	
  1273	void dump_unreclaimable_slab(void)
  1274	{
  1275		struct kmem_cache *s, *s2;
  1276		struct slabinfo sinfo;
  1277	
  1278		/*
  1279		 * Here acquiring slab_mutex is risky since we don't prefer to get
  1280		 * sleep in oom path. But, without mutex hold, it may introduce a
  1281		 * risk of crash.
  1282		 * Use mutex_trylock to protect the list traverse, dump nothing
  1283		 * without acquiring the mutex.
  1284		 */
  1285		if (!mutex_trylock(&slab_mutex)) {
  1286			pr_warn("excessive unreclaimable slab but cannot dump stats\n");
  1287			return;
  1288		}
  1289	
  1290		pr_info("Unreclaimable slab info:\n");
  1291		pr_info("Name                      Used          Total\n");
  1292	
  1293		list_for_each_entry_safe(s, s2, &slab_caches, list) {
  1294			if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
  1295				continue;
  1296	
  1297			memset(&sinfo, 0, sizeof(sinfo));
> 1298			get_slabinfo(s, &sinfo);
  1299	
  1300			if (sinfo.num_objs > 0)
  1301				pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
  1302					(sinfo.active_objs * s->size) / 1024,
  1303					(sinfo.num_objs * s->size) / 1024);
  1304		}
  1305		mutex_unlock(&slab_mutex);
  1306	}
  1307	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 4722 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-07 13:05     ` kbuild test robot
  0 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2017-10-07 13:05 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko,
	linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]

Hi Yang,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/oom-capture-unreclaimable-slab-info-in-oom-message/20171007-173639
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: h8300-h8300h-sim_defconfig (attached as .config)
compiler: h8300-linux-gcc (GCC) 6.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=h8300 

All errors (new ones prefixed by >>):

   mm/slab_common.o: In function `dump_unreclaimable_slab':
>> mm/slab_common.c:1298: undefined reference to `get_slabinfo'

vim +1298 mm/slab_common.c

  1272	
  1273	void dump_unreclaimable_slab(void)
  1274	{
  1275		struct kmem_cache *s, *s2;
  1276		struct slabinfo sinfo;
  1277	
  1278		/*
  1279		 * Here acquiring slab_mutex is risky since we don't prefer to get
  1280		 * sleep in oom path. But, without mutex hold, it may introduce a
  1281		 * risk of crash.
  1282		 * Use mutex_trylock to protect the list traverse, dump nothing
  1283		 * without acquiring the mutex.
  1284		 */
  1285		if (!mutex_trylock(&slab_mutex)) {
  1286			pr_warn("excessive unreclaimable slab but cannot dump stats\n");
  1287			return;
  1288		}
  1289	
  1290		pr_info("Unreclaimable slab info:\n");
  1291		pr_info("Name                      Used          Total\n");
  1292	
  1293		list_for_each_entry_safe(s, s2, &slab_caches, list) {
  1294			if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
  1295				continue;
  1296	
  1297			memset(&sinfo, 0, sizeof(sinfo));
> 1298			get_slabinfo(s, &sinfo);
  1299	
  1300			if (sinfo.num_objs > 0)
  1301				pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
  1302					(sinfo.active_objs * s->size) / 1024,
  1303					(sinfo.num_objs * s->size) / 1024);
  1304		}
  1305		mutex_unlock(&slab_mutex);
  1306	}
  1307	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 4722 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 21:29   ` Yang Shi
@ 2017-10-07 10:10     ` kbuild test robot
  -1 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2017-10-07 10:10 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko,
	Yang Shi, linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

Hi Yang,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/oom-capture-unreclaimable-slab-info-in-oom-message/20171007-173639
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: i386-tinyconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/slab_common.o: In function `dump_unreclaimable_slab':
>> slab_common.c:(.text+0x464): undefined reference to `get_slabinfo'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6696 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-07 10:10     ` kbuild test robot
  0 siblings, 0 replies; 60+ messages in thread
From: kbuild test robot @ 2017-10-07 10:10 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko,
	linux-mm, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 895 bytes --]

Hi Yang,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.14-rc3 next-20170929]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/oom-capture-unreclaimable-slab-info-in-oom-message/20171007-173639
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: i386-tinyconfig (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   mm/slab_common.o: In function `dump_unreclaimable_slab':
>> slab_common.c:(.text+0x464): undefined reference to `get_slabinfo'

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6696 bytes --]

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-06  9:37     ` Michal Hocko
@ 2017-10-06 16:37       ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-06 16:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/6/17 2:37 AM, Michal Hocko wrote:
> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> Kernel may panic when oom happens without killable process sometimes it
>> is caused by huge unreclaimable slabs used by kernel.
>>
>> Although kdump could help debug such problem, however, kdump is not
>> available on all architectures and it might be malfunction sometime.
>> And, since kernel already panic it is worthy capturing such information
>> in dmesg to aid touble shooting.
>>
>> Print out unreclaimable slab info (used size and total size) which
>> actual memory usage is not zero (num_objs * size != 0) when
>> unreclaimable slabs amount is greater than total user memory (LRU
>> pages).
>>
>> The output looks like:
>>
>> Unreclaimable slab info:
>> Name                      Used          Total
>> rpc_buffers               31KB         31KB
>> rpc_tasks                  7KB          7KB
>> ebitmap_node            1964KB       1964KB
>> avtab_node              5024KB       5024KB
>> xfs_buf                 1402KB       1402KB
>> xfs_ili                  134KB        134KB
>> xfs_efi_item             115KB        115KB
>> xfs_efd_item             115KB        115KB
>> xfs_buf_item             134KB        134KB
>> xfs_log_item_desc        342KB        342KB
>> xfs_trans               1412KB       1412KB
>> xfs_ifork                212KB        212KB
> 
> OK this looks better. The naming is not the greatest but I will not
> nitpick on this. I have one question though
> 
>>
>> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex)) {
>> +		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
>> +		return;
>> +	}
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
> 
> why do you zero out the structure. All the fields you are printing are
> filled out in get_slabinfo.

No special reason, just wipe out the potential stale data on the stack.

Yang

> 
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-06 16:37       ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-06 16:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/6/17 2:37 AM, Michal Hocko wrote:
> On Thu 05-10-17 05:29:10, Yang Shi wrote:
>> Kernel may panic when oom happens without killable process sometimes it
>> is caused by huge unreclaimable slabs used by kernel.
>>
>> Although kdump could help debug such problem, however, kdump is not
>> available on all architectures and it might be malfunction sometime.
>> And, since kernel already panic it is worthy capturing such information
>> in dmesg to aid touble shooting.
>>
>> Print out unreclaimable slab info (used size and total size) which
>> actual memory usage is not zero (num_objs * size != 0) when
>> unreclaimable slabs amount is greater than total user memory (LRU
>> pages).
>>
>> The output looks like:
>>
>> Unreclaimable slab info:
>> Name                      Used          Total
>> rpc_buffers               31KB         31KB
>> rpc_tasks                  7KB          7KB
>> ebitmap_node            1964KB       1964KB
>> avtab_node              5024KB       5024KB
>> xfs_buf                 1402KB       1402KB
>> xfs_ili                  134KB        134KB
>> xfs_efi_item             115KB        115KB
>> xfs_efd_item             115KB        115KB
>> xfs_buf_item             134KB        134KB
>> xfs_log_item_desc        342KB        342KB
>> xfs_trans               1412KB       1412KB
>> xfs_ifork                212KB        212KB
> 
> OK this looks better. The naming is not the greatest but I will not
> nitpick on this. I have one question though
> 
>>
>> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex)) {
>> +		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
>> +		return;
>> +	}
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
> 
> why do you zero out the structure. All the fields you are printing are
> filled out in get_slabinfo.

No special reason, just wipe out the potential stale data on the stack.

Yang

> 
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 21:29   ` Yang Shi
@ 2017-10-06  9:37     ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-06  9:37 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Thu 05-10-17 05:29:10, Yang Shi wrote:
> Kernel may panic when oom happens without killable process sometimes it
> is caused by huge unreclaimable slabs used by kernel.
> 
> Although kdump could help debug such problem, however, kdump is not
> available on all architectures and it might be malfunction sometime.
> And, since kernel already panic it is worthy capturing such information
> in dmesg to aid touble shooting.
> 
> Print out unreclaimable slab info (used size and total size) which
> actual memory usage is not zero (num_objs * size != 0) when
> unreclaimable slabs amount is greater than total user memory (LRU
> pages).
> 
> The output looks like:
> 
> Unreclaimable slab info:
> Name                      Used          Total
> rpc_buffers               31KB         31KB
> rpc_tasks                  7KB          7KB
> ebitmap_node            1964KB       1964KB
> avtab_node              5024KB       5024KB
> xfs_buf                 1402KB       1402KB
> xfs_ili                  134KB        134KB
> xfs_efi_item             115KB        115KB
> xfs_efd_item             115KB        115KB
> xfs_buf_item             134KB        134KB
> xfs_log_item_desc        342KB        342KB
> xfs_trans               1412KB       1412KB
> xfs_ifork                212KB        212KB

OK this looks better. The naming is not the greatest but I will not
nitpick on this. I have one question though

> 
> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
[...]
> +void dump_unreclaimable_slab(void)
> +{
> +	struct kmem_cache *s, *s2;
> +	struct slabinfo sinfo;
> +
> +	/*
> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
> +	 * sleep in oom path. But, without mutex hold, it may introduce a
> +	 * risk of crash.
> +	 * Use mutex_trylock to protect the list traverse, dump nothing
> +	 * without acquiring the mutex.
> +	 */
> +	if (!mutex_trylock(&slab_mutex)) {
> +		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
> +		return;
> +	}
> +
> +	pr_info("Unreclaimable slab info:\n");
> +	pr_info("Name                      Used          Total\n");
> +
> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> +			continue;
> +
> +		memset(&sinfo, 0, sizeof(sinfo));

why do you zero out the structure. All the fields you are printing are
filled out in get_slabinfo.

> +		get_slabinfo(s, &sinfo);
> +
> +		if (sinfo.num_objs > 0)
> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
> +				(sinfo.active_objs * s->size) / 1024,
> +				(sinfo.num_objs * s->size) / 1024);
> +	}
> +	mutex_unlock(&slab_mutex);
> +}
> +
>  #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>  void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>  {
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-06  9:37     ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-06  9:37 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Thu 05-10-17 05:29:10, Yang Shi wrote:
> Kernel may panic when oom happens without killable process sometimes it
> is caused by huge unreclaimable slabs used by kernel.
> 
> Although kdump could help debug such problem, however, kdump is not
> available on all architectures and it might be malfunction sometime.
> And, since kernel already panic it is worthy capturing such information
> in dmesg to aid touble shooting.
> 
> Print out unreclaimable slab info (used size and total size) which
> actual memory usage is not zero (num_objs * size != 0) when
> unreclaimable slabs amount is greater than total user memory (LRU
> pages).
> 
> The output looks like:
> 
> Unreclaimable slab info:
> Name                      Used          Total
> rpc_buffers               31KB         31KB
> rpc_tasks                  7KB          7KB
> ebitmap_node            1964KB       1964KB
> avtab_node              5024KB       5024KB
> xfs_buf                 1402KB       1402KB
> xfs_ili                  134KB        134KB
> xfs_efi_item             115KB        115KB
> xfs_efd_item             115KB        115KB
> xfs_buf_item             134KB        134KB
> xfs_log_item_desc        342KB        342KB
> xfs_trans               1412KB       1412KB
> xfs_ifork                212KB        212KB

OK this looks better. The naming is not the greatest but I will not
nitpick on this. I have one question though

> 
> Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
[...]
> +void dump_unreclaimable_slab(void)
> +{
> +	struct kmem_cache *s, *s2;
> +	struct slabinfo sinfo;
> +
> +	/*
> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
> +	 * sleep in oom path. But, without mutex hold, it may introduce a
> +	 * risk of crash.
> +	 * Use mutex_trylock to protect the list traverse, dump nothing
> +	 * without acquiring the mutex.
> +	 */
> +	if (!mutex_trylock(&slab_mutex)) {
> +		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
> +		return;
> +	}
> +
> +	pr_info("Unreclaimable slab info:\n");
> +	pr_info("Name                      Used          Total\n");
> +
> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> +			continue;
> +
> +		memset(&sinfo, 0, sizeof(sinfo));

why do you zero out the structure. All the fields you are printing are
filled out in get_slabinfo.

> +		get_slabinfo(s, &sinfo);
> +
> +		if (sinfo.num_objs > 0)
> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
> +				(sinfo.active_objs * s->size) / 1024,
> +				(sinfo.num_objs * s->size) / 1024);
> +	}
> +	mutex_unlock(&slab_mutex);
> +}
> +
>  #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>  void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>  {
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 18:08       ` Yang Shi
@ 2017-10-05  7:57         ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-05  7:57 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Thu 05-10-17 02:08:48, Yang Shi wrote:
> 
> 
> On 10/4/17 7:27 AM, Michal Hocko wrote:
> > On Wed 04-10-17 02:06:17, Yang Shi wrote:
> > > +static bool is_dump_unreclaim_slabs(void)
> > > +{
> > > +	unsigned long nr_lru;
> > > +
> > > +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> > > +		 global_node_page_state(NR_INACTIVE_ANON) +
> > > +		 global_node_page_state(NR_ACTIVE_FILE) +
> > > +		 global_node_page_state(NR_INACTIVE_FILE) +
> > > +		 global_node_page_state(NR_ISOLATED_ANON) +
> > > +		 global_node_page_state(NR_ISOLATED_FILE) +
> > > +		 global_node_page_state(NR_UNEVICTABLE);
> > > +
> > > +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> > > +}
> > 
> > I am sorry I haven't pointed this earlier (I was following only half
> > way) but this should really be memcg aware. You are checking only global
> > counters. I do not think it is an absolute must to provide per-memcg
> > data but you should at least check !is_memcg_oom(oc).
> 
> BTW, I saw there is already such check in dump_header that looks like the
> below code:
> 
>         if (oc->memcg)
>                 mem_cgroup_print_oom_info(oc->memcg, p);
>         else
>                 show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
> 
> I'm supposed it'd better to replace "oc->memcg" to "is_memcg_oom(oc)" since
> they do the same check and "is_memcg_oom" interface sounds preferable.

Yes, is_memcg_oom is better

> Then I'm going to move unreclaimable slabs dump to the "else" block.

makes sense.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-05  7:57         ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-05  7:57 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Thu 05-10-17 02:08:48, Yang Shi wrote:
> 
> 
> On 10/4/17 7:27 AM, Michal Hocko wrote:
> > On Wed 04-10-17 02:06:17, Yang Shi wrote:
> > > +static bool is_dump_unreclaim_slabs(void)
> > > +{
> > > +	unsigned long nr_lru;
> > > +
> > > +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> > > +		 global_node_page_state(NR_INACTIVE_ANON) +
> > > +		 global_node_page_state(NR_ACTIVE_FILE) +
> > > +		 global_node_page_state(NR_INACTIVE_FILE) +
> > > +		 global_node_page_state(NR_ISOLATED_ANON) +
> > > +		 global_node_page_state(NR_ISOLATED_FILE) +
> > > +		 global_node_page_state(NR_UNEVICTABLE);
> > > +
> > > +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> > > +}
> > 
> > I am sorry I haven't pointed this earlier (I was following only half
> > way) but this should really be memcg aware. You are checking only global
> > counters. I do not think it is an absolute must to provide per-memcg
> > data but you should at least check !is_memcg_oom(oc).
> 
> BTW, I saw there is already such check in dump_header that looks like the
> below code:
> 
>         if (oc->memcg)
>                 mem_cgroup_print_oom_info(oc->memcg, p);
>         else
>                 show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
> 
> I'm supposed it'd better to replace "oc->memcg" to "is_memcg_oom(oc)" since
> they do the same check and "is_memcg_oom" interface sounds preferable.

Yes, is_memcg_oom is better

> Then I'm going to move unreclaimable slabs dump to the "else" block.

makes sense.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 21:29 [PATCH 0/3 v10] oom: capture unreclaimable slab info in oom message Yang Shi
@ 2017-10-04 21:29   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 21:29 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/oom_kill.c    | 27 +++++++++++++++++++++++++--
 mm/slab.h        |  2 ++
 mm/slab_common.c | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index dee0f75..3023919 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -44,6 +44,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -420,10 +440,13 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 
 	cpuset_print_current_mems_allowed();
 	dump_stack();
-	if (oc->memcg)
+	if (is_memcg_oom(oc))
 		mem_cgroup_print_oom_info(oc->memcg, p);
-	else
+	else {
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+		if (is_dump_unreclaim_slabs())
+			dump_unreclaimable_slab();
+	}
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..6fc4d5d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,8 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+void dump_unreclaimable_slab(void);
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index c1629cb..5c8fac5 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1278,6 +1278,41 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex)) {
+		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
+		return;
+	}
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		memset(&sinfo, 0, sizeof(sinfo));
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-04 21:29   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 21:29 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/oom_kill.c    | 27 +++++++++++++++++++++++++--
 mm/slab.h        |  2 ++
 mm/slab_common.c | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index dee0f75..3023919 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -44,6 +44,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -161,6 +162,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -420,10 +440,13 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 
 	cpuset_print_current_mems_allowed();
 	dump_stack();
-	if (oc->memcg)
+	if (is_memcg_oom(oc))
 		mem_cgroup_print_oom_info(oc->memcg, p);
-	else
+	else {
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+		if (is_dump_unreclaim_slabs())
+			dump_unreclaimable_slab();
+	}
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..6fc4d5d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,8 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+void dump_unreclaimable_slab(void);
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index c1629cb..5c8fac5 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1278,6 +1278,41 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex)) {
+		pr_warn("excessive unreclaimable slab but cannot dump stats\n");
+		return;
+	}
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		memset(&sinfo, 0, sizeof(sinfo));
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 14:27     ` Michal Hocko
@ 2017-10-04 18:08       ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 18:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/4/17 7:27 AM, Michal Hocko wrote:
> On Wed 04-10-17 02:06:17, Yang Shi wrote:
>> +static bool is_dump_unreclaim_slabs(void)
>> +{
>> +	unsigned long nr_lru;
>> +
>> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
>> +		 global_node_page_state(NR_INACTIVE_ANON) +
>> +		 global_node_page_state(NR_ACTIVE_FILE) +
>> +		 global_node_page_state(NR_INACTIVE_FILE) +
>> +		 global_node_page_state(NR_ISOLATED_ANON) +
>> +		 global_node_page_state(NR_ISOLATED_FILE) +
>> +		 global_node_page_state(NR_UNEVICTABLE);
>> +
>> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
>> +}
> 
> I am sorry I haven't pointed this earlier (I was following only half
> way) but this should really be memcg aware. You are checking only global
> counters. I do not think it is an absolute must to provide per-memcg
> data but you should at least check !is_memcg_oom(oc).

BTW, I saw there is already such check in dump_header that looks like 
the below code:

         if (oc->memcg)
                 mem_cgroup_print_oom_info(oc->memcg, p);
         else
                 show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);

I'm supposed it'd better to replace "oc->memcg" to "is_memcg_oom(oc)" 
since they do the same check and "is_memcg_oom" interface sounds preferable.

Then I'm going to move unreclaimable slabs dump to the "else" block.

Yang

> 
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex))
>> +		return;
> 
> I would move the trylock up so that we do not get empty and confusing
> Unreclaimable slab info: and add a note that we are not dumping anything
> due to lock contention
> 	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");
> 
> Other than that this looks sensible to me.
> 
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-04 18:08       ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 18:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/4/17 7:27 AM, Michal Hocko wrote:
> On Wed 04-10-17 02:06:17, Yang Shi wrote:
>> +static bool is_dump_unreclaim_slabs(void)
>> +{
>> +	unsigned long nr_lru;
>> +
>> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
>> +		 global_node_page_state(NR_INACTIVE_ANON) +
>> +		 global_node_page_state(NR_ACTIVE_FILE) +
>> +		 global_node_page_state(NR_INACTIVE_FILE) +
>> +		 global_node_page_state(NR_ISOLATED_ANON) +
>> +		 global_node_page_state(NR_ISOLATED_FILE) +
>> +		 global_node_page_state(NR_UNEVICTABLE);
>> +
>> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
>> +}
> 
> I am sorry I haven't pointed this earlier (I was following only half
> way) but this should really be memcg aware. You are checking only global
> counters. I do not think it is an absolute must to provide per-memcg
> data but you should at least check !is_memcg_oom(oc).

BTW, I saw there is already such check in dump_header that looks like 
the below code:

         if (oc->memcg)
                 mem_cgroup_print_oom_info(oc->memcg, p);
         else
                 show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);

I'm supposed it'd better to replace "oc->memcg" to "is_memcg_oom(oc)" 
since they do the same check and "is_memcg_oom" interface sounds preferable.

Then I'm going to move unreclaimable slabs dump to the "else" block.

Yang

> 
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex))
>> +		return;
> 
> I would move the trylock up so that we do not get empty and confusing
> Unreclaimable slab info: and add a note that we are not dumping anything
> due to lock contention
> 	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");
> 
> Other than that this looks sensible to me.
> 
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-04 14:27     ` Michal Hocko
@ 2017-10-04 17:37       ` Yang Shi
  -1 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 17:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/4/17 7:27 AM, Michal Hocko wrote:
> On Wed 04-10-17 02:06:17, Yang Shi wrote:
>> +static bool is_dump_unreclaim_slabs(void)
>> +{
>> +	unsigned long nr_lru;
>> +
>> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
>> +		 global_node_page_state(NR_INACTIVE_ANON) +
>> +		 global_node_page_state(NR_ACTIVE_FILE) +
>> +		 global_node_page_state(NR_INACTIVE_FILE) +
>> +		 global_node_page_state(NR_ISOLATED_ANON) +
>> +		 global_node_page_state(NR_ISOLATED_FILE) +
>> +		 global_node_page_state(NR_UNEVICTABLE);
>> +
>> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
>> +}
> 
> I am sorry I haven't pointed this earlier (I was following only half
> way) but this should really be memcg aware. You are checking only global
> counters. I do not think it is an absolute must to provide per-memcg
> data but you should at least check !is_memcg_oom(oc).

OK, sure.

> 
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex))
>> +		return;
> 
> I would move the trylock up so that we do not get empty and confusing
> Unreclaimable slab info: and add a note that we are not dumping anything
> due to lock contention
> 	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");

Thanks for pointing this. Will fix in new version.

Yang

> 
> Other than that this looks sensible to me.
> 
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-04 17:37       ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-04 17:37 UTC (permalink / raw)
  To: Michal Hocko
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel



On 10/4/17 7:27 AM, Michal Hocko wrote:
> On Wed 04-10-17 02:06:17, Yang Shi wrote:
>> +static bool is_dump_unreclaim_slabs(void)
>> +{
>> +	unsigned long nr_lru;
>> +
>> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
>> +		 global_node_page_state(NR_INACTIVE_ANON) +
>> +		 global_node_page_state(NR_ACTIVE_FILE) +
>> +		 global_node_page_state(NR_INACTIVE_FILE) +
>> +		 global_node_page_state(NR_ISOLATED_ANON) +
>> +		 global_node_page_state(NR_ISOLATED_FILE) +
>> +		 global_node_page_state(NR_UNEVICTABLE);
>> +
>> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
>> +}
> 
> I am sorry I haven't pointed this earlier (I was following only half
> way) but this should really be memcg aware. You are checking only global
> counters. I do not think it is an absolute must to provide per-memcg
> data but you should at least check !is_memcg_oom(oc).

OK, sure.

> 
> [...]
>> +void dump_unreclaimable_slab(void)
>> +{
>> +	struct kmem_cache *s, *s2;
>> +	struct slabinfo sinfo;
>> +
>> +	pr_info("Unreclaimable slab info:\n");
>> +	pr_info("Name                      Used          Total\n");
>> +
>> +	/*
>> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
>> +	 * sleep in oom path. But, without mutex hold, it may introduce a
>> +	 * risk of crash.
>> +	 * Use mutex_trylock to protect the list traverse, dump nothing
>> +	 * without acquiring the mutex.
>> +	 */
>> +	if (!mutex_trylock(&slab_mutex))
>> +		return;
> 
> I would move the trylock up so that we do not get empty and confusing
> Unreclaimable slab info: and add a note that we are not dumping anything
> due to lock contention
> 	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");

Thanks for pointing this. Will fix in new version.

Yang

> 
> Other than that this looks sensible to me.
> 
>> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
>> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
>> +			continue;
>> +
>> +		memset(&sinfo, 0, sizeof(sinfo));
>> +		get_slabinfo(s, &sinfo);
>> +
>> +		if (sinfo.num_objs > 0)
>> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
>> +				(sinfo.active_objs * s->size) / 1024,
>> +				(sinfo.num_objs * s->size) / 1024);
>> +	}
>> +	mutex_unlock(&slab_mutex);
>> +}
>> +
>>   #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>>   void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>>   {
>> -- 
>> 1.8.3.1
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-03 18:06   ` Yang Shi
@ 2017-10-04 14:27     ` Michal Hocko
  -1 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-04 14:27 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Wed 04-10-17 02:06:17, Yang Shi wrote:
> +static bool is_dump_unreclaim_slabs(void)
> +{
> +	unsigned long nr_lru;
> +
> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> +		 global_node_page_state(NR_INACTIVE_ANON) +
> +		 global_node_page_state(NR_ACTIVE_FILE) +
> +		 global_node_page_state(NR_INACTIVE_FILE) +
> +		 global_node_page_state(NR_ISOLATED_ANON) +
> +		 global_node_page_state(NR_ISOLATED_FILE) +
> +		 global_node_page_state(NR_UNEVICTABLE);
> +
> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> +}

I am sorry I haven't pointed this earlier (I was following only half
way) but this should really be memcg aware. You are checking only global
counters. I do not think it is an absolute must to provide per-memcg
data but you should at least check !is_memcg_oom(oc).

[...]
> +void dump_unreclaimable_slab(void)
> +{
> +	struct kmem_cache *s, *s2;
> +	struct slabinfo sinfo;
> +
> +	pr_info("Unreclaimable slab info:\n");
> +	pr_info("Name                      Used          Total\n");
> +
> +	/*
> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
> +	 * sleep in oom path. But, without mutex hold, it may introduce a
> +	 * risk of crash.
> +	 * Use mutex_trylock to protect the list traverse, dump nothing
> +	 * without acquiring the mutex.
> +	 */
> +	if (!mutex_trylock(&slab_mutex))
> +		return;

I would move the trylock up so that we do not get empty and confusing
Unreclaimable slab info: and add a note that we are not dumping anything
due to lock contention
	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");

Other than that this looks sensible to me.

> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> +			continue;
> +
> +		memset(&sinfo, 0, sizeof(sinfo));
> +		get_slabinfo(s, &sinfo);
> +
> +		if (sinfo.num_objs > 0)
> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
> +				(sinfo.active_objs * s->size) / 1024,
> +				(sinfo.num_objs * s->size) / 1024);
> +	}
> +	mutex_unlock(&slab_mutex);
> +}
> +
>  #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>  void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>  {
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-04 14:27     ` Michal Hocko
  0 siblings, 0 replies; 60+ messages in thread
From: Michal Hocko @ 2017-10-04 14:27 UTC (permalink / raw)
  To: Yang Shi
  Cc: cl, penberg, rientjes, iamjoonsoo.kim, akpm, linux-mm, linux-kernel

On Wed 04-10-17 02:06:17, Yang Shi wrote:
> +static bool is_dump_unreclaim_slabs(void)
> +{
> +	unsigned long nr_lru;
> +
> +	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
> +		 global_node_page_state(NR_INACTIVE_ANON) +
> +		 global_node_page_state(NR_ACTIVE_FILE) +
> +		 global_node_page_state(NR_INACTIVE_FILE) +
> +		 global_node_page_state(NR_ISOLATED_ANON) +
> +		 global_node_page_state(NR_ISOLATED_FILE) +
> +		 global_node_page_state(NR_UNEVICTABLE);
> +
> +	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
> +}

I am sorry I haven't pointed this earlier (I was following only half
way) but this should really be memcg aware. You are checking only global
counters. I do not think it is an absolute must to provide per-memcg
data but you should at least check !is_memcg_oom(oc).

[...]
> +void dump_unreclaimable_slab(void)
> +{
> +	struct kmem_cache *s, *s2;
> +	struct slabinfo sinfo;
> +
> +	pr_info("Unreclaimable slab info:\n");
> +	pr_info("Name                      Used          Total\n");
> +
> +	/*
> +	 * Here acquiring slab_mutex is risky since we don't prefer to get
> +	 * sleep in oom path. But, without mutex hold, it may introduce a
> +	 * risk of crash.
> +	 * Use mutex_trylock to protect the list traverse, dump nothing
> +	 * without acquiring the mutex.
> +	 */
> +	if (!mutex_trylock(&slab_mutex))
> +		return;

I would move the trylock up so that we do not get empty and confusing
Unreclaimable slab info: and add a note that we are not dumping anything
due to lock contention
	pr_warn("excessive unreclaimable slab memory but cannot dump stats to give you more details\n");

Other than that this looks sensible to me.

> +	list_for_each_entry_safe(s, s2, &slab_caches, list) {
> +		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
> +			continue;
> +
> +		memset(&sinfo, 0, sizeof(sinfo));
> +		get_slabinfo(s, &sinfo);
> +
> +		if (sinfo.num_objs > 0)
> +			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
> +				(sinfo.active_objs * s->size) / 1024,
> +				(sinfo.num_objs * s->size) / 1024);
> +	}
> +	mutex_unlock(&slab_mutex);
> +}
> +
>  #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
>  void *memcg_slab_start(struct seq_file *m, loff_t *pos)
>  {
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
  2017-10-03 18:06 [PATCH 0/3 v8] oom: capture unreclaimable slab info in oom message Yang Shi
@ 2017-10-03 18:06   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-03 18:06 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/oom_kill.c    | 22 ++++++++++++++++++++++
 mm/slab.h        |  2 ++
 mm/slab_common.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 99736e0..6d89397 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -43,6 +43,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -160,6 +161,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -423,6 +443,8 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 		mem_cgroup_print_oom_info(oc->memcg, p);
 	else
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+	if (is_dump_unreclaim_slabs())
+		dump_unreclaimable_slab();
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..6fc4d5d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,8 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+void dump_unreclaimable_slab(void);
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 5520a22..be24324 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1270,6 +1270,38 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex))
+		return;
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		memset(&sinfo, 0, sizeof(sinfo));
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory
@ 2017-10-03 18:06   ` Yang Shi
  0 siblings, 0 replies; 60+ messages in thread
From: Yang Shi @ 2017-10-03 18:06 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel

Kernel may panic when oom happens without killable process sometimes it
is caused by huge unreclaimable slabs used by kernel.

Although kdump could help debug such problem, however, kdump is not
available on all architectures and it might be malfunction sometime.
And, since kernel already panic it is worthy capturing such information
in dmesg to aid touble shooting.

Print out unreclaimable slab info (used size and total size) which
actual memory usage is not zero (num_objs * size != 0) when
unreclaimable slabs amount is greater than total user memory (LRU
pages).

The output looks like:

Unreclaimable slab info:
Name                      Used          Total
rpc_buffers               31KB         31KB
rpc_tasks                  7KB          7KB
ebitmap_node            1964KB       1964KB
avtab_node              5024KB       5024KB
xfs_buf                 1402KB       1402KB
xfs_ili                  134KB        134KB
xfs_efi_item             115KB        115KB
xfs_efd_item             115KB        115KB
xfs_buf_item             134KB        134KB
xfs_log_item_desc        342KB        342KB
xfs_trans               1412KB       1412KB
xfs_ifork                212KB        212KB

Signed-off-by: Yang Shi <yang.s@alibaba-inc.com>
---
 mm/oom_kill.c    | 22 ++++++++++++++++++++++
 mm/slab.h        |  2 ++
 mm/slab_common.c | 32 ++++++++++++++++++++++++++++++++
 3 files changed, 56 insertions(+)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 99736e0..6d89397 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -43,6 +43,7 @@
 
 #include <asm/tlb.h>
 #include "internal.h"
+#include "slab.h"
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/oom.h>
@@ -160,6 +161,25 @@ static bool oom_unkillable_task(struct task_struct *p,
 	return false;
 }
 
+/*
+ * Print out unreclaimble slabs info when unreclaimable slabs amount is greater
+ * than all user memory (LRU pages)
+ */
+static bool is_dump_unreclaim_slabs(void)
+{
+	unsigned long nr_lru;
+
+	nr_lru = global_node_page_state(NR_ACTIVE_ANON) +
+		 global_node_page_state(NR_INACTIVE_ANON) +
+		 global_node_page_state(NR_ACTIVE_FILE) +
+		 global_node_page_state(NR_INACTIVE_FILE) +
+		 global_node_page_state(NR_ISOLATED_ANON) +
+		 global_node_page_state(NR_ISOLATED_FILE) +
+		 global_node_page_state(NR_UNEVICTABLE);
+
+	return (global_node_page_state(NR_SLAB_UNRECLAIMABLE) > nr_lru);
+}
+
 /**
  * oom_badness - heuristic function to determine which candidate task to kill
  * @p: task struct of which task we should calculate
@@ -423,6 +443,8 @@ static void dump_header(struct oom_control *oc, struct task_struct *p)
 		mem_cgroup_print_oom_info(oc->memcg, p);
 	else
 		show_mem(SHOW_MEM_FILTER_NODES, oc->nodemask);
+	if (is_dump_unreclaim_slabs())
+		dump_unreclaimable_slab();
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc->memcg, oc->nodemask);
 }
diff --git a/mm/slab.h b/mm/slab.h
index 0733628..6fc4d5d 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -505,6 +505,8 @@ static inline struct kmem_cache_node *get_node(struct kmem_cache *s, int node)
 void memcg_slab_stop(struct seq_file *m, void *p);
 int memcg_slab_show(struct seq_file *m, void *p);
 
+void dump_unreclaimable_slab(void);
+
 void ___cache_free(struct kmem_cache *cache, void *x, unsigned long addr);
 
 #ifdef CONFIG_SLAB_FREELIST_RANDOM
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 5520a22..be24324 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1270,6 +1270,38 @@ static int slab_show(struct seq_file *m, void *p)
 	return 0;
 }
 
+void dump_unreclaimable_slab(void)
+{
+	struct kmem_cache *s, *s2;
+	struct slabinfo sinfo;
+
+	pr_info("Unreclaimable slab info:\n");
+	pr_info("Name                      Used          Total\n");
+
+	/*
+	 * Here acquiring slab_mutex is risky since we don't prefer to get
+	 * sleep in oom path. But, without mutex hold, it may introduce a
+	 * risk of crash.
+	 * Use mutex_trylock to protect the list traverse, dump nothing
+	 * without acquiring the mutex.
+	 */
+	if (!mutex_trylock(&slab_mutex))
+		return;
+	list_for_each_entry_safe(s, s2, &slab_caches, list) {
+		if (!is_root_cache(s) || (s->flags & SLAB_RECLAIM_ACCOUNT))
+			continue;
+
+		memset(&sinfo, 0, sizeof(sinfo));
+		get_slabinfo(s, &sinfo);
+
+		if (sinfo.num_objs > 0)
+			pr_info("%-17s %10luKB %10luKB\n", cache_name(s),
+				(sinfo.active_objs * s->size) / 1024,
+				(sinfo.num_objs * s->size) / 1024);
+	}
+	mutex_unlock(&slab_mutex);
+}
+
 #if defined(CONFIG_MEMCG) && !defined(CONFIG_SLOB)
 void *memcg_slab_start(struct seq_file *m, loff_t *pos)
 {
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2017-10-19 23:13 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-10 17:25 [PATCH 0/3 v11] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-10 17:25 ` Yang Shi
2017-10-10 17:25 ` [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only Yang Shi
2017-10-10 17:25   ` Yang Shi
2017-10-10 17:25 ` [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO Yang Shi
2017-10-10 17:25   ` Yang Shi
2017-10-17  0:17   ` David Rientjes
2017-10-17  0:17     ` David Rientjes
2017-10-10 17:25 ` [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory Yang Shi
2017-10-10 17:25   ` Yang Shi
2017-10-17  0:15   ` David Rientjes
2017-10-17  0:15     ` David Rientjes
2017-10-17  7:44     ` Michal Hocko
2017-10-17  7:44       ` Michal Hocko
2017-10-17 20:59       ` David Rientjes
2017-10-17 20:59         ` David Rientjes
2017-10-17 21:40         ` Yang Shi
2017-10-17 21:40           ` Yang Shi
2017-10-17 21:50           ` David Rientjes
2017-10-17 21:50             ` David Rientjes
2017-10-17 22:20             ` Yang Shi
2017-10-17 22:20               ` Yang Shi
2017-10-17 22:39               ` David Rientjes
2017-10-17 22:39                 ` David Rientjes
2017-10-18 19:09                 ` Yang Shi
2017-10-18 19:09                   ` Yang Shi
2017-10-19  7:28                 ` Michal Hocko
2017-10-19  7:28                   ` Michal Hocko
2017-10-19 23:12                   ` Yang Shi
2017-10-19 23:12                     ` Yang Shi
  -- strict thread matches above, loose matches on Subject: below --
2017-10-04 21:29 [PATCH 0/3 v10] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-04 21:29 ` [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory Yang Shi
2017-10-04 21:29   ` Yang Shi
2017-10-06  9:37   ` Michal Hocko
2017-10-06  9:37     ` Michal Hocko
2017-10-06 16:37     ` Yang Shi
2017-10-06 16:37       ` Yang Shi
2017-10-09  6:33       ` Michal Hocko
2017-10-09  6:33         ` Michal Hocko
2017-10-09  6:36         ` Michal Hocko
2017-10-09  6:36           ` Michal Hocko
2017-10-09 16:44           ` Yang Shi
2017-10-09 16:44             ` Yang Shi
2017-10-09 18:53           ` Yang Shi
2017-10-09 18:53             ` Yang Shi
2017-10-09 21:00             ` Yang Shi
2017-10-09 21:00               ` Yang Shi
2017-10-07 10:10   ` kbuild test robot
2017-10-07 10:10     ` kbuild test robot
2017-10-07 13:05   ` kbuild test robot
2017-10-07 13:05     ` kbuild test robot
2017-10-03 18:06 [PATCH 0/3 v8] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-03 18:06 ` [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory Yang Shi
2017-10-03 18:06   ` Yang Shi
2017-10-04 14:27   ` Michal Hocko
2017-10-04 14:27     ` Michal Hocko
2017-10-04 17:37     ` Yang Shi
2017-10-04 17:37       ` Yang Shi
2017-10-04 18:08     ` Yang Shi
2017-10-04 18:08       ` Yang Shi
2017-10-05  7:57       ` Michal Hocko
2017-10-05  7:57         ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.