All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v10] oom: capture unreclaimable slab info in oom message
@ 2017-10-04 21:29 ` Yang Shi
  0 siblings, 0 replies; 36+ messages in thread
From: Yang Shi @ 2017-10-04 21:29 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v9 —> v10:
* Adopted the suggestion from Michal to just dump unreclaimable slab stats when !is_memcg_oom
* Adopted the suggestion from Michal to print warning when unreclaimable slabs dump can’t acquire the mutex

Changelog v8 —> v9:
* Adopted Tetsuo’s suggestion to protect global slab list traverse with mutex_trylock() to prevent from sleeping. Without the mutex acquired unreclaimable slbas will not be dumped.
* Adopted the suggestion from Christoph to dump CONFIG_SLABINFO since it is pointless to keep it.
* Rebased to 4.13-rc3

Changelog v7 —> v8:
* Adopted Michal’s suggestion to dump unreclaim slab info when unreclaimable slabs amount > total user memory. Not only in oom panic path.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: slabinfo: dump CONFIG_SLABINFO
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory

 init/Kconfig        |  6 ------
 mm/memcontrol.c     |  2 --
 mm/oom_kill.c       | 27 +++++++++++++++++++++++++--
 mm/slab.c           |  2 --
 mm/slab.h           |  2 ++
 mm/slab_common.c    | 38 +++++++++++++++++++++++++++++++++++---
 mm/slub.c           |  2 --
 tools/vm/slabinfo.c | 11 ++++++++++-
 8 files changed, 72 insertions(+), 18 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread
* [PATCH 0/3 v11] oom: capture unreclaimable slab info in oom message
@ 2017-10-10 17:25 Yang Shi
  2017-10-10 17:25   ` Yang Shi
  0 siblings, 1 reply; 36+ messages in thread
From: Yang Shi @ 2017-10-10 17:25 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v10 —> v11:
* Fixed compile failure reported by 0-DAY test. Andrew, please replace all of them.
* Adopted the suggestion from Michal to remove memset()
* Added Acked-By from Michal

Changelog v9 —> v10:
* Adopted the suggestion from Michal to just dump unreclaimable slab stats when !is_memcg_oom
* Adopted the suggestion from Michal to print warning when unreclaimable slabs dump can’t acquire the mutex

Changelog v8 —> v9:
* Adopted Tetsuo’s suggestion to protect global slab list traverse with mutex_trylock() to prevent from sleeping. Without the mutex acquired unreclaimable slbas will not be dumped.
* Adopted the suggestion from Christoph to dump CONFIG_SLABINFO since it is pointless to keep it.
* Rebased to 4.13-rc3

Changelog v7 —> v8:
* Adopted Michal’s suggestion to dump unreclaim slab info when unreclaimable slabs amount > total user memory. Not only in oom panic path.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: slabinfo: dump CONFIG_SLABINFO
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory

 init/Kconfig        |  6 ------
 mm/memcontrol.c     |  2 +-
 mm/oom_kill.c       | 27 +++++++++++++++++++++++++--
 mm/slab.c           |  2 --
 mm/slab.h           |  8 ++++++++
 mm/slab_common.c    | 41 +++++++++++++++++++++++++++++++++++++----
 mm/slub.c           |  4 ++--
 tools/vm/slabinfo.c | 11 ++++++++++-
 8 files changed, 83 insertions(+), 18 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread
* [PATCH 0/3 v8] oom: capture unreclaimable slab info in oom message
@ 2017-10-03 18:06 Yang Shi
  2017-10-03 18:06   ` Yang Shi
  0 siblings, 1 reply; 36+ messages in thread
From: Yang Shi @ 2017-10-03 18:06 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v8 —> 9:
* Adopted Tetsuo’s suggestion to protect global slab list traverse with mutex_trylock() to prevent from sleeping. Without the mutex acquired unreclaimable slbas will not be dumped.
* Adopted the suggestion from Christoph to dump CONFIG_SLABINFO since it is pointless to keep it.
* Rebased to 4.13-rc3

Changelog v7 —> v8:
* Adopted Michal’s suggestion to dump unreclaim slab info when unreclaimable slabs amount > total user memory. Not only in oom panic path.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: slabinfo: dump CONFIG_SLABINFO
      mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory

 init/Kconfig        |  6 ------
 mm/memcontrol.c     |  2 --
 mm/oom_kill.c       | 22 ++++++++++++++++++++++
 mm/slab.c           |  2 --
 mm/slab.h           |  2 ++
 mm/slab_common.c    | 35 ++++++++++++++++++++++++++++++++---
 mm/slub.c           |  2 --
 tools/vm/slabinfo.c | 11 ++++++++++-
 8 files changed, 66 insertions(+), 16 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread
* [PATCH 0/3 v7] oom: capture unreclaimable slab info in oom message when kernel panic
@ 2017-09-27  0:53 Yang Shi
  2017-09-27  0:53   ` Yang Shi
  0 siblings, 1 reply; 36+ messages in thread
From: Yang Shi @ 2017-09-27  0:53 UTC (permalink / raw)
  To: cl, penberg, rientjes, iamjoonsoo.kim, akpm, mhocko
  Cc: Yang Shi, linux-mm, linux-kernel


Recently we ran into a oom issue, kernel panic due to no killable process.
The dmesg shows huge unreclaimable slabs used almost 100% memory, but kdump doesn't capture vmcore due to some reason.

So, it may sound better to capture unreclaimable slab info in oom message when kernel panic to aid trouble shooting and cover the corner case.
Since kernel already panic, so capturing more information sounds worthy and doesn't bother normal oom killer.

With the patchset, tools/vm/slabinfo has a new option, "-U", to show unreclaimable slab only.

And, oom will print all non zero (num_objs * size != 0) unreclaimable slabs in oom killer message.

For details, please see the commit log for each commit.

Changelog v6 -> v7:
* Added unreclaim_slabs_oom_ratio proc knob, unreclaimable slabs info will be dumped when unreclaimable slabs amount : all user memory > the ratio

Changelog v5 —> v6:
* Fixed a checkpatch.pl warning for patch #2

Changelog v4 —> v5:
* Solved the comments from David
* Build test SLABINFO = n

Changelog v3 —> v4:
* Solved the comments from David
* Added David’s Acked-by in patch 1

Changelog v2 —> v3:
* Show used size and total size of each kmem cache per David’s comment

Changelog v1 —> v2:
* Removed the original patch 1 (“mm: slab: output reclaimable flag in /proc/slabinfo”) since Christoph suggested it might break the compatibility and /proc/slabinfo is legacy
* Added Christoph’s Acked-by
* Removed acquiring slab_mutex per Tetsuo’s comment


Yang Shi (3):
      tools: slabinfo: add "-U" option to show unreclaimable slabs only
      mm: oom: show unreclaimable slab info when kernel panic
      doc: add description for unreclaim_slabs_oom_ratio

 Documentation/sysctl/vm.txt | 12 ++++++++++++
 include/linux/oom.h         |  1 +
 include/uapi/linux/sysctl.h |  1 +
 kernel/sysctl.c             |  9 +++++++++
 kernel/sysctl_binary.c      |  1 +
 mm/oom_kill.c               | 31 +++++++++++++++++++++++++++++++
 mm/slab.h                   |  8 ++++++++
 mm/slab_common.c            | 29 +++++++++++++++++++++++++++++
 tools/vm/slabinfo.c         | 11 ++++++++++-
 9 files changed, 102 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-10-10 17:25 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-04 21:29 [PATCH 0/3 v10] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-04 21:29 ` Yang Shi
2017-10-04 21:29 ` [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only Yang Shi
2017-10-04 21:29   ` Yang Shi
2017-10-04 21:29 ` [PATCH 2/3] mm: slabinfo: dump CONFIG_SLABINFO Yang Shi
2017-10-04 21:29   ` Yang Shi
2017-10-07 11:45   ` kbuild test robot
2017-10-07 11:45     ` kbuild test robot
2017-10-07 12:06   ` kbuild test robot
2017-10-07 12:06     ` kbuild test robot
2017-10-04 21:29 ` [PATCH 3/3] mm: oom: show unreclaimable slab info when unreclaimable slabs > user memory Yang Shi
2017-10-04 21:29   ` Yang Shi
2017-10-06  9:37   ` Michal Hocko
2017-10-06  9:37     ` Michal Hocko
2017-10-06 16:37     ` Yang Shi
2017-10-06 16:37       ` Yang Shi
2017-10-09  6:33       ` Michal Hocko
2017-10-09  6:33         ` Michal Hocko
2017-10-09  6:36         ` Michal Hocko
2017-10-09  6:36           ` Michal Hocko
2017-10-09 16:44           ` Yang Shi
2017-10-09 16:44             ` Yang Shi
2017-10-09 18:53           ` Yang Shi
2017-10-09 18:53             ` Yang Shi
2017-10-09 21:00             ` Yang Shi
2017-10-09 21:00               ` Yang Shi
2017-10-07 10:10   ` kbuild test robot
2017-10-07 10:10     ` kbuild test robot
2017-10-07 13:05   ` kbuild test robot
2017-10-07 13:05     ` kbuild test robot
  -- strict thread matches above, loose matches on Subject: below --
2017-10-10 17:25 [PATCH 0/3 v11] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-10 17:25 ` [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only Yang Shi
2017-10-10 17:25   ` Yang Shi
2017-10-03 18:06 [PATCH 0/3 v8] oom: capture unreclaimable slab info in oom message Yang Shi
2017-10-03 18:06 ` [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only Yang Shi
2017-10-03 18:06   ` Yang Shi
2017-09-27  0:53 [PATCH 0/3 v7] oom: capture unreclaimable slab info in oom message when kernel panic Yang Shi
2017-09-27  0:53 ` [PATCH 1/3] tools: slabinfo: add "-U" option to show unreclaimable slabs only Yang Shi
2017-09-27  0:53   ` Yang Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.