linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/3] THP Shrinker
@ 2022-08-25 21:30 alexlzhu
  2022-08-25 21:30 ` [RFC 1/3] mm: add thp_utilization metrics to debugfs alexlzhu
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: alexlzhu @ 2022-08-25 21:30 UTC (permalink / raw)
  To: linux-mm
  Cc: willy, hannes, akpm, riel, kernel-team, linux-kernel, Alexander Zhu

From: Alexander Zhu <alexlzhu@fb.com>

Transparent Hugepages use a larger page size of 2MB in comparison to
normal sized pages that are 4kb. A larger page size allows for fewer TLB
cache misses and thus more efficient use of the CPU. Using a larger page
size also results in more memory waste, which can hurt performance in some
use cases. THPs are currently enabled in the Linux Kernel by applications
in limited virtual address ranges via the madvise system call.  The THP
shrinker tries to find a balance between increased use of THPs, and
increased use of memory. It shrinks the size of memory by removing the
underutilized THPs that are identified by the thp_utilization scanner. 

In our experiments we have noticed that the least utilized THPs are almost
entirely unutilized.

Sample Output: 

Utilized[0-50]: 1331 680884
Utilized[51-101]: 9 3983
Utilized[102-152]: 3 1187
Utilized[153-203]: 0 0
Utilized[204-255]: 2 539
Utilized[256-306]: 5 1135
Utilized[307-357]: 1 192
Utilized[358-408]: 0 0
Utilized[409-459]: 1 57
Utilized[460-512]: 400 13
Last Scan Time: 223.98
Last Scan Duration: 70.65

Above is a sample obtained from one of our test machines when THP is always
enabled. Of the 1331 THPs in this thp_utilization sample that have from
0-50 utilized subpages, we see that there are 680884 free pages. This
comes out to 680884 / (512 * 1331) = 99.91% zero pages in the least
utilized bucket. This represents 680884 * 4KB = 2.7GB memory waste.

Also note that the vast majority of pages are either in the least utilized
[0-50] or most utilized [460-512] buckets. The least utilized THPs are 
responsible for almost all of the memory waste when THP is always 
enabled. Thus by clearing out THPs in the lowest utilization bucket
we extract most of the improvement in CPU efficiency. We have seen 
similar results on our production hosts.

This patchset introduces the THP shrinker we have developed to identify
and split the least utilized THPs. It includes the thp_utilization 
changes that groups anonymous THPs into buckets, the split_huge_page()
changes that identify and zap zero 4KB pages within THPs and the shrinker
changes. It should be noted that the split_huge_page() changes are based
off previous work done by Yu Zhao. 

In the future, we intend to allow additional tuning to the shrinker
based on workload depending on CPU/IO/Memory pressure and the 
amount of anonymous memory. The long term goal is to eventually always 
enable THP for all applications and deprecate madvise entirely.

Alexander Zhu (3):
  mm: add thp_utilization metrics to debugfs
  mm: changes to split_huge_page() to free zero filled tail pages
  mm: THP low utilization shrinker

 Documentation/admin-guide/mm/transhuge.rst    |   9 +
 include/linux/huge_mm.h                       |   9 +
 include/linux/list_lru.h                      |  24 ++
 include/linux/mm_types.h                      |   5 +
 include/linux/rmap.h                          |   2 +-
 include/linux/vm_event_item.h                 |   2 +
 mm/huge_memory.c                              | 333 +++++++++++++++++-
 mm/list_lru.c                                 |  49 +++
 mm/migrate.c                                  |  60 +++-
 mm/migrate_device.c                           |   4 +-
 mm/page_alloc.c                               |   6 +
 mm/vmstat.c                                   |   2 +
 .../selftests/vm/split_huge_page_test.c       |  58 ++-
 tools/testing/selftests/vm/vm_util.c          |  23 ++
 tools/testing/selftests/vm/vm_util.h          |   1 +
 15 files changed, 569 insertions(+), 18 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-08-30 22:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-25 21:30 [RFC 0/3] THP Shrinker alexlzhu
2022-08-25 21:30 ` [RFC 1/3] mm: add thp_utilization metrics to debugfs alexlzhu
2022-08-27  0:11   ` Zi Yan
2022-08-29 20:19     ` Alex Zhu (Kernel)
2022-08-25 21:30 ` [RFC 2/3] mm: changes to split_huge_page() to free zero filled tail pages alexlzhu
2022-08-26 10:18   ` David Hildenbrand
2022-08-26 18:34     ` Alex Zhu (Kernel)
2022-08-26 21:18     ` Rik van Riel
2022-08-29 10:02       ` David Hildenbrand
2022-08-29 13:17         ` Rik van Riel
2022-08-30 12:33           ` David Hildenbrand
2022-08-30 21:54             ` Alex Zhu (Kernel)
2022-08-25 21:30 ` [RFC 3/3] mm: THP low utilization shrinker alexlzhu
2022-08-27  0:25   ` Zi Yan
2022-08-29 20:49     ` Alex Zhu (Kernel)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).