Re: [PATCH v2] mm: Reduce memory bloat with THP

* Re: [PATCH v2] mm: Reduce memory bloat with THP
       [not found] <1516318444-30868-1-git-send-email-nitingupta910@gmail.com>
@ 2018-01-19 12:49 ` Michal Hocko
  2018-01-19 20:59   ` Nitin Gupta
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2018-01-19 12:49 UTC (permalink / raw)
  To: Nitin Gupta
  Cc: steven.sistare, Nitin Gupta, Andrew Morton, Ingo Molnar,
	Mel Gorman, Nadav Amit, Minchan Kim, Kirill A. Shutemov,
	Peter Zijlstra, Vegard Nossum, Levin, Alexander (Sasha Levin),
	Mike Rapoport, Hillf Danton, Shaohua Li, Anshuman Khandual,
	Andrea Arcangeli, David Rientjes, Rik van Riel, Jan Kara,
	Dave Jiang, Jérôme Glisse, Matthew Wilcox,
	Ross Zwisler, Hugh Dickins, Tobin C Harding, linux-kernel,
	linux-mm

On Thu 18-01-18 15:33:16, Nitin Gupta wrote:
> From: Nitin Gupta <nitin.m.gupta@oracle.com>
> 
> Currently, if the THP enabled policy is "always", or the mode
> is "madvise" and a region is marked as MADV_HUGEPAGE, a hugepage
> is allocated on a page fault if the pud or pmd is empty.  This
> yields the best VA translation performance, but increases memory
> consumption if some small page ranges within the huge page are
> never accessed.

Yes, this is true but hardly unexpected for MADV_HUGEPAGE or THP always
users.

> An alternate behavior for such page faults is to install a
> hugepage only when a region is actually found to be (almost)
> fully mapped and active.  This is a compromise between
> translation performance and memory consumption.  Currently there
> is no way for an application to choose this compromise for the
> page fault conditions above.

Is that really true? We have /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
This is not reflected during the PF of course but you can control the
behavior there as well. Either by the global setting or a per proces
prctl.

> With this change, whenever an application issues MADV_DONTNEED on a
> memory region, the region is marked as "space-efficient". For such
> regions, a hugepage is not immediately allocated on first write.

Kirill didn't like it in the previous version and I do not like this
either. You are adding a very subtle side effect which might completely
unexpected. Consider userspace memory allocator which uses MADV_DONTNEED
to free up unused memory. Now you have put it out of THP usage
basically.

If the memory is used really scarce then we have MADV_NOHUGEPAGE.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread