linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@techsingularity.net>,
	Anthony Yznaga <anthony.yznaga@oracle.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	aneesh.kumar@linux.ibm.com, akpm@linux-foundation.org,
	jglisse@redhat.com, khandual@linux.vnet.ibm.com,
	kirill.shutemov@linux.intel.com, mhocko@kernel.org,
	minchan@kernel.org, peterz@infradead.org, rientjes@google.com,
	vbabka@suse.cz, willy@infradead.org, ying.huang@intel.com,
	nitingupta910@gmail.com
Subject: Re: [RFC PATCH] mm: thp: implement THP reservations for anonymous memory
Date: Tue, 20 Nov 2018 12:04:07 -0500	[thread overview]
Message-ID: <20181120170407.GM29258@redhat.com> (raw)
In-Reply-To: <20181120091122.3dxlgff3vivwilrg@kshutemo-mobl1>

On Tue, Nov 20, 2018 at 12:11:22PM +0300, Kirill A. Shutemov wrote:
> On Sat, Nov 10, 2018 at 11:44:12AM -0500, Andrea Arcangeli wrote:
> > I would prefer to add intelligence to detect when COWs after fork
> > should be done at 2m or 4k granularity (in the latter case by
> > splitting the pmd before the actual COW while leaving the transhuge
> > pmd intact in the other mm), because that would save CPU (and it'd
> > automatically optimize redis). The snapshot process especially would
> > run faster as it will read with THP performance.
> 
> I would argue we should switch to 4k COW everywhere. But it requires some

We could do that if MADV_HUGEPAGE is not set for example. So there
would still be a way to force the 2M cows if something benefits from
them.

For example with binaries executed in tmpfs one could want 2M cows on
MAP_PRIVATE to keep all the executable in 2MB tlbs despite the memory
loss (but then there are those libs that apparently aren't released to
load the binaries into THP anon too for the same reason and with even
higher memory waste risk as unlike tmpfs nothing can be shared if you
run multiple copies of a go large binary or something).

Certainly it would help whenever fork() is used for snapshotting
purposes, but then fork() used for snapshotting purposes doesn't look
the best mechanism possible for atomic snapshots.

It would be interesting to know which other common workloads will
benefit, for workloads that unlike fork()-for-snapshot, are already as
optimal as it can get.

> work on khugepaged side to be able to recover THP back after multiple 4k
> COW in the range. Currently khugepaged is not able to collapse PTE entires
> backed by compound page back to PMD.

Yes this is also answering Anthony question about what shall happen
after to the 4k cows on the doublemap.

The thing is, by the time khugepaged comes around, the child will
hopefully already have quit, so it would be ideal if it can understand
the anon page isn't even shared anymore, it's fully private to the
process after holding the mmap_sem for writing, so if it's not-shared
anymore and mapcount is 1, khugepaged doesn't need to do the 2M cow of
the doublemap THP at all, it just needs to flush the 4k fragment back
to the THP and drop the doublemap and convert the readonly pte entries
to a writable pmd_trans_huge (if VM_WRITE is still set).

> I have this on my todo list for long time, but...

We're also slowly making progress on the uffd-wp to offer a hopefully
way more efficient way to do the snapshot than using fork(), then the
whole fork thing won't be an issue because there will be no fork.

  reply	other threads:[~2018-11-20 17:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-09  6:48 [RFC PATCH] mm: thp: implement THP reservations for anonymous memory Anthony Yznaga
2018-11-09 11:07 ` Mel Gorman
2018-11-09 23:37   ` anthony.yznaga
2018-11-09 12:13 ` Kirill A. Shutemov
2018-11-09 13:11   ` Mel Gorman
2018-11-09 15:34     ` Zi Yan
2018-11-10  0:39       ` anthony.yznaga
2018-11-10  9:35       ` Kirill A. Shutemov
2018-11-09 19:51   ` Andrea Arcangeli
2018-11-10  0:55     ` anthony.yznaga
2018-11-10 13:22     ` Mel Gorman
2018-11-10 16:44       ` Andrea Arcangeli
2018-11-14 23:15         ` anthony.yznaga
2019-01-25  2:28           ` Anthony Yznaga
2018-11-20  9:11         ` Kirill A. Shutemov
2018-11-20 17:04           ` Andrea Arcangeli [this message]
2018-11-10  0:04   ` anthony.yznaga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181120170407.GM29258@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=jglisse@redhat.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=nitingupta910@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).