linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Pavel Emelyanov <xemul@virtuozzo.com>,
	linux-mm <linux-mm@kvack.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH] mm: introduce MADV_CLR_HUGEPAGE
Date: Tue, 30 May 2017 17:43:26 +0200	[thread overview]
Message-ID: <20170530154326.GB8412@redhat.com> (raw)
In-Reply-To: <20170530143941.GK7969@dhcp22.suse.cz>

On Tue, May 30, 2017 at 04:39:41PM +0200, Michal Hocko wrote:
> I sysctl for the mapcount can be increased, right? I also assume that
> those vmas will get merged after the post copy is done.

Assuming you enlarge the sysctl to the worst possible case, with 64bit
address space you can have billions of VMAs if you're migrating 4T of
RAM and you're unlucky and the address space gets fragmented. The
unswappable kernel memory overhead would be relatively large
(i.e. dozen gigabytes of RAM in vm_area_struct slab), and each
find_vma operation would need to walk ~40 steps across that large vma
rbtree. There's a reason the sysctl exist. Not to tell all those
unnecessary vma mangling operations would be protected by the mmap_sem
for writing.

Not creating a ton of vmas and enabling vma-less pte mangling with a
single large vma and only using mmap_sem for reading during all the
pte mangling, is one of the primary design motivations for
userfaultfd.

> I understand that part but it sounds awfully one purpose thing to me.
> Are we going to add other MADVISE_RESET_$FOO to clear other flags just
> because we can race in this specific use case?

Those already exists, see for example MADV_NORMAL, clearing
~VM_RAND_READ & ~VM_SEQ_READ after calling MADV_SEQUENTIAL or
MADV_RANDOM.

Or MADV_DOFORK after MADV_DONTFORK. MADV_DONTDUMP after MADV_DODUMP. Etc..

> But we already have MADV_HUGEPAGE, MADV_NOHUGEPAGE and prctl to
> enable/disable thp. Doesn't that sound little bit too much for a single
> feature to you?

MADV_NOHUGEPAGE doesn't mean clearing the flag set with
MADV_HUGEPAGE. MADV_NOHUGEPAGE disables THP on the region if the
global sysfs "enabled" tune is set to "always". MADV_HUGEPAGE enables
THP if the global "enabled" sysfs tune is set to "madvise". The two
MADV_NOHUGEPAGE and MADV_HUGEPAGE are needed to leverage the three-way
setting of "never" "madvise" "always" of the global tune.

The "madvise" global tune exists if you want to save RAM and you don't
care much about performance but still allowing apps like QEMU where no
memory is lost by enabling THP, to use THP.

There's no way to clear either of those two flags and bring back the
default behavior of the global sysfs tune, so it's not redundant at
the very least.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-05-30 15:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1495433562-26625-1-git-send-email-rppt@linux.vnet.ibm.com>
     [not found] ` <20170522114243.2wrdbncilozygbpl@node.shutemov.name>
     [not found]   ` <20170522133559.GE27382@rapoport-lnx>
     [not found]     ` <20170522135548.GA8514@dhcp22.suse.cz>
     [not found]       ` <20170522142927.GG27382@rapoport-lnx>
     [not found]         ` <a9e74c22-1a07-f49a-42b5-497fee85e9c9@suse.cz>
     [not found]           ` <20170524075043.GB3063@rapoport-lnx>
2017-05-24  7:58             ` [PATCH] mm: introduce MADV_CLR_HUGEPAGE Vlastimil Babka
2017-05-24 10:39               ` Mike Rapoport
2017-05-24 11:18                 ` Michal Hocko
2017-05-24 14:25                   ` Pavel Emelyanov
2017-05-24 14:27                   ` Mike Rapoport
2017-05-24 15:22                     ` Andrea Arcangeli
2017-05-30  7:44                     ` Michal Hocko
     [not found]                       ` <20170530074408.GA7969-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-30 10:19                         ` Mike Rapoport
2017-05-30 10:39                           ` Michal Hocko
     [not found]                             ` <20170530103930.GB7969-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-30 14:04                               ` Andrea Arcangeli
2017-05-30 14:39                                 ` Michal Hocko
     [not found]                                   ` <20170530143941.GK7969-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-30 14:56                                     ` Michal Hocko
     [not found]                                       ` <20170530145632.GL7969-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-30 16:06                                         ` Andrea Arcangeli
2017-05-31  6:30                                           ` Vlastimil Babka
2017-05-31  8:24                                             ` Michal Hocko
2017-05-31  9:27                                               ` Mike Rapoport
2017-05-31 10:24                                                 ` Michal Hocko
     [not found]                                               ` <20170531082414.GB27783-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-31 10:22                                                 ` Michal Hocko
2017-06-01 11:00                                               ` Mike Rapoport
2017-06-01 12:27                                                 ` Michal Hocko
2017-05-30 15:43                                   ` Andrea Arcangeli [this message]
2017-05-31 12:08                                     ` Michal Hocko
     [not found]                                       ` <20170531120822.GL27783-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-05-31 12:39                                         ` Mike Rapoprt
2017-05-31 14:18                                           ` Andrea Arcangeli
     [not found]                                             ` <20170531141809.GB302-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-05-31 14:32                                               ` Michal Hocko
2017-05-31 15:46                                                 ` Andrea Arcangeli
2017-06-01  6:58                                               ` Mike Rapoport
     [not found]                                           ` <8FA5E4C2-D289-4AF5-AA09-6C199E58F9A5-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2017-05-31 14:19                                             ` Michal Hocko
2017-06-01  6:53                                   ` Mike Rapoport
2017-06-01  8:09                                     ` Michal Hocko
     [not found]                                       ` <20170601080909.GD32677-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-06-01  8:35                                         ` Mike Rapoport
2017-06-01 13:45                                       ` Andrea Arcangeli
2017-06-02  9:11                                         ` Mike Rapoport
2017-05-31  9:08                               ` Mike Rapoport
2017-05-31 12:05                                 ` Michal Hocko
2017-05-31 12:25                                   ` Mike Rapoprt
2017-05-24 11:31                 ` Vlastimil Babka
2017-05-24 14:28                   ` Pavel Emelyanov
2017-05-24 14:54                     ` Vlastimil Babka
2017-05-24 15:13                       ` Mike Rapoport

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170530154326.GB8412@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    --cc=xemul@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).