All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yang Shi <shy828301@gmail.com>
To: "Zach O'Keefe" <zokeefe@google.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Michal Hocko <mhocko@suse.com>, Peter Xu <peterx@redhat.com>,
	 Alex Shi <alex.shi@linux.alibaba.com>,
	David Hildenbrand <david@redhat.com>,
	 David Rientjes <rientjes@google.com>,
	Song Liu <songliubraving@fb.com>,  Linux MM <linux-mm@kvack.org>,
	Rongwei Wang <rongwei.wang@linux.alibaba.com>,
	 Andrea Arcangeli <aarcange@redhat.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	 Hugh Dickins <hughd@google.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	 Minchan Kim <minchan@kernel.org>, SeongJae Park <sj@kernel.org>,
	 Pasha Tatashin <pasha.tatashin@soleen.com>
Subject: Re: [RFC] mm: MADV_COLLAPSE semantics
Date: Tue, 31 May 2022 16:52:23 -0700	[thread overview]
Message-ID: <CAHbLzkqMn81jzKRCjuxsb-if+q5_EQs3epV7GvHGARmbY=xMvQ@mail.gmail.com> (raw)
In-Reply-To: <CAAa6QmTCuHWuQ=dcdPX8hS3mKMucwjsjEoBCeFoDSwXCca6hpA@mail.gmail.com>

On Tue, May 31, 2022 at 2:37 PM Zach O'Keefe <zokeefe@google.com> wrote:
>
> Thanks everyone for your time and for the great discussion!
>
> For the purposes of arriving at a decision, I've tried to outline the
> major points + my 2c below as:

Thanks for summing up the discussion.

>
> 1. Breaking userland. AFAIK, if permitting MADV_COLLAPSE in "never"
> will break real, existing use cases, then linux's policy would
> necessitate that we don't do that. Is there a way we can reasonably
> determine this? An affirmative answer here makes this decision easy.

I don't have an affirmative answer. It depends on the users'
expectations. Some users may expect there won't be any THP allocation
in "never" mode even though it is requested by the users. AFAICT some
sys admins may expect so since they may manage machines which may run
untrusted software. So allowing MADV_COLLAPSE in "never" doesn't break
any workload, but may break some expectations.

>
> 2. Current uses of "never" a.k.a dev/debug. If (1) is false, then
> we've asserted that *currently* "never" is only used for
> development/debugging. During development of MADV_COLLAPSE, I found it
> necessary to disable khugepaged via a new debugfs tunable to prevent
> khugepaged collapsing memory before MADV_COLLAPSE could act. If
> MADV_COLLAPSE wasn't tied to "never", it's one less debugfs tunable
> we'd need. OTOH, I can still see the benefit, during debugging, of a
> master "no THPs" switch. If we think we'll ever want that master
> switch, then let's just keep "never" as said switch.
>
> 3. Future uses of "never". Do we want to permit a policy where
> userspace *entirely* takes over THP allocation, and khugepaged and
> at-fault is disabled in the kernel? If yes, then then might as well
> permit "never" to allow that now. Personally, though, I can't imagine
> wanting to disable faulting-in THPs in places where we know data will
> be hot; but respecting "never" does back us into a corner if we ever
> go that route.
>
> 4. Flexibility / separation of concerns:  All else being equal,
> decoupling user MADV_COLLAPSE from kernel THP sysfs controls is more
> flexible and consistent with the rest of MADV_COLLAPSE semantics.
>
> If that's roughly accurate, and in lieu of any other critical points,
> if we can determine (1),  then I'd prefer "never" to be tied to kernel
> decisions, not userspace. Any strong objections?

I do not have strong objections, and I think Michal's point and yours
do make some sense for some usecases. A simple way is to allow
MADV_COLLAPSE in "never" mode, then see whether there will be any
complaints.

>
> Thanks again for your time,
> Zach


  reply	other threads:[~2022-05-31 23:52 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24  0:18 [RFC] mm: MADV_COLLAPSE semantics Zach O'Keefe
2022-05-24 13:26 ` Peter Xu
2022-05-24 17:08   ` Zach O'Keefe
2022-05-24 20:02 ` Yang Shi
2022-05-25  8:24 ` Michal Hocko
2022-05-25 17:32   ` Yang Shi
2022-05-25 18:09     ` Zach O'Keefe
2022-05-26  7:12     ` Michal Hocko
2022-05-26 17:39       ` Yang Shi
2022-05-27  9:46         ` Michal Hocko
2022-05-31 23:47           ` Yang Shi
2022-06-01  9:50             ` Michal Hocko
2022-06-01 17:25               ` Yang Shi
2022-06-02  6:55                 ` Michal Hocko
2022-06-02 16:43                   ` Yang Shi
2022-06-03 13:26                     ` Zach O'Keefe
2022-06-03 13:33                       ` Zach O'Keefe
2022-05-26 18:30   ` Matthew Wilcox
2022-05-27  8:56     ` Michal Hocko
2022-05-27 18:09     ` Yang Shi
2022-05-31 21:36       ` Zach O'Keefe
2022-05-31 23:52         ` Yang Shi [this message]
2022-06-01  9:57         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHbLzkqMn81jzKRCjuxsb-if+q5_EQs3epV7GvHGARmbY=xMvQ@mail.gmail.com' \
    --to=shy828301@gmail.com \
    --cc=aarcange@redhat.com \
    --cc=alex.shi@linux.alibaba.com \
    --cc=axelrasmussen@google.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=peterx@redhat.com \
    --cc=rientjes@google.com \
    --cc=rongwei.wang@linux.alibaba.com \
    --cc=sj@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=willy@infradead.org \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.