linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Suren Baghdasaryan <surenb@google.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>, Michal Hocko <mhocko@suse.com>,
	David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Roman Gushchin <guro@fb.com>, Rik van Riel <riel@surriel.com>,
	Christian Brauner <christian@brauner.io>,
	Oleg Nesterov <oleg@redhat.com>,
	Tim Murray <timmurray@google.com>,
	linux-api@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@android.com>
Subject: Re: [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range
Date: Mon, 30 Nov 2020 11:01:15 -0800	[thread overview]
Message-ID: <CAJuCfpFuWqMEXJij_qHhyGpuFXLuJ7-DcHgcc9760NhBHhuLHw@mail.gmail.com> (raw)
In-Reply-To: <20201125234322.GG1484898@google.com>

On Wed, Nov 25, 2020 at 3:43 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Wed, Nov 25, 2020 at 03:23:40PM -0800, Suren Baghdasaryan wrote:
> > On Wed, Nov 25, 2020 at 3:13 PM Minchan Kim <minchan@kernel.org> wrote:
> > >
> > > On Mon, Nov 23, 2020 at 09:39:42PM -0800, Suren Baghdasaryan wrote:
> > > > process_madvise requires a vector of address ranges to be provided for
> > > > its operations. When an advice should be applied to the entire process,
> > > > the caller process has to obtain the list of VMAs of the target process
> > > > by reading the /proc/pid/maps or some other way. The cost of this
> > > > operation grows linearly with increasing number of VMAs in the target
> > > > process. Even constructing the input vector can be non-trivial when
> > > > target process has several thousands of VMAs and the syscall is being
> > > > issued during high memory pressure period when new allocations for such
> > > > a vector would only worsen the situation.
> > > > In the case when advice is being applied to the entire memory space of
> > > > the target process, this creates an extra overhead.
> > > > Add PMADV_FLAG_RANGE flag for process_madvise enabling the caller to
> > > > advise a memory range of the target process. For now, to keep it simple,
> > > > only the entire process memory range is supported, vec and vlen inputs
> > > > in this mode are ignored and can be NULL and 0.
> > > > Instead of returning the number of bytes that advice was successfully
> > > > applied to, the syscall in this mode returns 0 on success. This is due
> > > > to the fact that the number of bytes would not be useful for the caller
> > > > that does not know the amount of memory the call is supposed to affect.
> > > > Besides, the ssize_t return type can be too small to hold the number of
> > > > bytes affected when the operation is applied to a large memory range.
> > >
> > > Can we just use one element in iovec to indicate entire address rather
> > > than using up the reserved flags?
> > >
> > >         struct iovec {
> > >                 .iov_base = NULL,
> > >                 .iov_len = (~(size_t)0),
> > >         };
> > >
> > > Furthermore, it would be applied for other syscalls where have support
> > > iovec if we agree on it.
> > >
> >
> > The flag also changes the return value semantics. If we follow your
> > suggestion we should also agree that in this mode the return value
> > will be 0 on success and negative otherwise instead of the number of
> > bytes madvise was applied to.
>
> Well, return value will depends on the each API. If the operation is
> desruptive, it should return the right size affected by the API but
> would be okay with 0 or error, otherwise.

I'm fine with dropping the flag, I just thought with the flag it would
be more explicit that this is a special mode operating on ranges. This
way the patch also becomes simpler.
Andrew, Michal, Christian, what do you think about such API? Should I
change the API this way / keep the flag / change it in some other way?

  reply	other threads:[~2020-11-30 19:02 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24  5:39 [PATCH 0/2] userspace memory reaping using process_madvise Suren Baghdasaryan
2020-11-24  5:39 ` [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range Suren Baghdasaryan
2020-11-25 23:13   ` Minchan Kim
2020-11-25 23:23     ` Suren Baghdasaryan
2020-11-25 23:43       ` Minchan Kim
2020-11-30 19:01         ` Suren Baghdasaryan [this message]
2020-12-08  7:23           ` Suren Baghdasaryan
2020-12-11 20:27     ` Jann Horn
2020-12-11 23:01       ` Minchan Kim
2020-12-12  0:16         ` Jann Horn
2020-12-22 13:44       ` Christoph Hellwig
2020-12-22 17:48         ` Suren Baghdasaryan
2020-12-23  4:09           ` Suren Baghdasaryan
2020-12-23  7:57           ` Christoph Hellwig
2020-12-23 17:32             ` Suren Baghdasaryan
2020-11-24  5:39 ` [PATCH 2/2] mm/madvise: add process_madvise MADV_DONTNEER support Suren Baghdasaryan
2020-11-24 13:42   ` Oleg Nesterov
2020-11-24 16:42     ` Suren Baghdasaryan
2020-12-08 23:40   ` Jann Horn
2020-12-08 23:59     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJuCfpFuWqMEXJij_qHhyGpuFXLuJ7-DcHgcc9760NhBHhuLHw@mail.gmail.com \
    --to=surenb@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=christian@brauner.io \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@android.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleg@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=timmurray@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).