linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Suren Baghdasaryan <surenb@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>, Michal Hocko <mhocko@suse.com>,
	David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Roman Gushchin <guro@fb.com>, Rik van Riel <riel@surriel.com>,
	Christian Brauner <christian@brauner.io>,
	Oleg Nesterov <oleg@redhat.com>,
	Tim Murray <timmurray@google.com>,
	linux-api@vger.kernel.org, linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-team <kernel-team@android.com>
Subject: Re: [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range
Date: Wed, 25 Nov 2020 15:43:22 -0800	[thread overview]
Message-ID: <20201125234322.GG1484898@google.com> (raw)
In-Reply-To: <CAJuCfpGCc49g5+T+V3SxZ6eVteLac6xVRx+1z6G2a8P4-Cr7bA@mail.gmail.com>

On Wed, Nov 25, 2020 at 03:23:40PM -0800, Suren Baghdasaryan wrote:
> On Wed, Nov 25, 2020 at 3:13 PM Minchan Kim <minchan@kernel.org> wrote:
> >
> > On Mon, Nov 23, 2020 at 09:39:42PM -0800, Suren Baghdasaryan wrote:
> > > process_madvise requires a vector of address ranges to be provided for
> > > its operations. When an advice should be applied to the entire process,
> > > the caller process has to obtain the list of VMAs of the target process
> > > by reading the /proc/pid/maps or some other way. The cost of this
> > > operation grows linearly with increasing number of VMAs in the target
> > > process. Even constructing the input vector can be non-trivial when
> > > target process has several thousands of VMAs and the syscall is being
> > > issued during high memory pressure period when new allocations for such
> > > a vector would only worsen the situation.
> > > In the case when advice is being applied to the entire memory space of
> > > the target process, this creates an extra overhead.
> > > Add PMADV_FLAG_RANGE flag for process_madvise enabling the caller to
> > > advise a memory range of the target process. For now, to keep it simple,
> > > only the entire process memory range is supported, vec and vlen inputs
> > > in this mode are ignored and can be NULL and 0.
> > > Instead of returning the number of bytes that advice was successfully
> > > applied to, the syscall in this mode returns 0 on success. This is due
> > > to the fact that the number of bytes would not be useful for the caller
> > > that does not know the amount of memory the call is supposed to affect.
> > > Besides, the ssize_t return type can be too small to hold the number of
> > > bytes affected when the operation is applied to a large memory range.
> >
> > Can we just use one element in iovec to indicate entire address rather
> > than using up the reserved flags?
> >
> >         struct iovec {
> >                 .iov_base = NULL,
> >                 .iov_len = (~(size_t)0),
> >         };
> >
> > Furthermore, it would be applied for other syscalls where have support
> > iovec if we agree on it.
> >
> 
> The flag also changes the return value semantics. If we follow your
> suggestion we should also agree that in this mode the return value
> will be 0 on success and negative otherwise instead of the number of
> bytes madvise was applied to.

Well, return value will depends on the each API. If the operation is
desruptive, it should return the right size affected by the API but
would be okay with 0 or error, otherwise.

  reply	other threads:[~2020-11-25 23:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-24  5:39 [PATCH 0/2] userspace memory reaping using process_madvise Suren Baghdasaryan
2020-11-24  5:39 ` [PATCH 1/2] mm/madvise: allow process_madvise operations on entire memory range Suren Baghdasaryan
2020-11-25 23:13   ` Minchan Kim
2020-11-25 23:23     ` Suren Baghdasaryan
2020-11-25 23:43       ` Minchan Kim [this message]
2020-11-30 19:01         ` Suren Baghdasaryan
2020-12-08  7:23           ` Suren Baghdasaryan
2020-12-11 20:27     ` Jann Horn
2020-12-11 23:01       ` Minchan Kim
2020-12-12  0:16         ` Jann Horn
2020-12-22 13:44       ` Christoph Hellwig
2020-12-22 17:48         ` Suren Baghdasaryan
2020-12-23  4:09           ` Suren Baghdasaryan
2020-12-23  7:57           ` Christoph Hellwig
2020-12-23 17:32             ` Suren Baghdasaryan
2020-11-24  5:39 ` [PATCH 2/2] mm/madvise: add process_madvise MADV_DONTNEER support Suren Baghdasaryan
2020-11-24 13:42   ` Oleg Nesterov
2020-11-24 16:42     ` Suren Baghdasaryan
2020-12-08 23:40   ` Jann Horn
2020-12-08 23:59     ` Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201125234322.GG1484898@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=christian@brauner.io \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@android.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=oleg@redhat.com \
    --cc=riel@surriel.com \
    --cc=rientjes@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).