linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <christian.brauner@ubuntu.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	linux-api@vger.kernel.org, oleksandr@redhat.com,
	Suren Baghdasaryan <surenb@google.com>,
	Tim Murray <timmurray@google.com>,
	Daniel Colascione <dancol@google.com>,
	Sandeep Patil <sspatil@google.com>,
	Sonny Rao <sonnyrao@google.com>,
	Brian Geffon <bgeffon@google.com>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	John Dias <joaodias@google.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Jann Horn <jannh@google.com>,
	alexander.h.duyck@linux.intel.com, sj38.park@gmail.com,
	Christian Brauner <christian@brauner.io>,
	Kirill Tkhai <ktkhai@virtuozzo.com>
Subject: Re: [PATCH v7 5/7] mm: support both pid and pidfd for process_madvise
Date: Sat, 9 May 2020 14:48:17 +0200	[thread overview]
Message-ID: <20200509124817.xmrvsrq3mla6b76k@wittgenstein> (raw)
In-Reply-To: <20200508160415.65ff359a9e312c613336587b@linux-foundation.org>

On Fri, May 08, 2020 at 04:04:15PM -0700, Andrew Morton wrote:
> On Fri, 8 May 2020 11:36:53 -0700 Minchan Kim <minchan@kernel.org> wrote:
> 
> > 
> > ...
> >
> > Per Vlastimil's request, I changed "which and advise" with "idtype and
> > advice" in function prototype of description.
> > Could you replace the part in the description? Code is never changed.
> > 
> 
> Done, but...
> 
> >
> > ...
> >
> > There is a demand[1] to support pid as well pidfd for process_madvise to
> > reduce unnecessary syscall to get pidfd if the user has control of the
> > target process(ie, they could guarantee the process is not gone or pid is
> > not reused).
> > 
> > This patch aims for supporting both options like waitid(2).  So, the
> > syscall is currently,
> > 
> >         int process_madvise(idtype_t idtype, id_t id, void *addr,
> >                 size_t length, int advice, unsigned long flags);
> > 
> > @which is actually idtype_t for userspace libray and currently, it
> > supports P_PID and P_PIDFD.
> 
> What does "@which is actually idtype_t for userspace libray" mean?  Can
> you clarify and expand?

If I may clarify, the only case where we've supported both pidfd and pid
in the same system call is waitid() to avoid adding a dedicated system
call for waiting and because waitid() already had this (imho insane)
argument type switching. The idtype_t thing comes from waitid() and is
located int sys/wait.h and is defined as

"The type idtype_t is defined as an enumeration type whose possible
values include at least the following:

P_ALL
P_PID
P_PGID
"

int waitid(idtype_t idtype, id_t id, siginfo_t *infop, int options);
If idtype is P_PID, waitid() shall wait for the child with a process ID equal to (pid_t)id.
If idtype is P_PGID, waitid() shall wait for any child with a process group ID equal to (pid_t)id.
If idtype is P_ALL, waitid() shall wait for any children and id is ignored.

I'm personally not a fan of this idtype_t thing and think this should
just have been 
> >         int pidfd_madvise(int pidfd, void *addr,
> >                 size_t length, int advice, unsigned long flags);
and call it a day.

Also, if I may ask, why is the flag argument "unsigned long"?
That's pretty unorthodox. The expectation is that flag arguments are
not word-size dependent and should usually use "unsigned int". All new
system calls follow this pattern too.

The current syscall layout will mean that on 64 bit systems you have 64
flag bits and on 32 bit you have 32 flag bits, I think. That has just
recently led to some problems with the clone() syscall (fixed in [1]
which I'm sending Monday) which has the same weird word-size-dependent
flag argument layout. If a system does sign-extension and a userspace
api or glibc uses e.g. an int for the flag argument in the system call
wrapper - which is fairly common - you can get sign extended and then
you end up with garbage in the upper 32 bits of your system call.

> 
> Also, does this userspace library exist?  If so, where is it?

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/commit/?h=fixes&id=3f2c788a13143620c5471ac96ac4f033fc9ac3f3

Christian


  reply	other threads:[~2020-05-09 12:48 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-02 19:36 [PATCH v7 0/7] introduce memory hinting API for external process Minchan Kim
2020-03-02 19:36 ` [PATCH v7 1/7] mm: pass task and mm to do_madvise Minchan Kim
2020-03-05 15:48   ` Vlastimil Babka
2020-05-08 18:21     ` Minchan Kim
2020-03-02 19:36 ` [PATCH v7 2/7] mm: introduce external memory hinting API Minchan Kim
2020-03-03 10:33   ` kbuild test robot
2020-03-03 14:57     ` Minchan Kim
2020-03-05 18:15   ` Vlastimil Babka
2020-03-10 22:20     ` Minchan Kim
2020-03-11  0:36       ` Minchan Kim
2020-03-12 12:40       ` Vlastimil Babka
2020-03-12 20:23         ` Minchan Kim
2020-05-08 18:33           ` Minchan Kim
2020-03-02 19:36 ` [PATCH v7 3/7] mm: check fatal signal pending of target process Minchan Kim
2020-03-06 10:22   ` Vlastimil Babka
2020-03-10 22:24     ` Minchan Kim
2020-03-02 19:36 ` [PATCH v7 4/7] pid: move pidfd_get_pid function to pid.c Minchan Kim
2020-03-06 10:57   ` Vlastimil Babka
2020-03-06 11:14   ` Christian Brauner
2020-03-02 19:36 ` [PATCH v7 5/7] mm: support both pid and pidfd for process_madvise Minchan Kim
2020-03-06 11:14   ` Vlastimil Babka
2020-03-11  0:42     ` Minchan Kim
2020-05-08 18:36       ` Minchan Kim
2020-05-08 23:04         ` Andrew Morton
2020-05-09 12:48           ` Christian Brauner [this message]
2020-05-09 23:14             ` Minchan Kim
2020-05-12 19:55               ` Suren Baghdasaryan
2020-03-02 19:36 ` [PATCH v7 6/7] mm/madvise: employ mmget_still_valid for write lock Minchan Kim
2020-03-06 12:52   ` Vlastimil Babka
2020-03-06 13:03     ` Oleksandr Natalenko
2020-03-06 16:03       ` Vlastimil Babka
2020-03-09 12:30         ` Oleksandr Natalenko
2020-03-10 22:28           ` Minchan Kim
2020-03-02 19:36 ` [PATCH v7 7/7] mm/madvise: allow KSM hints for remote API Minchan Kim
2020-03-06 13:13   ` Vlastimil Babka
2020-03-06 13:41     ` Oleksandr Natalenko
2020-03-06 16:08       ` Vlastimil Babka
2020-03-09 13:11         ` Oleksandr Natalenko
2020-03-09 15:08           ` Michal Hocko
2020-03-09 15:19             ` Oleksandr Natalenko
2020-03-09 15:42               ` Vlastimil Babka
2020-03-09 16:03                 ` Michal Hocko
2020-06-11  2:21   ` Jann Horn
2020-03-02 21:16 ` [PATCH v7 0/7] introduce memory hinting API for external process Andrew Morton
2020-03-02 21:42   ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200509124817.xmrvsrq3mla6b76k@wittgenstein \
    --to=christian.brauner@ubuntu.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=bgeffon@google.com \
    --cc=christian@brauner.io \
    --cc=dancol@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=jannh@google.com \
    --cc=joaodias@google.com \
    --cc=joel@joelfernandes.org \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleksandr@redhat.com \
    --cc=shakeelb@google.com \
    --cc=sj38.park@gmail.com \
    --cc=sonnyrao@google.com \
    --cc=sspatil@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).