Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Kirill Tkhai <ktkhai@virtuozzo.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Daniel Colascione <dancol@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Linux API <linux-api@vger.kernel.org>,
	oleksandr@redhat.com, Suren Baghdasaryan <surenb@google.com>,
	Tim Murray <timmurray@google.com>,
	Sandeep Patil <sspatil@google.com>,
	Sonny Rao <sonnyrao@google.com>,
	Brian Geffon <bgeffon@google.com>, Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Shakeel Butt <shakeelb@google.com>,
	John Dias <joaodias@google.com>
Subject: Re: [PATCH 2/4] mm: introduce external memory hinting API
Date: Wed, 15 Jan 2020 12:38:43 +0300
Message-ID: <9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com> (raw)
In-Reply-To: <20200114191239.GB178589@google.com>

On 14.01.2020 22:12, Minchan Kim wrote:
> On Tue, Jan 14, 2020 at 11:39:28AM +0300, Kirill Tkhai wrote:
>> On 13.01.2020 22:18, Daniel Colascione wrote:
>>> On Mon, Jan 13, 2020, 12:47 AM Kirill Tkhai <ktkhai@virtuozzo.com> wrote:
>>>>> +SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start,
>>>>> +             size_t, len_in, int, behavior, unsigned long, flags)
>>>>
>>>> I don't like the interface. The fact we have pidfd does not mean,
>>>> we have to use it for new syscalls always. A user may want to set
>>>> madvise for specific pid from console and pass pid as argument.
>>>> pidfd would be an overkill in this case.
>>>> We usually call "kill -9 pid" from console. Why shouldn't process_madvise()
>>>> allow this?
>>>
>>> All new APIs should use pidfds: they're better than numeric PIDs
>>
>> Yes
>>
>>> in every way.
>>
>> No
>>
>>> If a program wants to allow users to specify processes by
>>> numeric PID, it can parse that numeric PID, open the corresponding
>>> pidfd, and then use that pidfd with whatever system call it wants.
>>> It's not necessary to support numeric PIDs at the system call level to
>>> allow a console program to identify a process by numeric PID.
>>
>> No. It is overkill. Ordinary pid interfaces also should be available.
>> There are a lot of cases, when they are more comfortable. Say, a calling
>> of process_madvise() from tracer, when a tracee is stopped. In this moment
>> the tracer knows everything about tracee state, and pidfd brackets
>> pidfd_open() and close() around actual action look just stupid, and this
>> is cpu time wasting.
>>
>> Another example is a parent task, which manages parameters of its children.
>> It knows everything about them, whether they are alive or not. Pidfd interface
>> will just utilize additional cpu time here.
>>
>> So, no. Both interfaces should be available.
> 
> Sounds like that you want to support both options for every upcoming API
> which deals with pid. I'm not sure how it's critical for process_madvise
> API this case. In general, we sacrifice some performance for the nicer one
> and later, once it's reported as hurdle for some workload, we could fix it
> via introducing new flag. What I don't like at this moment is to make
> syscall complicated with potential scenarios without real workload.

Yes, I suggest allowing both options for every new process api. This may be
performance-critical for some workloads. Say, CRIU may exercise a lot of
inter-process calls during container restore and additional system calls
will slow down online migration. And there should be many another examples.

At least you have to call the first argument in more generic way from the start.
Not "int pidfd", but something like "idtype_t id" instead. This allows to extend
it in the future.

Kirill


  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-10 21:34 [PATCH 0/4] introduce memory hinting API for external process Minchan Kim
2020-01-10 21:34 ` [PATCH 1/4] mm: factor out madvise's core functionality Minchan Kim
2020-01-11  7:37   ` SeongJae Park
2020-01-13 18:11     ` Minchan Kim
2020-01-13 18:22       ` SeongJae Park
2020-01-10 21:34 ` [PATCH 2/4] mm: introduce external memory hinting API Minchan Kim
2020-01-11  7:34   ` SeongJae Park
2020-01-13 18:02     ` Minchan Kim
2020-01-13  8:47   ` Kirill Tkhai
2020-01-13 10:42     ` Christian Brauner
2020-01-13 18:44       ` Minchan Kim
2020-01-13 19:10         ` Christian Brauner
2020-01-13 19:27           ` Daniel Colascione
2020-01-13 20:42             ` Christian Brauner
2020-01-13 21:04               ` Daniel Colascione
2020-01-14 19:20                 ` Christian Brauner
2020-01-14 18:59           ` Minchan Kim
2020-01-14 19:22             ` Christian Brauner
2020-01-13 18:39     ` Minchan Kim
2020-01-13 19:18     ` Daniel Colascione
2020-01-14  8:39       ` Kirill Tkhai
2020-01-14 19:12         ` Minchan Kim
2020-01-15  9:38           ` Kirill Tkhai [this message]
2020-01-10 21:34 ` [PATCH 3/4] mm/madvise: employ mmget_still_valid for write lock Minchan Kim
2020-01-10 21:34 ` [PATCH 4/4] mm/madvise: allow KSM hints for remote API Minchan Kim
2020-01-11  7:42   ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com \
    --to=ktkhai@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=bgeffon@google.com \
    --cc=dancol@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=joaodias@google.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=oleksandr@redhat.com \
    --cc=shakeelb@google.com \
    --cc=sonnyrao@google.com \
    --cc=sspatil@google.com \
    --cc=surenb@google.com \
    --cc=timmurray@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git