From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFCFFC33CB1 for ; Wed, 15 Jan 2020 09:39:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5118A24671 for ; Wed, 15 Jan 2020 09:39:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5118A24671 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=virtuozzo.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D224D8E000C; Wed, 15 Jan 2020 04:39:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CABDE8E0003; Wed, 15 Jan 2020 04:39:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9AEE8E000C; Wed, 15 Jan 2020 04:39:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0057.hostedemail.com [216.40.44.57]) by kanga.kvack.org (Postfix) with ESMTP id 9F8368E0003 for ; Wed, 15 Jan 2020 04:39:09 -0500 (EST) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 58EC08248047 for ; Wed, 15 Jan 2020 09:39:09 +0000 (UTC) X-FDA: 76379370018.02.fan95_15a866ffdd54e X-HE-Tag: fan95_15a866ffdd54e X-Filterd-Recvd-Size: 4613 Received: from relay.sw.ru (relay.sw.ru [185.231.240.75]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 15 Jan 2020 09:39:08 +0000 (UTC) Received: from dhcp-172-16-24-104.sw.ru ([172.16.24.104]) by relay.sw.ru with esmtp (Exim 4.92.3) (envelope-from ) id 1irf8N-0002JO-Tx; Wed, 15 Jan 2020 12:38:44 +0300 Subject: Re: [PATCH 2/4] mm: introduce external memory hinting API To: Minchan Kim Cc: Daniel Colascione , Andrew Morton , LKML , linux-mm , Linux API , oleksandr@redhat.com, Suren Baghdasaryan , Tim Murray , Sandeep Patil , Sonny Rao , Brian Geffon , Michal Hocko , Johannes Weiner , Shakeel Butt , John Dias References: <20200110213433.94739-1-minchan@kernel.org> <20200110213433.94739-3-minchan@kernel.org> <56ea0927-ad2e-3fbd-3366-3813330f6cec@virtuozzo.com> <3eec2097-75a3-1e1d-06d9-44ee5eaf1312@virtuozzo.com> <20200114191239.GB178589@google.com> From: Kirill Tkhai Message-ID: <9d849087-3359-c4ab-fbec-859e8186c509@virtuozzo.com> Date: Wed, 15 Jan 2020 12:38:43 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.1 MIME-Version: 1.0 In-Reply-To: <20200114191239.GB178589@google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 14.01.2020 22:12, Minchan Kim wrote: > On Tue, Jan 14, 2020 at 11:39:28AM +0300, Kirill Tkhai wrote: >> On 13.01.2020 22:18, Daniel Colascione wrote: >>> On Mon, Jan 13, 2020, 12:47 AM Kirill Tkhai wrote: >>>>> +SYSCALL_DEFINE5(process_madvise, int, pidfd, unsigned long, start, >>>>> + size_t, len_in, int, behavior, unsigned long, flags) >>>> >>>> I don't like the interface. The fact we have pidfd does not mean, >>>> we have to use it for new syscalls always. A user may want to set >>>> madvise for specific pid from console and pass pid as argument. >>>> pidfd would be an overkill in this case. >>>> We usually call "kill -9 pid" from console. Why shouldn't process_madvise() >>>> allow this? >>> >>> All new APIs should use pidfds: they're better than numeric PIDs >> >> Yes >> >>> in every way. >> >> No >> >>> If a program wants to allow users to specify processes by >>> numeric PID, it can parse that numeric PID, open the corresponding >>> pidfd, and then use that pidfd with whatever system call it wants. >>> It's not necessary to support numeric PIDs at the system call level to >>> allow a console program to identify a process by numeric PID. >> >> No. It is overkill. Ordinary pid interfaces also should be available. >> There are a lot of cases, when they are more comfortable. Say, a calling >> of process_madvise() from tracer, when a tracee is stopped. In this moment >> the tracer knows everything about tracee state, and pidfd brackets >> pidfd_open() and close() around actual action look just stupid, and this >> is cpu time wasting. >> >> Another example is a parent task, which manages parameters of its children. >> It knows everything about them, whether they are alive or not. Pidfd interface >> will just utilize additional cpu time here. >> >> So, no. Both interfaces should be available. > > Sounds like that you want to support both options for every upcoming API > which deals with pid. I'm not sure how it's critical for process_madvise > API this case. In general, we sacrifice some performance for the nicer one > and later, once it's reported as hurdle for some workload, we could fix it > via introducing new flag. What I don't like at this moment is to make > syscall complicated with potential scenarios without real workload. Yes, I suggest allowing both options for every new process api. This may be performance-critical for some workloads. Say, CRIU may exercise a lot of inter-process calls during container restore and additional system calls will slow down online migration. And there should be many another examples. At least you have to call the first argument in more generic way from the start. Not "int pidfd", but something like "idtype_t id" instead. This allows to extend it in the future. Kirill