From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,FSL_HELO_FAKE,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 615AEC63798 for ; Mon, 16 Nov 2020 23:24:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B3BC3241A3 for ; Mon, 16 Nov 2020 23:24:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Du7zG1zk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B3BC3241A3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F261F6B0036; Mon, 16 Nov 2020 18:24:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EFEC46B005D; Mon, 16 Nov 2020 18:24:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E13D36B0068; Mon, 16 Nov 2020 18:24:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id B48286B0036 for ; Mon, 16 Nov 2020 18:24:41 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5B22D181AEF15 for ; Mon, 16 Nov 2020 23:24:41 +0000 (UTC) X-FDA: 77491863162.25.trade03_1d0ab802732c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 36D371804E3A0 for ; Mon, 16 Nov 2020 23:24:41 +0000 (UTC) X-HE-Tag: trade03_1d0ab802732c X-Filterd-Recvd-Size: 7452 Received: from mail-pf1-f194.google.com (mail-pf1-f194.google.com [209.85.210.194]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Mon, 16 Nov 2020 23:24:40 +0000 (UTC) Received: by mail-pf1-f194.google.com with SMTP id c66so15658826pfa.4 for ; Mon, 16 Nov 2020 15:24:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=9gHW/b2b9IVZCe8TQB/5bFXeA9E/gEGYf6Ex+dE1xsM=; b=Du7zG1zkT8cXy4aIUBIS0820a1I4oY4AepXtnqY9BwQROuLpFqc2G0WXju9pYjEd5o 8yLw2EBPiw2bvTHtnEwbiM2asA+S1zbJtbxIRvvSMp3/iHyQfnCFtmRMhcspPTLNF4Iq RN2iFkXw8qfTzsfNCdKjOsHi4ZlCX+Iu2bJEO9BrIaxIbhm6mRPBcva4tamRkKThO+xX ui0VT56lAA/6z3DRpGr6wFOMnzKEUSHdSVWnYQwctuHP9M79dtfavDkO+gXk8r0E5Onb U3l0QD+H/Mgt/6t36gOX5RipWIj5E6CiguISxim27EchNuFD/qDlzP0SRv9KgrHXTk4J l2QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=9gHW/b2b9IVZCe8TQB/5bFXeA9E/gEGYf6Ex+dE1xsM=; b=uaTAde2i+Y3UhheQ7ZsIWJa511Ifigh2JhggfoeyBpFixdDKgUXORWGHLhDnhnqSnr YGYdGkJOqGc00N+XMg6qz4KaavGdbi+ZpriaaIHUUOcM1Zh7XYARi+/wqFPygO9gEAFd 2qlXYTcyOWhbERUNcBOGyNkHjpvRTMitWIf/O2yBkTUeeNI/RCir5wj+Bo9fA5rjT4ln q7WY/Wk5ZKq1uQzqMcC//k08u4D0qLuZSYArkCRexszOJbjzcrwGUzOi0shnBCRi0RqK 62GMKLUII/yp8fNPcGnthP54xd4kqmbQ8sqnux9xiZNwxr3qdh6+5WOY7XDdRwKoARpu Pbzg== X-Gm-Message-State: AOAM531xC0GcZfh+mY5Ltd/9HEwo9xF0Mo9JfFSjNI6jm5DVa90VyZkh iip8Jkjvw6zFeVyDU25F5Fg= X-Google-Smtp-Source: ABdhPJw+CCN+2GPNi3R5/WE8CzIWOQd5JBJwok6cN7cwCTZI+Jyj+dy07C+eYUEo56lj67iDBxXo0w== X-Received: by 2002:a17:90a:d495:: with SMTP id s21mr1413055pju.30.1605569079574; Mon, 16 Nov 2020 15:24:39 -0800 (PST) Received: from google.com ([2620:15c:211:201:7220:84ff:fe09:5e58]) by smtp.gmail.com with ESMTPSA id mt2sm556195pjb.7.2020.11.16.15.24.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Nov 2020 15:24:38 -0800 (PST) Date: Mon, 16 Nov 2020 15:24:36 -0800 From: Minchan Kim To: Andrew Morton Cc: Suren Baghdasaryan , Michal Hocko , David Rientjes , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Rik van Riel , Christian Brauner , Oleg Nesterov , Tim Murray , linux-api@vger.kernel.org, linux-mm , LKML , kernel-team Subject: Re: [PATCH 1/1] RFC: add pidfd_send_signal flag to reclaim mm while killing a process Message-ID: <20201116232436.GA3943731@google.com> References: <20201113173448.1863419-1-surenb@google.com> <20201113155539.64e0af5b60ad3145b018ab0d@linux-foundation.org> <20201113170032.7aa56ea273c900f97e6ccbdc@linux-foundation.org> <20201113171810.bebf66608b145cced85bf54c@linux-foundation.org> <20201113181632.6d98489465430a987c96568d@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201113181632.6d98489465430a987c96568d@linux-foundation.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 13, 2020 at 06:16:32PM -0800, Andrew Morton wrote: > On Fri, 13 Nov 2020 17:57:02 -0800 Suren Baghdasaryan wrote: > > > On Fri, Nov 13, 2020 at 5:18 PM Andrew Morton wrote: > > > > > > On Fri, 13 Nov 2020 17:09:37 -0800 Suren Baghdasaryan wrote: > > > > > > > > > > Seems to me that the ability to reap another process's memory is a > > > > > > > generally useful one, and that it should not be tied to delivering a > > > > > > > signal in this fashion. > > > > > > > > > > > > > > And we do have the new process_madvise(MADV_PAGEOUT). It may need a > > > > > > > few changes and tweaks, but can't that be used to solve this problem? > > > > > > > > > > > > Thank you for the feedback, Andrew. process_madvise(MADV_DONTNEED) was > > > > > > one of the options recently discussed in > > > > > > https://lore.kernel.org/linux-api/CAJuCfpGz1kPM3G1gZH+09Z7aoWKg05QSAMMisJ7H5MdmRrRhNQ@mail.gmail.com > > > > > > . The thread describes some of the issues with that approach but if we > > > > > > limit it to processes with pending SIGKILL only then I think that > > > > > > would be doable. > > > > > > > > > > Why would it be necessary to read /proc/pid/maps? I'd have thought > > > > > that a starting effort would be > > > > > > > > > > madvise((void *)0, (void *)-1, MADV_PAGEOUT) > > > > > > > > > > (after translation into process_madvise() speak). Which is equivalent > > > > > to the proposed process_madvise(MADV_DONTNEED_MM)? > > > > > > > > Yep, this is very similar to option #3 in > > > > https://lore.kernel.org/linux-api/CAJuCfpGz1kPM3G1gZH+09Z7aoWKg05QSAMMisJ7H5MdmRrRhNQ@mail.gmail.com > > > > and I actually have a tested prototype for that. > > > > > > Why is the `vector=NULL' needed? Can't `vector' point at a single iovec > > > which spans the whole address range? > > > > That would be the option #4 from the same discussion and the issues > > noted there are "process_madvise return value can't handle such a > > large number of bytes and there is MAX_RW_COUNT limit on max number of > > bytes one process_madvise call can handle". In my prototype I have a > > special handling for such "bulk operation" to work around the > > MAX_RW_COUNT limitation. > > Ah, OK, return value. Maybe process_madvise() shouldn't have done that > and should have simply returned 0 on success, like madvise(). > > I guess a special "nuke whole address space" command is OK. But, again > in the search for generality, the ability to nuke very large amounts of > address space (but not the entire address space) would be better. > > The process_madvise() return value issue could be addressed by adding a > process_madvise() mode which return 0 on success. > > And I guess the MAX_RW_COUNT issue is solvable by adding an > import_iovec() arg to say "don't check that". Along those lines. > > It's all sounding a bit painful (but not *too* painful). But to > reiterate, I do think that adding the ability for a process to shoot > down a large amount of another process's memory is a lot more generally > useful than tying it to SIGKILL, agree? I agree the direction but I think it's the general problem for every APIs have supported iovec and not sure process_madvise is special to chage it. IOW, it wouldn't be a problem to support *entire address space* special mode but not sure to support *large amount of address space* at the cost of breaking existing iovec scheme.