From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6D52C63699 for ; Sun, 15 Nov 2020 08:27:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A0739223FB for ; Sun, 15 Nov 2020 08:27:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="OzCqWMqj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726755AbgKOI0p (ORCPT ); Sun, 15 Nov 2020 03:26:45 -0500 Received: from mail.kernel.org ([198.145.29.99]:51566 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726230AbgKOI0j (ORCPT ); Sun, 15 Nov 2020 03:26:39 -0500 Received: from kernel.org (unknown [77.125.7.142]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2B40D20825; Sun, 15 Nov 2020 08:26:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605428798; bh=N8nX4EItu7ygJosleHRb6Se+RCkeKeuBhze7L0pxdoc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=OzCqWMqjydzAb9tsTc50ml1otm0W+jgR45ewsq/G5RaHSABMkhoMFFhu9KCzP5EgM H4hHM4CWx5jTJGIkXhhIkkUUbJjjG0LCaHvvN1Sygkt5FEaoKMH2vyKuxCD0ZXI6ak TE0ypAnInCIfqSa2/EaG3mDDyN0He79bGRMwfBCI= Date: Sun, 15 Nov 2020 10:26:25 +0200 From: Mike Rapoport To: David Hildenbrand Cc: Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , Elena Reshetova , "H. Peter Anvin" , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Rick Edgecombe , Shuah Khan , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org Subject: Re: [PATCH v8 2/9] mmap: make mlock_future_check() global Message-ID: <20201115082625.GT4758@kernel.org> References: <20201112190827.GP4758@kernel.org> <7A16CA44-782D-4ABA-8D93-76BDD0A90F94@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7A16CA44-782D-4ABA-8D93-76BDD0A90F94@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 12, 2020 at 09:15:18PM +0100, David Hildenbrand wrote: > > > Am 12.11.2020 um 20:08 schrieb Mike Rapoport : > > > > On Thu, Nov 12, 2020 at 05:22:00PM +0100, David Hildenbrand wrote: > >>> On 10.11.20 19:06, Mike Rapoport wrote: > >>> On Tue, Nov 10, 2020 at 06:17:26PM +0100, David Hildenbrand wrote: > >>>> On 10.11.20 16:14, Mike Rapoport wrote: > >>>>> From: Mike Rapoport > >>>>> > >>>>> It will be used by the upcoming secret memory implementation. > >>>>> > >>>>> Signed-off-by: Mike Rapoport > >>>>> --- > >>>>> mm/internal.h | 3 +++ > >>>>> mm/mmap.c | 5 ++--- > >>>>> 2 files changed, 5 insertions(+), 3 deletions(-) > >>>>> > >>>>> diff --git a/mm/internal.h b/mm/internal.h > >>>>> index c43ccdddb0f6..ae146a260b14 100644 > >>>>> --- a/mm/internal.h > >>>>> +++ b/mm/internal.h > >>>>> @@ -348,6 +348,9 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma) > >>>>> extern void mlock_vma_page(struct page *page); > >>>>> extern unsigned int munlock_vma_page(struct page *page); > >>>>> +extern int mlock_future_check(struct mm_struct *mm, unsigned long flags, > >>>>> + unsigned long len); > >>>>> + > >>>>> /* > >>>>> * Clear the page's PageMlocked(). This can be useful in a situation where > >>>>> * we want to unconditionally remove a page from the pagecache -- e.g., > >>>>> diff --git a/mm/mmap.c b/mm/mmap.c > >>>>> index 61f72b09d990..c481f088bd50 100644 > >>>>> --- a/mm/mmap.c > >>>>> +++ b/mm/mmap.c > >>>>> @@ -1348,9 +1348,8 @@ static inline unsigned long round_hint_to_min(unsigned long hint) > >>>>> return hint; > >>>>> } > >>>>> -static inline int mlock_future_check(struct mm_struct *mm, > >>>>> - unsigned long flags, > >>>>> - unsigned long len) > >>>>> +int mlock_future_check(struct mm_struct *mm, unsigned long flags, > >>>>> + unsigned long len) > >>>>> { > >>>>> unsigned long locked, lock_limit; > >>>>> > >>>> > >>>> So, an interesting question is if you actually want to charge secretmem > >>>> pages against mlock now, or if you want a dedicated secretmem cgroup > >>>> controller instead? > >>> > >>> Well, with the current implementation there are three limits an > >>> administrator can use to control secretmem limits: mlock, memcg and > >>> kernel parameter. > >>> > >>> The kernel parameter puts a global upper limit for secretmem usage, > >>> memcg accounts all secretmem allocations, including the unused memory in > >>> large pages caching and mlock allows per task limit for secretmem > >>> mappings, well, like mlock does. > >>> > >>> I didn't consider a dedicated cgroup, as it seems we already have enough > >>> existing knobs and a new one would be unnecessary. > >> > >> To me it feels like the mlock() limit is a wrong fit for secretmem. But > >> maybe there are other cases of using the mlock() limit without actually > >> doing mlock() that I am not aware of (most probably :) )? > > > > Secretmem does not explicitly calls to mlock() but it does what mlock() > > does and a bit more. Citing mlock(2): > > > > mlock(), mlock2(), and mlockall() lock part or all of the calling > > process's virtual address space into RAM, preventing that memory from > > being paged to the swap area. > > > > So, based on that secretmem pages are not swappable, I think that > > RLIMIT_MEMLOCK is appropriate here. > > > > The page explicitly lists mlock() system calls. Well, it's mlock() man page, isn't it? ;-) My thinking was that since secretmem does what mlock() does wrt swapability, it should at least obey the same limit, i.e. RLIMIT_MEMLOCK. > E.g., we also don‘t > account for gigantic pages - which might be allocated from CMA and are > not swappable. Do you mean gigantic pages in hugetlbfs? It seems to me that hugetlbfs accounting is a completely different story. > >> I mean, my concern is not earth shattering, this can be reworked later. As I > >> said, it just feels wrong. > >> > >> -- > >> Thanks, > >> > >> David / dhildenb > >> > > > > -- > > Sincerely yours, > > Mike. > > > -- Sincerely yours, Mike.