From: David Hildenbrand <david@redhat.com>
To: Mike Rapoport <rppt@kernel.org>
Cc: David Hildenbrand <david@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christopher Lameter <cl@linux.com>,
Dan Williams <dan.j.williams@intel.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
Elena Reshetova <elena.reshetova@intel.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
James Bottomley <jejb@linux.ibm.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Matthew Wilcox <willy@infradead.org>,
Mark Rutland <mark.rutland@arm.com>,
Mike Rapoport <rppt@linux.ibm.com>,
Michael Kerrisk <mtk.manpages@gmail.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Peter Zijlstra <peterz@infradead.org>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Shuah Khan <shuah@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Tycho Andersen <tycho@tycho.ws>, Will Deacon <will@kernel.org>,
linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org,
x86@kernel.org
Subject: Re: [PATCH v8 2/9] mmap: make mlock_future_check() global
Date: Thu, 12 Nov 2020 21:15:18 +0100 [thread overview]
Message-ID: <7A16CA44-782D-4ABA-8D93-76BDD0A90F94@redhat.com> (raw)
In-Reply-To: <20201112190827.GP4758@kernel.org>
> Am 12.11.2020 um 20:08 schrieb Mike Rapoport <rppt@kernel.org>:
>
> On Thu, Nov 12, 2020 at 05:22:00PM +0100, David Hildenbrand wrote:
>>> On 10.11.20 19:06, Mike Rapoport wrote:
>>> On Tue, Nov 10, 2020 at 06:17:26PM +0100, David Hildenbrand wrote:
>>>> On 10.11.20 16:14, Mike Rapoport wrote:
>>>>> From: Mike Rapoport <rppt@linux.ibm.com>
>>>>>
>>>>> It will be used by the upcoming secret memory implementation.
>>>>>
>>>>> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
>>>>> ---
>>>>> mm/internal.h | 3 +++
>>>>> mm/mmap.c | 5 ++---
>>>>> 2 files changed, 5 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/mm/internal.h b/mm/internal.h
>>>>> index c43ccdddb0f6..ae146a260b14 100644
>>>>> --- a/mm/internal.h
>>>>> +++ b/mm/internal.h
>>>>> @@ -348,6 +348,9 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
>>>>> extern void mlock_vma_page(struct page *page);
>>>>> extern unsigned int munlock_vma_page(struct page *page);
>>>>> +extern int mlock_future_check(struct mm_struct *mm, unsigned long flags,
>>>>> + unsigned long len);
>>>>> +
>>>>> /*
>>>>> * Clear the page's PageMlocked(). This can be useful in a situation where
>>>>> * we want to unconditionally remove a page from the pagecache -- e.g.,
>>>>> diff --git a/mm/mmap.c b/mm/mmap.c
>>>>> index 61f72b09d990..c481f088bd50 100644
>>>>> --- a/mm/mmap.c
>>>>> +++ b/mm/mmap.c
>>>>> @@ -1348,9 +1348,8 @@ static inline unsigned long round_hint_to_min(unsigned long hint)
>>>>> return hint;
>>>>> }
>>>>> -static inline int mlock_future_check(struct mm_struct *mm,
>>>>> - unsigned long flags,
>>>>> - unsigned long len)
>>>>> +int mlock_future_check(struct mm_struct *mm, unsigned long flags,
>>>>> + unsigned long len)
>>>>> {
>>>>> unsigned long locked, lock_limit;
>>>>>
>>>>
>>>> So, an interesting question is if you actually want to charge secretmem
>>>> pages against mlock now, or if you want a dedicated secretmem cgroup
>>>> controller instead?
>>>
>>> Well, with the current implementation there are three limits an
>>> administrator can use to control secretmem limits: mlock, memcg and
>>> kernel parameter.
>>>
>>> The kernel parameter puts a global upper limit for secretmem usage,
>>> memcg accounts all secretmem allocations, including the unused memory in
>>> large pages caching and mlock allows per task limit for secretmem
>>> mappings, well, like mlock does.
>>>
>>> I didn't consider a dedicated cgroup, as it seems we already have enough
>>> existing knobs and a new one would be unnecessary.
>>
>> To me it feels like the mlock() limit is a wrong fit for secretmem. But
>> maybe there are other cases of using the mlock() limit without actually
>> doing mlock() that I am not aware of (most probably :) )?
>
> Secretmem does not explicitly calls to mlock() but it does what mlock()
> does and a bit more. Citing mlock(2):
>
> mlock(), mlock2(), and mlockall() lock part or all of the calling
> process's virtual address space into RAM, preventing that memory from
> being paged to the swap area.
>
> So, based on that secretmem pages are not swappable, I think that
> RLIMIT_MEMLOCK is appropriate here.
>
The page explicitly lists mlock() system calls. E.g., we also don‘t account for gigantic pages - which might be allocated from CMA and are not swappable.
>> I mean, my concern is not earth shattering, this can be reworked later. As I
>> said, it just feels wrong.
>>
>> --
>> Thanks,
>>
>> David / dhildenb
>>
>
> --
> Sincerely yours,
> Mike.
>
next prev parent reply other threads:[~2020-11-12 20:15 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-10 15:14 [PATCH v8 0/9] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 1/9] mm: add definition of PMD_PAGE_ORDER Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 2/9] mmap: make mlock_future_check() global Mike Rapoport
2020-11-10 17:17 ` David Hildenbrand
2020-11-10 18:06 ` Mike Rapoport
2020-11-12 16:22 ` David Hildenbrand
2020-11-12 19:08 ` Mike Rapoport
2020-11-12 20:15 ` David Hildenbrand [this message]
2020-11-15 8:26 ` Mike Rapoport
2020-11-17 15:09 ` David Hildenbrand
2020-11-17 15:58 ` Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 3/9] set_memory: allow set_direct_map_*_noflush() for multiple pages Mike Rapoport
2020-11-13 12:26 ` Catalin Marinas
2020-11-10 15:14 ` [PATCH v8 4/9] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2020-11-13 13:58 ` Matthew Wilcox
2020-11-15 8:53 ` Mike Rapoport
2020-11-13 14:06 ` Matthew Wilcox
2020-11-15 8:45 ` Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 5/9] secretmem: use PMD-size pages to amortize direct map fragmentation Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 6/9] secretmem: add memcg accounting Mike Rapoport
2020-11-13 1:35 ` Andrew Morton
2020-11-13 23:42 ` Roman Gushchin
2020-11-15 9:17 ` Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 7/9] PM: hibernate: disable when there are active secretmem users Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 8/9] arch, mm: wire up memfd_secret system call were relevant Mike Rapoport
2020-11-13 12:25 ` Catalin Marinas
2020-11-15 8:56 ` Mike Rapoport
2020-11-10 15:14 ` [PATCH v8 9/9] secretmem: test: add basic selftest for memfd_secret(2) Mike Rapoport
2020-11-12 14:56 ` [PATCH v8 0/9] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7A16CA44-782D-4ABA-8D93-76BDD0A90F94@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=elena.reshetova@intel.com \
--cc=hpa@zytor.com \
--cc=jejb@linux.ibm.com \
--cc=kirill@shutemov.name \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linux-riscv@lists.infradead.org \
--cc=luto@kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=mtk.manpages@gmail.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rppt@kernel.org \
--cc=rppt@linux.ibm.com \
--cc=shuah@kernel.org \
--cc=tglx@linutronix.de \
--cc=tycho@tycho.ws \
--cc=viro@zeniv.linux.org.uk \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).