Linux-api Archive on lore.kernel.org
 help / color / Atom feed
From: James Bottomley <jejb@linux.ibm.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>,
	David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christopher Lameter <cl@linux.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Elena Reshetova <elena.reshetova@intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Matthew Wilcox <willy@infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Roman Gushchin <guro@fb.com>, Shakeel Butt <shakeelb@google.com>,
	Shuah Khan <shuah@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Tycho Andersen <tycho@tycho.ws>, Will Deacon <will@kernel.org>,
	linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org,
	x86@kernel.org, Hagen Paul Pfeifer <hagen@jauu.net>,
	Palmer Dabbelt <palmerdabbelt@google.com>
Subject: Re: [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation
Date: Mon, 01 Feb 2021 08:56:19 -0800
Message-ID: <6de6b9f9c2d28eecc494e7db6ffbedc262317e11.camel@linux.ibm.com> (raw)
In-Reply-To: <YBPF8ETGBHUzxaZR@dhcp22.suse.cz>

On Fri, 2021-01-29 at 09:23 +0100, Michal Hocko wrote:
> On Thu 28-01-21 13:05:02, James Bottomley wrote:
> > Obviously the API choice could be revisited
> > but do you have anything to add over the previous discussion, or is
> > this just to get your access control?
> 
> Well, access control is certainly one thing which I still believe is
> missing. But if there is a general agreement that the direct map
> manipulation is not that critical then this will become much less of
> a problem of course.

The secret memory is a scarce resource but it's not a facility that
should only be available to some users.

> It all boils down whether secret memory is a scarce resource. With
> the existing implementation it really is. It is effectivelly
> repeating same design errors as hugetlb did. And look now, we have a
> subtle and convoluted reservation code to track mmap requests and we
> have a cgroup controller to, guess what, have at least some control
> over distribution if the preallocated pool. See where am I coming
> from?

I'm fairly sure rlimit is the correct way to control this.  The
subtlety in both rlimit and memcg tracking comes from deciding to
account under an existing category rather than having our own new one. 
People don't like new stuff in accounting because it requires
modifications to everything in userspace.  Accounting under and
existing limit keeps userspace the same but leads to endless arguments
about which limit it should be under.  It took us several patch set
iterations to get to a fragile consensus on this which you're now
disrupting for reasons you're not making clear.

> If the secret memory is more in line with mlock without any imposed
> limit (other than available memory) in the end then, sure, using the
> same access control as mlock sounds reasonable. Btw. if this is
> really just a more restrictive mlock then is there any reason to not
> hook this into the existing mlock infrastructure (e.g.
> MCL_EXCLUSIVE)? Implications would be that direct map would be
> handled on instantiation/tear down paths, migration would deal with
> the same (if possible). Other than that it would be mlock like.

In the very first patch set we proposed a mmap flag to do this.  Under
detailed probing it emerged that this suffers from several design
problems: the KVM people want VMM to be able to remove the secret
memory range from the process; there may be situations where sharing is
useful and some people want to be able to seal the operations.  All of
this ended up convincing everyone that a file descriptor based approach
was better than a mmap one.

James



  parent reply index

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 12:27 [PATCH v16 00/11] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 01/11] mm: add definition of PMD_PAGE_ORDER Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 02/11] mmap: make mlock_future_check() global Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 03/11] riscv/Kconfig: make direct map manipulation options depend on MMU Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 04/11] set_memory: allow set_direct_map_*_noflush() for multiple pages Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 05/11] set_memory: allow querying whether set_direct_map_*() is actually enabled Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 06/11] mm: introduce memfd_secret system call to create "secret" memory areas Mike Rapoport
2021-01-25 17:01   ` Michal Hocko
2021-01-25 21:36     ` Mike Rapoport
2021-01-26  7:16       ` Michal Hocko
2021-01-26  8:33         ` Mike Rapoport
2021-01-26  9:00           ` Michal Hocko
2021-01-26  9:20             ` Mike Rapoport
2021-01-26  9:49               ` Michal Hocko
2021-01-26  9:53                 ` David Hildenbrand
2021-01-26 10:19                   ` Michal Hocko
2021-01-26  9:20             ` Michal Hocko
2021-02-03 12:15   ` Michal Hocko
2021-02-04 11:34     ` Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 07/11] secretmem: use PMD-size pages to amortize direct map fragmentation Mike Rapoport
2021-01-26 11:46   ` Michal Hocko
2021-01-26 11:56     ` David Hildenbrand
2021-01-26 12:08       ` Michal Hocko
2021-01-28  9:22         ` Mike Rapoport
2021-01-28 13:01           ` Michal Hocko
2021-01-28 13:28             ` Christoph Lameter
2021-01-28 13:49               ` Michal Hocko
2021-01-28 15:56                 ` Christoph Lameter
2021-01-28 16:23                   ` Michal Hocko
2021-01-28 15:28             ` James Bottomley
2021-01-29  7:03               ` Mike Rapoport
2021-01-28 21:05             ` James Bottomley
     [not found]               ` <YBPF8ETGBHUzxaZR@dhcp22.suse.cz>
2021-02-01 16:56                 ` James Bottomley [this message]
2021-02-02  9:35                   ` Michal Hocko
2021-02-02 12:48                     ` Mike Rapoport
2021-02-02 13:14                       ` David Hildenbrand
2021-02-02 13:32                         ` Michal Hocko
2021-02-02 14:12                           ` David Hildenbrand
2021-02-02 14:22                             ` Michal Hocko
2021-02-02 14:26                               ` David Hildenbrand
2021-02-02 14:32                                 ` Michal Hocko
2021-02-02 14:34                                   ` David Hildenbrand
2021-02-02 18:15                                     ` Mike Rapoport
2021-02-02 18:55                                       ` James Bottomley
2021-02-03 12:09                                         ` Michal Hocko
2021-02-04 11:31                                           ` Mike Rapoport
2021-02-02 13:27                       ` Michal Hocko
2021-02-02 19:10                         ` Mike Rapoport
2021-02-03  9:12                           ` Michal Hocko
2021-02-04  9:58                             ` Mike Rapoport
2021-02-04 13:02                               ` Michal Hocko
2021-01-29  7:21             ` Mike Rapoport
2021-01-29  8:51               ` Michal Hocko
2021-02-02 14:42                 ` David Hildenbrand
2021-01-21 12:27 ` [PATCH v16 08/11] secretmem: add memcg accounting Mike Rapoport
2021-01-25 16:17   ` Matthew Wilcox
2021-01-25 17:18     ` Shakeel Butt
2021-01-25 21:35       ` Mike Rapoport
2021-01-28 15:07         ` Shakeel Butt
2021-01-25 16:54   ` Michal Hocko
2021-01-25 21:38     ` Mike Rapoport
2021-01-26  7:31       ` Michal Hocko
2021-01-26  8:56         ` Mike Rapoport
2021-01-26  9:15           ` Michal Hocko
2021-01-26 14:48       ` Matthew Wilcox
2021-01-26 15:05         ` Michal Hocko
2021-01-27 18:42           ` Roman Gushchin
2021-01-28  7:58             ` Michal Hocko
2021-01-28 14:05               ` Shakeel Butt
2021-01-28 14:22                 ` Michal Hocko
2021-01-28 14:57                   ` Shakeel Butt
2021-01-21 12:27 ` [PATCH v16 09/11] PM: hibernate: disable when there are active secretmem users Mike Rapoport
2021-01-21 12:27 ` [PATCH v16 10/11] arch, mm: wire up memfd_secret system call where relevant Mike Rapoport
2021-01-25 18:18   ` Catalin Marinas
2021-01-21 12:27 ` [PATCH v16 11/11] secretmem: test: add basic selftest for memfd_secret(2) Mike Rapoport
2021-01-21 22:18 ` [PATCH v16 00/11] mm: introduce memfd_secret system call to create "secret" memory areas Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6de6b9f9c2d28eecc494e7db6ffbedc262317e11.camel@linux.ibm.com \
    --to=jejb@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=elena.reshetova@intel.com \
    --cc=guro@fb.com \
    --cc=hagen@jauu.net \
    --cc=hpa@zytor.com \
    --cc=kirill@shutemov.name \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mtk.manpages@gmail.com \
    --cc=palmer@dabbelt.com \
    --cc=palmerdabbelt@google.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tycho@tycho.ws \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-api Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-api/0 linux-api/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-api linux-api/ https://lore.kernel.org/linux-api \
		linux-api@vger.kernel.org
	public-inbox-index linux-api

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-api


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git