From: Ingo Molnar <mingo@kernel.org>
To: Dave Hansen <dave@sr71.net>
Cc: Mel Gorman <mgorman@techsingularity.net>,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
linux-mm@kvack.org, torvalds@linux-foundation.org,
akpm@linux-foundation.org, dave.hansen@linux.intel.com,
arnd@arndb.de, hughd@google.com, viro@zeniv.linux.org.uk,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 6/9] x86, pkeys: add pkey set/get syscalls
Date: Sat, 9 Jul 2016 10:37:15 +0200 [thread overview]
Message-ID: <20160709083715.GA29939@gmail.com> (raw)
In-Reply-To: <577FD587.6050101@sr71.net>
* Dave Hansen <dave@sr71.net> wrote:
> On 07/08/2016 12:18 AM, Ingo Molnar wrote:
>
> > So the question is, what is user-space going to do? Do any glibc patches
> > exist? How are the user-space library side APIs going to look like?
>
> My goal at the moment is to get folks enabled to the point that they can start
> modifying apps to use pkeys without having to patch their kernels.
> I don't have confidence that we can design good high-level userspace interfaces
> without seeing some real apps try to use the low-level ones and seeing how they
> struggle.
>
> I had some glibc code to do the pkey alloc/free operations, but those aren't
> necessary if we're doing it in the kernel. Other than getting the syscall
> wrappers in place, I don't have any immediate plans to do anything in glibc.
>
> Was there something you were expecting to see?
Yeah, so (as you probably guessed!) I'm starting to have second thoughts about the
complexity of the alloc/free/set/get interface I suggested, and Mel's review
certainly strengthened that feeling.
I have two worries:
1)
A technical worry I have is that the 'pkey allocation interface' does not seem to
be taking the per thread property of pkeys into account - while that property
would be useful for apps. That is a limitation that seems unjustified.
The reason for this is that we are storing the key allocation bitmap in struct_mm,
in mm->context.pkey_allocation_map - while we should be storing it in task_struct
or thread_info.
We could solve this by moving the allocation bitmap to the task struct, but:
2)
My main worry is that it appears at this stage that we are still pretty far away
from completely shadowing the hardware pkey state in the kernel - and without that
we cannot really force user-space to use the 'proper' APIs. They can just use the
raw instructions, condition them on a CPUID and be done with it: everything can be
organized in user-space.
Furthermore, implementing it in a high performance fashion would be pretty complex
- at minimum we'd have to register a per thread read-write user-space data area
where the kernel could store pkeys management data so that vsyscalls can access it
... None of that facility exists today.
And without vsyscall optimizations user-space might legitimately use its own
implementation for performance reasons and we'd end up with twice the complexity
and a largely unused piece of kernel infrastructure ...
So how about the following minimalistic approach instead, to get the ball rolling
without making ABI decisions we might regret:
- There are 16 pkey indices on x86 currently. We already use index 15 for the
true PROT_EXEC implementation. Let's set aside another pkey index for the
kernel's potential future use (index 14), and clear it explicitly in the
FPU context on every context switch if CONFIG_X86_DEBUG_FPU is enabled to make
sure it remains unallocated.
- Expose just the new mprotect_pkey() system call to install a pkey index into
the page tables - but we let user-space organize its key allocations.
- Give user-space an idea about limits:
"ALL THESE WORLDS ARE YOURSa??EXCEPT EUROPA ATTEMPT NO LANDING THERE"
Ooops, wrong one. Lets try this instead:
Expose the current maximum user-space usable pkeys index in some
programmatically accessible fashion. Maybe mprotect_pkey() could reject a
permanently allocated kernel pkey index via a distinctive error code?
I.e. this pattern:
ret = pkey_mprotect(NULL, PAGE_SIZE, real_prot, pkey);
... would validate the pkey and we'd return -EOPNOTSUPP for pkey that is not
available? This would allow maximum future flexibility as it would not define
kernel allocated pkeys as a 'range'.
- ... and otherwise leave the remaining 14 pkey indices for user-space to manage.
If in the future user-space pkeys usage grows to such a level that kernel
arbitration becomes desirable then we can still implement the get/set/alloc/free
system calls as well: the first use of those system calls would switch on the
kernel's pkey management facilities and from that point on user-space is supposed
to use the published system calls only. Applications using pkey instructions
directly would still work just fine: they'd never use the new system calls.
I.e. we can actually keep a bigger ABI flexibility by introducing the simplest
possible ABI at this stage. Maybe user-space usage of this hardware feature will
never grow beyond that simple ABI - in which case we've saved quite a bit of
ongoing maintenance complexity...
And yes, I realize that we've come a full round since the very first version of
this patch set, but I think the extra hoops were still worth it, because the
true-PROT_EXEC feature came out of it which is very useful IMHO. But my more
complex pkey management syscall ideas don't seem to be all that useful anymore.
So what do you think about this direction? This would simplify the patch set quite
a bit and would touch very little MM code beyond the mprotect_pkey() bits.
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-09 8:37 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-07 12:47 [PATCH 0/9] [REVIEW-REQUEST] [v4] System Calls for Memory Protection Keys Dave Hansen
2016-07-07 12:47 ` [PATCH 1/9] x86, pkeys: add fault handling for PF_PK page fault bit Dave Hansen
2016-07-07 14:40 ` Mel Gorman
2016-07-07 15:42 ` Dave Hansen
2016-07-07 12:47 ` [PATCH 2/9] mm: implement new pkey_mprotect() system call Dave Hansen
2016-07-07 14:40 ` Mel Gorman
2016-07-07 16:51 ` Dave Hansen
2016-07-08 10:15 ` Mel Gorman
2016-07-07 12:47 ` [PATCH 3/9] x86, pkeys: make mprotect_key() mask off additional vm_flags Dave Hansen
2016-07-07 12:47 ` [PATCH 4/9] x86: wire up mprotect_key() system call Dave Hansen
2016-07-07 12:47 ` [PATCH 5/9] x86, pkeys: allocation/free syscalls Dave Hansen
2016-07-07 14:40 ` Mel Gorman
2016-07-07 15:38 ` Dave Hansen
2016-07-07 12:47 ` [PATCH 6/9] x86, pkeys: add pkey set/get syscalls Dave Hansen
2016-07-07 14:45 ` Mel Gorman
2016-07-07 17:33 ` Dave Hansen
2016-07-08 7:18 ` Ingo Molnar
2016-07-08 16:32 ` Dave Hansen
2016-07-09 8:37 ` Ingo Molnar [this message]
2016-07-11 4:25 ` Andy Lutomirski
2016-07-11 7:35 ` Ingo Molnar
2016-07-11 14:28 ` Dave Hansen
2016-07-12 7:13 ` Ingo Molnar
2016-07-12 15:39 ` Dave Hansen
2016-07-11 14:50 ` Andy Lutomirski
2016-07-11 14:34 ` Dave Hansen
2016-07-11 14:45 ` Andy Lutomirski
2016-07-11 15:48 ` Dave Hansen
2016-07-12 16:32 ` Andy Lutomirski
2016-07-12 17:12 ` Dave Hansen
2016-07-12 22:55 ` Andy Lutomirski
2016-07-13 7:56 ` Ingo Molnar
2016-07-13 18:43 ` Andy Lutomirski
2016-07-14 8:07 ` Ingo Molnar
2016-07-18 4:43 ` Andy Lutomirski
2016-07-18 9:56 ` Ingo Molnar
2016-07-18 18:02 ` Dave Hansen
2016-07-18 20:12 ` Dave Hansen
2016-07-08 19:26 ` Dave Hansen
2016-07-08 10:22 ` Mel Gorman
2016-07-07 12:47 ` [PATCH 7/9] generic syscalls: wire up memory protection keys syscalls Dave Hansen
2016-07-07 12:47 ` [PATCH 8/9] pkeys: add details of system call use to Documentation/ Dave Hansen
2016-07-07 12:47 ` [PATCH 9/9] x86, pkeys: add self-tests Dave Hansen
2016-07-07 14:47 ` [PATCH 0/9] [REVIEW-REQUEST] [v4] System Calls for Memory Protection Keys Mel Gorman
2016-07-08 18:38 ` Hugh Dickins
-- strict thread matches above, loose matches on Subject: below --
2016-06-09 0:01 [PATCH 0/9] [v3] " Dave Hansen
2016-06-09 0:01 ` [PATCH 6/9] x86, pkeys: add pkey set/get syscalls Dave Hansen
2016-06-07 20:47 [PATCH 0/9] [v2] System Calls for Memory Protection Keys Dave Hansen
2016-06-07 20:47 ` [PATCH 6/9] x86, pkeys: add pkey set/get syscalls Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160709083715.GA29939@gmail.com \
--to=mingo@kernel.org \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=dave.hansen@linux.intel.com \
--cc=dave@sr71.net \
--cc=hpa@zytor.com \
--cc=hughd@google.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).