From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>,
Linux API <linux-api@vger.kernel.org>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
Linux FS Devel <linux-fsdevel@vger.kernel.org>,
linux-integrity <linux-integrity@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
LSM List <linux-security-module@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>, X86 ML <x86@kernel.org>
Subject: Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor
Date: Tue, 28 Jul 2020 14:01:12 -0500 [thread overview]
Message-ID: <8b28f4a5-2d9e-0686-40e5-2ea9e37c5933@linux.microsoft.com> (raw)
In-Reply-To: <CALCETrVy5OMuUx04-wWk9FJbSxkrT2vMfN_kANinudrDwC4Cig@mail.gmail.com>
I am working on a response to this. I will send it soon.
Thanks.
Madhavan
On 7/28/20 12:31 PM, Andy Lutomirski wrote:
>> On Jul 28, 2020, at 6:11 AM, madvenka@linux.microsoft.com wrote:
>>
>> From: "Madhavan T. Venkataraman" <madvenka@linux.microsoft.com>
>>
>> The kernel creates the trampoline mapping without any permissions. When
>> the trampoline is executed by user code, a page fault happens and the
>> kernel gets control. The kernel recognizes that this is a trampoline
>> invocation. It sets up the user registers based on the specified
>> register context, and/or pushes values on the user stack based on the
>> specified stack context, and sets the user PC to the requested target
>> PC. When the kernel returns, execution continues at the target PC.
>> So, the kernel does the work of the trampoline on behalf of the
>> application.
> This is quite clever, but now I’m wondering just how much kernel help
> is really needed. In your series, the trampoline is an non-executable
> page. I can think of at least two alternative approaches, and I'd
> like to know the pros and cons.
>
> 1. Entirely userspace: a return trampoline would be something like:
>
> 1:
> pushq %rax
> pushq %rbc
> pushq %rcx
> ...
> pushq %r15
> movq %rsp, %rdi # pointer to saved regs
> leaq 1b(%rip), %rsi # pointer to the trampoline itself
> callq trampoline_handler # see below
>
> You would fill a page with a bunch of these, possibly compacted to get
> more per page, and then you would remap as many copies as needed. The
> 'callq trampoline_handler' part would need to be a bit clever to make
> it continue to work despite this remapping. This will be *much*
> faster than trampfd. How much of your use case would it cover? For
> the inverse, it's not too hard to write a bit of asm to set all
> registers and jump somewhere.
>
> 2. Use existing kernel functionality. Raise a signal, modify the
> state, and return from the signal. This is very flexible and may not
> be all that much slower than trampfd.
>
> 3. Use a syscall. Instead of having the kernel handle page faults,
> have the trampoline code push the syscall nr register, load a special
> new syscall nr into the syscall nr register, and do a syscall. On
> x86_64, this would be:
>
> pushq %rax
> movq __NR_magic_trampoline, %rax
> syscall
>
> with some adjustment if the stack slot you're clobbering is important.
>
>
> Also, will using trampfd cause issues with various unwinders? I can
> easily imagine unwinders expecting code to be readable, although this
> is slowly going away for other reasons.
>
> All this being said, I think that the kernel should absolutely add a
> sensible interface for JITs to use to materialize their code. This
> would integrate sanely with LSMs and wouldn't require hacks like using
> files, etc. A cleverly designed JIT interface could function without
> seriailization IPIs, and even lame architectures like x86 could
> potentially avoid shootdown IPIs if the interface copied code instead
> of playing virtual memory games. At its very simplest, this could be:
>
> void *jit_create_code(const void *source, size_t len);
>
> and the result would be a new anonymous mapping that contains exactly
> the code requested. There could also be:
>
> int jittfd_create(...);
>
> that does something similar but creates a memfd. A nicer
> implementation for short JIT sequences would allow appending more code
> to an existing JIT region. On x86, an appendable JIT region would
> start filled with 0xCC, and I bet there's a way to materialize new
> code into a previously 0xcc-filled virtual page wthout any
> synchronization. One approach would be to start with:
>
> <some code>
> 0xcc
> 0xcc
> ...
> 0xcc
>
> and to create a whole new page like:
>
> <some code>
> <some more code>
> 0xcc
> ...
> 0xcc
>
> so that the only difference is that some code changed to some more
> code. Then replace the PTE to swap from the old page to the new page,
> and arrange to avoid freeing the old page until we're sure it's gone
> from all TLBs. This may not work if <some more code> spans a page
> boundary. The #BP fixup would zap the TLB and retry. Even just
> directly copying code over some 0xcc bytes almost works, but there's a
> nasty corner case involving instructions that fetch I$ fetch
> boundaries. I'm not sure to what extent I$ snooping helps.
>
> --Andy
next prev parent reply other threads:[~2020-07-28 19:01 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <aefc85852ea518982e74b233e11e16d2e707bc32>
2020-07-28 13:10 ` [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor madvenka
2020-07-28 13:10 ` [PATCH v1 1/4] [RFC] fs/trampfd: Implement the trampoline file descriptor API madvenka
2020-07-28 14:50 ` Oleg Nesterov
2020-07-28 14:58 ` Madhavan T. Venkataraman
2020-07-28 16:06 ` Oleg Nesterov
2020-07-28 13:10 ` [PATCH v1 2/4] [RFC] x86/trampfd: Provide support for the trampoline file descriptor madvenka
2020-07-30 9:06 ` Greg KH
2020-07-30 14:25 ` Madhavan T. Venkataraman
2020-07-28 13:10 ` [PATCH v1 3/4] [RFC] arm64/trampfd: " madvenka
2020-07-28 13:10 ` [PATCH v1 4/4] [RFC] arm/trampfd: " madvenka
2020-07-28 15:13 ` [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor David Laight
2020-07-28 16:32 ` Madhavan T. Venkataraman
2020-07-28 17:16 ` Andy Lutomirski
2020-07-28 18:52 ` Madhavan T. Venkataraman
2020-07-29 8:36 ` David Laight
2020-07-29 17:55 ` Madhavan T. Venkataraman
[not found] ` <81d744c0-923e-35ad-6063-8b186f6a153c@linux.microsoft.com>
2020-07-29 5:16 ` Andy Lutomirski
2020-07-28 16:05 ` Casey Schaufler
2020-07-28 16:49 ` Madhavan T. Venkataraman
2020-07-28 17:05 ` James Morris
2020-07-28 17:08 ` Madhavan T. Venkataraman
2020-07-28 17:31 ` Andy Lutomirski
2020-07-28 19:01 ` Madhavan T. Venkataraman [this message]
2020-07-29 13:29 ` Florian Weimer
2020-07-30 13:09 ` David Laight
2020-08-02 11:56 ` Pavel Machek
2020-08-03 8:08 ` David Laight
2020-08-03 15:57 ` Madhavan T. Venkataraman
2020-07-30 14:42 ` Madhavan T. Venkataraman
[not found] ` <6540b4b7-3f70-adbf-c922-43886599713a@linux.microsoft.com>
2020-07-30 20:54 ` Andy Lutomirski
2020-07-31 17:13 ` Madhavan T. Venkataraman
2020-07-31 18:31 ` Mark Rutland
2020-08-03 8:27 ` David Laight
2020-08-03 16:03 ` Madhavan T. Venkataraman
2020-08-03 16:57 ` David Laight
2020-08-03 17:00 ` Madhavan T. Venkataraman
2020-08-03 17:58 ` Madhavan T. Venkataraman
2020-08-04 13:55 ` Mark Rutland
2020-08-04 14:33 ` David Laight
2020-08-04 14:44 ` David Laight
2020-08-04 14:48 ` Madhavan T. Venkataraman
2020-08-04 15:46 ` Madhavan T. Venkataraman
2020-08-02 13:57 ` Florian Weimer
2020-08-02 18:54 ` Madhavan T. Venkataraman
2020-08-02 20:00 ` Andy Lutomirski
2020-08-02 22:58 ` Madhavan T. Venkataraman
2020-08-03 18:36 ` Madhavan T. Venkataraman
2020-08-10 17:34 ` Madhavan T. Venkataraman
2020-08-11 21:12 ` Madhavan T. Venkataraman
2020-08-03 8:23 ` David Laight
2020-08-03 15:59 ` Madhavan T. Venkataraman
2020-07-31 18:09 ` Mark Rutland
2020-07-31 20:08 ` Madhavan T. Venkataraman
2020-08-03 16:57 ` Madhavan T. Venkataraman
2020-08-04 14:30 ` Mark Rutland
2020-08-06 17:26 ` Madhavan T. Venkataraman
2020-08-08 22:17 ` Pavel Machek
2020-08-11 12:41 ` Madhavan T. Venkataraman
2020-08-11 13:08 ` Pavel Machek
2020-08-11 15:54 ` Madhavan T. Venkataraman
2020-08-12 10:06 ` Mark Rutland
2020-08-12 18:47 ` Madhavan T. Venkataraman
2020-08-19 18:53 ` Mickaël Salaün
2020-09-01 15:42 ` Mark Rutland
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8b28f4a5-2d9e-0686-40e5-2ea9e37c5933@linux.microsoft.com \
--to=madvenka@linux.microsoft.com \
--cc=kernel-hardening@lists.openwall.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-integrity@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=luto@kernel.org \
--cc=oleg@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).