From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>, Jann Horn <jannh@google.com>,
"Linus Torvalds" <torvalds@linux-foundation.org>,
Rich Felker <dalias@libc.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
Jethro Beekman <jethro@fortanix.com>,
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>,
Florian Weimer <fweimer@redhat.com>,
Linux API <linux-api@vger.kernel.org>, X86 ML <x86@kernel.org>,
linux-arch <linux-arch@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>, <nhorman@redhat.com>,
<npmccallum@redhat.com>, "Ayoun, Serge" <serge.ayoun@intel.com>,
<shay.katz-zamir@intel.com>, <linux-sgx@vger.kernel.org>,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
"Carlos O'Donell" <carlos@redhat.com>,
<adhemerval.zanella@linaro.org>
Subject: Re: RFC: userspace exception fixups
Date: Wed, 7 Nov 2018 11:01:15 -0800 [thread overview]
Message-ID: <20181107190114.GA26603@linux.intel.com> (raw)
In-Reply-To: <20181107153452.GB22972@linux.intel.com>
On Wed, Nov 07, 2018 at 07:34:52AM -0800, Sean Christopherson wrote:
> On Tue, Nov 06, 2018 at 05:17:14PM -0800, Andy Lutomirski wrote:
> > On Tue, Nov 6, 2018 at 4:02 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > On Tue, Nov 06, 2018 at 03:39:48PM -0800, Andy Lutomirski wrote:
> > > > On Tue, Nov 6, 2018 at 3:35 PM Sean Christopherson
> > > > <sean.j.christopherson@intel.com> wrote:
> > > > >
> > > > > Sorry if I'm beating a dead horse, but what if we only did fixup on ENCLU
> > > > > with a specific (ignored) prefix pattern? I.e. effectively make the magic
> > > > > fixup opt-in, falling back to signals. Jamming RIP to skip ENCLU isn't
> > > > > that far off the architecture, e.g. EENTER stuffs RCX with the next RIP so
> > > > > that the enclave can EEXIT to immediately after the EENTER location.
> > > > >
> > > >
> > > > How does that even work, though? On an AEX, RIP points to the ERESUME
> > > > instruction, not the EENTER instruction, so if we skip it we just end
> > > > up in lala land.
> > >
> > > Userspace would obviously need to be aware of the fixup behavior, but
> > > it actually works out fairly nicely to have a separate path for ERESUME
> > > fixup since a fault on EENTER is generally fatal, whereas as a fault on
> > > ERESUME might be recoverable.
> > >
> >
> > Hmm.
> >
> > >
> > > do_eenter:
> > > mov tcs, %rbx
> > > lea async_exit, %rcx
> > > mov $EENTER, %rax
> > > ENCLU
> >
> > Or SOME_SILLY_PREFIX ENCLU?
>
> Yeah, forgot to include that.
>
> > >
> > > /*
> > > * EEXIT or EENTER faulted. In the latter case, %RAX already holds some
> > > * fault indicator, e.g. -EFAULT.
> > > */
> > > eexit_or_eenter_fault:
> > > ret
> >
> > But userspace wants to know whether it was a fault or not. So I think
> > we either need two landing pads or we need to hijack a flag bit (are
> > there any known-zeroed flag bits after EEXIT?) to say whether it was a
> > fault. And, if it was a fault, we should give the vector, the
> > sanitized error code, and possibly CR2.
>
> As Jethro mentioned, RAX will always be 4 on a successful EEXIT, so we
> can use RAX to indicate a fault. That's what I was trying to imply with
> EFAULT. Here's the reg stuffing I use for the POC:
>
> regs->ax = EFAULT;
> regs->di = trapnr;
> regs->si = error_code;
> regs->dx = address;
>
>
> Well-known RAX values also means the kernel fault handlers only need to
> look for SOME_SILLY_PREFIX ENCLU if RAX==2 || RAX==3, i.e. the fault
> occurred on EENTER or in an enclave (RAX is set to ERESUME's leaf as
> part of the asynchronous enlcave exit flow).
POC kernel code, 64-bit only.
Limiting this to 64-bit isn't necessary, but it makes the code prettier
and allows using REX as the magic prefix. I like the idea of using REX
because it seems least likely to be repurposed for yet another new
feature. I have no idea if 64-bit only will fly with the SDK folks.
Going off comments in similar code related to UMIP, we'd need to figure
out how to handle protection keys.
/* REX with all bits set, ignored by ENCLU. */
#define SGX_DO_ENCLU_FIXUP 0x4F
#define SGX_ENCLU_OPCODE0 0x0F
#define SGX_ENCLU_OPCODE1 0x01
#define SGX_ENCLU_OPCODE2 0xD7
/* ENCLU is a three-byte opcode, plus one byte for the magic prefix. */
#define SGX_ENCLU_FIXUP_INSN_LEN 4
static int sgx_detect_enclu(struct pt_regs *regs)
{
unsigned char buf[SGX_ENCLU_FIXUP_INSN_LEN];
/* Look for EENTER or ERESUME in RAX, 64-bit mode only. */
if (!regs || (regs->ax != 2 && regs->ax != 3) || !user_64bit_mode(regs))
return 0;
if (copy_from_user(buf, (void __user *)(regs->ip), sizeof(buf)))
return 0;
if (buf[0] == SGX_DO_ENCLU_FIXUP &&
buf[1] == SGX_ENCLU_OPCODE0 &&
buf[2] == SGX_ENCLU_OPCODE1 &&
buf[3] == SGX_ENCLU_OPCODE2)
return SGX_ENCLU_FIXUP_INSN_LEN;
return 0;
}
bool sgx_fixup_enclu_fault(struct pt_regs *regs, int trapnr,
unsigned long error_code, unsigned long address)
{
int insn_len;
insn_len = sgx_detect_enclu(regs);
if (!insn_len)
return false;
regs->ip += insn_len;
regs->ax = EFAULT;
regs->di = trapnr;
regs->si = error_code;
regs->dx = address;
return true;
}
WARNING: multiple messages have this Message-ID (diff)
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>, Jann Horn <jannh@google.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Rich Felker <dalias@libc.org>,
Dave Hansen <dave.hansen@linux.intel.com>,
Jethro Beekman <jethro@fortanix.com>,
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>,
Florian Weimer <fweimer@redhat.com>,
Linux API <linux-api@vger.kernel.org>, X86 ML <x86@kernel.org>,
linux-arch <linux-arch@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
nhorman@redhat.com, npmccallum@redhat.com, "Ayoun,
Serge" <serge.ayoun@intel.com>,
shay.katz-zamir@intel.com, linux-sgx@vger.kernel.org,
Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
Carlos O'Donell <carlos@redhat.com>,
adhemerval.zanella@linaro.org
Subject: Re: RFC: userspace exception fixups
Date: Wed, 7 Nov 2018 11:01:15 -0800 [thread overview]
Message-ID: <20181107190114.GA26603@linux.intel.com> (raw)
Message-ID: <20181107190115.QzekKaPyCllPXOMm5F_MPWoCY4iSk09TkAAy-JgUITI@z> (raw)
In-Reply-To: <20181107153452.GB22972@linux.intel.com>
On Wed, Nov 07, 2018 at 07:34:52AM -0800, Sean Christopherson wrote:
> On Tue, Nov 06, 2018 at 05:17:14PM -0800, Andy Lutomirski wrote:
> > On Tue, Nov 6, 2018 at 4:02 PM Sean Christopherson
> > <sean.j.christopherson@intel.com> wrote:
> > >
> > > On Tue, Nov 06, 2018 at 03:39:48PM -0800, Andy Lutomirski wrote:
> > > > On Tue, Nov 6, 2018 at 3:35 PM Sean Christopherson
> > > > <sean.j.christopherson@intel.com> wrote:
> > > > >
> > > > > Sorry if I'm beating a dead horse, but what if we only did fixup on ENCLU
> > > > > with a specific (ignored) prefix pattern? I.e. effectively make the magic
> > > > > fixup opt-in, falling back to signals. Jamming RIP to skip ENCLU isn't
> > > > > that far off the architecture, e.g. EENTER stuffs RCX with the next RIP so
> > > > > that the enclave can EEXIT to immediately after the EENTER location.
> > > > >
> > > >
> > > > How does that even work, though? On an AEX, RIP points to the ERESUME
> > > > instruction, not the EENTER instruction, so if we skip it we just end
> > > > up in lala land.
> > >
> > > Userspace would obviously need to be aware of the fixup behavior, but
> > > it actually works out fairly nicely to have a separate path for ERESUME
> > > fixup since a fault on EENTER is generally fatal, whereas as a fault on
> > > ERESUME might be recoverable.
> > >
> >
> > Hmm.
> >
> > >
> > > do_eenter:
> > > mov tcs, %rbx
> > > lea async_exit, %rcx
> > > mov $EENTER, %rax
> > > ENCLU
> >
> > Or SOME_SILLY_PREFIX ENCLU?
>
> Yeah, forgot to include that.
>
> > >
> > > /*
> > > * EEXIT or EENTER faulted. In the latter case, %RAX already holds some
> > > * fault indicator, e.g. -EFAULT.
> > > */
> > > eexit_or_eenter_fault:
> > > ret
> >
> > But userspace wants to know whether it was a fault or not. So I think
> > we either need two landing pads or we need to hijack a flag bit (are
> > there any known-zeroed flag bits after EEXIT?) to say whether it was a
> > fault. And, if it was a fault, we should give the vector, the
> > sanitized error code, and possibly CR2.
>
> As Jethro mentioned, RAX will always be 4 on a successful EEXIT, so we
> can use RAX to indicate a fault. That's what I was trying to imply with
> EFAULT. Here's the reg stuffing I use for the POC:
>
> regs->ax = EFAULT;
> regs->di = trapnr;
> regs->si = error_code;
> regs->dx = address;
>
>
> Well-known RAX values also means the kernel fault handlers only need to
> look for SOME_SILLY_PREFIX ENCLU if RAX==2 || RAX==3, i.e. the fault
> occurred on EENTER or in an enclave (RAX is set to ERESUME's leaf as
> part of the asynchronous enlcave exit flow).
POC kernel code, 64-bit only.
Limiting this to 64-bit isn't necessary, but it makes the code prettier
and allows using REX as the magic prefix. I like the idea of using REX
because it seems least likely to be repurposed for yet another new
feature. I have no idea if 64-bit only will fly with the SDK folks.
Going off comments in similar code related to UMIP, we'd need to figure
out how to handle protection keys.
/* REX with all bits set, ignored by ENCLU. */
#define SGX_DO_ENCLU_FIXUP 0x4F
#define SGX_ENCLU_OPCODE0 0x0F
#define SGX_ENCLU_OPCODE1 0x01
#define SGX_ENCLU_OPCODE2 0xD7
/* ENCLU is a three-byte opcode, plus one byte for the magic prefix. */
#define SGX_ENCLU_FIXUP_INSN_LEN 4
static int sgx_detect_enclu(struct pt_regs *regs)
{
unsigned char buf[SGX_ENCLU_FIXUP_INSN_LEN];
/* Look for EENTER or ERESUME in RAX, 64-bit mode only. */
if (!regs || (regs->ax != 2 && regs->ax != 3) || !user_64bit_mode(regs))
return 0;
if (copy_from_user(buf, (void __user *)(regs->ip), sizeof(buf)))
return 0;
if (buf[0] == SGX_DO_ENCLU_FIXUP &&
buf[1] == SGX_ENCLU_OPCODE0 &&
buf[2] == SGX_ENCLU_OPCODE1 &&
buf[3] == SGX_ENCLU_OPCODE2)
return SGX_ENCLU_FIXUP_INSN_LEN;
return 0;
}
bool sgx_fixup_enclu_fault(struct pt_regs *regs, int trapnr,
unsigned long error_code, unsigned long address)
{
int insn_len;
insn_len = sgx_detect_enclu(regs);
if (!insn_len)
return false;
regs->ip += insn_len;
regs->ax = EFAULT;
regs->di = trapnr;
regs->si = error_code;
regs->dx = address;
return true;
}
next prev parent reply other threads:[~2018-11-07 19:01 UTC|newest]
Thread overview: 163+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-01 17:53 RFC: userspace exception fixups Andy Lutomirski
2018-11-01 17:53 ` Andy Lutomirski
2018-11-01 18:09 ` Florian Weimer
2018-11-01 18:09 ` Florian Weimer
2018-11-01 18:30 ` Rich Felker
2018-11-01 18:30 ` Rich Felker
2018-11-01 19:00 ` Jarkko Sakkinen
2018-11-01 19:00 ` Jarkko Sakkinen
2018-11-01 18:27 ` Rich Felker
2018-11-01 18:27 ` Rich Felker
2018-11-01 18:33 ` Jann Horn
2018-11-01 18:33 ` Jann Horn
2018-11-01 18:52 ` Rich Felker
2018-11-01 18:52 ` Rich Felker
2018-11-01 19:10 ` Linus Torvalds
2018-11-01 19:10 ` Linus Torvalds
2018-11-01 19:31 ` Rich Felker
2018-11-01 19:31 ` Rich Felker
2018-11-01 21:24 ` Linus Torvalds
2018-11-01 21:24 ` Linus Torvalds
2018-11-01 23:22 ` Andy Lutomirski
2018-11-01 23:22 ` Andy Lutomirski
2018-11-02 16:30 ` Sean Christopherson
2018-11-02 16:30 ` Sean Christopherson
2018-11-02 16:37 ` Jethro Beekman
2018-11-02 16:37 ` Jethro Beekman
2018-11-02 16:52 ` Sean Christopherson
2018-11-02 16:52 ` Sean Christopherson
2018-11-02 16:56 ` Jethro Beekman
2018-11-02 16:56 ` Jethro Beekman
2018-11-02 17:01 ` Andy Lutomirski
2018-11-02 17:01 ` Andy Lutomirski
2018-11-02 17:05 ` Jethro Beekman
2018-11-02 17:05 ` Jethro Beekman
2018-11-02 17:16 ` Andy Lutomirski
2018-11-02 17:16 ` Andy Lutomirski
2018-11-02 17:32 ` Rich Felker
2018-11-02 17:32 ` Rich Felker
2018-11-02 17:12 ` Sean Christopherson
2018-11-02 17:12 ` Sean Christopherson
2018-11-02 22:42 ` Jarkko Sakkinen
2018-11-02 22:42 ` Jarkko Sakkinen
2018-11-02 16:56 ` Dave Hansen
2018-11-02 16:56 ` Dave Hansen
2018-11-02 17:06 ` Sean Christopherson
2018-11-02 17:06 ` Sean Christopherson
2018-11-02 17:13 ` Dave Hansen
2018-11-02 17:13 ` Dave Hansen
2018-11-02 17:33 ` Sean Christopherson
2018-11-02 17:33 ` Sean Christopherson
2018-11-02 17:48 ` Andy Lutomirski
2018-11-02 17:48 ` Andy Lutomirski
2018-11-02 18:27 ` Sean Christopherson
2018-11-02 18:27 ` Sean Christopherson
2018-11-02 19:02 ` Jann Horn
2018-11-02 19:02 ` Jann Horn
2018-11-02 22:04 ` Sean Christopherson
2018-11-02 22:04 ` Sean Christopherson
2018-11-02 23:27 ` Jann Horn
2018-11-02 23:27 ` Jann Horn
2018-11-02 23:32 ` Andy Lutomirski
2018-11-02 23:32 ` Andy Lutomirski
2018-11-02 23:36 ` Jann Horn
2018-11-02 23:36 ` Jann Horn
2018-11-06 15:37 ` Sean Christopherson
2018-11-06 15:37 ` Sean Christopherson
2018-11-06 16:57 ` Andy Lutomirski
2018-11-06 16:57 ` Andy Lutomirski
2018-11-06 17:03 ` Dave Hansen
2018-11-06 17:03 ` Dave Hansen
2018-11-06 17:19 ` Sean Christopherson
2018-11-06 17:19 ` Sean Christopherson
2018-11-06 18:20 ` Andy Lutomirski
2018-11-06 18:20 ` Andy Lutomirski
2018-11-06 18:41 ` Dave Hansen
2018-11-06 18:41 ` Dave Hansen
2018-11-06 19:02 ` Andy Lutomirski
2018-11-06 19:02 ` Andy Lutomirski
2018-11-06 19:22 ` Dave Hansen
2018-11-06 19:22 ` Dave Hansen
2018-11-06 20:12 ` Andy Lutomirski
2018-11-06 20:12 ` Andy Lutomirski
2018-11-06 21:00 ` Dave Hansen
2018-11-06 21:00 ` Dave Hansen
2018-11-06 21:07 ` Andy Lutomirski
2018-11-06 21:07 ` Andy Lutomirski
2018-11-06 21:41 ` Andy Lutomirski
2018-11-06 21:41 ` Andy Lutomirski
2018-11-06 21:59 ` Sean Christopherson
2018-11-06 21:59 ` Sean Christopherson
2018-11-06 23:00 ` Andy Lutomirski
2018-11-06 23:00 ` Andy Lutomirski
2018-11-06 23:35 ` Sean Christopherson
2018-11-06 23:35 ` Sean Christopherson
2018-11-06 23:39 ` Andy Lutomirski
2018-11-06 23:39 ` Andy Lutomirski
2018-11-07 0:02 ` Sean Christopherson
2018-11-07 0:02 ` Sean Christopherson
2018-11-07 1:17 ` Andy Lutomirski
2018-11-07 1:17 ` Andy Lutomirski
2018-11-07 6:47 ` Jethro Beekman
2018-11-07 6:47 ` Jethro Beekman
2018-11-07 15:34 ` Sean Christopherson
2018-11-07 15:34 ` Sean Christopherson
2018-11-07 19:01 ` Sean Christopherson [this message]
2018-11-07 19:01 ` Sean Christopherson
2018-11-07 20:56 ` Dave Hansen
2018-11-07 20:56 ` Dave Hansen
2018-11-08 15:04 ` Jarkko Sakkinen
2018-11-08 15:04 ` Jarkko Sakkinen
2018-11-08 19:54 ` Sean Christopherson
2018-11-08 19:54 ` Sean Christopherson
2018-11-08 20:05 ` Andy Lutomirski
2018-11-08 20:05 ` Andy Lutomirski
2018-11-08 20:10 ` Dave Hansen
2018-11-08 20:10 ` Dave Hansen
2018-11-08 21:16 ` Sean Christopherson
2018-11-08 21:16 ` Sean Christopherson
2018-11-08 21:50 ` Dave Hansen
2018-11-08 21:50 ` Dave Hansen
2018-11-08 22:04 ` Sean Christopherson
2018-11-08 22:04 ` Sean Christopherson
2018-11-09 7:12 ` Christoph Hellwig
2018-11-09 7:12 ` Christoph Hellwig
2018-11-06 23:17 ` Rich Felker
2018-11-06 23:17 ` Rich Felker
2018-11-06 23:26 ` Sean Christopherson
2018-11-06 23:26 ` Sean Christopherson
2018-11-07 21:27 ` Rich Felker
2018-11-07 21:27 ` Rich Felker
2018-11-07 21:33 ` Andy Lutomirski
2018-11-07 21:33 ` Andy Lutomirski
2018-11-07 21:40 ` Sean Christopherson
2018-11-07 21:40 ` Sean Christopherson
2018-11-08 15:11 ` Jarkko Sakkinen
2018-11-08 15:11 ` Jarkko Sakkinen
2018-11-06 17:00 ` Dave Hansen
2018-11-06 17:00 ` Dave Hansen
2018-11-02 22:37 ` Jarkko Sakkinen
2018-11-02 22:37 ` Jarkko Sakkinen
2018-11-01 19:06 ` Linus Torvalds
2018-11-01 19:06 ` Linus Torvalds
2018-11-02 22:07 ` Jarkko Sakkinen
2018-11-02 22:07 ` Jarkko Sakkinen
2018-11-18 7:15 ` Jarkko Sakkinen
2018-11-18 7:18 ` Jarkko Sakkinen
2018-11-18 13:02 ` Jarkko Sakkinen
2018-11-19 5:17 ` Jethro Beekman
2018-11-19 14:05 ` Jarkko Sakkinen
2018-11-19 14:59 ` Jarkko Sakkinen
2018-11-19 15:29 ` Andy Lutomirski
2018-11-19 16:02 ` Jarkko Sakkinen
2018-11-19 17:00 ` Andy Lutomirski
2018-11-20 10:11 ` Jarkko Sakkinen
2018-11-20 15:19 ` Andy Lutomirski
2018-11-20 22:55 ` Jarkko Sakkinen
2018-11-21 5:17 ` Jethro Beekman
2018-11-21 15:17 ` Jarkko Sakkinen
2018-11-24 17:07 ` Jarkko Sakkinen
2018-11-26 14:35 ` Sean Christopherson
2018-11-26 22:06 ` Jarkko Sakkinen
2018-11-20 18:09 ` Sean Christopherson
2018-11-20 22:46 ` Jarkko Sakkinen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181107190114.GA26603@linux.intel.com \
--to=sean.j.christopherson@intel.com \
--cc=adhemerval.zanella@linaro.org \
--cc=andriy.shevchenko@linux.intel.com \
--cc=bp@alien8.de \
--cc=carlos@redhat.com \
--cc=dalias@libc.org \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=fweimer@redhat.com \
--cc=jannh@google.com \
--cc=jarkko.sakkinen@linux.intel.com \
--cc=jethro@fortanix.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=nhorman@redhat.com \
--cc=npmccallum@redhat.com \
--cc=peterz@infradead.org \
--cc=serge.ayoun@intel.com \
--cc=shay.katz-zamir@intel.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).