All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: "Jan Beulich" <jbeulich@suse.com>,
	"xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	"Roger Pau Monné" <roger.pau@citrix.com>, "Wei Liu" <wl@xen.org>
Subject: Re: [Xen-devel] [PATCH 7/7] x86emul: support SYSRET
Date: Wed, 25 Mar 2020 10:00:32 +0000	[thread overview]
Message-ID: <9af3c1bb-5b8f-4ff5-c9ce-2f34af652814@citrix.com> (raw)
In-Reply-To: <78b62646-6fd4-e5b3-bc09-783bb017eaaa@suse.com>

On 24/03/2020 16:29, Jan Beulich wrote:
> This is to augment SYSCALL, which has been supported for quite some
> time.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

I've compared this to the in-progress version I have in my XSA-204
follow-on series.  I'm afraid the behaviour has far more vendor specific
quirks than this.

>
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -5975,6 +5975,60 @@ x86_emulate(
>              goto done;
>          break;
>  
> +    case X86EMUL_OPC(0x0f, 0x07): /* sysret */
> +        vcpu_must_have(syscall);
> +        /* Inject #UD if syscall/sysret are disabled. */
> +        fail_if(!ops->read_msr);
> +        if ( (rc = ops->read_msr(MSR_EFER, &msr_val, ctxt)) != X86EMUL_OKAY )
> +            goto done;
> +        generate_exception_if((msr_val & EFER_SCE) == 0, EXC_UD);

(as with the SYSCALL side), no need for the vcpu_must_have(syscall) as
well as this check.

> +        generate_exception_if(!amd_like(ctxt) && !mode_64bit(), EXC_UD);
> +        generate_exception_if(!mode_ring0(), EXC_GP, 0);
> +        generate_exception_if(!in_protmode(ctxt, ops), EXC_GP, 0);
> +

The Intel SYSRET vulnerability checks regs->rcx for canonicity here, and
raises #GP here.

I see you've got it below, but this is where the Intel pseudocode puts
it, before MSR_STAR gets read, and logically it should be grouped with
the other excpetions.

> +        if ( (rc = ops->read_msr(MSR_STAR, &msr_val, ctxt)) != X86EMUL_OKAY )
> +            goto done;
> +        sreg.sel = ((msr_val >> 48) + 8) | 3; /* SELECTOR_RPL_MASK */

This would be the logical behaviour...

AMD CPUs |3 into %cs.sel, but don't make an equivalent adjustment for
%ss.sel, and simply take MSR_START.SYSRET_CS + 8.

If you aren't careful with MSR_STAR, SYSRET will return to userspace
with mismatching RPL/DPL and userspace can really find itself with an
%ss with an RPL of 0.  (Of course, when you take an interrupt and
attempt to IRET back to this context, things fall apart).

I discovered this entirely by accident in XTF, but it is confirmed by
careful reading of the AMD SYSRET pseudocode.

> +        cs.sel = op_bytes == 8 ? sreg.sel + 8 : sreg.sel - 8;
> +
> +        cs.base = sreg.base = 0; /* flat segment */
> +        cs.limit = sreg.limit = ~0u; /* 4GB limit */
> +        cs.attr = 0xcfb; /* G+DB+P+DPL3+S+Code */
> +        sreg.attr = 0xcf3; /* G+DB+P+DPL3+S+Data */

Again, that would be the logical behaviour...

AMD CPU's don't update anything but %ss.sel, and even comment the fact
in pseudocode now.

This was discovered by Andy Luto, where he found that taking an
interrupt (unconditionally sets %ss to NUL), and opportunistic sysret
back to 32bit userspace lets userspace see a sane %ss value, but with
the attrs still empty, and the stack unusable.

> +
> +#ifdef __x86_64__
> +        if ( mode_64bit() )
> +        {
> +            if ( op_bytes == 8 )
> +            {
> +                cs.attr = 0xafb; /* L+DB+P+DPL3+S+Code */
> +                generate_exception_if(!is_canonical_address(_regs.rcx) &&
> +                                      !amd_like(ctxt), EXC_GP, 0);

Wherever this ends up living, I think it needs calling out with a
comment /* CVE-xxx, Intel privilege escalation hole */, as it is a very
subtle piece of vendor specific behaviour.

Do we have a Centaur/other CPU to try with?  I'd err on the side of
going with == Intel rather than !AMD to avoid introducing known
vulnerabilities into models which stand half a chance of not being affected.

> +                _regs.rip = _regs.rcx;
> +            }
> +            else
> +                _regs.rip = _regs.ecx;
> +
> +            _regs.eflags = _regs.r11 & ~(X86_EFLAGS_RF | X86_EFLAGS_VM);
> +        }
> +        else
> +#endif
> +        {
> +            _regs.r(ip) = _regs.ecx;
> +            _regs.eflags |= X86_EFLAGS_IF;
> +        }
> +
> +        fail_if(!ops->write_segment);
> +        if ( (rc = ops->write_segment(x86_seg_cs, &cs, ctxt)) != X86EMUL_OKAY ||
> +             (!amd_like(ctxt) &&
> +              (rc = ops->write_segment(x86_seg_ss, &sreg,
> +                                       ctxt)) != X86EMUL_OKAY) )

Oh - here is the AMD behaviour with %ss, but its not quite correct.

AFAICT, the correct behaviour is to read the old %ss on AMD-like, set
flat attributes on Intel, and write back normally, because %ss.sel does
get updated.

~Andrew

> +            goto done;
> +
> +        singlestep = _regs.eflags & X86_EFLAGS_TF;
> +        break;
> +
>      case X86EMUL_OPC(0x0f, 0x08): /* invd */
>      case X86EMUL_OPC(0x0f, 0x09): /* wbinvd / wbnoinvd */
>          generate_exception_if(!mode_ring0(), EXC_GP, 0);
>



  reply	other threads:[~2020-03-25 10:01 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-24 16:18 [Xen-devel] [PATCH 0/7] x86emul: (mainly) vendor specific behavior adjustments Jan Beulich
2020-03-24 16:26 ` [Xen-devel] [PATCH 1/7] x86emul: add wrappers to check for AMD-like behavior Jan Beulich
2020-03-25 13:26   ` Andrew Cooper
2020-03-24 16:26 ` [Xen-devel] [PATCH 2/7] x86emul: vendor specific near RET behavior in 64-bit mode Jan Beulich
2020-03-25 13:36   ` Andrew Cooper
2020-03-24 16:27 ` [Xen-devel] [PATCH 3/7] x86emul: vendor specific direct branch " Jan Beulich
2020-03-25 14:10   ` Andrew Cooper
2020-03-24 16:27 ` [Xen-devel] [PATCH 4/7] x86emul: vendor specific near indirect " Jan Beulich
2020-03-25 14:11   ` Andrew Cooper
2020-03-24 16:28 ` [Xen-devel] [PATCH 5/7] x86emul: vendor specific SYSENTER/SYSEXIT behavior in long mode Jan Beulich
2020-03-25 14:15   ` Andrew Cooper
2020-03-24 16:28 ` [Xen-devel] [PATCH 6/7] x86emul: vendor specific SYSCALL behavior Jan Beulich
2020-03-25  9:44   ` Andrew Cooper
2020-03-24 16:29 ` [Xen-devel] [PATCH 7/7] x86emul: support SYSRET Jan Beulich
2020-03-25 10:00   ` Andrew Cooper [this message]
2020-03-25 10:19     ` Jan Beulich
2020-03-25 10:47       ` Andrew Cooper
2020-03-25 11:55     ` Jan Beulich
2020-03-25 12:25       ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9af3c1bb-5b8f-4ff5-c9ce-2f34af652814@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=roger.pau@citrix.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.