All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Wu, Feng" <feng.wu@intel.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: "andrew.cooper3@citrix.com" <andrew.cooper3@citrix.com>,
	"Dong, Eddie" <eddie.dong@intel.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"JBeulich@suse.com" <JBeulich@suse.com>,
	"ian.campbell@citrix.com" <ian.campbell@citrix.com>
Subject: Re: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen
Date: Thu, 8 May 2014 06:25:24 +0000	[thread overview]
Message-ID: <E959C4978C3B6342920538CF579893F001F71CD1@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D125E68398@SHSMSX101.ccr.corp.intel.com>



> -----Original Message-----
> From: Tian, Kevin
> Sent: Thursday, May 08, 2014 9:21 AM
> To: Wu, Feng; xen-devel@lists.xen.org
> Cc: JBeulich@suse.com; andrew.cooper3@citrix.com; Dong, Eddie; Nakajima,
> Jun; ian.campbell@citrix.com
> Subject: RE: [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention
> (SMAP) for Xen
> 
> > From: Wu, Feng
> > Sent: Wednesday, May 07, 2014 4:20 PM
> >
> > Supervisor Mode Access Prevention (SMAP) is a new security
> > feature disclosed by Intel, please refer to the following
> > document:
> >
> > http://software.intel.com/sites/default/files/319433-014.pdf
> >
> > If CR4.SMAP = 1, supervisor-mode data accesses are not allowed
> > to linear addresses that are accessible in user mode. If CPL < 3,
> > SMAP protections are disabled if EFLAGS.AC = 1. If CPL = 3, SMAP
> > applies to all supervisor-mode data accesses (these are implicit
> > supervisor accesses) regardless of the value of EFLAGS.AC.
> >
> > This patch enables SMAP in Xen to prevent Xen hypervisor from
> > accessing pv guest data, whose translation paging-structure
> > entries' U/S flags are all set.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> could you double-check whether S3 resume and CPU hotplug are covered
> well?
> 
Kevin, thank you very much for point this out!

I checked the following code path:
wakeup_start --> wakeup_32 --> wakeup_64 --> __ret_point --> restore_rest_processor_state()

I thought there was nothing special to do with SMAP here, but after careful review, I think we should
set X86_EFLAGS_AC for MSR_SYSCALL_MASK in restore_rest_processor_state(), which can
make AC bit cleared when entering kernel mode via SYSCALL after S3 resume. However, I think this
change should be included in patch "Clear AC bit in RFLAGS to protect Xen itself by SMAP",
in which, we did the same thing for MSR_SYSCALL_MASK in subarch_percpu_traps_init().

I also check the following code path, in which, seems we don't need to do anything special for SMAP.
cpu_up() --> __cpu_up() / notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu, NULL)

Thanks,
Feng

> Thanks
> Kevin
> 
> > ---
> >  docs/misc/xen-command-line.markdown |  7 +++++
> >  xen/arch/x86/setup.c                | 27 +++++++++++++++++
> >  xen/arch/x86/traps.c                | 58
> > +++++++++++++++++++++++++++----------
> >  xen/include/asm-x86/cpufeature.h    |  1 +
> >  xen/include/asm-x86/domain.h        |  6 ++--
> >  5 files changed, 82 insertions(+), 17 deletions(-)
> >
> > diff --git a/docs/misc/xen-command-line.markdown
> > b/docs/misc/xen-command-line.markdown
> > index 7dc938b..a7ac53d 100644
> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -912,6 +912,13 @@ Set the serial transmit buffer size.
> >
> >  Flag to enable Supervisor Mode Execution Protection
> >
> > +### smap
> > +> `= <boolean>`
> > +
> > +> Default: `true`
> > +
> > +Flag to enable Supervisor Mode Access Prevention
> > +
> >  ### snb\_igd\_quirk
> >  > `= <boolean>`
> >
> > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> > index 2e30701..29b22a1 100644
> > --- a/xen/arch/x86/setup.c
> > +++ b/xen/arch/x86/setup.c
> > @@ -61,6 +61,10 @@ integer_param("maxcpus", max_cpus);
> >  static bool_t __initdata disable_smep;
> >  invbool_param("smep", disable_smep);
> >
> > +/* smap: Enable/disable Supervisor Mode Access Prevention (default on). */
> > +static bool_t __initdata disable_smap;
> > +invbool_param("smap", disable_smap);
> > +
> >  /* **** Linux config option: propagated to domain0. */
> >  /* "acpi=off":    Sisables both ACPI table parsing and interpreter. */
> >  /* "acpi=force":  Override the disable blacklist.                   */
> > @@ -1280,6 +1284,11 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >      if ( cpu_has_smep )
> >          set_in_cr4(X86_CR4_SMEP);
> >
> > +    if ( disable_smap )
> > +        setup_clear_cpu_cap(X86_FEATURE_SMAP);
> > +    if ( cpu_has_smap )
> > +        set_in_cr4(X86_CR4_SMAP);
> > +
> >      if ( cpu_has_fsgsbase )
> >          set_in_cr4(X86_CR4_FSGSBASE);
> >
> > @@ -1384,6 +1393,21 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >          printk(XENLOG_WARNING
> >                 "Multiple initrd candidates, picking module #%u\n",
> >                 initrdidx);
> > +    /*
> > +     * Temporarily clear SMAP bit in CR4 to allow user-page accesses in
> > +     * construct_dom0(). This looks a little hackish, but using stac()/clac()
> > +     * to do this is not appropriate here: Since construct_dom0() calls
> > +     * copy_from_user(), so we cannot wrap construct_dom0() as a whole
> > in
> > +     * stac()/clac() - the AC bit remaining cleared after copy_from_user()
> > +     * would induce problems in construct_dom0().
> > +     *
> > +     * On the other hand, if we used stac()/clac() in construct_dom0(), we
> > +     * would need to carefully handle some corner cases.
> > +     *
> > +     * So we clear SMAP before construct_dom0() and enable it back
> > afterwards.
> > +    */
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() & ~X86_CR4_SMAP);
> >
> >      /*
> >       * We're going to setup domain0 using the module(s) that we stashed
> > safely
> > @@ -1395,6 +1419,9 @@ void __init noreturn __start_xen(unsigned long
> > mbi_p)
> >                          bootstrap_map, cmdline) != 0)
> >          panic("Could not set up DOM0 guest OS");
> >
> > +    if ( cpu_has_smap )
> > +        write_cr4(read_cr4() | X86_CR4_SMAP);
> > +
> >      /* Scrub RAM that is still free and so may go to an unprivileged
> domain.
> > */
> >      scrub_heap_pages();
> >
> > diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
> > index 19c96d5..ee203cb 100644
> > --- a/xen/arch/x86/traps.c
> > +++ b/xen/arch/x86/traps.c
> > @@ -1183,11 +1183,12 @@ static int handle_gdt_ldt_mapping_fault(
> >  enum pf_type {
> >      real_fault,
> >      smep_fault,
> > +    smap_fault,
> >      spurious_fault
> >  };
> >
> >  static enum pf_type __page_fault_type(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long mfn, cr3 = read_cr3();
> >      l4_pgentry_t l4e, *l4t;
> > @@ -1195,6 +1196,7 @@ static enum pf_type __page_fault_type(
> >      l2_pgentry_t l2e, *l2t;
> >      l1_pgentry_t l1e, *l1t;
> >      unsigned int required_flags, disallowed_flags, page_user;
> > +    unsigned int error_code = regs->error_code;
> >
> >      /*
> >       * We do not take spurious page faults in IRQ handlers as we do not
> > @@ -1263,19 +1265,37 @@ static enum pf_type __page_fault_type(
> >      page_user &= l1e_get_flags(l1e);
> >
> >  leaf:
> > -    /*
> > -     * Supervisor Mode Execution Protection (SMEP):
> > -     * Disallow supervisor execution from user-accessible mappings
> > -     */
> > -    if ( (read_cr4() & X86_CR4_SMEP) && page_user &&
> > -         ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> > PFEC_insn_fetch) )
> > -        return smep_fault;
> > +    if ( page_user )
> > +    {
> > +        unsigned long cr4 = read_cr4();
> > +        /*
> > +         * Supervisor Mode Execution Prevention (SMEP):
> > +         * Disallow supervisor execution from user-accessible mappings
> > +         */
> > +        if ( (cr4 & X86_CR4_SMEP) &&
> > +             ((error_code & (PFEC_insn_fetch|PFEC_user_mode)) ==
> > PFEC_insn_fetch) )
> > +            return smep_fault;
> > +
> > +        /*
> > +         * Supervisor Mode Access Prevention (SMAP):
> > +         * Disallow supervisor access user-accessible mappings
> > +         * A fault is considered as an SMAP violation if the following
> > +         * conditions are true:
> > +         *   - X86_CR4_SMAP is set in CR4
> > +         *   - A user page is being accessed
> > +         *   - CPL=3 or X86_EFLAGS_AC is clear
> > +         *   - Page fault in kernel mode
> > +         */
> > +        if ( (cr4 & X86_CR4_SMAP) && !(error_code & PFEC_user_mode)
> > &&
> > +             (((regs->cs & 3) == 3) || !(regs->eflags & X86_EFLAGS_AC)) )
> > +            return smap_fault;
> > +    }
> >
> >      return spurious_fault;
> >  }
> >
> >  static enum pf_type spurious_page_fault(
> > -    unsigned long addr, unsigned int error_code)
> > +    unsigned long addr, const struct cpu_user_regs *regs)
> >  {
> >      unsigned long flags;
> >      enum pf_type pf_type;
> > @@ -1285,7 +1305,7 @@ static enum pf_type spurious_page_fault(
> >       * page tables from becoming invalid under our feet during the walk.
> >       */
> >      local_irq_save(flags);
> > -    pf_type = __page_fault_type(addr, error_code);
> > +    pf_type = __page_fault_type(addr, regs);
> >      local_irq_restore(flags);
> >
> >      return pf_type;
> > @@ -1380,8 +1400,14 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(!guest_mode(regs)) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        BUG_ON(pf_type == smep_fault);
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault) )
> > +        {
> > +            console_start_sync();
> > +            printk("Xen SM%cP violation\n", (pf_type == smep_fault) ?
> 'E' :
> > 'A');
> > +            fatal_trap(TRAP_page_fault, regs);
> > +        }
> > +
> >          if ( pf_type != real_fault )
> >              return;
> >
> > @@ -1407,10 +1433,12 @@ void do_page_fault(struct cpu_user_regs *regs)
> >
> >      if ( unlikely(current->domain->arch.suppress_spurious_page_faults) )
> >      {
> > -        pf_type = spurious_page_fault(addr, error_code);
> > -        if ( pf_type == smep_fault )
> > +        pf_type = spurious_page_fault(addr, regs);
> > +        if ( (pf_type == smep_fault) || (pf_type == smap_fault))
> >          {
> > -            gdprintk(XENLOG_ERR, "Fatal SMEP fault\n");
> > +            printk(XENLOG_G_ERR "%pv fatal SM%cP violation\n",
> > +                   current, (pf_type == smep_fault) ? 'E' : 'A');
> > +
> >              domain_crash(current->domain);
> >          }
> >          if ( pf_type != real_fault )
> > diff --git a/xen/include/asm-x86/cpufeature.h
> > b/xen/include/asm-x86/cpufeature.h
> > index 20881c0..8014241 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -190,6 +190,7 @@
> >  #define cpu_has_fsgsbase	boot_cpu_has(X86_FEATURE_FSGSBASE)
> >
> >  #define cpu_has_smep            boot_cpu_has(X86_FEATURE_SMEP)
> > +#define cpu_has_smap            boot_cpu_has(X86_FEATURE_SMAP)
> >  #define cpu_has_fpu_sel
> > (!boot_cpu_has(X86_FEATURE_NO_FPU_SEL))
> >
> >  #define cpu_has_ffxsr           ((boot_cpu_data.x86_vendor ==
> > X86_VENDOR_AMD) \
> > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> > index c5c266f..abf55fb 100644
> > --- a/xen/include/asm-x86/domain.h
> > +++ b/xen/include/asm-x86/domain.h
> > @@ -467,12 +467,14 @@ unsigned long pv_guest_cr4_fixup(const struct
> vcpu
> > *, unsigned long guest_cr4);
> >      (((v)->arch.pv_vcpu.ctrlreg[4]                          \
> >        | (mmu_cr4_features                                   \
> >           & (X86_CR4_PGE | X86_CR4_PSE | X86_CR4_SMEP |      \
> > -            X86_CR4_OSXSAVE | X86_CR4_FSGSBASE))            \
> > +            X86_CR4_SMAP | X86_CR4_OSXSAVE |                \
> > +            X86_CR4_FSGSBASE))                              \
> >        | ((v)->domain->arch.vtsc ? X86_CR4_TSD : 0))         \
> >       & ~X86_CR4_DE)
> >  #define real_cr4_to_pv_guest_cr4(c)                         \
> >      ((c) & ~(X86_CR4_PGE | X86_CR4_PSE | X86_CR4_TSD |      \
> > -             X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE))
> > +             X86_CR4_OSXSAVE | X86_CR4_SMEP |               \
> > +             X86_CR4_FSGSBASE | X86_CR4_SMAP))
> >
> >  void domain_cpuid(struct domain *d,
> >                    unsigned int  input,
> > --
> > 1.8.3.1

  reply	other threads:[~2014-05-08  6:25 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-07  8:19 [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Feng Wu
2014-05-07  8:19 ` [PATCH v6 01/10] x86: define macros CPUINFO_features and CPUINFO_FEATURE_OFFSET Feng Wu
2014-05-07  8:19 ` [PATCH v6 02/10] x86: move common_interrupt to entry.S Feng Wu
2014-05-07  8:19 ` [PATCH v6 03/10] x86: merge stuff from asm-x86/x86_64/asm_defns.h to asm-x86/asm_defns.h Feng Wu
2014-05-07  8:19 ` [PATCH v6 04/10] x86: Add support for STAC/CLAC instructions Feng Wu
2014-05-07  9:36   ` Andrew Cooper
2014-05-07  8:19 ` [PATCH v6 05/10] Clear AC bit in RFLAGS to protect Xen itself by SMAP Feng Wu
2014-05-07  9:44   ` Andrew Cooper
2014-05-07 11:40     ` Jan Beulich
2014-05-07 11:53       ` Andrew Cooper
2014-05-08  1:41         ` Wu, Feng
2014-05-08  1:57           ` Andrew Cooper
2014-05-08  2:02             ` Wu, Feng
2014-05-08  6:40               ` Jan Beulich
2014-05-08  6:49                 ` Wu, Feng
2014-05-08  6:54                   ` Jan Beulich
2014-05-08  6:58                     ` Wu, Feng
2014-05-08  7:08                       ` Jan Beulich
2014-05-08  7:13                         ` Wu, Feng
2014-05-08  9:48               ` Andrew Cooper
2014-05-07  8:19 ` [PATCH v6 06/10] x86: Temporary disable SMAP to legally access user pages in kernel mode Feng Wu
2014-05-07  9:49   ` Andrew Cooper
2014-05-08  1:14   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 07/10] VMX: Disable SMAP feature when guest is in non-paging mode Feng Wu
2014-05-08  1:16   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 08/10] x86: Enable Supervisor Mode Access Prevention (SMAP) for Xen Feng Wu
2014-05-07 10:26   ` Andrew Cooper
2014-05-07 11:44     ` Jan Beulich
2014-05-07 11:47       ` Andrew Cooper
2014-05-08  2:32     ` Wu, Feng
2014-05-08  1:20   ` Tian, Kevin
2014-05-08  6:25     ` Wu, Feng [this message]
2014-05-08  7:06       ` Jan Beulich
2014-05-07  8:19 ` [PATCH v6 09/10] x86/hvm: Add SMAP support to HVM guest Feng Wu
2014-05-07 10:46   ` Andrew Cooper
2014-05-07 11:47     ` Jan Beulich
2014-05-08  1:22   ` Tian, Kevin
2014-05-07  8:19 ` [PATCH v6 10/10] x86/tools: Expose SMAP to HVM guests Feng Wu
2014-05-07  8:35 ` [PATCH v6 00/10] x86: Enable Supervisor Mode Access Prevention (SMAP) Jan Beulich
2014-05-07  9:00   ` Wu, Feng
2014-05-07  9:33     ` Jan Beulich
2014-05-07  8:57 ` Ian Campbell
2014-05-07  8:59   ` Wu, Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E959C4978C3B6342920538CF579893F001F71CD1@SHSMSX104.ccr.corp.intel.com \
    --to=feng.wu@intel.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=eddie.dong@intel.com \
    --cc=ian.campbell@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.