All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>, Wei Liu <wl@xen.org>,
	Ian Jackson <iwj@xenproject.org>
Subject: Re: [PATCH v3 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads
Date: Fri, 12 Mar 2021 10:08:52 +0100	[thread overview]
Message-ID: <YEsvpK8WJQNqSQGe@Air-de-Roger> (raw)
In-Reply-To: <90f87aa8-09da-1453-bd82-c722465c2881@suse.com>

On Fri, Mar 12, 2021 at 08:54:46AM +0100, Jan Beulich wrote:
> Prior to 4.15 Linux, when running in PV mode, did not install a #GP
> handler early enough to cover for example the rdmsrl_safe() of
> MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read
> of MSR_K7_HWCR later in the same function). The respective change
> (42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was
> backported to 4.14, but no further - presumably since it wasn't really
> easy because of other dependencies.
> 
> Therefore, to prevent our change in the handling of guest MSR accesses
> to render PV Linux 4.13 and older unusable on at least AMD systems, make
> the raising of #GP on this paths conditional upon the guest having
> installed a handler, provided of course the MSR can be read in the first
> place (we would have raised #GP in that case even before). Producing
> zero for reads isn't necessarily correct and may trip code trying to
> detect presence of MSRs early, but since such detection logic won't work
> without a #GP handler anyway, this ought to be a fair workaround.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Release-Acked-by: Ian Jackson <iwj@xenproject.org>

I think the approach is fine:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Some comments below.

> ---
> (projected v4: re-base over Roger's change)
> v3: Use temporary variable for probing. Document the behavior (in a
>     public header, for the lack of a better place).
> v2: Probe MSR read. Exclude hypervisor range. Avoid issuing two log
>     messages (in debug builds). Don't alter WRMSR behavior.
> ---
> While I didn't myself observe or find similar WRMSR side issues, I'm
> nevertheless not convinced we can get away without also making the WRMSR
> path somewhat more permissive again, e.g. tolerating attempts to set
> bits which are already set. But of course this would require keeping in
> sync for which MSRs we "fake" reads, as then a kernel attempt to set a
> bit may also appear as an attempt to clear others (because of the zero
> value that we gave it for the read). Roger validly points out that
> making behavior dependent upon MSR values has its own downsides, so
> simply depending on MSR readability is another option (with, in turn,
> its own undesirable effects, e.g. for write-only MSRs).
> 
> --- a/xen/arch/x86/pv/emul-priv-op.c
> +++ b/xen/arch/x86/pv/emul-priv-op.c
> @@ -874,7 +874,8 @@ static int read_msr(unsigned int reg, ui
>      struct vcpu *curr = current;
>      const struct domain *currd = curr->domain;
>      const struct cpuid_policy *cp = currd->arch.cpuid;
> -    bool vpmu_msr = false;
> +    bool vpmu_msr = false, warn = false;
> +    uint64_t tmp;
>      int ret;
>  
>      if ( (ret = guest_rdmsr(curr, reg, val)) != X86EMUL_UNHANDLEABLE )
> @@ -882,7 +883,7 @@ static int read_msr(unsigned int reg, ui
>          if ( ret == X86EMUL_EXCEPTION )
>              x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt);

You might want to move the injection of the exception to the done
label?

So that we can avoid the call to x86_emul_reset_event.

>  
> -        return ret;
> +        goto done;
>      }
>  
>      switch ( reg )
> @@ -986,7 +987,7 @@ static int read_msr(unsigned int reg, ui
>          }
>          /* fall through */
>      default:
> -        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg);
> +        warn = true;
>          break;
>  
>      normal:
> @@ -995,7 +996,19 @@ static int read_msr(unsigned int reg, ui
>          return X86EMUL_OKAY;
>      }
>  
> -    return X86EMUL_UNHANDLEABLE;
> + done:
> +    if ( ret != X86EMUL_OKAY && !curr->arch.pv.trap_ctxt[X86_EXC_GP].address &&
> +         (reg >> 16) != 0x4000 && !rdmsr_safe(reg, tmp) )
> +    {
> +        gprintk(XENLOG_WARNING, "faking RDMSR 0x%08x\n", reg);
> +        *val = 0;
> +        x86_emul_reset_event(ctxt);
> +        ret = X86EMUL_OKAY;
> +    }
> +    else if ( warn )
> +        gdprintk(XENLOG_WARNING, "RDMSR 0x%08x unimplemented\n", reg);

I think you could add:

if ( rc == X86EMUL_EXCEPTION )
    x86_emul_hw_exception(TRAP_gp_fault, 0, ctxt);

> +
> +    return ret;
>  }
>  
>  static int write_msr(unsigned int reg, uint64_t val,
> --- a/xen/include/public/arch-x86/xen.h
> +++ b/xen/include/public/arch-x86/xen.h
> @@ -143,6 +143,12 @@ typedef unsigned long xen_ulong_t;
>   *  Level == 1: Kernel may enter
>   *  Level == 2: Kernel may enter
>   *  Level == 3: Everyone may enter
> + *
> + * Note: For compatibility with kernels not setting up exception handlers
> + *       early enough, Xen will avoid trying to inject #GP (and hence crash
> + *       the domain) when an RDMSR would require this, but no handler was
> + *       set yet. The precise conditions are implementation specific, and

You can drop the 'yet' here I think? As even if a handler has been set
and then removed we would still prevent injecting a #GP AFAICT. Not a
native speaker anyway, so I might be wrong on that one.

> + *       new code shouldn't rely on such behavior anyway.

I would use a stronger mustn't here instead of shouldn't.

Thanks, Roger.


  reply	other threads:[~2021-03-12  9:09 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-12  7:53 [PATCH v3 0/2][4.15] x86: guest MSR access handling tweaks Jan Beulich
2021-03-12  7:54 ` [PATCH v3 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads Jan Beulich
2021-03-12  9:08   ` Roger Pau Monné [this message]
2021-03-12  9:32     ` Jan Beulich
2021-03-12 11:19   ` Andrew Cooper
2021-03-12 11:01     ` [PATCH v4 0/2][4.15] x86: guest MSR access handling tweaks Jan Beulich
2021-03-12 11:02       ` [PATCH v4 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads Jan Beulich
2021-03-12 11:03       ` [PATCH v4 2/2][4.15] x86/AMD: expose HWCR.TscFreqSel to guests Jan Beulich
2021-03-12 13:59       ` [PATCH v4 0/2][4.15] x86: guest MSR access handling tweaks [and 1 more messages] Ian Jackson
2021-03-12 13:34     ` [PATCH v3 1/2][4.15] x86/PV: conditionally avoid raising #GP for early guest MSR reads Jan Beulich
2021-03-12  7:55 ` [PATCH v3 2/2][4.15] x86/AMD: expose HWCR.TscFreqSel to guests Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YEsvpK8WJQNqSQGe@Air-de-Roger \
    --to=roger.pau@citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=iwj@xenproject.org \
    --cc=jbeulich@suse.com \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.