All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Kevin Tian <kevin.tian@intel.com>,
	Xen-devel <xen-devel@lists.xen.org>,
	Wei Liu <wei.liu2@citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [PATCH 1/6] x86/vmx: Fix handing of MSR_DEBUGCTL on VMExit
Date: Tue, 29 May 2018 19:08:27 +0100	[thread overview]
Message-ID: <d0cc97da-609e-6ef9-2479-d72eb70e6e5d@citrix.com> (raw)
In-Reply-To: <5B0D2C6502000078001C685A@prv1-mh.provo.novell.com>

On 29/05/18 11:33, Jan Beulich wrote:
>>>> On 28.05.18 at 16:27, <andrew.cooper3@citrix.com> wrote:
>> Currently, whenever the guest writes a nonzero value to MSR_DEBUGCTL, Xen
>> updates a host MSR load list entry with the current hardware value of
>> MSR_DEBUGCTL.  This is wrong.
> "This is wrong" goes too far for my taste: It is not very efficient to do it that
> way, but it's still correct. Unless, of course, the zeroing of the register
> happens after the processing of the MSR load list (which I doubt it does).

It is functionally broken.  Restoration of Xen's debugging setting must
happen from the first vmexit, not the first vmexit after the guest plays
with MSR_DEBUGCTL.

With the current behaviour, Xen looses its MSR_DEBUGCTL setting on any
pcpu where an HVM guest has been scheduled, and then feeds the current
value (0) into the host load list, even when it was attempting to set a
non-zero value.

>
>> Initially, I tried to have a common xen_msr_debugctl variable, but
>> rip-relative addresses don't resolve correctly in alternative blocks.
>> LBR-only has been fine for ages, and I don't see that changing any time 
>> soon.
> The chosen solution is certainly fine, but the issue could have been
> avoided by doing the load from memory ahead of the alternative block
> (accepting that it also happens when the value isn't actually needed).
>
> Another option would be to invert the sense of the feature flag,
> patching NOPs over the register setup plus WRMSR.

I considered both, but until it is necessary, there is little point.

>
>> @@ -1764,17 +1765,6 @@ void do_device_not_available(struct cpu_user_regs *regs)
>>      return;
>>  }
>>  
>> -static void ler_enable(void)
>> -{
>> -    u64 debugctl;
>> -
>> -    if ( !this_cpu(ler_msr) )
>> -        return;
>> -
>> -    rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
>> -    wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl | IA32_DEBUGCTLMSR_LBR);
>> -}
>> -
>>  void do_debug(struct cpu_user_regs *regs)
>>  {
>>      unsigned long dr6;
>> @@ -1870,13 +1860,13 @@ void do_debug(struct cpu_user_regs *regs)
>>      v->arch.debugreg[6] |= (dr6 & ~X86_DR6_DEFAULT);
>>      v->arch.debugreg[6] &= (dr6 | ~X86_DR6_DEFAULT);
>>  
>> -    ler_enable();
>>      pv_inject_hw_exception(TRAP_debug, X86_EVENT_NO_EC);
>> -    return;
>>  
>>   out:
>> -    ler_enable();
>> -    return;
>> +
>> +    /* #DB automatically disabled LBR.  Reinstate it if debugging Xen. */
>> +    if ( cpu_has_xen_lbr )
>> +        wrmsrl(MSR_IA32_DEBUGCTLMSR, IA32_DEBUGCTLMSR_LBR);
> While I can see that we don't currently need anything more than this one
> bit, it still doesn't feel overly well to not do a read-modify-write cycle here.

We should never be using a RMW cycle.  All that risks doing is
accumulating unexpected debugging controls.

If/when it becomes a variable, the correct code here is:

if ( xen_debugctl_val & IA32_DEBUGCTLMSR_LBR )
    wrmsrl(MSR_IA32_DEBUGCTLMSR, xen_debugctl_val);

(except that since writing this patch, I've found that BTF is also
cleared on AMD hardware, so that probably wants to be taken into account).

> In any event, rather than moving the write further towards the end of
> the function, could I ask you to move it further up, so that in the (unlikely)
> event of do_debug() itself triggering an exception we'd get a proper
> indication of the last branch before that?

Ok.  It can move to immediately after resetting %dr6.

>
>> @@ -1920,38 +1910,46 @@ void load_TR(void)
>>          : "=m" (old_gdt) : "rm" (TSS_ENTRY << 3), "m" (tss_gdt) : "memory" );
>>  }
>>  
>> -void percpu_traps_init(void)
>> +static uint32_t calc_ler_msr(void)
> Here and elsewhere "unsigned int" would be more appropriate to use.
> We don't require MSR indexes to be exactly 32 bits wide, but only at
> least as wide.

MSR indices are architecturally 32 bits wide.

>
>> +void percpu_traps_init(void)
>> +{
>> +    subarch_percpu_traps_init();
>> +
>> +    if ( !opt_ler )
>> +        return;
>> +
>> +    if ( !ler_msr && (ler_msr = calc_ler_msr()) )
>> +        setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
> This does not hold up with the promise the description makes: If running
> on an unrecognized model, calc_ler_msr() is going to be called more than
> once. If it really was called just once, it could also become __init. With
> the inverted sense of the feature flag (as suggested above) you could
> check whether the flag bit is set or ler_msr is non-zero.

Hmm - I suppose it doesn't quite match the description, but does it
matter (if I tweak the description)?  It is debugging functionality, and
I don't see any 64bit models missing from the list.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  reply	other threads:[~2018-05-29 18:08 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-28 14:27 [PATCH 0/6] x86/vmx: Misc fixes and improvements Andrew Cooper
2018-05-28 14:27 ` [PATCH 1/6] x86/vmx: Fix handing of MSR_DEBUGCTL on VMExit Andrew Cooper
2018-05-29 10:33   ` Jan Beulich
2018-05-29 18:08     ` Andrew Cooper [this message]
2018-05-30  7:32       ` Jan Beulich
2018-05-30 10:28         ` Andrew Cooper
2018-05-30 10:49           ` Jan Beulich
2018-05-30 17:34   ` [PATCH v2 " Andrew Cooper
2018-06-01  9:28     ` Jan Beulich
2018-06-05  7:54     ` Tian, Kevin
2018-05-28 14:27 ` [PATCH 2/6] x86: Improvements to ler debugging Andrew Cooper
2018-05-29 11:39   ` Jan Beulich
2018-05-29 18:09     ` Andrew Cooper
2018-06-05  7:55   ` Tian, Kevin
2018-05-28 14:27 ` [PATCH 3/6] x86/pat: Simplify host PAT handling Andrew Cooper
2018-05-29 11:40   ` Jan Beulich
2018-06-06  9:39   ` Roger Pau Monné
2018-05-28 14:27 ` [PATCH 4/6] x86/vmx: Simplify PAT handling during vcpu construction Andrew Cooper
2018-05-29 11:41   ` Jan Beulich
2018-06-05  7:57   ` Tian, Kevin
2018-06-06  9:42   ` Roger Pau Monné
2018-05-28 14:27 ` [PATCH 5/6] x86/vmx: Defer vmx_vmcs_exit() as long as possible in construct_vmcs() Andrew Cooper
2018-05-29 11:43   ` Jan Beulich
2018-06-05  8:00   ` Tian, Kevin
2018-06-06  9:45   ` Roger Pau Monné
2018-06-06 10:11     ` Andrew Cooper
2018-05-28 14:27 ` [PATCH 6/6] x86/vmx: Drop VMX signal for full real-mode Andrew Cooper
2018-06-05  8:01   ` Tian, Kevin
2018-06-06 10:03   ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d0cc97da-609e-6ef9-2479-d72eb70e6e5d@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=roger.pau@citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.