All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Remaining MSR wrinkles
       [not found] ` <24649.6523.991714.489131@mariner.uk.xensource.com>
@ 2021-03-11  7:55   ` Jan Beulich
  2021-03-11 16:54     ` Roger Pau Monné
  2021-03-11 18:22     ` Boris Ostrovsky
  0 siblings, 2 replies; 3+ messages in thread
From: Jan Beulich @ 2021-03-11  7:55 UTC (permalink / raw)
  To: Ian Jackson, Roger Pau Monné, Boris Ostrovsky
  Cc: committers, Andrew Cooper, xen-devel

On 10.03.2021 20:09, Ian Jackson wrote:
> (I bounced Roger's original mail to xen-devel.  I hope it made it...)
> 
> Roger Pau Monné writes ("Remaining MSR wrinkles"):
>> 1. MSR behavior for PV guests without a #GP handler set: PV Linux versions <
>>    4.14 will use rdmsr_safe (and likely wrmsr_safe?) without having a #GP
>>    handler setup, which results in a crash. This bug was hidden in previous
>>    Xen releases by allowing unlimited read access to the MSR space.

I've not observed problematic wrmsr_safe() so far.

>>    Jan has posted several proposals to address this:
>>
>>    https://lore.kernel.org/xen-devel/7e69db81-cee7-3c7b-be64-4f5ff50fbe9c@suse.com/
>>    https://lore.kernel.org/xen-devel/d794bbee-a5e5-6632-3d1f-acd8164b7171@suse.com/
>>
>>    Which all rely on the fact that for PV guests Xen knows whether there's a
>>    #GP handler setup and can hence prevent injection of a #GP fault if no
>>    handler is present.
>>
>>    Andrew opinion is that we should instead try to figure out which MSRs the
>>    buggy Linux versions try to access and special case them. Andrew also raised
>>    the point that continue running with a 'fake' (ie: 0) MSR value might be
>>    worse than crashing.
>>
>>    Part of the discussion has also happened here:
>>
>>    https://lore.kernel.org/xen-devel/4da62f0b-8a08-dd84-2040-fd55d74fd85a@citrix.com/
>>
>>    Look for the last quote.
>>
>>    Another option is to document that PV Linux < 4.14 will require msr_relaxed=1
>>    in order to run. That as Jan pointed out will also imply PV Linux to run with
>>    a faked (0) MSR value instead of crashing.
> 
>> For 1. I do agree with Jan than this workaround is likely the best option we
>> have, sort of resorting to request enabling msr_relaxed for all Linux PV guests
>> < 4.14. Whether we want to limit this workaround to the read side only I'm not
>> fully convinced. There's something nice about having symmetry in the read and
>> write paths, but if all the calls we have identified are rdmsr only I prefer to
>> leave the write path unaltered and request users to use msr_relaxed if write
>> issues are found later.

Especially if Andrew's ambiguous objection was against the write side
only, I think I'd prefer to go with this latter variant. Considering
that dealing with the read side alone is sufficient to address the
observed issue, I'm even inclined to prefer this irrespective of that
constraint.

> Thanks.  I find your explanation and arguments convincing.  I have
> read what Andy says in that link and I find that less convincing.  In
> particular "I don't think we should legitimise buggy PV behaviour" is
> not entirely consistent with our previous approach to
> bug-compatibility and support for old guests.
> 
> Accordingly, (with committer tie-breaker hat on) I would prefer to
> apply the patches from Jan.  I don't have an opinion about the read vs
> write question, and will probably be happy with whatever you and Jan
> can agree on.
> 
> I don't think I Release-Acked those patches yet so, for those two,
> 
> Release-Acked-by: Ian Jackson <iwj@xenproject.org>

You didn't, indeed, but "those two" is slightly confusing, the two
links Roger did provide are just different versions of the same
patch. Hence I'd like to double check that it is exactly this one
patch of mine (of which I need to send another version, at least
to include Roger's requested documentation of the behavior, and
possibly also the write side equivalent - still waiting for Andrew
to come back and clarify the scope of his objection).

>> 2. RAPL_POWER_LIMIT: handle the MSR explicitly to make Solaris happy.
>>    Alternatively we can list in the release notes that Solaris guests require
>>    msr_relaxed=1. Andrew is working on a patch for this.
> 
> I would prefer to handle the MSR explicitly, for the same
> compatibility reason as above.

The question is here - we aren't sure yet that this is the only
one, are we? Andrew suspects if this one MSR gets accessed this
way, then likely others will be, too. Boris, can you tell for sure
either way?

Also, Boris - any chance you could give your Tested-by for Roger's
patch? It's otherwise ready to go in, but I'd prefer to commit it
knowing that you've tested this hopefully final version. I'm sorry
for the recurring requests to test this workaround.

>> 3. MSR_K8_HWCR: Linux will complain about a missing bit (TscFreqSel). Jan
>>    posted a patch to handle the MSR and unconditionally set the bit:
>>
>>    https://lore.kernel.org/xen-devel/c91b190a-aaa1-d3b8-10eb-d8da7ad1f834@suse.com/
>>
>>    There are concerns from Andrew about missing bits in ACPI tables and Pstate
>>    MSRs if this bit is reported as set.
> 
>> For 3. I think I at least need more details about the relation with TscFreqSel
>> and ACPI or other MSRs, and I haven't been able to find it on the PPRs I
>> looked.
> 
> I don't understsnd the implications here.  Jan says that Linux has
> been warning about this, but is that the only symptom of the current
> state of affairs ?

Yes. It claiming a firmware bug when running under Xen, but not when
running natively is likely to cause support calls.

> Jan writes
> 
>  The main risk with making the read not fault here is that guests
>  might imply [sic, "infer" intended] they can also write this MSR then.
> 
> How would that matter, in practice ?  The guest would try to write
> the KSR and then ... ?

... would crash. I should say though that with increasing awareness of
potentially running virtualized, kernels have become more careful in
this regard. (I was about to say "with implications like this one, but
since you say "imply" was wrong, "implications" would likely be as well.
What would the correct noun here be then? My dictionary has "inference",
but this feels a little odd, too much like misspelled "interference".)

Jan


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Remaining MSR wrinkles
  2021-03-11  7:55   ` Remaining MSR wrinkles Jan Beulich
@ 2021-03-11 16:54     ` Roger Pau Monné
  2021-03-11 18:22     ` Boris Ostrovsky
  1 sibling, 0 replies; 3+ messages in thread
From: Roger Pau Monné @ 2021-03-11 16:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Ian Jackson, Boris Ostrovsky, committers, Andrew Cooper, xen-devel

On Thu, Mar 11, 2021 at 08:55:29AM +0100, Jan Beulich wrote:
> On 10.03.2021 20:09, Ian Jackson wrote:
> > (I bounced Roger's original mail to xen-devel.  I hope it made it...)
> > 
> > Roger Pau Monné writes ("Remaining MSR wrinkles"):
> >> 1. MSR behavior for PV guests without a #GP handler set: PV Linux versions <
> >>    4.14 will use rdmsr_safe (and likely wrmsr_safe?) without having a #GP
> >>    handler setup, which results in a crash. This bug was hidden in previous
> >>    Xen releases by allowing unlimited read access to the MSR space.
> 
> I've not observed problematic wrmsr_safe() so far.
> 
> >>    Jan has posted several proposals to address this:
> >>
> >>    https://lore.kernel.org/xen-devel/7e69db81-cee7-3c7b-be64-4f5ff50fbe9c@suse.com/
> >>    https://lore.kernel.org/xen-devel/d794bbee-a5e5-6632-3d1f-acd8164b7171@suse.com/
> >>
> >>    Which all rely on the fact that for PV guests Xen knows whether there's a
> >>    #GP handler setup and can hence prevent injection of a #GP fault if no
> >>    handler is present.
> >>
> >>    Andrew opinion is that we should instead try to figure out which MSRs the
> >>    buggy Linux versions try to access and special case them. Andrew also raised
> >>    the point that continue running with a 'fake' (ie: 0) MSR value might be
> >>    worse than crashing.
> >>
> >>    Part of the discussion has also happened here:
> >>
> >>    https://lore.kernel.org/xen-devel/4da62f0b-8a08-dd84-2040-fd55d74fd85a@citrix.com/
> >>
> >>    Look for the last quote.
> >>
> >>    Another option is to document that PV Linux < 4.14 will require msr_relaxed=1
> >>    in order to run. That as Jan pointed out will also imply PV Linux to run with
> >>    a faked (0) MSR value instead of crashing.
> > 
> >> For 1. I do agree with Jan than this workaround is likely the best option we
> >> have, sort of resorting to request enabling msr_relaxed for all Linux PV guests
> >> < 4.14. Whether we want to limit this workaround to the read side only I'm not
> >> fully convinced. There's something nice about having symmetry in the read and
> >> write paths, but if all the calls we have identified are rdmsr only I prefer to
> >> leave the write path unaltered and request users to use msr_relaxed if write
> >> issues are found later.
> 
> Especially if Andrew's ambiguous objection was against the write side
> only, I think I'd prefer to go with this latter variant. Considering
> that dealing with the read side alone is sufficient to address the
> observed issue, I'm even inclined to prefer this irrespective of that
> constraint.
> 
> > Thanks.  I find your explanation and arguments convincing.  I have
> > read what Andy says in that link and I find that less convincing.  In
> > particular "I don't think we should legitimise buggy PV behaviour" is
> > not entirely consistent with our previous approach to
> > bug-compatibility and support for old guests.
> > 
> > Accordingly, (with committer tie-breaker hat on) I would prefer to
> > apply the patches from Jan.  I don't have an opinion about the read vs
> > write question, and will probably be happy with whatever you and Jan
> > can agree on.
> > 
> > I don't think I Release-Acked those patches yet so, for those two,
> > 
> > Release-Acked-by: Ian Jackson <iwj@xenproject.org>
> 
> You didn't, indeed, but "those two" is slightly confusing, the two
> links Roger did provide are just different versions of the same
> patch. Hence I'd like to double check that it is exactly this one
> patch of mine (of which I need to send another version, at least
> to include Roger's requested documentation of the behavior, and
> possibly also the write side equivalent - still waiting for Andrew
> to come back and clarify the scope of his objection).

I would leave the write side out. Now we have the fallback msr_relaxed
option which should be enough to cover for any write side issue that
might arise later. Also you have not identified any problematic
wrmsr_safe so far.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Remaining MSR wrinkles
  2021-03-11  7:55   ` Remaining MSR wrinkles Jan Beulich
  2021-03-11 16:54     ` Roger Pau Monné
@ 2021-03-11 18:22     ` Boris Ostrovsky
  1 sibling, 0 replies; 3+ messages in thread
From: Boris Ostrovsky @ 2021-03-11 18:22 UTC (permalink / raw)
  To: Jan Beulich, Ian Jackson, Roger Pau Monné
  Cc: committers, Andrew Cooper, xen-devel


On 3/11/21 2:55 AM, Jan Beulich wrote:
> On 10.03.2021 20:09, Ian Jackson wrote:
>> (I bounced Roger's original mail to xen-devel.  I hope it made it...)
>>
>> Roger Pau Monné writes ("Remaining MSR wrinkles"):
>>
>>> 2. RAPL_POWER_LIMIT: handle the MSR explicitly to make Solaris happy.
>>>    Alternatively we can list in the release notes that Solaris guests require
>>>    msr_relaxed=1. Andrew is working on a patch for this.


MSR_RAPL_POWER_UNIT (in Linux parlance), i.e. 0x606.


>> I would prefer to handle the MSR explicitly, for the same
>> compatibility reason as above.
> The question is here - we aren't sure yet that this is the only
> one, are we? Andrew suspects if this one MSR gets accessed this
> way, then likely others will be, too. Boris, can you tell for sure
> either way?


The only one that Solaris reads on boot in 0x606. However, a few more may be accessed unguarded if kstat runs, as I mentioned in https://lore.kernel.org/xen-devel/4fc3532b-f53f-2a15-ce64-f857816b0566@oracle.com/ (I have not been able to empirically verify that)


>
> Also, Boris - any chance you could give your Tested-by for Roger's
> patch? It's otherwise ready to go in, but I'd prefer to commit it
> knowing that you've tested this hopefully final version. I'm sorry
> for the recurring requests to test this workaround.

For v5:


Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-11 18:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <YEj39oqZA0y/af0c@Air-de-Roger>
     [not found] ` <24649.6523.991714.489131@mariner.uk.xensource.com>
2021-03-11  7:55   ` Remaining MSR wrinkles Jan Beulich
2021-03-11 16:54     ` Roger Pau Monné
2021-03-11 18:22     ` Boris Ostrovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.