All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>,
	Paul Durrant <paul.durrant@citrix.com>, Tim Deegan <tim@xen.org>,
	Xen-devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments
Date: Mon, 3 Apr 2017 18:37:11 +0100	[thread overview]
Message-ID: <c3bd56cd-774f-574d-bbf0-3418a7508f9f@citrix.com> (raw)
In-Reply-To: <58E28F9A020000780014C34A@prv-mh.provo.novell.com>

On 03/04/17 17:08, Jan Beulich wrote:
>>>> On 03.04.17 at 17:42, <andrew.cooper3@citrix.com> wrote:
>> On 03/04/17 16:07, Jan Beulich wrote:
>>>>>> On 03.04.17 at 16:27, <andrew.cooper3@citrix.com> wrote:
>>>> On 03/04/17 10:13, Jan Beulich wrote:
>>>>>>>> On 31.03.17 at 21:50, <andrew.cooper3@citrix.com> wrote:
>>>>>> --- a/xen/arch/x86/hvm/hvm.c
>>>>>> +++ b/xen/arch/x86/hvm/hvm.c
>>>>>> @@ -2374,13 +2374,27 @@ int hvm_set_cr4(unsigned long value, bool_t 
>>>> may_defer)
>>>>>>      return X86EMUL_OKAY;
>>>>>>  }
>>>>>>  
>>>>>> +enum hvm_segmentation_mode hvm_seg_mode(
>>>>>> +    const struct vcpu *v, enum x86_segment seg,
>>>>>> +    const struct segment_register *cs)
>>>>> The inputs here are at least somewhat counterintuitive (for example,
>>>>> from an abstract pov it is unexpected that the result depends on seg
>>>>> and cs). At the very least I think the naming should make clear that
>>>>> cs is not just some code segment, but the CS v has currently in use
>>>>> (e.g. active_cs). Going further the question is whether having this
>>>>> helper is really useful (and not perhaps inviting mis-use), and hence
>>>>> whether the inputs passed here wouldn't better be passed directly
>>>>> to hvm_virtual_to_linear_addr(), the more that the "seg" argument
>>>>> is required to match up between the two calls.
>>>> I purposefully wanted to avoid people opencoding the logic and getting
>>>> it wrong (looks like even I got it wrong).
>>>>
>>>> I'm not convinced that passing the parameters individually is better.
>>>>
>>>>>> +{
>>>>>> +    if ( !(v->arch.hvm_vcpu.guest_cr[0] & X86_CR0_PE) )
>>>>>> +        return hvm_seg_mode_real;
>>>>> What about VM86 mode?
>>>> Very good point.
>>>>
>>>>>> +    if ( hvm_long_mode_active(v) &&
>>>>>> +         (is_x86_system_segment(seg) || cs->attr.fields.l) )
>>>>>> +        return hvm_seg_mode_long;
>>>>>> +
>>>>>> +    return hvm_seg_mode_prot;
>>>>> Did you verify this actually matching real hardware behavior? There
>>>>> being obvious anomalies when compat ring-0 code executes
>>>>> LGDT/LIDT/SGDT/SIDT (in long mode these ought to have 10-byte
>>>>> operands, yet 32-bit/compat code would expect 6-byte ones, so
>>>>> one of the two views is necessarily wrong, and whichever it is, it
>>>>> introduces an inconsistency), I wouldn't take it for given that _all_
>>>>> descriptor table accessing insns behave like they would from a
>>>>> 64-bit code segment (I nevertheless assume they do, but as I
>>>>> can't see it written down anywhere, we shouldn't assume so,
>>>>> considering how many other oddities there are in x86).
>>>>>
>>>>> This question is also being supported by the SDM using the same
>>>>> standard "Same exceptions as in protected mode" in the
>>>>> respective insns' "Compatibility Mode Exceptions" sections, yet
>>>>> the behavior above implies that #GP(0) might also result for
>>>>> compat mode descriptor table accesses if the descriptor address
>>>>> ends up being non-canonical. Interestingly enough the PM
>>>>> doesn't separate exception specifications for 32-bit protected,
>>>>> compat, and 64-bit modes.
>>>> You are mistaken.
>>>>
>>>> {L,S}{I,G}DT are {read,write}_segment_register() operations, using a
>>>> plain memory operand.
>>>>
>>>> When we come to the segmentation check, it will be by default
>>>> %ds-relative, with size as calculated by op_bytes in the instruction
>>>> emulator.
>>> I think I didn't make myself clear then: I'm not at all talking about how
>>> the memory access of these insns get carried out, I solely talk about
>>> the size of their operands:
>> I still fail to see what the size of the operands have to do with the
>> choice of segmentation mode.
>>
>>> In long mode to load IDTR or GDTR you'd expect a 64-bit base and a 16-bit 
>> limit.
>>
>> Why?  I'd expect nothing of the sort, because 32bit compat segments are
>> purposefully designed to be no functional difference from regular 32bit
>> protected mode segments.  That includes not changing the behaviour of
>> instructions like this.
> Well, one can of course take the position that ring-0 compat code
> is simply a useless thing.

Compatibility mode segments exist for the purpose of making user code
continue to work.  I don't find it surprising that compatbility
supervisor segments have some rough corners.

>
>>> Hence if _all_ system segment
>>> register related insns worked consistently in long mode, the four
>>> named insns would have to have 10-byte operands.
>> This isn't a valid expectation to draw.
>>
>>>  I'm pretty sure
>>> they don't though, so there is _one_ anomaly already.
>> Indeed they don't.  In a compatibility mode segment, they have take a
>> 6-byte operand, identically to 32bit mode.
>>
>>> With that I don't think we can rule out there being other anomalies, with this
>>> not being talked about explicitly anywhere in the doc.
>> I don't think any of this is relevant to the correctness of this patch.
> I don't question the correctness; all I question is how far it is built
> upon assumptions vs actually observed (short of documented)
> behavior.

AMD Vol2 14.5 "Initialising Long Mode" is fairly clear on the subject. 
The layout and behaviour of the 4 system structures depend on

>
>> Without this fix, implicit accesses to system segments in a
>> compatibility mode segment will truncate the resulting linear address as
>> part of performing the segmentation calculations, which is obviously not
>> how real hardware behaves.
> I understand this. But things are a little more complicated. Just
> extend your line of thinking regarding IDTR/GDTR to LDTR and
> TR: Above you suggest that the former two get loaded in a fully
> 32-bit mode compatible way. LTR and LLDT (as well as LAR and
> LSL), however, access a descriptor table. 32-bit code would
> expect an 8-byte descriptor to be read. Is compat mode code
> then not allowed to make the same assumption?

Hmm - the wording of LTR/LLDT in both manuals states 64bit mode, not
long mode, so there is a decent chance that the compat behaviour is
identical.  Let me experiment.

~Andrew

>  I think you've
> made pretty much explicit that you'd expect a 16-byte descriptor
> to be loaded in this case. So ring-0 compat code operates
> identically to 32-bit mode in some respects, and identically to
> 64-bit code in others. Fundamentally I'd expect consistency here,
> but along the lines of the usefulness remark higher up I simply
> think that no-one had spent any thought on this when designing
> x86-64; otherwise it would have been easy to simply disallow
> ring-0 to enter compat mode, thus avoiding any inconsistencies.
>
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-04-03 17:37 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 19:50 [PATCH for 4.9 0/6] x86/emul: Fixes Andrew Cooper
2017-03-31 19:50 ` [PATCH for 4.9 1/6] x86/hvm: Correct some address space terminology Andrew Cooper
2017-04-03  8:24   ` Paul Durrant
2017-04-03  8:24   ` Jan Beulich
2017-04-03 10:19     ` Andrew Cooper
2017-04-03 10:29       ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 2/6] x86/hvm: Correct long mode predicate Andrew Cooper
2017-04-03  8:26   ` Paul Durrant
2017-04-03  8:30   ` Jan Beulich
2017-04-03  8:50   ` George Dunlap
2017-04-05  7:08   ` Tian, Kevin
2017-03-31 19:50 ` [PATCH for 4.9 3/6] x86/hvm: Fix segmentation logic for system segments Andrew Cooper
2017-04-03  8:31   ` Paul Durrant
2017-04-03  9:13   ` Jan Beulich
2017-04-03 14:27     ` Andrew Cooper
2017-04-03 15:07       ` Jan Beulich
2017-04-03 15:42         ` Andrew Cooper
2017-04-03 16:08           ` Jan Beulich
2017-04-03 17:37             ` Andrew Cooper [this message]
2017-04-04 10:18               ` Andrew Cooper
2017-04-04 10:32                 ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 4/6] x86/svm: Introduce svm_emul_swint_injection() Andrew Cooper
2017-04-03  9:30   ` Jan Beulich
2017-04-03 14:04   ` Boris Ostrovsky
2017-03-31 19:50 ` [PATCH for 4.9 5/6] x86/emul: Drop swint_emulate infrastructure Andrew Cooper
2017-04-03  8:36   ` Paul Durrant
2017-04-03  9:38   ` Jan Beulich
2017-03-31 19:50 ` [PATCH for 4.9 6/6] x86/emul: Require callers to provide LMA in the emulation context Andrew Cooper
2017-04-03  8:40   ` Paul Durrant
2017-04-03 10:10   ` Jan Beulich
2017-04-05 16:24     ` Andrew Cooper
2017-04-06  6:58       ` Jan Beulich
2017-04-06  9:47         ` Andrew Cooper
2017-04-06 14:14           ` Jan Beulich
2017-04-06 16:32             ` Andrew Cooper
2017-04-07  8:35               ` Jan Beulich
2017-04-05 16:07   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c3bd56cd-774f-574d-bbf0-3418a7508f9f@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=julien.grall@arm.com \
    --cc=paul.durrant@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.