From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH 7/8] xen/x86: Rework CR4 handling for PV guests Date: Fri, 26 Jun 2015 07:22:21 +0100 Message-ID: <558D0BBD0200007800089FAC@mail.emea.novell.com> References: <1435163500-10589-1-git-send-email-andrew.cooper3@citrix.com> <1435163500-10589-8-git-send-email-andrew.cooper3@citrix.com> <558C2F3E0200007800089BA9@mail.emea.novell.com> <558C2AB2.1070001@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <558C2AB2.1070001@citrix.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper Cc: Xen-devel List-Id: xen-devel@lists.xenproject.org >>> On 25.06.15 at 18:22, wrote: > On 25/06/15 15:41, Jan Beulich wrote: >>>>> On 24.06.15 at 18:31, wrote: >>> The major change here is that v->arch.pv_vcpu.ctrlreg[4] now contains only CR4 >>> bits which Xen wishes to shadow, rather than containing a mix of host and >>> guest bits. This is likely to elicit a warning when migrating a PV guest >>> from >>> an older version of Xen, although no function problems as a result. >> For one, I'm having a hard time guessing what the part of the sentence >> after the comma is meant to tell me. > > Before this patch, v->arch.pv_vcpu.ctrlreg[4] contained an unqualified > combination of host and guest controlled bits. > > After this patch, it strictly contains the bits Xen chooses to shadow. With this you're re-stating what the first sentence said after its comma. The question was on the second one though. >> And then, from a support >> perspective, such warnings aren't helpful, as they incur questions. > > It was useful for debugging, but can see your point. I could reword it > to "ignoring $FOO's attempt to update $BAR's CR4". I would hesitate > from removing it completely however. Yes, the warning in general should stay of course (there is one after all right now). I realize I slightly mis-read the sentence though - looks like the implication is that the possible warning is from the mixture of bits previously in .ctrlreg[4]? Could we flag the write originating from a migration, issuing the warning only for guest initiated updates? >> Plus >> finally, storing neither the current guest view nor the current hypervisor >> view in that field seems prone to cause headaches in the future. Otoh >> this gets it in line with CR0 handling as it looks. > > My logic was that the set of shadowed bits will never decrease on newer > versions of Xen, but Xen's choice of CR4 bits may change. As a result, > neither "guests view" nor "hypervisors view" is appropriate in an > upgrade case. Okay, namely together with getting it in line with CR0 handling this makes sense. >>> A second change is that visible set of CR4 bits is different. In >>> particular, >>> VMXE and MCE are not leaked into a guest, while SMAP and SMEP are exposed. >> Not leaking VMXE into the guest is fine. Not exposing MCE even to >> Dom0 is questionable. > > I would prefer not to special case the hardware domain if possible, and > I suppose MCE is not too much of an issue to leak. We already are, > after all. > >> And exposing SMEP and SMAP without >> exposing their CPUID flags seems bogus. > > I was hoping not to get bogged down in CPUID given my upcoming work to > fix feature levelling. > > After recent consideration, I am still not sure if we want to support > SMAP in 32bit PV guests or not. The trapping of stac/clac would be a > high overhead, although the guest could modify EFLAGS.AC itself using popf. If it technically works (which it looks like it does) I see no reason leaving the decision to guests. >>> In addition, add PGE to the set of bits we don't care about a guest >>> attempting to modify. >> I agree from a functionality pov this is okay, but do we really want >> to allow the guest to turn this off from a performance pov? > > This change doesn't shadow PGE. It just prevents the warning from > firing if the guest attempts to clear PGE. I which case I think the wording you chose is slightly misleading. >>> @@ -699,6 +719,8 @@ static int __init init_pv_cr4_masks(void) >>> common_mask &= ~X86_CR4_DE; >>> if ( cpu_has_xsave ) >>> common_mask &= ~X86_CR4_OSXSAVE; >>> + if ( cpu_has_pge ) >>> + common_mask &= ~X86_CR4_PGE; >> Following the earlier comment, shouldn't this bit then also be added >> to PV_CR4_SHADOW? > > I don't think so. A guest has to defer to Xen to perform TLB flushes. > I just wished to avoid the warning. Maybe add a comment to that effect then, avoiding future readers to feel tempted to correct this apparent inconsistency, just like it happened to me? >> With FSGSBASE not in PV_CR4_SHADOW, how would (as the comment >> in init_pv_cr4_masks() says) the guest manage to observe the flag in >> its desired state after a migration? > > The guest does not get any control of FSGSBASE. Xen unilaterally has it > enabled, and uses them on vcpu context switch so it is not safe to ever > disable. Again - is the comment the misleading? I took it to say that the guest can control its _view_ of the flag without controlling the hardware setting. I.e. when it wrote it as zero, it would read back zero, despite the flag being set in hardware at all times. Jan