From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tim Deegan Subject: Re: [PATCH for-4.5 RFC v2] x86/HVM: Unconditionally crash guests on repeated vmentry failures Date: Thu, 27 Nov 2014 12:44:24 +0100 Message-ID: <20141127114424.GA13236@deinos.phlegethon.org> References: <1416934449-20299-1-git-send-email-andrew.cooper3@citrix.com> <5474C998020000780004AD1D@mail.emea.novell.com> <5474C658.4030901@citrix.com> <5476058A02000078000C2808@mail.emea.novell.com> <54761124.60203@citrix.com> <5476F1F5020000780004B0BC@mail.emea.novell.com> <20141127102650.GA13234@deinos.phlegethon.org> <54771636020000780004B1DC@mail.emea.novell.com> <20141127112936.GB13234@deinos.phlegethon.org> <54771B33020000780004B1FF@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <54771B33020000780004B1FF@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , keir@xen.org, xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org At 11:38 +0000 on 27 Nov (1417084691), Jan Beulich wrote: > >>> On 27.11.14 at 12:29, wrote: > > At 11:16 +0000 on 27 Nov (1417083414), Jan Beulich wrote: > >> >>> On 27.11.14 at 11:26, wrote: > >> > At 08:42 +0000 on 27 Nov (1417074133), Jan Beulich wrote: > >> >> >>> On 26.11.14 at 18:43, wrote: > >> >> > My v1 patch only fixes the VMX side of things. SVM doesn't explicitly > >> >> > identify a failed vmentry and lets it fall into the default case of the > >> >> > vmexit handler. As such, with v1, the infinite loop still affects AMD > >> >> > hardware. > >> >> > >> >> Right; I should have said "something along the lines of v1". An SVM > >> >> part is needed, but that should probably extend beyond what you > >> >> proposed in v2: There are a number of "goto exit_and_crash" > >> >> statements ahead of where you place your addition. I think they > >> >> all need to be treated similarly. > >> >> > >> >> I therefore think we should revert the VMX part of 28b4baacd5 > >> >> and make SVM behavior consistent with what results for VMX: > >> >> Crash the guest unconditionally on VMEXIT_INVALID, without > >> >> looking for recurring VM entry failures. See below/attached. > >> >> > >> >> Jan > >> >> > >> >> x86/HVM: prevent infinite VM entry retries > >> >> > >> >> This reverts the VMX side of commit 28b4baac ("x86/HVM: don't crash > >> >> guest upon problems occurring in user mode") and gets SVM in line with > >> >> the resulting VMX behavior. This is because Andrew validly says > >> >> > >> >> "A failed vmentry is overwhelmingly likely to be caused by corrupt > >> >> VMC[SB] state. As a result, injecting a fault and retrying the the > >> >> vmentry is likely to fail in the same way." > >> >> > >> >> Reported-by: Andrew Cooper > >> >> Signed-off-by: Jan Beulich > >> > > >> > Looking at the SVM side, AFAICT you're trying to filter out > >> > VMEXIT_INVALID exits that couldn't be handled by nested SVM, but I > >> > think it's fine just to always crash on nested-SVM failures: we know > >> > the guest wasn't in user mode because it successfully executed VMRUN. > >> > And looking at it, the other users of that label are for unexpected > >> > debugging exits, which can't be caused by the guest userspace either. > >> > > >> > So how about this for the SVM side, reverting to crashing for > >> > everything except new, unsupported exit types? > >> > >> Generally a good idea, but there are two paths to exit_and_crash > >> (for VMEXIT_EXCEPTION_DB and VMEXIT_EXCEPTION_BP) which I > >> think would better crash only conditionally. > > > > Those are catching Xen bugs -- we don't (or at least shouldn't) enable > > those exit types when the debugger is not attached. I think that, > > like with the p2m ENOMEM case, turning them into #UD is not really > > helpful. But if you prefer, those could be made into 'goto default' > > to preserve the current behaviour, which would also retain the > > debugging output. > > > >> And finally, if doing it that way we might better remove the > >> exit_and_crash label altogether and call domain_crash() directly > >> in the places we mean it to be called. > > > > Indeed. How's this, then? > > Looks good - if you don't mind I'll replace the SVM part of the earlier > patch with this, add your S-o-b alongside mine, and send again for > final review. Sure, that sounds great. Tim.