From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tim Deegan <tim@xen.org>
Subject: Re: [PATCH for-4.5 RFC v2] x86/HVM: Unconditionally
 crash guests on repeated vmentry failures
Date: Thu, 27 Nov 2014 12:44:24 +0100
Message-ID: <20141127114424.GA13236@deinos.phlegethon.org>
References: <1416934449-20299-1-git-send-email-andrew.cooper3@citrix.com>
	<5474C998020000780004AD1D@mail.emea.novell.com>
	<5474C658.4030901@citrix.com>
	<5476058A02000078000C2808@mail.emea.novell.com>
	<54761124.60203@citrix.com>
	<5476F1F5020000780004B0BC@mail.emea.novell.com>
	<20141127102650.GA13234@deinos.phlegethon.org>
	<54771636020000780004B1DC@mail.emea.novell.com>
	<20141127112936.GB13234@deinos.phlegethon.org>
	<54771B33020000780004B1FF@mail.emea.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Content-Disposition: inline
In-Reply-To: <54771B33020000780004B1FF@mail.emea.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>, keir@xen.org, xen-devel@lists.xen.org
List-Id: xen-devel@lists.xenproject.org

At 11:38 +0000 on 27 Nov (1417084691), Jan Beulich wrote:
> >>> On 27.11.14 at 12:29, <tim@xen.org> wrote:
> > At 11:16 +0000 on 27 Nov (1417083414), Jan Beulich wrote:
> >> >>> On 27.11.14 at 11:26, <tim@xen.org> wrote:
> >> > At 08:42 +0000 on 27 Nov (1417074133), Jan Beulich wrote:
> >> >> >>> On 26.11.14 at 18:43, <andrew.cooper3@citrix.com> wrote:
> >> >> > My v1 patch only fixes the VMX side of things.  SVM doesn't explicitly
> >> >> > identify a failed vmentry and lets it fall into the default case of the
> >> >> > vmexit handler.  As such, with v1, the infinite loop still affects AMD
> >> >> > hardware.
> >> >> 
> >> >> Right; I should have said "something along the lines of v1". An SVM
> >> >> part is needed, but that should probably extend beyond what you
> >> >> proposed in v2: There are a number of "goto exit_and_crash"
> >> >> statements ahead of where you place your addition. I think they
> >> >> all need to be treated similarly.
> >> >> 
> >> >> I therefore think we should revert the VMX part of 28b4baacd5
> >> >> and make SVM behavior consistent with what results for VMX:
> >> >> Crash the guest unconditionally on VMEXIT_INVALID, without
> >> >> looking for recurring VM entry failures. See below/attached.
> >> >> 
> >> >> Jan
> >> >> 
> >> >> x86/HVM: prevent infinite VM entry retries
> >> >> 
> >> >> This reverts the VMX side of commit 28b4baac ("x86/HVM: don't crash
> >> >> guest upon problems occurring in user mode") and gets SVM in line with
> >> >> the resulting VMX behavior. This is because Andrew validly says
> >> >> 
> >> >> "A failed vmentry is overwhelmingly likely to be caused by corrupt
> >> >>  VMC[SB] state.  As a result, injecting a fault and retrying the the
> >> >>  vmentry is likely to fail in the same way."
> >> >> 
> >> >> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >> > 
> >> > Looking at the SVM side, AFAICT you're trying to filter out
> >> > VMEXIT_INVALID exits that couldn't be handled by nested SVM, but I
> >> > think it's fine just to always crash on nested-SVM failures: we know
> >> > the guest wasn't in user mode because it successfully executed VMRUN.
> >> > And looking at it, the other users of that label are for unexpected
> >> > debugging exits, which can't be caused by the guest userspace either.
> >> > 
> >> > So how about this for the SVM side, reverting to crashing for
> >> > everything except new, unsupported exit types?
> >> 
> >> Generally a good idea, but there are two paths to exit_and_crash
> >> (for VMEXIT_EXCEPTION_DB and VMEXIT_EXCEPTION_BP) which I
> >> think would better crash only conditionally.
> > 
> > Those are catching Xen bugs -- we don't (or at least shouldn't) enable
> > those exit types when the debugger is not attached.  I think that,
> > like with the p2m ENOMEM case, turning them into #UD is not really
> > helpful.  But if you prefer, those could be made into 'goto default'
> > to preserve the current behaviour, which would also retain the
> > debugging output.
> > 
> >> And finally, if doing it that way we might better remove the
> >> exit_and_crash label altogether and call domain_crash() directly
> >> in the places we mean it to be called.
> > 
> > Indeed.  How's this, then?
> 
> Looks good - if you don't mind I'll replace the SVM part of the earlier
> patch with this, add your S-o-b alongside mine, and send again for
> final review.

Sure, that sounds great.

Tim.