From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Kay, Allen M" Subject: RE: Xen Advisory 5 (CVE-2011-3131) IOMMU fault livelock Date: Tue, 20 Sep 2011 17:07:37 -0700 Message-ID: <987664A83D2D224EAE907B061CE93D5301EDED333E@orsmsx505.amr.corp.intel.com> References: <20037.10841.995717.397090@mariner.uk.xensource.com> <4E454C880200007800051000@nat28.tlf.novell.com> <20110812140901.GC11708@ocelot.phlegethon.org> <4E4559440200007800051062@nat28.tlf.novell.com> <20110815092608.GD11708@ocelot.phlegethon.org> <4E4A32650200007800051651@nat28.tlf.novell.com> <20110816150621.GM11708@ocelot.phlegethon.org> <4E4AAFF402000078000518CF@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <4E4AAFF402000078000518CF@nat28.tlf.novell.com> Content-Language: en-US List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Jan Beulich , Tim Deegan Cc: "xen-devel@lists.xensource.com" , "Dugger, Donald D" , Xen.org, security team List-Id: xen-devel@lists.xenproject.org Catching up on an old thread ... If I understand correctly, the proposal is to check for VT-d faults in do_s= oftirq() handler. If so, we probably don't even need to enable VT-d MSI in= terrupt at all if iommu_debug is not set, basically handling VT-d faults wi= th polling method. This sounds fine to me as long as we still turn on VT-d MSI interrupt for i= ommu_debug case. Allen -----Original Message----- From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists= .xensource.com] On Behalf Of Jan Beulich Sent: Tuesday, August 16, 2011 8:59 AM To: Tim Deegan Cc: xen-devel@lists.xensource.com; Xen.org security team Subject: Re: [Xen-devel] Xen Advisory 5 (CVE-2011-3131) IOMMU fault liveloc= k >>> On 16.08.11 at 17:06, Tim Deegan wrote: > At 08:03 +0100 on 16 Aug (1313481813), Jan Beulich wrote: >> >>> On 15.08.11 at 11:26, Tim Deegan wrote: >> > At 15:48 +0100 on 12 Aug (1313164084), Jan Beulich wrote: >> >> >>> On 12.08.11 at 16:09, Tim Deegan wrote: >> >> > At 14:53 +0100 on 12 Aug (1313160824), Jan Beulich wrote: >> >> >> > This issue is resolved in changeset 23762:537ed3b74b3f of >> >> >> > xen-unstable.hg, and 23112:84e3706df07a of xen-4.1-testing.hg. >> >> >>=20 >> >> >> Do you really think this helps much? Direct control of the device = means >> >> >> it could also (perhaps on a second vCPU) constantly re-enable the = bus >> >> >> mastering bit.=20 >> >> >=20 >> >> > That path goes through qemu/pciback, so at least lets Xen schedule = the >> >> > dom0 tools. >> >>=20 >> >> Are you sure? If (as said) the guest uses a second vCPU for doing the >> >> config space accesses, I can't see how this would save the pCPU the >> >> fault storm is occurring on. >> >=20 >> > Hmmm. Yes, I see what you mean. >>=20 >> Actually, a second vCPU may not even be needed: Since the "fault" >> really is an external interrupt, if that one gets handled on a pCPU othe= r >> than the one the guest's vCPU is running on, it could execute such a >> loop even in that case. >>=20 >> As to yesterdays softirq-based handling thoughts - perhaps the clearing >> of the bus master bit on the device should still be done in the actual I= RQ >> handler, while the processing of the fault records could be moved out to >> a softirq. >=20 > Hmmm. I like the idea of using a softirq but in fact by the time we've > figured out which BDF to silence we've pretty much done handling the > fault. Ugly, but yes, indeed. > Reading the VTd docs it looks like we can just ack the IOMMU fault > interrupt and it won't send any more until we clear the log, so we can > leave the whole business to a softirq. Delaying that might cause the > log to overflow, but that's not necessarily the end of the world. > Looks like we can do the same on AMD by disabling interrupt generation > in the main handler and reenabling it in the softirq. >=20 > Is there any situation where we rally care terribly about the IOfault > logs overflowing? As long as older entries don't get overwritten, I don't think that's going to be problematic. The more that we basically shut off the offending device(s). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel