From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: RE: expose MWAIT to dom0 Date: Tue, 16 Aug 2011 09:45:28 +0100 Message-ID: <4E4A4A4802000078000516D8@nat28.tlf.novell.com> References: <625BA99ED14B2D499DC4E29D8138F15062D2E80C3A@shsmsx502.ccr.corp.intel.com> <4E48EEB50200007800051398@nat28.tlf.novell.com> <625BA99ED14B2D499DC4E29D8138F15062D2E80DE8@shsmsx502.ccr.corp.intel.com> <4E49111E0200007800051441@nat28.tlf.novell.com> <625BA99ED14B2D499DC4E29D8138F15062DA38993E@shsmsx502.ccr.corp.intel.com> <4E4A2D5A0200007800051629@nat28.tlf.novell.com> <625BA99ED14B2D499DC4E29D8138F15062DA3899EC@shsmsx502.ccr.corp.intel.com> <4E4A42B702000078000516A3@nat28.tlf.novell.com> <625BA99ED14B2D499DC4E29D8138F15062DA389AF6@shsmsx502.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: <625BA99ED14B2D499DC4E29D8138F15062DA389AF6@shsmsx502.ccr.corp.intel.com> Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Kevin Tian Cc: Yang Z Zhang , "xen-devel@lists.xensource.com" , Keir Fraser , Gang Wei List-Id: xen-devel@lists.xenproject.org >>> On 16.08.11 at 10:29, "Tian, Kevin" wrote: >> From: Jan Beulich [mailto:JBeulich@novell.com]=20 >> Sent: Tuesday, August 16, 2011 4:13 PM >>=20 >> >>> On 16.08.11 at 08:53, "Tian, Kevin" wrote: >> >> From: Jan Beulich [mailto:JBeulich@novell.com]=20 >> >> Sent: Tuesday, August 16, 2011 2:42 PM >> >> >> >> >>> On 16.08.11 at 08:03, "Tian, Kevin" = wrote: >> >> >> From: Jan Beulich [mailto:JBeulich@novell.com]=20 >> >> >> Sent: Monday, August 15, 2011 6:29 PM >> >> >> >> >> >> that while improving the situation on CPUs that support the = break-on- >> >> >> interrupt extension to mwait, it would result in C2/C3 not being = usable >> >> >> at all on CPUs that don't (but support mwait in its simpler form = and >> >> >> have ACPI tables specifying FFH as address space id). Is that = only a >> >> >> theoretical concern (i.e. is there an implicit guarantee that for = other >> >> >> than C1 FFH wouldn't be specified without that extension being >> >> >> available)? I thinks it's a practical one, or otherwise there = wouldn't >> >> >> be a point in removing the ACPI_PDC_C_C2C3_FFH bit prior to _PDC >> >> >> evaluation. >> >> > >> >> > Yes, this is a practical one, though I don't know any box doing = that. In >> > all >> >> > the boxes I've been using so far, all the extensions are available.= But >> >> > since >> >> > BIOS vendor may also impact the availability of CPUID bits, I = think we >> >> > should do it right by strictly conforming to theSDM. I.e. we need = check >> >> > CPUID leafs and then verify all Cx states propagated from dom0, = instead >> >> > of blindly following its info. Will work a patch for that. >> >> >> >> You're getting it sort of wrong way round: What I don't want to do = (but >> >> seemingly being necessary) is mimic the decision logic the hypervisor= >> >> uses (i.e. require the break-on-interrupt extension for C2/C3 = entering >> >> through MWAIT) in Dom0 when deciding about the bits to pass to >> > >> > break-on-interrupt is not a hard requirement to use MWAIT. Even when >> > that extension is not available, MWAIT can be still used to enter = C2/C3, >> > just with interrupt enabled. >>=20 >> And that's why this implementation detail should be confined to the >> hypervisor - Dom0 should not care about this if at all possible. >>=20 >> >> _PDC. That ought to be an implementation detail (subject to change) >> >> in the hypervisor alone. The hypervisor itself, otoh, already = properly >> >> checks CPUID leaf 5 (and that's what might cause it to not use mwait >> >> despite the bit in CPUID leaf 1 being set, which should be all Dom0 >> >> ought to look at for deciding whether to clear ACPI_PDC_C_C2C3_FFH). >> >> >> > >> > I made a mistake, that currently CPUID leaf 5 is already checked in >> > check_cx in hypervisor, so it should be sane. However I still fail to = catch >> > your real concern here. :/ >>=20 >> If Dom0 finds (real) CPUID leaf 1 report MWAIT to be available, it >> will (with the logic outlined above) call _PDC without clearing >> ACPI_PDC_C_C2C3_FFH. If now the break-on-interrupt extension >> is not present, but the address space ID for C2 or C3 is set to FFH, >> then Xen (in acpi_processor_ffh_cstate_probe()) will reject the >> Cx entry (and hence refrain from using the respective C-state), >> whereas if Dom0 cleared ACPI_PDC_C_C2C3_FFH in that case, >> firmware would (normally) have converted the address space ID to >> SYSTEM_IO, and hence Xen would have decided to use C2/C3 with >> the SYSIO entry method. >>=20 >> So this is only acceptable if there are *no* production CPUs of any >> vendor that would support MWAIT without the break-on-interrupt >> extension. >>=20 >=20 > yes, that's also the way that native Linux code currently uses: > - notify BIOS ACPI_PDC_C_C2C3_FFH if cpu has mwait > - reject Cx entry if break-on-interrupt extension is not present = later > in acpi processor driver when parsing Cx entries. >=20 > From BIOS ACPI p.o.v, OSPM can notify BIOS about FFH style if following > conditions are true: > a) cpu supports mwait > b) OSPM itself supports mwait >=20 > a) is architectural, but b) is implementation specific regarding to what > can be called "support". Obviously both Xen and Linux here use an > inconsistent way between the place notifying BIOS and the point parsing > ACPI Cx entry. So your conclusion is correct that C2/C3 would be = rejected > on the CPU which doesn't support MWAIT with break-on-interrupt=20 > extension. But it should be fine in the real world, and we may consider > whether to do something when a real case is encountered in the future. >=20 > On the other hand, you can think it as the decision from Xen that it > doesn't want to use legacy I/O method for C2/C3 when such situation = exists.=20 > :-) Yeah, but customers could validly view this as regression (because on such a system Xen would use C2/C3 currently). Jan