* RIP MTRR - status update for upcoming v4.2 @ 2015-06-11 20:36 Luis R. Rodriguez 2015-06-11 23:23 ` Toshi Kani 0 siblings, 1 reply; 20+ messages in thread From: Luis R. Rodriguez @ 2015-06-11 20:36 UTC (permalink / raw) To: Tomi Valkeinen, Borislav Petkov, Bjorn Helgaas, Jej B, Ville Syrjälä, Ville Syrjälä, Andrew Morton Cc: linux-kernel, linux-media, linux-fbdev, linux-pci, Andy Lutomirski, Toshi Kani, X86 ML, Juergen Gross, Dave Airlie, Luis R. Rodriguez, xen-devel, Julia Lawall The series to bury direct MTRR use is almost all in and on its way to v4.2. As the pending series continue slowly to be merged I wanted to take the time to reiterate the justification for these changes in hopes it may help those still reviewing some of these patches which are pending and to help document all these changes. There are also some series which depend on symbols now exported through some other subsystem trees so coordination is needed there. In those cases we have the option there to sit and wait for the exported symbols to trickle in through v4.2 and later on v4.3 finalize the changes, or to let some of the depending changes to in through other subsystem trees. I don't consider the coordination required difficult to handle so would prefer to see the changes in for v4.2 to be able to put a nail on the MTRR coffin sooner rather than later and to also help get more testing out of this sooner rather than later. PAT is known to have errata for some CPUs so hearing reports of issues with PAT would be very valuable. I'll let maintainers decide on how that trickles through. To help with all this towards the end I provide a status of all the pending patches to get this work completed. Justification ========= We want to bury direct use of MTRR code because: a) MTRR is x86 specific, this means all existing MTRR code is #idef'd out. PAT support for x86 was implemented using architecture agnostic APIs, this enables other architectures to provide support for a similar write-combining feature, and removes the nasty #idef eyesores. MTRR should be seen as a first step temporary architectural evolution to what PAT eventually became on x86. b) We have a long term goal to change the default behavior of ioremap_nocache() and pci_mmap_page_range() to use PAT strong UC, right now we cannot do this, but after all drivers are converted (all these series I've been posting) we expect to be able to make the change. Making a change to strong UC on these two calls can only happen after a period of time of having Linux bake with all these changes merged and in place. How many kernels we will want Linux baked with all these transformations to arch_phys before making a change to ioremap_nocache() and pci_mmap_page_range() is up to x86 folks. There are other gains possible with this but I welcome others to chime in here with what gains we can expect from this. c) MTRR acts on physical addresses and requires power-of-two alignment, on both the base used and size, this limits the flexibility for MTRR use. For a good example of its limitations refer to the patches which change the atyfb driver from using MTRR to PAT. d) MTRR is known to be unreliable, it can at times not work even on modern systems. e) There is a limit to how many MTRRs you can use on a system. If using a large number of devices with MTRR support you will quickly run out of MTRRs. This is why originally Andy Lutomirski ended up adding the arch_phys_wc_add() API, in order to take advantage of PAT which is *not* bound to the same limitations as MTRRs are. f) PAT has been available for quite a long time, since Pentium III (circa 1999) and newer, but having PAT enabled does not restrict use of MTRR and because of this some systems may end up then combining MTRR and PAT. I do not believe this wasn't an original highly expected wide use situation, it technically should work to combine both but there might be issues with interactions between both, exactly what issues can exist or have existed remains quite unclear as MTRR in and of itself has been known to be unreliable anyway. If possible its best to just be binary about this and only use MTRR if and only if necessary because of the other issues known with MTRR. g) Linux has support for Xen PV domains using PAT, this was introduced by Juergen via v3.19 via commit 47591df50512 ("xen: Support Xen pv-domains using PAT"). Since MTRR is old we don't want to add MTRR support into Xen on Linux, instead since Linux now supports PV domains with PAT we can take full advantage of write combining for PV domains on Xen provided all Linux drivers are converted to use PAT properly.a framebuffer folks's ACK Review of the changes ================ Most of the series has consisted of driver transformations using Coccinelle SmPL patches to transform existing code which access MTRR APIs directly to the architecture agnostic write-combining calls. Other patches extend bus subsystems to make available new write-combining architecture agnostic APIs. Other patches have consisted of extending architecture agnostic APIs to help work around old MTRR hacks -- this was perhaps the hardest task and took quite a bit of time and review as it required review of implications of all combinatorial possibilities with MTRR and PAT, which also got documented as part of the series. In the end it was also determined some drivers required substantial work to be able to work properly with PAT, the atyfb driver is an example driver which had the homework required done. Due to complexities and since the driver / hardware was ancient we decided to reach a compromise and require users of those drivers to boot with a kernel parameter to disable PAT, fortunately this was only required for two old device drivers: * ipath: the ipath device driver powers the old HTX bus cards that only work in AMD systems, while the newer IB/qib device driver powers all PCI-e cards. The ipath device driver is obsolete, hardware hard to find * ivtv: the hardware is really rare these days, and perhaps only some lost souls in some third world country are expected to be using this feature of the device driver. Pending RIP MTRR patches ==================== There are a few pending series so I wanted to provide a status update on those series. mtrr: bury MTRR - unexport mtrr_add() and mtrr_del() This is the nail on the MTRR coffin, it will prevent future direct access to MTRR code. This will not be posted until all of the below patches are in and merged. A possible next step here might be to consider separating PAT code from MTRR code and making PAT a first class citizen, enabling distributions to disable MTRR code in the future. I thought this was possible but for some reason I recently thought that there was one possible issue to make this happen. I suppose we won't know unless we try, unless of course someone already knows, Toshi? IB/ipath: use arch_phys_wc_add() and require PAT disabled IB/ipath: add counting for MTRR ivtv: use arch_phys_wc_add() and require PAT disabled This v7 series just posted addresses all drivers which cannot work with PAT, fortunately we end up with only 2! This series has all subsystem and driver maintainers ACKs, it was just posted with a few fixes with the intent to be merged through Boris' x86 tree as it depends on the newly exported pat_enabled() symbol. fusion: remove dead MTRR code This should go through Bottomley's tree, driver maintainer provided an ACKed, this is just pending integration into the SCSI subsystem tree. As a last resort my hope is that this can go through Andrew Morton's tree. video: fbdev: vesafb: use arch_phys_wc_add() video: fbdev: vesafb: add missing mtrr_del() for added MTRR video: fbdev: vesafb: only support MTRR_TYPE_WRCOMB A v4 series was posted, pending review / Integration through Tomis' tree video: fbdev: atyfb: use arch_phys_wc_add() and ioremap_wc() video: fbdev: atyfb: replace MTRR UC hole with strong UC video: fbdev: atyfb: clarify ioremap() base and length used video: fbdev: atyfb: move framebuffer length fudging to helper This provides a work around replacement for the direct MTRR use hack using the new ioremap_uc() API. This is pending review / ACK by the driver maintainer Ville Syrjälä, and the subsystem maintainer, Tomi. Since it relies on ioremap_uc(), which is merged through Boris' tree, this should go through Boris' tree *iff* we want it in for v4.2, otherwise this will have to wait for v4.3. video: fbdev: vt8623fb: use arch_phys_wc_add() and pci_iomap_wc() video: fbdev: s3fb: use arch_phys_wc_add() and pci_iomap_wc() video: fbdev: arkfb: use arch_phys_wc_add() and pci_iomap_wc() lib: devres: add pcim_iomap_wc() variants pci: add pci_iomap_wc() variants This has Tomi's ACK already for the driver specific changes, Bjorn asked for a Documentation guidance update for EXPORT_SYMBOL_GPL(), this is now merged via linux-next. This is pending consent from Tomi if it can go in through Bjorn's tree. It is also obviously pending a final ACK from Bjorn. The devres change would go in without any users, I leave this for Bjorn to decide but do note my concern of someone in the future adding a non EXPORT_SYMBOL_GPL() for this implementation. video: fbdev: gxt4500: use pci_ioremap_wc_bar() for framebuffer video: fbdev: kyrofb: use arch_phys_wc_add() and pci_ioremap_wc_bar() video: fbdev: i740fb: use arch_phys_wc_add() and pci_ioremap_wc_bar() pci: add pci_ioremap_wc_bar() This requires Tomi's Ack / review, and then obviously Bjorn's own ACK / integration. Consent from Tomi of whether or not this can go through Bjorn's tree is also needed. Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: RIP MTRR - status update for upcoming v4.2 2015-06-11 20:36 RIP MTRR - status update for upcoming v4.2 Luis R. Rodriguez @ 2015-06-11 23:23 ` Toshi Kani 2015-06-12 0:52 ` Luis R. Rodriguez 2015-06-12 7:59 ` [Xen-devel] " Jan Beulich 0 siblings, 2 replies; 20+ messages in thread From: Toshi Kani @ 2015-06-11 23:23 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Tomi Valkeinen, Borislav Petkov, Bjorn Helgaas, Jej B, Ville Syrjälä, Ville Syrjälä, Andrew Morton, linux-kernel, linux-media, linux-fbdev, linux-pci, Andy Lutomirski, X86 ML, Juergen Gross, Dave Airlie, Luis R. Rodriguez, xen-devel, Julia Lawall On Thu, 2015-06-11 at 13:36 -0700, Luis R. Rodriguez wrote: : > Pending RIP MTRR patches > ==================== > > There are a few pending series so I wanted to provide a status update > on those series. > > mtrr: bury MTRR - unexport mtrr_add() and mtrr_del() > > This is the nail on the MTRR coffin, it will prevent future direct > access to MTRR code. This will not be posted until all of the below > patches are in and merged. A possible next step here might be to > consider separating PAT code from MTRR code and making PAT a first > class citizen, enabling distributions to disable MTRR code in the > future. I thought this was possible but for some reason I recently > thought that there was one possible issue to make this happen. I > suppose we won't know unless we try, unless of course someone already > knows, Toshi? There are two usages on MTRRs: 1) MTRR entries set by firmware 2) MTRR entries set by OS drivers We can obsolete 2), but we have no control over 1). As UEFI firmwares also set this up, this usage will continue to stay. So, we should not get rid of the MTRR code that looks up the MTRR entries, while we have no need to modify them. Such MTRR entries provide safe guard to /dev/mem, which allows privileged user to access a range that may require UC mapping while the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in such a case. UEFI memory table has memory attribute, which describes cache types supported in physical memory ranges. However, this information gets lost when it it is converted to e820 table. Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: RIP MTRR - status update for upcoming v4.2 2015-06-11 23:23 ` Toshi Kani @ 2015-06-12 0:52 ` Luis R. Rodriguez 2015-06-12 16:42 ` Toshi Kani 2015-06-12 7:59 ` [Xen-devel] " Jan Beulich 1 sibling, 1 reply; 20+ messages in thread From: Luis R. Rodriguez @ 2015-06-12 0:52 UTC (permalink / raw) To: Toshi Kani Cc: Luis R. Rodriguez, Tomi Valkeinen, Borislav Petkov, Bjorn Helgaas, Jej B, Ville Syrjälä, Ville Syrjälä, Andrew Morton, linux-kernel, linux-media, linux-fbdev, linux-pci, Andy Lutomirski, X86 ML, Juergen Gross, Dave Airlie, xen-devel, Julia Lawall On Thu, Jun 11, 2015 at 05:23:16PM -0600, Toshi Kani wrote: > On Thu, 2015-06-11 at 13:36 -0700, Luis R. Rodriguez wrote: > : > > Pending RIP MTRR patches > > ==================== > > > > There are a few pending series so I wanted to provide a status update > > on those series. > > > > mtrr: bury MTRR - unexport mtrr_add() and mtrr_del() > > > > This is the nail on the MTRR coffin, it will prevent future direct > > access to MTRR code. This will not be posted until all of the below > > patches are in and merged. A possible next step here might be to > > consider separating PAT code from MTRR code and making PAT a first > > class citizen, enabling distributions to disable MTRR code in the > > future. I thought this was possible but for some reason I recently > > thought that there was one possible issue to make this happen. I > > suppose we won't know unless we try, unless of course someone already > > knows, Toshi? > > There are two usages on MTRRs: > 1) MTRR entries set by firmware > 2) MTRR entries set by OS drivers > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > also set this up, this usage will continue to stay. So, we should not > get rid of the MTRR code that looks up the MTRR entries, while we have > no need to modify them. > > Such MTRR entries provide safe guard to /dev/mem, which allows > privileged user to access a range that may require UC mapping while > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > such a case. > > UEFI memory table has memory attribute, which describes cache types > supported in physical memory ranges. However, this information gets > lost when it it is converted to e820 table. Is there no way to modify CPU capability bits upon boot and kick UEFI to re-evaluate ? In such UEFI cases what happens for instance when Xen is used which does not support MTRR? Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: RIP MTRR - status update for upcoming v4.2 2015-06-12 0:52 ` Luis R. Rodriguez @ 2015-06-12 16:42 ` Toshi Kani 0 siblings, 0 replies; 20+ messages in thread From: Toshi Kani @ 2015-06-12 16:42 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Luis R. Rodriguez, Tomi Valkeinen, Borislav Petkov, Bjorn Helgaas, Jej B, Ville Syrjälä, Ville Syrjälä, Andrew Morton, linux-kernel, linux-media, linux-fbdev, linux-pci, Andy Lutomirski, X86 ML, Juergen Gross, Dave Airlie, xen-devel, Julia Lawall On Fri, 2015-06-12 at 02:52 +0200, Luis R. Rodriguez wrote: > On Thu, Jun 11, 2015 at 05:23:16PM -0600, Toshi Kani wrote: > > On Thu, 2015-06-11 at 13:36 -0700, Luis R. Rodriguez wrote: > > : > > > Pending RIP MTRR patches > > > ==================== > > > > > > There are a few pending series so I wanted to provide a status update > > > on those series. > > > > > > mtrr: bury MTRR - unexport mtrr_add() and mtrr_del() > > > > > > This is the nail on the MTRR coffin, it will prevent future direct > > > access to MTRR code. This will not be posted until all of the below > > > patches are in and merged. A possible next step here might be to > > > consider separating PAT code from MTRR code and making PAT a first > > > class citizen, enabling distributions to disable MTRR code in the > > > future. I thought this was possible but for some reason I recently > > > thought that there was one possible issue to make this happen. I > > > suppose we won't know unless we try, unless of course someone already > > > knows, Toshi? > > > > There are two usages on MTRRs: > > 1) MTRR entries set by firmware > > 2) MTRR entries set by OS drivers > > > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > > also set this up, this usage will continue to stay. So, we should not > > get rid of the MTRR code that looks up the MTRR entries, while we have > > no need to modify them. > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > privileged user to access a range that may require UC mapping while > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > > such a case. > > > > UEFI memory table has memory attribute, which describes cache types > > supported in physical memory ranges. However, this information gets > > lost when it it is converted to e820 table. > > Is there no way to modify CPU capability bits upon boot and kick UEFI > to re-evaluate ? In such UEFI cases what happens for instance when > Xen is used which does not support MTRR? EFI GetMemoryMap() is a boot service, and won't be available after ExitBootServices() is called. But we should be able to keep the attribute information copied into some table if necessary. Xen provides virtual firmware on their guests, right? If this firmware does not set up MTRRs today, then I do not think it needs to set up for UEFI, either. Assuming the guest physical address is virtualized, it does not have to carry the same platform attribute & restriction. Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-11 23:23 ` Toshi Kani 2015-06-12 0:52 ` Luis R. Rodriguez @ 2015-06-12 7:59 ` Jan Beulich 2015-06-12 16:58 ` Toshi Kani 2015-06-12 23:15 ` Andy Lutomirski 1 sibling, 2 replies; 20+ messages in thread From: Jan Beulich @ 2015-06-12 7:59 UTC (permalink / raw) To: Toshi Kani Cc: Andy Lutomirski, Luis R. Rodriguez, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, ville.syrjala, Julia Lawall, xen-devel, Dave Airlie, syrjala, Juergen Gross, Luis Rodriguez, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > There are two usages on MTRRs: > 1) MTRR entries set by firmware > 2) MTRR entries set by OS drivers > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > also set this up, this usage will continue to stay. So, we should not > get rid of the MTRR code that looks up the MTRR entries, while we have > no need to modify them. > > Such MTRR entries provide safe guard to /dev/mem, which allows > privileged user to access a range that may require UC mapping while > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > such a case. But it wouldn't be impossible to simply read the MTRRs upon boot, store the information, disable MTRRs, and correctly use PAT to achieve the same effect (i.e. the "blindly maps" part of course would need fixing). > UEFI memory table has memory attribute, which describes cache types > supported in physical memory ranges. However, this information gets > lost when it it is converted to e820 table. I'm afraid you rather don't want to trust that information, as firmware vendors frequently screw it up. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 7:59 ` [Xen-devel] " Jan Beulich @ 2015-06-12 16:58 ` Toshi Kani 2015-08-06 19:53 ` Luis R. Rodriguez 2015-06-12 23:15 ` Andy Lutomirski 1 sibling, 1 reply; 20+ messages in thread From: Toshi Kani @ 2015-06-12 16:58 UTC (permalink / raw) To: Jan Beulich Cc: Andy Lutomirski, Luis R. Rodriguez, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, ville.syrjala, Julia Lawall, xen-devel, Dave Airlie, syrjala, Juergen Gross, Luis Rodriguez, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: > >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > There are two usages on MTRRs: > > 1) MTRR entries set by firmware > > 2) MTRR entries set by OS drivers > > > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > > also set this up, this usage will continue to stay. So, we should not > > get rid of the MTRR code that looks up the MTRR entries, while we have > > no need to modify them. > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > privileged user to access a range that may require UC mapping while > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > > such a case. > > But it wouldn't be impossible to simply read the MTRRs upon boot, > store the information, disable MTRRs, and correctly use PAT to > achieve the same effect (i.e. the "blindly maps" part of course > would need fixing). It could be done, but I do not see much benefit of doing it. One of the reasons platform vendors set MTRRs is so that a system won't hit a machine check when an OS bug leads an access with a wrong cache type. A machine check is hard to analyze and can be seen as a hardware issue by customers. Emulating MTRRs with PAT won't protect from such a bug. > > UEFI memory table has memory attribute, which describes cache types > > supported in physical memory ranges. However, this information gets > > lost when it it is converted to e820 table. > > I'm afraid you rather don't want to trust that information, as > firmware vendors frequently screw it up. Could be, but we need to use firmware info when necessary... Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 16:58 ` Toshi Kani @ 2015-08-06 19:53 ` Luis R. Rodriguez 2015-08-06 19:55 ` Luis R. Rodriguez 2015-08-06 22:58 ` Toshi Kani 0 siblings, 2 replies; 20+ messages in thread From: Luis R. Rodriguez @ 2015-08-06 19:53 UTC (permalink / raw) To: Toshi Kani Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> wrote: > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: >> >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: >> > There are two usages on MTRRs: >> > 1) MTRR entries set by firmware >> > 2) MTRR entries set by OS drivers >> > >> > We can obsolete 2), but we have no control over 1). As UEFI firmwares >> > also set this up, this usage will continue to stay. So, we should not >> > get rid of the MTRR code that looks up the MTRR entries, while we have >> > no need to modify them. >> > >> > Such MTRR entries provide safe guard to /dev/mem, which allows >> > privileged user to access a range that may require UC mapping while >> > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in >> > such a case. >> >> But it wouldn't be impossible to simply read the MTRRs upon boot, >> store the information, disable MTRRs, and correctly use PAT to >> achieve the same effect (i.e. the "blindly maps" part of course >> would need fixing). > > It could be done, but I do not see much benefit of doing it. One of the > reasons platform vendors set MTRRs is so that a system won't hit a > machine check when an OS bug leads an access with a wrong cache type. > > A machine check is hard to analyze and can be seen as a hardware issue by > customers. Emulating MTRRs with PAT won't protect from such a bug. That's seems like a fair and valid concern. This could only happen if the OS would have code that would use MTRR, in the case of Linux we'll soon be able to vet that this cannot happen. For those type of OSes... could it be possible to negotiate or hint to the platform through an attribute somehow that the OS has such capability to not use MTRR? Then, only if this bit is set, the platform could then avoid such MTRR settings, and if we have issues you can throw rocks at us. Could this also be used to prevent SMIs with MTRRs? Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-06 19:53 ` Luis R. Rodriguez @ 2015-08-06 19:55 ` Luis R. Rodriguez 2015-08-06 22:58 ` Toshi Kani 1 sibling, 0 replies; 20+ messages in thread From: Luis R. Rodriguez @ 2015-08-06 19:55 UTC (permalink / raw) To: Toshi Kani Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Thu, Aug 6, 2015 at 12:53 PM, Luis R. Rodriguez <mcgrof@do-not-panic.com> wrote: > For those type of OSes... > could it be possible to negotiate or hint to the platform through an > attribute somehow that the OS has such capability to not use MTRR? And if that's not possible how about a new platform setting that would need to be set at the platform level to enable disabling this junk? Then only folks who know what they are doing would enable it, and if the customer set it, the issue would not be on the platform. Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-06 19:53 ` Luis R. Rodriguez 2015-08-06 19:55 ` Luis R. Rodriguez @ 2015-08-06 22:58 ` Toshi Kani 2015-08-07 20:25 ` Luis R. Rodriguez 1 sibling, 1 reply; 20+ messages in thread From: Toshi Kani @ 2015-08-06 22:58 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> wrote: > > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: > > > > > > On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > > > There are two usages on MTRRs: > > > > 1) MTRR entries set by firmware > > > > 2) MTRR entries set by OS drivers > > > > > > > > We can obsolete 2), but we have no control over 1). As UEFI > > > > firmwares > > > > also set this up, this usage will continue to stay. So, we should > > > > not > > > > get rid of the MTRR code that looks up the MTRR entries, while we > > > > have > > > > no need to modify them. > > > > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > > > privileged user to access a range that may require UC mapping while > > > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to > > > > UC in > > > > such a case. > > > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, > > > store the information, disable MTRRs, and correctly use PAT to > > > achieve the same effect (i.e. the "blindly maps" part of course > > > would need fixing). > > > > It could be done, but I do not see much benefit of doing it. One of the > > reasons platform vendors set MTRRs is so that a system won't hit a > > machine check when an OS bug leads an access with a wrong cache type. > > > > A machine check is hard to analyze and can be seen as a hardware issue > > by customers. Emulating MTRRs with PAT won't protect from such a bug. > > That's seems like a fair and valid concern. This could only happen if > the OS would have code that would use MTRR, in the case of Linux we'll > soon be able to vet that this cannot happen. No, there is no OS support necessary to use MTRR. After firmware sets it up, CPUs continue to use it without any OS support. I think the Linux change you are referring is to obsolete legacy interfaces that modify the MTRR setup. I agree that Linux should not modify MTRR. > For those type of OSes... > could it be possible to negotiate or hint to the platform through an > attribute somehow that the OS has such capability to not use MTRR? The OS can disable MTRR. However, this can also cause a problem in firmware, which may rely on MTRR. > Then, only if this bit is set, the platform could then avoid such MTRR > settings, and if we have issues you can throw rocks at us. > And if that's not possible how about a new platform setting that would > need to be set at the platform level to enable disabling this junk? > Then only folks who know what they are doing would enable it, and if > the customer set it, the issue would not be on the platform. > Could this also be used to prevent SMIs with MTRRs? ACPI _OSI could be used for firmware to implement some OS-specific features, but it may be too late for firmware to make major changes and is generally useless unless OS requirements are described in a spec backed by logo certification. SMIs are also used for platform management, such as fan speed control. Is there any issue for Linux to use MTRR set by firmware? Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-06 22:58 ` Toshi Kani @ 2015-08-07 20:25 ` Luis R. Rodriguez 2015-08-07 21:56 ` Toshi Kani 0 siblings, 1 reply; 20+ messages in thread From: Luis R. Rodriguez @ 2015-08-07 20:25 UTC (permalink / raw) To: Toshi Kani Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> wrote: > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: >> On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> wrote: >> > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: >> > > > > > On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: >> > > > There are two usages on MTRRs: >> > > > 1) MTRR entries set by firmware >> > > > 2) MTRR entries set by OS drivers >> > > > >> > > > We can obsolete 2), but we have no control over 1). As UEFI >> > > > firmwares >> > > > also set this up, this usage will continue to stay. So, we should >> > > > not >> > > > get rid of the MTRR code that looks up the MTRR entries, while we >> > > > have >> > > > no need to modify them. >> > > > >> > > > Such MTRR entries provide safe guard to /dev/mem, which allows >> > > > privileged user to access a range that may require UC mapping while >> > > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to >> > > > UC in >> > > > such a case. >> > > >> > > But it wouldn't be impossible to simply read the MTRRs upon boot, >> > > store the information, disable MTRRs, and correctly use PAT to >> > > achieve the same effect (i.e. the "blindly maps" part of course >> > > would need fixing). >> > >> > It could be done, but I do not see much benefit of doing it. One of the >> > reasons platform vendors set MTRRs is so that a system won't hit a >> > machine check when an OS bug leads an access with a wrong cache type. >> > >> > A machine check is hard to analyze and can be seen as a hardware issue >> > by customers. Emulating MTRRs with PAT won't protect from such a bug. >> >> That's seems like a fair and valid concern. This could only happen if >> the OS would have code that would use MTRR, in the case of Linux we'll >> soon be able to vet that this cannot happen. > > No, there is no OS support necessary to use MTRR. After firmware sets it > up, CPUs continue to use it without any OS support. I think the Linux > change you are referring is to obsolete legacy interfaces that modify the > MTRR setup. I agree that Linux should not modify MTRR. Its a bit more than that though. Since you agree that the OS can live without MTRR code I was hoping to then see if we can fold out PAT Linux code from under the MTRR dependency on Linux and make PAT a first class citizen, maybe at least for x86-64. Right now you can only get PAT support on Linux if you have MTRR code, but I'd like to see if instead we can rip MTRR code out completely under its own Kconfig and let it start rotting away. Code-wise the only issue I saw was that PAT code also relies on mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found no other obvious issues. Platform firmware and SMIs seems to be the only other possible issue. More on this below. >> For those type of OSes... >> could it be possible to negotiate or hint to the platform through an >> attribute somehow that the OS has such capability to not use MTRR? > > The OS can disable MTRR. However, this can also cause a problem in > firmware, which may rely on MTRR. Can you describe what type of issues we could expect ? I tend to care more about this for 64-bit systems so if 32-bit platforms would be more of the ones which could cause an issue would restricting disabling MTRR only for 64-bit help? >> Then, only if this bit is set, the platform could then avoid such MTRR >> settings, and if we have issues you can throw rocks at us. > >> And if that's not possible how about a new platform setting that would >> need to be set at the platform level to enable disabling this junk? >> Then only folks who know what they are doing would enable it, and if >> the customer set it, the issue would not be on the platform. > >> Could this also be used to prevent SMIs with MTRRs? > > ACPI _OSI could be used for firmware to implement some OS-specific features, > but it may be too late for firmware to make major changes and is generally > useless unless OS requirements are described in a spec backed by logo > certification. I see.. So there are no guarantees that platform firmware would not expect OS MTRR support. > SMIs are also used for platform management, such as fan > speed control. And its conceivable that some devices, or the platform itself, may trigger SMIs to have the platform firmware poke with MTRRs? > Is there any issue for Linux to use MTRR set by firmware? Even though we don't have the Kconfig option right now to disable MTRR cod explicitly I'll note that there are a few other cases that could flip Linux to note use MTRR: a) Some BIOSes could let MTRR get disabled b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which disables MTRR on Linux If these environments can exist it'd be good to understand possible issues that could creep up as a result of the OS not having MTRR enabled. If this is a reasonable thing for x86-64 I was hoping we could just let users opt-in to a similar build configuration through the OS by letting PAT not depend on MTRR. Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 20:25 ` Luis R. Rodriguez @ 2015-08-07 21:56 ` Toshi Kani 2015-08-07 22:23 ` Luis R. Rodriguez 0 siblings, 1 reply; 20+ messages in thread From: Toshi Kani @ 2015-08-07 21:56 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: > On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: > > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> wrote: > > > > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: > > > > > > > > On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > > > > > There are two usages on MTRRs: > > > > > > 1) MTRR entries set by firmware > > > > > > 2) MTRR entries set by OS drivers > > > > > > > > > > > > We can obsolete 2), but we have no control over 1). As UEFI > > > > > > firmwares also set this up, this usage will continue to stay. > > > > > > So, we should not get rid of the MTRR code that looks up the > > > > > > MTRR entries, while we have no need to modify them. > > > > > > > > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > > > > > privileged user to access a range that may require UC mapping > > > > > > while the /dev/mem driver blindly maps it with WB. MTRRs > > > > > > converts WB to UC in such a case. > > > > > > > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, > > > > > store the information, disable MTRRs, and correctly use PAT to > > > > > achieve the same effect (i.e. the "blindly maps" part of course > > > > > would need fixing). > > > > > > > > It could be done, but I do not see much benefit of doing it. One of > > > > the reasons platform vendors set MTRRs is so that a system won't hit > > > > a machine check when an OS bug leads an access with a wrong cache > > > > type. > > > > > > > > A machine check is hard to analyze and can be seen as a hardware > > > > issue by customers. Emulating MTRRs with PAT won't protect from > > > > such a bug. > > > > > > That's seems like a fair and valid concern. This could only happen if > > > the OS would have code that would use MTRR, in the case of Linux we'll > > > soon be able to vet that this cannot happen. > > > > No, there is no OS support necessary to use MTRR. After firmware sets > > it up, CPUs continue to use it without any OS support. I think the > > Linux change you are referring is to obsolete legacy interfaces that > > modify the MTRR setup. I agree that Linux should not modify MTRR. > > Its a bit more than that though. Since you agree that the OS can live > without MTRR code I was hoping to then see if we can fold out PAT > Linux code from under the MTRR dependency on Linux and make PAT a > first class citizen, maybe at least for x86-64. Right now you can only > get PAT support on Linux if you have MTRR code, but I'd like to see if > instead we can rip MTRR code out completely under its own Kconfig and > let it start rotting away. > > Code-wise the only issue I saw was that PAT code also relies on > mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found > no other obvious issues. We can rip of the MTTR code that modifies the MTRR setup, but not mtrr_type_lookup(). This function provides necessary checks per documented in commit 7f0431e3dc89 as follows. 1) reserve_memtype() tracks an effective memory type in case a request type is WB (ex. /dev/mem blindly uses WB). Missing to track with its effective type causes a subsequent request to map the same range with the effective type to fail. 2) pud_set_huge() and pmd_set_huge() check if a requested range has any overlap with MTRRs. Missing to detect an overlap may cause a performance penalty or undefined behavior. mtrr_type_lookup() is still admittedly awkward, but I do not think we have an immediate issue in PAT code calling it. I do not think it makes PAT code a second class citizen. > Platform firmware and SMIs seems to be the only other possible issue. > More on this below. > > > > For those type of OSes... > > > could it be possible to negotiate or hint to the platform through an > > > attribute somehow that the OS has such capability to not use MTRR? > > > > The OS can disable MTRR. However, this can also cause a problem in > > firmware, which may rely on MTRR. > > Can you describe what type of issues we could expect ? I tend to care > more about this for 64-bit systems so if 32-bit platforms would be > more of the ones which could cause an issue would restricting > disabling MTRR only for 64-bit help? The SMI handler runs in real-mode and relies on MTRR being effective to provide right cache types. It does not matter if it is 64-bit or not. > > > Then, only if this bit is set, the platform could then avoid such MTRR > > > settings, and if we have issues you can throw rocks at us. > > > > > And if that's not possible how about a new platform setting that would > > > need to be set at the platform level to enable disabling this junk? > > > Then only folks who know what they are doing would enable it, and if > > > the customer set it, the issue would not be on the platform. > > > > > Could this also be used to prevent SMIs with MTRRs? > > > > ACPI _OSI could be used for firmware to implement some OS-specific > > features, but it may be too late for firmware to make major changes and > > is generally useless unless OS requirements are described in a spec > > backed by logo certification. > > I see.. So there are no guarantees that platform firmware would not > expect OS MTRR support. > > > SMIs are also used for platform management, such as fan > > speed control. > > And its conceivable that some devices, or the platform itself, may > trigger SMIs to have the platform firmware poke with MTRRs? SMIs are outside of OS control. SMI handler relies on MTRR being set. SMI must be quick, so the handler should not be required to initialize MTRR or page tables. > > Is there any issue for Linux to use MTRR set by firmware? > > Even though we don't have the Kconfig option right now to disable MTRR > cod explicitly I'll note that there are a few other cases that could > flip Linux to note use MTRR: > > a) Some BIOSes could let MTRR get disabled > b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which > disables MTRR on Linux > > If these environments can exist it'd be good to understand possible > issues that could creep up as a result of the OS not having MTRR > enabled. If this is a reasonable thing for x86-64 I was hoping we > could just let users opt-in to a similar build configuration through > the OS by letting PAT not depend on MTRR. Case a) and b) do not cause any issue. They simply lead mtrr_type_lookup() to return immediately with MTRR_TYPE_INVALID (i.e. MTRR disable), and the callers handle this value properly. These cases are only problematic when the OS tries to modify MTRR. Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 21:56 ` Toshi Kani @ 2015-08-07 22:23 ` Luis R. Rodriguez 2015-08-07 23:08 ` Toshi Kani 0 siblings, 1 reply; 20+ messages in thread From: Luis R. Rodriguez @ 2015-08-07 22:23 UTC (permalink / raw) To: Toshi Kani Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@hp.com> wrote: > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: >> On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> wrote: >> > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: >> > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> wrote: >> > > > On Fri, 2015-06-12 at 08:59 +0100, Jan Beulich wrote: >> > > > > > > > On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: >> > > > > > There are two usages on MTRRs: >> > > > > > 1) MTRR entries set by firmware >> > > > > > 2) MTRR entries set by OS drivers >> > > > > > >> > > > > > We can obsolete 2), but we have no control over 1). As UEFI >> > > > > > firmwares also set this up, this usage will continue to stay. >> > > > > > So, we should not get rid of the MTRR code that looks up the >> > > > > > MTRR entries, while we have no need to modify them. >> > > > > > >> > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows >> > > > > > privileged user to access a range that may require UC mapping >> > > > > > while the /dev/mem driver blindly maps it with WB. MTRRs >> > > > > > converts WB to UC in such a case. >> > > > > >> > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, >> > > > > store the information, disable MTRRs, and correctly use PAT to >> > > > > achieve the same effect (i.e. the "blindly maps" part of course >> > > > > would need fixing). >> > > > >> > > > It could be done, but I do not see much benefit of doing it. One of >> > > > the reasons platform vendors set MTRRs is so that a system won't hit >> > > > a machine check when an OS bug leads an access with a wrong cache >> > > > type. >> > > > >> > > > A machine check is hard to analyze and can be seen as a hardware >> > > > issue by customers. Emulating MTRRs with PAT won't protect from >> > > > such a bug. >> > > >> > > That's seems like a fair and valid concern. This could only happen if >> > > the OS would have code that would use MTRR, in the case of Linux we'll >> > > soon be able to vet that this cannot happen. >> > >> > No, there is no OS support necessary to use MTRR. After firmware sets >> > it up, CPUs continue to use it without any OS support. I think the >> > Linux change you are referring is to obsolete legacy interfaces that >> > modify the MTRR setup. I agree that Linux should not modify MTRR. >> >> Its a bit more than that though. Since you agree that the OS can live >> without MTRR code I was hoping to then see if we can fold out PAT >> Linux code from under the MTRR dependency on Linux and make PAT a >> first class citizen, maybe at least for x86-64. Right now you can only >> get PAT support on Linux if you have MTRR code, but I'd like to see if >> instead we can rip MTRR code out completely under its own Kconfig and >> let it start rotting away. >> >> Code-wise the only issue I saw was that PAT code also relies on >> mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found >> no other obvious issues. > > We can rip of the MTTR code that modifies the MTRR setup, but not > mtrr_type_lookup(). This function provides necessary checks per documented > in commit 7f0431e3dc89 as follows. > > 1) reserve_memtype() tracks an effective memory type in case > a request type is WB (ex. /dev/mem blindly uses WB). Missing > to track with its effective type causes a subsequent request > to map the same range with the effective type to fail. > > 2) pud_set_huge() and pmd_set_huge() check if a requested range > has any overlap with MTRRs. Missing to detect an overlap may > cause a performance penalty or undefined behavior. > > mtrr_type_lookup() is still admittedly awkward, but I do not think we have > an immediate issue in PAT code calling it. I do not think it makes PAT code > a second class citizen. OK since we know that if MTRR set up code ends up disabled and would return MTRR_TYPE_INVALID what if we just static inline this for the no-MTRR Kconfig build option immediately, and only then have the full blown implementation for the case where MTRR Kconfig option is enabled? >> Platform firmware and SMIs seems to be the only other possible issue. >> More on this below. >> >> > > For those type of OSes... >> > > could it be possible to negotiate or hint to the platform through an >> > > attribute somehow that the OS has such capability to not use MTRR? >> > >> > The OS can disable MTRR. However, this can also cause a problem in >> > firmware, which may rely on MTRR. >> >> Can you describe what type of issues we could expect ? I tend to care >> more about this for 64-bit systems so if 32-bit platforms would be >> more of the ones which could cause an issue would restricting >> disabling MTRR only for 64-bit help? > > The SMI handler runs in real-mode and relies on MTRR being effective to > provide right cache types. It does not matter if it is 64-bit or not. I see... since I have no visibility to what goes under the hood, can you provide one example use case where an SMI handler would require getting a cache type through MTRR ? I realize this can vary, vendor by vendor, but any example would do just to satisfy my curiosity. >> > > Then, only if this bit is set, the platform could then avoid such MTRR >> > > settings, and if we have issues you can throw rocks at us. >> > >> > > And if that's not possible how about a new platform setting that would >> > > need to be set at the platform level to enable disabling this junk? >> > > Then only folks who know what they are doing would enable it, and if >> > > the customer set it, the issue would not be on the platform. >> > >> > > Could this also be used to prevent SMIs with MTRRs? >> > >> > ACPI _OSI could be used for firmware to implement some OS-specific >> > features, but it may be too late for firmware to make major changes and >> > is generally useless unless OS requirements are described in a spec >> > backed by logo certification. >> >> I see.. So there are no guarantees that platform firmware would not >> expect OS MTRR support. >> >> > SMIs are also used for platform management, such as fan >> > speed control. >> >> And its conceivable that some devices, or the platform itself, may >> trigger SMIs to have the platform firmware poke with MTRRs? > > SMIs are outside of OS control. SMI handler relies on MTRR being set. SMI > must be quick, so the handler should not be required to initialize MTRR or > page tables. Right makes sense. >> > Is there any issue for Linux to use MTRR set by firmware? >> >> Even though we don't have the Kconfig option right now to disable MTRR >> cod explicitly I'll note that there are a few other cases that could >> flip Linux to note use MTRR: >> >> a) Some BIOSes could let MTRR get disabled >> b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which >> disables MTRR on Linux >> >> If these environments can exist it'd be good to understand possible >> issues that could creep up as a result of the OS not having MTRR >> enabled. If this is a reasonable thing for x86-64 I was hoping we >> could just let users opt-in to a similar build configuration through >> the OS by letting PAT not depend on MTRR. > > Case a) and b) do not cause any issue. They simply lead mtrr_type_lookup() > to return immediately with MTRR_TYPE_INVALID (i.e. MTRR disable), and the > callers handle this value properly. These cases are only problematic when > the OS tries to modify MTRR. OK if the OS returns MTRR_TYPE_INVALID, for folks who do not want MTRR code on their kernel, we should be OK? Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 22:23 ` Luis R. Rodriguez @ 2015-08-07 23:08 ` Toshi Kani 2015-08-07 23:19 ` Toshi Kani 2015-08-07 23:26 ` Luis R. Rodriguez 0 siblings, 2 replies; 20+ messages in thread From: Toshi Kani @ 2015-08-07 23:08 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, 2015-08-07 at 15:23 -0700, Luis R. Rodriguez wrote: > On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: > > > On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > > > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: > > > > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> > > > > > wrote: : > > > > > > > > No, there is no OS support necessary to use MTRR. After firmware > > > > sets it up, CPUs continue to use it without any OS support. I think > > > > the Linux change you are referring is to obsolete legacy interfaces > > > > that modify the MTRR setup. I agree that Linux should not modify > > > > MTRR. > > > > > > Its a bit more than that though. Since you agree that the OS can live > > > without MTRR code I was hoping to then see if we can fold out PAT > > > Linux code from under the MTRR dependency on Linux and make PAT a > > > first class citizen, maybe at least for x86-64. Right now you can only > > > get PAT support on Linux if you have MTRR code, but I'd like to see if > > > instead we can rip MTRR code out completely under its own Kconfig and > > > let it start rotting away. > > > > > > Code-wise the only issue I saw was that PAT code also relies on > > > mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found > > > no other obvious issues. > > > > We can rip of the MTTR code that modifies the MTRR setup, but not > > mtrr_type_lookup(). This function provides necessary checks per > > documented > > in commit 7f0431e3dc89 as follows. > > > > 1) reserve_memtype() tracks an effective memory type in case > > a request type is WB (ex. /dev/mem blindly uses WB). Missing > > to track with its effective type causes a subsequent request > > to map the same range with the effective type to fail. > > > > 2) pud_set_huge() and pmd_set_huge() check if a requested range > > has any overlap with MTRRs. Missing to detect an overlap may > > cause a performance penalty or undefined behavior. > > > > mtrr_type_lookup() is still admittedly awkward, but I do not think we > > have an immediate issue in PAT code calling it. I do not think it makes > > PAT code a second class citizen. > > OK since we know that if MTRR set up code ends up disabled and would > return MTRR_TYPE_INVALID what if we just static inline this for the > no-MTRR Kconfig build option immediately, and only then have the full > blown implementation for the case where MTRR Kconfig option is > enabled? Yes, the MTRR code could be disabled by Kconfig with such inline stubs as long as the kernel is built specifically for a particular platform with MTRR disabled, such as Xen guest kernel. However, since MTRR is a CPU feature enabled on most of the systems, I am not sure if it makes sense to be configurable with Kconfig, though. > > > Platform firmware and SMIs seems to be the only other possible issue. > > > More on this below. > > > > > > > > For those type of OSes... > > > > > could it be possible to negotiate or hint to the platform through > > > > > an attribute somehow that the OS has such capability to not use > > > > > MTRR? > > > > > > > > The OS can disable MTRR. However, this can also cause a problem in > > > > firmware, which may rely on MTRR. > > > > > > Can you describe what type of issues we could expect ? I tend to care > > > more about this for 64-bit systems so if 32-bit platforms would be > > > more of the ones which could cause an issue would restricting > > > disabling MTRR only for 64-bit help? > > > > The SMI handler runs in real-mode and relies on MTRR being effective to > > provide right cache types. It does not matter if it is 64-bit or not. > > I see... since I have no visibility to what goes under the hood, can > you provide one example use case where an SMI handler would require > getting a cache type through MTRR ? I realize this can vary, vendor by > vendor, but any example would do just to satisfy my curiosity. For fan control, it would need UC access to its registers. > > > > > Then, only if this bit is set, the platform could then avoid such > > > > > MTRR settings, and if we have issues you can throw rocks at us. > > > > > > > > > And if that's not possible how about a new platform setting that > > > > > would need to be set at the platform level to enable disabling > > > > > this junk? > > > > > Then only folks who know what they are doing would enable it, and > > > > > if the customer set it, the issue would not be on the platform. > > > > > > > > > Could this also be used to prevent SMIs with MTRRs? > > > > > > > > ACPI _OSI could be used for firmware to implement some OS-specific > > > > features, but it may be too late for firmware to make major changes > > > > and > > > > is generally useless unless OS requirements are described in a spec > > > > backed by logo certification. > > > > > > I see.. So there are no guarantees that platform firmware would not > > > expect OS MTRR support. > > > > > > > SMIs are also used for platform management, such as fan > > > > speed control. > > > > > > And its conceivable that some devices, or the platform itself, may > > > trigger SMIs to have the platform firmware poke with MTRRs? > > > > SMIs are outside of OS control. SMI handler relies on MTRR being set. > > SMI must be quick, so the handler should not be required to initialize > > MTRR or page tables. > > Right makes sense. > > > > > Is there any issue for Linux to use MTRR set by firmware? > > > > > > Even though we don't have the Kconfig option right now to disable MTRR > > > cod explicitly I'll note that there are a few other cases that could > > > flip Linux to note use MTRR: > > > > > > a) Some BIOSes could let MTRR get disabled > > > b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which > > > disables MTRR on Linux > > > > > > If these environments can exist it'd be good to understand possible > > > issues that could creep up as a result of the OS not having MTRR > > > enabled. If this is a reasonable thing for x86-64 I was hoping we > > > could just let users opt-in to a similar build configuration through > > > the OS by letting PAT not depend on MTRR. > > > > Case a) and b) do not cause any issue. They simply lead > > mtrr_type_lookup() to return immediately with MTRR_TYPE_INVALID (i.e. > > MTRR disable), and the callers handle this value properly. These cases > > are only problematic when the OS tries to modify MTRR. > > OK if the OS returns MTRR_TYPE_INVALID, for folks who do not want MTRR > code on their kernel, we should be OK? Technically OK. Not sure if we want such a Kconfig option, though. Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 23:08 ` Toshi Kani @ 2015-08-07 23:19 ` Toshi Kani 2015-08-07 23:26 ` Luis R. Rodriguez 1 sibling, 0 replies; 20+ messages in thread From: Toshi Kani @ 2015-08-07 23:19 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, 2015-08-07 at 17:08 -0600, Toshi Kani wrote: > On Fri, 2015-08-07 at 15:23 -0700, Luis R. Rodriguez wrote: > > On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: > > > > On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> > > > > wrote: > > > > > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: > > > > > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> > > > > > > wrote: > : > > > > > > > > > > No, there is no OS support necessary to use MTRR. After firmware > > > > > sets it up, CPUs continue to use it without any OS support. I > > > > > think the Linux change you are referring is to obsolete legacy > > > > > interfaces that modify the MTRR setup. I agree that Linux should > > > > > not modify MTRR. > > > > > > > > Its a bit more than that though. Since you agree that the OS can > > > > live without MTRR code I was hoping to then see if we can fold out > > > > PAT Linux code from under the MTRR dependency on Linux and make PAT > > > > a first class citizen, maybe at least for x86-64. Right now you can > > > > only get PAT support on Linux if you have MTRR code, but I'd like to > > > > see if instead we can rip MTRR code out completely under its own > > > > Kconfig and let it start rotting away. > > > > > > > > Code-wise the only issue I saw was that PAT code also relies on > > > > mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I > > > > found no other obvious issues. > > > > > > We can rip of the MTTR code that modifies the MTRR setup, but not > > > mtrr_type_lookup(). This function provides necessary checks per > > > documented in commit 7f0431e3dc89 as follows. > > > > > > 1) reserve_memtype() tracks an effective memory type in case > > > a request type is WB (ex. /dev/mem blindly uses WB). Missing > > > to track with its effective type causes a subsequent request > > > to map the same range with the effective type to fail. > > > > > > 2) pud_set_huge() and pmd_set_huge() check if a requested range > > > has any overlap with MTRRs. Missing to detect an overlap may > > > cause a performance penalty or undefined behavior. > > > > > > mtrr_type_lookup() is still admittedly awkward, but I do not think we > > > have an immediate issue in PAT code calling it. I do not think it > > > makes > > > PAT code a second class citizen. > > > > OK since we know that if MTRR set up code ends up disabled and would > > return MTRR_TYPE_INVALID what if we just static inline this for the > > no-MTRR Kconfig build option immediately, and only then have the full > > blown implementation for the case where MTRR Kconfig option is > > enabled? > > Yes, the MTRR code could be disabled by Kconfig with such inline stubs as > long as the kernel is built specifically for a particular platform with > MTRR disabled, such as Xen guest kernel. Noticed that we do have CONFIG_MTRR and mtrr_type_lookup() inline stub returns MTRR_INVALID. -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 23:08 ` Toshi Kani 2015-08-07 23:19 ` Toshi Kani @ 2015-08-07 23:26 ` Luis R. Rodriguez 2015-08-07 23:48 ` Toshi Kani 1 sibling, 1 reply; 20+ messages in thread From: Luis R. Rodriguez @ 2015-08-07 23:26 UTC (permalink / raw) To: Toshi Kani Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, Aug 7, 2015 at 4:08 PM, Toshi Kani <toshi.kani@hp.com> wrote: > On Fri, 2015-08-07 at 15:23 -0700, Luis R. Rodriguez wrote: >> On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@hp.com> wrote: >> > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: >> > > On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> wrote: >> > > > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: >> > > > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com> >> > > > > wrote: > : >> > > > >> > > > No, there is no OS support necessary to use MTRR. After firmware >> > > > sets it up, CPUs continue to use it without any OS support. I think >> > > > the Linux change you are referring is to obsolete legacy interfaces >> > > > that modify the MTRR setup. I agree that Linux should not modify >> > > > MTRR. >> > > >> > > Its a bit more than that though. Since you agree that the OS can live >> > > without MTRR code I was hoping to then see if we can fold out PAT >> > > Linux code from under the MTRR dependency on Linux and make PAT a >> > > first class citizen, maybe at least for x86-64. Right now you can only >> > > get PAT support on Linux if you have MTRR code, but I'd like to see if >> > > instead we can rip MTRR code out completely under its own Kconfig and >> > > let it start rotting away. >> > > >> > > Code-wise the only issue I saw was that PAT code also relies on >> > > mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I found >> > > no other obvious issues. >> > >> > We can rip of the MTTR code that modifies the MTRR setup, but not >> > mtrr_type_lookup(). This function provides necessary checks per >> > documented >> > in commit 7f0431e3dc89 as follows. >> > >> > 1) reserve_memtype() tracks an effective memory type in case >> > a request type is WB (ex. /dev/mem blindly uses WB). Missing >> > to track with its effective type causes a subsequent request >> > to map the same range with the effective type to fail. >> > >> > 2) pud_set_huge() and pmd_set_huge() check if a requested range >> > has any overlap with MTRRs. Missing to detect an overlap may >> > cause a performance penalty or undefined behavior. >> > >> > mtrr_type_lookup() is still admittedly awkward, but I do not think we >> > have an immediate issue in PAT code calling it. I do not think it makes >> > PAT code a second class citizen. >> >> OK since we know that if MTRR set up code ends up disabled and would >> return MTRR_TYPE_INVALID what if we just static inline this for the >> no-MTRR Kconfig build option immediately, and only then have the full >> blown implementation for the case where MTRR Kconfig option is >> enabled? > > Yes, the MTRR code could be disabled by Kconfig with such inline stubs OK thanks. > as > long as the kernel is built specifically for a particular platform with MTRR > disabled, such as Xen guest kernel. Sure. > However, since MTRR is a CPU feature enabled on most of the systems, I am > not sure if it makes sense to be configurable with Kconfig, though. To me this is about making PAT a first class citizen in code though and validating through Kconfig the option then to opt-out of MTRR from OS code. Perhaps we can recommend to enable it but having the options to split out PAT from MTRR is what I was aiming for. >> > > Platform firmware and SMIs seems to be the only other possible issue. >> > > More on this below. >> > > >> > > > > For those type of OSes... >> > > > > could it be possible to negotiate or hint to the platform through >> > > > > an attribute somehow that the OS has such capability to not use >> > > > > MTRR? >> > > > >> > > > The OS can disable MTRR. However, this can also cause a problem in >> > > > firmware, which may rely on MTRR. >> > > >> > > Can you describe what type of issues we could expect ? I tend to care >> > > more about this for 64-bit systems so if 32-bit platforms would be >> > > more of the ones which could cause an issue would restricting >> > > disabling MTRR only for 64-bit help? >> > >> > The SMI handler runs in real-mode and relies on MTRR being effective to >> > provide right cache types. It does not matter if it is 64-bit or not. >> >> I see... since I have no visibility to what goes under the hood, can >> you provide one example use case where an SMI handler would require >> getting a cache type through MTRR ? I realize this can vary, vendor by >> vendor, but any example would do just to satisfy my curiosity. > > For fan control, it would need UC access to its registers. OK thanks! To follow up with the example, since the platform firmware would have set up the MTRRs anyway, the SMI should still work, even if the OS didn't do anything, right? >> > > > > Then, only if this bit is set, the platform could then avoid such >> > > > > MTRR settings, and if we have issues you can throw rocks at us. >> > > > >> > > > > And if that's not possible how about a new platform setting that >> > > > > would need to be set at the platform level to enable disabling >> > > > > this junk? >> > > > > Then only folks who know what they are doing would enable it, and >> > > > > if the customer set it, the issue would not be on the platform. >> > > > >> > > > > Could this also be used to prevent SMIs with MTRRs? >> > > > >> > > > ACPI _OSI could be used for firmware to implement some OS-specific >> > > > features, but it may be too late for firmware to make major changes >> > > > and >> > > > is generally useless unless OS requirements are described in a spec >> > > > backed by logo certification. >> > > >> > > I see.. So there are no guarantees that platform firmware would not >> > > expect OS MTRR support. >> > > >> > > > SMIs are also used for platform management, such as fan >> > > > speed control. >> > > >> > > And its conceivable that some devices, or the platform itself, may >> > > trigger SMIs to have the platform firmware poke with MTRRs? >> > >> > SMIs are outside of OS control. SMI handler relies on MTRR being set. >> > SMI must be quick, so the handler should not be required to initialize >> > MTRR or page tables. >> >> Right makes sense. >> >> > > > Is there any issue for Linux to use MTRR set by firmware? >> > > >> > > Even though we don't have the Kconfig option right now to disable MTRR >> > > cod explicitly I'll note that there are a few other cases that could >> > > flip Linux to note use MTRR: >> > > >> > > a) Some BIOSes could let MTRR get disabled >> > > b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which >> > > disables MTRR on Linux >> > > >> > > If these environments can exist it'd be good to understand possible >> > > issues that could creep up as a result of the OS not having MTRR >> > > enabled. If this is a reasonable thing for x86-64 I was hoping we >> > > could just let users opt-in to a similar build configuration through >> > > the OS by letting PAT not depend on MTRR. >> > >> > Case a) and b) do not cause any issue. They simply lead >> > mtrr_type_lookup() to return immediately with MTRR_TYPE_INVALID (i.e. >> > MTRR disable), and the callers handle this value properly. These cases >> > are only problematic when the OS tries to modify MTRR. >> >> OK if the OS returns MTRR_TYPE_INVALID, for folks who do not want MTRR >> code on their kernel, we should be OK? > > Technically OK. Not sure if we want such a Kconfig option, though. Its more of me wanting to get PAT out from under MTRR. Does that make sense? Luis ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-08-07 23:26 ` Luis R. Rodriguez @ 2015-08-07 23:48 ` Toshi Kani 0 siblings, 0 replies; 20+ messages in thread From: Toshi Kani @ 2015-08-07 23:48 UTC (permalink / raw) To: Luis R. Rodriguez Cc: Jan Beulich, Andy Lutomirski, Bjorn Helgaas, Jej B, X86 ML, Andrew Morton, Ville Syrjälä, Julia Lawall, xen-devel, Dave Airlie, Ville Syrjälä, Juergen Gross, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci On Fri, 2015-08-07 at 16:26 -0700, Luis R. Rodriguez wrote: > On Fri, Aug 7, 2015 at 4:08 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > On Fri, 2015-08-07 at 15:23 -0700, Luis R. Rodriguez wrote: > > > On Fri, Aug 7, 2015 at 2:56 PM, Toshi Kani <toshi.kani@hp.com> wrote: > > > > On Fri, 2015-08-07 at 13:25 -0700, Luis R. Rodriguez wrote: > > > > > On Thu, Aug 6, 2015 at 3:58 PM, Toshi Kani <toshi.kani@hp.com> > > > > > wrote: > > > > > > On Thu, 2015-08-06 at 12:53 -0700, Luis R. Rodriguez wrote: > > > > > > > On Fri, Jun 12, 2015 at 9:58 AM, Toshi Kani <toshi.kani@hp.com > > > > > > > > > > > > > > > wrote: : > > > > > > > > > > Its a bit more than that though. Since you agree that the OS can > > > > > live without MTRR code I was hoping to then see if we can fold out > > > > > PAT Linux code from under the MTRR dependency on Linux and make > > > > > PAT a first class citizen, maybe at least for x86-64. Right now > > > > > you can only get PAT support on Linux if you have MTRR code, but > > > > > I'd like to see if instead we can rip MTRR code out completely > > > > > under its own Kconfig and let it start rotting away. > > > > > > > > > > Code-wise the only issue I saw was that PAT code also relies on > > > > > mtrr_type_lookup(), see pat_x_mtrr_type(), but other than this I > > > > > found no other obvious issues. > > > > > > > > We can rip of the MTTR code that modifies the MTRR setup, but not > > > > mtrr_type_lookup(). This function provides necessary checks per > > > > documented > > > > in commit 7f0431e3dc89 as follows. > > > > > > > > 1) reserve_memtype() tracks an effective memory type in case > > > > a request type is WB (ex. /dev/mem blindly uses WB). Missing > > > > to track with its effective type causes a subsequent request > > > > to map the same range with the effective type to fail. > > > > > > > > 2) pud_set_huge() and pmd_set_huge() check if a requested range > > > > has any overlap with MTRRs. Missing to detect an overlap may > > > > cause a performance penalty or undefined behavior. > > > > > > > > mtrr_type_lookup() is still admittedly awkward, but I do not think > > > > we > > > > have an immediate issue in PAT code calling it. I do not think it > > > > makes > > > > PAT code a second class citizen. > > > > > > OK since we know that if MTRR set up code ends up disabled and would > > > return MTRR_TYPE_INVALID what if we just static inline this for the > > > no-MTRR Kconfig build option immediately, and only then have the full > > > blown implementation for the case where MTRR Kconfig option is > > > enabled? > > > > Yes, the MTRR code could be disabled by Kconfig with such inline stubs > > OK thanks. > > > as > > long as the kernel is built specifically for a particular platform with > > MTRR disabled, such as Xen guest kernel. > > Sure. > > > However, since MTRR is a CPU feature enabled on most of the systems, I > > am not sure if it makes sense to be configurable with Kconfig, though. > > To me this is about making PAT a first class citizen in code though > and validating through Kconfig the option then to opt-out of MTRR from > OS code. Perhaps we can recommend to enable it but having the options > to split out PAT from MTRR is what I was aiming for. Since we have CONFIG_MTRR already, we do not need to argue over this option. :-) It makes sense since when MTRR code was introduced, there were CPUs without this capability... > > > > > Platform firmware and SMIs seems to be the only other possible > > > > > issue. More on this below. > > > > > > > > > > > > For those type of OSes... > > > > > > > could it be possible to negotiate or hint to the platform > > > > > > > through an attribute somehow that the OS has such capability > > > > > > > to not use MTRR? > > > > > > > > > > > > The OS can disable MTRR. However, this can also cause a problem > > > > > > in firmware, which may rely on MTRR. > > > > > > > > > > Can you describe what type of issues we could expect ? I tend to > > > > > care more about this for 64-bit systems so if 32-bit platforms > > > > > would be more of the ones which could cause an issue would > > > > > restricting disabling MTRR only for 64-bit help? > > > > > > > > The SMI handler runs in real-mode and relies on MTRR being effective > > > > to provide right cache types. It does not matter if it is 64-bit or > > > > not. > > > > > > I see... since I have no visibility to what goes under the hood, can > > > you provide one example use case where an SMI handler would require > > > getting a cache type through MTRR ? I realize this can vary, vendor by > > > vendor, but any example would do just to satisfy my curiosity. > > > > For fan control, it would need UC access to its registers. > > OK thanks! To follow up with the example, since the platform firmware > would have set up the MTRRs anyway, the SMI should still work, even if > the OS didn't do anything, right? Yes, MTRR works without the OS code. However, mtrr_type_lookup() is necessary to make sure that OS mapping requests are aligned with with the MTRR setup. > > > > > > Is there any issue for Linux to use MTRR set by firmware? > > > > > > > > > > Even though we don't have the Kconfig option right now to disable > > > > > MTRR cod explicitly I'll note that there are a few other cases > > > > > that could flip Linux to note use MTRR: > > > > > > > > > > a) Some BIOSes could let MTRR get disabled > > > > > b) As of Xen 4.4, the hypervisor disables X86_FEATURE_MTRR which > > > > > disables MTRR on Linux > > > > > > > > > > If these environments can exist it'd be good to understand > > > > > possible issues that could creep up as a result of the OS not > > > > > having MTRR enabled. If this is a reasonable thing for x86-64 I > > > > > was hoping we could just let users opt-in to a similar build > > > > > configuration through the OS by letting PAT not depend on MTRR. > > > > > > > > Case a) and b) do not cause any issue. They simply lead > > > > mtrr_type_lookup() to return immediately with MTRR_TYPE_INVALID > > > > (i.e. MTRR disable), and the callers handle this value properly. > > > > These cases are only problematic when the OS tries to modify MTRR. > > > > > > OK if the OS returns MTRR_TYPE_INVALID, for folks who do not want MTRR > > > code on their kernel, we should be OK? > > > > Technically OK. Not sure if we want such a Kconfig option, though. > > Its more of me wanting to get PAT out from under MTRR. Does that make > sense? It makes sense if you need to make the kernel size a bit smaller, and you build kernels specific to Xen guests. Leaving the MTRR code enabled on Xen guests does not cause you any issue, though. Thanks, -Toshi ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 7:59 ` [Xen-devel] " Jan Beulich 2015-06-12 16:58 ` Toshi Kani @ 2015-06-12 23:15 ` Andy Lutomirski 2015-06-12 23:29 ` James Bottomley ` (2 more replies) 1 sibling, 3 replies; 20+ messages in thread From: Andy Lutomirski @ 2015-06-12 23:15 UTC (permalink / raw) To: Jan Beulich Cc: Juergen Gross, Tomi Valkeinen, Ville Syrjälä, Dave Airlie, xen-devel, linux-fbdev, X86 ML, Andrew Morton, Jej B, Bjorn Helgaas, Luis R. Rodriguez, linux-media, Luis Rodriguez, linux-pci, Toshi Kani, Borislav Petkov, Julia Lawall, linux-kernel, ville.syrjala On Jun 12, 2015 12:59 AM, "Jan Beulich" <JBeulich@suse.com> wrote: > > >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > There are two usages on MTRRs: > > 1) MTRR entries set by firmware > > 2) MTRR entries set by OS drivers > > > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > > also set this up, this usage will continue to stay. So, we should not > > get rid of the MTRR code that looks up the MTRR entries, while we have > > no need to modify them. > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > privileged user to access a range that may require UC mapping while > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > > such a case. > > But it wouldn't be impossible to simply read the MTRRs upon boot, > store the information, disable MTRRs, and correctly use PAT to > achieve the same effect (i.e. the "blindly maps" part of course > would need fixing). This may crash and burn badly when we call a UEFI function or an SMI happens. I think we should just leave the MTRRs alone. --Andy ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 23:15 ` Andy Lutomirski @ 2015-06-12 23:29 ` James Bottomley 2015-06-13 6:37 ` Ingo Molnar 2015-06-15 6:20 ` Jan Beulich 2 siblings, 0 replies; 20+ messages in thread From: James Bottomley @ 2015-06-12 23:29 UTC (permalink / raw) To: Andy Lutomirski Cc: Jan Beulich, Juergen Gross, Tomi Valkeinen, Ville Syrjälä, Dave Airlie, xen-devel, linux-fbdev, X86 ML, Andrew Morton, Bjorn Helgaas, Luis R. Rodriguez, linux-media, Luis Rodriguez, linux-pci, Toshi Kani, Borislav Petkov, Julia Lawall, linux-kernel, ville.syrjala On Fri, 2015-06-12 at 16:15 -0700, Andy Lutomirski wrote: > On Jun 12, 2015 12:59 AM, "Jan Beulich" <JBeulich@suse.com> wrote: > > > > >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > > There are two usages on MTRRs: > > > 1) MTRR entries set by firmware > > > 2) MTRR entries set by OS drivers > > > > > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > > > also set this up, this usage will continue to stay. So, we should not > > > get rid of the MTRR code that looks up the MTRR entries, while we have > > > no need to modify them. > > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows > > > privileged user to access a range that may require UC mapping while > > > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in > > > such a case. > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, > > store the information, disable MTRRs, and correctly use PAT to > > achieve the same effect (i.e. the "blindly maps" part of course > > would need fixing). > > This may crash and burn badly when we call a UEFI function or an SMI > happens. I think we should just leave the MTRRs alone. Wholeheartedly agree: PAT only works when the given memory map is in operation but MTRRs function everywhere. Anything that goes into real mode or installs its own memory map won't see the Linux page attributes. James ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 23:15 ` Andy Lutomirski 2015-06-12 23:29 ` James Bottomley @ 2015-06-13 6:37 ` Ingo Molnar 2015-06-15 6:20 ` Jan Beulich 2 siblings, 0 replies; 20+ messages in thread From: Ingo Molnar @ 2015-06-13 6:37 UTC (permalink / raw) To: Andy Lutomirski Cc: Jan Beulich, Juergen Gross, Tomi Valkeinen, Ville Syrjälä, Dave Airlie, xen-devel, linux-fbdev, X86 ML, Andrew Morton, Jej B, Bjorn Helgaas, Luis R. Rodriguez, linux-media, Luis Rodriguez, linux-pci, Toshi Kani, Borislav Petkov, Julia Lawall, linux-kernel, ville.syrjala * Andy Lutomirski <luto@amacapital.net> wrote: > On Jun 12, 2015 12:59 AM, "Jan Beulich" <JBeulich@suse.com> wrote: > > > > >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: > > > There are two usages on MTRRs: > > > 1) MTRR entries set by firmware > > > 2) MTRR entries set by OS drivers > > > > > > We can obsolete 2), but we have no control over 1). As UEFI firmwares > > > also set this up, this usage will continue to stay. So, we should not > > > get rid of the MTRR code that looks up the MTRR entries, while we have > > > no need to modify them. > > > > > > Such MTRR entries provide safe guard to /dev/mem, which allows privileged > > > user to access a range that may require UC mapping while the /dev/mem driver > > > blindly maps it with WB. MTRRs converts WB to UC in such a case. > > > > But it wouldn't be impossible to simply read the MTRRs upon boot, store the > > information, disable MTRRs, and correctly use PAT to achieve the same effect > > (i.e. the "blindly maps" part of course would need fixing). > > This may crash and burn badly when we call a UEFI function or an SMI happens. I > think we should just leave the MTRRs alone. Not to mention suspend/resume, reboot and other goodies where the firmware might pop up expecting intact MTRRs. Btw., doesn't a lack of MTRRs imply UC? So is 'crash and burn' possible in most cases? Isn't it just 'executes slower than before'? Thanks, Ingo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2 2015-06-12 23:15 ` Andy Lutomirski 2015-06-12 23:29 ` James Bottomley 2015-06-13 6:37 ` Ingo Molnar @ 2015-06-15 6:20 ` Jan Beulich 2 siblings, 0 replies; 20+ messages in thread From: Jan Beulich @ 2015-06-15 6:20 UTC (permalink / raw) To: Andy Lutomirski Cc: Luis R. Rodriguez, Bjorn Helgaas, Jej B, Toshi Kani, X86 ML, Andrew Morton, ville.syrjala, Julia Lawall, xen-devel, Dave Airlie, syrjala, Juergen Gross, Luis Rodriguez, Borislav Petkov, Tomi Valkeinen, linux-fbdev, linux-kernel, linux-media, linux-pci >>> On 13.06.15 at 01:15, <luto@amacapital.net> wrote: > On Jun 12, 2015 12:59 AM, "Jan Beulich" <JBeulich@suse.com> wrote: >> >> >>> On 12.06.15 at 01:23, <toshi.kani@hp.com> wrote: >> > There are two usages on MTRRs: >> > 1) MTRR entries set by firmware >> > 2) MTRR entries set by OS drivers >> > >> > We can obsolete 2), but we have no control over 1). As UEFI firmwares >> > also set this up, this usage will continue to stay. So, we should not >> > get rid of the MTRR code that looks up the MTRR entries, while we have >> > no need to modify them. >> > >> > Such MTRR entries provide safe guard to /dev/mem, which allows >> > privileged user to access a range that may require UC mapping while >> > the /dev/mem driver blindly maps it with WB. MTRRs converts WB to UC in >> > such a case. >> >> But it wouldn't be impossible to simply read the MTRRs upon boot, >> store the information, disable MTRRs, and correctly use PAT to >> achieve the same effect (i.e. the "blindly maps" part of course >> would need fixing). > > This may crash and burn badly when we call a UEFI function or an SMI > happens. I think we should just leave the MTRRs alone. I buy the SMI part, but UEFI runtime calls are being done on page tables we construct and control, so attributes could be kept correct without relying on MTRRs. Jan ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2015-08-07 23:50 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-06-11 20:36 RIP MTRR - status update for upcoming v4.2 Luis R. Rodriguez 2015-06-11 23:23 ` Toshi Kani 2015-06-12 0:52 ` Luis R. Rodriguez 2015-06-12 16:42 ` Toshi Kani 2015-06-12 7:59 ` [Xen-devel] " Jan Beulich 2015-06-12 16:58 ` Toshi Kani 2015-08-06 19:53 ` Luis R. Rodriguez 2015-08-06 19:55 ` Luis R. Rodriguez 2015-08-06 22:58 ` Toshi Kani 2015-08-07 20:25 ` Luis R. Rodriguez 2015-08-07 21:56 ` Toshi Kani 2015-08-07 22:23 ` Luis R. Rodriguez 2015-08-07 23:08 ` Toshi Kani 2015-08-07 23:19 ` Toshi Kani 2015-08-07 23:26 ` Luis R. Rodriguez 2015-08-07 23:48 ` Toshi Kani 2015-06-12 23:15 ` Andy Lutomirski 2015-06-12 23:29 ` James Bottomley 2015-06-13 6:37 ` Ingo Molnar 2015-06-15 6:20 ` Jan Beulich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).