All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
       [not found] <55E477DE.2060106@gmail.com>
@ 2015-08-31 16:05 ` Stuart Hayes
  2015-09-03  2:45   ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Stuart Hayes @ 2015-08-31 16:05 UTC (permalink / raw)
  To: tglx, mingo, H. Peter Anvin; +Cc: linux-kernel, x86, prarit

Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
to map large memory configs into MTRRs.

Currently, mtrr_cleanup() will fail with large memory configurations,
because it limits chunk_size to 2GB, which means that each MTRR can only
cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
MTRRs (such as some recent Intel CPUs have), it is not possible to set up
the MTRRs to cover all of memory.

Signed-off-by: Stuart Hayes <stuart.w.hayes@gmail.com>
---
--- linux-4.2-rc7/arch/x86/kernel/cpu/mtrr/cleanup.c.orig	2015-08-16 18:34:13.000000000 -0500
+++ linux-4.2-rc7/arch/x86/kernel/cpu/mtrr/cleanup.c	2015-08-27 12:29:51.908579247 -0500
@@ -517,10 +517,11 @@ struct mtrr_cleanup_result {
 
 /*
  * gran_size: 64K, 128K, 256K, 512K, 1M, 2M, ..., 2G
- * chunk size: gran_size, ..., 2G
- * so we need (1+16)*8
+ * chunk size: gran_size, ..., 2G, ..., 1<<address_bits
+ *   (for 32 address bits, we need 136)
+ *   (for 40 address bits, we need 264)
  */
-#define NUM_RESULT	136
+#define NUM_RESULT	264
 #define PSHIFT		(PAGE_SHIFT - 10)
 
 static struct mtrr_cleanup_result __initdata result[NUM_RESULT];
@@ -751,7 +752,7 @@ int __init mtrr_cleanup(unsigned address
 	memset(result, 0, sizeof(result));
 	for (gran_size = (1ULL<<16); gran_size < (1ULL<<32); gran_size <<= 1) {
 
-		for (chunk_size = gran_size; chunk_size < (1ULL<<32);
+		for (chunk_size = gran_size; chunk_size < (1ULL<<address_bits);
 		     chunk_size <<= 1) {
 
 			if (i >= NUM_RESULT)




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-08-31 16:05 ` Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup Stuart Hayes
@ 2015-09-03  2:45   ` Luis R. Rodriguez
  2015-09-03 12:17     ` Prarit Bhargava
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03  2:45 UTC (permalink / raw)
  To: Stuart Hayes
  Cc: tglx, mingo, H. Peter Anvin, linux-kernel, x86, prarit, mcgrof,
	Toshi Kani

On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
> to map large memory configs into MTRRs.
> 
> Currently, mtrr_cleanup() will fail with large memory configurations,
> because it limits chunk_size to 2GB, which means that each MTRR can only
> cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
> MTRRs (such as some recent Intel CPUs have), it is not possible to set up
> the MTRRs to cover all of memory.

Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
happen if the cleanup is just skipped in your case ?

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03  2:45   ` Luis R. Rodriguez
@ 2015-09-03 12:17     ` Prarit Bhargava
  2015-09-03 17:59       ` Luis R. Rodriguez
  2015-09-14 14:46       ` Stuart Hayes
  0 siblings, 2 replies; 23+ messages in thread
From: Prarit Bhargava @ 2015-09-03 12:17 UTC (permalink / raw)
  To: Luis R. Rodriguez, Stuart Hayes
  Cc: tglx, mingo, H. Peter Anvin, linux-kernel, x86, mcgrof, Toshi Kani



On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
>> Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
>> to map large memory configs into MTRRs.
>>
>> Currently, mtrr_cleanup() will fail with large memory configurations,
>> because it limits chunk_size to 2GB, which means that each MTRR can only
>> cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
>> MTRRs (such as some recent Intel CPUs have), it is not possible to set up
>> the MTRRs to cover all of memory.
> 
> Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
> happen if the cleanup is just skipped in your case ?

The infiniband & video drivers still use MTRR (or at least it was my
understanding that they do).  In any case, Stuart -- could you try booting with
'disable_mtrr_cleanup' as a kernel parameter?

P.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 12:17     ` Prarit Bhargava
@ 2015-09-03 17:59       ` Luis R. Rodriguez
  2015-09-03 18:10         ` Prarit Bhargava
  2015-09-14 14:46       ` Stuart Hayes
  1 sibling, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 17:59 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Stuart Hayes, tglx, mingo, H. Peter Anvin, linux-kernel, x86,
	mcgrof, Toshi Kani

On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> 
> 
> On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> > On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> >> Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
> >> to map large memory configs into MTRRs.
> >>
> >> Currently, mtrr_cleanup() will fail with large memory configurations,
> >> because it limits chunk_size to 2GB, which means that each MTRR can only
> >> cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
> >> MTRRs (such as some recent Intel CPUs have), it is not possible to set up
> >> the MTRRs to cover all of memory.
> > 
> > Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
> > happen if the cleanup is just skipped in your case ?
> 
> The infiniband & video drivers still use MTRR (or at least it was my
> understanding that they do). 

There were a few stragglers left on v4.2, I have transformed them in the latest
development changes and those tranformations are now part of linux-next. If
this is specific to a driver you may want to first ensure you backport the
required patch that transforms the driver to use proper PAT interfaces, v4.2
should have most updates but there were still a few left. Just make sure your
driver doesn't call mtrr_add() directly and if it doesn't then you should be
OK.

> In any case, Stuart -- could you try booting with
> 'disable_mtrr_cleanup' as a kernel parameter?

Indeed, please I'd like to hear back. Be sure to have the respective driver
transformation in place, what driver are you using exactly? In the event that
you argue this is still needed I'd like to know exaclty *why*, the comit log
does not mention any of that at all.

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 17:59       ` Luis R. Rodriguez
@ 2015-09-03 18:10         ` Prarit Bhargava
  2015-09-03 18:40           ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Prarit Bhargava @ 2015-09-03 18:10 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Stuart Hayes, tglx, mingo, H. Peter Anvin, linux-kernel, x86,
	mcgrof, Toshi Kani



On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
>>
>>
>> On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
>>> On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
>>>> Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
>>>> to map large memory configs into MTRRs.
>>>>
>>>> Currently, mtrr_cleanup() will fail with large memory configurations,
>>>> because it limits chunk_size to 2GB, which means that each MTRR can only
>>>> cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
>>>> MTRRs (such as some recent Intel CPUs have), it is not possible to set up
>>>> the MTRRs to cover all of memory.
>>>
>>> Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
>>> happen if the cleanup is just skipped in your case ?
>>
>> The infiniband & video drivers still use MTRR (or at least it was my
>> understanding that they do). 
> 
> There were a few stragglers left on v4.2, I have transformed them in the latest
> development changes and those tranformations are now part of linux-next. If
> this is specific to a driver you may want to first ensure you backport the
> required patch that transforms the driver to use proper PAT interfaces, v4.2
> should have most updates but there were still a few left. Just make sure your
> driver doesn't call mtrr_add() directly and if it doesn't then you should be
> OK.
> 
>> In any case, Stuart -- could you try booting with
>> 'disable_mtrr_cleanup' as a kernel parameter?
> 
> Indeed, please I'd like to hear back. Be sure to have the respective driver
> transformation in place, what driver are you using exactly? In the event that
> you argue this is still needed I'd like to know exaclty *why*, the comit log
> does not mention any of that at all.
> 

Well ... we are trying to also fix this in older kernels too, *cough* RHEL
*cough*, so that's where the patch comes from.  If upstream is going to
deprecate/remove mtrr support so be it.  We can do a stable fix instead to fix
older stable kernels.

P.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 18:10         ` Prarit Bhargava
@ 2015-09-03 18:40           ` Luis R. Rodriguez
  2015-09-03 19:22             ` Toshi Kani
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 18:40 UTC (permalink / raw)
  To: Prarit Bhargava
  Cc: Stuart Hayes, tglx, mingo, H. Peter Anvin, linux-kernel, x86,
	mcgrof, Toshi Kani

On Thu, Sep 03, 2015 at 02:10:14PM -0400, Prarit Bhargava wrote:
> 
> 
> On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> > On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> >>
> >>
> >> On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> >>> On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> >>>> Increase the range of chunk sizes tried in mtrr_cleanup() so it is able
> >>>> to map large memory configs into MTRRs.
> >>>>
> >>>> Currently, mtrr_cleanup() will fail with large memory configurations,
> >>>> because it limits chunk_size to 2GB, which means that each MTRR can only
> >>>> cover 2GB of memory.  With a memory size of, say, 256GB, and ten variable
> >>>> MTRRs (such as some recent Intel CPUs have), it is not possible to set up
> >>>> the MTRRs to cover all of memory.
> >>>
> >>> Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
> >>> happen if the cleanup is just skipped in your case ?
> >>
> >> The infiniband & video drivers still use MTRR (or at least it was my
> >> understanding that they do). 
> > 
> > There were a few stragglers left on v4.2, I have transformed them in the latest
> > development changes and those tranformations are now part of linux-next. If
> > this is specific to a driver you may want to first ensure you backport the
> > required patch that transforms the driver to use proper PAT interfaces, v4.2
> > should have most updates but there were still a few left. Just make sure your
> > driver doesn't call mtrr_add() directly and if it doesn't then you should be
> > OK.
> > 
> >> In any case, Stuart -- could you try booting with
> >> 'disable_mtrr_cleanup' as a kernel parameter?
> > 
> > Indeed, please I'd like to hear back. Be sure to have the respective driver
> > transformation in place, what driver are you using exactly? In the event that
> > you argue this is still needed I'd like to know exaclty *why*, the comit log
> > does not mention any of that at all.
> > 
> 
> Well ... we are trying to also fix this in older kernels too, *cough* RHEL
> *cough*, so that's where the patch comes from.  If upstream is going to
> deprecate/remove mtrr support so be it. 

Check linux-next, and Documentation/x86/mtrr.txt

https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Documentation/x86/mtrr.txt

The platform use of MTRR is the only thing you should be concerned over but as
noted returning MTRR_TYPE_INVALID should suffice if the OS does not make
any driver use / modifications.

Another next step I was considering was to then take out PAT initialization
code out from MTRR's so that we can enable distros disable MTRR but enable
PAT. Right now this is not possible. If there is really a need for MTRR
cleanup now's the time to discuss this then, I'd like to know precicely
why it'd be needed if the OS does not make any direct use of it.

> We can do a stable fix instead to fix older stable kernels.

Patches to stable depend on them being merged first on Linus' tree so
we'd need to queue this up for linux-next if you want that, but again,
best to first check *why* you need this other than some "some error comes
up" type of thing.

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 18:40           ` Luis R. Rodriguez
@ 2015-09-03 19:22             ` Toshi Kani
  2015-09-03 19:51               ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Toshi Kani @ 2015-09-03 19:22 UTC (permalink / raw)
  To: Luis R. Rodriguez, Prarit Bhargava
  Cc: Stuart Hayes, tglx, mingo, H. Peter Anvin, linux-kernel, x86,
	mcgrof, Toshi Kani

On Thu, 2015-09-03 at 20:40 +0200, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 02:10:14PM -0400, Prarit Bhargava wrote:
> > 
> > 
> > On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> > > On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> > > > 
> > > > 
> > > > On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> > > > > On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> > > > > > Increase the range of chunk sizes tried in mtrr_cleanup() so it is 
> > > > > > able to map large memory configs into MTRRs.
> > > > > > 
> > > > > > Currently, mtrr_cleanup() will fail with large memory 
> > > > > > configurations, because it limits chunk_size to 2GB, which means 
> > > > > > that each MTRR can only cover 2GB of memory.  With a memory size 
> > > > > > of, say, 256GB, and ten variable MTRRs (such as some recent Intel 
> > > > > > CPUs have), it is not possible to set up the MTRRs to cover all of
> > > > > > memory.
> > > > > 
> > > > > Linux drivers no longer use MTRR so why is the cleanup needed, ie, 
> > > > > what would happen if the cleanup is just skipped in your case ?
> > > > 
> > > > The infiniband & video drivers still use MTRR (or at least it was my
> > > > understanding that they do). 
> > > 
> > > There were a few stragglers left on v4.2, I have transformed them in the 
> > > latest development changes and those tranformations are now part of 
> > > linux-next. If this is specific to a driver you may want to first ensure 
> > > you backport the required patch that transforms the driver to use proper 
> > > PAT interfaces, v4.2 should have most updates but there were still a few 
> > > left. Just make sure your driver doesn't call mtrr_add() directly and if 
> > > it doesn't then you should be OK.
> > > 
> > > > In any case, Stuart -- could you try booting with
> > > > 'disable_mtrr_cleanup' as a kernel parameter?
> > > 
> > > Indeed, please I'd like to hear back. Be sure to have the respective 
> > > driver transformation in place, what driver are you using exactly? In 
> > > the event that you argue this is still needed I'd like to know exaclty 
> > > *why*, the comit log does not mention any of that at all.
> > > 
> > 
> > Well ... we are trying to also fix this in older kernels too, *cough* RHEL
> > *cough*, so that's where the patch comes from.  If upstream is going to
> > deprecate/remove mtrr support so be it. 
> 
> Check linux-next, and Documentation/x86/mtrr.txt
> 
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Docume
> ntation/x86/mtrr.txt
> 
> The platform use of MTRR is the only thing you should be concerned over but 
> as noted returning MTRR_TYPE_INVALID should suffice if the OS does not make
> any driver use / modifications.

The following sentence in the "mtrr.txt" is not correct.  (Sorry I should have
caught it earlier.)  mtrr_type_lookup() returns MTRR_TYPE_INVALID when MTRRs
are disabled, i.e. MTRRs are not set by neither firmware nor OS.  Most of the
firmwares enable them, though.

"If MTRRs are only set up by the platform firmware code though and the OS does
not make any specific MTRR mapping requests mtrr_type_lookup() should always
return MTRR_TYPE_INVALID."

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 19:22             ` Toshi Kani
@ 2015-09-03 19:51               ` Luis R. Rodriguez
  2015-09-03 21:31                 ` Toshi Kani
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 19:51 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani

On Thu, Sep 03, 2015 at 01:22:42PM -0600, Toshi Kani wrote:
> On Thu, 2015-09-03 at 20:40 +0200, Luis R. Rodriguez wrote:
> > On Thu, Sep 03, 2015 at 02:10:14PM -0400, Prarit Bhargava wrote:
> > > 
> > > 
> > > On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> > > > On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> > > > > 
> > > > > 
> > > > > On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> > > > > > On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> > > > > > > Increase the range of chunk sizes tried in mtrr_cleanup() so it is 
> > > > > > > able to map large memory configs into MTRRs.
> > > > > > > 
> > > > > > > Currently, mtrr_cleanup() will fail with large memory 
> > > > > > > configurations, because it limits chunk_size to 2GB, which means 
> > > > > > > that each MTRR can only cover 2GB of memory.  With a memory size 
> > > > > > > of, say, 256GB, and ten variable MTRRs (such as some recent Intel 
> > > > > > > CPUs have), it is not possible to set up the MTRRs to cover all of
> > > > > > > memory.
> > > > > > 
> > > > > > Linux drivers no longer use MTRR so why is the cleanup needed, ie, 
> > > > > > what would happen if the cleanup is just skipped in your case ?
> > > > > 
> > > > > The infiniband & video drivers still use MTRR (or at least it was my
> > > > > understanding that they do). 
> > > > 
> > > > There were a few stragglers left on v4.2, I have transformed them in the 
> > > > latest development changes and those tranformations are now part of 
> > > > linux-next. If this is specific to a driver you may want to first ensure 
> > > > you backport the required patch that transforms the driver to use proper 
> > > > PAT interfaces, v4.2 should have most updates but there were still a few 
> > > > left. Just make sure your driver doesn't call mtrr_add() directly and if 
> > > > it doesn't then you should be OK.
> > > > 
> > > > > In any case, Stuart -- could you try booting with
> > > > > 'disable_mtrr_cleanup' as a kernel parameter?
> > > > 
> > > > Indeed, please I'd like to hear back. Be sure to have the respective 
> > > > driver transformation in place, what driver are you using exactly? In 
> > > > the event that you argue this is still needed I'd like to know exaclty 
> > > > *why*, the comit log does not mention any of that at all.
> > > > 
> > > 
> > > Well ... we are trying to also fix this in older kernels too, *cough* RHEL
> > > *cough*, so that's where the patch comes from.  If upstream is going to
> > > deprecate/remove mtrr support so be it. 
> > 
> > Check linux-next, and Documentation/x86/mtrr.txt
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Docume
> > ntation/x86/mtrr.txt
> > 
> > The platform use of MTRR is the only thing you should be concerned over but 
> > as noted returning MTRR_TYPE_INVALID should suffice if the OS does not make
> > any driver use / modifications.
> 
> The following sentence in the "mtrr.txt" is not correct.  (Sorry I should have
> caught it earlier.)  mtrr_type_lookup() returns MTRR_TYPE_INVALID when MTRRs
> are disabled, i.e. MTRRs are not set by neither firmware nor OS.  Most of the
> firmwares enable them, though.
>
> "If MTRRs are only set up by the platform firmware code though and the OS does
> not make any specific MTRR mapping requests mtrr_type_lookup() should always
> return MTRR_TYPE_INVALID."

So this should be clarified to say -- that because platform firmware *may* make
use of MTRRs, even if all OS drivers are no longer making use of MTRRs
directly, we still need mtrr_type_lookup() to return the right type ?

 Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 19:51               ` Luis R. Rodriguez
@ 2015-09-03 21:31                 ` Toshi Kani
  2015-09-03 22:07                   ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Toshi Kani @ 2015-09-03 21:31 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani

On Thu, 2015-09-03 at 21:51 +0200, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 01:22:42PM -0600, Toshi Kani wrote:
> > On Thu, 2015-09-03 at 20:40 +0200, Luis R. Rodriguez wrote:
> > > On Thu, Sep 03, 2015 at 02:10:14PM -0400, Prarit Bhargava wrote:
> > > > 
> > > > 
> > > > On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> > > > > On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> > > > > > 
> > > > > > 
> > > > > > On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> > > > > > > On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> > > > > > > > Increase the range of chunk sizes tried in mtrr_cleanup() so it is 
> > > > > > > > able to map large memory configs into MTRRs.
> > > > > > > > 
> > > > > > > > Currently, mtrr_cleanup() will fail with large memory 
> > > > > > > > configurations, because it limits chunk_size to 2GB, which means 
> > > > > > > > that each MTRR can only cover 2GB of memory.  With a memory size 
> > > > > > > > of, say, 256GB, and ten variable MTRRs (such as some recent Intel 
> > > > > > > > CPUs have), it is not possible to set up the MTRRs to cover all of
> > > > > > > > memory.
> > > > > > > 
> > > > > > > Linux drivers no longer use MTRR so why is the cleanup needed, ie, 
> > > > > > > what would happen if the cleanup is just skipped in your case ?
> > > > > > 
> > > > > > The infiniband & video drivers still use MTRR (or at least it was my
> > > > > > understanding that they do). 
> > > > > 
> > > > > There were a few stragglers left on v4.2, I have transformed them in the 
> > > > > latest development changes and those tranformations are now part of 
> > > > > linux-next. If this is specific to a driver you may want to first ensure 
> > > > > you backport the required patch that transforms the driver to use proper 
> > > > > PAT interfaces, v4.2 should have most updates but there were still a few 
> > > > > left. Just make sure your driver doesn't call mtrr_add() directly and if 
> > > > > it doesn't then you should be OK.
> > > > > 
> > > > > > In any case, Stuart -- could you try booting with
> > > > > > 'disable_mtrr_cleanup' as a kernel parameter?
> > > > > 
> > > > > Indeed, please I'd like to hear back. Be sure to have the respective 
> > > > > driver transformation in place, what driver are you using exactly? In 
> > > > > the event that you argue this is still needed I'd like to know exaclty 
> > > > > *why*, the comit log does not mention any of that at all.
> > > > > 
> > > > 
> > > > Well ... we are trying to also fix this in older kernels too, *cough* RHEL
> > > > *cough*, so that's where the patch comes from.  If upstream is going to
> > > > deprecate/remove mtrr support so be it. 
> > > 
> > > Check linux-next, and Documentation/x86/mtrr.txt
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Docume
> > > ntation/x86/mtrr.txt
> > > 
> > > The platform use of MTRR is the only thing you should be concerned over but 
> > > as noted returning MTRR_TYPE_INVALID should suffice if the OS does not make
> > > any driver use / modifications.
> > 
> > The following sentence in the "mtrr.txt" is not correct.  (Sorry I should have
> > caught it earlier.)  mtrr_type_lookup() returns MTRR_TYPE_INVALID when MTRRs
> > are disabled, i.e. MTRRs are not set by neither firmware nor OS.  Most of the
> > firmwares enable them, though.
> > 
> > "If MTRRs are only set up by the platform firmware code though and the OS does
> > not make any specific MTRR mapping requests mtrr_type_lookup() should always
> > return MTRR_TYPE_INVALID."
> 
> So this should be clarified to say -- that because platform firmware *may* make
> use of MTRRs, even if all OS drivers are no longer making use of MTRRs
> directly, we still need mtrr_type_lookup() to return the right type ?

That is correct, and I think the doc already states that.  I'd suggest
the following minor changes to the doc:

Thanks,
-Toshi

===
diff --git a/Documentation/x86/mtrr.txt b/Documentation/x86/mtrr.txt
index dc3e703..db1d8f0 100644
--- a/Documentation/x86/mtrr.txt
+++ b/Documentation/x86/mtrr.txt
@@ -13,15 +13,12 @@ non-PAT systems while a no-op but equally effective on PAT enabled systems.

 Even if Linux does not use MTRRs directly, some x86 platform firmware may still
 set up MTRRs early before booting the OS. They do this as some platform
-firmware may still have implemented access to MTRRs which would be controlled
-and handled by the platform firmware directly. An example of platform use of
+firmware may rely on cache types set by MTRRs. An example of platform use of
 MTRRs is through the use of SMI handlers, one case could be for fan control,
 the platform code would need uncachable access to some of its fan control
 registers. Such platform access does not need any Operating System MTRR code in
 place other than mtrr_type_lookup() to ensure any OS specific mapping requests
-are aligned with platform MTRR setup. If MTRRs are only set up by the platform
-firmware code though and the OS does not make any specific MTRR mapping
-requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
+are aligned with platform MTRR setup.

 For details refer to Documentation/x86/pat.txt.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 21:31                 ` Toshi Kani
@ 2015-09-03 22:07                   ` Luis R. Rodriguez
  2015-09-03 22:25                     ` Toshi Kani
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 22:07 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani

On Thu, Sep 03, 2015 at 03:31:42PM -0600, Toshi Kani wrote:
> On Thu, 2015-09-03 at 21:51 +0200, Luis R. Rodriguez wrote:
> > On Thu, Sep 03, 2015 at 01:22:42PM -0600, Toshi Kani wrote:
> > > On Thu, 2015-09-03 at 20:40 +0200, Luis R. Rodriguez wrote:
> > > > On Thu, Sep 03, 2015 at 02:10:14PM -0400, Prarit Bhargava wrote:
> > > > > 
> > > > > 
> > > > > On 09/03/2015 01:59 PM, Luis R. Rodriguez wrote:
> > > > > > On Thu, Sep 03, 2015 at 08:17:02AM -0400, Prarit Bhargava wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 09/02/2015 10:45 PM, Luis R. Rodriguez wrote:
> > > > > > > > On Mon, Aug 31, 2015 at 11:05:33AM -0500, Stuart Hayes wrote:
> > > > > > > > > Increase the range of chunk sizes tried in mtrr_cleanup() so it is 
> > > > > > > > > able to map large memory configs into MTRRs.
> > > > > > > > > 
> > > > > > > > > Currently, mtrr_cleanup() will fail with large memory 
> > > > > > > > > configurations, because it limits chunk_size to 2GB, which means 
> > > > > > > > > that each MTRR can only cover 2GB of memory.  With a memory size 
> > > > > > > > > of, say, 256GB, and ten variable MTRRs (such as some recent Intel 
> > > > > > > > > CPUs have), it is not possible to set up the MTRRs to cover all of
> > > > > > > > > memory.
> > > > > > > > 
> > > > > > > > Linux drivers no longer use MTRR so why is the cleanup needed, ie, 
> > > > > > > > what would happen if the cleanup is just skipped in your case ?
> > > > > > > 
> > > > > > > The infiniband & video drivers still use MTRR (or at least it was my
> > > > > > > understanding that they do). 
> > > > > > 
> > > > > > There were a few stragglers left on v4.2, I have transformed them in the 
> > > > > > latest development changes and those tranformations are now part of 
> > > > > > linux-next. If this is specific to a driver you may want to first ensure 
> > > > > > you backport the required patch that transforms the driver to use proper 
> > > > > > PAT interfaces, v4.2 should have most updates but there were still a few 
> > > > > > left. Just make sure your driver doesn't call mtrr_add() directly and if 
> > > > > > it doesn't then you should be OK.
> > > > > > 
> > > > > > > In any case, Stuart -- could you try booting with
> > > > > > > 'disable_mtrr_cleanup' as a kernel parameter?
> > > > > > 
> > > > > > Indeed, please I'd like to hear back. Be sure to have the respective 
> > > > > > driver transformation in place, what driver are you using exactly? In 
> > > > > > the event that you argue this is still needed I'd like to know exaclty 
> > > > > > *why*, the comit log does not mention any of that at all.
> > > > > > 
> > > > > 
> > > > > Well ... we are trying to also fix this in older kernels too, *cough* RHEL
> > > > > *cough*, so that's where the patch comes from.  If upstream is going to
> > > > > deprecate/remove mtrr support so be it. 
> > > > 
> > > > Check linux-next, and Documentation/x86/mtrr.txt
> > > > 
> > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git/tree/Docume
> > > > ntation/x86/mtrr.txt
> > > > 
> > > > The platform use of MTRR is the only thing you should be concerned over but 
> > > > as noted returning MTRR_TYPE_INVALID should suffice if the OS does not make
> > > > any driver use / modifications.
> > > 
> > > The following sentence in the "mtrr.txt" is not correct.  (Sorry I should have
> > > caught it earlier.)  mtrr_type_lookup() returns MTRR_TYPE_INVALID when MTRRs
> > > are disabled, i.e. MTRRs are not set by neither firmware nor OS.  Most of the
> > > firmwares enable them, though.
> > > 
> > > "If MTRRs are only set up by the platform firmware code though and the OS does
> > > not make any specific MTRR mapping requests mtrr_type_lookup() should always
> > > return MTRR_TYPE_INVALID."
> > 
> > So this should be clarified to say -- that because platform firmware *may* make
> > use of MTRRs, even if all OS drivers are no longer making use of MTRRs
> > directly, we still need mtrr_type_lookup() to return the right type ?
> 
> That is correct, and I think the doc already states that.  I'd suggest
> the following minor changes to the doc:
> 
> Thanks,
> -Toshi
> 
> ===
> diff --git a/Documentation/x86/mtrr.txt b/Documentation/x86/mtrr.txt
> index dc3e703..db1d8f0 100644
> --- a/Documentation/x86/mtrr.txt
> +++ b/Documentation/x86/mtrr.txt
> @@ -13,15 +13,12 @@ non-PAT systems while a no-op but equally effective on PAT enabled systems.
> 
>  Even if Linux does not use MTRRs directly, some x86 platform firmware may still
>  set up MTRRs early before booting the OS. They do this as some platform
> -firmware may still have implemented access to MTRRs which would be controlled
> -and handled by the platform firmware directly. An example of platform use of
> +firmware may rely on cache types set by MTRRs. An example of platform use of
>  MTRRs is through the use of SMI handlers, one case could be for fan control,
>  the platform code would need uncachable access to some of its fan control
>  registers. Such platform access does not need any Operating System MTRR code in
>  place other than mtrr_type_lookup() to ensure any OS specific mapping requests
> -are aligned with platform MTRR setup. If MTRRs are only set up by the platform
> -firmware code though and the OS does not make any specific MTRR mapping
> -requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
> +are aligned with platform MTRR setup.

These are still at odds, for instance, I was under the impression we can
just have the OS return MTRR_TYPE_INVALID if the OS / drivers never used
or set up MTRR, but the platform did, above (not the patch) you seem to be
saying that even if the OS didn't modify MTRRs the OS still needs to return
the appropriately set up MTRR type by firmware. This is different. Can you
clarify?

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 22:07                   ` Luis R. Rodriguez
@ 2015-09-03 22:25                     ` Toshi Kani
  2015-09-03 22:45                       ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Toshi Kani @ 2015-09-03 22:25 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani

On Fri, 2015-09-04 at 00:07 +0200, Luis R. Rodriguez wrote:
 :
> These are still at odds, for instance, I was under the impression we can
> just have the OS return MTRR_TYPE_INVALID if the OS / drivers never used
> or set up MTRR, but the platform did, above (not the patch) you seem to be
> saying that even if the OS didn't modify MTRRs the OS still needs to return
> the appropriately set up MTRR type by firmware. This is different. Can you
> clarify?

mtrr_type_lookup() returns valid MTRR cache type for a given address range
when MTRRs are enabled.  It does not matter if MTRRs are set by the firmware
or the OS.  When MTRRs are enabled, the kernel needs to check through
mtrr_type_lookup() that large page mapping requests are aligned with MTRRs.

On Xen, or on a platform with firmware that does not enable MTRRs,
mtrr_type_lookup() returns MTRR_TYPE_INVALID (as long as the kernel does not
enable them).

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 22:25                     ` Toshi Kani
@ 2015-09-03 22:45                       ` Luis R. Rodriguez
  2015-09-03 23:21                         ` Toshi Kani
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 22:45 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross

On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
> On Fri, 2015-09-04 at 00:07 +0200, Luis R. Rodriguez wrote:
>  :
> > These are still at odds, for instance, I was under the impression we can
> > just have the OS return MTRR_TYPE_INVALID if the OS / drivers never used
> > or set up MTRR, but the platform did, above (not the patch) you seem to be
> > saying that even if the OS didn't modify MTRRs the OS still needs to return
> > the appropriately set up MTRR type by firmware. This is different. Can you
> > clarify?
> 
> mtrr_type_lookup() returns valid MTRR cache type for a given address range
> when MTRRs are enabled.  It does not matter if MTRRs are set by the firmware
> or the OS.  When MTRRs are enabled, the kernel needs to check through
> mtrr_type_lookup() that large page mapping requests are aligned with MTRRs.

One further change I was considering was seeing if we can separate PAT
set up from MTRR's setup, but that was under the assumption we could live
with a kernel that would have mtrr_type_lookup() return MTRR_TYPE_INVALID
if kernel MTRR code is completely disabled but PAT enabled. We can't enable PAT
today without MTRR beceause PAT is initialized from the MTRR init sequence and
that depends on MTRR, if we separated these though and and if a distro disabled
kernel MTRR an but enabled PAT and if MTRR did set up MTRR what would the possible
issues be?

> On Xen,

When Xen is used a platform firmware may still set up MTRR, even if the
hypervisor doesn't set up MTRR right ? So same issue and question here.

> or on a platform with firmware that does not enable MTRRs,
> mtrr_type_lookup() returns MTRR_TYPE_INVALID (as long as the kernel does not
> enable them).

Sure this makes sense.

Thanks in advance,

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 22:45                       ` Luis R. Rodriguez
@ 2015-09-03 23:21                         ` Toshi Kani
  2015-09-03 23:54                           ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Toshi Kani @ 2015-09-03 23:21 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross

On Fri, 2015-09-04 at 00:45 +0200, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
> > On Fri, 2015-09-04 at 00:07 +0200, Luis R. Rodriguez wrote:
> >  :
> > > These are still at odds, for instance, I was under the impression we can
> > > just have the OS return MTRR_TYPE_INVALID if the OS / drivers never used
> > > or set up MTRR, but the platform did, above (not the patch) you seem to 
> > > be saying that even if the OS didn't modify MTRRs the OS still needs to
> > > return the appropriately set up MTRR type by firmware. This is 
> > > different. Can you clarify?
> > 
> > mtrr_type_lookup() returns valid MTRR cache type for a given address range
> > when MTRRs are enabled.  It does not matter if MTRRs are set by the 
> > firmware or the OS.  When MTRRs are enabled, the kernel needs to check 
> > through mtrr_type_lookup() that large page mapping requests are aligned 
> > with MTRRs.
> 
> One further change I was considering was seeing if we can separate PAT
> set up from MTRR's setup, but that was under the assumption we could live
> with a kernel that would have mtrr_type_lookup() return MTRR_TYPE_INVALID
> if kernel MTRR code is completely disabled but PAT enabled. We can't enable 
> PAT today without MTRR beceause PAT is initialized from the MTRR init 
> sequence and that depends on MTRR, if we separated these though and and if a 
> distro disabled kernel MTRR an but enabled PAT and if MTRR did set up MTRR 
> what would the possible issues be?

PAT's dependency to MTRR could be removed, but I would not recommend disabling
the MTRR option since most of the firmwares enable MTRRs.  When the kernel has
the MTRR option disabled, but the firmware enables MTRRs, the kernel is unable
to verify if a large page mapping is aligned with MTRRs.  This can lead
undefined behavior when such an unaligned map with MTRRs is created and
accessed.

> > On Xen,
> 
> When Xen is used a platform firmware may still set up MTRR, even if the
> hypervisor doesn't set up MTRR right ? So same issue and question here.

Right, I meant to say Xen guests.  In case of the Xen hypervisor,
mtrr_type_lookup() returns a valid type as it runs on a platform.

Thanks,
-Toshi

> > or on a platform with firmware that does not enable MTRRs,
> > mtrr_type_lookup() returns MTRR_TYPE_INVALID (as long as the kernel does 
> > not enable them).
> 
> Sure this makes sense.
> 
> Thanks in advance,
> 
>   Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 23:21                         ` Toshi Kani
@ 2015-09-03 23:54                           ` Luis R. Rodriguez
  2015-09-04  0:48                             ` Toshi Kani
  2015-09-04  6:51                             ` Jan Beulich
  0 siblings, 2 replies; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-03 23:54 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross

On Thu, Sep 03, 2015 at 05:21:14PM -0600, Toshi Kani wrote:
> On Fri, 2015-09-04 at 00:45 +0200, Luis R. Rodriguez wrote:
> > On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
> > > On Fri, 2015-09-04 at 00:07 +0200, Luis R. Rodriguez wrote:
> > >  :
> > > > These are still at odds, for instance, I was under the impression we can
> > > > just have the OS return MTRR_TYPE_INVALID if the OS / drivers never used
> > > > or set up MTRR, but the platform did, above (not the patch) you seem to 
> > > > be saying that even if the OS didn't modify MTRRs the OS still needs to
> > > > return the appropriately set up MTRR type by firmware. This is 
> > > > different. Can you clarify?
> > > 
> > > mtrr_type_lookup() returns valid MTRR cache type for a given address range
> > > when MTRRs are enabled.  It does not matter if MTRRs are set by the 
> > > firmware or the OS.  When MTRRs are enabled, the kernel needs to check 
> > > through mtrr_type_lookup() that large page mapping requests are aligned 
> > > with MTRRs.
> > 
> > One further change I was considering was seeing if we can separate PAT
> > set up from MTRR's setup, but that was under the assumption we could live
> > with a kernel that would have mtrr_type_lookup() return MTRR_TYPE_INVALID
> > if kernel MTRR code is completely disabled but PAT enabled. We can't enable 
> > PAT today without MTRR beceause PAT is initialized from the MTRR init 
> > sequence and that depends on MTRR, if we separated these though and and if a 
> > distro disabled kernel MTRR an but enabled PAT and if MTRR did set up MTRR 
> > what would the possible issues be?
> 
> PAT's dependency to MTRR could be removed, but I would not recommend disabling
> the MTRR option since most of the firmwares enable MTRRs.

OK we can put such warning and default to enable MTRR and strongly warn
against disabling.

> When the kernel has
> the MTRR option disabled, but the firmware enables MTRRs, the kernel is unable
> to verify if a large page mapping is aligned with MTRRs.  This can lead
> undefined behavior when such an unaligned map with MTRRs is created and
> accessed.

Crikey!

> > > On Xen,
> > 
> > When Xen is used a platform firmware may still set up MTRR, even if the
> > hypervisor doesn't set up MTRR right ? So same issue and question here.
> 
> Right, I meant to say Xen guests.

Ah but its import complicated than that.

> In case of the Xen hypervisor,
> mtrr_type_lookup() returns a valid type as it runs on a platform.

I am not sure if this happens today, I know MTRR is simply disabled by
the Xen Hypervisor on the CPU explicitly, it disable it so guests reading
the MTRR capabilities sees it as disabled when queried.

Then since the Xen Linux guests cannot speak MTRR through the hypervisor (for
instance Xen guests cannot ask Xen hypervisor to mtrr_type_lookup() for it)
if PCI passthrough is used it could mean a guest might set up / use incorrect
info as well.

If I undestand this correctly then I think we're in a pickle with Xen unless
we add hypervisor support and hypercall support for mtrr_type_lookup().

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 23:54                           ` Luis R. Rodriguez
@ 2015-09-04  0:48                             ` Toshi Kani
  2015-09-04  1:40                               ` Luis R. Rodriguez
  2015-09-04  6:51                             ` Jan Beulich
  1 sibling, 1 reply; 23+ messages in thread
From: Toshi Kani @ 2015-09-04  0:48 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross

On Fri, 2015-09-04 at 01:54 +0200, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 05:21:14PM -0600, Toshi Kani wrote:
> > On Fri, 2015-09-04 at 00:45 +0200, Luis R. Rodriguez wrote:
> > > On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
 :
> > > > On Xen,
> > > 
> > > When Xen is used a platform firmware may still set up MTRR, even if the
> > > hypervisor doesn't set up MTRR right ? So same issue and question here.
> > 
> > Right, I meant to say Xen guests.
> 
> Ah but its import complicated than that.
> 
> > In case of the Xen hypervisor,
> > mtrr_type_lookup() returns a valid type as it runs on a platform.
> 
> I am not sure if this happens today, I know MTRR is simply disabled by
> the Xen Hypervisor on the CPU explicitly, it disable it so guests reading
> the MTRR capabilities sees it as disabled when queried.

Oh, I would not let the hypervisor to disable MTRRs...

> Then since the Xen Linux guests cannot speak MTRR through the hypervisor
> (for instance Xen guests cannot ask Xen hypervisor to mtrr_type_lookup() for
> it) if PCI passthrough is used it could mean a guest might set up / use
> incorrect info as well.
> 
> If I undestand this correctly then I think we're in a pickle with Xen unless
> we add hypervisor support and hypercall support for mtrr_type_lookup().

I was under assumption that MTRRs are emulated and disabled on guests.  Isn't
guest physical address virtualized?  I know other proprietary VMMs on IA64,
but know nothing about Xen...  So, please disregard my comments to Xen. :-)

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-04  0:48                             ` Toshi Kani
@ 2015-09-04  1:40                               ` Luis R. Rodriguez
  2015-09-04 14:56                                 ` Toshi Kani
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-09-04  1:40 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross, Roger Pau Monné,
	xen-devel

On Thu, Sep 03, 2015 at 06:48:46PM -0600, Toshi Kani wrote:
> On Fri, 2015-09-04 at 01:54 +0200, Luis R. Rodriguez wrote:
> > On Thu, Sep 03, 2015 at 05:21:14PM -0600, Toshi Kani wrote:
> > > On Fri, 2015-09-04 at 00:45 +0200, Luis R. Rodriguez wrote:
> > > > On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
>  :
> > > > > On Xen,
> > > > 
> > > > When Xen is used a platform firmware may still set up MTRR, even if the
> > > > hypervisor doesn't set up MTRR right ? So same issue and question here.
> > > 
> > > Right, I meant to say Xen guests.
> > 
> > Ah but its import complicated than that.
> > 
> > > In case of the Xen hypervisor,
> > > mtrr_type_lookup() returns a valid type as it runs on a platform.
> > 
> > I am not sure if this happens today, I know MTRR is simply disabled by
> > the Xen Hypervisor on the CPU explicitly, it disable it so guests reading
> > the MTRR capabilities sees it as disabled when queried.
> 
> Oh, I would not let the hypervisor to disable MTRRs...

Commit 586ab6a055376ec3f3e1e8 ("x86/pvh: disable MTRR feature on cpuid for Dom0")
by Roger Pau Monné disables MTRR for PVH dom0, so that cpuid returns that
MTRR is disabled to guests. Then later on Linux as of commit 47591df50512
("xen: Support Xen pv-domains using PAT") added by Juergen as of v3.19
Linux guests can end up booting without MTRR but with PAT now enabled.

> > Then since the Xen Linux guests cannot speak MTRR through the hypervisor
> > (for instance Xen guests cannot ask Xen hypervisor to mtrr_type_lookup() for
> > it) if PCI passthrough is used it could mean a guest might set up / use
> > incorrect info as well.
> > 
> > If I undestand this correctly then I think we're in a pickle with Xen unless
> > we add hypervisor support and hypercall support for mtrr_type_lookup().
> 
> I was under assumption that MTRRs are emulated and disabled on guests.

Some "special" flavor Linux guests (with non-upstream code) have guest
MTRR hypercall support, for vanilla Xen and Linux they just never get MTRR
support. After Juergen's Linux changes though Xen guests can now get
shiny PAT support. Since MTRR hypercall support is not upstream and MTRR is
ancient I decided instead of adding MTRR hypercall support upstream to go with
converting all drivers to PAT interfaces, with the assumption there would be no
issues.

> Isn't guest physical address virtualized?

It is, there is a xen iotlb and stuff but that should ensure dom0 gets
to get proper access to devices, and if you use PCI passthrough you want
the best experience as well.

> I know other proprietary VMMs on IA64,
> but know nothing about Xen...  So, please disregard my comments to Xen. :-)

No worries, no one knows all the answers, we work together to remove
cob webs off of these odd corners no one cares about :)

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 23:54                           ` Luis R. Rodriguez
  2015-09-04  0:48                             ` Toshi Kani
@ 2015-09-04  6:51                             ` Jan Beulich
  1 sibling, 0 replies; 23+ messages in thread
From: Jan Beulich @ 2015-09-04  6:51 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: mcgrof, Stuart Hayes, Toshi Kani, Toshi Kani, x86, tglx, mingo,
	Prarit Bhargava, Juergen Gross, linux-kernel, H. Peter Anvin

>>> On 04.09.15 at 01:54, <mcgrof@suse.com> wrote:
> If I undestand this correctly then I think we're in a pickle with Xen unless
> we add hypervisor support and hypercall support for mtrr_type_lookup().

Are you perhaps unaware of XENPF_read_memtype (which our kernel
uses)?

Jan


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-04  1:40                               ` Luis R. Rodriguez
@ 2015-09-04 14:56                                 ` Toshi Kani
  0 siblings, 0 replies; 23+ messages in thread
From: Toshi Kani @ 2015-09-04 14:56 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Prarit Bhargava, Stuart Hayes, tglx, mingo, H. Peter Anvin,
	linux-kernel, x86, mcgrof, Toshi Kani, Jan Beulich,
	Juergen Gross, Roger Pau Monné,
	xen-devel

On Fri, 2015-09-04 at 03:40 +0200, Luis R. Rodriguez wrote:
> On Thu, Sep 03, 2015 at 06:48:46PM -0600, Toshi Kani wrote:
> > On Fri, 2015-09-04 at 01:54 +0200, Luis R. Rodriguez wrote:
> > > On Thu, Sep 03, 2015 at 05:21:14PM -0600, Toshi Kani wrote:
> > > > On Fri, 2015-09-04 at 00:45 +0200, Luis R. Rodriguez wrote:
> > > > > On Thu, Sep 03, 2015 at 04:25:31PM -0600, Toshi Kani wrote:
> >  :
> > > > > > On Xen,
> > > > > 
> > > > > When Xen is used a platform firmware may still set up MTRR, even if 
> > > > > the hypervisor doesn't set up MTRR right ? So same issue and 
> > > > > question here.
> > > > 
> > > > Right, I meant to say Xen guests.
> > > 
> > > Ah but its import complicated than that.
> > > 
> > > > In case of the Xen hypervisor,
> > > > mtrr_type_lookup() returns a valid type as it runs on a platform.
> > > 
> > > I am not sure if this happens today, I know MTRR is simply disabled by
> > > the Xen Hypervisor on the CPU explicitly, it disable it so guests 
> > > reading the MTRR capabilities sees it as disabled when queried.
> > 
> > Oh, I would not let the hypervisor to disable MTRRs...
> 
> Commit 586ab6a055376ec3f3e1e8 ("x86/pvh: disable MTRR feature on cpuid for 
> Dom0") by Roger Pau Monné disables MTRR for PVH dom0, so that cpuid returns 
> that MTRR is disabled to guests. 

Oh, I see.  It just clears the capability bit so that the kernel thinks MTRRs
are disabled.  That makes sense.

> Then later on Linux as of commit 47591df50512 ("xen: Support Xen pv-domains
> using PAT") added by Juergen as of v3.19 Linux guests can end up booting
> without MTRR but with PAT now enabled.

Nice!

> > > Then since the Xen Linux guests cannot speak MTRR through the hypervisor
> > > (for instance Xen guests cannot ask Xen hypervisor to mtrr_type_lookup() 
> > > for it) if PCI passthrough is used it could mean a guest might set up / 
> > > use incorrect info as well.
> > > 
> > > If I undestand this correctly then I think we're in a pickle with Xen 
> > > unless we add hypervisor support and hypercall support for
> > > mtrr_type_lookup().
> > 
> > I was under assumption that MTRRs are emulated and disabled on guests.
> 
> Some "special" flavor Linux guests (with non-upstream code) have guest
> MTRR hypercall support, for vanilla Xen and Linux they just never get MTRR
> support. After Juergen's Linux changes though Xen guests can now get
> shiny PAT support. Since MTRR hypercall support is not upstream and MTRR is
> ancient I decided instead of adding MTRR hypercall support upstream to go 
> with converting all drivers to PAT interfaces, with the assumption there 
> would be no issues.
> 
> > Isn't guest physical address virtualized?
> 
> It is, there is a xen iotlb and stuff but that should ensure dom0 gets
> to get proper access to devices, and if you use PCI passthrough you want
> the best experience as well.
> 
> > I know other proprietary VMMs on IA64, but know nothing about Xen...  So, 
> > please disregard my comments to Xen. :-)
> 
> No worries, no one knows all the answers, we work together to remove
> cob webs off of these odd corners no one cares about :)

Thanks for all the info!  That helps.
-Toshi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-03 12:17     ` Prarit Bhargava
  2015-09-03 17:59       ` Luis R. Rodriguez
@ 2015-09-14 14:46       ` Stuart Hayes
  2015-11-05 19:14         ` Yinghai Lu
  1 sibling, 1 reply; 23+ messages in thread
From: Stuart Hayes @ 2015-09-14 14:46 UTC (permalink / raw)
  To: Prarit Bhargava, Luis R. Rodriguez
  Cc: tglx, mingo, H. Peter Anvin, linux-kernel, x86, mcgrof, Toshi Kani


>>
>> Linux drivers no longer use MTRR so why is the cleanup needed, ie, what would
>> happen if the cleanup is just skipped in your case ?
> 
> The infiniband & video drivers still use MTRR (or at least it was my
> understanding that they do).  In any case, Stuart -- could you try booting with
> 'disable_mtrr_cleanup' as a kernel parameter?
> 
> P.
> 

Sorry for the delayed response.

Booting with 'disable_mtrr_cleanup' works, but the system I am working with
isn't actually failing--it just gets ugly error messages.  And the BIOS on the
system I am working with had set up the MTRRs correctly.

Stuart

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-09-14 14:46       ` Stuart Hayes
@ 2015-11-05 19:14         ` Yinghai Lu
  2015-11-05 19:43           ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Yinghai Lu @ 2015-11-05 19:14 UTC (permalink / raw)
  To: Stuart Hayes
  Cc: Prarit Bhargava, Luis R. Rodriguez, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Linux Kernel Mailing List,
	the arch/x86 maintainers, mcgrof, Toshi Kani

On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes <stuart.w.hayes@gmail.com> wrote:
>
> Booting with 'disable_mtrr_cleanup' works, but the system I am working with
> isn't actually failing--it just gets ugly error messages.  And the BIOS on the
> system I am working with had set up the MTRRs correctly.

Please post boot log and /proc/mtrr for:
1. without your patch
2. without your patch and with disable_mtrr_cleanup in boot command line.
3. with your patch.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-11-05 19:14         ` Yinghai Lu
@ 2015-11-05 19:43           ` Luis R. Rodriguez
  2016-03-16 20:20             ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2015-11-05 19:43 UTC (permalink / raw)
  To: Yinghai Lu, Luis R. Rodriguez
  Cc: Stuart Hayes, Prarit Bhargava, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Linux Kernel Mailing List,
	the arch/x86 maintainers, Toshi Kani

On Thu, Nov 5, 2015 at 11:14 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes <stuart.w.hayes@gmail.com> wrote:
>>
>> Booting with 'disable_mtrr_cleanup' works, but the system I am working with
>> isn't actually failing--it just gets ugly error messages.  And the BIOS on the
>> system I am working with had set up the MTRRs correctly.
>
> Please post boot log and /proc/mtrr for:
> 1. without your patch
> 2. without your patch and with disable_mtrr_cleanup in boot command line.
> 3. with your patch.

Stuart,

to provide some context -- I reached out to Yinghai as he wrote the
original mtrr cleanup code. The commit logs seem to read that a crash
was possible on systems with > 4 GiB RAM with some types of BIOSes...
The cleanup code seems to trigger when variable MTRRs do not exist
that are UC, or when all varible MTRRs that exist are just UC + WB
(Yinghai correct me if I'm wrong). The commit log in question
(95ffa2438d0e9 "x86: mtrr cleanup for converting continuous to
discrete layout, v8") was not very clear about the cause of the crash
-- but suppose the issue here was the BIOS on some systems might want
to create some UC variable MTRRs early on and there was no UC MTRRs
available, and I can only guess the cleanup exists as hack for those
BIOSes. Even if that was the case -- its still not clear *why* the
crash would happen but I suppose a driver mishap can happen without UC
guarantees for some devices the BIOS may want to enable UC MTRR on.

To be able to determine what we do upstream we need to understand the
above first. We also need to understand if the cleanup might also be
implicated by userspace drivers using /proc/mtrr, or if a proprietary
driver exists that does use mtrr_add() directly even though PAT has
been available for ages and all drivers are now properly converted.

With clear answers to the above we'll be able to determine what the
right course of action should be for this patch. For instance I'm
inclined to strive to disable the complex cleanup code if we don't
need it anymore, but if we do need it your patch makes sense. If the
patch makes sense then though are we going to have to keep updating
the segment size *every time* as systems grow? That seems rather
silly. And if PAT is prevalent why are vendors adding MTRRs still? The
cleanup seems complex and a major hack for a fix for some BIOSes, I'd
much rather identify the exact issue and only have a fix to address
that case.

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2015-11-05 19:43           ` Luis R. Rodriguez
@ 2016-03-16 20:20             ` Luis R. Rodriguez
  2016-03-29 17:07               ` Luis R. Rodriguez
  0 siblings, 1 reply; 23+ messages in thread
From: Luis R. Rodriguez @ 2016-03-16 20:20 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Yinghai Lu, Stuart Hayes, Prarit Bhargava, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Linux Kernel Mailing List,
	the arch/x86 maintainers, Toshi Kani

On Thu, Nov 05, 2015 at 11:43:59AM -0800, Luis R. Rodriguez wrote:
> On Thu, Nov 5, 2015 at 11:14 AM, Yinghai Lu <yinghai@kernel.org> wrote:
> > On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes <stuart.w.hayes@gmail.com> wrote:
> >>
> >> Booting with 'disable_mtrr_cleanup' works, but the system I am working with
> >> isn't actually failing--it just gets ugly error messages.  And the BIOS on the
> >> system I am working with had set up the MTRRs correctly.
> >
> > Please post boot log and /proc/mtrr for:
> > 1. without your patch
> > 2. without your patch and with disable_mtrr_cleanup in boot command line.
> > 3. with your patch.
> 
> Stuart,
> 
> to provide some context -- I reached out to Yinghai as he wrote the
> original mtrr cleanup code. The commit logs seem to read that a crash
> was possible on systems with > 4 GiB RAM with some types of BIOSes...
> The cleanup code seems to trigger when variable MTRRs do not exist
> that are UC, or when all varible MTRRs that exist are just UC + WB
> (Yinghai correct me if I'm wrong). The commit log in question
> (95ffa2438d0e9 "x86: mtrr cleanup for converting continuous to
> discrete layout, v8") was not very clear about the cause of the crash
> -- but suppose the issue here was the BIOS on some systems might want
> to create some UC variable MTRRs early on and there was no UC MTRRs
> available, and I can only guess the cleanup exists as hack for those
> BIOSes. Even if that was the case -- its still not clear *why* the
> crash would happen but I suppose a driver mishap can happen without UC
> guarantees for some devices the BIOS may want to enable UC MTRR on.
> 
> To be able to determine what we do upstream we need to understand the
> above first. We also need to understand if the cleanup might also be
> implicated by userspace drivers using /proc/mtrr, or if a proprietary
> driver exists that does use mtrr_add() directly even though PAT has
> been available for ages and all drivers are now properly converted.
> 
> With clear answers to the above we'll be able to determine what the
> right course of action should be for this patch. For instance I'm
> inclined to strive to disable the complex cleanup code if we don't
> need it anymore, but if we do need it your patch makes sense. If the
> patch makes sense then though are we going to have to keep updating
> the segment size *every time* as systems grow? That seems rather
> silly. And if PAT is prevalent why are vendors adding MTRRs still? The
> cleanup seems complex and a major hack for a fix for some BIOSes, I'd
> much rather identify the exact issue and only have a fix to address
> that case.

I never heard back... so let's take this up on the other thread I just
raised.

  Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup
  2016-03-16 20:20             ` Luis R. Rodriguez
@ 2016-03-29 17:07               ` Luis R. Rodriguez
  0 siblings, 0 replies; 23+ messages in thread
From: Luis R. Rodriguez @ 2016-03-29 17:07 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Yinghai Lu, Stuart Hayes, Prarit Bhargava, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Linux Kernel Mailing List,
	the arch/x86 maintainers, xen-devel

On Wed, Mar 16, 2016 at 1:20 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote:
> On Thu, Nov 05, 2015 at 11:43:59AM -0800, Luis R. Rodriguez wrote:
>> On Thu, Nov 5, 2015 at 11:14 AM, Yinghai Lu <yinghai@kernel.org> wrote:
>> > On Mon, Sep 14, 2015 at 7:46 AM, Stuart Hayes <stuart.w.hayes@gmail.com> wrote:
>> >>
>> >> Booting with 'disable_mtrr_cleanup' works, but the system I am working with
>> >> isn't actually failing--it just gets ugly error messages.  And the BIOS on the
>> >> system I am working with had set up the MTRRs correctly.
>> >
>> > Please post boot log and /proc/mtrr for:
>> > 1. without your patch
>> > 2. without your patch and with disable_mtrr_cleanup in boot command line.
>> > 3. with your patch.
>>
>> Stuart,
>>
>> to provide some context -- I reached out to Yinghai as he wrote the
>> original mtrr cleanup code. The commit logs seem to read that a crash
>> was possible on systems with > 4 GiB RAM with some types of BIOSes...
>> The cleanup code seems to trigger when variable MTRRs do not exist
>> that are UC, or when all varible MTRRs that exist are just UC + WB
>> (Yinghai correct me if I'm wrong). The commit log in question
>> (95ffa2438d0e9 "x86: mtrr cleanup for converting continuous to
>> discrete layout, v8") was not very clear about the cause of the crash
>> -- but suppose the issue here was the BIOS on some systems might want
>> to create some UC variable MTRRs early on and there was no UC MTRRs
>> available, and I can only guess the cleanup exists as hack for those
>> BIOSes. Even if that was the case -- its still not clear *why* the
>> crash would happen but I suppose a driver mishap can happen without UC
>> guarantees for some devices the BIOS may want to enable UC MTRR on.
>>
>> To be able to determine what we do upstream we need to understand the
>> above first. We also need to understand if the cleanup might also be
>> implicated by userspace drivers using /proc/mtrr, or if a proprietary
>> driver exists that does use mtrr_add() directly even though PAT has
>> been available for ages and all drivers are now properly converted.
>>
>> With clear answers to the above we'll be able to determine what the
>> right course of action should be for this patch. For instance I'm
>> inclined to strive to disable the complex cleanup code if we don't
>> need it anymore, but if we do need it your patch makes sense. If the
>> patch makes sense then though are we going to have to keep updating
>> the segment size *every time* as systems grow? That seems rather
>> silly. And if PAT is prevalent why are vendors adding MTRRs still? The
>> cleanup seems complex and a major hack for a fix for some BIOSes, I'd
>> much rather identify the exact issue and only have a fix to address
>> that case.
>
> I never heard back... so let's take this up on the other thread I just
> raised.

Although we never got an answer, it seems now its clear, at least
Toshi has provided a confirmation that the cleanup is no longer needed
given that "cleanup was needed to allocate more free slots to MTRRs.
We do not need to care about the number of free slots as long as the
kernel does not insert any new entry to MTRRs" [0]. So instead of
enhancing the cleanup code for larger systems we should now just
remove this cleanup code completely now.

[0] http://lkml.kernel.org/r/1458336958.6393.544.camel@hpe.com

 Luis

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-03-29 17:08 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <55E477DE.2060106@gmail.com>
2015-08-31 16:05 ` Fwd: [PATCH] x86: Use larger chunks in mtrr_cleanup Stuart Hayes
2015-09-03  2:45   ` Luis R. Rodriguez
2015-09-03 12:17     ` Prarit Bhargava
2015-09-03 17:59       ` Luis R. Rodriguez
2015-09-03 18:10         ` Prarit Bhargava
2015-09-03 18:40           ` Luis R. Rodriguez
2015-09-03 19:22             ` Toshi Kani
2015-09-03 19:51               ` Luis R. Rodriguez
2015-09-03 21:31                 ` Toshi Kani
2015-09-03 22:07                   ` Luis R. Rodriguez
2015-09-03 22:25                     ` Toshi Kani
2015-09-03 22:45                       ` Luis R. Rodriguez
2015-09-03 23:21                         ` Toshi Kani
2015-09-03 23:54                           ` Luis R. Rodriguez
2015-09-04  0:48                             ` Toshi Kani
2015-09-04  1:40                               ` Luis R. Rodriguez
2015-09-04 14:56                                 ` Toshi Kani
2015-09-04  6:51                             ` Jan Beulich
2015-09-14 14:46       ` Stuart Hayes
2015-11-05 19:14         ` Yinghai Lu
2015-11-05 19:43           ` Luis R. Rodriguez
2016-03-16 20:20             ` Luis R. Rodriguez
2016-03-29 17:07               ` Luis R. Rodriguez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.