linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping
@ 2015-05-15 18:23 Toshi Kani
  2015-05-15 18:23 ` [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Toshi Kani
                   ` (5 more replies)
  0 siblings, 6 replies; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, mcgrof

This patchset enhances MTRR checks for the kernel huge I/O mapping.

The following functional changes are made in patch 6/6.
 - Allow pud_set_huge() and pmd_set_huge() to create a huge page mapping
   when the range is covered by a single MTRR entry of any memory type.
 - Log a pr_warn_once() message when a specified PMD map range spans more
   than a single MTRR entry.  Drivers should make a mapping request aligned
   to a single MTRR entry when the range is covered by MTRRs.

Patch 1/6 simplifies the condition of HAVE_ARCH_HUGE_VMAP in Kconfig.
Patch 2/6 - 5/6 are bug fix and clean up to mtrr_type_lookup().

The patchset is based on the tip tree.
---
v5:
 - Separate Kconfig change and reordered/squashed the patchset. (Borislav
   Petkov)
 - Update logs, comments and functional structures. (Borislav Petkov)
 - Move MTRR_STATE_MTRR_XXX definitions to kernel asm/mtrr.h.  (Borislav
   Petkov)
 - Change mtrr_type_lookup() not to set 'uniform' in case of MTRR_TYPE_INVALID.
   (Borislav Petkov)
 - Remove a patch accepted in the tip free from the series.

v4:
 - Update the change logs of patchset. (Ingo Molnar)
 - Add patch 3/7 to make the wrong address fix as a separate patch.
   (Ingo Molnar)
 - Add patch 5/7 to define MTRR_TYPE_INVALID. (Ingo Molnar)
 - Update patch 6/7 to document MTRR fixed ranges. (Ingo Molnar)

v3:
 - Add patch 3/5 to fix a bug in MTRR state checks.
 - Update patch 4/5 to create separate functions for the fixed and
   variable entries. (Ingo Molnar)

v2:
 - Update change logs and comments per review comments.
   (Ingo Molnar)
 - Add patch 3/4 to clean up mtrr_type_lookup(). (Ingo Molnar)

---
Toshi Kani (6):
 1/6 mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP
 2/6 mtrr, x86: Fix MTRR lookup to handle inclusive entry
 3/6 mtrr, x86: Fix MTRR state checks in mtrr_type_lookup()
 4/6 mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup()
 5/6 mtrr, x86: Clean up mtrr_type_lookup()
 6/6 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

---
 arch/x86/Kconfig                   |   2 +-
 arch/x86/include/asm/mtrr.h        |  10 +-
 arch/x86/include/uapi/asm/mtrr.h   |   8 +-
 arch/x86/kernel/cpu/mtrr/cleanup.c |   3 +-
 arch/x86/kernel/cpu/mtrr/generic.c | 200 ++++++++++++++++++++++++-------------
 arch/x86/mm/pat.c                  |   4 +-
 arch/x86/mm/pgtable.c              |  59 ++++++++---
 7 files changed, 194 insertions(+), 92 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
  2015-05-17  8:30   ` Borislav Petkov
       [not found]   ` <1432628901-18044-2-git-send-email-bp@alien8.de>
  2015-05-15 18:23 ` [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry Toshi Kani
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
in arch/x86/Kconfig since X86_PAE depends on X86_32.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 8fec044..73a4d03 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -100,7 +100,7 @@ config X86
 	select IRQ_FORCED_THREADING
 	select HAVE_BPF_JIT if X86_64
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
-	select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
+	select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
 	select ARCH_HAS_SG_CHAIN
 	select CLKEVT_I8253
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
  2015-05-15 18:23 ` [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
       [not found]   ` <1432628901-18044-3-git-send-email-bp@alien8.de>
  2015-05-15 18:23 ` [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup() Toshi Kani
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

When an MTRR entry is inclusive to a requested range, i.e.
the start and end of the request are not within the MTRR
entry range but the range contains the MTRR entry entirely,
__mtrr_type_lookup() ignores such a case because both
start_state and end_state are set to zero.

This bug can cause the following issues:
1) reserve_memtype() tracks an effective memory type in case
   a request type is WB (ex. /dev/mem blindly uses WB). Missing
   to track with its effective type causes a subsequent request
   to map the same range with the effective type to fail.
2) pud_set_huge() and pmd_set_huge() check if a requested range
   has any overlap with MTRRs. Missing to detect an overlap may
   cause a performance penalty or undefined behavior.

This patch fixes the bug by adding a new flag, 'inclusive',
to detect the inclusive case.  This case is then handled in
the same way as end_state:1 since the first region is the same.
With this fix, __mtrr_type_lookup() handles the inclusive case
properly.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 5b23967..e202d26 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
-		unsigned short start_state, end_state;
+		unsigned short start_state, end_state, inclusive;
 
 		if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
 			continue;
@@ -166,19 +166,27 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 		start_state = ((start & mask) == (base & mask));
 		end_state = ((end & mask) == (base & mask));
+		inclusive = ((start < base) && (end > base));
 
-		if (start_state != end_state) {
+		if ((start_state != end_state) || inclusive) {
 			/*
 			 * We have start:end spanning across an MTRR.
-			 * We split the region into
-			 * either
-			 * (start:mtrr_end) (mtrr_end:end)
-			 * or
-			 * (start:mtrr_start) (mtrr_start:end)
+			 * We split the region into either
+			 *
+			 * - start_state:1
+			 * (start:mtrr_end)(mtrr_end:end)
+			 * - end_state:1
+			 * (start:mtrr_start)(mtrr_start:end)
+			 * - inclusive:1
+			 * (start:mtrr_start)(mtrr_start:mtrr_end)(mtrr_end:end)
+			 *
 			 * depending on kind of overlap.
-			 * Return the type for first region and a pointer to
-			 * the start of second region so that caller will
-			 * lookup again on the second region.
+			 *
+			 * Return the type of the first region and a pointer
+			 * to the start of next region so that caller will be
+			 * advised to lookup again after having adjusted start
+			 * and end.
+			 *
 			 * Note: This way we handle multiple overlaps as well.
 			 */
 			if (start_state)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup()
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
  2015-05-15 18:23 ` [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Toshi Kani
  2015-05-15 18:23 ` [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
       [not found]   ` <1432628901-18044-4-git-send-email-bp@alien8.de>
  2015-05-15 18:23 ` [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup() Toshi Kani
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

'mtrr_state.enabled' contains the FE (fixed MTRRs enabled)
and E (MTRRs enabled) flags in MSR_MTRRdefType.  Intel SDM,
section 11.11.2.1, defines these flags as follows:
 - All MTRRs are disabled when the E flag is clear.
   The FE flag has no affect when the E flag is clear.
 - The default type is enabled when the E flag is set.
 - MTRR variable ranges are enabled when the E flag is set.
 - MTRR fixed ranges are enabled when both E and FE flags
   are set.

MTRR state checks in __mtrr_type_lookup() do not match with
SDM.  Hence, this patch makes the following changes:
 - The current code detects MTRRs disabled when both E and
   FE flags are clear in mtrr_state.enabled.  Fix to detect
   MTRRs disabled when the E flag is clear.
 - The current code does not check if the FE bit is set in
   mtrr_state.enabled when looking into the fixed entries.
   Fix to check the FE flag.
 - The current code returns the default type when the E flag
   is clear in mtrr_state.enabled.  However, the default type
   is also disabled when the E flag is clear.  Fix to remove
   the code as this case is handled as MTRR disabled with
   the 1st change.

In addition, this patch defines the E and FE flags in
mtrr_state.enabled as follows.
 - FE flag: MTRR_STATE_MTRR_FIXED_ENABLED
 - E  flag: MTRR_STATE_MTRR_ENABLED

print_mtrr_state() and x86_get_mtrr_mem_range() are also updated
accordingly.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/mtrr.h        |    4 ++++
 arch/x86/kernel/cpu/mtrr/cleanup.c |    3 ++-
 arch/x86/kernel/cpu/mtrr/generic.c |   15 ++++++++-------
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..ef92794 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -127,4 +127,8 @@ struct mtrr_gentry32 {
 				 _IOW(MTRR_IOCTL_BASE,  9, struct mtrr_sentry32)
 #endif /* CONFIG_COMPAT */
 
+/* Bit fields for enabled in struct mtrr_state_type */
+#define MTRR_STATE_MTRR_FIXED_ENABLED	0x01
+#define MTRR_STATE_MTRR_ENABLED		0x02
+
 #endif /* _ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c
index 5f90b85..70d7c93 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -98,7 +98,8 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
 			continue;
 		base = range_state[i].base_pfn;
 		if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed &&
-		    (mtrr_state.enabled & 1)) {
+		    (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+		    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
 			/* Var MTRR contains UC entry below 1M? Skip it: */
 			printk(BIOS_BUG_MSG, i);
 			if (base + size <= (1<<(20-PAGE_SHIFT)))
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e202d26..b0599db 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -119,14 +119,16 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	if (!mtrr_state_set)
 		return 0xFF;
 
-	if (!mtrr_state.enabled)
+	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
 		return 0xFF;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
 
 	/* Look in fixed ranges. Just return the type as per start */
-	if (mtrr_state.have_fixed && (start < 0x100000)) {
+	if ((start < 0x100000) &&
+	    (mtrr_state.have_fixed) &&
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
 		int idx;
 
 		if (start < 0x80000) {
@@ -149,9 +151,6 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	 * Look of multiple ranges matching this address and pick type
 	 * as per MTRR precedence
 	 */
-	if (!(mtrr_state.enabled & 2))
-		return mtrr_state.def_type;
-
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
@@ -355,7 +354,9 @@ static void __init print_mtrr_state(void)
 		 mtrr_attrib_to_str(mtrr_state.def_type));
 	if (mtrr_state.have_fixed) {
 		pr_debug("MTRR fixed ranges %sabled:\n",
-			 mtrr_state.enabled & 1 ? "en" : "dis");
+			((mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+			 (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) ?
+			 "en" : "dis");
 		print_fixed(0x00000, 0x10000, mtrr_state.fixed_ranges + 0);
 		for (i = 0; i < 2; ++i)
 			print_fixed(0x80000 + i * 0x20000, 0x04000,
@@ -368,7 +369,7 @@ static void __init print_mtrr_state(void)
 		print_fixed_last();
 	}
 	pr_debug("MTRR variable ranges %sabled:\n",
-		 mtrr_state.enabled & 2 ? "en" : "dis");
+		 mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis");
 	high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4;
 
 	for (i = 0; i < num_var_ranges; ++i) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup()
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
                   ` (2 preceding siblings ...)
  2015-05-15 18:23 ` [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup() Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
       [not found]   ` <1432628901-18044-5-git-send-email-bp@alien8.de>
  2015-05-15 18:23 ` [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup() Toshi Kani
  2015-05-15 18:23 ` [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Toshi Kani
  5 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

mtrr_type_lookup() returns 0xFF when it cannot return a valid
MTRR memory type since MTRRs are disabled.  This patch defines
MTRR_TYPE_INVALID to clarify the meaning of this value, and
documents its usage.

Document the return values of Kernel Virtual Address mapping
functions, pud_set_huge(), pmd_set_huge, pud_clear_huge() and
pmd_clear_huge().

There is no functional change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/mtrr.h        |    2 +-
 arch/x86/include/uapi/asm/mtrr.h   |    8 ++++++-
 arch/x86/kernel/cpu/mtrr/generic.c |   14 ++++++------
 arch/x86/mm/pgtable.c              |   42 +++++++++++++++++++++++++++---------
 4 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index ef92794..bb03a54 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -55,7 +55,7 @@ static inline u8 mtrr_type_lookup(u64 addr, u64 end)
 	/*
 	 * Return no-MTRRs:
 	 */
-	return 0xff;
+	return MTRR_TYPE_INVALID;
 }
 #define mtrr_save_fixed_ranges(arg) do {} while (0)
 #define mtrr_save_state() do {} while (0)
diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h
index d0acb65..7528dcf 100644
--- a/arch/x86/include/uapi/asm/mtrr.h
+++ b/arch/x86/include/uapi/asm/mtrr.h
@@ -103,7 +103,7 @@ struct mtrr_state_type {
 #define MTRRIOC_GET_PAGE_ENTRY   _IOWR(MTRR_IOCTL_BASE, 8, struct mtrr_gentry)
 #define MTRRIOC_KILL_PAGE_ENTRY  _IOW(MTRR_IOCTL_BASE,  9, struct mtrr_sentry)
 
-/*  These are the region types  */
+/* MTRR memory types, which are defined in SDM */
 #define MTRR_TYPE_UNCACHABLE 0
 #define MTRR_TYPE_WRCOMB     1
 /*#define MTRR_TYPE_         2*/
@@ -113,5 +113,11 @@ struct mtrr_state_type {
 #define MTRR_TYPE_WRBACK     6
 #define MTRR_NUM_TYPES       7
 
+/*
+ * Invalid MTRR memory type.  mtrr_type_lookup() returns this value when
+ * MTRRs are disabled.  Note, this value is allocated from the reserved
+ * values (0x7-0xff) of the MTRR memory types.
+ */
+#define MTRR_TYPE_INVALID    0xff
 
 #endif /* _UAPI_ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index b0599db..7b1491c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -104,7 +104,7 @@ static int check_type_overlap(u8 *prev, u8 *curr)
 
 /*
  * Error/Semi-error returns:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
  * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
  *		corresponds only to [start:*partial_end].
  *		Caller has to lookup again for [*partial_end:end].
@@ -117,10 +117,10 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	*repeat = 0;
 	if (!mtrr_state_set)
-		return 0xFF;
+		return MTRR_TYPE_INVALID;
 
 	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
-		return 0xFF;
+		return MTRR_TYPE_INVALID;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
@@ -151,7 +151,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	 * Look of multiple ranges matching this address and pick type
 	 * as per MTRR precedence
 	 */
-	prev_match = 0xFF;
+	prev_match = MTRR_TYPE_INVALID;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
 
@@ -206,7 +206,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			continue;
 
 		curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;
-		if (prev_match == 0xFF) {
+		if (prev_match == MTRR_TYPE_INVALID) {
 			prev_match = curr_match;
 			continue;
 		}
@@ -220,7 +220,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			return MTRR_TYPE_WRBACK;
 	}
 
-	if (prev_match != 0xFF)
+	if (prev_match != MTRR_TYPE_INVALID)
 		return prev_match;
 
 	return mtrr_state.def_type;
@@ -229,7 +229,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 /*
  * Returns the effective MTRR type for the region
  * Error return:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
  */
 u8 mtrr_type_lookup(u64 start, u64 end)
 {
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0b97d2c..c30f981 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -563,16 +563,22 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 }
 
 #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+/**
+ * pud_set_huge - setup kernel PUD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -584,16 +590,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pmd_set_huge - setup kernel PMD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -605,6 +617,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PUD map is found).
+ */
 int pud_clear_huge(pud_t *pud)
 {
 	if (pud_large(*pud)) {
@@ -615,6 +632,11 @@ int pud_clear_huge(pud_t *pud)
 	return 0;
 }
 
+/**
+ * pmd_clear_huge - clear kernel PMD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PMD map is found).
+ */
 int pmd_clear_huge(pmd_t *pmd)
 {
 	if (pmd_large(*pmd)) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup()
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
                   ` (3 preceding siblings ...)
  2015-05-15 18:23 ` [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup() Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
       [not found]   ` <1432628901-18044-6-git-send-email-bp@alien8.de>
  2015-05-15 18:23 ` [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Toshi Kani
  5 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

MTRRs contain fixed and variable entries.  mtrr_type_lookup()
may repeatedly call __mtrr_type_lookup() to handle a request
that overlaps with variable entries.  However,
__mtrr_type_lookup() also handles the fixed entries, which
do not have to be repeated.  Therefore, this patch creates
separate functions, mtrr_type_lookup_fixed() and
mtrr_type_lookup_variable(), to handle the fixed and variable
ranges respectively.

The patch also updates the function headers to clarify the
return values and output argument.  It updates comments to
clarify that the repeating is necessary to handle overlaps
with the default type, since overlaps with multiple entries
alone can be handled without such repeating.

There is no functional change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c |  136 +++++++++++++++++++++++-------------
 1 file changed, 85 insertions(+), 51 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 7b1491c..c7d5245 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -102,55 +102,67 @@ static int check_type_overlap(u8 *prev, u8 *curr)
 	return 0;
 }
 
-/*
- * Error/Semi-error returns:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
- * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
- *		corresponds only to [start:*partial_end].
- *		Caller has to lookup again for [*partial_end:end].
+/**
+ * mtrr_type_lookup_fixed - look up memory type in MTRR fixed entries
+ *
+ * Return the MTRR fixed memory type of 'start'.
+ *
+ * MTRR fixed entries are divided into the following ways:
+ *  0x00000 - 0x7FFFF : This range is divided into eight 64KB sub-ranges
+ *  0x80000 - 0xBFFFF : This range is divided into sixteen 16KB sub-ranges
+ *  0xC0000 - 0xFFFFF : This range is divided into sixty-four 4KB sub-ranges
+ *
+ * Return Values:
+ * MTRR_TYPE_(type)  - Matched memory type
+ * MTRR_TYPE_INVALID - Unmatched
  */
-static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
+static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
+{
+	int idx;
+
+	if (start >= 0x100000)
+		return MTRR_TYPE_INVALID;
+
+	if (start < 0x80000) {		/* 0x0 - 0x7FFFF */
+		idx = 0;
+		idx += (start >> 16);
+		return mtrr_state.fixed_ranges[idx];
+
+	} else if (start < 0xC0000) {	/* 0x80000 - 0xBFFFF */
+		idx = 1 * 8;
+		idx += ((start - 0x80000) >> 14);
+		return mtrr_state.fixed_ranges[idx];
+	}
+
+	/* 0xC0000 - 0xFFFFF */
+	idx = 3 * 8;
+	idx += ((start - 0xC0000) >> 12);
+	return mtrr_state.fixed_ranges[idx];
+}
+
+/**
+ * mtrr_type_lookup_variable - look up memory type in MTRR variable entries
+ *
+ * Return Value:
+ * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
+ *
+ * Output Argument:
+ * repeat - Set to 1 when [start:end] spanned across MTRR range and type
+ *	    returned corresponds only to [start:*partial_end].  Caller has
+ *	    to lookup again for [*partial_end:end].
+ */
+static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+				    int *repeat)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
-	if (!mtrr_state_set)
-		return MTRR_TYPE_INVALID;
-
-	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
-		return MTRR_TYPE_INVALID;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
 
-	/* Look in fixed ranges. Just return the type as per start */
-	if ((start < 0x100000) &&
-	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
-		int idx;
-
-		if (start < 0x80000) {
-			idx = 0;
-			idx += (start >> 16);
-			return mtrr_state.fixed_ranges[idx];
-		} else if (start < 0xC0000) {
-			idx = 1 * 8;
-			idx += ((start - 0x80000) >> 14);
-			return mtrr_state.fixed_ranges[idx];
-		} else {
-			idx = 3 * 8;
-			idx += ((start - 0xC0000) >> 12);
-			return mtrr_state.fixed_ranges[idx];
-		}
-	}
-
-	/*
-	 * Look in variable ranges
-	 * Look of multiple ranges matching this address and pick type
-	 * as per MTRR precedence
-	 */
 	prev_match = MTRR_TYPE_INVALID;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
@@ -186,7 +198,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			 * advised to lookup again after having adjusted start
 			 * and end.
 			 *
-			 * Note: This way we handle multiple overlaps as well.
+			 * Note: This way we handle overlaps with multiple
+			 * entries and the default type properly.
 			 */
 			if (start_state)
 				*partial_end = base + get_mtrr_size(mask);
@@ -215,21 +228,18 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			return curr_match;
 	}
 
-	if (mtrr_tom2) {
-		if (start >= (1ULL<<32) && (end < mtrr_tom2))
-			return MTRR_TYPE_WRBACK;
-	}
-
 	if (prev_match != MTRR_TYPE_INVALID)
 		return prev_match;
 
 	return mtrr_state.def_type;
 }
 
-/*
- * Returns the effective MTRR type for the region
- * Error return:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
+/**
+ * mtrr_type_lookup - look up memory type in MTRR
+ *
+ * Return Values:
+ * MTRR_TYPE_(type)  - The effective MTRR type for the region
+ * MTRR_TYPE_INVALID - MTRR is disabled
  */
 u8 mtrr_type_lookup(u64 start, u64 end)
 {
@@ -237,22 +247,46 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	int repeat;
 	u64 partial_end;
 
-	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+	if (!mtrr_state_set)
+		return MTRR_TYPE_INVALID;
+
+	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
+		return MTRR_TYPE_INVALID;
+
+	/*
+	 * Look up the fixed ranges first, which take priority over
+	 * the variable ranges.
+	 */
+	if ((start < 0x100000) &&
+	    (mtrr_state.have_fixed) &&
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
+		return mtrr_type_lookup_fixed(start, end);
+
+	/*
+	 * Look up the variable ranges.  Look of multiple ranges matching
+	 * this address and pick type as per MTRR precedence.
+	 */
+	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
 
 	/*
 	 * Common path is with repeat = 0.
 	 * However, we can have cases where [start:end] spans across some
-	 * MTRR range. Do repeated lookups for that case here.
+	 * MTRR ranges and/or the default type.  Do repeated lookups for
+	 * that case here.
 	 */
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+		type = mtrr_type_lookup_variable(start, end, &partial_end,
+						 &repeat);
 
 		if (check_type_overlap(&prev_type, &type))
 			return type;
 	}
 
+	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
+		return MTRR_TYPE_WRBACK;
+
 	return type;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
                   ` (4 preceding siblings ...)
  2015-05-15 18:23 ` [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup() Toshi Kani
@ 2015-05-15 18:23 ` Toshi Kani
  2015-05-18 13:33   ` Borislav Petkov
       [not found]   ` <1432628901-18044-8-git-send-email-bp@alien8.de>
  5 siblings, 2 replies; 40+ messages in thread
From: Toshi Kani @ 2015-05-15 18:23 UTC (permalink / raw)
  To: bp, akpm, hpa, tglx, mingo
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle,
	mcgrof, Toshi Kani

This patch adds an additional argument, 'uniform', to
mtrr_type_lookup(), which returns 1 when a given range is
covered uniformly by MTRRs, i.e. the range is fully covered
by a single MTRR entry or the default type.

pud_set_huge() and pmd_set_huge() are changed to check the
new 'uniform' flag to see if it is safe to create a huge page
mapping to the range.  This allows them to create a huge page
mapping to a range covered by a single MTRR entry of any
memory type.  It also detects a non-optimal request properly.
They continue to check with the WB type since the WB type has
no effect even if a request spans multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request
so that driver writers will be aware of such a case.  Drivers
should make a mapping request aligned to a single MTRR entry
when the range is covered by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/mtrr.h        |    4 ++--
 arch/x86/kernel/cpu/mtrr/generic.c |   37 ++++++++++++++++++++++++++----------
 arch/x86/mm/pat.c                  |    4 ++--
 arch/x86/mm/pgtable.c              |   33 ++++++++++++++++++++------------
 4 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a54..a31759e 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index c7d5245..7d347ac 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -146,19 +146,22 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
  * Return Value:
  * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
  *
- * Output Argument:
+ * Output Arguments:
  * repeat - Set to 1 when [start:end] spanned across MTRR range and type
  *	    returned corresponds only to [start:*partial_end].  Caller has
  *	    to lookup again for [*partial_end:end].
+ * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region
+ *	     is fully covered by a single MTRR entry or the default type.
  */
 static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
-				    int *repeat)
+				    int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
@@ -213,6 +216,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if ((start & mask) != (base & mask))
@@ -224,6 +228,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -240,10 +245,14 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
  * Return Values:
  * MTRR_TYPE_(type)  - The effective MTRR type for the region
  * MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when MTRR covers the region uniformly, i.e. the region
+ *	     is fully covered by a single MTRR entry or the default type.
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform = 1, dummy;
 	int repeat;
 	u64 partial_end;
 
@@ -259,14 +268,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	 */
 	if ((start < 0x100000) &&
 	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
-		return mtrr_type_lookup_fixed(start, end);
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+		is_uniform = 0;
+		type = mtrr_type_lookup_fixed(start, end);
+		goto out;
+	}
 
 	/*
 	 * Look up the variable ranges.  Look of multiple ranges matching
 	 * this address and pick type as per MTRR precedence.
 	 */
-	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+	type = mtrr_type_lookup_variable(start, end, &partial_end,
+					 &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -277,16 +290,20 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
+		is_uniform = 0;
+
 		type = mtrr_type_lookup_variable(start, end, &partial_end,
-						 &repeat);
+						 &repeat, &dummy);
 
 		if (check_type_overlap(&prev_type, &type))
-			return type;
+			goto out;
 	}
 
 	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
-		return MTRR_TYPE_WRBACK;
+		type = MTRR_TYPE_WRBACK;
 
+out:
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..372ad42 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f981..3fa0eb9 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -567,18 +567,21 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
  * pud_set_huge - setup kernel PUD mapping
  *
  * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * this function only sets up a huge page in the following conditions.
+ *  - MTRR is disabled.
+ *  - The range is mapped uniformly by MTRR, i.e. the range is fully covered
+ *    by a single MTRR entry or the default type.
+ *  - The MTRR memory type is WB.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -594,19 +597,25 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
  * pmd_set_huge - setup kernel PMD mapping
  *
  * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * this function only sets up a huge page in the following conditions.
+ *  - MTRR is disabled.
+ *  - The range is mapped uniformly by MTRR, i.e. the range is fully covered
+ *    by a single MTRR entry or the default type.
+ *  - The MTRR memory type is WB.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
+				addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP
  2015-05-15 18:23 ` [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Toshi Kani
@ 2015-05-17  8:30   ` Borislav Petkov
       [not found]   ` <1432628901-18044-2-git-send-email-bp@alien8.de>
  1 sibling, 0 replies; 40+ messages in thread
From: Borislav Petkov @ 2015-05-17  8:30 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Fri, May 15, 2015 at 12:23:52PM -0600, Toshi Kani wrote:
> Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
> in arch/x86/Kconfig since X86_PAE depends on X86_32.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  arch/x86/Kconfig |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 8fec044..73a4d03 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -100,7 +100,7 @@ config X86
>  	select IRQ_FORCED_THREADING
>  	select HAVE_BPF_JIT if X86_64
>  	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> -	select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
> +	select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
>  	select ARCH_HAS_SG_CHAIN
>  	select CLKEVT_I8253
>  	select ARCH_HAVE_NMI_SAFE_CMPXCHG

Applied, thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-15 18:23 ` [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Toshi Kani
@ 2015-05-18 13:33   ` Borislav Petkov
  2015-05-18 17:22     ` Toshi Kani
       [not found]   ` <1432628901-18044-8-git-send-email-bp@alien8.de>
  1 sibling, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-18 13:33 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote:
> This patch adds an additional argument, 'uniform', to
> mtrr_type_lookup(), which returns 1 when a given range is
> covered uniformly by MTRRs, i.e. the range is fully covered
> by a single MTRR entry or the default type.
> 
> pud_set_huge() and pmd_set_huge() are changed to check the
> new 'uniform' flag to see if it is safe to create a huge page
> mapping to the range.  This allows them to create a huge page
> mapping to a range covered by a single MTRR entry of any
> memory type.  It also detects a non-optimal request properly.
> They continue to check with the WB type since the WB type has
> no effect even if a request spans multiple MTRR entries.
> 
> pmd_set_huge() logs a warning message to a non-optimal request
> so that driver writers will be aware of such a case.  Drivers
> should make a mapping request aligned to a single MTRR entry
> when the range is covered by MTRRs.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  arch/x86/include/asm/mtrr.h        |    4 ++--
>  arch/x86/kernel/cpu/mtrr/generic.c |   37 ++++++++++++++++++++++++++----------
>  arch/x86/mm/pat.c                  |    4 ++--
>  arch/x86/mm/pgtable.c              |   33 ++++++++++++++++++++------------
>  4 files changed, 52 insertions(+), 26 deletions(-)

...

>  int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
>  {
> -	u8 mtrr;
> +	u8 mtrr, uniform;
>  
> -	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
> -	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> +	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
> +	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> +	    (mtrr != MTRR_TYPE_WRBACK)) {
> +		pr_warn_once("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
> +				addr, addr + PMD_SIZE);
>  		return 0;
> +	}

All applied, I reformatted the comments in this last one a bit and made
the warning message hopefully a bit more descriptive:

---
From: Toshi Kani <toshi.kani@hp.com>
Date: Fri, 15 May 2015 12:23:57 -0600
Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers

This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets
set to 1 when a given range is covered uniformly by MTRRs, i.e. the
range is fully covered by a single MTRR entry or the default type.

Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to
see if it is safe to create a huge page mapping in the range.

This allows them to create a huge page mapping in a range covered by
a single MTRR entry of any memory type. It also detects a non-optimal
request properly. They continue to check with the WB type since it does
not effectively change the uniform mapping even if a request spans
multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request so that
driver writers will be aware of such a case. Drivers should make a
mapping request aligned to a single MTRR entry when the range is covered
by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: dave.hansen@intel.com
Cc: Elliott@hp.com
Cc: pebolle@tiscali.nl
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-mm <linux-mm@kvack.org>
Cc: x86-ml <x86@kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com
[ Realign comments, improve warning message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/mtrr.h        |  4 ++--
 arch/x86/kernel/cpu/mtrr/generic.c | 40 +++++++++++++++++++++++++++----------
 arch/x86/mm/pat.c                  |  4 ++--
 arch/x86/mm/pgtable.c              | 41 +++++++++++++++++++++++++-------------
 4 files changed, 61 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a547c1ab..a31759e1edd9 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e51100c49eea..f782d9b62cb3 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
  * Return Value:
  * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
  *
- * Output Argument:
+ * Output Arguments:
  * repeat - Set to 1 when [start:end] spanned across MTRR range and type
  *	    returned corresponds only to [start:*partial_end].  Caller has
  *	    to lookup again for [*partial_end:end].
+ *
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
 static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
-				    int *repeat)
+				    int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
 
 	/* Make end inclusive instead of exclusive */
 	end--;
@@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if ((start & mask) != (base & mask))
@@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
  * Return Values:
  * MTRR_TYPE_(type)  - The effective MTRR type for the region
  * MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform = 1, dummy;
 	int repeat;
 	u64 partial_end;
 
@@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	 */
 	if ((start < 0x100000) &&
 	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
-		return mtrr_type_lookup_fixed(start, end);
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+		is_uniform = 0;
+		type = mtrr_type_lookup_fixed(start, end);
+		goto out;
+	}
 
 	/*
 	 * Look up the variable ranges.  Look of multiple ranges matching
 	 * this address and pick type as per MTRR precedence.
 	 */
-	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+	type = mtrr_type_lookup_variable(start, end, &partial_end,
+					 &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+		is_uniform = 0;
+		type = mtrr_type_lookup_variable(start, end, &partial_end,
+						 &repeat, &dummy);
 
 		if (check_type_overlap(&prev_type, &type))
-			return type;
+			goto out;
 	}
 
 	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
-		return MTRR_TYPE_WRBACK;
+		type = MTRR_TYPE_WRBACK;
 
+out:
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af6771a95a..372ad422c2c3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f9819786b..f1894daa79ee 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 /**
  * pud_set_huge - setup kernel PUD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
+ * this function sets up a huge page only if all of the following
+ * conditions are met:
+ *
+ *  - MTRRs are disabled.
+ *  - The range is mapped uniformly by an MTRR, i.e. the range is
+ *    fully covered by a single MTRR entry or the default type.
+ *  - The MTRR memory type is WB.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 /**
  * pmd_set_huge - setup kernel PMD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
+ * this function sets up a huge page only if all of the following
+ * conditions are met:
+ *
+ *  - MTRR is disabled.
+ *  - The range is mapped uniformly by an MTRR, i.e. the range is
+ *    fully covered by a single MTRR entry or the default type.
+ *  - The MTRR memory type is WB.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
+			     __func__, addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 
-- 
2.3.5

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 13:33   ` Borislav Petkov
@ 2015-05-18 17:22     ` Toshi Kani
  2015-05-18 19:01       ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-18 17:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, 2015-05-18 at 15:33 +0200, Borislav Petkov wrote:
> On Fri, May 15, 2015 at 12:23:57PM -0600, Toshi Kani wrote:
> > This patch adds an additional argument, 'uniform', to
> > mtrr_type_lookup(), which returns 1 when a given range is
> > covered uniformly by MTRRs, i.e. the range is fully covered
> > by a single MTRR entry or the default type.
> > 
> > pud_set_huge() and pmd_set_huge() are changed to check the
> > new 'uniform' flag to see if it is safe to create a huge page
> > mapping to the range.  This allows them to create a huge page
> > mapping to a range covered by a single MTRR entry of any
> > memory type.  It also detects a non-optimal request properly.
> > They continue to check with the WB type since the WB type has
> > no effect even if a request spans multiple MTRR entries.
> > 
> > pmd_set_huge() logs a warning message to a non-optimal request
> > so that driver writers will be aware of such a case.  Drivers
> > should make a mapping request aligned to a single MTRR entry
> > when the range is covered by MTRRs.
> > 
> > Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> > ---
> >  arch/x86/include/asm/mtrr.h        |    4 ++--
> >  arch/x86/kernel/cpu/mtrr/generic.c |   37 ++++++++++++++++++++++++++----------
> >  arch/x86/mm/pat.c                  |    4 ++--
> >  arch/x86/mm/pgtable.c              |   33 ++++++++++++++++++++------------
> >  4 files changed, 52 insertions(+), 26 deletions(-)
 :
> 
> All applied, 

Great!

> I reformatted the comments in this last one a bit and made
> the warning message hopefully a bit more descriptive:

I have a few comments below.

> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index c30f9819786b..f1894daa79ee 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
>  /**
>   * pud_set_huge - setup kernel PUD mapping
>   *
> - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> + * this function sets up a huge page only if all of the following
> + * conditions are met:

It should be "if any of the following condition is met".  Or, does NOT
setup if all of ...

> + *
> + *  - MTRRs are disabled.
> + *  - The range is mapped uniformly by an MTRR, i.e. the range is
> + *    fully covered by a single MTRR entry or the default type.
> + *  - The MTRR memory type is WB.
>   *
>   * Returns 1 on success and 0 on failure.
>   */
>  int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
>  {
> -	u8 mtrr;
> +	u8 mtrr, uniform;
>  
> -	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
> -	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> +	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
> +	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> +	    (mtrr != MTRR_TYPE_WRBACK))
>  		return 0;
>  
>  	prot = pgprot_4k_2_large(prot);
> @@ -593,20 +598,28 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
>  /**
>   * pmd_set_huge - setup kernel PMD mapping
>   *
> - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> + * this function sets up a huge page only if all of the following
> + * conditions are met:

Ditto.

> + *
> + *  - MTRR is disabled.
> + *  - The range is mapped uniformly by an MTRR, i.e. the range is
> + *    fully covered by a single MTRR entry or the default type.
> + *  - The MTRR memory type is WB.
>   *
>   * Returns 1 on success and 0 on failure.
>   */
>  int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
>  {
> -	u8 mtrr;
> +	u8 mtrr, uniform;
>  
> -	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
> -	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
> +	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
> +	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> +	    (mtrr != MTRR_TYPE_WRBACK)) {
> +		pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
> +			     __func__, addr, addr + PMD_SIZE);

This new message looks good.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 17:22     ` Toshi Kani
@ 2015-05-18 19:01       ` Borislav Petkov
  2015-05-18 19:31         ` Toshi Kani
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-18 19:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> > index c30f9819786b..f1894daa79ee 100644
> > --- a/arch/x86/mm/pgtable.c
> > +++ b/arch/x86/mm/pgtable.c
> > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> >  /**
> >   * pud_set_huge - setup kernel PUD mapping
> >   *
> > - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> > - * this function does not set up a huge page when the range is covered
> > - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> > - * disabled.
> > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> > + * this function sets up a huge page only if all of the following
> > + * conditions are met:
> 
> It should be "if any of the following condition is met".  Or, does NOT
> setup if all of ...
> 
> > + *
> > + *  - MTRRs are disabled.
> > + *  - The range is mapped uniformly by an MTRR, i.e. the range is
> > + *    fully covered by a single MTRR entry or the default type.
> > + *  - The MTRR memory type is WB.

Hmm, ok, so this is kinda like "any" but they also depend on each other.
So it is

If
	- MTRRs are disabled

	or

	- MTRRs are enabled and the range is completely covered by a single MTRR

	or

	 - MTRRs are enabled and the range is not completely covered by a
	 single MTRR but the memory type of the range is WB, even if covered by
	 multiple MTRRs.

Right?

So tell me this: why do we need to repeat that over those KVA helpers?
It's not like the callers can do anything about it, can they?

So maybe that comment - expanded into more detail - should be over
mtrr_type_lookup() only. That'll be better, methinks.

Hmm.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 19:01       ` Borislav Petkov
@ 2015-05-18 19:31         ` Toshi Kani
  2015-05-18 20:01           ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-18 19:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, 2015-05-18 at 21:01 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 11:22:39AM -0600, Toshi Kani wrote:
> > > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> > > index c30f9819786b..f1894daa79ee 100644
> > > --- a/arch/x86/mm/pgtable.c
> > > +++ b/arch/x86/mm/pgtable.c
> > > @@ -566,19 +566,24 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > >  /**
> > >   * pud_set_huge - setup kernel PUD mapping
> > >   *
> > > - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> > > - * this function does not set up a huge page when the range is covered
> > > - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> > > - * disabled.
> > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore,
> > > + * this function sets up a huge page only if all of the following
> > > + * conditions are met:
> > 
> > It should be "if any of the following condition is met".  Or, does NOT
> > setup if all of ...
> > 
> > > + *
> > > + *  - MTRRs are disabled.
> > > + *  - The range is mapped uniformly by an MTRR, i.e. the range is
> > > + *    fully covered by a single MTRR entry or the default type.
> > > + *  - The MTRR memory type is WB.
> 
> Hmm, ok, so this is kinda like "any" but they also depend on each other.
> So it is
> 
> If
> 	- MTRRs are disabled
> 
> 	or
> 
> 	- MTRRs are enabled and the range is completely covered by a single MTRR
> 
> 	or
> 
> 	 - MTRRs are enabled and the range is not completely covered by a
> 	 single MTRR but the memory type of the range is WB, even if covered by
> 	 multiple MTRRs.
> 
> Right?

Well, #2 and #3 are independent. That is, uniform can be set regardless
of a type value, and WB can be returned regardless of a uniform value.  

#1 is a new condition added per your comment that uniform no longer
covers the MTRR disabled case.  Yes, #2 and #3 depend on #1 being false.

> So tell me this: why do we need to repeat that over those KVA helpers?
> It's not like the callers can do anything about it, can they?
>
> So maybe that comment - expanded into more detail - should be over
> mtrr_type_lookup() only. That'll be better, methinks.

The caller is responsible for verifying the conditions that are safe to
create huge page.  So, I think the comments are needed here to state
such conditions.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 19:31         ` Toshi Kani
@ 2015-05-18 20:01           ` Borislav Petkov
  2015-05-18 20:21             ` Toshi Kani
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-18 20:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote:
> Well, #2 and #3 are independent. That is, uniform can be set regardless

Not #2 and #3 above - the original #2 and #3 ones. I've written them out
detailed to show what I mean.

> The caller is responsible for verifying the conditions that are safe to
> create huge page.

How is the caller ever going to be able to do anything about it?

Regardless, I'd prefer to not duplicate comments and rather put a short
sentence pointing the reader to the comments over mtrr_type_lookup()
where this all is being explained in detail.

I'll fix it up.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 20:01           ` Borislav Petkov
@ 2015-05-18 20:21             ` Toshi Kani
  2015-05-18 20:51               ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-18 20:21 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, 2015-05-18 at 22:01 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 01:31:59PM -0600, Toshi Kani wrote:
> > Well, #2 and #3 are independent. That is, uniform can be set regardless
> 
> Not #2 and #3 above - the original #2 and #3 ones. I've written them out
> detailed to show what I mean.

The original #2 and #3 are set independently as well. They do not depend
on each other condition being a specific value.

> > The caller is responsible for verifying the conditions that are safe to
> > create huge page.
> 
> How is the caller ever going to be able to do anything about it?

The caller is the one who makes the condition checks necessary to create
a huge page mapping.  mtrr_type_look() only returns how the given range
is related with MTRRs.

> Regardless, I'd prefer to not duplicate comments and rather put a short
> sentence pointing the reader to the comments over mtrr_type_lookup()
> where this all is being explained in detail.
> 
> I'll fix it up.

I appreciate your help.

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 20:21             ` Toshi Kani
@ 2015-05-18 20:51               ` Borislav Petkov
  2015-05-18 21:53                 ` Toshi Kani
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-18 20:51 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> The caller is the one who makes the condition checks necessary to create
> a huge page mapping.

How? It would go and change MTRRs configuration and ranges and their
memory types so that a huge mapping succeeds?

Or go and try a different range?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 20:51               ` Borislav Petkov
@ 2015-05-18 21:53                 ` Toshi Kani
  2015-05-19 11:44                   ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-18 21:53 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote:
> On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> > The caller is the one who makes the condition checks necessary to create
> > a huge page mapping.
> 
> How? It would go and change MTRRs configuration and ranges and their
> memory types so that a huge mapping succeeds?
> 
> Or go and try a different range?

Try with a smaller page size.

The callers, pud_set_huge() and pmd_set_huge(), check if the given range
is safe with MTRRs for creating a huge page mapping.  If not, they fail
the request, which leads their callers, ioremap_pud_range() and
ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
-> 4KB.  4KB may not have overlap with MTRRs (hence no checking is
necessary), which will succeed as before.

Thanks,
-Toshi




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-18 21:53                 ` Toshi Kani
@ 2015-05-19 11:44                   ` Borislav Petkov
  2015-05-19 13:23                     ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-19 11:44 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Mon, May 18, 2015 at 03:53:14PM -0600, Toshi Kani wrote:
> On Mon, 2015-05-18 at 22:51 +0200, Borislav Petkov wrote:
> > On Mon, May 18, 2015 at 02:21:08PM -0600, Toshi Kani wrote:
> > > The caller is the one who makes the condition checks necessary to create
> > > a huge page mapping.
> > 
> > How? It would go and change MTRRs configuration and ranges and their
> > memory types so that a huge mapping succeeds?
> > 
> > Or go and try a different range?
> 
> Try with a smaller page size.
> 
> The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> is safe with MTRRs for creating a huge page mapping.  If not, they fail
> the request, which leads their callers, ioremap_pud_range() and
> ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> -> 4KB.  4KB may not have overlap with MTRRs (hence no checking is
> necessary), which will succeed as before.

Ok, now *this* should be in the form of a comment over the KVA helpers,
not the MTRR aspect. Callers of those functions would have to know that
- they shouldn't care about MTRR setup.

The MTRR aspect with the 3 conditions should be only over
mtrr_type_lookup().

I'll integrate it into the patch.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-19 11:44                   ` Borislav Petkov
@ 2015-05-19 13:23                     ` Borislav Petkov
  2015-05-19 13:47                       ` Toshi Kani
  2015-05-20 11:55                       ` Ingo Molnar
  0 siblings, 2 replies; 40+ messages in thread
From: Borislav Petkov @ 2015-05-19 13:23 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote:
> > Try with a smaller page size.
> > 
> > The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> > is safe with MTRRs for creating a huge page mapping.  If not, they fail
> > the request, which leads their callers, ioremap_pud_range() and
> > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> > -> 4KB.  4KB may not have overlap with MTRRs (hence no checking is
> > necessary), which will succeed as before.

Scratch that, I think I have it now. And I even have a good feeling
about it :-)

---
From: Toshi Kani <toshi.kani@hp.com>
Date: Fri, 15 May 2015 12:23:57 -0600
Subject: [PATCH] x86/mm: Enhance MTRR checks in kernel mapping helpers

This patch adds the argument 'uniform' to mtrr_type_lookup(), which gets
set to 1 when a given range is covered uniformly by MTRRs, i.e. the
range is fully covered by a single MTRR entry or the default type.

Change pud_set_huge() and pmd_set_huge() to honor the 'uniform' flag to
see if it is safe to create a huge page mapping in the range.

This allows them to create a huge page mapping in a range covered by
a single MTRR entry of any memory type. It also detects a non-optimal
request properly. They continue to check with the WB type since it does
not effectively change the uniform mapping even if a request spans
multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request so that
driver writers will be aware of such a case. Drivers should make a
mapping request aligned to a single MTRR entry when the range is covered
by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Cc: dave.hansen@intel.com
Cc: Elliott@hp.com
Cc: pebolle@tiscali.nl
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-mm <linux-mm@kvack.org>
Cc: x86-ml <x86@kernel.org>
Cc: lkml <linux-kernel@vger.kernel.org>
Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com
[ Realign, flesh out comments, improve warning message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 arch/x86/include/asm/mtrr.h        |  4 ++--
 arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++++++++++++++++----------
 arch/x86/mm/pat.c                  |  4 ++--
 arch/x86/mm/pgtable.c              | 38 +++++++++++++++++++++++-------------
 4 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a547c1ab..a31759e1edd9 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e51100c49eea..f782d9b62cb3 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
  * Return Value:
  * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
  *
- * Output Argument:
+ * Output Arguments:
  * repeat - Set to 1 when [start:end] spanned across MTRR range and type
  *	    returned corresponds only to [start:*partial_end].  Caller has
  *	    to lookup again for [*partial_end:end].
+ *
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
 static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
-				    int *repeat)
+				    int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
 
 	/* Make end inclusive instead of exclusive */
 	end--;
@@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if ((start & mask) != (base & mask))
@@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
  * Return Values:
  * MTRR_TYPE_(type)  - The effective MTRR type for the region
  * MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform = 1, dummy;
 	int repeat;
 	u64 partial_end;
 
@@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	 */
 	if ((start < 0x100000) &&
 	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
-		return mtrr_type_lookup_fixed(start, end);
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+		is_uniform = 0;
+		type = mtrr_type_lookup_fixed(start, end);
+		goto out;
+	}
 
 	/*
 	 * Look up the variable ranges.  Look of multiple ranges matching
 	 * this address and pick type as per MTRR precedence.
 	 */
-	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+	type = mtrr_type_lookup_variable(start, end, &partial_end,
+					 &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+		is_uniform = 0;
+		type = mtrr_type_lookup_variable(start, end, &partial_end,
+						 &repeat, &dummy);
 
 		if (check_type_overlap(&prev_type, &type))
-			return type;
+			goto out;
 	}
 
 	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
-		return MTRR_TYPE_WRBACK;
+		type = MTRR_TYPE_WRBACK;
 
+out:
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af6771a95a..372ad422c2c3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f9819786b..df2f8a587438 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 /**
  * pud_set_huge - setup kernel PUD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
+ * function sets up a huge page only if any of the following conditions are met:
+ *
+ * - MTRRs are disabled, or
+ *
+ * - MTRRs are enabled and the range is completely covered by a single MTRR, or
+ *
+ * - MTRRs are enabled and the range is not completely covered by a single MTRR
+ *   but the memory type of the range is WB, even if covered by multiple MTRRs.
+ *
+ * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
+ * page mapping attempt fails.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -593,20 +602,21 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 /**
  * pmd_set_huge - setup kernel PMD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * See text over pud_set_huge() above.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
+			     __func__, addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 
-- 
2.3.5

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-19 13:23                     ` Borislav Petkov
@ 2015-05-19 13:47                       ` Toshi Kani
  2015-05-20 11:55                       ` Ingo Molnar
  1 sibling, 0 replies; 40+ messages in thread
From: Toshi Kani @ 2015-05-19 13:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel, dave.hansen,
	Elliott, pebolle, mcgrof

On Tue, 2015-05-19 at 15:23 +0200, Borislav Petkov wrote:
> On Tue, May 19, 2015 at 01:44:37PM +0200, Borislav Petkov wrote:
> > > Try with a smaller page size.
> > > 
> > > The callers, pud_set_huge() and pmd_set_huge(), check if the given range
> > > is safe with MTRRs for creating a huge page mapping.  If not, they fail
> > > the request, which leads their callers, ioremap_pud_range() and
> > > ioremap_pmd_range(), to retry with a smaller page size, i.e. 1GB -> 2MB
> > > -> 4KB.  4KB may not have overlap with MTRRs (hence no checking is
> > > necessary), which will succeed as before.
> 
> Scratch that, I think I have it now. And I even have a good feeling
> about it :-)

Looks good. Thanks for the update!
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-19 13:23                     ` Borislav Petkov
  2015-05-19 13:47                       ` Toshi Kani
@ 2015-05-20 11:55                       ` Ingo Molnar
  2015-05-20 14:34                         ` Toshi Kani
  1 sibling, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2015-05-20 11:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Toshi Kani, akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle, mcgrof


* Borislav Petkov <bp@alien8.de> wrote:

> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
>  /**
>   * pud_set_huge - setup kernel PUD mapping
>   *
> - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> - * this function does not set up a huge page when the range is covered
> - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> - * disabled.
> + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> + * function sets up a huge page only if any of the following conditions are met:
> + *
> + * - MTRRs are disabled, or
> + *
> + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> + *
> + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> + *   but the memory type of the range is WB, even if covered by multiple MTRRs.
> + *
> + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> + * page mapping attempt fails.

This comment should explain why it's ok in the WB case.

Also, the phrase 'the memory type of the range' is ambiguous: it might 
mean the partial MTRR's, or the memory type specified via PAT by the 
huge-pmd entry.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-20 11:55                       ` Ingo Molnar
@ 2015-05-20 14:34                         ` Toshi Kani
  2015-05-20 15:01                           ` Ingo Molnar
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-20 14:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, akpm, hpa, tglx, mingo, linux-mm, x86,
	linux-kernel, dave.hansen, Elliott, pebolle, mcgrof

On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> * Borislav Petkov <bp@alien8.de> wrote:
> 
> > --- a/arch/x86/mm/pgtable.c
> > +++ b/arch/x86/mm/pgtable.c
> > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> >  /**
> >   * pud_set_huge - setup kernel PUD mapping
> >   *
> > - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> > - * this function does not set up a huge page when the range is covered
> > - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> > - * disabled.
> > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > + * function sets up a huge page only if any of the following conditions are met:
> > + *
> > + * - MTRRs are disabled, or
> > + *
> > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > + *
> > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > + *   but the memory type of the range is WB, even if covered by multiple MTRRs.
> > + *
> > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > + * page mapping attempt fails.
> 
> This comment should explain why it's ok in the WB case.
> 
> Also, the phrase 'the memory type of the range' is ambiguous: it might 
> mean the partial MTRR's, or the memory type specified via PAT by the 
> huge-pmd entry.

Agreed.  How about this sentence?

 - MTRRs are enabled and the corresponding MTRR memory type is WB, which
has no effect to the requested PAT memory type.

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-20 14:34                         ` Toshi Kani
@ 2015-05-20 15:01                           ` Ingo Molnar
  2015-05-20 15:02                             ` Toshi Kani
  0 siblings, 1 reply; 40+ messages in thread
From: Ingo Molnar @ 2015-05-20 15:01 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Borislav Petkov, akpm, hpa, tglx, mingo, linux-mm, x86,
	linux-kernel, dave.hansen, Elliott, pebolle, mcgrof


* Toshi Kani <toshi.kani@hp.com> wrote:

> On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> > * Borislav Petkov <bp@alien8.de> wrote:
> > 
> > > --- a/arch/x86/mm/pgtable.c
> > > +++ b/arch/x86/mm/pgtable.c
> > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > >  /**
> > >   * pud_set_huge - setup kernel PUD mapping
> > >   *
> > > - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> > > - * this function does not set up a huge page when the range is covered
> > > - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> > > - * disabled.
> > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > > + * function sets up a huge page only if any of the following conditions are met:
> > > + *
> > > + * - MTRRs are disabled, or
> > > + *
> > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > > + *
> > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > > + *   but the memory type of the range is WB, even if covered by multiple MTRRs.
> > > + *
> > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > > + * page mapping attempt fails.
> > 
> > This comment should explain why it's ok in the WB case.
> > 
> > Also, the phrase 'the memory type of the range' is ambiguous: it might 
> > mean the partial MTRR's, or the memory type specified via PAT by the 
> > huge-pmd entry.
> 
> Agreed.  How about this sentence?
> 
>  - MTRRs are enabled and the corresponding MTRR memory type is WB, which
> has no effect to the requested PAT memory type.

s/effect to/effect on

sounds good otherwise!

Btw., if WB MTRR entries can never have an effect on Linux PAT 
specified attributes, why do we allow them to be created? I don't 
think we ever call into real mode for this to matter?

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-20 15:01                           ` Ingo Molnar
@ 2015-05-20 15:02                             ` Toshi Kani
  2015-05-20 16:04                               ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Toshi Kani @ 2015-05-20 15:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, akpm, hpa, tglx, mingo, linux-mm, x86,
	linux-kernel, dave.hansen, Elliott, pebolle, mcgrof

On Wed, 2015-05-20 at 17:01 +0200, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > On Wed, 2015-05-20 at 13:55 +0200, Ingo Molnar wrote:
> > > * Borislav Petkov <bp@alien8.de> wrote:
> > > 
> > > > --- a/arch/x86/mm/pgtable.c
> > > > +++ b/arch/x86/mm/pgtable.c
> > > > @@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
> > > >  /**
> > > >   * pud_set_huge - setup kernel PUD mapping
> > > >   *
> > > > - * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
> > > > - * this function does not set up a huge page when the range is covered
> > > > - * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
> > > > - * disabled.
> > > > + * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
> > > > + * function sets up a huge page only if any of the following conditions are met:
> > > > + *
> > > > + * - MTRRs are disabled, or
> > > > + *
> > > > + * - MTRRs are enabled and the range is completely covered by a single MTRR, or
> > > > + *
> > > > + * - MTRRs are enabled and the range is not completely covered by a single MTRR
> > > > + *   but the memory type of the range is WB, even if covered by multiple MTRRs.
> > > > + *
> > > > + * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
> > > > + * page mapping attempt fails.
> > > 
> > > This comment should explain why it's ok in the WB case.
> > > 
> > > Also, the phrase 'the memory type of the range' is ambiguous: it might 
> > > mean the partial MTRR's, or the memory type specified via PAT by the 
> > > huge-pmd entry.
> > 
> > Agreed.  How about this sentence?
> > 
> >  - MTRRs are enabled and the corresponding MTRR memory type is WB, which
> > has no effect to the requested PAT memory type.
> 
> s/effect to/effect on
> 
> sounds good otherwise!

Great!

Boris, can you update the patch, or do you want me to send you a patch
for this update?

> Btw., if WB MTRR entries can never have an effect on Linux PAT 
> specified attributes, why do we allow them to be created? I don't 
> think we ever call into real mode for this to matter?

MTRRs have the default memory type, which is used when the given range
is not covered by any MTRR entries.  There are two types of BIOS setup:

1) Default UC
 - BIOS sets the default type to UC, and covers all WB accessible ranges
with MTRR entries of WB.

2) Default WB
 - BIOS sets the default type to WB, and covers non-WB accessible range
with MTRR entries of other memory types, such as UC.

In both cases, WB type can be returned.  In case of 1), the requested
range may overlap with multiple MTRR entries of WB type, which is still
safe.

Thanks,
-Toshi


Thanks,
-Toshi



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-20 16:04                               ` Borislav Petkov
@ 2015-05-20 15:46                                 ` Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: Toshi Kani @ 2015-05-20 15:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle, mcgrof

On Wed, 2015-05-20 at 18:04 +0200, Borislav Petkov wrote:
> On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote:
> > Boris, can you update the patch,
> 
> Done.

Thanks!
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-05-20 15:02                             ` Toshi Kani
@ 2015-05-20 16:04                               ` Borislav Petkov
  2015-05-20 15:46                                 ` Toshi Kani
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-05-20 16:04 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Ingo Molnar, akpm, hpa, tglx, mingo, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle, mcgrof

On Wed, May 20, 2015 at 09:02:23AM -0600, Toshi Kani wrote:
> Boris, can you update the patch,

Done.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/kconfig: Simplify conditions for HAVE_ARCH_HUGE_VMAP
       [not found]   ` <1432628901-18044-2-git-send-email-bp@alien8.de>
@ 2015-05-27 14:17     ` tip-bot for Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, torvalds, hpa, toshi.kani, bp, akpm, tglx, linux-mm,
	linux-kernel, bp, mingo, peterz, mcgrof, brgerst, luto

Commit-ID:  10455f64aff0d715dcdfb09b02393df168fe267e
Gitweb:     http://git.kernel.org/tip/10455f64aff0d715dcdfb09b02393df168fe267e
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:04 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:55 +0200

x86/mm/kconfig: Simplify conditions for HAVE_ARCH_HUGE_VMAP

Simplify the conditions selecting HAVE_ARCH_HUGE_VMAP since
X86_PAE depends on X86_32 already.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-2-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 226d569..4eb0b0f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -100,7 +100,7 @@ config X86
 	select IRQ_FORCED_THREADING
 	select HAVE_BPF_JIT if X86_64
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
-	select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
+	select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
 	select ARCH_HAS_SG_CHAIN
 	select CLKEVT_I8253
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/mtrr: Fix MTRR lookup to handle an inclusive entry
       [not found]   ` <1432628901-18044-3-git-send-email-bp@alien8.de>
@ 2015-05-27 14:18     ` tip-bot for Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, tglx, dvlasenk, peterz, bp, luto, torvalds, toshi.kani,
	akpm, mcgrof, hpa, brgerst, linux-mm, bp, linux-kernel

Commit-ID:  7f0431e3dc8953f41e9433581c1fdd7ee45860b0
Gitweb:     http://git.kernel.org/tip/7f0431e3dc8953f41e9433581c1fdd7ee45860b0
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:05 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:56 +0200

x86/mm/mtrr: Fix MTRR lookup to handle an inclusive entry

When an MTRR entry is inclusive to a requested range, i.e. the
start and end of the request are not within the MTRR entry range
but the range contains the MTRR entry entirely:

  range_start ... [mtrr_start ... mtrr_end] ... range_end

__mtrr_type_lookup() ignores such a case because both
start_state and end_state are set to zero.

This bug can cause the following issues:

1) reserve_memtype() tracks an effective memory type in case
   a request type is WB (ex. /dev/mem blindly uses WB). Missing
   to track with its effective type causes a subsequent request
   to map the same range with the effective type to fail.

2) pud_set_huge() and pmd_set_huge() check if a requested range
   has any overlap with MTRRs. Missing to detect an overlap may
   cause a performance penalty or undefined behavior.

This patch fixes the bug by adding a new flag, 'inclusive',
to detect the inclusive case.  This case is then handled in
the same way as end_state:1 since the first region is the same.
With this fix, __mtrr_type_lookup() handles the inclusive case
properly.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-3-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/mtrr/generic.c | 28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 5b23967..e202d26 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
-		unsigned short start_state, end_state;
+		unsigned short start_state, end_state, inclusive;
 
 		if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
 			continue;
@@ -166,19 +166,27 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 		start_state = ((start & mask) == (base & mask));
 		end_state = ((end & mask) == (base & mask));
+		inclusive = ((start < base) && (end > base));
 
-		if (start_state != end_state) {
+		if ((start_state != end_state) || inclusive) {
 			/*
 			 * We have start:end spanning across an MTRR.
-			 * We split the region into
-			 * either
-			 * (start:mtrr_end) (mtrr_end:end)
-			 * or
-			 * (start:mtrr_start) (mtrr_start:end)
+			 * We split the region into either
+			 *
+			 * - start_state:1
+			 * (start:mtrr_end)(mtrr_end:end)
+			 * - end_state:1
+			 * (start:mtrr_start)(mtrr_start:end)
+			 * - inclusive:1
+			 * (start:mtrr_start)(mtrr_start:mtrr_end)(mtrr_end:end)
+			 *
 			 * depending on kind of overlap.
-			 * Return the type for first region and a pointer to
-			 * the start of second region so that caller will
-			 * lookup again on the second region.
+			 *
+			 * Return the type of the first region and a pointer
+			 * to the start of next region so that caller will be
+			 * advised to lookup again after having adjusted start
+			 * and end.
+			 *
 			 * Note: This way we handle multiple overlaps as well.
 			 */
 			if (start_state)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/mtrr: Fix MTRR state checks in mtrr_type_lookup()
       [not found]   ` <1432628901-18044-4-git-send-email-bp@alien8.de>
@ 2015-05-27 14:18     ` tip-bot for Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, dvlasenk, linux-kernel, toshi.kani, peterz, bp, mcgrof,
	mingo, brgerst, torvalds, luto, akpm, tglx, linux-mm, bp

Commit-ID:  9b3aca620883fc06636737c82a4d024b22182281
Gitweb:     http://git.kernel.org/tip/9b3aca620883fc06636737c82a4d024b22182281
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:06 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:56 +0200

x86/mm/mtrr: Fix MTRR state checks in mtrr_type_lookup()

'mtrr_state.enabled' contains the FE (fixed MTRRs enabled)
and E (MTRRs enabled) flags in MSR_MTRRdefType.  Intel SDM,
section 11.11.2.1, defines these flags as follows:

 - All MTRRs are disabled when the E flag is clear.
   The FE flag has no affect when the E flag is clear.
 - The default type is enabled when the E flag is set.
 - MTRR variable ranges are enabled when the E flag is set.
 - MTRR fixed ranges are enabled when both E and FE flags
   are set.

MTRR state checks in __mtrr_type_lookup() do not match with SDM.

Hence, this patch makes the following changes:
 - The current code detects MTRRs disabled when both E and
   FE flags are clear in mtrr_state.enabled.  Fix to detect
   MTRRs disabled when the E flag is clear.
 - The current code does not check if the FE bit is set in
   mtrr_state.enabled when looking at the fixed entries.
   Fix to check the FE flag.
 - The current code returns the default type when the E flag
   is clear in mtrr_state.enabled. However, the default type
   is UC when the E flag is clear.  Remove the code as this
   case is handled as MTRR disabled with the 1st change.

In addition, this patch defines the E and FE flags in
mtrr_state.enabled as follows.
 - FE flag: MTRR_STATE_MTRR_FIXED_ENABLED
 - E  flag: MTRR_STATE_MTRR_ENABLED

print_mtrr_state() and x86_get_mtrr_mem_range() are also updated
accordingly.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-4-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/mtrr.h        |  4 ++++
 arch/x86/kernel/cpu/mtrr/cleanup.c |  3 ++-
 arch/x86/kernel/cpu/mtrr/generic.c | 15 ++++++++-------
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..ef92794 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -127,4 +127,8 @@ struct mtrr_gentry32 {
 				 _IOW(MTRR_IOCTL_BASE,  9, struct mtrr_sentry32)
 #endif /* CONFIG_COMPAT */
 
+/* Bit fields for enabled in struct mtrr_state_type */
+#define MTRR_STATE_MTRR_FIXED_ENABLED	0x01
+#define MTRR_STATE_MTRR_ENABLED		0x02
+
 #endif /* _ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c b/arch/x86/kernel/cpu/mtrr/cleanup.c
index 5f90b85..70d7c93 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -98,7 +98,8 @@ x86_get_mtrr_mem_range(struct range *range, int nr_range,
 			continue;
 		base = range_state[i].base_pfn;
 		if (base < (1<<(20-PAGE_SHIFT)) && mtrr_state.have_fixed &&
-		    (mtrr_state.enabled & 1)) {
+		    (mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+		    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
 			/* Var MTRR contains UC entry below 1M? Skip it: */
 			printk(BIOS_BUG_MSG, i);
 			if (base + size <= (1<<(20-PAGE_SHIFT)))
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e202d26..b0599db 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -119,14 +119,16 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	if (!mtrr_state_set)
 		return 0xFF;
 
-	if (!mtrr_state.enabled)
+	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
 		return 0xFF;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
 
 	/* Look in fixed ranges. Just return the type as per start */
-	if (mtrr_state.have_fixed && (start < 0x100000)) {
+	if ((start < 0x100000) &&
+	    (mtrr_state.have_fixed) &&
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
 		int idx;
 
 		if (start < 0x80000) {
@@ -149,9 +151,6 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	 * Look of multiple ranges matching this address and pick type
 	 * as per MTRR precedence
 	 */
-	if (!(mtrr_state.enabled & 2))
-		return mtrr_state.def_type;
-
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
@@ -355,7 +354,9 @@ static void __init print_mtrr_state(void)
 		 mtrr_attrib_to_str(mtrr_state.def_type));
 	if (mtrr_state.have_fixed) {
 		pr_debug("MTRR fixed ranges %sabled:\n",
-			 mtrr_state.enabled & 1 ? "en" : "dis");
+			((mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED) &&
+			 (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) ?
+			 "en" : "dis");
 		print_fixed(0x00000, 0x10000, mtrr_state.fixed_ranges + 0);
 		for (i = 0; i < 2; ++i)
 			print_fixed(0x80000 + i * 0x20000, 0x04000,
@@ -368,7 +369,7 @@ static void __init print_mtrr_state(void)
 		print_fixed_last();
 	}
 	pr_debug("MTRR variable ranges %sabled:\n",
-		 mtrr_state.enabled & 2 ? "en" : "dis");
+		 mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED ? "en" : "dis");
 	high_width = (__ffs64(size_or_mask) - (32 - PAGE_SHIFT) + 3) / 4;
 
 	for (i = 0; i < num_var_ranges; ++i) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/mtrr: Use symbolic define as a retval for disabled MTRRs
       [not found]   ` <1432628901-18044-5-git-send-email-bp@alien8.de>
@ 2015-05-27 14:18     ` tip-bot for Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bp, tglx, linux-mm, mcgrof, hpa, akpm, torvalds, linux-kernel,
	brgerst, mingo, toshi.kani, dvlasenk, peterz, bp, luto

Commit-ID:  3d3ca416d9b0784cfcf244eeeba1bcaf421bc64d
Gitweb:     http://git.kernel.org/tip/3d3ca416d9b0784cfcf244eeeba1bcaf421bc64d
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:07 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:57 +0200

x86/mm/mtrr: Use symbolic define as a retval for disabled MTRRs

mtrr_type_lookup() returns verbatim 0xFF when MTRRs are
disabled. This patch defines MTRR_TYPE_INVALID to clarify the
meaning of this value, and documents its usage.

Document the return values of the kernel virtual address mapping
helpers pud_set_huge(), pmd_set_huge, pud_clear_huge() and
pmd_clear_huge().

There is no functional change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-5-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/mtrr.h        |  2 +-
 arch/x86/include/uapi/asm/mtrr.h   |  8 +++++++-
 arch/x86/kernel/cpu/mtrr/generic.c | 14 ++++++-------
 arch/x86/mm/pgtable.c              | 42 +++++++++++++++++++++++++++++---------
 4 files changed, 47 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index ef92794..bb03a54 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -55,7 +55,7 @@ static inline u8 mtrr_type_lookup(u64 addr, u64 end)
 	/*
 	 * Return no-MTRRs:
 	 */
-	return 0xff;
+	return MTRR_TYPE_INVALID;
 }
 #define mtrr_save_fixed_ranges(arg) do {} while (0)
 #define mtrr_save_state() do {} while (0)
diff --git a/arch/x86/include/uapi/asm/mtrr.h b/arch/x86/include/uapi/asm/mtrr.h
index d0acb65..7528dcf 100644
--- a/arch/x86/include/uapi/asm/mtrr.h
+++ b/arch/x86/include/uapi/asm/mtrr.h
@@ -103,7 +103,7 @@ struct mtrr_state_type {
 #define MTRRIOC_GET_PAGE_ENTRY   _IOWR(MTRR_IOCTL_BASE, 8, struct mtrr_gentry)
 #define MTRRIOC_KILL_PAGE_ENTRY  _IOW(MTRR_IOCTL_BASE,  9, struct mtrr_sentry)
 
-/*  These are the region types  */
+/* MTRR memory types, which are defined in SDM */
 #define MTRR_TYPE_UNCACHABLE 0
 #define MTRR_TYPE_WRCOMB     1
 /*#define MTRR_TYPE_         2*/
@@ -113,5 +113,11 @@ struct mtrr_state_type {
 #define MTRR_TYPE_WRBACK     6
 #define MTRR_NUM_TYPES       7
 
+/*
+ * Invalid MTRR memory type.  mtrr_type_lookup() returns this value when
+ * MTRRs are disabled.  Note, this value is allocated from the reserved
+ * values (0x7-0xff) of the MTRR memory types.
+ */
+#define MTRR_TYPE_INVALID    0xff
 
 #endif /* _UAPI_ASM_X86_MTRR_H */
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index b0599db..7b1491c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -104,7 +104,7 @@ static int check_type_overlap(u8 *prev, u8 *curr)
 
 /*
  * Error/Semi-error returns:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
  * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
  *		corresponds only to [start:*partial_end].
  *		Caller has to lookup again for [*partial_end:end].
@@ -117,10 +117,10 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	*repeat = 0;
 	if (!mtrr_state_set)
-		return 0xFF;
+		return MTRR_TYPE_INVALID;
 
 	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
-		return 0xFF;
+		return MTRR_TYPE_INVALID;
 
 	/* Make end inclusive end, instead of exclusive */
 	end--;
@@ -151,7 +151,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	 * Look of multiple ranges matching this address and pick type
 	 * as per MTRR precedence
 	 */
-	prev_match = 0xFF;
+	prev_match = MTRR_TYPE_INVALID;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
 
@@ -206,7 +206,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			continue;
 
 		curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;
-		if (prev_match == 0xFF) {
+		if (prev_match == MTRR_TYPE_INVALID) {
 			prev_match = curr_match;
 			continue;
 		}
@@ -220,7 +220,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			return MTRR_TYPE_WRBACK;
 	}
 
-	if (prev_match != 0xFF)
+	if (prev_match != MTRR_TYPE_INVALID)
 		return prev_match;
 
 	return mtrr_state.def_type;
@@ -229,7 +229,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 /*
  * Returns the effective MTRR type for the region
  * Error return:
- * 0xFF - when MTRR is not enabled
+ * MTRR_TYPE_INVALID - when MTRR is not enabled
  */
 u8 mtrr_type_lookup(u64 start, u64 end)
 {
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0b97d2c..c30f981 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -563,16 +563,22 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 }
 
 #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+/**
+ * pud_set_huge - setup kernel PUD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -584,16 +590,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pmd_set_huge - setup kernel PMD mapping
+ *
+ * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
+ * this function does not set up a huge page when the range is covered
+ * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
+ * disabled.
+ *
+ * Returns 1 on success and 0 on failure.
+ */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -605,6 +617,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PUD map is found).
+ */
 int pud_clear_huge(pud_t *pud)
 {
 	if (pud_large(*pud)) {
@@ -615,6 +632,11 @@ int pud_clear_huge(pud_t *pud)
 	return 0;
 }
 
+/**
+ * pmd_clear_huge - clear kernel PMD mapping when it is set
+ *
+ * Returns 1 on success and 0 on failure (no PMD map is found).
+ */
 int pmd_clear_huge(pmd_t *pmd)
 {
 	if (pmd_large(*pmd)) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
       [not found]   ` <1432628901-18044-6-git-send-email-bp@alien8.de>
@ 2015-05-27 14:19     ` tip-bot for Toshi Kani
  2015-07-31 13:18       ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, dvlasenk, bp, bp, mingo, luto, linux-mm, linux-kernel,
	torvalds, mcgrof, toshi.kani, brgerst, peterz, akpm, tglx

Commit-ID:  0cc705f56e400764a171055f727d28a48260bb4b
Gitweb:     http://git.kernel.org/tip/0cc705f56e400764a171055f727d28a48260bb4b
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:08 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:57 +0200

x86/mm/mtrr: Clean up mtrr_type_lookup()

MTRRs contain fixed and variable entries. mtrr_type_lookup() may
repeatedly call __mtrr_type_lookup() to handle a request that
overlaps with variable entries.

However, __mtrr_type_lookup() also handles the fixed entries,
which do not have to be repeated. Therefore, this patch creates
separate functions, mtrr_type_lookup_fixed() and
mtrr_type_lookup_variable(), to handle the fixed and variable
ranges respectively.

The patch also updates the function headers to clarify the
return values and output argument. It updates comments to
clarify that the repeating is necessary to handle overlaps with
the default type, since overlaps with multiple entries alone can
be handled without such repeating.

There is no functional change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-6-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/mtrr/generic.c | 138 +++++++++++++++++++++++--------------
 1 file changed, 86 insertions(+), 52 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 7b1491c..e51100c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -102,55 +102,68 @@ static int check_type_overlap(u8 *prev, u8 *curr)
 	return 0;
 }
 
-/*
- * Error/Semi-error returns:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
- * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
- *		corresponds only to [start:*partial_end].
- *		Caller has to lookup again for [*partial_end:end].
+/**
+ * mtrr_type_lookup_fixed - look up memory type in MTRR fixed entries
+ *
+ * Return the MTRR fixed memory type of 'start'.
+ *
+ * MTRR fixed entries are divided into the following ways:
+ *  0x00000 - 0x7FFFF : This range is divided into eight 64KB sub-ranges
+ *  0x80000 - 0xBFFFF : This range is divided into sixteen 16KB sub-ranges
+ *  0xC0000 - 0xFFFFF : This range is divided into sixty-four 4KB sub-ranges
+ *
+ * Return Values:
+ * MTRR_TYPE_(type)  - Matched memory type
+ * MTRR_TYPE_INVALID - Unmatched
+ */
+static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
+{
+	int idx;
+
+	if (start >= 0x100000)
+		return MTRR_TYPE_INVALID;
+
+	/* 0x0 - 0x7FFFF */
+	if (start < 0x80000) {
+		idx = 0;
+		idx += (start >> 16);
+		return mtrr_state.fixed_ranges[idx];
+	/* 0x80000 - 0xBFFFF */
+	} else if (start < 0xC0000) {
+		idx = 1 * 8;
+		idx += ((start - 0x80000) >> 14);
+		return mtrr_state.fixed_ranges[idx];
+	}
+
+	/* 0xC0000 - 0xFFFFF */
+	idx = 3 * 8;
+	idx += ((start - 0xC0000) >> 12);
+	return mtrr_state.fixed_ranges[idx];
+}
+
+/**
+ * mtrr_type_lookup_variable - look up memory type in MTRR variable entries
+ *
+ * Return Value:
+ * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
+ *
+ * Output Argument:
+ * repeat - Set to 1 when [start:end] spanned across MTRR range and type
+ *	    returned corresponds only to [start:*partial_end].  Caller has
+ *	    to lookup again for [*partial_end:end].
  */
-static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
+static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
+				    int *repeat)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
-	if (!mtrr_state_set)
-		return MTRR_TYPE_INVALID;
-
-	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
-		return MTRR_TYPE_INVALID;
 
-	/* Make end inclusive end, instead of exclusive */
+	/* Make end inclusive instead of exclusive */
 	end--;
 
-	/* Look in fixed ranges. Just return the type as per start */
-	if ((start < 0x100000) &&
-	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
-		int idx;
-
-		if (start < 0x80000) {
-			idx = 0;
-			idx += (start >> 16);
-			return mtrr_state.fixed_ranges[idx];
-		} else if (start < 0xC0000) {
-			idx = 1 * 8;
-			idx += ((start - 0x80000) >> 14);
-			return mtrr_state.fixed_ranges[idx];
-		} else {
-			idx = 3 * 8;
-			idx += ((start - 0xC0000) >> 12);
-			return mtrr_state.fixed_ranges[idx];
-		}
-	}
-
-	/*
-	 * Look in variable ranges
-	 * Look of multiple ranges matching this address and pick type
-	 * as per MTRR precedence
-	 */
 	prev_match = MTRR_TYPE_INVALID;
 	for (i = 0; i < num_var_ranges; ++i) {
 		unsigned short start_state, end_state, inclusive;
@@ -186,7 +199,8 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			 * advised to lookup again after having adjusted start
 			 * and end.
 			 *
-			 * Note: This way we handle multiple overlaps as well.
+			 * Note: This way we handle overlaps with multiple
+			 * entries and the default type properly.
 			 */
 			if (start_state)
 				*partial_end = base + get_mtrr_size(mask);
@@ -215,21 +229,18 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			return curr_match;
 	}
 
-	if (mtrr_tom2) {
-		if (start >= (1ULL<<32) && (end < mtrr_tom2))
-			return MTRR_TYPE_WRBACK;
-	}
-
 	if (prev_match != MTRR_TYPE_INVALID)
 		return prev_match;
 
 	return mtrr_state.def_type;
 }
 
-/*
- * Returns the effective MTRR type for the region
- * Error return:
- * MTRR_TYPE_INVALID - when MTRR is not enabled
+/**
+ * mtrr_type_lookup - look up memory type in MTRR
+ *
+ * Return Values:
+ * MTRR_TYPE_(type)  - The effective MTRR type for the region
+ * MTRR_TYPE_INVALID - MTRR is disabled
  */
 u8 mtrr_type_lookup(u64 start, u64 end)
 {
@@ -237,22 +248,45 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	int repeat;
 	u64 partial_end;
 
-	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+	if (!mtrr_state_set)
+		return MTRR_TYPE_INVALID;
+
+	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
+		return MTRR_TYPE_INVALID;
+
+	/*
+	 * Look up the fixed ranges first, which take priority over
+	 * the variable ranges.
+	 */
+	if ((start < 0x100000) &&
+	    (mtrr_state.have_fixed) &&
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
+		return mtrr_type_lookup_fixed(start, end);
+
+	/*
+	 * Look up the variable ranges.  Look of multiple ranges matching
+	 * this address and pick type as per MTRR precedence.
+	 */
+	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
 
 	/*
 	 * Common path is with repeat = 0.
 	 * However, we can have cases where [start:end] spans across some
-	 * MTRR range. Do repeated lookups for that case here.
+	 * MTRR ranges and/or the default type.  Do repeated lookups for
+	 * that case here.
 	 */
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+		type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
 
 		if (check_type_overlap(&prev_type, &type))
 			return type;
 	}
 
+	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
+		return MTRR_TYPE_WRBACK;
+
 	return type;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [tip:x86/mm] x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers
       [not found]   ` <1432628901-18044-8-git-send-email-bp@alien8.de>
@ 2015-05-27 14:19     ` tip-bot for Toshi Kani
  0 siblings, 0 replies; 40+ messages in thread
From: tip-bot for Toshi Kani @ 2015-05-27 14:19 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: luto, toshi.kani, torvalds, dvlasenk, brgerst, peterz, bp, tglx,
	linux-mm, mcgrof, akpm, linux-kernel, hpa, bp, mingo

Commit-ID:  b73522e0c1be58d3c69b124985b8ccf94e3677f7
Gitweb:     http://git.kernel.org/tip/b73522e0c1be58d3c69b124985b8ccf94e3677f7
Author:     Toshi Kani <toshi.kani@hp.com>
AuthorDate: Tue, 26 May 2015 10:28:10 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 27 May 2015 14:40:58 +0200

x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers

This patch adds the argument 'uniform' to mtrr_type_lookup(),
which gets set to 1 when a given range is covered uniformly by
MTRRs, i.e. the range is fully covered by a single MTRR entry or
the default type.

Change pud_set_huge() and pmd_set_huge() to honor the 'uniform'
flag to see if it is safe to create a huge page mapping in the
range.

This allows them to create a huge page mapping in a range
covered by a single MTRR entry of any memory type. It also
detects a non-optimal request properly. They continue to check
with the WB type since it does not effectively change the
uniform mapping even if a request spans multiple MTRR entries.

pmd_set_huge() logs a warning message to a non-optimal request
so that driver writers will be aware of such a case. Drivers
should make a mapping request aligned to a single MTRR entry
when the range is covered by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
[ Realign, flesh out comments, improve warning message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Elliott@hp.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave.hansen@intel.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: pebolle@tiscali.nl
Link: http://lkml.kernel.org/r/1431714237-880-7-git-send-email-toshi.kani@hp.com
Link: http://lkml.kernel.org/r/1432628901-18044-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/mtrr.h        |  4 ++--
 arch/x86/kernel/cpu/mtrr/generic.c | 40 ++++++++++++++++++++++++++++----------
 arch/x86/mm/pat.c                  |  4 ++--
 arch/x86/mm/pgtable.c              | 38 +++++++++++++++++++++++-------------
 4 files changed, 58 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index bb03a54..a31759e 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,7 +50,7 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index e51100c..f782d9b 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -147,19 +147,24 @@ static u8 mtrr_type_lookup_fixed(u64 start, u64 end)
  * Return Value:
  * MTRR_TYPE_(type) - Matched memory type or default memory type (unmatched)
  *
- * Output Argument:
+ * Output Arguments:
  * repeat - Set to 1 when [start:end] spanned across MTRR range and type
  *	    returned corresponds only to [start:*partial_end].  Caller has
  *	    to lookup again for [*partial_end:end].
+ *
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
 static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
-				    int *repeat)
+				    int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
 
 	/* Make end inclusive instead of exclusive */
 	end--;
@@ -214,6 +219,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if ((start & mask) != (base & mask))
@@ -225,6 +231,7 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -241,10 +248,15 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end,
  * Return Values:
  * MTRR_TYPE_(type)  - The effective MTRR type for the region
  * MTRR_TYPE_INVALID - MTRR is disabled
+ *
+ * Output Argument:
+ * uniform - Set to 1 when an MTRR covers the region uniformly, i.e. the
+ *	     region is fully covered by a single MTRR entry or the default
+ *	     type.
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform = 1, dummy;
 	int repeat;
 	u64 partial_end;
 
@@ -260,14 +272,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	 */
 	if ((start < 0x100000) &&
 	    (mtrr_state.have_fixed) &&
-	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
-		return mtrr_type_lookup_fixed(start, end);
+	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED)) {
+		is_uniform = 0;
+		type = mtrr_type_lookup_fixed(start, end);
+		goto out;
+	}
 
 	/*
 	 * Look up the variable ranges.  Look of multiple ranges matching
 	 * this address and pick type as per MTRR precedence.
 	 */
-	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+	type = mtrr_type_lookup_variable(start, end, &partial_end,
+					 &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -278,15 +294,19 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
+		is_uniform = 0;
+		type = mtrr_type_lookup_variable(start, end, &partial_end,
+						 &repeat, &dummy);
 
 		if (check_type_overlap(&prev_type, &type))
-			return type;
+			goto out;
 	}
 
 	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
-		return MTRR_TYPE_WRBACK;
+		type = MTRR_TYPE_WRBACK;
 
+out:
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..372ad42 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c30f981..fb0a9dd 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -566,19 +566,28 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 /**
  * pud_set_huge - setup kernel PUD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * MTRRs can override PAT memory types with 4KiB granularity. Therefore, this
+ * function sets up a huge page only if any of the following conditions are met:
+ *
+ * - MTRRs are disabled, or
+ *
+ * - MTRRs are enabled and the range is completely covered by a single MTRR, or
+ *
+ * - MTRRs are enabled and the corresponding MTRR memory type is WB, which
+ *   has no effect on the requested PAT memory type.
+ *
+ * Callers should try to decrease page size (1GB -> 2MB -> 4K) if the bigger
+ * page mapping attempt fails.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -593,20 +602,21 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 /**
  * pmd_set_huge - setup kernel PMD mapping
  *
- * MTRR can override PAT memory types with 4KiB granularity.  Therefore,
- * this function does not set up a huge page when the range is covered
- * by a non-WB type of MTRR.  MTRR_TYPE_INVALID indicates that MTRR are
- * disabled.
+ * See text over pud_set_huge() above.
  *
  * Returns 1 on success and 0 on failure.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != MTRR_TYPE_INVALID))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
+	    (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a huge-page mapping due to MTRR override.\n",
+			     __func__, addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-05-27 14:19     ` [tip:x86/mm] x86/mm/mtrr: " tip-bot for Toshi Kani
@ 2015-07-31 13:18       ` Peter Zijlstra
  2015-07-31 14:44         ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2015-07-31 13:18 UTC (permalink / raw)
  To: mingo, hpa, bp, dvlasenk, bp, akpm, brgerst, tglx, linux-mm,
	luto, mcgrof, toshi.kani, torvalds, linux-kernel
  Cc: linux-tip-commits

On Wed, May 27, 2015 at 07:19:05AM -0700, tip-bot for Toshi Kani wrote:
> +/**
> + * mtrr_type_lookup - look up memory type in MTRR
> + *
> + * Return Values:
> + * MTRR_TYPE_(type)  - The effective MTRR type for the region
> + * MTRR_TYPE_INVALID - MTRR is disabled
>   */
>  u8 mtrr_type_lookup(u64 start, u64 end)
>  {

>  	int repeat;
>  	u64 partial_end;
>  
> +	if (!mtrr_state_set)
> +		return MTRR_TYPE_INVALID;
> +
> +	if (!(mtrr_state.enabled & MTRR_STATE_MTRR_ENABLED))
> +		return MTRR_TYPE_INVALID;
> +
> +	/*
> +	 * Look up the fixed ranges first, which take priority over
> +	 * the variable ranges.
> +	 */
> +	if ((start < 0x100000) &&
> +	    (mtrr_state.have_fixed) &&
> +	    (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED))
> +		return mtrr_type_lookup_fixed(start, end);
> +
> +	/*
> +	 * Look up the variable ranges.  Look of multiple ranges matching
> +	 * this address and pick type as per MTRR precedence.
> +	 */
> +	type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
>  
>  	/*
>  	 * Common path is with repeat = 0.
>  	 * However, we can have cases where [start:end] spans across some
> +	 * MTRR ranges and/or the default type.  Do repeated lookups for
> +	 * that case here.
>  	 */
>  	while (repeat) {
>  		prev_type = type;
>  		start = partial_end;
> +		type = mtrr_type_lookup_variable(start, end, &partial_end, &repeat);
>  
>  		if (check_type_overlap(&prev_type, &type))
>  			return type;
>  	}
>  
> +	if (mtrr_tom2 && (start >= (1ULL<<32)) && (end < mtrr_tom2))
> +		return MTRR_TYPE_WRBACK;
> +
>  	return type;
>  }

So I got staring at this MTRR horror show because I _really_ _Really_
want to kill stop_machine_from_inactive_cpu().

But I wondered about these lookup functions, should they not have an
assertion that preemption is disabled?

Using these functions with preemption enabled is racy against MTRR
updates. And if that race is ok, at the very least explain that it is
indeed racy and why this is not a problem.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-07-31 13:18       ` Peter Zijlstra
@ 2015-07-31 14:44         ` Borislav Petkov
  2015-07-31 15:08           ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-07-31 14:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, hpa, dvlasenk, bp, akpm, brgerst, tglx, linux-mm, luto,
	mcgrof, toshi.kani, torvalds, linux-kernel, linux-tip-commits

On Fri, Jul 31, 2015 at 03:18:02PM +0200, Peter Zijlstra wrote:
> Using these functions with preemption enabled is racy against MTRR
> updates. And if that race is ok, at the very least explain that it is
> indeed racy and why this is not a problem.

Right, so Luis has been working on burying direct MTRR access so
after that work is done, we'll be using only PAT for changing memory
attributes. Look at arch_phys_wc_add() and all those fbdev users of
mtrr_add() which get converted to that thing...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-07-31 14:44         ` Borislav Petkov
@ 2015-07-31 15:08           ` Peter Zijlstra
  2015-07-31 15:27             ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2015-07-31 15:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: mingo, hpa, dvlasenk, bp, akpm, brgerst, tglx, linux-mm, luto,
	mcgrof, toshi.kani, torvalds, linux-kernel, linux-tip-commits

On Fri, Jul 31, 2015 at 04:44:52PM +0200, Borislav Petkov wrote:
> On Fri, Jul 31, 2015 at 03:18:02PM +0200, Peter Zijlstra wrote:
> > Using these functions with preemption enabled is racy against MTRR
> > updates. And if that race is ok, at the very least explain that it is
> > indeed racy and why this is not a problem.
> 
> Right, so Luis has been working on burying direct MTRR access so
> after that work is done, we'll be using only PAT for changing memory
> attributes. Look at arch_phys_wc_add() and all those fbdev users of
> mtrr_add() which get converted to that thing...

Drivers don't do those lookups afaict.

But its things like set_memory_XX(), and afaict that's all buggy against
MTRR modifications.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-07-31 15:08           ` Peter Zijlstra
@ 2015-07-31 15:27             ` Borislav Petkov
  2015-08-01 14:28               ` Luis R. Rodriguez
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-07-31 15:27 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, hpa, dvlasenk, bp, akpm, brgerst, tglx, linux-mm, luto,
	mcgrof, toshi.kani, torvalds, linux-kernel, linux-tip-commits

On Fri, Jul 31, 2015 at 05:08:06PM +0200, Peter Zijlstra wrote:
> But its things like set_memory_XX(), and afaict that's all buggy against
> MTRR modifications.

I think the idea is to not do any MTRR modifications at some point:

>From Documentation/x86/pat.txt:

"... Ideally mtrr_add() usage will be phased out in favor of
arch_phys_wc_add() which will be a no-op on PAT enabled systems. The
region over which a arch_phys_wc_add() is made, should already have been
ioremapped with WC attributes or PAT entries, this can be done by using
ioremap_wc() / set_memory_wc()."

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-07-31 15:27             ` Borislav Petkov
@ 2015-08-01 14:28               ` Luis R. Rodriguez
  2015-08-01 16:33                 ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Luis R. Rodriguez @ 2015-08-01 14:28 UTC (permalink / raw)
  To: Borislav Petkov, Toshi Kani
  Cc: Peter Zijlstra, mingo, hpa, dvlasenk, bp, akpm, brgerst, tglx,
	linux-mm, luto, torvalds, linux-kernel, linux-tip-commits

On Fri, Jul 31, 2015 at 05:27:13PM +0200, Borislav Petkov wrote:
> On Fri, Jul 31, 2015 at 05:08:06PM +0200, Peter Zijlstra wrote:
> > But its things like set_memory_XX(), and afaict that's all buggy against
> > MTRR modifications.
> 
> I think the idea is to not do any MTRR modifications at some point:
> 
> From Documentation/x86/pat.txt:
> 
> "... Ideally mtrr_add() usage will be phased out in favor of
> arch_phys_wc_add() which will be a no-op on PAT enabled systems. The
> region over which a arch_phys_wc_add() is made, should already have been
> ioremapped with WC attributes or PAT entries, this can be done by using
> ioremap_wc() / set_memory_wc()."

I need to update this documentation to remove set_memory_wc() there as we've
learned with the MTRR --> PAT conversion that set_memory_wc() cannot be used on
IO memory, it can only be used for RAM. I am not sure if I would call it being
broken that you cannot use set_memory_*() for IO memory that may have been by
design.

  Luis

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-08-01 14:28               ` Luis R. Rodriguez
@ 2015-08-01 16:33                 ` Borislav Petkov
  2015-08-01 16:39                   ` Linus Torvalds
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-08-01 16:33 UTC (permalink / raw)
  To: Luis R. Rodriguez
  Cc: Toshi Kani, Peter Zijlstra, mingo, hpa, dvlasenk, bp, akpm,
	brgerst, tglx, linux-mm, luto, torvalds, linux-kernel,
	linux-tip-commits

On Sat, Aug 01, 2015 at 04:28:20PM +0200, Luis R. Rodriguez wrote:
> I need to update this documentation to remove set_memory_wc() there as we've
> learned with the MTRR --> PAT conversion that set_memory_wc() cannot be used on
> IO memory, it can only be used for RAM. I am not sure if I would call it being
> broken that you cannot use set_memory_*() for IO memory that may have been by
> design.

Well, it doesn't really make sense to write-combine IO memory, does it?
My simplistic impression is that an IO range behind which there's a
device, cannot stomach any caching of IO as all commands/data accesses
need to happen as they get issued...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-08-01 16:33                 ` Borislav Petkov
@ 2015-08-01 16:39                   ` Linus Torvalds
  2015-08-01 16:49                     ` Borislav Petkov
  0 siblings, 1 reply; 40+ messages in thread
From: Linus Torvalds @ 2015-08-01 16:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luis R. Rodriguez, Toshi Kani, Peter Zijlstra, Ingo Molnar,
	Peter Anvin, Denys Vlasenko, Borislav Petkov, Andrew Morton,
	Brian Gerst, Thomas Gleixner, linux-mm, Andy Lutomirski,
	Linux Kernel Mailing List, linux-tip-commits

On Sat, Aug 1, 2015 at 9:33 AM, Borislav Petkov <bp@alien8.de> wrote:
>
> Well, it doesn't really make sense to write-combine IO memory, does it?

Quite the reverse.

It makes no sense to write-combine normal memory (RAM), because caches
work and sane memory is always cache-coherent. So marking regular
memory write-combining is a sign of crap hardware (which admittedly
exists all too much, but hopefully goes away).

In contrast, marking MMIO memory write-combining is not a sign of crap
hardware - it's just a sign of things like frame buffers on the card
etc. Which very much wants write combining. So WC for MMIO at least
makes sense.

Yes, yes, I realize that "crap hardware" may actually be the more
common case, but still..

             Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-08-01 16:39                   ` Linus Torvalds
@ 2015-08-01 16:49                     ` Borislav Petkov
  2015-08-01 17:03                       ` Linus Torvalds
  0 siblings, 1 reply; 40+ messages in thread
From: Borislav Petkov @ 2015-08-01 16:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Luis R. Rodriguez, Toshi Kani, Peter Zijlstra, Ingo Molnar,
	Peter Anvin, Denys Vlasenko, Borislav Petkov, Andrew Morton,
	Brian Gerst, Thomas Gleixner, linux-mm, Andy Lutomirski,
	Linux Kernel Mailing List, linux-tip-commits

On Sat, Aug 01, 2015 at 09:39:07AM -0700, Linus Torvalds wrote:
> Quite the reverse.
> 
> It makes no sense to write-combine normal memory (RAM), because caches
> work and sane memory is always cache-coherent. So marking regular
> memory write-combining is a sign of crap hardware (which admittedly
> exists all too much, but hopefully goes away).
> 
> In contrast, marking MMIO memory write-combining is not a sign of crap
> hardware - it's just a sign of things like frame buffers on the card
> etc. Which very much wants write combining. So WC for MMIO at least
> makes sense.
> 
> Yes, yes, I realize that "crap hardware" may actually be the more
> common case, but still..

Hmm, ok.

My simplistic mental picture while thinking of this is the IO range
where you send the commands to the device and you don't really want to
delay those but they should reach the device as they get issued.

OTOH, your example with frame buffers really wants to WC because sending
down each write separately is plain dumb.

Ok, I see, so it can make sense to have WC IO memory, depending on the
range and what you're going to use it for, I guess...

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [tip:x86/mm] x86/mm/mtrr: Clean up mtrr_type_lookup()
  2015-08-01 16:49                     ` Borislav Petkov
@ 2015-08-01 17:03                       ` Linus Torvalds
  0 siblings, 0 replies; 40+ messages in thread
From: Linus Torvalds @ 2015-08-01 17:03 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Luis R. Rodriguez, Toshi Kani, Peter Zijlstra, Ingo Molnar,
	Peter Anvin, Denys Vlasenko, Borislav Petkov, Andrew Morton,
	Brian Gerst, Thomas Gleixner, linux-mm, Andy Lutomirski,
	Linux Kernel Mailing List, linux-tip-commits

On Sat, Aug 1, 2015 at 9:49 AM, Borislav Petkov <bp@alien8.de> wrote:
>
> My simplistic mental picture while thinking of this is the IO range
> where you send the commands to the device and you don't really want to
> delay those but they should reach the device as they get issued.

Well, even for command streams, people often do go for a
write-combining approach, simply because it is *so* much more
efficient on the bus to buffer and burst things. The interface is set
up to not really "combine" things in the over-writing sense, but just
in the "combine continuous writes into bigger buffers on the CPU, and
then write it out as efficiently as possible" sense.

Of course, the device (and the driver) has to be designed properly for
that, and it makes sense only with certain kinds of models, but it can
actually be much more efficient to make the device interface be
something like "write 32-byte command packets to a circular
write-combining buffer" than it is to do things other ways. Back in
the days, that was one of the most efficient ways to try to fill up
the PCI bandwidth.

There are other approaches too, of course, with the modern variation
tending to be "the device does all real accesses by reading over DMA,
and the only time you use IO accesses is for setup and as a 'start
your DMA transfers now' kind of interface". But write-combining MMIO
used to be a very common model for high-performace IO not that long
ago, because DMA didn't actually use to be all that efficient at all
(nasty behavior with caches and snooping etc back before the memory
controller was on-die and DMA accesses snooped caches directly). So
the "DMA is efficient even for smaller things" thing is relatively
recent.

                     Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2015-08-01 17:03 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-15 18:23 [PATCH v5 0/6] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
2015-05-15 18:23 ` [PATCH v5 1/6] mm, x86: Simplify conditions of HAVE_ARCH_HUGE_VMAP Toshi Kani
2015-05-17  8:30   ` Borislav Petkov
     [not found]   ` <1432628901-18044-2-git-send-email-bp@alien8.de>
2015-05-27 14:17     ` [tip:x86/mm] x86/mm/kconfig: Simplify conditions for HAVE_ARCH_HUGE_VMAP tip-bot for Toshi Kani
2015-05-15 18:23 ` [PATCH v5 2/6] mtrr, x86: Fix MTRR lookup to handle inclusive entry Toshi Kani
     [not found]   ` <1432628901-18044-3-git-send-email-bp@alien8.de>
2015-05-27 14:18     ` [tip:x86/mm] x86/mm/mtrr: Fix MTRR lookup to handle an " tip-bot for Toshi Kani
2015-05-15 18:23 ` [PATCH v5 3/6] mtrr, x86: Fix MTRR state checks in mtrr_type_lookup() Toshi Kani
     [not found]   ` <1432628901-18044-4-git-send-email-bp@alien8.de>
2015-05-27 14:18     ` [tip:x86/mm] x86/mm/mtrr: " tip-bot for Toshi Kani
2015-05-15 18:23 ` [PATCH v5 4/6] mtrr, x86: Define MTRR_TYPE_INVALID for mtrr_type_lookup() Toshi Kani
     [not found]   ` <1432628901-18044-5-git-send-email-bp@alien8.de>
2015-05-27 14:18     ` [tip:x86/mm] x86/mm/mtrr: Use symbolic define as a retval for disabled MTRRs tip-bot for Toshi Kani
2015-05-15 18:23 ` [PATCH v5 5/6] mtrr, x86: Clean up mtrr_type_lookup() Toshi Kani
     [not found]   ` <1432628901-18044-6-git-send-email-bp@alien8.de>
2015-05-27 14:19     ` [tip:x86/mm] x86/mm/mtrr: " tip-bot for Toshi Kani
2015-07-31 13:18       ` Peter Zijlstra
2015-07-31 14:44         ` Borislav Petkov
2015-07-31 15:08           ` Peter Zijlstra
2015-07-31 15:27             ` Borislav Petkov
2015-08-01 14:28               ` Luis R. Rodriguez
2015-08-01 16:33                 ` Borislav Petkov
2015-08-01 16:39                   ` Linus Torvalds
2015-08-01 16:49                     ` Borislav Petkov
2015-08-01 17:03                       ` Linus Torvalds
2015-05-15 18:23 ` [PATCH v5 6/6] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Toshi Kani
2015-05-18 13:33   ` Borislav Petkov
2015-05-18 17:22     ` Toshi Kani
2015-05-18 19:01       ` Borislav Petkov
2015-05-18 19:31         ` Toshi Kani
2015-05-18 20:01           ` Borislav Petkov
2015-05-18 20:21             ` Toshi Kani
2015-05-18 20:51               ` Borislav Petkov
2015-05-18 21:53                 ` Toshi Kani
2015-05-19 11:44                   ` Borislav Petkov
2015-05-19 13:23                     ` Borislav Petkov
2015-05-19 13:47                       ` Toshi Kani
2015-05-20 11:55                       ` Ingo Molnar
2015-05-20 14:34                         ` Toshi Kani
2015-05-20 15:01                           ` Ingo Molnar
2015-05-20 15:02                             ` Toshi Kani
2015-05-20 16:04                               ` Borislav Petkov
2015-05-20 15:46                                 ` Toshi Kani
     [not found]   ` <1432628901-18044-8-git-send-email-bp@alien8.de>
2015-05-27 14:19     ` [tip:x86/mm] x86/mm/mtrr: Enhance MTRR checks in kernel mapping helpers tip-bot for Toshi Kani

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).