All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping
@ 2015-03-10 20:23 ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle

This patchset enhances MTRR checks for the kernel huge I/O mapping,
which was enabled by the patchset below:
  https://lkml.org/lkml/2015/3/3/589

The following functional changes are made in patch 3/3.
 - Allow pud_set_huge() and pmd_set_huge() to create a huge page
   mapping to a range covered by a single MTRR entry of any memory
   type.
 - Log a pr_warn() message when a requested PMD map range spans more
   than a single MTRR entry.  Drivers should make a mapping request
   aligned to a single MTRR entry when the range is covered by MTRRs.

Patch 1/3 addresses other review comments to the mapping funcs for
better code read-ability.  Patch 2/3 fixes a bug in mtrr_type_lookup().

The patchset is based on the -mm tree.
---
Toshi Kani (3):
 1/3 mm, x86: Document return values of mapping funcs
 2/3 mtrr, x86: Fix MTRR lookup to handle inclusive entry
 3/3 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

---
 arch/x86/Kconfig                   |  2 +-
 arch/x86/include/asm/mtrr.h        |  5 ++--
 arch/x86/kernel/cpu/mtrr/generic.c | 52 +++++++++++++++++++++++++------------
 arch/x86/mm/pat.c                  |  4 +--
 arch/x86/mm/pgtable.c              | 53 ++++++++++++++++++++++++++++----------
 5 files changed, 81 insertions(+), 35 deletions(-)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 0/3] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping
@ 2015-03-10 20:23 ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle

This patchset enhances MTRR checks for the kernel huge I/O mapping,
which was enabled by the patchset below:
  https://lkml.org/lkml/2015/3/3/589

The following functional changes are made in patch 3/3.
 - Allow pud_set_huge() and pmd_set_huge() to create a huge page
   mapping to a range covered by a single MTRR entry of any memory
   type.
 - Log a pr_warn() message when a requested PMD map range spans more
   than a single MTRR entry.  Drivers should make a mapping request
   aligned to a single MTRR entry when the range is covered by MTRRs.

Patch 1/3 addresses other review comments to the mapping funcs for
better code read-ability.  Patch 2/3 fixes a bug in mtrr_type_lookup().

The patchset is based on the -mm tree.
---
Toshi Kani (3):
 1/3 mm, x86: Document return values of mapping funcs
 2/3 mtrr, x86: Fix MTRR lookup to handle inclusive entry
 3/3 mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping

---
 arch/x86/Kconfig                   |  2 +-
 arch/x86/include/asm/mtrr.h        |  5 ++--
 arch/x86/kernel/cpu/mtrr/generic.c | 52 +++++++++++++++++++++++++------------
 arch/x86/mm/pat.c                  |  4 +--
 arch/x86/mm/pgtable.c              | 53 ++++++++++++++++++++++++++++----------
 5 files changed, 81 insertions(+), 35 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/3] mm, x86: Document return values of mapping funcs
  2015-03-10 20:23 ` Toshi Kani
@ 2015-03-10 20:23   ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

Documented the return values of KVA mapping functions,
pud_set_huge(), pmd_set_huge, pud_clear_huge() and
pmd_clear_huge().

Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
in Kconfig since X86_PAE depends on X86_32.

There is no functinal change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/Kconfig      |    2 +-
 arch/x86/mm/pgtable.c |   36 ++++++++++++++++++++++++++++--------
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 110f6ae..ba5e78e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -99,7 +99,7 @@ config X86
 	select IRQ_FORCED_THREADING
 	select HAVE_BPF_JIT if X86_64
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
-	select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
+	select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
 	select ARCH_HAS_SG_CHAIN
 	select CLKEVT_I8253
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0b97d2c..a0f7eeb 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -563,14 +563,19 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 }
 
 #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+/**
+ * pud_set_huge - setup kernel PUD mapping
+ *
+ * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
+ * it does not set up a huge page when the range is covered by non-WB type
+ * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
 	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
 		return 0;
@@ -584,14 +589,19 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pmd_set_huge - setup kernel PMD mapping
+ *
+ * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
+ * it does not set up a huge page when the range is covered by non-WB type
+ * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
 	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
 		return 0;
@@ -605,6 +615,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pud_clear_huge(pud_t *pud)
 {
 	if (pud_large(*pud)) {
@@ -615,6 +630,11 @@ int pud_clear_huge(pud_t *pud)
 	return 0;
 }
 
+/**
+ * pmd_clear_huge - clear kernel PMD mapping when it is set
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pmd_clear_huge(pmd_t *pmd)
 {
 	if (pmd_large(*pmd)) {

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 1/3] mm, x86: Document return values of mapping funcs
@ 2015-03-10 20:23   ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

Documented the return values of KVA mapping functions,
pud_set_huge(), pmd_set_huge, pud_clear_huge() and
pmd_clear_huge().

Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
in Kconfig since X86_PAE depends on X86_32.

There is no functinal change in this patch.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/Kconfig      |    2 +-
 arch/x86/mm/pgtable.c |   36 ++++++++++++++++++++++++++++--------
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 110f6ae..ba5e78e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -99,7 +99,7 @@ config X86
 	select IRQ_FORCED_THREADING
 	select HAVE_BPF_JIT if X86_64
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
-	select HAVE_ARCH_HUGE_VMAP if X86_64 || (X86_32 && X86_PAE)
+	select HAVE_ARCH_HUGE_VMAP if X86_64 || X86_PAE
 	select ARCH_HAS_SG_CHAIN
 	select CLKEVT_I8253
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 0b97d2c..a0f7eeb 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -563,14 +563,19 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
 }
 
 #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+/**
+ * pud_set_huge - setup kernel PUD mapping
+ *
+ * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
+ * it does not set up a huge page when the range is covered by non-WB type
+ * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
 	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
 		return 0;
@@ -584,14 +589,19 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pmd_set_huge - setup kernel PMD mapping
+ *
+ * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
+ * it does not set up a huge page when the range is covered by non-WB type
+ * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
 	u8 mtrr;
 
-	/*
-	 * Do not use a huge page when the range is covered by non-WB type
-	 * of MTRRs.
-	 */
 	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
 	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
 		return 0;
@@ -605,6 +615,11 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 	return 1;
 }
 
+/**
+ * pud_clear_huge - clear kernel PUD mapping when it is set
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pud_clear_huge(pud_t *pud)
 {
 	if (pud_large(*pud)) {
@@ -615,6 +630,11 @@ int pud_clear_huge(pud_t *pud)
 	return 0;
 }
 
+/**
+ * pmd_clear_huge - clear kernel PMD mapping when it is set
+ *
+ * Return 1 on success, and 0 on no-operation.
+ */
 int pmd_clear_huge(pmd_t *pmd)
 {
 	if (pmd_large(*pmd)) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
  2015-03-10 20:23 ` Toshi Kani
@ 2015-03-10 20:23   ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

When an MTRR entry is inclusive to a requested range, i.e.
the start and end of the request are not within the MTRR
entry range but the range contains the MTRR entry entirely,
__mtrr_type_lookup() ignores such case because both
start_state and end_state are set to zero.

This patch fixes the issue by adding a new flag, inclusive,
to detect the case.  This case is then handled in the same
way as (!start_state && end_state).

Also updated the comment in __mtrr_type_lookup() to clarify
that the repeat handling is necessary to handle overlaps
with the default type, since overlaps with multiple entries
alone can be handled without such repeat.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 7d74f7b..cdb955f 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
-		unsigned short start_state, end_state;
+		unsigned short start_state, end_state, inclusive;
 
 		if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
 			continue;
@@ -166,20 +166,22 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 		start_state = ((start & mask) == (base & mask));
 		end_state = ((end & mask) == (base & mask));
+		inclusive = ((start < base) && (end > base));
 
-		if (start_state != end_state) {
+		if ((start_state != end_state) || inclusive) {
 			/*
 			 * We have start:end spanning across an MTRR.
-			 * We split the region into
-			 * either
-			 * (start:mtrr_end) (mtrr_end:end)
-			 * or
-			 * (start:mtrr_start) (mtrr_start:end)
+			 * We split the region into either
+			 * - start_state:1
+			 *     (start:mtrr_end) (mtrr_end:end)
+			 * - end_state:1 or inclusive:1
+			 *     (start:mtrr_start) (mtrr_start:end)
 			 * depending on kind of overlap.
 			 * Return the type for first region and a pointer to
 			 * the start of second region so that caller will
 			 * lookup again on the second region.
-			 * Note: This way we handle multiple overlaps as well.
+			 * Note: This way we handle overlaps with multiple
+			 * entries and the default type properly.
 			 */
 			if (start_state)
 				*partial_end = base + get_mtrr_size(mask);
@@ -195,7 +197,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			*repeat = 1;
 		}
 
-		if ((start & mask) != (base & mask))
+		if (!start_state)
 			continue;
 
 		curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
@ 2015-03-10 20:23   ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

When an MTRR entry is inclusive to a requested range, i.e.
the start and end of the request are not within the MTRR
entry range but the range contains the MTRR entry entirely,
__mtrr_type_lookup() ignores such case because both
start_state and end_state are set to zero.

This patch fixes the issue by adding a new flag, inclusive,
to detect the case.  This case is then handled in the same
way as (!start_state && end_state).

Also updated the comment in __mtrr_type_lookup() to clarify
that the repeat handling is necessary to handle overlaps
with the default type, since overlaps with multiple entries
alone can be handled without such repeat.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/kernel/cpu/mtrr/generic.c |   20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index 7d74f7b..cdb955f 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -154,7 +154,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 	prev_match = 0xFF;
 	for (i = 0; i < num_var_ranges; ++i) {
-		unsigned short start_state, end_state;
+		unsigned short start_state, end_state, inclusive;
 
 		if (!(mtrr_state.var_ranges[i].mask_lo & (1 << 11)))
 			continue;
@@ -166,20 +166,22 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 		start_state = ((start & mask) == (base & mask));
 		end_state = ((end & mask) == (base & mask));
+		inclusive = ((start < base) && (end > base));
 
-		if (start_state != end_state) {
+		if ((start_state != end_state) || inclusive) {
 			/*
 			 * We have start:end spanning across an MTRR.
-			 * We split the region into
-			 * either
-			 * (start:mtrr_end) (mtrr_end:end)
-			 * or
-			 * (start:mtrr_start) (mtrr_start:end)
+			 * We split the region into either
+			 * - start_state:1
+			 *     (start:mtrr_end) (mtrr_end:end)
+			 * - end_state:1 or inclusive:1
+			 *     (start:mtrr_start) (mtrr_start:end)
 			 * depending on kind of overlap.
 			 * Return the type for first region and a pointer to
 			 * the start of second region so that caller will
 			 * lookup again on the second region.
-			 * Note: This way we handle multiple overlaps as well.
+			 * Note: This way we handle overlaps with multiple
+			 * entries and the default type properly.
 			 */
 			if (start_state)
 				*partial_end = base + get_mtrr_size(mask);
@@ -195,7 +197,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			*repeat = 1;
 		}
 
-		if ((start & mask) != (base & mask))
+		if (!start_state)
 			continue;
 
 		curr_match = mtrr_state.var_ranges[i].base_lo & 0xff;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-03-10 20:23 ` Toshi Kani
@ 2015-03-10 20:23   ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

This patch adds an additional argument, *uniform, to
mtrr_type_lookup(), which returns 1 when a given range is
either fully covered by a single MTRR entry or not covered
at all.

pud_set_huge() and pmd_set_huge() are changed to check the
new uniform flag to see if it is safe to create a huge page
mapping to the range.  This allows them to create a huge page
mapping to a range covered by a single MTRR entry of any
memory type.  It also detects an unoptimal request properly.
They continue to check with the WB type since the WB type has
no effect even if a request spans to multiple MTRR entries.

pmd_set_huge() logs a warning message to an unoptimal request
so that driver writers will be aware of such case.  Drivers
should make a mapping request aligned to a single MTRR entry
when the range is covered by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/mtrr.h        |    5 +++--
 arch/x86/kernel/cpu/mtrr/generic.c |   32 +++++++++++++++++++++++++-------
 arch/x86/mm/pat.c                  |    4 ++--
 arch/x86/mm/pgtable.c              |   25 +++++++++++++++----------
 4 files changed, 45 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..5b4d467 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,11 +50,12 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
 	 */
+	*uniform = 1;
 	return 0xff;
 }
 #define mtrr_save_fixed_ranges(arg) do {} while (0)
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index cdb955f..aef238c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
  * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
  *		corresponds only to [start:*partial_end].
  *		Caller has to lookup again for [*partial_end:end].
+ * *uniform == 1 The requested range is either fully covered by a single MTRR
+ *		 entry or not covered at all.
  */
-static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
+static u8 __mtrr_type_lookup(u64 start, u64 end,
+			     u64 *partial_end, int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
+
 	if (!mtrr_state_set)
 		return 0xFF;
 
@@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	/* Look in fixed ranges. Just return the type as per start */
 	if (mtrr_state.have_fixed && (start < 0x100000)) {
 		int idx;
+		*uniform = 0;
 
 		if (start < 0x80000) {
 			idx = 0;
@@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if (!start_state)
@@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 }
 
 /*
- * Returns the effective MTRR type for the region
+ * Returns the effective MTRR type for the region.  *uniform is set to 1
+ * when a given range is either fully covered by a single MTRR entry or
+ * not covered at all.
+ *
  * Error return:
  * 0xFF - when MTRR is not enabled
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform, dummy;
 	int repeat;
 	u64 partial_end;
 
-	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+	type = __mtrr_type_lookup(start, end,
+				  &partial_end, &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+		is_uniform = 0;
 
-		if (check_type_overlap(&prev_type, &type))
+		type = __mtrr_type_lookup(start, end,
+					  &partial_end, &repeat, &dummy);
+
+		if (check_type_overlap(&prev_type, &type)) {
+			*uniform = 0;
 			return type;
+		}
 	}
 
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..372ad42 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a0f7eeb..25843a9 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -567,17 +567,18 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
  * pud_set_huge - setup kernel PUD mapping
  *
  * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
- * it does not set up a huge page when the range is covered by non-WB type
- * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ * it only sets up a huge page when the range is mapped uniformly (i.e.
+ * either fully covered by a single MTRR entry or not covered at all) or
+ * the MTRR type is WB.
  *
  * Return 1 on success, and 0 on no-operation.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -593,18 +594,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
  * pmd_set_huge - setup kernel PMD mapping
  *
  * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
- * it does not set up a huge page when the range is covered by non-WB type
- * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ * it only sets up a huge page when the range is mapped uniformly (i.e.
+ * either fully covered by a single MTRR entry or not covered at all) or
+ * the MTRR type is WB.
  *
  * Return 1 on success, and 0 on no-operation.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
+				addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
@ 2015-03-10 20:23   ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-10 20:23 UTC (permalink / raw)
  To: akpm, hpa, tglx, mingo, arnd
  Cc: linux-mm, x86, linux-kernel, dave.hansen, Elliott, pebolle, Toshi Kani

This patch adds an additional argument, *uniform, to
mtrr_type_lookup(), which returns 1 when a given range is
either fully covered by a single MTRR entry or not covered
at all.

pud_set_huge() and pmd_set_huge() are changed to check the
new uniform flag to see if it is safe to create a huge page
mapping to the range.  This allows them to create a huge page
mapping to a range covered by a single MTRR entry of any
memory type.  It also detects an unoptimal request properly.
They continue to check with the WB type since the WB type has
no effect even if a request spans to multiple MTRR entries.

pmd_set_huge() logs a warning message to an unoptimal request
so that driver writers will be aware of such case.  Drivers
should make a mapping request aligned to a single MTRR entry
when the range is covered by MTRRs.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/include/asm/mtrr.h        |    5 +++--
 arch/x86/kernel/cpu/mtrr/generic.c |   32 +++++++++++++++++++++++++-------
 arch/x86/mm/pat.c                  |    4 ++--
 arch/x86/mm/pgtable.c              |   25 +++++++++++++++----------
 4 files changed, 45 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/mtrr.h b/arch/x86/include/asm/mtrr.h
index f768f62..5b4d467 100644
--- a/arch/x86/include/asm/mtrr.h
+++ b/arch/x86/include/asm/mtrr.h
@@ -31,7 +31,7 @@
  * arch_phys_wc_add and arch_phys_wc_del.
  */
 # ifdef CONFIG_MTRR
-extern u8 mtrr_type_lookup(u64 addr, u64 end);
+extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform);
 extern void mtrr_save_fixed_ranges(void *);
 extern void mtrr_save_state(void);
 extern int mtrr_add(unsigned long base, unsigned long size,
@@ -50,11 +50,12 @@ extern int mtrr_trim_uncached_memory(unsigned long end_pfn);
 extern int amd_special_default_mtrr(void);
 extern int phys_wc_to_mtrr_index(int handle);
 #  else
-static inline u8 mtrr_type_lookup(u64 addr, u64 end)
+static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
 {
 	/*
 	 * Return no-MTRRs:
 	 */
+	*uniform = 1;
 	return 0xff;
 }
 #define mtrr_save_fixed_ranges(arg) do {} while (0)
diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
index cdb955f..aef238c 100644
--- a/arch/x86/kernel/cpu/mtrr/generic.c
+++ b/arch/x86/kernel/cpu/mtrr/generic.c
@@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
  * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
  *		corresponds only to [start:*partial_end].
  *		Caller has to lookup again for [*partial_end:end].
+ * *uniform == 1 The requested range is either fully covered by a single MTRR
+ *		 entry or not covered at all.
  */
-static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
+static u8 __mtrr_type_lookup(u64 start, u64 end,
+			     u64 *partial_end, int *repeat, u8 *uniform)
 {
 	int i;
 	u64 base, mask;
 	u8 prev_match, curr_match;
 
 	*repeat = 0;
+	*uniform = 1;
+
 	if (!mtrr_state_set)
 		return 0xFF;
 
@@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 	/* Look in fixed ranges. Just return the type as per start */
 	if (mtrr_state.have_fixed && (start < 0x100000)) {
 		int idx;
+		*uniform = 0;
 
 		if (start < 0x80000) {
 			idx = 0;
@@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 
 			end = *partial_end - 1; /* end is inclusive */
 			*repeat = 1;
+			*uniform = 0;
 		}
 
 		if (!start_state)
@@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 			continue;
 		}
 
+		*uniform = 0;
 		if (check_type_overlap(&prev_match, &curr_match))
 			return curr_match;
 	}
@@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
 }
 
 /*
- * Returns the effective MTRR type for the region
+ * Returns the effective MTRR type for the region.  *uniform is set to 1
+ * when a given range is either fully covered by a single MTRR entry or
+ * not covered at all.
+ *
  * Error return:
  * 0xFF - when MTRR is not enabled
  */
-u8 mtrr_type_lookup(u64 start, u64 end)
+u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
 {
-	u8 type, prev_type;
+	u8 type, prev_type, is_uniform, dummy;
 	int repeat;
 	u64 partial_end;
 
-	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+	type = __mtrr_type_lookup(start, end,
+				  &partial_end, &repeat, &is_uniform);
 
 	/*
 	 * Common path is with repeat = 0.
@@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
 	while (repeat) {
 		prev_type = type;
 		start = partial_end;
-		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
+		is_uniform = 0;
 
-		if (check_type_overlap(&prev_type, &type))
+		type = __mtrr_type_lookup(start, end,
+					  &partial_end, &repeat, &dummy);
+
+		if (check_type_overlap(&prev_type, &type)) {
+			*uniform = 0;
 			return type;
+		}
 	}
 
+	*uniform = is_uniform;
 	return type;
 }
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..372ad42 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -267,9 +267,9 @@ static unsigned long pat_x_mtrr_type(u64 start, u64 end,
 	 * request is for WB.
 	 */
 	if (req_type == _PAGE_CACHE_MODE_WB) {
-		u8 mtrr_type;
+		u8 mtrr_type, uniform;
 
-		mtrr_type = mtrr_type_lookup(start, end);
+		mtrr_type = mtrr_type_lookup(start, end, &uniform);
 		if (mtrr_type != MTRR_TYPE_WRBACK)
 			return _PAGE_CACHE_MODE_UC_MINUS;
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a0f7eeb..25843a9 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -567,17 +567,18 @@ void native_set_fixmap(enum fixed_addresses idx, phys_addr_t phys,
  * pud_set_huge - setup kernel PUD mapping
  *
  * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
- * it does not set up a huge page when the range is covered by non-WB type
- * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ * it only sets up a huge page when the range is mapped uniformly (i.e.
+ * either fully covered by a single MTRR entry or not covered at all) or
+ * the MTRR type is WB.
  *
  * Return 1 on success, and 0 on no-operation.
  */
 int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
+	if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK))
 		return 0;
 
 	prot = pgprot_4k_2_large(prot);
@@ -593,18 +594,22 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
  * pmd_set_huge - setup kernel PMD mapping
  *
  * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
- * it does not set up a huge page when the range is covered by non-WB type
- * of MTRRs.  0xFF indicates that MTRRs are disabled.
+ * it only sets up a huge page when the range is mapped uniformly (i.e.
+ * either fully covered by a single MTRR entry or not covered at all) or
+ * the MTRR type is WB.
  *
  * Return 1 on success, and 0 on no-operation.
  */
 int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot)
 {
-	u8 mtrr;
+	u8 mtrr, uniform;
 
-	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE);
-	if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
+	mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
+	if ((!uniform) && (mtrr != MTRR_TYPE_WRBACK)) {
+		pr_warn("pmd_set_huge: requesting [mem %#010llx-%#010llx], which spans more than a single MTRR entry\n",
+				addr, addr + PMD_SIZE);
 		return 0;
+	}
 
 	prot = pgprot_4k_2_large(prot);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs
  2015-03-10 20:23   ` Toshi Kani
@ 2015-03-11  6:30     ` Ingo Molnar
  -1 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  6:30 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> Documented the return values of KVA mapping functions,
> pud_set_huge(), pmd_set_huge, pud_clear_huge() and
> pmd_clear_huge().
> 
> Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
> in Kconfig since X86_PAE depends on X86_32.

Changelogs are not a diary, they are a story, generally written in the 
present tense. So it should be something like:

  Document the return values of KVA mapping functions,
  pud_set_huge(), pmd_set_huge, pud_clear_huge() and
  pmd_clear_huge().

  Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
  in the Kconfig, since X86_PAE depends on X86_32.

(also note the slight fixes I made to the text.)

> There is no functinal change in this patch.

Typo.

> +/**
> + * pud_set_huge - setup kernel PUD mapping
> + *
> + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,

s/with a/with

> + * it does not set up a huge page when the range is covered by non-WB type
> + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> + *
> + * Return 1 on success, and 0 on no-operation.

What is a 'no-operation'?

I suspect you want:

    * Returns 1 on success, and 0 when no PUD was set.


> +/**
> + * pmd_set_huge - setup kernel PMD mapping
> + *
> + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> + * it does not set up a huge page when the range is covered by non-WB type
> + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> + *
> + * Return 1 on success, and 0 on no-operation.

Ditto (and the rest of the patch).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs
@ 2015-03-11  6:30     ` Ingo Molnar
  0 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  6:30 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> Documented the return values of KVA mapping functions,
> pud_set_huge(), pmd_set_huge, pud_clear_huge() and
> pmd_clear_huge().
> 
> Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
> in Kconfig since X86_PAE depends on X86_32.

Changelogs are not a diary, they are a story, generally written in the 
present tense. So it should be something like:

  Document the return values of KVA mapping functions,
  pud_set_huge(), pmd_set_huge, pud_clear_huge() and
  pmd_clear_huge().

  Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
  in the Kconfig, since X86_PAE depends on X86_32.

(also note the slight fixes I made to the text.)

> There is no functinal change in this patch.

Typo.

> +/**
> + * pud_set_huge - setup kernel PUD mapping
> + *
> + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,

s/with a/with

> + * it does not set up a huge page when the range is covered by non-WB type
> + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> + *
> + * Return 1 on success, and 0 on no-operation.

What is a 'no-operation'?

I suspect you want:

    * Returns 1 on success, and 0 when no PUD was set.


> +/**
> + * pmd_set_huge - setup kernel PMD mapping
> + *
> + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> + * it does not set up a huge page when the range is covered by non-WB type
> + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> + *
> + * Return 1 on success, and 0 on no-operation.

Ditto (and the rest of the patch).

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
  2015-03-10 20:23   ` Toshi Kani
@ 2015-03-11  6:32     ` Ingo Molnar
  -1 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  6:32 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> When an MTRR entry is inclusive to a requested range, i.e.
> the start and end of the request are not within the MTRR
> entry range but the range contains the MTRR entry entirely,
> __mtrr_type_lookup() ignores such case because both
> start_state and end_state are set to zero.

'ignores such a case' or 'ignores such cases'.

> This patch fixes the issue by adding a new flag, inclusive,

s/inclusive/'inclusive'

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
@ 2015-03-11  6:32     ` Ingo Molnar
  0 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  6:32 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> When an MTRR entry is inclusive to a requested range, i.e.
> the start and end of the request are not within the MTRR
> entry range but the range contains the MTRR entry entirely,
> __mtrr_type_lookup() ignores such case because both
> start_state and end_state are set to zero.

'ignores such a case' or 'ignores such cases'.

> This patch fixes the issue by adding a new flag, inclusive,

s/inclusive/'inclusive'

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-03-10 20:23   ` Toshi Kani
@ 2015-03-11  7:02     ` Ingo Molnar
  -1 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  7:02 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> This patch adds an additional argument, *uniform, to

s/*uniform/'uniform'

> mtrr_type_lookup(), which returns 1 when a given range is
> either fully covered by a single MTRR entry or not covered
> at all.

s/or not covered/or is not covered

> pud_set_huge() and pmd_set_huge() are changed to check the
> new uniform flag to see if it is safe to create a huge page

s/uniform/'uniform'

> mapping to the range.  This allows them to create a huge page
> mapping to a range covered by a single MTRR entry of any
> memory type.  It also detects an unoptimal request properly.

s/unoptimal/non-optimal

or nonoptimal

Also, some description in the changelog about what a 'non-optimal' 
request is would be most userful.

> They continue to check with the WB type since the WB type has
> no effect even if a request spans to multiple MTRR entries.

s/spans to/spans

> -static inline u8 mtrr_type_lookup(u64 addr, u64 end)
> +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
>  {
>  	/*
>  	 * Return no-MTRRs:
>  	 */
> +	*uniform = 1;
>  	return 0xff;
>  }
>  #define mtrr_save_fixed_ranges(arg) do {} while (0)
> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> index cdb955f..aef238c 100644
> --- a/arch/x86/kernel/cpu/mtrr/generic.c
> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
>   * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
>   *		corresponds only to [start:*partial_end].
>   *		Caller has to lookup again for [*partial_end:end].
> + * *uniform == 1 The requested range is either fully covered by a single MTRR
> + *		 entry or not covered at all.
>   */

So I think a better approach would be to count the number of separate 
MTRR caching types a range is covered by, instead of this hard to 
quality 'uniform' flag.

I.e. a 'nr_mtrr_types' count.

If for example a range partially intersects with an MTRR, then that 
count would be 2: the MTRR, and the outside (default cache policy) 
type.

( Note that with this approach is not only easy to understand and easy 
  to review, but could also be refined in the future, to count the 
  number of _incompatible_ caching types present within a range. )


> -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> +static u8 __mtrr_type_lookup(u64 start, u64 end,
> +			     u64 *partial_end, int *repeat, u8 *uniform)
>  {
>  	int i;
>  	u64 base, mask;
>  	u8 prev_match, curr_match;
>  
>  	*repeat = 0;
> +	*uniform = 1;
> +
>  	if (!mtrr_state_set)
>  		return 0xFF;
>  
> @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  	/* Look in fixed ranges. Just return the type as per start */
>  	if (mtrr_state.have_fixed && (start < 0x100000)) {
>  		int idx;
> +		*uniform = 0;

So this function scares me, because the code is clearly crap:

        if (mtrr_state.have_fixed && (start < 0x100000)) {
	...
                } else if (start < 0x1000000) {
	...

How can that 'else if' branch ever not be true?

Did it perhaps want to be the other way around:

        if (mtrr_state.have_fixed && (start < 0x1000000)) {
	...
                } else if (start < 0x100000) {
	...

or did it simply mess up the condition?

>  
>  		if (start < 0x80000) {
>  			idx = 0;
> @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  
>  			end = *partial_end - 1; /* end is inclusive */
>  			*repeat = 1;
> +			*uniform = 0;
>  		}
>  
>  		if (!start_state)
> @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  			continue;
>  		}
>  
> +		*uniform = 0;
>  		if (check_type_overlap(&prev_match, &curr_match))
>  			return curr_match;
>  	}


> @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  }
>  
>  /*
> - * Returns the effective MTRR type for the region
> + * Returns the effective MTRR type for the region.  *uniform is set to 1
> + * when a given range is either fully covered by a single MTRR entry or
> + * not covered at all.
> + *
>   * Error return:
>   * 0xFF - when MTRR is not enabled
>   */
> -u8 mtrr_type_lookup(u64 start, u64 end)
> +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
>  {
> -	u8 type, prev_type;
> +	u8 type, prev_type, is_uniform, dummy;
>  	int repeat;
>  	u64 partial_end;
>  
> -	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> +	type = __mtrr_type_lookup(start, end,
> +				  &partial_end, &repeat, &is_uniform);
>  
>  	/*
>  	 * Common path is with repeat = 0.
> @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
>  	while (repeat) {
>  		prev_type = type;
>  		start = partial_end;
> -		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> +		is_uniform = 0;
>  
> -		if (check_type_overlap(&prev_type, &type))
> +		type = __mtrr_type_lookup(start, end,
> +					  &partial_end, &repeat, &dummy);
> +
> +		if (check_type_overlap(&prev_type, &type)) {
> +			*uniform = 0;
>  			return type;
> +		}
>  	}
>  
> +	*uniform = is_uniform;
>  	return type;

So the MTRR code is from hell, it would be nice to first clean up the 
whole code and the MTRR data structures before extending it with more 
complexity ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
@ 2015-03-11  7:02     ` Ingo Molnar
  0 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-11  7:02 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> This patch adds an additional argument, *uniform, to

s/*uniform/'uniform'

> mtrr_type_lookup(), which returns 1 when a given range is
> either fully covered by a single MTRR entry or not covered
> at all.

s/or not covered/or is not covered

> pud_set_huge() and pmd_set_huge() are changed to check the
> new uniform flag to see if it is safe to create a huge page

s/uniform/'uniform'

> mapping to the range.  This allows them to create a huge page
> mapping to a range covered by a single MTRR entry of any
> memory type.  It also detects an unoptimal request properly.

s/unoptimal/non-optimal

or nonoptimal

Also, some description in the changelog about what a 'non-optimal' 
request is would be most userful.

> They continue to check with the WB type since the WB type has
> no effect even if a request spans to multiple MTRR entries.

s/spans to/spans

> -static inline u8 mtrr_type_lookup(u64 addr, u64 end)
> +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
>  {
>  	/*
>  	 * Return no-MTRRs:
>  	 */
> +	*uniform = 1;
>  	return 0xff;
>  }
>  #define mtrr_save_fixed_ranges(arg) do {} while (0)
> diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> index cdb955f..aef238c 100644
> --- a/arch/x86/kernel/cpu/mtrr/generic.c
> +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
>   * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
>   *		corresponds only to [start:*partial_end].
>   *		Caller has to lookup again for [*partial_end:end].
> + * *uniform == 1 The requested range is either fully covered by a single MTRR
> + *		 entry or not covered at all.
>   */

So I think a better approach would be to count the number of separate 
MTRR caching types a range is covered by, instead of this hard to 
quality 'uniform' flag.

I.e. a 'nr_mtrr_types' count.

If for example a range partially intersects with an MTRR, then that 
count would be 2: the MTRR, and the outside (default cache policy) 
type.

( Note that with this approach is not only easy to understand and easy 
  to review, but could also be refined in the future, to count the 
  number of _incompatible_ caching types present within a range. )


> -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> +static u8 __mtrr_type_lookup(u64 start, u64 end,
> +			     u64 *partial_end, int *repeat, u8 *uniform)
>  {
>  	int i;
>  	u64 base, mask;
>  	u8 prev_match, curr_match;
>  
>  	*repeat = 0;
> +	*uniform = 1;
> +
>  	if (!mtrr_state_set)
>  		return 0xFF;
>  
> @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  	/* Look in fixed ranges. Just return the type as per start */
>  	if (mtrr_state.have_fixed && (start < 0x100000)) {
>  		int idx;
> +		*uniform = 0;

So this function scares me, because the code is clearly crap:

        if (mtrr_state.have_fixed && (start < 0x100000)) {
	...
                } else if (start < 0x1000000) {
	...

How can that 'else if' branch ever not be true?

Did it perhaps want to be the other way around:

        if (mtrr_state.have_fixed && (start < 0x1000000)) {
	...
                } else if (start < 0x100000) {
	...

or did it simply mess up the condition?

>  
>  		if (start < 0x80000) {
>  			idx = 0;
> @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  
>  			end = *partial_end - 1; /* end is inclusive */
>  			*repeat = 1;
> +			*uniform = 0;
>  		}
>  
>  		if (!start_state)
> @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  			continue;
>  		}
>  
> +		*uniform = 0;
>  		if (check_type_overlap(&prev_match, &curr_match))
>  			return curr_match;
>  	}


> @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
>  }
>  
>  /*
> - * Returns the effective MTRR type for the region
> + * Returns the effective MTRR type for the region.  *uniform is set to 1
> + * when a given range is either fully covered by a single MTRR entry or
> + * not covered at all.
> + *
>   * Error return:
>   * 0xFF - when MTRR is not enabled
>   */
> -u8 mtrr_type_lookup(u64 start, u64 end)
> +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
>  {
> -	u8 type, prev_type;
> +	u8 type, prev_type, is_uniform, dummy;
>  	int repeat;
>  	u64 partial_end;
>  
> -	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> +	type = __mtrr_type_lookup(start, end,
> +				  &partial_end, &repeat, &is_uniform);
>  
>  	/*
>  	 * Common path is with repeat = 0.
> @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
>  	while (repeat) {
>  		prev_type = type;
>  		start = partial_end;
> -		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> +		is_uniform = 0;
>  
> -		if (check_type_overlap(&prev_type, &type))
> +		type = __mtrr_type_lookup(start, end,
> +					  &partial_end, &repeat, &dummy);
> +
> +		if (check_type_overlap(&prev_type, &type)) {
> +			*uniform = 0;
>  			return type;
> +		}
>  	}
>  
> +	*uniform = is_uniform;
>  	return type;

So the MTRR code is from hell, it would be nice to first clean up the 
whole code and the MTRR data structures before extending it with more 
complexity ...

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs
  2015-03-11  6:30     ` Ingo Molnar
@ 2015-03-11 15:25       ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 15:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, Robert (Server Storage),
	pebolle

On Wed, 2015-03-11 at 06:30 +0000, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > Documented the return values of KVA mapping functions,
> > pud_set_huge(), pmd_set_huge, pud_clear_huge() and
> > pmd_clear_huge().
> > 
> > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
> > in Kconfig since X86_PAE depends on X86_32.
> 
> Changelogs are not a diary, they are a story, generally written in the 
> present tense. 

Oh, I see. Thanks for the tip!

> So it should be something like:
> 
>   Document the return values of KVA mapping functions,
>   pud_set_huge(), pmd_set_huge, pud_clear_huge() and
>   pmd_clear_huge().
> 
>   Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
>   in the Kconfig, since X86_PAE depends on X86_32.
> 
> (also note the slight fixes I made to the text.)

Updated with the descriptions above.

> > There is no functinal change in this patch.
> 
> Typo.

Fixed.

> > +/**
> > + * pud_set_huge - setup kernel PUD mapping
> > + *
> > + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> 
> s/with a/with

Fixed.

> > + * it does not set up a huge page when the range is covered by non-WB type
> > + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> > + *
> > + * Return 1 on success, and 0 on no-operation.
> 
> What is a 'no-operation'?
> 
> I suspect you want:
> 
>     * Returns 1 on success, and 0 when no PUD was set.

Yes, that's what it meant to say.

> > +/**
> > + * pmd_set_huge - setup kernel PMD mapping
> > + *
> > + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> > + * it does not set up a huge page when the range is covered by non-WB type
> > + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> > + *
> > + * Return 1 on success, and 0 on no-operation.
> 
> Ditto (and the rest of the patch).

Updated all functions. I changed pud_clear_huge()'s description to:  

 * Return 1 on success, and 0 when no PUD map was found.

Thanks!
-Toshi


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/3] mm, x86: Document return values of mapping funcs
@ 2015-03-11 15:25       ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 15:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, Robert (Server Storage),
	pebolle

On Wed, 2015-03-11 at 06:30 +0000, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > Documented the return values of KVA mapping functions,
> > pud_set_huge(), pmd_set_huge, pud_clear_huge() and
> > pmd_clear_huge().
> > 
> > Simplified the conditions to select HAVE_ARCH_HUGE_VMAP
> > in Kconfig since X86_PAE depends on X86_32.
> 
> Changelogs are not a diary, they are a story, generally written in the 
> present tense. 

Oh, I see. Thanks for the tip!

> So it should be something like:
> 
>   Document the return values of KVA mapping functions,
>   pud_set_huge(), pmd_set_huge, pud_clear_huge() and
>   pmd_clear_huge().
> 
>   Simplify the conditions to select HAVE_ARCH_HUGE_VMAP
>   in the Kconfig, since X86_PAE depends on X86_32.
> 
> (also note the slight fixes I made to the text.)

Updated with the descriptions above.

> > There is no functinal change in this patch.
> 
> Typo.

Fixed.

> > +/**
> > + * pud_set_huge - setup kernel PUD mapping
> > + *
> > + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> 
> s/with a/with

Fixed.

> > + * it does not set up a huge page when the range is covered by non-WB type
> > + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> > + *
> > + * Return 1 on success, and 0 on no-operation.
> 
> What is a 'no-operation'?
> 
> I suspect you want:
> 
>     * Returns 1 on success, and 0 when no PUD was set.

Yes, that's what it meant to say.

> > +/**
> > + * pmd_set_huge - setup kernel PMD mapping
> > + *
> > + * MTRRs can override PAT memory types with a 4KB granularity.  Therefore,
> > + * it does not set up a huge page when the range is covered by non-WB type
> > + * of MTRRs.  0xFF indicates that MTRRs are disabled.
> > + *
> > + * Return 1 on success, and 0 on no-operation.
> 
> Ditto (and the rest of the patch).

Updated all functions. I changed pud_clear_huge()'s description to:  

 * Return 1 on success, and 0 when no PUD map was found.

Thanks!
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
  2015-03-11  6:32     ` Ingo Molnar
@ 2015-03-11 15:27       ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 15:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle

On Wed, 2015-03-11 at 07:32 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > When an MTRR entry is inclusive to a requested range, i.e.
> > the start and end of the request are not within the MTRR
> > entry range but the range contains the MTRR entry entirely,
> > __mtrr_type_lookup() ignores such case because both
> > start_state and end_state are set to zero.
> 
> 'ignores such a case' or 'ignores such cases'.

Changed to 'ignores such a case'.

> > This patch fixes the issue by adding a new flag, inclusive,
> 
> s/inclusive/'inclusive'

Updated.

Thanks!
-Toshi


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry
@ 2015-03-11 15:27       ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 15:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle

On Wed, 2015-03-11 at 07:32 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > When an MTRR entry is inclusive to a requested range, i.e.
> > the start and end of the request are not within the MTRR
> > entry range but the range contains the MTRR entry entirely,
> > __mtrr_type_lookup() ignores such case because both
> > start_state and end_state are set to zero.
> 
> 'ignores such a case' or 'ignores such cases'.

Changed to 'ignores such a case'.

> > This patch fixes the issue by adding a new flag, inclusive,
> 
> s/inclusive/'inclusive'

Updated.

Thanks!
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-03-11  7:02     ` Ingo Molnar
@ 2015-03-11 16:52       ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 16:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle

On Wed, 2015-03-11 at 08:02 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > This patch adds an additional argument, *uniform, to
> 
> s/*uniform/'uniform'

Done.

> > mtrr_type_lookup(), which returns 1 when a given range is
> > either fully covered by a single MTRR entry or not covered
> > at all.
> 
> s/or not covered/or is not covered

Done.

> > pud_set_huge() and pmd_set_huge() are changed to check the
> > new uniform flag to see if it is safe to create a huge page
> 
> s/uniform/'uniform'

Done.

> > mapping to the range.  This allows them to create a huge page
> > mapping to a range covered by a single MTRR entry of any
> > memory type.  It also detects an unoptimal request properly.
> 
> s/unoptimal/non-optimal

Done.

> or nonoptimal
> 
> Also, some description in the changelog about what a 'non-optimal' 
> request is would be most userful.
> 
> > They continue to check with the WB type since the WB type has
> > no effect even if a request spans to multiple MTRR entries.
> 
> s/spans to/spans

Done.

> > -static inline u8 mtrr_type_lookup(u64 addr, u64 end)
> > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
> >  {
> >  	/*
> >  	 * Return no-MTRRs:
> >  	 */
> > +	*uniform = 1;
> >  	return 0xff;
> >  }
> >  #define mtrr_save_fixed_ranges(arg) do {} while (0)
> > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> > index cdb955f..aef238c 100644
> > --- a/arch/x86/kernel/cpu/mtrr/generic.c
> > +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
> >   * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
> >   *		corresponds only to [start:*partial_end].
> >   *		Caller has to lookup again for [*partial_end:end].
> > + * *uniform == 1 The requested range is either fully covered by a single MTRR
> > + *		 entry or not covered at all.
> >   */
> 
> So I think a better approach would be to count the number of separate 
> MTRR caching types a range is covered by, instead of this hard to 
> quality 'uniform' flag.
> 
> I.e. a 'nr_mtrr_types' count.
> 
> If for example a range partially intersects with an MTRR, then that 
> count would be 2: the MTRR, and the outside (default cache policy) 
> type.
> 
> ( Note that with this approach is not only easy to understand and easy 
>   to review, but could also be refined in the future, to count the 
>   number of _incompatible_ caching types present within a range. )

I agree that using a count is more flexible.  However, there are some
issues below.

 - MTRRs have both fixed and variable ranges. The first 1MB is covered
with 11 fixed-range registers with different sizes of granularity,
512KB, 128KB, and 32KB.  __mtrr_type_lookup() checks the memory type of
the range at 'start', but does not check if a requested range spans
multiple memory types.  This first 1MB can be handled as 'uniform = 0'
since processors do not create a huge page map in this 1MB range.
However, setting a correct value to 'nr_mtrr_types' requires a major
overhaul in this code.

 - mtrr_type_lookup() returns without walking through all MTRR entries
when check_type_overlap() returns 1, i.e. the overlap made the resulted
memory type UC.  In this case, the code cannot set a correct value to
'nr_mtrr_type'.

Since MTRRs are legacy, esp. the fixed range, there is not much benefit
from enhancing the functionality of mtrr_type_lookup() unless there is
an issue with the current platforms.  For this patch, we only need to
know whether the mapping count is 1 or >1.  So, I think using 'uniform'
makes sense for simplicity.

> > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> > +static u8 __mtrr_type_lookup(u64 start, u64 end,
> > +			     u64 *partial_end, int *repeat, u8 *uniform)
> >  {
> >  	int i;
> >  	u64 base, mask;
> >  	u8 prev_match, curr_match;
> >  
> >  	*repeat = 0;
> > +	*uniform = 1;
> > +
> >  	if (!mtrr_state_set)
> >  		return 0xFF;
> >  
> > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  	/* Look in fixed ranges. Just return the type as per start */
> >  	if (mtrr_state.have_fixed && (start < 0x100000)) {
> >  		int idx;
> > +		*uniform = 0;
> 
> So this function scares me, because the code is clearly crap:
> 
>         if (mtrr_state.have_fixed && (start < 0x100000)) {
> 	...
>                 } else if (start < 0x1000000) {
> 	...
> 
> How can that 'else if' branch ever not be true?

This 'else if' is always true.  So, it can be simply 'else' without any
condition.

> Did it perhaps want to be the other way around:
> 
>         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> 	...
>                 } else if (start < 0x100000) {
> 	...
> 
> or did it simply mess up the condition?

I think it was just paranoid to test the same condition twice...

> >  
> >  		if (start < 0x80000) {
> >  			idx = 0;
> > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  
> >  			end = *partial_end - 1; /* end is inclusive */
> >  			*repeat = 1;
> > +			*uniform = 0;
> >  		}
> >  
> >  		if (!start_state)
> > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  			continue;
> >  		}
> >  
> > +		*uniform = 0;
> >  		if (check_type_overlap(&prev_match, &curr_match))
> >  			return curr_match;
> >  	}
> 
> 
> > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  }
> >  
> >  /*
> > - * Returns the effective MTRR type for the region
> > + * Returns the effective MTRR type for the region.  *uniform is set to 1
> > + * when a given range is either fully covered by a single MTRR entry or
> > + * not covered at all.
> > + *
> >   * Error return:
> >   * 0xFF - when MTRR is not enabled
> >   */
> > -u8 mtrr_type_lookup(u64 start, u64 end)
> > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
> >  {
> > -	u8 type, prev_type;
> > +	u8 type, prev_type, is_uniform, dummy;
> >  	int repeat;
> >  	u64 partial_end;
> >  
> > -	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> > +	type = __mtrr_type_lookup(start, end,
> > +				  &partial_end, &repeat, &is_uniform);
> >  
> >  	/*
> >  	 * Common path is with repeat = 0.
> > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
> >  	while (repeat) {
> >  		prev_type = type;
> >  		start = partial_end;
> > -		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> > +		is_uniform = 0;
> >  
> > -		if (check_type_overlap(&prev_type, &type))
> > +		type = __mtrr_type_lookup(start, end,
> > +					  &partial_end, &repeat, &dummy);
> > +
> > +		if (check_type_overlap(&prev_type, &type)) {
> > +			*uniform = 0;
> >  			return type;
> > +		}
> >  	}
> >  
> > +	*uniform = is_uniform;
> >  	return type;
> 
> So the MTRR code is from hell, it would be nice to first clean up the 
> whole code and the MTRR data structures before extending it with more 
> complexity ...

Good idea.  I will clean up the code (no functional change) before
making this change.

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
@ 2015-03-11 16:52       ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-11 16:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle

On Wed, 2015-03-11 at 08:02 +0100, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > This patch adds an additional argument, *uniform, to
> 
> s/*uniform/'uniform'

Done.

> > mtrr_type_lookup(), which returns 1 when a given range is
> > either fully covered by a single MTRR entry or not covered
> > at all.
> 
> s/or not covered/or is not covered

Done.

> > pud_set_huge() and pmd_set_huge() are changed to check the
> > new uniform flag to see if it is safe to create a huge page
> 
> s/uniform/'uniform'

Done.

> > mapping to the range.  This allows them to create a huge page
> > mapping to a range covered by a single MTRR entry of any
> > memory type.  It also detects an unoptimal request properly.
> 
> s/unoptimal/non-optimal

Done.

> or nonoptimal
> 
> Also, some description in the changelog about what a 'non-optimal' 
> request is would be most userful.
> 
> > They continue to check with the WB type since the WB type has
> > no effect even if a request spans to multiple MTRR entries.
> 
> s/spans to/spans

Done.

> > -static inline u8 mtrr_type_lookup(u64 addr, u64 end)
> > +static inline u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform)
> >  {
> >  	/*
> >  	 * Return no-MTRRs:
> >  	 */
> > +	*uniform = 1;
> >  	return 0xff;
> >  }
> >  #define mtrr_save_fixed_ranges(arg) do {} while (0)
> > diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c
> > index cdb955f..aef238c 100644
> > --- a/arch/x86/kernel/cpu/mtrr/generic.c
> > +++ b/arch/x86/kernel/cpu/mtrr/generic.c
> > @@ -108,14 +108,19 @@ static int check_type_overlap(u8 *prev, u8 *curr)
> >   * *repeat == 1 implies [start:end] spanned across MTRR range and type returned
> >   *		corresponds only to [start:*partial_end].
> >   *		Caller has to lookup again for [*partial_end:end].
> > + * *uniform == 1 The requested range is either fully covered by a single MTRR
> > + *		 entry or not covered at all.
> >   */
> 
> So I think a better approach would be to count the number of separate 
> MTRR caching types a range is covered by, instead of this hard to 
> quality 'uniform' flag.
> 
> I.e. a 'nr_mtrr_types' count.
> 
> If for example a range partially intersects with an MTRR, then that 
> count would be 2: the MTRR, and the outside (default cache policy) 
> type.
> 
> ( Note that with this approach is not only easy to understand and easy 
>   to review, but could also be refined in the future, to count the 
>   number of _incompatible_ caching types present within a range. )

I agree that using a count is more flexible.  However, there are some
issues below.

 - MTRRs have both fixed and variable ranges. The first 1MB is covered
with 11 fixed-range registers with different sizes of granularity,
512KB, 128KB, and 32KB.  __mtrr_type_lookup() checks the memory type of
the range at 'start', but does not check if a requested range spans
multiple memory types.  This first 1MB can be handled as 'uniform = 0'
since processors do not create a huge page map in this 1MB range.
However, setting a correct value to 'nr_mtrr_types' requires a major
overhaul in this code.

 - mtrr_type_lookup() returns without walking through all MTRR entries
when check_type_overlap() returns 1, i.e. the overlap made the resulted
memory type UC.  In this case, the code cannot set a correct value to
'nr_mtrr_type'.

Since MTRRs are legacy, esp. the fixed range, there is not much benefit
from enhancing the functionality of mtrr_type_lookup() unless there is
an issue with the current platforms.  For this patch, we only need to
know whether the mapping count is 1 or >1.  So, I think using 'uniform'
makes sense for simplicity.

> > -static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> > +static u8 __mtrr_type_lookup(u64 start, u64 end,
> > +			     u64 *partial_end, int *repeat, u8 *uniform)
> >  {
> >  	int i;
> >  	u64 base, mask;
> >  	u8 prev_match, curr_match;
> >  
> >  	*repeat = 0;
> > +	*uniform = 1;
> > +
> >  	if (!mtrr_state_set)
> >  		return 0xFF;
> >  
> > @@ -128,6 +133,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  	/* Look in fixed ranges. Just return the type as per start */
> >  	if (mtrr_state.have_fixed && (start < 0x100000)) {
> >  		int idx;
> > +		*uniform = 0;
> 
> So this function scares me, because the code is clearly crap:
> 
>         if (mtrr_state.have_fixed && (start < 0x100000)) {
> 	...
>                 } else if (start < 0x1000000) {
> 	...
> 
> How can that 'else if' branch ever not be true?

This 'else if' is always true.  So, it can be simply 'else' without any
condition.

> Did it perhaps want to be the other way around:
> 
>         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> 	...
>                 } else if (start < 0x100000) {
> 	...
> 
> or did it simply mess up the condition?

I think it was just paranoid to test the same condition twice...

> >  
> >  		if (start < 0x80000) {
> >  			idx = 0;
> > @@ -195,6 +201,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  
> >  			end = *partial_end - 1; /* end is inclusive */
> >  			*repeat = 1;
> > +			*uniform = 0;
> >  		}
> >  
> >  		if (!start_state)
> > @@ -206,6 +213,7 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  			continue;
> >  		}
> >  
> > +		*uniform = 0;
> >  		if (check_type_overlap(&prev_match, &curr_match))
> >  			return curr_match;
> >  	}
> 
> 
> > @@ -222,17 +230,21 @@ static u8 __mtrr_type_lookup(u64 start, u64 end, u64 *partial_end, int *repeat)
> >  }
> >  
> >  /*
> > - * Returns the effective MTRR type for the region
> > + * Returns the effective MTRR type for the region.  *uniform is set to 1
> > + * when a given range is either fully covered by a single MTRR entry or
> > + * not covered at all.
> > + *
> >   * Error return:
> >   * 0xFF - when MTRR is not enabled
> >   */
> > -u8 mtrr_type_lookup(u64 start, u64 end)
> > +u8 mtrr_type_lookup(u64 start, u64 end, u8 *uniform)
> >  {
> > -	u8 type, prev_type;
> > +	u8 type, prev_type, is_uniform, dummy;
> >  	int repeat;
> >  	u64 partial_end;
> >  
> > -	type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> > +	type = __mtrr_type_lookup(start, end,
> > +				  &partial_end, &repeat, &is_uniform);
> >  
> >  	/*
> >  	 * Common path is with repeat = 0.
> > @@ -242,12 +254,18 @@ u8 mtrr_type_lookup(u64 start, u64 end)
> >  	while (repeat) {
> >  		prev_type = type;
> >  		start = partial_end;
> > -		type = __mtrr_type_lookup(start, end, &partial_end, &repeat);
> > +		is_uniform = 0;
> >  
> > -		if (check_type_overlap(&prev_type, &type))
> > +		type = __mtrr_type_lookup(start, end,
> > +					  &partial_end, &repeat, &dummy);
> > +
> > +		if (check_type_overlap(&prev_type, &type)) {
> > +			*uniform = 0;
> >  			return type;
> > +		}
> >  	}
> >  
> > +	*uniform = is_uniform;
> >  	return type;
> 
> So the MTRR code is from hell, it would be nice to first clean up the 
> whole code and the MTRR data structures before extending it with more 
> complexity ...

Good idea.  I will clean up the code (no functional change) before
making this change.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-03-11 16:52       ` Toshi Kani
@ 2015-03-12 11:03         ` Ingo Molnar
  -1 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-12 11:03 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> > Did it perhaps want to be the other way around:
> > 
> >         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> > 	...
> >                 } else if (start < 0x100000) {
> > 	...
> > 
> > or did it simply mess up the condition?
> 
> I think it was just paranoid to test the same condition twice...

Read the code again, it's _not_ the same condition ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
@ 2015-03-12 11:03         ` Ingo Molnar
  0 siblings, 0 replies; 24+ messages in thread
From: Ingo Molnar @ 2015-03-12 11:03 UTC (permalink / raw)
  To: Toshi Kani
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, pebolle


* Toshi Kani <toshi.kani@hp.com> wrote:

> > Did it perhaps want to be the other way around:
> > 
> >         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> > 	...
> >                 } else if (start < 0x100000) {
> > 	...
> > 
> > or did it simply mess up the condition?
> 
> I think it was just paranoid to test the same condition twice...

Read the code again, it's _not_ the same condition ...

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
  2015-03-12 11:03         ` Ingo Molnar
@ 2015-03-12 13:58           ` Toshi Kani
  -1 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-12 13:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, Robert (Server Storage),
	pebolle

On Thu, 2015-03-12 at 11:03 +0000, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > > Did it perhaps want to be the other way around:
> > > 
> > >         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> > > 	...
> > >                 } else if (start < 0x100000) {
> > > 	...
> > > 
> > > or did it simply mess up the condition?
> > 
> > I think it was just paranoid to test the same condition twice...
> 
> Read the code again, it's _not_ the same condition ...

Oh, I see...  It must be a typo.  The fixed range is 0x0 to 0xFFFFF, so
it only makes sense to check with (start < 0x100000).

Thanks,
-Toshi


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping
@ 2015-03-12 13:58           ` Toshi Kani
  0 siblings, 0 replies; 24+ messages in thread
From: Toshi Kani @ 2015-03-12 13:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, hpa, tglx, mingo, arnd, linux-mm, x86, linux-kernel,
	dave.hansen, Elliott, Robert (Server Storage),
	pebolle

On Thu, 2015-03-12 at 11:03 +0000, Ingo Molnar wrote:
> * Toshi Kani <toshi.kani@hp.com> wrote:
> 
> > > Did it perhaps want to be the other way around:
> > > 
> > >         if (mtrr_state.have_fixed && (start < 0x1000000)) {
> > > 	...
> > >                 } else if (start < 0x100000) {
> > > 	...
> > > 
> > > or did it simply mess up the condition?
> > 
> > I think it was just paranoid to test the same condition twice...
> 
> Read the code again, it's _not_ the same condition ...

Oh, I see...  It must be a typo.  The fixed range is 0x0 to 0xFFFFF, so
it only makes sense to check with (start < 0x100000).

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-03-12 13:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-10 20:23 [PATCH 0/3] mtrr, mm, x86: Enhance MTRR checks for huge I/O mapping Toshi Kani
2015-03-10 20:23 ` Toshi Kani
2015-03-10 20:23 ` [PATCH 1/3] mm, x86: Document return values of mapping funcs Toshi Kani
2015-03-10 20:23   ` Toshi Kani
2015-03-11  6:30   ` Ingo Molnar
2015-03-11  6:30     ` Ingo Molnar
2015-03-11 15:25     ` Toshi Kani
2015-03-11 15:25       ` Toshi Kani
2015-03-10 20:23 ` [PATCH 2/3] mtrr, x86: Fix MTRR lookup to handle inclusive entry Toshi Kani
2015-03-10 20:23   ` Toshi Kani
2015-03-11  6:32   ` Ingo Molnar
2015-03-11  6:32     ` Ingo Molnar
2015-03-11 15:27     ` Toshi Kani
2015-03-11 15:27       ` Toshi Kani
2015-03-10 20:23 ` [PATCH 3/3] mtrr, mm, x86: Enhance MTRR checks for KVA huge page mapping Toshi Kani
2015-03-10 20:23   ` Toshi Kani
2015-03-11  7:02   ` Ingo Molnar
2015-03-11  7:02     ` Ingo Molnar
2015-03-11 16:52     ` Toshi Kani
2015-03-11 16:52       ` Toshi Kani
2015-03-12 11:03       ` Ingo Molnar
2015-03-12 11:03         ` Ingo Molnar
2015-03-12 13:58         ` Toshi Kani
2015-03-12 13:58           ` Toshi Kani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.