linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] Support Write-Through mapping on x86
@ 2014-09-17 19:48 Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk

This patchset adds support of Write-Through (WT) mapping on x86.
The study below shows that using WT mapping may be useful for
non-volatile memory.

  http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf

This patchset applies on top of the Juergen's patchset below,
which provides the basis of the PAT management.

  https://lkml.org/lkml/2014/9/12/205

All new/modified interfaces have been tested.

v3:
 - Dropped the set_memory_wt() patch. (Andy Lutomirski)
 - Refactored the !pat_enabled handling. (H. Peter Anvin,
   Andy Lutomirski)
 - Added the picture of PTE encoding. (Konrad Rzeszutek Wilk)

v2:
 - Changed WT to use slot 7 of the PAT MSR. (H. Peter Anvin,
   Andy Lutomirski)
 - Changed to have conservative checks to exclude all Pentium 2, 3,
   M, and 4 families. (Ingo Molnar, Henrique de Moraes Holschuh,
   Andy Lutomirski)
 - Updated documentation to cover WT interfaces and usages.
   (Andy Lutomirski, Yigal Korman)

---
Toshi Kani (5):
  1/5 x86, mm, pat: Set WT to PA7 slot of PAT MSR
  2/5 x86, mm, pat: Change reserve_memtype() to handle WT
  3/5 x86, mm, asm-gen: Add ioremap_wt() for WT
  4/5 x86, mm, pat: Add pgprot_writethrough() for WT
  5/5 x86, mm, pat: Refactor !pat_enable handling

---
 Documentation/x86/pat.txt            |   4 +-
 arch/x86/include/asm/cacheflush.h    |   4 +
 arch/x86/include/asm/io.h            |   2 +
 arch/x86/include/asm/pgtable_types.h |   3 +
 arch/x86/mm/init.c                   |   6 +-
 arch/x86/mm/iomap_32.c               |  18 ++--
 arch/x86/mm/ioremap.c                |  26 +++++-
 arch/x86/mm/pageattr.c               |   3 -
 arch/x86/mm/pat.c                    | 160 ++++++++++++++++++++++-------------
 include/asm-generic/io.h             |   4 +
 include/asm-generic/iomap.h          |   4 +
 include/asm-generic/pgtable.h        |   4 +
 12 files changed, 156 insertions(+), 82 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR
  2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
@ 2014-09-17 19:48 ` Toshi Kani
  2014-09-23  5:46   ` Juergen Gross
  2014-09-17 19:48 ` [PATCH v3 2/5] x86, mm, pat: Change reserve_memtype() to handle WT Toshi Kani
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Toshi Kani

This patch sets WT to the PA7 slot in the PAT MSR when the processor
is not affected by the PAT errata.  The PA7 slot is chosen to further
minimize the risk of using the PAT bit as the PA3 slot is UC and is
not currently used.

The following Intel processors are affected by the PAT errata.

   errata               cpuid
   ----------------------------------------------------
   Pentium 2, A52       family 0x6, model 0x5
   Pentium 3, E27       family 0x6, model 0x7, 0x8
   Pentium 3 Xenon, G26 family 0x6, model 0x7, 0x8, 0xa
   Pentium M, Y26       family 0x6, model 0x9
   Pentium M 90nm, X9   family 0x6, model 0xd
   Pentium 4, N46       family 0xf, model 0x0

Instead of making sharp boundary checks, this patch makes conservative
checks to exclude all Pentium 2, 3, M and 4 family processors.  For
such processors, _PAGE_CACHE_MODE_WT is redirected to UC- per the
default setup in __cachemode2pte_tbl[].

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/pat.c |   64 +++++++++++++++++++++++++++++++++++++++++------------
 1 file changed, 49 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index ff31851..db687c3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -133,6 +133,7 @@ void pat_init(void)
 {
 	u64 pat;
 	bool boot_cpu = !boot_pat_state;
+	struct cpuinfo_x86 *c = &boot_cpu_data;
 
 	if (!pat_enabled)
 		return;
@@ -153,21 +154,54 @@ void pat_init(void)
 		}
 	}
 
-	/* Set PWT to Write-Combining. All other bits stay the same */
-	/*
-	 * PTE encoding used in Linux:
-	 *      PAT
-	 *      |PCD
-	 *      ||PWT
-	 *      |||
-	 *      000 WB		_PAGE_CACHE_WB
-	 *      001 WC		_PAGE_CACHE_WC
-	 *      010 UC-		_PAGE_CACHE_UC_MINUS
-	 *      011 UC		_PAGE_CACHE_UC
-	 * PAT bit unused
-	 */
-	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
-	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
+	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+		/*
+		 * PAT support with the lower four entries. Intel Pentium 2,
+		 * 3, M, and 4 are affected by PAT errata, which makes the
+		 * upper four entries unusable.  We do not use the upper four
+		 * entries for all the affected processor families for safe.
+		 *
+		 *  PTE encoding used in Linux:
+		 *      PAT
+		 *      |PCD
+		 *      ||PWT  PAT
+		 *      |||    slot
+		 *      000    0    WB : _PAGE_CACHE_MODE_WB
+		 *      001    1    WC : _PAGE_CACHE_MODE_WC
+		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *      011    3    UC : _PAGE_CACHE_MODE_UC
+		 * PAT bit unused
+		 *
+		 * NOTE: When WT or WP is used, it is redirected to UC- per
+		 * the default setup in __cachemode2pte_tbl[].
+		 */
+		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	} else {
+		/*
+		 * PAT full support. WT is set to slot 7, which minimizes
+		 * the risk of using the PAT bit as slot 3 is UC and is
+		 * currently unused. Slot 4 should remain as reserved.
+		 *
+		 *  PTE encoding used in Linux:
+		 *      PAT
+		 *      |PCD
+		 *      ||PWT  PAT
+		 *      |||    slot
+		 *      000    0    WB : _PAGE_CACHE_MODE_WB
+		 *      001    1    WC : _PAGE_CACHE_MODE_WC
+		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *      011    3    UC : _PAGE_CACHE_MODE_UC
+		 *      100    4    <reserved>
+		 *      101    5    <reserved>
+		 *      110    6    <reserved>
+		 *      111    7    WT : _PAGE_CACHE_MODE_WT
+		 */
+		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
+	}
 
 	/* Boot CPU check */
 	if (!boot_pat_state)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/5] x86, mm, pat: Change reserve_memtype() to handle WT
  2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
@ 2014-09-17 19:48 ` Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 3/5] x86, mm, asm-gen: Add ioremap_wt() for WT Toshi Kani
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Toshi Kani

This patch changes reserve_memtype() to handle the WT cache mode.
When PAT is not enabled, it continues to set UC- to *new_type for
any non-WB request.

When a target range is RAM, reserve_ram_pages_type() fails for WT
for now.  This function may not reserve a RAM range for WT since
reserve_ram_pages_type() uses the page flags limited to three memory
types, WB, WC and UC.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/include/asm/cacheflush.h |    4 ++++
 arch/x86/mm/pat.c                 |   16 +++++++++++++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 157644b..c912680 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -53,6 +53,10 @@ static inline void set_page_memtype(struct page *pg,
 	case _PAGE_CACHE_MODE_WB:
 		memtype_flags = _PGMT_WB;
 		break;
+	case _PAGE_CACHE_MODE_WT:
+	case _PAGE_CACHE_MODE_WP:
+		pr_err("set_page_memtype: unsupported cachemode %d\n", memtype);
+		BUG();
 	default:
 		memtype_flags = _PGMT_DEFAULT;
 		break;
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index db687c3..a214f5a 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -289,6 +289,8 @@ static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 
 /*
  * For RAM pages, we use page flags to mark the pages with appropriate type.
+ * The page flags are currently limited to three types, WB, WC and UC. Hence,
+ * any request to WT or WP will fail with -EINVAL.
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
@@ -300,6 +302,13 @@ static int reserve_ram_pages_type(u64 start, u64 end,
 	struct page *page;
 	u64 pfn;
 
+	if ((req_type == _PAGE_CACHE_MODE_WT) ||
+	    (req_type == _PAGE_CACHE_MODE_WP)) {
+		if (new_type)
+			*new_type = _PAGE_CACHE_MODE_UC_MINUS;
+		return -EINVAL;
+	}
+
 	if (req_type == _PAGE_CACHE_MODE_UC) {
 		/* We do not support strong UC */
 		WARN_ON_ONCE(1);
@@ -349,6 +358,7 @@ static int free_ram_pages_type(u64 start, u64 end)
  * - _PAGE_CACHE_MODE_WC
  * - _PAGE_CACHE_MODE_UC_MINUS
  * - _PAGE_CACHE_MODE_UC
+ * - _PAGE_CACHE_MODE_WT
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
  * region with req_type. If new_type is non-NULL, function will return
@@ -368,10 +378,10 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 	if (!pat_enabled) {
 		/* This is identical to page table setting without PAT */
 		if (new_type) {
-			if (req_type == _PAGE_CACHE_MODE_WC)
-				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
+			if (req_type == _PAGE_CACHE_MODE_WB)
+				*new_type = _PAGE_CACHE_MODE_WB;
 			else
-				*new_type = req_type;
+				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
 		}
 		return 0;
 	}

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/5] x86, mm, asm-gen: Add ioremap_wt() for WT
  2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 2/5] x86, mm, pat: Change reserve_memtype() to handle WT Toshi Kani
@ 2014-09-17 19:48 ` Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 4/5] x86, mm, pat: Add pgprot_writethrough() " Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
  4 siblings, 0 replies; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Toshi Kani

This patch adds ioremap_wt() for creating WT mapping on x86.
It follows the same model as ioremap_wc() for multi-architecture
support.  ARCH_HAS_IOREMAP_WT is defined in the x86 version of
io.h to indicate that ioremap_wt() is implemented on x86.

Also update the PAT documentation file to cover ioremap_wt().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 Documentation/x86/pat.txt   |    4 +++-
 arch/x86/include/asm/io.h   |    2 ++
 arch/x86/mm/ioremap.c       |   24 ++++++++++++++++++++++++
 include/asm-generic/io.h    |    4 ++++
 include/asm-generic/iomap.h |    4 ++++
 5 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index cf08c9f..be7b8c2 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -12,7 +12,7 @@ virtual addresses.
 
 PAT allows for different types of memory attributes. The most commonly used
 ones that will be supported at this time are Write-back, Uncached,
-Write-combined and Uncached Minus.
+Write-combined, Write-through and Uncached Minus.
 
 
 PAT APIs
@@ -38,6 +38,8 @@ ioremap_nocache        |    --    |    UC-     |       UC-        |
                        |          |            |                  |
 ioremap_wc             |    --    |    --      |       WC         |
                        |          |            |                  |
+ioremap_wt             |    --    |    --      |       WT         |
+                       |          |            |                  |
 set_memory_uc          |    UC-   |    --      |       --         |
  set_memory_wb         |          |            |                  |
                        |          |            |                  |
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 71b9e65..c813c86 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -35,6 +35,7 @@
   */
 
 #define ARCH_HAS_IOREMAP_WC
+#define ARCH_HAS_IOREMAP_WT
 
 #include <linux/string.h>
 #include <linux/compiler.h>
@@ -316,6 +317,7 @@ extern void unxlate_dev_mem_ptr(unsigned long phys, void *addr);
 extern int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 				enum page_cache_mode pcm);
 extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
+extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size);
 
 extern bool is_early_ioremap_ptep(pte_t *ptep);
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 885fe44..952f4b4 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -155,6 +155,10 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		prot = __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WC));
 		break;
+	case _PAGE_CACHE_MODE_WT:
+		prot = __pgprot(pgprot_val(prot) |
+				cachemode2protval(_PAGE_CACHE_MODE_WT));
+		break;
 	case _PAGE_CACHE_MODE_WB:
 		break;
 	}
@@ -249,6 +253,26 @@ void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 }
 EXPORT_SYMBOL(ioremap_wc);
 
+/**
+ * ioremap_wt	-	map memory into CPU space write through
+ * @phys_addr:	bus address of the memory
+ * @size:	size of the resource to map
+ *
+ * This version of ioremap ensures that the memory is marked write through.
+ * Write through writes data into memory while keeping the cache up-to-date.
+ *
+ * Must be freed with iounmap.
+ */
+void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
+{
+	if (pat_enabled)
+		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
+					__builtin_return_address(0));
+	else
+		return ioremap_nocache(phys_addr, size);
+}
+EXPORT_SYMBOL(ioremap_wt);
+
 void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 975e1cc..405d418 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -322,6 +322,10 @@ static inline void __iomem *ioremap(phys_addr_t offset, unsigned long size)
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ioremap_wt
+#define ioremap_wt ioremap_nocache
+#endif
+
 static inline void iounmap(void __iomem *addr)
 {
 }
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 1b41011..d8f8622 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -66,6 +66,10 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_WT
+#define ioremap_wt ioremap_nocache
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 4/5] x86, mm, pat: Add pgprot_writethrough() for WT
  2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
                   ` (2 preceding siblings ...)
  2014-09-17 19:48 ` [PATCH v3 3/5] x86, mm, asm-gen: Add ioremap_wt() for WT Toshi Kani
@ 2014-09-17 19:48 ` Toshi Kani
  2014-09-17 19:48 ` [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
  4 siblings, 0 replies; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Toshi Kani

This patch adds pgprot_writethrough() for setting WT to a given
pgprot_t.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 arch/x86/include/asm/pgtable_types.h |    3 +++
 arch/x86/mm/pat.c                    |   10 ++++++++++
 include/asm-generic/pgtable.h        |    4 ++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index bd2f50f..cc7c65d 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -394,6 +394,9 @@ extern int nx_enabled;
 #define pgprot_writecombine	pgprot_writecombine
 extern pgprot_t pgprot_writecombine(pgprot_t prot);
 
+#define pgprot_writethrough	pgprot_writethrough
+extern pgprot_t pgprot_writethrough(pgprot_t prot);
+
 /* Indicate that x86 has its own track and untrack pfn vma functions */
 #define __HAVE_PFNMAP_TRACKING
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index a214f5a..a0264d3 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -896,6 +896,16 @@ pgprot_t pgprot_writecombine(pgprot_t prot)
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
+pgprot_t pgprot_writethrough(pgprot_t prot)
+{
+	if (pat_enabled)
+		return __pgprot(pgprot_val(prot) |
+				cachemode2protval(_PAGE_CACHE_MODE_WT));
+	else
+		return pgprot_noncached(prot);
+}
+EXPORT_SYMBOL_GPL(pgprot_writethrough);
+
 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT)
 
 static struct memtype *memtype_get_idx(loff_t pos)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 53b2acc..1af0ed9 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -249,6 +249,10 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #define pgprot_writecombine pgprot_noncached
 #endif
 
+#ifndef pgprot_writethrough
+#define pgprot_writethrough pgprot_noncached
+#endif
+
 /*
  * When walking page tables, get the address of the next boundary,
  * or the end address of the range if that comes earlier.  Although no

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling
  2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
                   ` (3 preceding siblings ...)
  2014-09-17 19:48 ` [PATCH v3 4/5] x86, mm, pat: Add pgprot_writethrough() " Toshi Kani
@ 2014-09-17 19:48 ` Toshi Kani
  2014-09-23  6:25   ` Juergen Gross
  4 siblings, 1 reply; 8+ messages in thread
From: Toshi Kani @ 2014-09-17 19:48 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Toshi Kani

This patch refactors the !pat_enabled handling code and integrates
this case into the PAT abstraction code. The PAT table is emulated
by corresponding to the two cache attribute bits, PWT (Write Through)
and PCD (Cache Disable). The emulated PAT table is also the same as
the BIOS default setup in case the system has PAT but "nopat" boot
option is specified.

As a result of this change, cache aliasing is checked for all cases
including !pat_enabled.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/init.c     |    6 ++-
 arch/x86/mm/iomap_32.c |   18 +++-------
 arch/x86/mm/ioremap.c  |   10 +----
 arch/x86/mm/pageattr.c |    3 --
 arch/x86/mm/pat.c      |   90 +++++++++++++++++++++---------------------------
 5 files changed, 50 insertions(+), 77 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 82b41d5..2e147c8 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -37,7 +37,7 @@
  */
 uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
 	[_PAGE_CACHE_MODE_WB]		= 0,
-	[_PAGE_CACHE_MODE_WC]		= _PAGE_PWT,
+	[_PAGE_CACHE_MODE_WC]		= _PAGE_PCD,
 	[_PAGE_CACHE_MODE_UC_MINUS]	= _PAGE_PCD,
 	[_PAGE_CACHE_MODE_UC]		= _PAGE_PCD | _PAGE_PWT,
 	[_PAGE_CACHE_MODE_WT]		= _PAGE_PCD,
@@ -46,11 +46,11 @@ uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
 EXPORT_SYMBOL_GPL(__cachemode2pte_tbl);
 uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx(0)] = _PAGE_CACHE_MODE_WB,
-	[__pte2cm_idx(_PAGE_PWT)] = _PAGE_CACHE_MODE_WC,
+	[__pte2cm_idx(_PAGE_PWT)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PCD)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD)] = _PAGE_CACHE_MODE_UC,
 	[__pte2cm_idx(_PAGE_PAT)] = _PAGE_CACHE_MODE_WB,
-	[__pte2cm_idx(_PAGE_PWT | _PAGE_PAT)] = _PAGE_CACHE_MODE_WC,
+	[__pte2cm_idx(_PAGE_PWT | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index ee58a0b..96aa8bf 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -70,29 +70,23 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
 	return (void *)vaddr;
 }
 
-/*
- * Map 'pfn' using protections 'prot'
- */
-#define __PAGE_KERNEL_WC	(__PAGE_KERNEL | \
-				 cachemode2protval(_PAGE_CACHE_MODE_WC))
-
 void __iomem *
 iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
 {
 	/*
-	 * For non-PAT systems, promote PAGE_KERNEL_WC to PAGE_KERNEL_UC_MINUS.
-	 * PAGE_KERNEL_WC maps to PWT, which translates to uncached if the
-	 * MTRR is UC or WC.  UC_MINUS gets the real intention, of the
-	 * user, which is "WC if the MTRR is WC, UC if you can't do that."
+	 * For non-PAT systems, translate non-WB request to UC- just in
+	 * case the caller set the PWT bit to prot directly without using
+	 * pgprot_writecombine(). UC- translates to uncached if the MTRR
+	 * is UC or WC. UC- gets the real intention, of the user, which is
+	 * "WC if the MTRR is WC, UC if you can't do that."
 	 */
-	if (!pat_enabled && pgprot_val(prot) == __PAGE_KERNEL_WC)
+	if (!pat_enabled && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB)
 		prot = __pgprot(__PAGE_KERNEL |
 				cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
 
 	return (void __force __iomem *) kmap_atomic_prot_pfn(pfn, prot);
 }
 EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn);
-#undef __PAGE_KERNEL_WC
 
 void
 iounmap_atomic(void __iomem *kvaddr)
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 952f4b4..ff45c19 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -245,11 +245,8 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
 					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
 }
 EXPORT_SYMBOL(ioremap_wc);
 
@@ -265,11 +262,8 @@ EXPORT_SYMBOL(ioremap_wc);
  */
 void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
 					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
 }
 EXPORT_SYMBOL(ioremap_wt);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 6917b39..34f870d 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1553,9 +1553,6 @@ int set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
-
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_MODE_WC, NULL);
 	if (ret)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index a0264d3..e0e836e 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -135,28 +135,48 @@ void pat_init(void)
 	bool boot_cpu = !boot_pat_state;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 
-	if (!pat_enabled)
-		return;
-
 	if (!cpu_has_pat) {
 		if (!boot_pat_state) {
 			pat_disable("PAT not supported by CPU.");
-			return;
-		} else {
+		} else if (pat_enabled) {
 			/*
 			 * If this happens we are on a secondary CPU, but
 			 * switched to PAT on the boot CPU. We have no way to
 			 * undo PAT.
 			 */
-			printk(KERN_ERR "PAT enabled, "
+			pr_err("PAT enabled, "
 			       "but not supported by secondary CPU\n");
 			BUG();
 		}
 	}
 
-	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
-	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
-	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+	if (!pat_enabled) {
+		/*
+		 * No PAT. Emulate the PAT table by corresponding to the two
+		 * cache bits, PWT (Write Through) and PCD (Cache Disable).
+		 * This is also the same as the BIOS default setup in case
+		 * the system has PAT but "nopat" boot option is specified.
+		 *
+		 *  PTE encoding used in Linux:
+		 *       PCD
+		 *       |PWT  PAT
+		 *       ||    slot
+		 *       00    0    WB : _PAGE_CACHE_MODE_WB
+		 *       01    1    WT : _PAGE_CACHE_MODE_WT
+		 *       10    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *       11    3    UC : _PAGE_CACHE_MODE_UC
+		 *
+		 * NOTE: When WC or WP is used, it is redirected to UC- per
+		 * the default setup in __cachemode2pte_tbl[].
+		 */
+		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+		if (!boot_pat_state)
+			boot_pat_state = pat;
+
+	} else if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+		   (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
+		    ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
 		/*
 		 * PAT support with the lower four entries. Intel Pentium 2,
 		 * 3, M, and 4 are affected by PAT errata, which makes the
@@ -203,11 +223,13 @@ void pat_init(void)
 		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
 	}
 
-	/* Boot CPU check */
-	if (!boot_pat_state)
-		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
+	if (pat_enabled) {
+		/* Boot CPU check */
+		if (!boot_pat_state)
+			rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
 
-	wrmsrl(MSR_IA32_CR_PAT, pat);
+		wrmsrl(MSR_IA32_CR_PAT, pat);
+	}
 
 	if (boot_cpu)
 		pat_init_cache_modes();
@@ -375,17 +397,6 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 
 	BUG_ON(start >= end); /* end is exclusive */
 
-	if (!pat_enabled) {
-		/* This is identical to page table setting without PAT */
-		if (new_type) {
-			if (req_type == _PAGE_CACHE_MODE_WB)
-				*new_type = _PAGE_CACHE_MODE_WB;
-			else
-				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
-		}
-		return 0;
-	}
-
 	/* Low ISA region is always mapped WB in page table. No need to track */
 	if (x86_platform.is_untracked_pat_range(start, end)) {
 		if (new_type)
@@ -450,9 +461,6 @@ int free_memtype(u64 start, u64 end)
 	int is_range_ram;
 	struct memtype *entry;
 
-	if (!pat_enabled)
-		return 0;
-
 	/* Low ISA region is always mapped WB. No need to track */
 	if (x86_platform.is_untracked_pat_range(start, end))
 		return 0;
@@ -591,16 +599,13 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
 	return 1;
 }
 #else
-/* This check is needed to avoid cache aliasing when PAT is enabled */
+/* This check is needed to avoid cache aliasing */
 static inline int range_is_allowed(unsigned long pfn, unsigned long size)
 {
 	u64 from = ((u64)pfn) << PAGE_SHIFT;
 	u64 to = from + size;
 	u64 cursor = from;
 
-	if (!pat_enabled)
-		return 1;
-
 	while (cursor < to) {
 		if (!devmem_is_allowed(pfn)) {
 			printk(KERN_INFO "Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx]\n",
@@ -704,9 +709,6 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
 	 * the type requested matches the type of first page in the range.
 	 */
 	if (is_ram) {
-		if (!pat_enabled)
-			return 0;
-
 		pcm = lookup_memtype(paddr);
 		if (want_pcm != pcm) {
 			printk(KERN_WARNING "%s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
@@ -819,9 +821,6 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
 		return ret;
 	}
 
-	if (!pat_enabled)
-		return 0;
-
 	/*
 	 * For anything smaller than the vma size we set prot based on the
 	 * lookup.
@@ -847,9 +846,6 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
 {
 	enum page_cache_mode pcm;
 
-	if (!pat_enabled)
-		return 0;
-
 	/* Set prot based on lookup */
 	pcm = lookup_memtype((resource_size_t)pfn << PAGE_SHIFT);
 	*prot = __pgprot((pgprot_val(vma->vm_page_prot) & (~_PAGE_CACHE_MASK)) |
@@ -888,21 +884,15 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
 {
-	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) |
+	return __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WC));
-	else
-		return pgprot_noncached(prot);
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
 pgprot_t pgprot_writethrough(pgprot_t prot)
 {
-	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) |
+	return __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WT));
-	else
-		return pgprot_noncached(prot);
 }
 EXPORT_SYMBOL_GPL(pgprot_writethrough);
 
@@ -981,10 +971,8 @@ static const struct file_operations memtype_fops = {
 
 static int __init pat_memtype_list_init(void)
 {
-	if (pat_enabled) {
-		debugfs_create_file("pat_memtype_list", S_IRUSR,
+	debugfs_create_file("pat_memtype_list", S_IRUSR,
 				    arch_debugfs_dir, NULL, &memtype_fops);
-	}
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR
  2014-09-17 19:48 ` [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
@ 2014-09-23  5:46   ` Juergen Gross
  0 siblings, 0 replies; 8+ messages in thread
From: Juergen Gross @ 2014-09-23  5:46 UTC (permalink / raw)
  To: Toshi Kani, hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, stefan.bader, luto, hmh, yigal, konrad.wilk

On 09/17/2014 09:48 PM, Toshi Kani wrote:
> This patch sets WT to the PA7 slot in the PAT MSR when the processor
> is not affected by the PAT errata.  The PA7 slot is chosen to further
> minimize the risk of using the PAT bit as the PA3 slot is UC and is
> not currently used.
>
> The following Intel processors are affected by the PAT errata.
>
>     errata               cpuid
>     ----------------------------------------------------
>     Pentium 2, A52       family 0x6, model 0x5
>     Pentium 3, E27       family 0x6, model 0x7, 0x8
>     Pentium 3 Xenon, G26 family 0x6, model 0x7, 0x8, 0xa
>     Pentium M, Y26       family 0x6, model 0x9
>     Pentium M 90nm, X9   family 0x6, model 0xd
>     Pentium 4, N46       family 0xf, model 0x0
>
> Instead of making sharp boundary checks, this patch makes conservative
> checks to exclude all Pentium 2, 3, M and 4 family processors.  For
> such processors, _PAGE_CACHE_MODE_WT is redirected to UC- per the
> default setup in __cachemode2pte_tbl[].
>
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>

Reviewed-by: Juergen Gross <jgross@suse.com>

> ---
>   arch/x86/mm/pat.c |   64 +++++++++++++++++++++++++++++++++++++++++------------
>   1 file changed, 49 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index ff31851..db687c3 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -133,6 +133,7 @@ void pat_init(void)
>   {
>   	u64 pat;
>   	bool boot_cpu = !boot_pat_state;
> +	struct cpuinfo_x86 *c = &boot_cpu_data;
>
>   	if (!pat_enabled)
>   		return;
> @@ -153,21 +154,54 @@ void pat_init(void)
>   		}
>   	}
>
> -	/* Set PWT to Write-Combining. All other bits stay the same */
> -	/*
> -	 * PTE encoding used in Linux:
> -	 *      PAT
> -	 *      |PCD
> -	 *      ||PWT
> -	 *      |||
> -	 *      000 WB		_PAGE_CACHE_WB
> -	 *      001 WC		_PAGE_CACHE_WC
> -	 *      010 UC-		_PAGE_CACHE_UC_MINUS
> -	 *      011 UC		_PAGE_CACHE_UC
> -	 * PAT bit unused
> -	 */
> -	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
> -	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
> +	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
> +	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
> +	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
> +		/*
> +		 * PAT support with the lower four entries. Intel Pentium 2,
> +		 * 3, M, and 4 are affected by PAT errata, which makes the
> +		 * upper four entries unusable.  We do not use the upper four
> +		 * entries for all the affected processor families for safe.
> +		 *
> +		 *  PTE encoding used in Linux:
> +		 *      PAT
> +		 *      |PCD
> +		 *      ||PWT  PAT
> +		 *      |||    slot
> +		 *      000    0    WB : _PAGE_CACHE_MODE_WB
> +		 *      001    1    WC : _PAGE_CACHE_MODE_WC
> +		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
> +		 *      011    3    UC : _PAGE_CACHE_MODE_UC
> +		 * PAT bit unused
> +		 *
> +		 * NOTE: When WT or WP is used, it is redirected to UC- per
> +		 * the default setup in __cachemode2pte_tbl[].
> +		 */
> +		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
> +		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
> +	} else {
> +		/*
> +		 * PAT full support. WT is set to slot 7, which minimizes
> +		 * the risk of using the PAT bit as slot 3 is UC and is
> +		 * currently unused. Slot 4 should remain as reserved.
> +		 *
> +		 *  PTE encoding used in Linux:
> +		 *      PAT
> +		 *      |PCD
> +		 *      ||PWT  PAT
> +		 *      |||    slot
> +		 *      000    0    WB : _PAGE_CACHE_MODE_WB
> +		 *      001    1    WC : _PAGE_CACHE_MODE_WC
> +		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
> +		 *      011    3    UC : _PAGE_CACHE_MODE_UC
> +		 *      100    4    <reserved>
> +		 *      101    5    <reserved>
> +		 *      110    6    <reserved>
> +		 *      111    7    WT : _PAGE_CACHE_MODE_WT
> +		 */
> +		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
> +		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
> +	}
>
>   	/* Boot CPU check */
>   	if (!boot_pat_state)
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling
  2014-09-17 19:48 ` [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
@ 2014-09-23  6:25   ` Juergen Gross
  0 siblings, 0 replies; 8+ messages in thread
From: Juergen Gross @ 2014-09-23  6:25 UTC (permalink / raw)
  To: Toshi Kani, hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, stefan.bader, luto, hmh, yigal, konrad.wilk

On 09/17/2014 09:48 PM, Toshi Kani wrote:
> This patch refactors the !pat_enabled handling code and integrates
> this case into the PAT abstraction code. The PAT table is emulated
> by corresponding to the two cache attribute bits, PWT (Write Through)
> and PCD (Cache Disable). The emulated PAT table is also the same as
> the BIOS default setup in case the system has PAT but "nopat" boot
> option is specified.
>
> As a result of this change, cache aliasing is checked for all cases
> including !pat_enabled.
>
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>

Reviewed-by: Juergen Gross <jgross@suse.com>

> ---
>   arch/x86/mm/init.c     |    6 ++-
>   arch/x86/mm/iomap_32.c |   18 +++-------
>   arch/x86/mm/ioremap.c  |   10 +----
>   arch/x86/mm/pageattr.c |    3 --
>   arch/x86/mm/pat.c      |   90 +++++++++++++++++++++---------------------------
>   5 files changed, 50 insertions(+), 77 deletions(-)
>
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 82b41d5..2e147c8 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -37,7 +37,7 @@
>    */
>   uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
>   	[_PAGE_CACHE_MODE_WB]		= 0,
> -	[_PAGE_CACHE_MODE_WC]		= _PAGE_PWT,
> +	[_PAGE_CACHE_MODE_WC]		= _PAGE_PCD,
>   	[_PAGE_CACHE_MODE_UC_MINUS]	= _PAGE_PCD,
>   	[_PAGE_CACHE_MODE_UC]		= _PAGE_PCD | _PAGE_PWT,
>   	[_PAGE_CACHE_MODE_WT]		= _PAGE_PCD,
> @@ -46,11 +46,11 @@ uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
>   EXPORT_SYMBOL_GPL(__cachemode2pte_tbl);
>   uint8_t __pte2cachemode_tbl[8] = {
>   	[__pte2cm_idx(0)] = _PAGE_CACHE_MODE_WB,
> -	[__pte2cm_idx(_PAGE_PWT)] = _PAGE_CACHE_MODE_WC,
> +	[__pte2cm_idx(_PAGE_PWT)] = _PAGE_CACHE_MODE_UC_MINUS,
>   	[__pte2cm_idx(_PAGE_PCD)] = _PAGE_CACHE_MODE_UC_MINUS,
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD)] = _PAGE_CACHE_MODE_UC,
>   	[__pte2cm_idx(_PAGE_PAT)] = _PAGE_CACHE_MODE_WB,
> -	[__pte2cm_idx(_PAGE_PWT | _PAGE_PAT)] = _PAGE_CACHE_MODE_WC,
> +	[__pte2cm_idx(_PAGE_PWT | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
>   	[__pte2cm_idx(_PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
>   	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
>   };
> diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
> index ee58a0b..96aa8bf 100644
> --- a/arch/x86/mm/iomap_32.c
> +++ b/arch/x86/mm/iomap_32.c
> @@ -70,29 +70,23 @@ void *kmap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
>   	return (void *)vaddr;
>   }
>
> -/*
> - * Map 'pfn' using protections 'prot'
> - */
> -#define __PAGE_KERNEL_WC	(__PAGE_KERNEL | \
> -				 cachemode2protval(_PAGE_CACHE_MODE_WC))
> -
>   void __iomem *
>   iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
>   {
>   	/*
> -	 * For non-PAT systems, promote PAGE_KERNEL_WC to PAGE_KERNEL_UC_MINUS.
> -	 * PAGE_KERNEL_WC maps to PWT, which translates to uncached if the
> -	 * MTRR is UC or WC.  UC_MINUS gets the real intention, of the
> -	 * user, which is "WC if the MTRR is WC, UC if you can't do that."
> +	 * For non-PAT systems, translate non-WB request to UC- just in
> +	 * case the caller set the PWT bit to prot directly without using
> +	 * pgprot_writecombine(). UC- translates to uncached if the MTRR
> +	 * is UC or WC. UC- gets the real intention, of the user, which is
> +	 * "WC if the MTRR is WC, UC if you can't do that."
>   	 */
> -	if (!pat_enabled && pgprot_val(prot) == __PAGE_KERNEL_WC)
> +	if (!pat_enabled && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB)
>   		prot = __pgprot(__PAGE_KERNEL |
>   				cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
>
>   	return (void __force __iomem *) kmap_atomic_prot_pfn(pfn, prot);
>   }
>   EXPORT_SYMBOL_GPL(iomap_atomic_prot_pfn);
> -#undef __PAGE_KERNEL_WC
>
>   void
>   iounmap_atomic(void __iomem *kvaddr)
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 952f4b4..ff45c19 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -245,11 +245,8 @@ EXPORT_SYMBOL(ioremap_nocache);
>    */
>   void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
>   {
> -	if (pat_enabled)
> -		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
> +	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
>   					__builtin_return_address(0));
> -	else
> -		return ioremap_nocache(phys_addr, size);
>   }
>   EXPORT_SYMBOL(ioremap_wc);
>
> @@ -265,11 +262,8 @@ EXPORT_SYMBOL(ioremap_wc);
>    */
>   void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
>   {
> -	if (pat_enabled)
> -		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
> +	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
>   					__builtin_return_address(0));
> -	else
> -		return ioremap_nocache(phys_addr, size);
>   }
>   EXPORT_SYMBOL(ioremap_wt);
>
> diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
> index 6917b39..34f870d 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -1553,9 +1553,6 @@ int set_memory_wc(unsigned long addr, int numpages)
>   {
>   	int ret;
>
> -	if (!pat_enabled)
> -		return set_memory_uc(addr, numpages);
> -
>   	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
>   		_PAGE_CACHE_MODE_WC, NULL);
>   	if (ret)
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index a0264d3..e0e836e 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -135,28 +135,48 @@ void pat_init(void)
>   	bool boot_cpu = !boot_pat_state;
>   	struct cpuinfo_x86 *c = &boot_cpu_data;
>
> -	if (!pat_enabled)
> -		return;
> -
>   	if (!cpu_has_pat) {
>   		if (!boot_pat_state) {
>   			pat_disable("PAT not supported by CPU.");
> -			return;
> -		} else {
> +		} else if (pat_enabled) {
>   			/*
>   			 * If this happens we are on a secondary CPU, but
>   			 * switched to PAT on the boot CPU. We have no way to
>   			 * undo PAT.
>   			 */
> -			printk(KERN_ERR "PAT enabled, "
> +			pr_err("PAT enabled, "
>   			       "but not supported by secondary CPU\n");
>   			BUG();
>   		}
>   	}
>
> -	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
> -	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
> -	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
> +	if (!pat_enabled) {
> +		/*
> +		 * No PAT. Emulate the PAT table by corresponding to the two
> +		 * cache bits, PWT (Write Through) and PCD (Cache Disable).
> +		 * This is also the same as the BIOS default setup in case
> +		 * the system has PAT but "nopat" boot option is specified.
> +		 *
> +		 *  PTE encoding used in Linux:
> +		 *       PCD
> +		 *       |PWT  PAT
> +		 *       ||    slot
> +		 *       00    0    WB : _PAGE_CACHE_MODE_WB
> +		 *       01    1    WT : _PAGE_CACHE_MODE_WT
> +		 *       10    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
> +		 *       11    3    UC : _PAGE_CACHE_MODE_UC
> +		 *
> +		 * NOTE: When WC or WP is used, it is redirected to UC- per
> +		 * the default setup in __cachemode2pte_tbl[].
> +		 */
> +		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
> +		      PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
> +		if (!boot_pat_state)
> +			boot_pat_state = pat;
> +
> +	} else if ((c->x86_vendor == X86_VENDOR_INTEL) &&
> +		   (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
> +		    ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
>   		/*
>   		 * PAT support with the lower four entries. Intel Pentium 2,
>   		 * 3, M, and 4 are affected by PAT errata, which makes the
> @@ -203,11 +223,13 @@ void pat_init(void)
>   		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
>   	}
>
> -	/* Boot CPU check */
> -	if (!boot_pat_state)
> -		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
> +	if (pat_enabled) {
> +		/* Boot CPU check */
> +		if (!boot_pat_state)
> +			rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
>
> -	wrmsrl(MSR_IA32_CR_PAT, pat);
> +		wrmsrl(MSR_IA32_CR_PAT, pat);
> +	}
>
>   	if (boot_cpu)
>   		pat_init_cache_modes();
> @@ -375,17 +397,6 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
>
>   	BUG_ON(start >= end); /* end is exclusive */
>
> -	if (!pat_enabled) {
> -		/* This is identical to page table setting without PAT */
> -		if (new_type) {
> -			if (req_type == _PAGE_CACHE_MODE_WB)
> -				*new_type = _PAGE_CACHE_MODE_WB;
> -			else
> -				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
> -		}
> -		return 0;
> -	}
> -
>   	/* Low ISA region is always mapped WB in page table. No need to track */
>   	if (x86_platform.is_untracked_pat_range(start, end)) {
>   		if (new_type)
> @@ -450,9 +461,6 @@ int free_memtype(u64 start, u64 end)
>   	int is_range_ram;
>   	struct memtype *entry;
>
> -	if (!pat_enabled)
> -		return 0;
> -
>   	/* Low ISA region is always mapped WB. No need to track */
>   	if (x86_platform.is_untracked_pat_range(start, end))
>   		return 0;
> @@ -591,16 +599,13 @@ static inline int range_is_allowed(unsigned long pfn, unsigned long size)
>   	return 1;
>   }
>   #else
> -/* This check is needed to avoid cache aliasing when PAT is enabled */
> +/* This check is needed to avoid cache aliasing */
>   static inline int range_is_allowed(unsigned long pfn, unsigned long size)
>   {
>   	u64 from = ((u64)pfn) << PAGE_SHIFT;
>   	u64 to = from + size;
>   	u64 cursor = from;
>
> -	if (!pat_enabled)
> -		return 1;
> -
>   	while (cursor < to) {
>   		if (!devmem_is_allowed(pfn)) {
>   			printk(KERN_INFO "Program %s tried to access /dev/mem between [mem %#010Lx-%#010Lx]\n",
> @@ -704,9 +709,6 @@ static int reserve_pfn_range(u64 paddr, unsigned long size, pgprot_t *vma_prot,
>   	 * the type requested matches the type of first page in the range.
>   	 */
>   	if (is_ram) {
> -		if (!pat_enabled)
> -			return 0;
> -
>   		pcm = lookup_memtype(paddr);
>   		if (want_pcm != pcm) {
>   			printk(KERN_WARNING "%s:%d map pfn RAM range req %s for [mem %#010Lx-%#010Lx], got %s\n",
> @@ -819,9 +821,6 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_t *prot,
>   		return ret;
>   	}
>
> -	if (!pat_enabled)
> -		return 0;
> -
>   	/*
>   	 * For anything smaller than the vma size we set prot based on the
>   	 * lookup.
> @@ -847,9 +846,6 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
>   {
>   	enum page_cache_mode pcm;
>
> -	if (!pat_enabled)
> -		return 0;
> -
>   	/* Set prot based on lookup */
>   	pcm = lookup_memtype((resource_size_t)pfn << PAGE_SHIFT);
>   	*prot = __pgprot((pgprot_val(vma->vm_page_prot) & (~_PAGE_CACHE_MASK)) |
> @@ -888,21 +884,15 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
>
>   pgprot_t pgprot_writecombine(pgprot_t prot)
>   {
> -	if (pat_enabled)
> -		return __pgprot(pgprot_val(prot) |
> +	return __pgprot(pgprot_val(prot) |
>   				cachemode2protval(_PAGE_CACHE_MODE_WC));
> -	else
> -		return pgprot_noncached(prot);
>   }
>   EXPORT_SYMBOL_GPL(pgprot_writecombine);
>
>   pgprot_t pgprot_writethrough(pgprot_t prot)
>   {
> -	if (pat_enabled)
> -		return __pgprot(pgprot_val(prot) |
> +	return __pgprot(pgprot_val(prot) |
>   				cachemode2protval(_PAGE_CACHE_MODE_WT));
> -	else
> -		return pgprot_noncached(prot);
>   }
>   EXPORT_SYMBOL_GPL(pgprot_writethrough);
>
> @@ -981,10 +971,8 @@ static const struct file_operations memtype_fops = {
>
>   static int __init pat_memtype_list_init(void)
>   {
> -	if (pat_enabled) {
> -		debugfs_create_file("pat_memtype_list", S_IRUSR,
> +	debugfs_create_file("pat_memtype_list", S_IRUSR,
>   				    arch_debugfs_dir, NULL, &memtype_fops);
> -	}
>   	return 0;
>   }
>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-09-23  6:25 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-17 19:48 [PATCH v3 0/5] Support Write-Through mapping on x86 Toshi Kani
2014-09-17 19:48 ` [PATCH v3 1/5] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
2014-09-23  5:46   ` Juergen Gross
2014-09-17 19:48 ` [PATCH v3 2/5] x86, mm, pat: Change reserve_memtype() to handle WT Toshi Kani
2014-09-17 19:48 ` [PATCH v3 3/5] x86, mm, asm-gen: Add ioremap_wt() for WT Toshi Kani
2014-09-17 19:48 ` [PATCH v3 4/5] x86, mm, pat: Add pgprot_writethrough() " Toshi Kani
2014-09-17 19:48 ` [PATCH v3 5/5] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
2014-09-23  6:25   ` Juergen Gross

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).