linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/12] Support Write-Through mapping on x86
@ 2015-05-27 15:18 Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
                   ` (11 more replies)
  0 siblings, 12 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch

This patchset adds support of Write-Through (WT) mapping on x86.
The study below shows that using WT mapping may be useful for
non-volatile memory.

http://www.hpl.hp.com/techreports/2012/HPL-2012-236.pdf

The patchset consists of the following changes.
 - Patch 1/12 to 6/12 add ioremap_wt()
 - Patch 7/12 adds pgprot_writethrough()
 - Patch 8/12 to 9/12 add set_memory_wt()
 - Patch 10/12 to 11/12 refactor !pat_enable paths
 - Patch 12/12 changes the pmem driver to call ioremap_wt()

All new/modified interfaces have been tested.

---
v10:
- Removed ioremap_writethrough(). (Thomas Gleixner)
- Clarified and cleaned up multiple comments and functions.
  (Thomas Gleixner) 
- Changed ioremap_change_attr() to accept the WT type.

v9:
- Changed to export the set_xxx_wt() interfaces with GPL.
  (Ingo Molnar)
- Changed is_new_memtype_allowed() to handle WT cases.
- Changed arch-specific io.h to define ioremap_wt().
- Changed the pmem driver to use ioremap_wt().
- Rebased to 4.1-rc3 and resolved minor conflicts.

v8:
- Rebased to 4.0-rc1 and resolved conflicts with 9d34cfdf4 in
  patch 5/7.

v7:
- Rebased to 3.19-rc3 as Juergen's patchset for the PAT management
  has been accepted.

v6:
- Dropped the patch moving [set|get]_page_memtype() to pat.c
  since the tip branch already has this change.
- Fixed an issue when CONFIG_X86_PAT is not defined.

v5:
- Clarified comment of why using slot 7. (Andy Lutomirski,
  Thomas Gleixner)
- Moved [set|get]_page_memtype() to pat.c. (Thomas Gleixner)
- Removed BUG() from set_page_memtype(). (Thomas Gleixner)

v4:
- Added set_memory_wt() by adding WT support of regular memory.

v3:
- Dropped the set_memory_wt() patch. (Andy Lutomirski)
- Refactored the !pat_enabled handling. (H. Peter Anvin,
  Andy Lutomirski)
- Added the picture of PTE encoding. (Konrad Rzeszutek Wilk)

v2:
- Changed WT to use slot 7 of the PAT MSR. (H. Peter Anvin,
  Andy Lutomirski)
- Changed to have conservative checks to exclude all Pentium 2, 3,
  M, and 4 families. (Ingo Molnar, Henrique de Moraes Holschuh,
  Andy Lutomirski)
- Updated documentation to cover WT interfaces and usages.
  (Andy Lutomirski, Yigal Korman)

---
Toshi Kani (12):
 1/12 x86, mm, pat: Set WT to PA7 slot of PAT MSR
 2/12 x86, mm, pat: Change reserve_memtype() for WT
 3/12 x86, asm: Change is_new_memtype_allowed() for WT
 4/12 x86, mm, asm-gen: Add ioremap_wt() for WT
 5/12 arch/*/asm/io.h: Add ioremap_wt() to all architectures
 6/12 video/fbdev, asm/io.h: Remove ioremap_writethrough()
 7/12 x86, mm, pat: Add pgprot_writethrough() for WT
 8/12 x86, mm, asm: Add WT support to set_page_memtype()
 9/12 x86, mm: Add set_memory_wt() for WT
10/12 x86, mm, pat: Cleanup init flags in pat_init()
11/12 x86, mm, pat: Refactor !pat_enable handling
12/12 drivers/block/pmem: Map NVDIMM with ioremap_wt()

---
 Documentation/x86/pat.txt            |  13 +-
 arch/arc/include/asm/io.h            |   1 +
 arch/arm/include/asm/io.h            |   1 +
 arch/arm64/include/asm/io.h          |   1 +
 arch/avr32/include/asm/io.h          |   1 +
 arch/frv/include/asm/io.h            |   4 +-
 arch/m32r/include/asm/io.h           |   1 +
 arch/m68k/include/asm/io_mm.h        |   4 +-
 arch/m68k/include/asm/io_no.h        |   4 +-
 arch/metag/include/asm/io.h          |   3 +
 arch/microblaze/include/asm/io.h     |   2 +-
 arch/mn10300/include/asm/io.h        |   1 +
 arch/nios2/include/asm/io.h          |   1 +
 arch/s390/include/asm/io.h           |   1 +
 arch/sparc/include/asm/io_32.h       |   1 +
 arch/sparc/include/asm/io_64.h       |   1 +
 arch/tile/include/asm/io.h           |   2 +-
 arch/x86/include/asm/cacheflush.h    |   6 +-
 arch/x86/include/asm/io.h            |   2 +
 arch/x86/include/asm/pgtable.h       |   8 +-
 arch/x86/include/asm/pgtable_types.h |   3 +
 arch/x86/mm/init.c                   |   6 +-
 arch/x86/mm/iomap_32.c               |  12 +-
 arch/x86/mm/ioremap.c                |  29 ++++-
 arch/x86/mm/pageattr.c               |  65 +++++++---
 arch/x86/mm/pat.c                    | 232 +++++++++++++++++++++++------------
 arch/xtensa/include/asm/io.h         |   1 +
 drivers/block/pmem.c                 |   4 +-
 drivers/video/fbdev/amifb.c          |   4 +-
 drivers/video/fbdev/atafb.c          |   3 +-
 drivers/video/fbdev/hpfb.c           |   4 +-
 include/asm-generic/io.h             |   9 ++
 include/asm-generic/iomap.h          |   4 +
 include/asm-generic/pgtable.h        |   4 +
 34 files changed, 311 insertions(+), 127 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 2/12] x86, mm, pat: Change reserve_memtype() for WT Toshi Kani
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch sets WT to the PA7 slot in the PAT MSR when the processor
is not affected by the PAT errata.  The PA7 slot is chosen to improve
robustness in the presence of errata that might cause the high PAT bit
to be ignored.  This way a buggy PA7 slot access will hit the PA3 slot,
which is UC, so at worst we lose performance without causing a correctness
issue.

The following Intel processors are affected by the PAT errata.

   errata               cpuid
   ----------------------------------------------------
   Pentium 2, A52       family 0x6, model 0x5
   Pentium 3, E27       family 0x6, model 0x7, 0x8
   Pentium 3 Xenon, G26 family 0x6, model 0x7, 0x8, 0xa
   Pentium M, Y26       family 0x6, model 0x9
   Pentium M 90nm, X9   family 0x6, model 0xd
   Pentium 4, N46       family 0xf, model 0x0

Instead of making sharp boundary checks, this patch makes conservative
checks to exclude all Pentium 2, 3, M and 4 family processors.  For
such processors, _PAGE_CACHE_MODE_WT is redirected to UC- per the
default setup in __cachemode2pte_tbl[].

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/mm/pat.c |   71 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 56 insertions(+), 15 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 35af677..1baa60d 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -197,6 +197,7 @@ void pat_init(void)
 {
 	u64 pat;
 	bool boot_cpu = !boot_pat_state;
+	struct cpuinfo_x86 *c = &boot_cpu_data;
 
 	if (!pat_enabled)
 		return;
@@ -217,21 +218,61 @@ void pat_init(void)
 		}
 	}
 
-	/* Set PWT to Write-Combining. All other bits stay the same */
-	/*
-	 * PTE encoding used in Linux:
-	 *      PAT
-	 *      |PCD
-	 *      ||PWT
-	 *      |||
-	 *      000 WB		_PAGE_CACHE_WB
-	 *      001 WC		_PAGE_CACHE_WC
-	 *      010 UC-		_PAGE_CACHE_UC_MINUS
-	 *      011 UC		_PAGE_CACHE_UC
-	 * PAT bit unused
-	 */
-	pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
-	      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
+	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+		/*
+		 * PAT support with the lower four entries. Intel Pentium 2,
+		 * 3, M, and 4 are affected by PAT errata, which makes the
+		 * upper four entries unusable.  We do not use the upper four
+		 * entries for all the affected processor families for safe.
+		 *
+		 *  PTE encoding used in Linux:
+		 *      PAT
+		 *      |PCD
+		 *      ||PWT  PAT
+		 *      |||    slot
+		 *      000    0    WB : _PAGE_CACHE_MODE_WB
+		 *      001    1    WC : _PAGE_CACHE_MODE_WC
+		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *      011    3    UC : _PAGE_CACHE_MODE_UC
+		 * PAT bit unused
+		 *
+		 * NOTE: When WT or WP is used, it is redirected to UC- per
+		 * the default setup in __cachemode2pte_tbl[].
+		 */
+		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, UC);
+	} else {
+		/*
+		 * PAT full support.  We put WT in slot 7 to improve
+		 * robustness in the presence of errata that might cause
+		 * the high PAT bit to be ignored.  This way a buggy slot 7
+		 * access will hit slot 3, and slot 3 is UC, so at worst
+		 * we lose performance without causing a correctness issue.
+		 * Pentium 4 erratum N46 is an example of such an erratum,
+		 * although we try not to use PAT at all on affected CPUs.
+		 *
+		 *  PTE encoding used in Linux:
+		 *      PAT
+		 *      |PCD
+		 *      ||PWT  PAT
+		 *      |||    slot
+		 *      000    0    WB : _PAGE_CACHE_MODE_WB
+		 *      001    1    WC : _PAGE_CACHE_MODE_WC
+		 *      010    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *      011    3    UC : _PAGE_CACHE_MODE_UC
+		 *      100    4    WB : Reserved
+		 *      101    5    WC : Reserved
+		 *      110    6    UC-: Reserved
+		 *      111    7    WT : _PAGE_CACHE_MODE_WT
+		 *
+		 * The reserved slots are unused, but mapped to their
+		 * corresponding types in the presence of PAT errata.
+		 */
+		pat = PAT(0, WB) | PAT(1, WC) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
+	}
 
 	/* Boot CPU check */
 	if (!boot_pat_state) {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 2/12] x86, mm, pat: Change reserve_memtype() for WT
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 3/12] x86, asm: Change is_new_memtype_allowed() " Toshi Kani
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch changes reserve_memtype() to support the WT cache mode
with PAT.  When PAT is not enabled, WB and UC- are the only types
supported.

When a target range is in RAM, reserve_ram_pages_type() verifies
the requested type.  reserve_ram_pages_type() is changed to fail
WT and WP requests with -EINVAL since set_page_memtype() is
limited to handle three types, WB, WC and UC-.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
Patch 8/12 enhances set_page_memtype() to support WT.
---
 arch/x86/mm/pat.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 1baa60d..d932b43 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -365,6 +365,8 @@ static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 
 /*
  * For RAM pages, we use page flags to mark the pages with appropriate type.
+ * The page flags are limited to three types, WB, WC and UC-.
+ * WT and WP requests fail with -EINVAL, and UC gets redirected to UC-.
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
@@ -376,6 +378,13 @@ static int reserve_ram_pages_type(u64 start, u64 end,
 	struct page *page;
 	u64 pfn;
 
+	if ((req_type == _PAGE_CACHE_MODE_WT) ||
+	    (req_type == _PAGE_CACHE_MODE_WP)) {
+		if (new_type)
+			*new_type = _PAGE_CACHE_MODE_UC_MINUS;
+		return -EINVAL;
+	}
+
 	if (req_type == _PAGE_CACHE_MODE_UC) {
 		/* We do not support strong UC */
 		WARN_ON_ONCE(1);
@@ -425,6 +434,7 @@ static int free_ram_pages_type(u64 start, u64 end)
  * - _PAGE_CACHE_MODE_WC
  * - _PAGE_CACHE_MODE_UC_MINUS
  * - _PAGE_CACHE_MODE_UC
+ * - _PAGE_CACHE_MODE_WT
  *
  * If new_type is NULL, function will return an error if it cannot reserve the
  * region with req_type. If new_type is non-NULL, function will return
@@ -442,12 +452,12 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 	BUG_ON(start >= end); /* end is exclusive */
 
 	if (!pat_enabled) {
-		/* This is identical to page table setting without PAT */
+		/* WB and UC- are the only types supported without PAT */
 		if (new_type) {
-			if (req_type == _PAGE_CACHE_MODE_WC)
-				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
+			if (req_type == _PAGE_CACHE_MODE_WB)
+				*new_type = _PAGE_CACHE_MODE_WB;
 			else
-				*new_type = req_type;
+				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
 		}
 		return 0;
 	}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 3/12] x86, asm: Change is_new_memtype_allowed() for WT
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 2/12] x86, mm, pat: Change reserve_memtype() for WT Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 4/12] x86, mm, asm-gen: Add ioremap_wt() " Toshi Kani
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

__ioremap_caller() calls reserve_memtype() to set new_pcm
(existing map type if any), and then calls
is_new_memtype_allowed() to verify if converting to new_pcm
is allowed when pcm (request type) is different from new_pcm.

When WT is requested, the caller expects that writes are
ordered and uncached.  Therefore, this patch changes
is_new_memtype_allowed() to disallow the following cases.

 - If the request is WT, mapping type cannot be WB
 - If the request is WT, mapping type cannot be WC

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable.h |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index fe57e7a..2562e30 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -398,11 +398,17 @@ static inline int is_new_memtype_allowed(u64 paddr, unsigned long size,
 	 * requested memtype:
 	 * - request is uncached, return cannot be write-back
 	 * - request is write-combine, return cannot be write-back
+	 * - request is write-through, return cannot be write-back
+	 * - request is write-through, return cannot be write-combine
 	 */
 	if ((pcm == _PAGE_CACHE_MODE_UC_MINUS &&
 	     new_pcm == _PAGE_CACHE_MODE_WB) ||
 	    (pcm == _PAGE_CACHE_MODE_WC &&
-	     new_pcm == _PAGE_CACHE_MODE_WB)) {
+	     new_pcm == _PAGE_CACHE_MODE_WB) ||
+	    (pcm == _PAGE_CACHE_MODE_WT &&
+	     new_pcm == _PAGE_CACHE_MODE_WB) ||
+	    (pcm == _PAGE_CACHE_MODE_WT &&
+	     new_pcm == _PAGE_CACHE_MODE_WC)) {
 		return 0;
 	}
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 4/12] x86, mm, asm-gen: Add ioremap_wt() for WT
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (2 preceding siblings ...)
  2015-05-27 15:18 ` [PATCH v10 3/12] x86, asm: Change is_new_memtype_allowed() " Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 5/12] arch/*/asm/io.h: Add ioremap_wt() to all architectures Toshi Kani
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch adds ioremap_wt() for creating WT mapping on x86.
It follows the same model as ioremap_wc() for multi-architecture
support.  ARCH_HAS_IOREMAP_WT is defined in the x86 version of
io.h to indicate that ioremap_wt() is implemented on x86.

Also update the PAT documentation file to cover ioremap_wt().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 Documentation/x86/pat.txt   |    4 +++-
 arch/x86/include/asm/io.h   |    2 ++
 arch/x86/mm/ioremap.c       |   24 ++++++++++++++++++++++++
 include/asm-generic/io.h    |    9 +++++++++
 include/asm-generic/iomap.h |    4 ++++
 5 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index cf08c9f..be7b8c2 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -12,7 +12,7 @@ virtual addresses.
 
 PAT allows for different types of memory attributes. The most commonly used
 ones that will be supported at this time are Write-back, Uncached,
-Write-combined and Uncached Minus.
+Write-combined, Write-through and Uncached Minus.
 
 
 PAT APIs
@@ -38,6 +38,8 @@ ioremap_nocache        |    --    |    UC-     |       UC-        |
                        |          |            |                  |
 ioremap_wc             |    --    |    --      |       WC         |
                        |          |            |                  |
+ioremap_wt             |    --    |    --      |       WT         |
+                       |          |            |                  |
 set_memory_uc          |    UC-   |    --      |       --         |
  set_memory_wb         |          |            |                  |
                        |          |            |                  |
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 34a5b93..81942ef 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -35,6 +35,7 @@
   */
 
 #define ARCH_HAS_IOREMAP_WC
+#define ARCH_HAS_IOREMAP_WT
 
 #include <linux/string.h>
 #include <linux/compiler.h>
@@ -320,6 +321,7 @@ extern void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr);
 extern int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 				enum page_cache_mode pcm);
 extern void __iomem *ioremap_wc(resource_size_t offset, unsigned long size);
+extern void __iomem *ioremap_wt(resource_size_t offset, unsigned long size);
 
 extern bool is_early_ioremap_ptep(pte_t *ptep);
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 70e7444..ae8c284 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -172,6 +172,10 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr,
 		prot = __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WC));
 		break;
+	case _PAGE_CACHE_MODE_WT:
+		prot = __pgprot(pgprot_val(prot) |
+				cachemode2protval(_PAGE_CACHE_MODE_WT));
+		break;
 	case _PAGE_CACHE_MODE_WB:
 		break;
 	}
@@ -266,6 +270,26 @@ void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 }
 EXPORT_SYMBOL(ioremap_wc);
 
+/**
+ * ioremap_wt	-	map memory into CPU space write through
+ * @phys_addr:	bus address of the memory
+ * @size:	size of the resource to map
+ *
+ * This version of ioremap ensures that the memory is marked write through.
+ * Write through stores data into memory while keeping the cache up-to-date.
+ *
+ * Must be freed with iounmap.
+ */
+void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
+{
+	if (pat_enabled)
+		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
+					__builtin_return_address(0));
+	else
+		return ioremap_nocache(phys_addr, size);
+}
+EXPORT_SYMBOL(ioremap_wt);
+
 void __iomem *ioremap_cache(resource_size_t phys_addr, unsigned long size)
 {
 	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WB,
diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h
index 9db0423..bae62dc 100644
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -777,8 +777,17 @@ static inline void __iomem *ioremap_wc(phys_addr_t offset, size_t size)
 }
 #endif
 
+#ifndef ioremap_wt
+#define ioremap_wt ioremap_wt
+static inline void __iomem *ioremap_wt(phys_addr_t offset, size_t size)
+{
+	return ioremap_nocache(offset, size);
+}
+#endif
+
 #ifndef iounmap
 #define iounmap iounmap
+
 static inline void iounmap(void __iomem *addr)
 {
 }
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 1b41011..d8f8622 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -66,6 +66,10 @@ extern void ioport_unmap(void __iomem *);
 #define ioremap_wc ioremap_nocache
 #endif
 
+#ifndef ARCH_HAS_IOREMAP_WT
+#define ioremap_wt ioremap_nocache
+#endif
+
 #ifdef CONFIG_PCI
 /* Destroy a virtual mapping cookie for a PCI BAR (memory or IO) */
 struct pci_dev;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 5/12] arch/*/asm/io.h: Add ioremap_wt() to all architectures
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (3 preceding siblings ...)
  2015-05-27 15:18 ` [PATCH v10 4/12] x86, mm, asm-gen: Add ioremap_wt() " Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 6/12] video/fbdev, asm/io.h: Remove ioremap_writethrough() Toshi Kani
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch adds ioremap_wt() to all arch-specific asm/io.h which
define ioremap_wc() locally.  These arch-specific asm/io.h do not
include <asm-generic/iomap.h>.  Some of them include
<asm-generic/io.h>, but ioremap_wt() is defined for consistency
since they define all ioremap_xxx locally.

ioremap_wt() is defined indentical to ioremap_nocache() to all
architectures without WT support.

frv and m68k already have ioremap_writethrough().  This patch
implements ioremap_wt() indetical to ioremap_writethrough() and
defines ARCH_HAS_IOREMAP_WT in both architectures.

This patch allows generic drivers to use ioremap_wt().

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/arc/include/asm/io.h        |    1 +
 arch/arm/include/asm/io.h        |    1 +
 arch/arm64/include/asm/io.h      |    1 +
 arch/avr32/include/asm/io.h      |    1 +
 arch/frv/include/asm/io.h        |    7 +++++++
 arch/m32r/include/asm/io.h       |    1 +
 arch/m68k/include/asm/io_mm.h    |    7 +++++++
 arch/m68k/include/asm/io_no.h    |    6 ++++++
 arch/metag/include/asm/io.h      |    3 +++
 arch/microblaze/include/asm/io.h |    1 +
 arch/mn10300/include/asm/io.h    |    1 +
 arch/nios2/include/asm/io.h      |    1 +
 arch/s390/include/asm/io.h       |    1 +
 arch/sparc/include/asm/io_32.h   |    1 +
 arch/sparc/include/asm/io_64.h   |    1 +
 arch/tile/include/asm/io.h       |    1 +
 arch/xtensa/include/asm/io.h     |    1 +
 17 files changed, 36 insertions(+)

diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
index cabd518..7cc4ced 100644
--- a/arch/arc/include/asm/io.h
+++ b/arch/arc/include/asm/io.h
@@ -20,6 +20,7 @@ extern void iounmap(const void __iomem *addr);
 
 #define ioremap_nocache(phy, sz)	ioremap(phy, sz)
 #define ioremap_wc(phy, sz)		ioremap(phy, sz)
+#define ioremap_wt(phy, sz)		ioremap(phy, sz)
 
 /* Change struct page to physical address */
 #define page_to_phys(page)		(page_to_pfn(page) << PAGE_SHIFT)
diff --git a/arch/arm/include/asm/io.h b/arch/arm/include/asm/io.h
index db58deb..1b7677d 100644
--- a/arch/arm/include/asm/io.h
+++ b/arch/arm/include/asm/io.h
@@ -336,6 +336,7 @@ extern void _memset_io(volatile void __iomem *, int, size_t);
 #define ioremap_nocache(cookie,size)	__arm_ioremap((cookie), (size), MT_DEVICE)
 #define ioremap_cache(cookie,size)	__arm_ioremap((cookie), (size), MT_DEVICE_CACHED)
 #define ioremap_wc(cookie,size)		__arm_ioremap((cookie), (size), MT_DEVICE_WC)
+#define ioremap_wt(cookie,size)		__arm_ioremap((cookie), (size), MT_DEVICE)
 #define iounmap				__arm_iounmap
 
 /*
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 540f7c0..7116d39 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -170,6 +170,7 @@ extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size);
 #define ioremap(addr, size)		__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_nocache(addr, size)	__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
 #define ioremap_wc(addr, size)		__ioremap((addr), (size), __pgprot(PROT_NORMAL_NC))
+#define ioremap_wt(addr, size)		__ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE))
 #define iounmap				__iounmap
 
 /*
diff --git a/arch/avr32/include/asm/io.h b/arch/avr32/include/asm/io.h
index 4f5ec2b..e998ff5 100644
--- a/arch/avr32/include/asm/io.h
+++ b/arch/avr32/include/asm/io.h
@@ -296,6 +296,7 @@ extern void __iounmap(void __iomem *addr);
 	__iounmap(addr)
 
 #define ioremap_wc ioremap_nocache
+#define ioremap_wt ioremap_nocache
 
 #define cached(addr) P1SEGADDR(addr)
 #define uncached(addr) P2SEGADDR(addr)
diff --git a/arch/frv/include/asm/io.h b/arch/frv/include/asm/io.h
index 0b78bc8..1fe98fe 100644
--- a/arch/frv/include/asm/io.h
+++ b/arch/frv/include/asm/io.h
@@ -17,6 +17,8 @@
 
 #ifdef __KERNEL__
 
+#define ARCH_HAS_IOREMAP_WT
+
 #include <linux/types.h>
 #include <asm/virtconvert.h>
 #include <asm/string.h>
@@ -270,6 +272,11 @@ static inline void __iomem *ioremap_writethrough(unsigned long physaddr, unsigne
 	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
 }
 
+static inline void __iomem *ioremap_wt(unsigned long physaddr, unsigned long size)
+{
+	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
+}
+
 static inline void __iomem *ioremap_fullcache(unsigned long physaddr, unsigned long size)
 {
 	return __ioremap(physaddr, size, IOMAP_FULL_CACHING);
diff --git a/arch/m32r/include/asm/io.h b/arch/m32r/include/asm/io.h
index 9cc00db..0c3f25e 100644
--- a/arch/m32r/include/asm/io.h
+++ b/arch/m32r/include/asm/io.h
@@ -68,6 +68,7 @@ static inline void __iomem *ioremap(unsigned long offset, unsigned long size)
 extern void iounmap(volatile void __iomem *addr);
 #define ioremap_nocache(off,size) ioremap(off,size)
 #define ioremap_wc ioremap_nocache
+#define ioremap_wt ioremap_nocache
 
 /*
  * IO bus memory addresses are also 1:1 with the physical address
diff --git a/arch/m68k/include/asm/io_mm.h b/arch/m68k/include/asm/io_mm.h
index 8955b40..7c12138 100644
--- a/arch/m68k/include/asm/io_mm.h
+++ b/arch/m68k/include/asm/io_mm.h
@@ -20,6 +20,8 @@
 
 #ifdef __KERNEL__
 
+#define ARCH_HAS_IOREMAP_WT
+
 #include <linux/compiler.h>
 #include <asm/raw_io.h>
 #include <asm/virtconvert.h>
@@ -470,6 +472,11 @@ static inline void __iomem *ioremap_writethrough(unsigned long physaddr,
 {
 	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
 }
+static inline void __iomem *ioremap_wt(unsigned long physaddr,
+					 unsigned long size)
+{
+	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
+}
 static inline void __iomem *ioremap_fullcache(unsigned long physaddr,
 				      unsigned long size)
 {
diff --git a/arch/m68k/include/asm/io_no.h b/arch/m68k/include/asm/io_no.h
index a93c8cd..5fff9a2 100644
--- a/arch/m68k/include/asm/io_no.h
+++ b/arch/m68k/include/asm/io_no.h
@@ -3,6 +3,8 @@
 
 #ifdef __KERNEL__
 
+#define ARCH_HAS_IOREMAP_WT
+
 #include <asm/virtconvert.h>
 #include <asm-generic/iomap.h>
 
@@ -157,6 +159,10 @@ static inline void *ioremap_writethrough(unsigned long physaddr, unsigned long s
 {
 	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
 }
+static inline void *ioremap_wt(unsigned long physaddr, unsigned long size)
+{
+	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
+}
 static inline void *ioremap_fullcache(unsigned long physaddr, unsigned long size)
 {
 	return __ioremap(physaddr, size, IOMAP_FULL_CACHING);
diff --git a/arch/metag/include/asm/io.h b/arch/metag/include/asm/io.h
index d5779b0..9890f21 100644
--- a/arch/metag/include/asm/io.h
+++ b/arch/metag/include/asm/io.h
@@ -160,6 +160,9 @@ extern void __iounmap(void __iomem *addr);
 #define ioremap_wc(offset, size)                \
 	__ioremap((offset), (size), _PAGE_WR_COMBINE)
 
+#define ioremap_wt(offset, size)                \
+	__ioremap((offset), (size), 0)
+
 #define iounmap(addr)                           \
 	__iounmap(addr)
 
diff --git a/arch/microblaze/include/asm/io.h b/arch/microblaze/include/asm/io.h
index 940f5fc..ec3da11 100644
--- a/arch/microblaze/include/asm/io.h
+++ b/arch/microblaze/include/asm/io.h
@@ -43,6 +43,7 @@ extern void __iomem *ioremap(phys_addr_t address, unsigned long size);
 #define ioremap_nocache(addr, size)		ioremap((addr), (size))
 #define ioremap_fullcache(addr, size)		ioremap((addr), (size))
 #define ioremap_wc(addr, size)			ioremap((addr), (size))
+#define ioremap_wt(addr, size)			ioremap((addr), (size))
 
 #endif /* CONFIG_MMU */
 
diff --git a/arch/mn10300/include/asm/io.h b/arch/mn10300/include/asm/io.h
index cc4a2ba..07c5b4a 100644
--- a/arch/mn10300/include/asm/io.h
+++ b/arch/mn10300/include/asm/io.h
@@ -282,6 +282,7 @@ static inline void __iomem *ioremap_nocache(unsigned long offset, unsigned long
 }
 
 #define ioremap_wc ioremap_nocache
+#define ioremap_wt ioremap_nocache
 
 static inline void iounmap(void __iomem *addr)
 {
diff --git a/arch/nios2/include/asm/io.h b/arch/nios2/include/asm/io.h
index 6e24d7c..c5a62da 100644
--- a/arch/nios2/include/asm/io.h
+++ b/arch/nios2/include/asm/io.h
@@ -46,6 +46,7 @@ static inline void iounmap(void __iomem *addr)
 }
 
 #define ioremap_wc ioremap_nocache
+#define ioremap_wt ioremap_nocache
 
 /* Pages to physical address... */
 #define page_to_phys(page)	virt_to_phys(page_to_virt(page))
diff --git a/arch/s390/include/asm/io.h b/arch/s390/include/asm/io.h
index 30fd5c8..cb5fdf3 100644
--- a/arch/s390/include/asm/io.h
+++ b/arch/s390/include/asm/io.h
@@ -29,6 +29,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr);
 
 #define ioremap_nocache(addr, size)	ioremap(addr, size)
 #define ioremap_wc			ioremap_nocache
+#define ioremap_wt			ioremap_nocache
 
 static inline void __iomem *ioremap(unsigned long offset, unsigned long size)
 {
diff --git a/arch/sparc/include/asm/io_32.h b/arch/sparc/include/asm/io_32.h
index 407ac14..57f26c3 100644
--- a/arch/sparc/include/asm/io_32.h
+++ b/arch/sparc/include/asm/io_32.h
@@ -129,6 +129,7 @@ static inline void sbus_memcpy_toio(volatile void __iomem *dst,
 void __iomem *ioremap(unsigned long offset, unsigned long size);
 #define ioremap_nocache(X,Y)	ioremap((X),(Y))
 #define ioremap_wc(X,Y)		ioremap((X),(Y))
+#define ioremap_wt(X,Y)		ioremap((X),(Y))
 void iounmap(volatile void __iomem *addr);
 
 /* Create a virtual mapping cookie for an IO port range */
diff --git a/arch/sparc/include/asm/io_64.h b/arch/sparc/include/asm/io_64.h
index 50d4840..c32fa3f 100644
--- a/arch/sparc/include/asm/io_64.h
+++ b/arch/sparc/include/asm/io_64.h
@@ -402,6 +402,7 @@ static inline void __iomem *ioremap(unsigned long offset, unsigned long size)
 
 #define ioremap_nocache(X,Y)		ioremap((X),(Y))
 #define ioremap_wc(X,Y)			ioremap((X),(Y))
+#define ioremap_wt(X,Y)			ioremap((X),(Y))
 
 static inline void iounmap(volatile void __iomem *addr)
 {
diff --git a/arch/tile/include/asm/io.h b/arch/tile/include/asm/io.h
index 6ef4eca..9c3d950 100644
--- a/arch/tile/include/asm/io.h
+++ b/arch/tile/include/asm/io.h
@@ -54,6 +54,7 @@ extern void iounmap(volatile void __iomem *addr);
 
 #define ioremap_nocache(physaddr, size)		ioremap(physaddr, size)
 #define ioremap_wc(physaddr, size)		ioremap(physaddr, size)
+#define ioremap_wt(physaddr, size)		ioremap(physaddr, size)
 #define ioremap_writethrough(physaddr, size)	ioremap(physaddr, size)
 #define ioremap_fullcache(physaddr, size)	ioremap(physaddr, size)
 
diff --git a/arch/xtensa/include/asm/io.h b/arch/xtensa/include/asm/io.h
index fe1600a..c39bb6e 100644
--- a/arch/xtensa/include/asm/io.h
+++ b/arch/xtensa/include/asm/io.h
@@ -59,6 +59,7 @@ static inline void __iomem *ioremap_cache(unsigned long offset,
 }
 
 #define ioremap_wc ioremap_nocache
+#define ioremap_wt ioremap_nocache
 
 static inline void __iomem *ioremap(unsigned long offset, unsigned long size)
 {

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 6/12] video/fbdev, asm/io.h: Remove ioremap_writethrough()
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (4 preceding siblings ...)
  2015-05-27 15:18 ` [PATCH v10 5/12] arch/*/asm/io.h: Add ioremap_wt() to all architectures Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:18 ` [PATCH v10 7/12] x86, mm, pat: Add pgprot_writethrough() for WT Toshi Kani
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch removes the callers of ioremap_writethrough() by
replacing them with ioremap_wt() in three drivers under
drivers/video/fbdev.  It then removes ioremap_writethrough()
defined in some architecture's asm/io.h, frv, m68k, microblaze,
and tile.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/frv/include/asm/io.h        |    5 -----
 arch/m68k/include/asm/io_mm.h    |    5 -----
 arch/m68k/include/asm/io_no.h    |    4 ----
 arch/microblaze/include/asm/io.h |    1 -
 arch/tile/include/asm/io.h       |    1 -
 drivers/video/fbdev/amifb.c      |    4 ++--
 drivers/video/fbdev/atafb.c      |    3 +--
 drivers/video/fbdev/hpfb.c       |    4 ++--
 8 files changed, 5 insertions(+), 22 deletions(-)

diff --git a/arch/frv/include/asm/io.h b/arch/frv/include/asm/io.h
index 1fe98fe..a31b63e 100644
--- a/arch/frv/include/asm/io.h
+++ b/arch/frv/include/asm/io.h
@@ -267,11 +267,6 @@ static inline void __iomem *ioremap_nocache(unsigned long physaddr, unsigned lon
 	return __ioremap(physaddr, size, IOMAP_NOCACHE_SER);
 }
 
-static inline void __iomem *ioremap_writethrough(unsigned long physaddr, unsigned long size)
-{
-	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
-}
-
 static inline void __iomem *ioremap_wt(unsigned long physaddr, unsigned long size)
 {
 	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
diff --git a/arch/m68k/include/asm/io_mm.h b/arch/m68k/include/asm/io_mm.h
index 7c12138..618c85d3 100644
--- a/arch/m68k/include/asm/io_mm.h
+++ b/arch/m68k/include/asm/io_mm.h
@@ -467,11 +467,6 @@ static inline void __iomem *ioremap_nocache(unsigned long physaddr, unsigned lon
 {
 	return __ioremap(physaddr, size, IOMAP_NOCACHE_SER);
 }
-static inline void __iomem *ioremap_writethrough(unsigned long physaddr,
-					 unsigned long size)
-{
-	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
-}
 static inline void __iomem *ioremap_wt(unsigned long physaddr,
 					 unsigned long size)
 {
diff --git a/arch/m68k/include/asm/io_no.h b/arch/m68k/include/asm/io_no.h
index 5fff9a2..ad7bd40 100644
--- a/arch/m68k/include/asm/io_no.h
+++ b/arch/m68k/include/asm/io_no.h
@@ -155,10 +155,6 @@ static inline void *ioremap_nocache(unsigned long physaddr, unsigned long size)
 {
 	return __ioremap(physaddr, size, IOMAP_NOCACHE_SER);
 }
-static inline void *ioremap_writethrough(unsigned long physaddr, unsigned long size)
-{
-	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
-}
 static inline void *ioremap_wt(unsigned long physaddr, unsigned long size)
 {
 	return __ioremap(physaddr, size, IOMAP_WRITETHROUGH);
diff --git a/arch/microblaze/include/asm/io.h b/arch/microblaze/include/asm/io.h
index ec3da11..39b6315 100644
--- a/arch/microblaze/include/asm/io.h
+++ b/arch/microblaze/include/asm/io.h
@@ -39,7 +39,6 @@ extern resource_size_t isa_mem_base;
 extern void iounmap(void __iomem *addr);
 
 extern void __iomem *ioremap(phys_addr_t address, unsigned long size);
-#define ioremap_writethrough(addr, size)	ioremap((addr), (size))
 #define ioremap_nocache(addr, size)		ioremap((addr), (size))
 #define ioremap_fullcache(addr, size)		ioremap((addr), (size))
 #define ioremap_wc(addr, size)			ioremap((addr), (size))
diff --git a/arch/tile/include/asm/io.h b/arch/tile/include/asm/io.h
index 9c3d950..dc61de1 100644
--- a/arch/tile/include/asm/io.h
+++ b/arch/tile/include/asm/io.h
@@ -55,7 +55,6 @@ extern void iounmap(volatile void __iomem *addr);
 #define ioremap_nocache(physaddr, size)		ioremap(physaddr, size)
 #define ioremap_wc(physaddr, size)		ioremap(physaddr, size)
 #define ioremap_wt(physaddr, size)		ioremap(physaddr, size)
-#define ioremap_writethrough(physaddr, size)	ioremap(physaddr, size)
 #define ioremap_fullcache(physaddr, size)	ioremap(physaddr, size)
 
 #define mmiowb()
diff --git a/drivers/video/fbdev/amifb.c b/drivers/video/fbdev/amifb.c
index 35f7900..ee3a703 100644
--- a/drivers/video/fbdev/amifb.c
+++ b/drivers/video/fbdev/amifb.c
@@ -3705,8 +3705,8 @@ default_chipset:
 	 * access the videomem with writethrough cache
 	 */
 	info->fix.smem_start = (u_long)ZTWO_PADDR(videomemory);
-	videomemory = (u_long)ioremap_writethrough(info->fix.smem_start,
-						   info->fix.smem_len);
+	videomemory = (u_long)ioremap_wt(info->fix.smem_start,
+					 info->fix.smem_len);
 	if (!videomemory) {
 		dev_warn(&pdev->dev,
 			 "Unable to map videomem cached writethrough\n");
diff --git a/drivers/video/fbdev/atafb.c b/drivers/video/fbdev/atafb.c
index cb9ee25..d6ce613 100644
--- a/drivers/video/fbdev/atafb.c
+++ b/drivers/video/fbdev/atafb.c
@@ -3185,8 +3185,7 @@ int __init atafb_init(void)
 		/* Map the video memory (physical address given) to somewhere
 		 * in the kernel address space.
 		 */
-		external_screen_base = ioremap_writethrough(external_addr,
-						     external_len);
+		external_screen_base = ioremap_wt(external_addr, external_len);
 		if (external_vgaiobase)
 			external_vgaiobase =
 			  (unsigned long)ioremap(external_vgaiobase, 0x10000);
diff --git a/drivers/video/fbdev/hpfb.c b/drivers/video/fbdev/hpfb.c
index a1b7e5f..9476d19 100644
--- a/drivers/video/fbdev/hpfb.c
+++ b/drivers/video/fbdev/hpfb.c
@@ -241,8 +241,8 @@ static int hpfb_init_one(unsigned long phys_base, unsigned long virt_base)
 	fb_info.fix.line_length = fb_width;
 	fb_height = (in_8(fb_regs + HPFB_FBHMSB) << 8) | in_8(fb_regs + HPFB_FBHLSB);
 	fb_info.fix.smem_len = fb_width * fb_height;
-	fb_start = (unsigned long)ioremap_writethrough(fb_info.fix.smem_start,
-						       fb_info.fix.smem_len);
+	fb_start = (unsigned long)ioremap_wt(fb_info.fix.smem_start,
+					     fb_info.fix.smem_len);
 	hpfb_defined.xres = (in_8(fb_regs + HPFB_DWMSB) << 8) | in_8(fb_regs + HPFB_DWLSB);
 	hpfb_defined.yres = (in_8(fb_regs + HPFB_DHMSB) << 8) | in_8(fb_regs + HPFB_DHLSB);
 	hpfb_defined.xres_virtual = hpfb_defined.xres;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 7/12] x86, mm, pat: Add pgprot_writethrough() for WT
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (5 preceding siblings ...)
  2015-05-27 15:18 ` [PATCH v10 6/12] video/fbdev, asm/io.h: Remove ioremap_writethrough() Toshi Kani
@ 2015-05-27 15:18 ` Toshi Kani
  2015-05-27 15:19 ` [PATCH v10 8/12] x86, mm, asm: Add WT support to set_page_memtype() Toshi Kani
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:18 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch adds pgprot_writethrough() for setting WT to a given
pgprot_t.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable_types.h |    3 +++
 arch/x86/mm/pat.c                    |   10 ++++++++++
 include/asm-generic/pgtable.h        |    4 ++++
 3 files changed, 17 insertions(+)

diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 78f0c8c..13f310b 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -367,6 +367,9 @@ extern int nx_enabled;
 #define pgprot_writecombine	pgprot_writecombine
 extern pgprot_t pgprot_writecombine(pgprot_t prot);
 
+#define pgprot_writethrough	pgprot_writethrough
+extern pgprot_t pgprot_writethrough(pgprot_t prot);
+
 /* Indicate that x86 has its own track and untrack pfn vma functions */
 #define __HAVE_PFNMAP_TRACKING
 
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index d932b43..aee5cdf 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -972,6 +972,16 @@ pgprot_t pgprot_writecombine(pgprot_t prot)
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
+pgprot_t pgprot_writethrough(pgprot_t prot)
+{
+	if (pat_enabled)
+		return __pgprot(pgprot_val(prot) |
+				cachemode2protval(_PAGE_CACHE_MODE_WT));
+	else
+		return pgprot_noncached(prot);
+}
+EXPORT_SYMBOL_GPL(pgprot_writethrough);
+
 #if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_PAT)
 
 static struct memtype *memtype_get_idx(loff_t pos)
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 39f1d6a..bd910ce 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -262,6 +262,10 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #define pgprot_writecombine pgprot_noncached
 #endif
 
+#ifndef pgprot_writethrough
+#define pgprot_writethrough pgprot_noncached
+#endif
+
 #ifndef pgprot_device
 #define pgprot_device pgprot_noncached
 #endif

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 8/12] x86, mm, asm: Add WT support to set_page_memtype()
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (6 preceding siblings ...)
  2015-05-27 15:18 ` [PATCH v10 7/12] x86, mm, pat: Add pgprot_writethrough() for WT Toshi Kani
@ 2015-05-27 15:19 ` Toshi Kani
  2015-05-27 15:19 ` [PATCH v10 9/12] x86, mm: Add set_memory_wt() for WT Toshi Kani
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:19 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

As set_memory_wb() calls free_ram_pages_type(), which then calls
set_page_memtype() with -1, _PGMT_DEFAULT is used for tracking
the WB type.  _PGMT_WB is defined but unused.  Hence, this patch
renames _PGMT_DEFAULT to _PGMT_WB to clarify the usage, and
releases the slot used by _PGMT_WB before.  free_ram_pages_type()
is changed to call set_page_memtype() with _PGMT_WB, and
get_page_memtype() returns _PAGE_CACHE_MODE_WB for _PGMT_WB.

This patch then defines _PGMT_WT to the released slot.  This enables
set_page_memtype() to track the WT type.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/pat.c |   60 +++++++++++++++++++++++++++--------------------------
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index aee5cdf..92fc635 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -69,18 +69,22 @@ static u64 __read_mostly boot_pat_state;
 
 #ifdef CONFIG_X86_PAT
 /*
- * X86 PAT uses page flags WC and Uncached together to keep track of
- * memory type of pages that have backing page struct. X86 PAT supports 3
- * different memory types, _PAGE_CACHE_MODE_WB, _PAGE_CACHE_MODE_WC and
- * _PAGE_CACHE_MODE_UC_MINUS and fourth state where page's memory type has not
- * been changed from its default (value of -1 used to denote this).
- * Note we do not support _PAGE_CACHE_MODE_UC here.
+ * X86 PAT uses page flags arch_1 and uncached together to keep track of
+ * memory type of pages that have backing page struct.
+ *
+ * X86 PAT supports 4 different memory types:
+ *  - _PAGE_CACHE_MODE_WB
+ *  - _PAGE_CACHE_MODE_WC
+ *  - _PAGE_CACHE_MODE_UC_MINUS
+ *  - _PAGE_CACHE_MODE_WT
+ *
+ * _PAGE_CACHE_MODE_WB is the default type.
  */
 
-#define _PGMT_DEFAULT		0
+#define _PGMT_WB		0
 #define _PGMT_WC		(1UL << PG_arch_1)
 #define _PGMT_UC_MINUS		(1UL << PG_uncached)
-#define _PGMT_WB		(1UL << PG_uncached | 1UL << PG_arch_1)
+#define _PGMT_WT		(1UL << PG_uncached | 1UL << PG_arch_1)
 #define _PGMT_MASK		(1UL << PG_uncached | 1UL << PG_arch_1)
 #define _PGMT_CLEAR_MASK	(~_PGMT_MASK)
 
@@ -88,14 +92,14 @@ static inline enum page_cache_mode get_page_memtype(struct page *pg)
 {
 	unsigned long pg_flags = pg->flags & _PGMT_MASK;
 
-	if (pg_flags == _PGMT_DEFAULT)
-		return -1;
+	if (pg_flags == _PGMT_WB)
+		return _PAGE_CACHE_MODE_WB;
 	else if (pg_flags == _PGMT_WC)
 		return _PAGE_CACHE_MODE_WC;
 	else if (pg_flags == _PGMT_UC_MINUS)
 		return _PAGE_CACHE_MODE_UC_MINUS;
 	else
-		return _PAGE_CACHE_MODE_WB;
+		return _PAGE_CACHE_MODE_WT;
 }
 
 static inline void set_page_memtype(struct page *pg,
@@ -112,11 +116,12 @@ static inline void set_page_memtype(struct page *pg,
 	case _PAGE_CACHE_MODE_UC_MINUS:
 		memtype_flags = _PGMT_UC_MINUS;
 		break;
-	case _PAGE_CACHE_MODE_WB:
-		memtype_flags = _PGMT_WB;
+	case _PAGE_CACHE_MODE_WT:
+		memtype_flags = _PGMT_WT;
 		break;
+	case _PAGE_CACHE_MODE_WB:
 	default:
-		memtype_flags = _PGMT_DEFAULT;
+		memtype_flags = _PGMT_WB;
 		break;
 	}
 
@@ -365,8 +370,11 @@ static int pat_pagerange_is_ram(resource_size_t start, resource_size_t end)
 
 /*
  * For RAM pages, we use page flags to mark the pages with appropriate type.
- * The page flags are limited to three types, WB, WC and UC-.
- * WT and WP requests fail with -EINVAL, and UC gets redirected to UC-.
+ * The page flags are limited to four types, WB (default), WC, WT and UC-.
+ * WP request fails with -EINVAL, and UC gets redirected to UC-.  Setting
+ * a new memory type is only allowed to a page mapped with the default WB
+ * type.
+ *
  * Here we do two pass:
  * - Find the memtype of all the pages in the range, look for any conflicts
  * - In case of no conflicts, set the new memtype for pages in the range
@@ -378,8 +386,7 @@ static int reserve_ram_pages_type(u64 start, u64 end,
 	struct page *page;
 	u64 pfn;
 
-	if ((req_type == _PAGE_CACHE_MODE_WT) ||
-	    (req_type == _PAGE_CACHE_MODE_WP)) {
+	if (req_type == _PAGE_CACHE_MODE_WP) {
 		if (new_type)
 			*new_type = _PAGE_CACHE_MODE_UC_MINUS;
 		return -EINVAL;
@@ -396,7 +403,7 @@ static int reserve_ram_pages_type(u64 start, u64 end,
 
 		page = pfn_to_page(pfn);
 		type = get_page_memtype(page);
-		if (type != -1) {
+		if (type != _PAGE_CACHE_MODE_WB) {
 			pr_info("reserve_ram_pages_type failed [mem %#010Lx-%#010Lx], track 0x%x, req 0x%x\n",
 				start, end - 1, type, req_type);
 			if (new_type)
@@ -423,7 +430,7 @@ static int free_ram_pages_type(u64 start, u64 end)
 
 	for (pfn = (start >> PAGE_SHIFT); pfn < (end >> PAGE_SHIFT); ++pfn) {
 		page = pfn_to_page(pfn);
-		set_page_memtype(page, -1);
+		set_page_memtype(page, _PAGE_CACHE_MODE_WB);
 	}
 	return 0;
 }
@@ -568,7 +575,7 @@ int free_memtype(u64 start, u64 end)
  * Only to be called when PAT is enabled
  *
  * Returns _PAGE_CACHE_MODE_WB, _PAGE_CACHE_MODE_WC, _PAGE_CACHE_MODE_UC_MINUS
- * or _PAGE_CACHE_MODE_UC
+ * or _PAGE_CACHE_MODE_WT.
  */
 static enum page_cache_mode lookup_memtype(u64 paddr)
 {
@@ -580,16 +587,9 @@ static enum page_cache_mode lookup_memtype(u64 paddr)
 
 	if (pat_pagerange_is_ram(paddr, paddr + PAGE_SIZE)) {
 		struct page *page;
-		page = pfn_to_page(paddr >> PAGE_SHIFT);
-		rettype = get_page_memtype(page);
-		/*
-		 * -1 from get_page_memtype() implies RAM page is in its
-		 * default state and not reserved, and hence of type WB
-		 */
-		if (rettype == -1)
-			rettype = _PAGE_CACHE_MODE_WB;
 
-		return rettype;
+		page = pfn_to_page(paddr >> PAGE_SHIFT);
+		return get_page_memtype(page);
 	}
 
 	spin_lock(&memtype_lock);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 9/12] x86, mm: Add set_memory_wt() for WT
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (7 preceding siblings ...)
  2015-05-27 15:19 ` [PATCH v10 8/12] x86, mm, asm: Add WT support to set_page_memtype() Toshi Kani
@ 2015-05-27 15:19 ` Toshi Kani
  2015-05-27 15:19 ` [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init() Toshi Kani
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:19 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

Now that reserve_ram_pages_type() accepts the WT type, this
patch adds set_memory_wt(), set_memory_array_wt() and
set_pages_array_wt() for setting the WT type to the regular
memory.

ioremap_change_attr() is also extended to accept the WT type.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 Documentation/x86/pat.txt         |    9 +++--
 arch/x86/include/asm/cacheflush.h |    6 +++
 arch/x86/mm/ioremap.c             |    3 ++
 arch/x86/mm/pageattr.c            |   65 ++++++++++++++++++++++++++++++-------
 4 files changed, 66 insertions(+), 17 deletions(-)

diff --git a/Documentation/x86/pat.txt b/Documentation/x86/pat.txt
index be7b8c2..bf4339c 100644
--- a/Documentation/x86/pat.txt
+++ b/Documentation/x86/pat.txt
@@ -46,6 +46,9 @@ set_memory_uc          |    UC-   |    --      |       --         |
 set_memory_wc          |    WC    |    --      |       --         |
  set_memory_wb         |          |            |                  |
                        |          |            |                  |
+set_memory_wt          |    WT    |    --      |       --         |
+ set_memory_wb         |          |            |                  |
+                       |          |            |                  |
 pci sysfs resource     |    --    |    --      |       UC-        |
                        |          |            |                  |
 pci sysfs resource_wc  |    --    |    --      |       WC         |
@@ -117,8 +120,8 @@ can be more restrictive, in case of any existing aliasing for that address.
 For example: If there is an existing uncached mapping, a new ioremap_wc can
 return uncached mapping in place of write-combine requested.
 
-set_memory_[uc|wc] and set_memory_wb should be used in pairs, where driver will
-first make a region uc or wc and switch it back to wb after use.
+set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
+will first make a region uc, wc or wt and switch it back to wb after use.
 
 Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
 interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
@@ -126,7 +129,7 @@ interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
 Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
 types.
 
-Drivers should use set_memory_[uc|wc] to set access type for RAM ranges.
+Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
 
 
 PAT debugging
diff --git a/arch/x86/include/asm/cacheflush.h b/arch/x86/include/asm/cacheflush.h
index 47c8e32..b6f7457 100644
--- a/arch/x86/include/asm/cacheflush.h
+++ b/arch/x86/include/asm/cacheflush.h
@@ -8,7 +8,7 @@
 /*
  * The set_memory_* API can be used to change various attributes of a virtual
  * address range. The attributes include:
- * Cachability   : UnCached, WriteCombining, WriteBack
+ * Cachability   : UnCached, WriteCombining, WriteThrough, WriteBack
  * Executability : eXeutable, NoteXecutable
  * Read/Write    : ReadOnly, ReadWrite
  * Presence      : NotPresent
@@ -35,9 +35,11 @@
 
 int _set_memory_uc(unsigned long addr, int numpages);
 int _set_memory_wc(unsigned long addr, int numpages);
+int _set_memory_wt(unsigned long addr, int numpages);
 int _set_memory_wb(unsigned long addr, int numpages);
 int set_memory_uc(unsigned long addr, int numpages);
 int set_memory_wc(unsigned long addr, int numpages);
+int set_memory_wt(unsigned long addr, int numpages);
 int set_memory_wb(unsigned long addr, int numpages);
 int set_memory_x(unsigned long addr, int numpages);
 int set_memory_nx(unsigned long addr, int numpages);
@@ -48,10 +50,12 @@ int set_memory_4k(unsigned long addr, int numpages);
 
 int set_memory_array_uc(unsigned long *addr, int addrinarray);
 int set_memory_array_wc(unsigned long *addr, int addrinarray);
+int set_memory_array_wt(unsigned long *addr, int addrinarray);
 int set_memory_array_wb(unsigned long *addr, int addrinarray);
 
 int set_pages_array_uc(struct page **pages, int addrinarray);
 int set_pages_array_wc(struct page **pages, int addrinarray);
+int set_pages_array_wt(struct page **pages, int addrinarray);
 int set_pages_array_wb(struct page **pages, int addrinarray);
 
 /*
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index ae8c284..7e702dc 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -42,6 +42,9 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size,
 	case _PAGE_CACHE_MODE_WC:
 		err = _set_memory_wc(vaddr, nrpages);
 		break;
+	case _PAGE_CACHE_MODE_WT:
+		err = _set_memory_wt(vaddr, nrpages);
+		break;
 	case _PAGE_CACHE_MODE_WB:
 		err = _set_memory_wb(vaddr, nrpages);
 		break;
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 89af288..6427273 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1502,12 +1502,10 @@ EXPORT_SYMBOL(set_memory_uc);
 static int _set_memory_array(unsigned long *addr, int addrinarray,
 		enum page_cache_mode new_type)
 {
+	enum page_cache_mode set_type;
 	int i, j;
 	int ret;
 
-	/*
-	 * for now UC MINUS. see comments in ioremap_nocache()
-	 */
 	for (i = 0; i < addrinarray; i++) {
 		ret = reserve_memtype(__pa(addr[i]), __pa(addr[i]) + PAGE_SIZE,
 					new_type, NULL);
@@ -1515,9 +1513,12 @@ static int _set_memory_array(unsigned long *addr, int addrinarray,
 			goto out_free;
 	}
 
+	/* If WC, set to UC- first and then WC */
+	set_type = (new_type == _PAGE_CACHE_MODE_WC) ?
+				_PAGE_CACHE_MODE_UC_MINUS : new_type;
+
 	ret = change_page_attr_set(addr, addrinarray,
-				   cachemode2pgprot(_PAGE_CACHE_MODE_UC_MINUS),
-				   1);
+				   cachemode2pgprot(set_type), 1);
 
 	if (!ret && new_type == _PAGE_CACHE_MODE_WC)
 		ret = change_page_attr_set_clr(addr, addrinarray,
@@ -1549,6 +1550,12 @@ int set_memory_array_wc(unsigned long *addr, int addrinarray)
 }
 EXPORT_SYMBOL(set_memory_array_wc);
 
+int set_memory_array_wt(unsigned long *addr, int addrinarray)
+{
+	return _set_memory_array(addr, addrinarray, _PAGE_CACHE_MODE_WT);
+}
+EXPORT_SYMBOL_GPL(set_memory_array_wt);
+
 int _set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
@@ -1577,21 +1584,42 @@ int set_memory_wc(unsigned long addr, int numpages)
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_MODE_WC, NULL);
 	if (ret)
-		goto out_err;
+		return ret;
 
 	ret = _set_memory_wc(addr, numpages);
 	if (ret)
-		goto out_free;
-
-	return 0;
+		free_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE);
 
-out_free:
-	free_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE);
-out_err:
 	return ret;
 }
 EXPORT_SYMBOL(set_memory_wc);
 
+int _set_memory_wt(unsigned long addr, int numpages)
+{
+	return change_page_attr_set(&addr, numpages,
+				    cachemode2pgprot(_PAGE_CACHE_MODE_WT), 0);
+}
+
+int set_memory_wt(unsigned long addr, int numpages)
+{
+	int ret;
+
+	if (!pat_enabled)
+		return set_memory_uc(addr, numpages);
+
+	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
+			      _PAGE_CACHE_MODE_WT, NULL);
+	if (ret)
+		return ret;
+
+	ret = _set_memory_wt(addr, numpages);
+	if (ret)
+		free_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(set_memory_wt);
+
 int _set_memory_wb(unsigned long addr, int numpages)
 {
 	/* WB cache mode is hard wired to all cache attribute bits being 0 */
@@ -1682,6 +1710,7 @@ static int _set_pages_array(struct page **pages, int addrinarray,
 {
 	unsigned long start;
 	unsigned long end;
+	enum page_cache_mode set_type;
 	int i;
 	int free_idx;
 	int ret;
@@ -1695,8 +1724,12 @@ static int _set_pages_array(struct page **pages, int addrinarray,
 			goto err_out;
 	}
 
+	/* If WC, set to UC- first and then WC */
+	set_type = (new_type == _PAGE_CACHE_MODE_WC) ?
+				_PAGE_CACHE_MODE_UC_MINUS : new_type;
+
 	ret = cpa_set_pages_array(pages, addrinarray,
-			cachemode2pgprot(_PAGE_CACHE_MODE_UC_MINUS));
+				  cachemode2pgprot(set_type));
 	if (!ret && new_type == _PAGE_CACHE_MODE_WC)
 		ret = change_page_attr_set_clr(NULL, addrinarray,
 					       cachemode2pgprot(
@@ -1730,6 +1763,12 @@ int set_pages_array_wc(struct page **pages, int addrinarray)
 }
 EXPORT_SYMBOL(set_pages_array_wc);
 
+int set_pages_array_wt(struct page **pages, int addrinarray)
+{
+	return _set_pages_array(pages, addrinarray, _PAGE_CACHE_MODE_WT);
+}
+EXPORT_SYMBOL_GPL(set_pages_array_wt);
+
 int set_pages_wb(struct page *page, int numpages)
 {
 	unsigned long addr = (unsigned long)page_address(page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init()
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (8 preceding siblings ...)
  2015-05-27 15:19 ` [PATCH v10 9/12] x86, mm: Add set_memory_wt() for WT Toshi Kani
@ 2015-05-27 15:19 ` Toshi Kani
  2015-05-29  8:59   ` Borislav Petkov
  2015-05-27 15:19 ` [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
  2015-05-27 15:19 ` [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt() Toshi Kani
  11 siblings, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:19 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

pat_init() uses two flags, 'boot_cpu' and 'boot_pat_state', for
tracking the boot CPU's initialization status.  'boot_pat_state'
is also overloaded to carry the boot PAT value.

This patch cleans this up by replacing them with a new single
flag, 'boot_cpu_done', to track the boot CPU's initialization
status.  'boot_pat_state' is only used to carry the boot PAT
value as a result.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 arch/x86/mm/pat.c |   42 ++++++++++++++++++++----------------------
 1 file changed, 20 insertions(+), 22 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 92fc635..7cfd995 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -201,26 +201,31 @@ void pat_init_cache_modes(void)
 void pat_init(void)
 {
 	u64 pat;
-	bool boot_cpu = !boot_pat_state;
 	struct cpuinfo_x86 *c = &boot_cpu_data;
+	static bool boot_cpu_done;
 
 	if (!pat_enabled)
 		return;
 
-	if (!cpu_has_pat) {
-		if (!boot_pat_state) {
+	if (!boot_cpu_done) {
+		if (!cpu_has_pat) {
 			pat_disable("PAT not supported by CPU.");
 			return;
-		} else {
-			/*
-			 * If this happens we are on a secondary CPU, but
-			 * switched to PAT on the boot CPU. We have no way to
-			 * undo PAT.
-			 */
-			printk(KERN_ERR "PAT enabled, "
-			       "but not supported by secondary CPU\n");
-			BUG();
 		}
+
+		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
+		if (!boot_pat_state) {
+			pat_disable("PAT read returns always zero, disabled.");
+			return;
+		}
+	} else if (!cpu_has_pat) {
+		/*
+		 * If this happens we are on a secondary CPU, but
+		 * switched to PAT on the boot CPU. We have no way to
+		 * undo PAT.
+		 */
+		pr_err("PAT enabled, but not supported by secondary CPU\n");
+		BUG();
 	}
 
 	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
@@ -279,19 +284,12 @@ void pat_init(void)
 		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
 	}
 
-	/* Boot CPU check */
-	if (!boot_pat_state) {
-		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
-		if (!boot_pat_state) {
-			pat_disable("PAT read returns always zero, disabled.");
-			return;
-		}
-	}
-
 	wrmsrl(MSR_IA32_CR_PAT, pat);
 
-	if (boot_cpu)
+	if (!boot_cpu_done) {
 		pat_init_cache_modes();
+		boot_cpu_done = true;
+	}
 }
 
 #undef PAT

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (9 preceding siblings ...)
  2015-05-27 15:19 ` [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init() Toshi Kani
@ 2015-05-27 15:19 ` Toshi Kani
  2015-05-29  8:58   ` Borislav Petkov
  2015-05-27 15:19 ` [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt() Toshi Kani
  11 siblings, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:19 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

This patch refactors the !pat_enabled code paths and integrates
them into the PAT abstraction code.  The PAT table is emulated by
corresponding to the two cache attribute bits, PWT (Write Through)
and PCD (Cache Disable).  The emulated PAT table is the same as the
BIOS default setup when the system has PAT but the "nopat" boot
option is specified.  The emulated PAT table is also used when
MSR_IA32_CR_PAT returns 0 (9d34cfdf4).

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
 arch/x86/mm/init.c     |    6 ++--
 arch/x86/mm/iomap_32.c |   12 ++++---
 arch/x86/mm/ioremap.c  |   10 +-----
 arch/x86/mm/pageattr.c |    6 ----
 arch/x86/mm/pat.c      |   77 +++++++++++++++++++++++++++++-------------------
 5 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 1d55318..8533b46 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -40,7 +40,7 @@
  */
 uint16_t __cachemode2pte_tbl[_PAGE_CACHE_MODE_NUM] = {
 	[_PAGE_CACHE_MODE_WB      ]	= 0         | 0        ,
-	[_PAGE_CACHE_MODE_WC      ]	= _PAGE_PWT | 0        ,
+	[_PAGE_CACHE_MODE_WC      ]	= 0         | _PAGE_PCD,
 	[_PAGE_CACHE_MODE_UC_MINUS]	= 0         | _PAGE_PCD,
 	[_PAGE_CACHE_MODE_UC      ]	= _PAGE_PWT | _PAGE_PCD,
 	[_PAGE_CACHE_MODE_WT      ]	= 0         | _PAGE_PCD,
@@ -50,11 +50,11 @@ EXPORT_SYMBOL(__cachemode2pte_tbl);
 
 uint8_t __pte2cachemode_tbl[8] = {
 	[__pte2cm_idx( 0        | 0         | 0        )] = _PAGE_CACHE_MODE_WB,
-	[__pte2cm_idx(_PAGE_PWT | 0         | 0        )] = _PAGE_CACHE_MODE_WC,
+	[__pte2cm_idx(_PAGE_PWT | 0         | 0        )] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx( 0        | _PAGE_PCD | 0        )] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | 0        )] = _PAGE_CACHE_MODE_UC,
 	[__pte2cm_idx( 0        | 0         | _PAGE_PAT)] = _PAGE_CACHE_MODE_WB,
-	[__pte2cm_idx(_PAGE_PWT | 0         | _PAGE_PAT)] = _PAGE_CACHE_MODE_WC,
+	[__pte2cm_idx(_PAGE_PWT | 0         | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(0         | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC_MINUS,
 	[__pte2cm_idx(_PAGE_PWT | _PAGE_PCD | _PAGE_PAT)] = _PAGE_CACHE_MODE_UC,
 };
diff --git a/arch/x86/mm/iomap_32.c b/arch/x86/mm/iomap_32.c
index 9ca35fc..2c51a2b 100644
--- a/arch/x86/mm/iomap_32.c
+++ b/arch/x86/mm/iomap_32.c
@@ -77,13 +77,13 @@ void __iomem *
 iomap_atomic_prot_pfn(unsigned long pfn, pgprot_t prot)
 {
 	/*
-	 * For non-PAT systems, promote PAGE_KERNEL_WC to PAGE_KERNEL_UC_MINUS.
-	 * PAGE_KERNEL_WC maps to PWT, which translates to uncached if the
-	 * MTRR is UC or WC.  UC_MINUS gets the real intention, of the
-	 * user, which is "WC if the MTRR is WC, UC if you can't do that."
+	 * For non-PAT systems, translate non-WB request to UC- just in
+	 * case the caller set the PWT bit to prot directly without using
+	 * pgprot_writecombine(). UC- translates to uncached if the MTRR
+	 * is UC or WC. UC- gets the real intention, of the user, which is
+	 * "WC if the MTRR is WC, UC if you can't do that."
 	 */
-	if (!pat_enabled && pgprot_val(prot) ==
-	    (__PAGE_KERNEL | cachemode2protval(_PAGE_CACHE_MODE_WC)))
+	if (!pat_enabled && pgprot2cachemode(prot) != _PAGE_CACHE_MODE_WB)
 		prot = __pgprot(__PAGE_KERNEL |
 				cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS));
 
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 7e702dc..f966129 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -265,11 +265,8 @@ EXPORT_SYMBOL(ioremap_nocache);
  */
 void __iomem *ioremap_wc(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WC,
 					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
 }
 EXPORT_SYMBOL(ioremap_wc);
 
@@ -285,11 +282,8 @@ EXPORT_SYMBOL(ioremap_wc);
  */
 void __iomem *ioremap_wt(resource_size_t phys_addr, unsigned long size)
 {
-	if (pat_enabled)
-		return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
+	return __ioremap_caller(phys_addr, size, _PAGE_CACHE_MODE_WT,
 					__builtin_return_address(0));
-	else
-		return ioremap_nocache(phys_addr, size);
 }
 EXPORT_SYMBOL(ioremap_wt);
 
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 6427273..5a25e95 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1578,9 +1578,6 @@ int set_memory_wc(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
-
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 		_PAGE_CACHE_MODE_WC, NULL);
 	if (ret)
@@ -1604,9 +1601,6 @@ int set_memory_wt(unsigned long addr, int numpages)
 {
 	int ret;
 
-	if (!pat_enabled)
-		return set_memory_uc(addr, numpages);
-
 	ret = reserve_memtype(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,
 			      _PAGE_CACHE_MODE_WT, NULL);
 	if (ret)
diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 7cfd995..0533867 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -186,7 +186,11 @@ void pat_init_cache_modes(void)
 	char pat_msg[33];
 	u64 pat;
 
-	rdmsrl(MSR_IA32_CR_PAT, pat);
+	if (pat_enabled)
+		rdmsrl(MSR_IA32_CR_PAT, pat);
+	else
+		pat = boot_pat_state;
+
 	pat_msg[32] = 0;
 	for (i = 7; i >= 0; i--) {
 		cache = pat_get_cache_mode((pat >> (i * 8)) & 7,
@@ -204,21 +208,16 @@ void pat_init(void)
 	struct cpuinfo_x86 *c = &boot_cpu_data;
 	static bool boot_cpu_done;
 
-	if (!pat_enabled)
-		return;
-
 	if (!boot_cpu_done) {
-		if (!cpu_has_pat) {
+		if (!cpu_has_pat)
 			pat_disable("PAT not supported by CPU.");
-			return;
-		}
 
-		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
-		if (!boot_pat_state) {
-			pat_disable("PAT read returns always zero, disabled.");
-			return;
+		if (pat_enabled) {
+			rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
+			if (!boot_pat_state)
+				pat_disable("PAT read returns always zero, disabled.");
 		}
-	} else if (!cpu_has_pat) {
+	} else if (!cpu_has_pat && pat_enabled) {
 		/*
 		 * If this happens we are on a secondary CPU, but
 		 * switched to PAT on the boot CPU. We have no way to
@@ -228,9 +227,35 @@ void pat_init(void)
 		BUG();
 	}
 
-	if ((c->x86_vendor == X86_VENDOR_INTEL) &&
-	    (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
-	     ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
+	if (!pat_enabled) {
+		/*
+		 * No PAT. Emulate the PAT table that corresponds to the two
+		 * cache bits, PWT (Write Through) and PCD (Cache Disable).
+		 * This setup is the same as the BIOS default setup when the
+		 * system has PAT but the "nopat" boot option is specified.
+		 * This emulated PAT table is also used when MSR_IA32_CR_PAT
+		 * returns 0.
+		 *
+		 *  PTE encoding used in Linux:
+		 *       PCD
+		 *       |PWT  PAT
+		 *       ||    slot
+		 *       00    0    WB : _PAGE_CACHE_MODE_WB
+		 *       01    1    WT : _PAGE_CACHE_MODE_WT
+		 *       10    2    UC-: _PAGE_CACHE_MODE_UC_MINUS
+		 *       11    3    UC : _PAGE_CACHE_MODE_UC
+		 *
+		 * NOTE: When WC or WP is used, it is redirected to UC- per
+		 * the default setup in __cachemode2pte_tbl[].
+		 */
+		pat = PAT(0, WB) | PAT(1, WT) | PAT(2, UC_MINUS) | PAT(3, UC) |
+		      PAT(4, WB) | PAT(5, WT) | PAT(6, UC_MINUS) | PAT(7, UC);
+		if (!boot_pat_state)
+			boot_pat_state = pat;
+
+	} else if ((c->x86_vendor == X86_VENDOR_INTEL) &&
+		   (((c->x86 == 0x6) && (c->x86_model <= 0xd)) ||
+		    ((c->x86 == 0xf) && (c->x86_model <= 0x6)))) {
 		/*
 		 * PAT support with the lower four entries. Intel Pentium 2,
 		 * 3, M, and 4 are affected by PAT errata, which makes the
@@ -284,7 +309,8 @@ void pat_init(void)
 		      PAT(4, WB) | PAT(5, WC) | PAT(6, UC_MINUS) | PAT(7, WT);
 	}
 
-	wrmsrl(MSR_IA32_CR_PAT, pat);
+	if (pat_enabled)
+		wrmsrl(MSR_IA32_CR_PAT, pat);
 
 	if (!boot_cpu_done) {
 		pat_init_cache_modes();
@@ -457,13 +483,8 @@ int reserve_memtype(u64 start, u64 end, enum page_cache_mode req_type,
 	BUG_ON(start >= end); /* end is exclusive */
 
 	if (!pat_enabled) {
-		/* WB and UC- are the only types supported without PAT */
-		if (new_type) {
-			if (req_type == _PAGE_CACHE_MODE_WB)
-				*new_type = _PAGE_CACHE_MODE_WB;
-			else
-				*new_type = _PAGE_CACHE_MODE_UC_MINUS;
-		}
+		if (new_type)
+			*new_type = req_type;
 		return 0;
 	}
 
@@ -962,21 +983,15 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 
 pgprot_t pgprot_writecombine(pgprot_t prot)
 {
-	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) |
+	return __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WC));
-	else
-		return pgprot_noncached(prot);
 }
 EXPORT_SYMBOL_GPL(pgprot_writecombine);
 
 pgprot_t pgprot_writethrough(pgprot_t prot)
 {
-	if (pat_enabled)
-		return __pgprot(pgprot_val(prot) |
+	return __pgprot(pgprot_val(prot) |
 				cachemode2protval(_PAGE_CACHE_MODE_WT));
-	else
-		return pgprot_noncached(prot);
 }
 EXPORT_SYMBOL_GPL(pgprot_writethrough);
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
                   ` (10 preceding siblings ...)
  2015-05-27 15:19 ` [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
@ 2015-05-27 15:19 ` Toshi Kani
  2015-05-29  9:11   ` Borislav Petkov
  11 siblings, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-27 15:19 UTC (permalink / raw)
  To: hpa, tglx, mingo, akpm, arnd
  Cc: linux-mm, linux-kernel, x86, linux-nvdimm, jgross, stefan.bader,
	luto, hmh, yigal, konrad.wilk, Elliott, mcgrof, hch, Toshi Kani

The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
write back the contents of the CPU caches in case of a crash.

This patch changes to use ioremap_wt(), which provides uncached
writes but cached reads, for improving read performance.

Signed-off-by: Toshi Kani <toshi.kani@hp.com>
---
 drivers/block/pmem.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
index eabf4a8..095dfaa 100644
--- a/drivers/block/pmem.c
+++ b/drivers/block/pmem.c
@@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
 	}
 
 	/*
-	 * Map the memory as non-cachable, as we can't write back the contents
+	 * Map the memory as write-through, as we can't write back the contents
 	 * of the CPU caches in case of a crash.
 	 */
 	err = -ENOMEM;
-	pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
+	pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
 	if (!pmem->virt_addr)
 		goto out_release_region;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling
  2015-05-27 15:19 ` [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
@ 2015-05-29  8:58   ` Borislav Petkov
  2015-05-29 14:27     ` Toshi Kani
  0 siblings, 1 reply; 33+ messages in thread
From: Borislav Petkov @ 2015-05-29  8:58 UTC (permalink / raw)
  To: Toshi Kani
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Wed, May 27, 2015 at 09:19:03AM -0600, Toshi Kani wrote:
> This patch refactors the !pat_enabled code paths and integrates

Please refrain from using such empty phrases like "This patch does this
and that" in your commit messages - it is implicitly obvious that it is
"this patch" when one reads it.

> them into the PAT abstraction code.  The PAT table is emulated by
> corresponding to the two cache attribute bits, PWT (Write Through)
> and PCD (Cache Disable).  The emulated PAT table is the same as the
> BIOS default setup when the system has PAT but the "nopat" boot
> option is specified.  The emulated PAT table is also used when
> MSR_IA32_CR_PAT returns 0 (9d34cfdf4).

9d34cfdf4 - what is that thing? A commit message? If so, we quote them
like this:

  9d34cfdf4796 ("x86: Don't rely on VMWare emulating PAT MSR correctly")

note the 12 chars length of the commit id.

> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> Reviewed-by: Juergen Gross <jgross@suse.com>
> ---
>  arch/x86/mm/init.c     |    6 ++--
>  arch/x86/mm/iomap_32.c |   12 ++++---
>  arch/x86/mm/ioremap.c  |   10 +-----
>  arch/x86/mm/pageattr.c |    6 ----
>  arch/x86/mm/pat.c      |   77 +++++++++++++++++++++++++++++-------------------
>  5 files changed, 57 insertions(+), 54 deletions(-)

So I started applying your pile and everything was ok-ish until I came
about this trainwreck. You have a lot of changes in here, the commit
message is certainly lacking sufficient explanation as to why and this
patch is changing stuff which the previous one adds.

So a lot of unnecesary code movement.

Then you have stuff like this:

	+       } else if (!cpu_has_pat && pat_enabled) {

How can a CPU not have PAT but have it enabled?!?

So this is not how we do patchsets.

Please do the cleanups *first*. Do them in small, self-contained changes
explaining *why* you're doing them.

*Then* add the new functionality, .i.e. the WT.

Oh, and when you do your next version, do the patches against tip/master
because there are a bunch of changes in the PAT code already.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init()
  2015-05-27 15:19 ` [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init() Toshi Kani
@ 2015-05-29  8:59   ` Borislav Petkov
  0 siblings, 0 replies; 33+ messages in thread
From: Borislav Petkov @ 2015-05-29  8:59 UTC (permalink / raw)
  To: Toshi Kani
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Wed, May 27, 2015 at 09:19:02AM -0600, Toshi Kani wrote:
> pat_init() uses two flags, 'boot_cpu' and 'boot_pat_state', for
> tracking the boot CPU's initialization status.  'boot_pat_state'
> is also overloaded to carry the boot PAT value.
> 
> This patch cleans this up by replacing them with a new single
> flag, 'boot_cpu_done', to track the boot CPU's initialization
> status.  'boot_pat_state' is only used to carry the boot PAT
> value as a result.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  arch/x86/mm/pat.c |   42 ++++++++++++++++++++----------------------
>  1 file changed, 20 insertions(+), 22 deletions(-)

...

> +		rdmsrl(MSR_IA32_CR_PAT, boot_pat_state);
> +		if (!boot_pat_state) {
> +			pat_disable("PAT read returns always zero, disabled.");
> +			return;
> +		}
> +	} else if (!cpu_has_pat) {
> +		/*
> +		 * If this happens we are on a secondary CPU, but
> +		 * switched to PAT on the boot CPU. We have no way to
> +		 * undo PAT.
> +		 */
> +		pr_err("PAT enabled, but not supported by secondary CPU\n");
> +		BUG();

These could be replaced with a panic().

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-27 15:19 ` [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt() Toshi Kani
@ 2015-05-29  9:11   ` Borislav Petkov
  2015-05-29 14:43     ` Dan Williams
  0 siblings, 1 reply; 33+ messages in thread
From: Borislav Petkov @ 2015-05-29  9:11 UTC (permalink / raw)
  To: Toshi Kani, Dan Williams, Ross Zwisler
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
> write back the contents of the CPU caches in case of a crash.
> 
> This patch changes to use ioremap_wt(), which provides uncached
> writes but cached reads, for improving read performance.
> 
> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> ---
>  drivers/block/pmem.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
> index eabf4a8..095dfaa 100644
> --- a/drivers/block/pmem.c
> +++ b/drivers/block/pmem.c
> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
>  	}
>  
>  	/*
> -	 * Map the memory as non-cachable, as we can't write back the contents
> +	 * Map the memory as write-through, as we can't write back the contents
>  	 * of the CPU caches in case of a crash.
>  	 */
>  	err = -ENOMEM;
> -	pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> +	pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>  	if (!pmem->virt_addr)
>  		goto out_release_region;

Dan, Ross, what about this one?

ACK to pick it up as a temporary solution?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling
  2015-05-29  8:58   ` Borislav Petkov
@ 2015-05-29 14:27     ` Toshi Kani
  2015-05-29 15:13       ` Borislav Petkov
  0 siblings, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-29 14:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Fri, 2015-05-29 at 10:58 +0200, Borislav Petkov wrote:
> On Wed, May 27, 2015 at 09:19:03AM -0600, Toshi Kani wrote:
> > This patch refactors the !pat_enabled code paths and integrates
> 
> Please refrain from using such empty phrases like "This patch does this
> and that" in your commit messages - it is implicitly obvious that it is
> "this patch" when one reads it.
> 
> > them into the PAT abstraction code.  The PAT table is emulated by
> > corresponding to the two cache attribute bits, PWT (Write Through)
> > and PCD (Cache Disable).  The emulated PAT table is the same as the
> > BIOS default setup when the system has PAT but the "nopat" boot
> > option is specified.  The emulated PAT table is also used when
> > MSR_IA32_CR_PAT returns 0 (9d34cfdf4).
> 
> 9d34cfdf4 - what is that thing? A commit message? If so, we quote them
> like this:
> 
>   9d34cfdf4796 ("x86: Don't rely on VMWare emulating PAT MSR correctly")
> 
> note the 12 chars length of the commit id.

Yes, it refers the commit message above.

> > Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> > Reviewed-by: Juergen Gross <jgross@suse.com>
> > ---
> >  arch/x86/mm/init.c     |    6 ++--
> >  arch/x86/mm/iomap_32.c |   12 ++++---
> >  arch/x86/mm/ioremap.c  |   10 +-----
> >  arch/x86/mm/pageattr.c |    6 ----
> >  arch/x86/mm/pat.c      |   77 +++++++++++++++++++++++++++++-------------------
> >  5 files changed, 57 insertions(+), 54 deletions(-)
> 
> So I started applying your pile and everything was ok-ish until I came
> about this trainwreck. You have a lot of changes in here, the commit
> message is certainly lacking sufficient explanation as to why and this
> patch is changing stuff which the previous one adds.

This !pat_enabled path cleanup was suggested during review and is
independent from the WT enablement.  So, I thought it'd be better to
place it as an additional change on top of the WT set, so that it'd be
easier to bisect when there is any issue found in the !pat_enabled path.

> So a lot of unnecesary code movement.
>
> Then you have stuff like this:
> 
> 	+       } else if (!cpu_has_pat && pat_enabled) {
> 
> How can a CPU not have PAT but have it enabled?!?

This simply preserves the original error check in the code.  This error
check makes sure that all CPUs have the PAT feature supported when PAT
is enabled.  This error can only happen when heterogeneous CPUs are
installed/emulated on the system/guest.  This check may be paranoid, but
this cleanup is not meant to modify such an error check.

> So this is not how we do patchsets.
> 
> Please do the cleanups *first*. Do them in small, self-contained changes
> explaining *why* you're doing them.
> 
> *Then* add the new functionality, .i.e. the WT.

Can you consider the patch 10/12-11/12 as a separate patchset from the
WT series?  If that is OK, I will resubmit 10/12 (BUG->panic) and 11/12
(commit log update). 

> Oh, and when you do your next version, do the patches against tip/master
> because there are a bunch of changes in the PAT code already.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29  9:11   ` Borislav Petkov
@ 2015-05-29 14:43     ` Dan Williams
  2015-05-29 15:03       ` Toshi Kani
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Williams @ 2015-05-29 14:43 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Toshi Kani, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, jgross, Stefan Bader,
	Andy Lutomirski, hmh, yigal, Konrad Rzeszutek Wilk, Elliott,
	Robert (Server Storage),
	mcgrof, Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
>> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
>> write back the contents of the CPU caches in case of a crash.
>>
>> This patch changes to use ioremap_wt(), which provides uncached
>> writes but cached reads, for improving read performance.
>>
>> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
>> ---
>>  drivers/block/pmem.c |    4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
>> index eabf4a8..095dfaa 100644
>> --- a/drivers/block/pmem.c
>> +++ b/drivers/block/pmem.c
>> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
>>       }
>>
>>       /*
>> -      * Map the memory as non-cachable, as we can't write back the contents
>> +      * Map the memory as write-through, as we can't write back the contents
>>        * of the CPU caches in case of a crash.
>>        */
>>       err = -ENOMEM;
>> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>>       if (!pmem->virt_addr)
>>               goto out_release_region;
>
> Dan, Ross, what about this one?
>
> ACK to pick it up as a temporary solution?

I see that is_new_memtype_allowed() is updated to disallow some
combinations, but the manual seems to imply any mixing of memory types
is unsupported.  Which worries me even in the current code where we
have uncached mappings in the driver, and potentially cached DAX
mappings handed out to userspace.

A general quibble separate from this patch is that we don't have a way
of knowing if ioremap() will reject or change our requested memory
type.  Shouldn't the driver be explicitly requesting a known valid
type in advance?

Lastly we now have the PMEM API patches from Ross out for review where
he is assuming cached mappings with non-temporal writes:
https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
This gives us WC semantics on writes which I believe has the nice
property of reducing the number of write transactions to memory.
Also, the numbers in the paper seem to be assuming DAX operation, but
this ioremap_wt() is in the driver and typically behind a file system.
Are the numbers relevant to that usage mode?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 14:43     ` Dan Williams
@ 2015-05-29 15:03       ` Toshi Kani
  2015-05-29 18:19         ` Dan Williams
  0 siblings, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-29 15:03 UTC (permalink / raw)
  To: Dan Williams
  Cc: Borislav Petkov, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, jgross, Stefan Bader,
	Andy Lutomirski, hmh, yigal, Konrad Rzeszutek Wilk, Elliott,
	Robert (Server Storage),
	mcgrof, Christoph Hellwig, Matthew Wilcox

On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
> >> write back the contents of the CPU caches in case of a crash.
> >>
> >> This patch changes to use ioremap_wt(), which provides uncached
> >> writes but cached reads, for improving read performance.
> >>
> >> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
> >> ---
> >>  drivers/block/pmem.c |    4 ++--
> >>  1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
> >> index eabf4a8..095dfaa 100644
> >> --- a/drivers/block/pmem.c
> >> +++ b/drivers/block/pmem.c
> >> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
> >>       }
> >>
> >>       /*
> >> -      * Map the memory as non-cachable, as we can't write back the contents
> >> +      * Map the memory as write-through, as we can't write back the contents
> >>        * of the CPU caches in case of a crash.
> >>        */
> >>       err = -ENOMEM;
> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >>       if (!pmem->virt_addr)
> >>               goto out_release_region;
> >
> > Dan, Ross, what about this one?
> >
> > ACK to pick it up as a temporary solution?
> 
> I see that is_new_memtype_allowed() is updated to disallow some
> combinations, but the manual seems to imply any mixing of memory types
> is unsupported.  Which worries me even in the current code where we
> have uncached mappings in the driver, and potentially cached DAX
> mappings handed out to userspace.

is_new_memtype_allowed() is not to allow some combinations of mixing of
memory types.  When it is allowed, the requested type of ioremap_xxx()
is changed to match with the existing map type, so that mixing of memory
types does not happen.

DAX uses vm_insert_mixed(), which does not even check the existing map
type to the physical address.

> A general quibble separate from this patch is that we don't have a way
> of knowing if ioremap() will reject or change our requested memory
> type.  Shouldn't the driver be explicitly requesting a known valid
> type in advance?

I agree we need a solution here.

> Lastly we now have the PMEM API patches from Ross out for review where
> he is assuming cached mappings with non-temporal writes:
> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
> This gives us WC semantics on writes which I believe has the nice
> property of reducing the number of write transactions to memory.
> Also, the numbers in the paper seem to be assuming DAX operation, but
> this ioremap_wt() is in the driver and typically behind a file system.
> Are the numbers relevant to that usage mode?

I have not looked into the Ross's changes yet, but they do not seem to
replace the use of ioremap_nocache().  If his changes can use WB type
reliably, yes, we do not need a temporary solution of using ioremap_wt()
in this driver.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling
  2015-05-29 14:27     ` Toshi Kani
@ 2015-05-29 15:13       ` Borislav Petkov
  2015-05-29 15:17         ` Toshi Kani
  0 siblings, 1 reply; 33+ messages in thread
From: Borislav Petkov @ 2015-05-29 15:13 UTC (permalink / raw)
  To: Toshi Kani
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Fri, May 29, 2015 at 08:27:08AM -0600, Toshi Kani wrote:
> This simply preserves the original error check in the code.  This error
> check makes sure that all CPUs have the PAT feature supported when PAT
> is enabled.  This error can only happen when heterogeneous CPUs are
> installed/emulated on the system/guest.  This check may be paranoid, but
> this cleanup is not meant to modify such an error check.

No, this is a ridiculous attempt to justify crazy code. Please do it
right. If the cleanup makes the code more insane than it is, then don't
do it in the first place.

> Can you consider the patch 10/12-11/12 as a separate patchset from the
> WT series?  If that is OK, I will resubmit 10/12 (BUG->panic) and 11/12
> (commit log update).

That's not enough. 11/12 is a convoluted mess which needs splitting and
more detailed explanations in the commit messages.

So no. Read what I said: do the cleanup *first* , *then* add the new
functionality.

The WT patches shouldn't change all too much from what you have now.
Also, 11/12 changes stuff which you add in 1/12. This churn is useless
and shouldn't be there at all.

So you should be able to do the cleanup first and have the WT stuff
ontop just fine.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling
  2015-05-29 15:13       ` Borislav Petkov
@ 2015-05-29 15:17         ` Toshi Kani
  0 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-29 15:17 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: hpa, tglx, mingo, akpm, arnd, linux-mm, linux-kernel, x86,
	linux-nvdimm, jgross, stefan.bader, luto, hmh, yigal,
	konrad.wilk, Elliott, mcgrof, hch

On Fri, 2015-05-29 at 17:13 +0200, Borislav Petkov wrote:
> On Fri, May 29, 2015 at 08:27:08AM -0600, Toshi Kani wrote:
> > This simply preserves the original error check in the code.  This error
> > check makes sure that all CPUs have the PAT feature supported when PAT
> > is enabled.  This error can only happen when heterogeneous CPUs are
> > installed/emulated on the system/guest.  This check may be paranoid, but
> > this cleanup is not meant to modify such an error check.
> 
> No, this is a ridiculous attempt to justify crazy code. Please do it
> right. If the cleanup makes the code more insane than it is, then don't
> do it in the first place.

Well, the change is based on this review comment.  So, I am not sure
what would be the right thing to do.  I am not 100% certain that this
check can be removed, either.
https://lkml.org/lkml/2015/5/22/148

> > Can you consider the patch 10/12-11/12 as a separate patchset from the
> > WT series?  If that is OK, I will resubmit 10/12 (BUG->panic) and 11/12
> > (commit log update).
> 
> That's not enough. 11/12 is a convoluted mess which needs splitting and
> more detailed explanations in the commit messages.
> 
> So no. Read what I said: do the cleanup *first* , *then* add the new
> functionality.
> 
> The WT patches shouldn't change all too much from what you have now.
> Also, 11/12 changes stuff which you add in 1/12. This churn is useless
> and shouldn't be there at all.
> 
> So you should be able to do the cleanup first and have the WT stuff
> ontop just fine.

OK, I will do the cleanup first and resubmit the patchset based on
tip/master.

Thanks,
-Toshi


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 15:03       ` Toshi Kani
@ 2015-05-29 18:19         ` Dan Williams
  2015-05-29 18:32           ` Toshi Kani
  2015-05-29 18:34           ` Andy Lutomirski
  0 siblings, 2 replies; 33+ messages in thread
From: Dan Williams @ 2015-05-29 18:19 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Borislav Petkov, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, jgross, Stefan Bader,
	Andy Lutomirski, hmh, yigal, Konrad Rzeszutek Wilk, Elliott,
	Robert (Server Storage),
	mcgrof, Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
>> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
>> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
>> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
>> >> write back the contents of the CPU caches in case of a crash.
>> >>
>> >> This patch changes to use ioremap_wt(), which provides uncached
>> >> writes but cached reads, for improving read performance.
>> >>
>> >> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
>> >> ---
>> >>  drivers/block/pmem.c |    4 ++--
>> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
>> >> index eabf4a8..095dfaa 100644
>> >> --- a/drivers/block/pmem.c
>> >> +++ b/drivers/block/pmem.c
>> >> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
>> >>       }
>> >>
>> >>       /*
>> >> -      * Map the memory as non-cachable, as we can't write back the contents
>> >> +      * Map the memory as write-through, as we can't write back the contents
>> >>        * of the CPU caches in case of a crash.
>> >>        */
>> >>       err = -ENOMEM;
>> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>> >>       if (!pmem->virt_addr)
>> >>               goto out_release_region;
>> >
>> > Dan, Ross, what about this one?
>> >
>> > ACK to pick it up as a temporary solution?
>>
>> I see that is_new_memtype_allowed() is updated to disallow some
>> combinations, but the manual seems to imply any mixing of memory types
>> is unsupported.  Which worries me even in the current code where we
>> have uncached mappings in the driver, and potentially cached DAX
>> mappings handed out to userspace.
>
> is_new_memtype_allowed() is not to allow some combinations of mixing of
> memory types.  When it is allowed, the requested type of ioremap_xxx()
> is changed to match with the existing map type, so that mixing of memory
> types does not happen.

Yes, but now if the caller was expecting one memory type and gets
another one that is something I think the driver would want to know.
At a minimum I don't think we want to get emails about pmem driver
performance problems when someone's platform is silently degrading WB
to UC for example.

> DAX uses vm_insert_mixed(), which does not even check the existing map
> type to the physical address.

Right, I think that's a problem...

>> A general quibble separate from this patch is that we don't have a way
>> of knowing if ioremap() will reject or change our requested memory
>> type.  Shouldn't the driver be explicitly requesting a known valid
>> type in advance?
>
> I agree we need a solution here.
>
>> Lastly we now have the PMEM API patches from Ross out for review where
>> he is assuming cached mappings with non-temporal writes:
>> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
>> This gives us WC semantics on writes which I believe has the nice
>> property of reducing the number of write transactions to memory.
>> Also, the numbers in the paper seem to be assuming DAX operation, but
>> this ioremap_wt() is in the driver and typically behind a file system.
>> Are the numbers relevant to that usage mode?
>
> I have not looked into the Ross's changes yet, but they do not seem to
> replace the use of ioremap_nocache().  If his changes can use WB type
> reliably, yes, we do not need a temporary solution of using ioremap_wt()
> in this driver.

Hmm, yes you're right, it seems those patches did not change the
implementation to use ioremap_cache()... which happens to not be
implemented on all architectures.  I'll take a look.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 18:19         ` Dan Williams
@ 2015-05-29 18:32           ` Toshi Kani
  2015-05-29 19:34             ` Dan Williams
  2015-05-29 18:34           ` Andy Lutomirski
  1 sibling, 1 reply; 33+ messages in thread
From: Toshi Kani @ 2015-05-29 18:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: Borislav Petkov, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, jgross, Stefan Bader,
	Andy Lutomirski, hmh, yigal, Konrad Rzeszutek Wilk, Elliott,
	Robert (Server Storage),
	mcgrof, Christoph Hellwig, Matthew Wilcox

On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
 :
> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >> >>       if (!pmem->virt_addr)
> >> >>               goto out_release_region;
> >> >
> >> > Dan, Ross, what about this one?
> >> >
> >> > ACK to pick it up as a temporary solution?
> >>
> >> I see that is_new_memtype_allowed() is updated to disallow some
> >> combinations, but the manual seems to imply any mixing of memory types
> >> is unsupported.  Which worries me even in the current code where we
> >> have uncached mappings in the driver, and potentially cached DAX
> >> mappings handed out to userspace.
> >
> > is_new_memtype_allowed() is not to allow some combinations of mixing of
> > memory types.  When it is allowed, the requested type of ioremap_xxx()
> > is changed to match with the existing map type, so that mixing of memory
> > types does not happen.
> 
> Yes, but now if the caller was expecting one memory type and gets
> another one that is something I think the driver would want to know.
> At a minimum I don't think we want to get emails about pmem driver
> performance problems when someone's platform is silently degrading WB
> to UC for example.

The pmem driver creates an ioremap map to an NVDIMM range first.  So,
there will be no conflict at this point, unless there is a conflicting
driver claiming the same NVDIMM range.

DAX then uses the pmem driver (or other byte-addressable driver) to
mount a file system and creates a separate user-space mapping for
mmap().  So, a (silent) map-type conflict will happen at this point,
which may not be protected by the ioremap itself.

> > DAX uses vm_insert_mixed(), which does not even check the existing map
> > type to the physical address.
> 
> Right, I think that's a problem...
> 
> >> A general quibble separate from this patch is that we don't have a way
> >> of knowing if ioremap() will reject or change our requested memory
> >> type.  Shouldn't the driver be explicitly requesting a known valid
> >> type in advance?
> >
> > I agree we need a solution here.
> >
> >> Lastly we now have the PMEM API patches from Ross out for review where
> >> he is assuming cached mappings with non-temporal writes:
> >> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
> >> This gives us WC semantics on writes which I believe has the nice
> >> property of reducing the number of write transactions to memory.
> >> Also, the numbers in the paper seem to be assuming DAX operation, but
> >> this ioremap_wt() is in the driver and typically behind a file system.
> >> Are the numbers relevant to that usage mode?
> >
> > I have not looked into the Ross's changes yet, but they do not seem to
> > replace the use of ioremap_nocache().  If his changes can use WB type
> > reliably, yes, we do not need a temporary solution of using ioremap_wt()
> > in this driver.
> 
> Hmm, yes you're right, it seems those patches did not change the
> implementation to use ioremap_cache()... which happens to not be
> implemented on all architectures.  I'll take a look.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 18:19         ` Dan Williams
  2015-05-29 18:32           ` Toshi Kani
@ 2015-05-29 18:34           ` Andy Lutomirski
  2015-05-29 19:32             ` Dan Williams
  2015-05-29 21:29             ` Elliott, Robert (Server Storage)
  1 sibling, 2 replies; 33+ messages in thread
From: Andy Lutomirski @ 2015-05-29 18:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: Toshi Kani, Borislav Petkov, Ross Zwisler, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	linux-mm, linux-kernel, X86 ML, linux-nvdimm, Juergen Gross,
	Stefan Bader, Henrique de Moraes Holschuh, Yigal Korman,
	Konrad Rzeszutek Wilk, Elliott, Robert (Server Storage),
	Luis Rodriguez, Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 11:19 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
>> On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
>>> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
>>> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
>>> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
>>> >> write back the contents of the CPU caches in case of a crash.
>>> >>
>>> >> This patch changes to use ioremap_wt(), which provides uncached
>>> >> writes but cached reads, for improving read performance.
>>> >>
>>> >> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
>>> >> ---
>>> >>  drivers/block/pmem.c |    4 ++--
>>> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>>> >>
>>> >> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
>>> >> index eabf4a8..095dfaa 100644
>>> >> --- a/drivers/block/pmem.c
>>> >> +++ b/drivers/block/pmem.c
>>> >> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
>>> >>       }
>>> >>
>>> >>       /*
>>> >> -      * Map the memory as non-cachable, as we can't write back the contents
>>> >> +      * Map the memory as write-through, as we can't write back the contents
>>> >>        * of the CPU caches in case of a crash.
>>> >>        */
>>> >>       err = -ENOMEM;
>>> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>>> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>>> >>       if (!pmem->virt_addr)
>>> >>               goto out_release_region;
>>> >
>>> > Dan, Ross, what about this one?
>>> >
>>> > ACK to pick it up as a temporary solution?
>>>
>>> I see that is_new_memtype_allowed() is updated to disallow some
>>> combinations, but the manual seems to imply any mixing of memory types
>>> is unsupported.  Which worries me even in the current code where we
>>> have uncached mappings in the driver, and potentially cached DAX
>>> mappings handed out to userspace.
>>
>> is_new_memtype_allowed() is not to allow some combinations of mixing of
>> memory types.  When it is allowed, the requested type of ioremap_xxx()
>> is changed to match with the existing map type, so that mixing of memory
>> types does not happen.
>
> Yes, but now if the caller was expecting one memory type and gets
> another one that is something I think the driver would want to know.
> At a minimum I don't think we want to get emails about pmem driver
> performance problems when someone's platform is silently degrading WB
> to UC for example.
>
>> DAX uses vm_insert_mixed(), which does not even check the existing map
>> type to the physical address.
>
> Right, I think that's a problem...
>
>>> A general quibble separate from this patch is that we don't have a way
>>> of knowing if ioremap() will reject or change our requested memory
>>> type.  Shouldn't the driver be explicitly requesting a known valid
>>> type in advance?
>>
>> I agree we need a solution here.
>>
>>> Lastly we now have the PMEM API patches from Ross out for review where
>>> he is assuming cached mappings with non-temporal writes:
>>> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
>>> This gives us WC semantics on writes which I believe has the nice
>>> property of reducing the number of write transactions to memory.
>>> Also, the numbers in the paper seem to be assuming DAX operation, but
>>> this ioremap_wt() is in the driver and typically behind a file system.
>>> Are the numbers relevant to that usage mode?
>>
>> I have not looked into the Ross's changes yet, but they do not seem to
>> replace the use of ioremap_nocache().  If his changes can use WB type
>> reliably, yes, we do not need a temporary solution of using ioremap_wt()
>> in this driver.
>
> Hmm, yes you're right, it seems those patches did not change the
> implementation to use ioremap_cache()... which happens to not be
> implemented on all architectures.  I'll take a look.

Whoa, there!  Why would we use non-temporal stores to WB memory to
access persistent memory?  I can see two reasons not to:

1. As far as I understand it, non-temporal stores to WT should have
almost identical performance.

2. Is there any actual architectural guarantee that it's safe to have
a WB mapping that we use like that?  By my reading of the manual,
MOVNTDQA (as a write to pmem); SFENCE; PCOMMIT; SFENCE on uncached
memory should be guaranteed to do a durable write.  On the other hand,
it's considerably less clear to me that the same sequence to WB memory
is safe -- aren't we supposed to stick a CLWB or CLFLUSHOPT in there,
too, on WB memory?  In other words, is there any case in which
MOVNTDQA or similar acting on a WB mapping could result in a dirty
cache line?

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 18:34           ` Andy Lutomirski
@ 2015-05-29 19:32             ` Dan Williams
  2015-05-29 21:29             ` Elliott, Robert (Server Storage)
  1 sibling, 0 replies; 33+ messages in thread
From: Dan Williams @ 2015-05-29 19:32 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Toshi Kani, Borislav Petkov, Ross Zwisler, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	linux-mm, linux-kernel, X86 ML, linux-nvdimm, Juergen Gross,
	Stefan Bader, Henrique de Moraes Holschuh, Yigal Korman,
	Konrad Rzeszutek Wilk, Elliott, Robert (Server Storage),
	Luis Rodriguez, Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 11:34 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, May 29, 2015 at 11:19 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
>>> On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
>>>> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
>>>> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
>>>> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
>>>> >> write back the contents of the CPU caches in case of a crash.
>>>> >>
>>>> >> This patch changes to use ioremap_wt(), which provides uncached
>>>> >> writes but cached reads, for improving read performance.
>>>> >>
>>>> >> Signed-off-by: Toshi Kani <toshi.kani@hp.com>
>>>> >> ---
>>>> >>  drivers/block/pmem.c |    4 ++--
>>>> >>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>> >>
>>>> >> diff --git a/drivers/block/pmem.c b/drivers/block/pmem.c
>>>> >> index eabf4a8..095dfaa 100644
>>>> >> --- a/drivers/block/pmem.c
>>>> >> +++ b/drivers/block/pmem.c
>>>> >> @@ -139,11 +139,11 @@ static struct pmem_device *pmem_alloc(struct device *dev, struct resource *res)
>>>> >>       }
>>>> >>
>>>> >>       /*
>>>> >> -      * Map the memory as non-cachable, as we can't write back the contents
>>>> >> +      * Map the memory as write-through, as we can't write back the contents
>>>> >>        * of the CPU caches in case of a crash.
>>>> >>        */
>>>> >>       err = -ENOMEM;
>>>> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>>>> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>>>> >>       if (!pmem->virt_addr)
>>>> >>               goto out_release_region;
>>>> >
>>>> > Dan, Ross, what about this one?
>>>> >
>>>> > ACK to pick it up as a temporary solution?
>>>>
>>>> I see that is_new_memtype_allowed() is updated to disallow some
>>>> combinations, but the manual seems to imply any mixing of memory types
>>>> is unsupported.  Which worries me even in the current code where we
>>>> have uncached mappings in the driver, and potentially cached DAX
>>>> mappings handed out to userspace.
>>>
>>> is_new_memtype_allowed() is not to allow some combinations of mixing of
>>> memory types.  When it is allowed, the requested type of ioremap_xxx()
>>> is changed to match with the existing map type, so that mixing of memory
>>> types does not happen.
>>
>> Yes, but now if the caller was expecting one memory type and gets
>> another one that is something I think the driver would want to know.
>> At a minimum I don't think we want to get emails about pmem driver
>> performance problems when someone's platform is silently degrading WB
>> to UC for example.
>>
>>> DAX uses vm_insert_mixed(), which does not even check the existing map
>>> type to the physical address.
>>
>> Right, I think that's a problem...
>>
>>>> A general quibble separate from this patch is that we don't have a way
>>>> of knowing if ioremap() will reject or change our requested memory
>>>> type.  Shouldn't the driver be explicitly requesting a known valid
>>>> type in advance?
>>>
>>> I agree we need a solution here.
>>>
>>>> Lastly we now have the PMEM API patches from Ross out for review where
>>>> he is assuming cached mappings with non-temporal writes:
>>>> https://lists.01.org/pipermail/linux-nvdimm/2015-May/000929.html.
>>>> This gives us WC semantics on writes which I believe has the nice
>>>> property of reducing the number of write transactions to memory.
>>>> Also, the numbers in the paper seem to be assuming DAX operation, but
>>>> this ioremap_wt() is in the driver and typically behind a file system.
>>>> Are the numbers relevant to that usage mode?
>>>
>>> I have not looked into the Ross's changes yet, but they do not seem to
>>> replace the use of ioremap_nocache().  If his changes can use WB type
>>> reliably, yes, we do not need a temporary solution of using ioremap_wt()
>>> in this driver.
>>
>> Hmm, yes you're right, it seems those patches did not change the
>> implementation to use ioremap_cache()... which happens to not be
>> implemented on all architectures.  I'll take a look.
>
> Whoa, there!  Why would we use non-temporal stores to WB memory to
> access persistent memory?  I can see two reasons not to:
>
> 1. As far as I understand it, non-temporal stores to WT should have
> almost identical performance.
>
> 2. Is there any actual architectural guarantee that it's safe to have
> a WB mapping that we use like that?  By my reading of the manual,
> MOVNTDQA (as a write to pmem); SFENCE; PCOMMIT; SFENCE on uncached
> memory should be guaranteed to do a durable write.  On the other hand,
> it's considerably less clear to me that the same sequence to WB memory
> is safe -- aren't we supposed to stick a CLWB or CLFLUSHOPT in there,
> too, on WB memory?  In other words, is there any case in which
> MOVNTDQA or similar acting on a WB mapping could result in a dirty
> cache line?

Depends, see the note in 10.4.6.2, "Some older CPU implementations
(e.g., Pentium M) allowed addresses being written with a non-temporal
store instruction to be updated in-place if the memory type was not WC
and line was already in the cache."  The expectation is that
boot_cpu_has(X86_FEATURE_PCOMMIT) is false on such a cpu so we'll
fallback to not using non-temporal stores.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 18:32           ` Toshi Kani
@ 2015-05-29 19:34             ` Dan Williams
  2015-05-29 20:10               ` Toshi Kani
  0 siblings, 1 reply; 33+ messages in thread
From: Dan Williams @ 2015-05-29 19:34 UTC (permalink / raw)
  To: Toshi Kani
  Cc: Borislav Petkov, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, Juergen Gross, Stefan Bader,
	Andy Lutomirski, Henrique de Moraes Holschuh, Yigal Korman,
	Konrad Rzeszutek Wilk, Elliott, Robert (Server Storage),
	Luis Rodriguez, Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 11:32 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
>> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
>> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
>> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
>> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
>> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
>  :
>> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
>> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
>> >> >>       if (!pmem->virt_addr)
>> >> >>               goto out_release_region;
>> >> >
>> >> > Dan, Ross, what about this one?
>> >> >
>> >> > ACK to pick it up as a temporary solution?
>> >>
>> >> I see that is_new_memtype_allowed() is updated to disallow some
>> >> combinations, but the manual seems to imply any mixing of memory types
>> >> is unsupported.  Which worries me even in the current code where we
>> >> have uncached mappings in the driver, and potentially cached DAX
>> >> mappings handed out to userspace.
>> >
>> > is_new_memtype_allowed() is not to allow some combinations of mixing of
>> > memory types.  When it is allowed, the requested type of ioremap_xxx()
>> > is changed to match with the existing map type, so that mixing of memory
>> > types does not happen.
>>
>> Yes, but now if the caller was expecting one memory type and gets
>> another one that is something I think the driver would want to know.
>> At a minimum I don't think we want to get emails about pmem driver
>> performance problems when someone's platform is silently degrading WB
>> to UC for example.
>
> The pmem driver creates an ioremap map to an NVDIMM range first.  So,
> there will be no conflict at this point, unless there is a conflicting
> driver claiming the same NVDIMM range.

Hmm, I thought it would be WB due to this comment in is_new_memtype_allowed()

        /*
         * PAT type is always WB for untracked ranges, so no need to check.
         */

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 19:34             ` Dan Williams
@ 2015-05-29 20:10               ` Toshi Kani
  0 siblings, 0 replies; 33+ messages in thread
From: Toshi Kani @ 2015-05-29 20:10 UTC (permalink / raw)
  To: Dan Williams
  Cc: Borislav Petkov, Ross Zwisler, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, Andrew Morton, Arnd Bergmann, linux-mm,
	linux-kernel, X86 ML, linux-nvdimm, Juergen Gross, Stefan Bader,
	Andy Lutomirski, Henrique de Moraes Holschuh, Yigal Korman,
	Konrad Rzeszutek Wilk, Elliott, Robert (Server Storage),
	Luis Rodriguez, Christoph Hellwig, Matthew Wilcox

On Fri, 2015-05-29 at 12:34 -0700, Dan Williams wrote:
> On Fri, May 29, 2015 at 11:32 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> > On Fri, 2015-05-29 at 11:19 -0700, Dan Williams wrote:
> >> On Fri, May 29, 2015 at 8:03 AM, Toshi Kani <toshi.kani@hp.com> wrote:
> >> > On Fri, 2015-05-29 at 07:43 -0700, Dan Williams wrote:
> >> >> On Fri, May 29, 2015 at 2:11 AM, Borislav Petkov <bp@alien8.de> wrote:
> >> >> > On Wed, May 27, 2015 at 09:19:04AM -0600, Toshi Kani wrote:
> >> >> >> The pmem driver maps NVDIMM with ioremap_nocache() as we cannot
> >  :
> >> >> >> -     pmem->virt_addr = ioremap_nocache(pmem->phys_addr, pmem->size);
> >> >> >> +     pmem->virt_addr = ioremap_wt(pmem->phys_addr, pmem->size);
> >> >> >>       if (!pmem->virt_addr)
> >> >> >>               goto out_release_region;
> >> >> >
> >> >> > Dan, Ross, what about this one?
> >> >> >
> >> >> > ACK to pick it up as a temporary solution?
> >> >>
> >> >> I see that is_new_memtype_allowed() is updated to disallow some
> >> >> combinations, but the manual seems to imply any mixing of memory types
> >> >> is unsupported.  Which worries me even in the current code where we
> >> >> have uncached mappings in the driver, and potentially cached DAX
> >> >> mappings handed out to userspace.
> >> >
> >> > is_new_memtype_allowed() is not to allow some combinations of mixing of
> >> > memory types.  When it is allowed, the requested type of ioremap_xxx()
> >> > is changed to match with the existing map type, so that mixing of memory
> >> > types does not happen.
> >>
> >> Yes, but now if the caller was expecting one memory type and gets
> >> another one that is something I think the driver would want to know.
> >> At a minimum I don't think we want to get emails about pmem driver
> >> performance problems when someone's platform is silently degrading WB
> >> to UC for example.
> >
> > The pmem driver creates an ioremap map to an NVDIMM range first.  So,
> > there will be no conflict at this point, unless there is a conflicting
> > driver claiming the same NVDIMM range.
> 
> Hmm, I thought it would be WB due to this comment in is_new_memtype_allowed()
> 
>         /*
>          * PAT type is always WB for untracked ranges, so no need to check.
>          */

This comment applies to the ISA range, where ioremap() does not create
any mapping to, i.e. untracked.  You can ignore this comment for NVDIMM.

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 18:34           ` Andy Lutomirski
  2015-05-29 19:32             ` Dan Williams
@ 2015-05-29 21:29             ` Elliott, Robert (Server Storage)
  2015-05-29 21:46               ` Andy Lutomirski
  1 sibling, 1 reply; 33+ messages in thread
From: Elliott, Robert (Server Storage) @ 2015-05-29 21:29 UTC (permalink / raw)
  To: Andy Lutomirski, Dan Williams
  Cc: Kani, Toshimitsu, Borislav Petkov, Ross Zwisler, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	linux-mm, linux-kernel, X86 ML, linux-nvdimm, Juergen Gross,
	Stefan Bader, Henrique de Moraes Holschuh, Yigal Korman,
	Konrad Rzeszutek Wilk, Luis Rodriguez, Christoph Hellwig,
	Matthew Wilcox

> -----Original Message-----
> From: Andy Lutomirski [mailto:luto@amacapital.net]
> Sent: Friday, May 29, 2015 1:35 PM
...
> Whoa, there!  Why would we use non-temporal stores to WB memory to
> access persistent memory?  I can see two reasons not to:

Data written to a block storage device (here, the NVDIMM) is unlikely
to be read or written again any time soon.  It's not like the code
and data that a program has in memory, where there might be a loop
accessing the location every CPU clock; it's storage I/O to
historically very slow (relative to the CPU clock speed) devices.  
The source buffer for that data might be frequently accessed, 
but not the NVDIMM storage itself.  

Non-temporal stores avoid wasting cache space on these "one-time" 
accesses.  The same applies for reads and non-temporal loads.
Keep the CPU data cache lines free for the application.

DAX and mmap() do change that; the application is now free to
store frequently accessed data structures directly in persistent 
memory.  But, that's not available if btt is used, and 
application loads and stores won't go through the memcpy()
calls inside pmem anyway.  The non-temporal instructions are
cache coherent, so data integrity won't get confused by them
if I/O going through pmem's block storage APIs happens
to overlap with the application's mmap() regions.

---
Robert Elliott, HP Server Storage



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 21:29             ` Elliott, Robert (Server Storage)
@ 2015-05-29 21:46               ` Andy Lutomirski
  2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
                                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Andy Lutomirski @ 2015-05-29 21:46 UTC (permalink / raw)
  To: Elliott, Robert (Server Storage)
  Cc: Dan Williams, Kani, Toshimitsu, Borislav Petkov, Ross Zwisler,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, linux-mm, linux-kernel, X86 ML, linux-nvdimm,
	Juergen Gross, Stefan Bader, Henrique de Moraes Holschuh,
	Yigal Korman, Konrad Rzeszutek Wilk, Luis Rodriguez,
	Christoph Hellwig, Matthew Wilcox

On Fri, May 29, 2015 at 2:29 PM, Elliott, Robert (Server Storage)
<Elliott@hp.com> wrote:
>> -----Original Message-----
>> From: Andy Lutomirski [mailto:luto@amacapital.net]
>> Sent: Friday, May 29, 2015 1:35 PM
> ...
>> Whoa, there!  Why would we use non-temporal stores to WB memory to
>> access persistent memory?  I can see two reasons not to:
>
> Data written to a block storage device (here, the NVDIMM) is unlikely
> to be read or written again any time soon.  It's not like the code
> and data that a program has in memory, where there might be a loop
> accessing the location every CPU clock; it's storage I/O to
> historically very slow (relative to the CPU clock speed) devices.
> The source buffer for that data might be frequently accessed,
> but not the NVDIMM storage itself.
>
> Non-temporal stores avoid wasting cache space on these "one-time"
> accesses.  The same applies for reads and non-temporal loads.
> Keep the CPU data cache lines free for the application.
>
> DAX and mmap() do change that; the application is now free to
> store frequently accessed data structures directly in persistent
> memory.  But, that's not available if btt is used, and
> application loads and stores won't go through the memcpy()
> calls inside pmem anyway.  The non-temporal instructions are
> cache coherent, so data integrity won't get confused by them
> if I/O going through pmem's block storage APIs happens
> to overlap with the application's mmap() regions.
>

You answered the wrong question. :)  I understand the point of the
non-temporal stores -- I don't understand the point of using
non-temporal stores to *WB memory*.  I think we should be okay with
having the kernel mapping use WT instead.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 21:46               ` Andy Lutomirski
@ 2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
  2015-05-29 22:32                 ` H. Peter Anvin
  2015-06-01  8:58                 ` Ingo Molnar
  2 siblings, 0 replies; 33+ messages in thread
From: Elliott, Robert (Server Storage) @ 2015-05-29 22:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Dan Williams, Kani, Toshimitsu, Borislav Petkov, Ross Zwisler,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, linux-mm, linux-kernel, X86 ML, linux-nvdimm,
	Juergen Gross, Stefan Bader, Henrique de Moraes Holschuh,
	Yigal Korman, Konrad Rzeszutek Wilk, Luis Rodriguez,
	Christoph Hellwig, Matthew Wilcox



---
Robert Elliott, HP Server Storage

> -----Original Message-----
> From: Andy Lutomirski [mailto:luto@amacapital.net]
> Sent: Friday, May 29, 2015 4:46 PM
> To: Elliott, Robert (Server Storage)
> Cc: Dan Williams; Kani, Toshimitsu; Borislav Petkov; Ross Zwisler;
> H. Peter Anvin; Thomas Gleixner; Ingo Molnar; Andrew Morton; Arnd
> Bergmann; linux-mm@kvack.org; linux-kernel@vger.kernel.org; X86 ML;
> linux-nvdimm@lists.01.org; Juergen Gross; Stefan Bader; Henrique de
> Moraes Holschuh; Yigal Korman; Konrad Rzeszutek Wilk; Luis
> Rodriguez; Christoph Hellwig; Matthew Wilcox
> Subject: Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with
> ioremap_wt()
> 
> On Fri, May 29, 2015 at 2:29 PM, Elliott, Robert (Server Storage)
> <Elliott@hp.com> wrote:
> >> -----Original Message-----
> >> From: Andy Lutomirski [mailto:luto@amacapital.net]
> >> Sent: Friday, May 29, 2015 1:35 PM
> > ...
> >> Whoa, there!  Why would we use non-temporal stores to WB memory
> to
> >> access persistent memory?  I can see two reasons not to:
> >
> > Data written to a block storage device (here, the NVDIMM) is
> unlikely
> > to be read or written again any time soon.  It's not like the code
> > and data that a program has in memory, where there might be a loop
> > accessing the location every CPU clock; it's storage I/O to
> > historically very slow (relative to the CPU clock speed) devices.
> > The source buffer for that data might be frequently accessed,
> > but not the NVDIMM storage itself.
> >
> > Non-temporal stores avoid wasting cache space on these "one-time"
> > accesses.  The same applies for reads and non-temporal loads.
> > Keep the CPU data cache lines free for the application.
> >
> > DAX and mmap() do change that; the application is now free to
> > store frequently accessed data structures directly in persistent
> > memory.  But, that's not available if btt is used, and
> > application loads and stores won't go through the memcpy()
> > calls inside pmem anyway.  The non-temporal instructions are
> > cache coherent, so data integrity won't get confused by them
> > if I/O going through pmem's block storage APIs happens
> > to overlap with the application's mmap() regions.
> >
> 
> You answered the wrong question. :)  I understand the point of the
> non-temporal stores -- I don't understand the point of using
> non-temporal stores to *WB memory*.  I think we should be okay with
> having the kernel mapping use WT instead.

The cache type that the application chooses for its mmap()
view has to be compatible with that already selected by the 
kernel, or we run into:

Intel SDM 11.12.4 Programming the PAT
...
"The PAT allows any memory type to be specified in the page tables,
and therefore it is possible to have a single physical page mapped
to two or more different linear addresses, each with different
memory types. Intel does not support this practice because it may
lead to undefined operations that can result in a system failure. 
In particular, a WC page must never be aliased to a cacheable page
because WC writes may not check the processor caches."

Right now, application memory is always WB, so WB is the
only safe choice from this perspective (the system must have
ADR for safety from other perspectives). That might not be 
the best choice for all applications, though; some applications
might not want CPU caching all the data they run through here 
and prefer WC.  On a non-ADR system, WT might be the only 
safe choice.

Should there be a way for the application to specify a cache
type in its mmap() call? The type already selected by the
kernel driver could (carefully) be changed on the fly if 
it's different.

Non-temporal store performance is excellent under WB, WC, and WT;
if anything, I think WC edges ahead because it need not snoop
the cache. It's still poor under UC.




^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 21:46               ` Andy Lutomirski
  2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
@ 2015-05-29 22:32                 ` H. Peter Anvin
  2015-06-01  8:58                 ` Ingo Molnar
  2 siblings, 0 replies; 33+ messages in thread
From: H. Peter Anvin @ 2015-05-29 22:32 UTC (permalink / raw)
  To: Andy Lutomirski, Elliott, Robert (Server Storage)
  Cc: Dan Williams, Kani, Toshimitsu, Borislav Petkov, Ross Zwisler,
	Thomas Gleixner, Ingo Molnar, Andrew Morton, Arnd Bergmann,
	linux-mm, linux-kernel, X86 ML, linux-nvdimm@lists.01.org,
	Juergen Gross, Stefan Bader, Henrique de Moraes Holschuh,
	Yigal Korman, Konrad Rzeszutek Wilk, Luis Rodriguez,
	Christoph Hellwig, Matthew Wilcox

Nontemporal stores to WB memory is fine in such a way that it doesn't pollute the cache.  This can be done by denoting to WC or by forcing cache allocation out of only a subset of the cache.

On May 29, 2015 2:46:19 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Fri, May 29, 2015 at 2:29 PM, Elliott, Robert (Server Storage)
><Elliott@hp.com> wrote:
>>> -----Original Message-----
>>> From: Andy Lutomirski [mailto:luto@amacapital.net]
>>> Sent: Friday, May 29, 2015 1:35 PM
>> ...
>>> Whoa, there!  Why would we use non-temporal stores to WB memory to
>>> access persistent memory?  I can see two reasons not to:
>>
>> Data written to a block storage device (here, the NVDIMM) is unlikely
>> to be read or written again any time soon.  It's not like the code
>> and data that a program has in memory, where there might be a loop
>> accessing the location every CPU clock; it's storage I/O to
>> historically very slow (relative to the CPU clock speed) devices.
>> The source buffer for that data might be frequently accessed,
>> but not the NVDIMM storage itself.
>>
>> Non-temporal stores avoid wasting cache space on these "one-time"
>> accesses.  The same applies for reads and non-temporal loads.
>> Keep the CPU data cache lines free for the application.
>>
>> DAX and mmap() do change that; the application is now free to
>> store frequently accessed data structures directly in persistent
>> memory.  But, that's not available if btt is used, and
>> application loads and stores won't go through the memcpy()
>> calls inside pmem anyway.  The non-temporal instructions are
>> cache coherent, so data integrity won't get confused by them
>> if I/O going through pmem's block storage APIs happens
>> to overlap with the application's mmap() regions.
>>
>
>You answered the wrong question. :)  I understand the point of the
>non-temporal stores -- I don't understand the point of using
>non-temporal stores to *WB memory*.  I think we should be okay with
>having the kernel mapping use WT instead.
>
>--Andy

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-05-29 21:46               ` Andy Lutomirski
  2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
  2015-05-29 22:32                 ` H. Peter Anvin
@ 2015-06-01  8:58                 ` Ingo Molnar
  2015-06-01 17:10                   ` Andy Lutomirski
  2 siblings, 1 reply; 33+ messages in thread
From: Ingo Molnar @ 2015-06-01  8:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Elliott, Robert (Server Storage),
	Dan Williams, Kani, Toshimitsu, Borislav Petkov, Ross Zwisler,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, linux-mm, linux-kernel, X86 ML, linux-nvdimm,
	Juergen Gross, Stefan Bader, Henrique de Moraes Holschuh,
	Yigal Korman, Konrad Rzeszutek Wilk, Luis Rodriguez,
	Christoph Hellwig, Matthew Wilcox


* Andy Lutomirski <luto@amacapital.net> wrote:

> You answered the wrong question. :) I understand the point of the non-temporal 
> stores -- I don't understand the point of using non-temporal stores to *WB 
> memory*.  I think we should be okay with having the kernel mapping use WT 
> instead.

WB memory is write-through, but they are still fully cached for reads.

So non-temporal instructions influence how the CPU will allocate (or not allocate) 
WT cache lines.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt()
  2015-06-01  8:58                 ` Ingo Molnar
@ 2015-06-01 17:10                   ` Andy Lutomirski
  0 siblings, 0 replies; 33+ messages in thread
From: Andy Lutomirski @ 2015-06-01 17:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Elliott, Robert (Server Storage),
	Dan Williams, Kani, Toshimitsu, Borislav Petkov, Ross Zwisler,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, Andrew Morton,
	Arnd Bergmann, linux-mm, linux-kernel, X86 ML, linux-nvdimm,
	Juergen Gross, Stefan Bader, Henrique de Moraes Holschuh,
	Yigal Korman, Konrad Rzeszutek Wilk, Luis Rodriguez,
	Christoph Hellwig, Matthew Wilcox

On Mon, Jun 1, 2015 at 1:58 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Andy Lutomirski <luto@amacapital.net> wrote:
>
>> You answered the wrong question. :) I understand the point of the non-temporal
>> stores -- I don't understand the point of using non-temporal stores to *WB
>> memory*.  I think we should be okay with having the kernel mapping use WT
>> instead.
>
> WB memory is write-through, but they are still fully cached for reads.
>
> So non-temporal instructions influence how the CPU will allocate (or not allocate)
> WT cache lines.
>

I'm doing a terrible job of saying what I mean.

Given that we're using non-temporal writes, the kernel code should
work correctly and with similar performance regardless of whether the
mapping is WB or WT.  It would still be correct, if slower, with WC or
UC, and, if we used explicit streaming reads, even that would matter
less.

I think this means that we are free to switch the kernel mapping
between WB and WT as needed to improve DAX behavior.  We could even
plausibly do it at runtime.

--Andy

> Thanks,
>
>         Ingo



-- 
Andy Lutomirski
AMA Capital Management, LLC

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-06-01 17:11 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-27 15:18 [PATCH v10 0/12] Support Write-Through mapping on x86 Toshi Kani
2015-05-27 15:18 ` [PATCH v10 1/12] x86, mm, pat: Set WT to PA7 slot of PAT MSR Toshi Kani
2015-05-27 15:18 ` [PATCH v10 2/12] x86, mm, pat: Change reserve_memtype() for WT Toshi Kani
2015-05-27 15:18 ` [PATCH v10 3/12] x86, asm: Change is_new_memtype_allowed() " Toshi Kani
2015-05-27 15:18 ` [PATCH v10 4/12] x86, mm, asm-gen: Add ioremap_wt() " Toshi Kani
2015-05-27 15:18 ` [PATCH v10 5/12] arch/*/asm/io.h: Add ioremap_wt() to all architectures Toshi Kani
2015-05-27 15:18 ` [PATCH v10 6/12] video/fbdev, asm/io.h: Remove ioremap_writethrough() Toshi Kani
2015-05-27 15:18 ` [PATCH v10 7/12] x86, mm, pat: Add pgprot_writethrough() for WT Toshi Kani
2015-05-27 15:19 ` [PATCH v10 8/12] x86, mm, asm: Add WT support to set_page_memtype() Toshi Kani
2015-05-27 15:19 ` [PATCH v10 9/12] x86, mm: Add set_memory_wt() for WT Toshi Kani
2015-05-27 15:19 ` [PATCH v10 10/12] x86, mm, pat: Cleanup init flags in pat_init() Toshi Kani
2015-05-29  8:59   ` Borislav Petkov
2015-05-27 15:19 ` [PATCH v10 11/12] x86, mm, pat: Refactor !pat_enabled handling Toshi Kani
2015-05-29  8:58   ` Borislav Petkov
2015-05-29 14:27     ` Toshi Kani
2015-05-29 15:13       ` Borislav Petkov
2015-05-29 15:17         ` Toshi Kani
2015-05-27 15:19 ` [PATCH v10 12/12] drivers/block/pmem: Map NVDIMM with ioremap_wt() Toshi Kani
2015-05-29  9:11   ` Borislav Petkov
2015-05-29 14:43     ` Dan Williams
2015-05-29 15:03       ` Toshi Kani
2015-05-29 18:19         ` Dan Williams
2015-05-29 18:32           ` Toshi Kani
2015-05-29 19:34             ` Dan Williams
2015-05-29 20:10               ` Toshi Kani
2015-05-29 18:34           ` Andy Lutomirski
2015-05-29 19:32             ` Dan Williams
2015-05-29 21:29             ` Elliott, Robert (Server Storage)
2015-05-29 21:46               ` Andy Lutomirski
2015-05-29 22:24                 ` Elliott, Robert (Server Storage)
2015-05-29 22:32                 ` H. Peter Anvin
2015-06-01  8:58                 ` Ingo Molnar
2015-06-01 17:10                   ` Andy Lutomirski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).