All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv
@ 2017-05-23  4:05 ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Anton Blanchard, Oliver O'Halloran

From: Anton Blanchard <anton@samba.org>

Adds support for removing bolted (i.e kernel linear mapping) mappings on
powernv. This is needed to support memory hot unplug operations which
are required for the teardown of DAX/PMEM devices.

Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Fixed the commit author
          Added VM_WARN_ON() if we attempt to remove an unbolted hpte
---
 arch/powerpc/mm/hash_native_64.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 65bb8f33b399..b534d041cfe8 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
 	tlbie(vpn, psize, psize, ssize, 0);
 }
 
+/*
+ * Remove a bolted kernel entry. Memory hotplug uses this.
+ *
+ * No need to lock here because we should be the only user.
+ */
+static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
+{
+	unsigned long vpn;
+	unsigned long vsid;
+	long slot;
+	struct hash_pte *hptep;
+
+	vsid = get_kernel_vsid(ea, ssize);
+	vpn = hpt_vpn(ea, vsid, ssize);
+
+	slot = native_hpte_find(vpn, psize, ssize);
+	if (slot == -1)
+		return -ENOENT;
+
+	hptep = htab_address + slot;
+
+	VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
+
+	/* Invalidate the hpte */
+	hptep->v = 0;
+
+	/* Invalidate the TLB */
+	tlbie(vpn, psize, psize, ssize, 0);
+	return 0;
+}
+
+
 static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
 				   int bpsize, int apsize, int ssize, int local)
 {
@@ -725,6 +757,7 @@ void __init hpte_init_native(void)
 	mmu_hash_ops.hpte_invalidate	= native_hpte_invalidate;
 	mmu_hash_ops.hpte_updatepp	= native_hpte_updatepp;
 	mmu_hash_ops.hpte_updateboltedpp = native_hpte_updateboltedpp;
+	mmu_hash_ops.hpte_removebolted = native_hpte_removebolted;
 	mmu_hash_ops.hpte_insert	= native_hpte_insert;
 	mmu_hash_ops.hpte_remove	= native_hpte_remove;
 	mmu_hash_ops.hpte_clear_all	= native_hpte_clear;
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv
@ 2017-05-23  4:05 ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Anton Blanchard, Oliver O'Halloran

From: Anton Blanchard <anton@samba.org>

Adds support for removing bolted (i.e kernel linear mapping) mappings on
powernv. This is needed to support memory hot unplug operations which
are required for the teardown of DAX/PMEM devices.

Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Fixed the commit author
          Added VM_WARN_ON() if we attempt to remove an unbolted hpte
---
 arch/powerpc/mm/hash_native_64.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 65bb8f33b399..b534d041cfe8 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
 	tlbie(vpn, psize, psize, ssize, 0);
 }
 
+/*
+ * Remove a bolted kernel entry. Memory hotplug uses this.
+ *
+ * No need to lock here because we should be the only user.
+ */
+static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
+{
+	unsigned long vpn;
+	unsigned long vsid;
+	long slot;
+	struct hash_pte *hptep;
+
+	vsid = get_kernel_vsid(ea, ssize);
+	vpn = hpt_vpn(ea, vsid, ssize);
+
+	slot = native_hpte_find(vpn, psize, ssize);
+	if (slot == -1)
+		return -ENOENT;
+
+	hptep = htab_address + slot;
+
+	VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
+
+	/* Invalidate the hpte */
+	hptep->v = 0;
+
+	/* Invalidate the TLB */
+	tlbie(vpn, psize, psize, ssize, 0);
+	return 0;
+}
+
+
 static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
 				   int bpsize, int apsize, int ssize, int local)
 {
@@ -725,6 +757,7 @@ void __init hpte_init_native(void)
 	mmu_hash_ops.hpte_invalidate	= native_hpte_invalidate;
 	mmu_hash_ops.hpte_updatepp	= native_hpte_updatepp;
 	mmu_hash_ops.hpte_updateboltedpp = native_hpte_updateboltedpp;
+	mmu_hash_ops.hpte_removebolted = native_hpte_removebolted;
 	mmu_hash_ops.hpte_insert	= native_hpte_insert;
 	mmu_hash_ops.hpte_remove	= native_hpte_remove;
 	mmu_hash_ops.hpte_clear_all	= native_hpte_clear;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/6] powerpc/vmemmap: Reshuffle vmemmap_free()
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  4:05   ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Removes an indentation level and shuffles some code around to make the
following patch cleaner. No functional changes.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Remove broken initialiser
---
 arch/powerpc/mm/init_64.c | 48 ++++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index ec84b31c6c86..8851e4f5dbab 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -234,13 +234,15 @@ static unsigned long vmemmap_list_free(unsigned long start)
 void __ref vmemmap_free(unsigned long start, unsigned long end)
 {
 	unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
+	unsigned long page_order = get_order(page_size);
 
 	start = _ALIGN_DOWN(start, page_size);
 
 	pr_debug("vmemmap_free %lx...%lx\n", start, end);
 
 	for (; start < end; start += page_size) {
-		unsigned long addr;
+		unsigned long nr_pages, addr;
+		struct page *page;
 
 		/*
 		 * the section has already be marked as invalid, so
@@ -251,29 +253,29 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 			continue;
 
 		addr = vmemmap_list_free(start);
-		if (addr) {
-			struct page *page = pfn_to_page(addr >> PAGE_SHIFT);
-
-			if (PageReserved(page)) {
-				/* allocated from bootmem */
-				if (page_size < PAGE_SIZE) {
-					/*
-					 * this shouldn't happen, but if it is
-					 * the case, leave the memory there
-					 */
-					WARN_ON_ONCE(1);
-				} else {
-					unsigned int nr_pages =
-						1 << get_order(page_size);
-					while (nr_pages--)
-						free_reserved_page(page++);
-				}
-			} else
-				free_pages((unsigned long)(__va(addr)),
-							get_order(page_size));
-
-			vmemmap_remove_mapping(start, page_size);
+		if (!addr)
+			continue;
+
+		page = pfn_to_page(addr >> PAGE_SHIFT);
+		nr_pages = 1 << page_order;
+
+		if (PageReserved(page)) {
+			/* allocated from bootmem */
+			if (page_size < PAGE_SIZE) {
+				/*
+				 * this shouldn't happen, but if it is
+				 * the case, leave the memory there
+				 */
+				WARN_ON_ONCE(1);
+			} else {
+				while (nr_pages--)
+					free_reserved_page(page++);
+			}
+		} else {
+			free_pages((unsigned long)(__va(addr)), page_order);
 		}
+
+		vmemmap_remove_mapping(start, page_size);
 	}
 }
 #endif
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/6] powerpc/vmemmap: Reshuffle vmemmap_free()
@ 2017-05-23  4:05   ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Removes an indentation level and shuffles some code around to make the
following patch cleaner. No functional changes.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Remove broken initialiser
---
 arch/powerpc/mm/init_64.c | 48 ++++++++++++++++++++++++-----------------------
 1 file changed, 25 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index ec84b31c6c86..8851e4f5dbab 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -234,13 +234,15 @@ static unsigned long vmemmap_list_free(unsigned long start)
 void __ref vmemmap_free(unsigned long start, unsigned long end)
 {
 	unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
+	unsigned long page_order = get_order(page_size);
 
 	start = _ALIGN_DOWN(start, page_size);
 
 	pr_debug("vmemmap_free %lx...%lx\n", start, end);
 
 	for (; start < end; start += page_size) {
-		unsigned long addr;
+		unsigned long nr_pages, addr;
+		struct page *page;
 
 		/*
 		 * the section has already be marked as invalid, so
@@ -251,29 +253,29 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 			continue;
 
 		addr = vmemmap_list_free(start);
-		if (addr) {
-			struct page *page = pfn_to_page(addr >> PAGE_SHIFT);
-
-			if (PageReserved(page)) {
-				/* allocated from bootmem */
-				if (page_size < PAGE_SIZE) {
-					/*
-					 * this shouldn't happen, but if it is
-					 * the case, leave the memory there
-					 */
-					WARN_ON_ONCE(1);
-				} else {
-					unsigned int nr_pages =
-						1 << get_order(page_size);
-					while (nr_pages--)
-						free_reserved_page(page++);
-				}
-			} else
-				free_pages((unsigned long)(__va(addr)),
-							get_order(page_size));
-
-			vmemmap_remove_mapping(start, page_size);
+		if (!addr)
+			continue;
+
+		page = pfn_to_page(addr >> PAGE_SHIFT);
+		nr_pages = 1 << page_order;
+
+		if (PageReserved(page)) {
+			/* allocated from bootmem */
+			if (page_size < PAGE_SIZE) {
+				/*
+				 * this shouldn't happen, but if it is
+				 * the case, leave the memory there
+				 */
+				WARN_ON_ONCE(1);
+			} else {
+				while (nr_pages--)
+					free_reserved_page(page++);
+			}
+		} else {
+			free_pages((unsigned long)(__va(addr)), page_order);
 		}
+
+		vmemmap_remove_mapping(start, page_size);
 	}
 }
 #endif
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/6] powerpc/vmemmap: Add altmap support
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  4:05   ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
altmap is a driver provided region that is used to provide the backing
storage for the struct pages of ZONE_DEVICE memory. In situations where
large amount of ZONE_DEVICE memory is being added to the system the
altmap reduces pressure on main system memory by allowing the mm/
metadata to be stored on the device itself rather in main memory.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/mm/init_64.c | 15 +++++++++++++--
 arch/powerpc/mm/mem.c     | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 8851e4f5dbab..225fbb8034e6 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -44,6 +44,7 @@
 #include <linux/slab.h>
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
+#include <linux/memremap.h>
 
 #include <asm/pgalloc.h>
 #include <asm/page.h>
@@ -171,13 +172,17 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 	pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
 
 	for (; start < end; start += page_size) {
+		struct vmem_altmap *altmap;
 		void *p;
 		int rc;
 
 		if (vmemmap_populated(start, page_size))
 			continue;
 
-		p = vmemmap_alloc_block(page_size, node);
+		/* altmap lookups only work at section boundaries */
+		altmap = to_vmem_altmap(SECTION_ALIGN_DOWN(start));
+
+		p =  __vmemmap_alloc_block_buf(page_size, node, altmap);
 		if (!p)
 			return -ENOMEM;
 
@@ -242,6 +247,8 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 
 	for (; start < end; start += page_size) {
 		unsigned long nr_pages, addr;
+		struct vmem_altmap *altmap;
+		struct page *section_base;
 		struct page *page;
 
 		/*
@@ -257,9 +264,13 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 			continue;
 
 		page = pfn_to_page(addr >> PAGE_SHIFT);
+		section_base = pfn_to_page(vmemmap_section_start(start));
 		nr_pages = 1 << page_order;
 
-		if (PageReserved(page)) {
+		altmap = to_vmem_altmap((unsigned long) section_base);
+		if (altmap) {
+			vmem_altmap_free(altmap, nr_pages);
+		} else if (PageReserved(page)) {
 			/* allocated from bootmem */
 			if (page_size < PAGE_SIZE) {
 				/*
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9ee536ec0739..2c0c16f11eee 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -36,6 +36,7 @@
 #include <linux/hugetlb.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/memremap.h>
 
 #include <asm/pgalloc.h>
 #include <asm/prom.h>
@@ -159,11 +160,20 @@ int arch_remove_memory(u64 start, u64 size)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
+	struct vmem_altmap *altmap;
+	struct page *page;
 	int ret;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	ret = __remove_pages(zone, start_pfn, nr_pages);
+	/*
+	 * If we have an altmap then we need to skip over any reserved PFNs
+	 * when querying the zone.
+	 */
+	page = pfn_to_page(start_pfn);
+	altmap = to_vmem_altmap((unsigned long) page);
+	if (altmap)
+		page += vmem_altmap_offset(altmap);
+
+	ret = __remove_pages(page_zone(page), start_pfn, nr_pages);
 	if (ret)
 		return ret;
 
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/6] powerpc/vmemmap: Add altmap support
@ 2017-05-23  4:05   ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
altmap is a driver provided region that is used to provide the backing
storage for the struct pages of ZONE_DEVICE memory. In situations where
large amount of ZONE_DEVICE memory is being added to the system the
altmap reduces pressure on main system memory by allowing the mm/
metadata to be stored on the device itself rather in main memory.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/mm/init_64.c | 15 +++++++++++++--
 arch/powerpc/mm/mem.c     | 16 +++++++++++++---
 2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 8851e4f5dbab..225fbb8034e6 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -44,6 +44,7 @@
 #include <linux/slab.h>
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
+#include <linux/memremap.h>
 
 #include <asm/pgalloc.h>
 #include <asm/page.h>
@@ -171,13 +172,17 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 	pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
 
 	for (; start < end; start += page_size) {
+		struct vmem_altmap *altmap;
 		void *p;
 		int rc;
 
 		if (vmemmap_populated(start, page_size))
 			continue;
 
-		p = vmemmap_alloc_block(page_size, node);
+		/* altmap lookups only work at section boundaries */
+		altmap = to_vmem_altmap(SECTION_ALIGN_DOWN(start));
+
+		p =  __vmemmap_alloc_block_buf(page_size, node, altmap);
 		if (!p)
 			return -ENOMEM;
 
@@ -242,6 +247,8 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 
 	for (; start < end; start += page_size) {
 		unsigned long nr_pages, addr;
+		struct vmem_altmap *altmap;
+		struct page *section_base;
 		struct page *page;
 
 		/*
@@ -257,9 +264,13 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
 			continue;
 
 		page = pfn_to_page(addr >> PAGE_SHIFT);
+		section_base = pfn_to_page(vmemmap_section_start(start));
 		nr_pages = 1 << page_order;
 
-		if (PageReserved(page)) {
+		altmap = to_vmem_altmap((unsigned long) section_base);
+		if (altmap) {
+			vmem_altmap_free(altmap, nr_pages);
+		} else if (PageReserved(page)) {
 			/* allocated from bootmem */
 			if (page_size < PAGE_SIZE) {
 				/*
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9ee536ec0739..2c0c16f11eee 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -36,6 +36,7 @@
 #include <linux/hugetlb.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <linux/memremap.h>
 
 #include <asm/pgalloc.h>
 #include <asm/prom.h>
@@ -159,11 +160,20 @@ int arch_remove_memory(u64 start, u64 size)
 {
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
-	struct zone *zone;
+	struct vmem_altmap *altmap;
+	struct page *page;
 	int ret;
 
-	zone = page_zone(pfn_to_page(start_pfn));
-	ret = __remove_pages(zone, start_pfn, nr_pages);
+	/*
+	 * If we have an altmap then we need to skip over any reserved PFNs
+	 * when querying the zone.
+	 */
+	page = pfn_to_page(start_pfn);
+	altmap = to_vmem_altmap((unsigned long) page);
+	if (altmap)
+		page += vmem_altmap_offset(altmap);
+
+	ret = __remove_pages(page_zone(page), start_pfn, nr_pages);
 	if (ret)
 		return ret;
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  4:05   ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran, Aneesh Kumar K . V

Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
is used to differentiate device backed memory from transparent huge
pages since they are handled in more or less the same manner by the core
mm code.

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.

Aneesh, this has been fleshed out substantially since v1. Can you
re-review it? Also no explicit gup support is required in this patch
since devmap support was added generic GUP as a part of making x86 use
the generic version.
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
 arch/powerpc/mm/hugetlbpage.c                 |  2 +-
 arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
 arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
 arch/powerpc/mm/pgtable-radix.c               |  3 ++-
 arch/powerpc/mm/pgtable_64.c                  |  2 +-
 8 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9732837aaae8..eaaf613c5347 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
  */
 static inline int hash__pmd_trans_huge(pmd_t pmd)
 {
-	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
+	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
 		  (_PAGE_PTE | H_PAGE_THP_HUGE));
 }
 
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 85bc9875c3be..24634e92dd0b 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -79,6 +79,9 @@
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
+#define _PAGE_DEVMAP		_RPAGE_SW1
+#define __HAVE_ARCH_PTE_DEVMAP
+
 /*
  * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
  * Instead of fixing all of them, add an alternate define which
@@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
 	return pte;
 }
 
+static inline pte_t pte_mkdevmap(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
+}
+
+static inline int pte_devmap(pte_t pte)
+{
+	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
+}
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
 	/* FIXME!! check whether this need to be a conditional */
@@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
 #define pmd_clear_savedwrite(pmd)	pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
 
+#define pud_pfn(...) (0)
+#define pgd_pfn(...) (0)
+
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
 #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
@@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 
-
 #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
 static inline bool arch_needs_pgtable_deposit(void)
 {
@@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
 	return true;
 }
 
+static inline pmd_t pmd_mkdevmap(pmd_t pmd)
+{
+	return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
+}
+
+static inline int pmd_devmap(pmd_t pmd)
+{
+	return pte_devmap(pmd_pte(pmd));
+}
+
+static inline int pud_devmap(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_devmap(pgd_t pgd)
+{
+	return 0;
+}
+
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index ac16d1943022..ba43754e96d2 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
 
 static inline int radix__pmd_trans_huge(pmd_t pmd)
 {
-	return !!(pmd_val(pmd) & _PAGE_PTE);
+	return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
 }
 
 static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a4f33de4008e..d9958af5c98e 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
 			if (pmd_none(pmd))
 				return NULL;
 
-			if (pmd_trans_huge(pmd)) {
+			if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
 				if (is_thp)
 					*is_thp = true;
 				ret_pte = (pte_t *) pmdp;
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 5fcb3dd74c13..31eed8fa8e99 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 {
 	int changed;
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!pmd_trans_huge(*pmdp));
+	WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&vma->vm_mm->page_table_lock);
 #endif
 	changed = !pmd_same(*(pmdp), entry);
@@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
 	assert_spin_locked(&mm->page_table_lock);
-	WARN_ON(!pmd_trans_huge(pmd));
+	WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
 #endif
 	trace_hugepage_set_pmd(addr, pmd_val(pmd));
 	return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 8b85a14b08ea..7456cde4dbce 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
 	unsigned long old;
 
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!pmd_trans_huge(*pmdp));
+	WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&mm->page_table_lock);
 #endif
 
@@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(pmd_trans_huge(*pmdp));
+	VM_BUG_ON(pmd_devmap(*pmdp));
 
 	pmd = *pmdp;
 	pmd_clear(pmdp);
@@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
 {
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
+	VM_BUG_ON(pmd_devmap(*pmdp));
 
 	/*
 	 * We can't mark the pmd none here, because that will cause a race
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index c28165d8970b..69e28dda81f2 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
 	unsigned long old;
 
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!radix__pmd_trans_huge(*pmdp));
+	WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&mm->page_table_lock);
 #endif
 
@@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
+	VM_BUG_ON(pmd_devmap(*pmdp));
 	/*
 	 * khugepaged calls this for normal pmd
 	 */
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index db93cf747a03..aefde9bd3110 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
  */
 struct page *pmd_page(pmd_t pmd)
 {
-	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
+	if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
 		return pte_page(pmd_pte(pmd));
 	return virt_to_page(pmd_page_vaddr(pmd));
 }
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
@ 2017-05-23  4:05   ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran, Aneesh Kumar K . V

Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
is used to differentiate device backed memory from transparent huge
pages since they are handled in more or less the same manner by the core
mm code.

Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.

Aneesh, this has been fleshed out substantially since v1. Can you
re-review it? Also no explicit gup support is required in this patch
since devmap support was added generic GUP as a part of making x86 use
the generic version.
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
 arch/powerpc/mm/hugetlbpage.c                 |  2 +-
 arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
 arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
 arch/powerpc/mm/pgtable-radix.c               |  3 ++-
 arch/powerpc/mm/pgtable_64.c                  |  2 +-
 8 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9732837aaae8..eaaf613c5347 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
  */
 static inline int hash__pmd_trans_huge(pmd_t pmd)
 {
-	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
+	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
 		  (_PAGE_PTE | H_PAGE_THP_HUGE));
 }
 
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 85bc9875c3be..24634e92dd0b 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -79,6 +79,9 @@
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
+#define _PAGE_DEVMAP		_RPAGE_SW1
+#define __HAVE_ARCH_PTE_DEVMAP
+
 /*
  * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
  * Instead of fixing all of them, add an alternate define which
@@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
 	return pte;
 }
 
+static inline pte_t pte_mkdevmap(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
+}
+
+static inline int pte_devmap(pte_t pte)
+{
+	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
+}
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
 	/* FIXME!! check whether this need to be a conditional */
@@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
 #define pmd_clear_savedwrite(pmd)	pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
 
+#define pud_pfn(...) (0)
+#define pgd_pfn(...) (0)
+
 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
 #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
 #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
@@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 
-
 #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
 static inline bool arch_needs_pgtable_deposit(void)
 {
@@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
 	return true;
 }
 
+static inline pmd_t pmd_mkdevmap(pmd_t pmd)
+{
+	return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
+}
+
+static inline int pmd_devmap(pmd_t pmd)
+{
+	return pte_devmap(pmd_pte(pmd));
+}
+
+static inline int pud_devmap(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_devmap(pgd_t pgd)
+{
+	return 0;
+}
+
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index ac16d1943022..ba43754e96d2 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
 
 static inline int radix__pmd_trans_huge(pmd_t pmd)
 {
-	return !!(pmd_val(pmd) & _PAGE_PTE);
+	return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
 }
 
 static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a4f33de4008e..d9958af5c98e 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
 			if (pmd_none(pmd))
 				return NULL;
 
-			if (pmd_trans_huge(pmd)) {
+			if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
 				if (is_thp)
 					*is_thp = true;
 				ret_pte = (pte_t *) pmdp;
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 5fcb3dd74c13..31eed8fa8e99 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 {
 	int changed;
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!pmd_trans_huge(*pmdp));
+	WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&vma->vm_mm->page_table_lock);
 #endif
 	changed = !pmd_same(*(pmdp), entry);
@@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
 	assert_spin_locked(&mm->page_table_lock);
-	WARN_ON(!pmd_trans_huge(pmd));
+	WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
 #endif
 	trace_hugepage_set_pmd(addr, pmd_val(pmd));
 	return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 8b85a14b08ea..7456cde4dbce 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
 	unsigned long old;
 
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!pmd_trans_huge(*pmdp));
+	WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&mm->page_table_lock);
 #endif
 
@@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(pmd_trans_huge(*pmdp));
+	VM_BUG_ON(pmd_devmap(*pmdp));
 
 	pmd = *pmdp;
 	pmd_clear(pmdp);
@@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
 {
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
+	VM_BUG_ON(pmd_devmap(*pmdp));
 
 	/*
 	 * We can't mark the pmd none here, because that will cause a race
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index c28165d8970b..69e28dda81f2 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
 	unsigned long old;
 
 #ifdef CONFIG_DEBUG_VM
-	WARN_ON(!radix__pmd_trans_huge(*pmdp));
+	WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
 	assert_spin_locked(&mm->page_table_lock);
 #endif
 
@@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
 
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
 	VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
+	VM_BUG_ON(pmd_devmap(*pmdp));
 	/*
 	 * khugepaged calls this for normal pmd
 	 */
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index db93cf747a03..aefde9bd3110 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
  */
 struct page *pmd_page(pmd_t pmd)
 {
-	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
+	if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
 		return pte_page(pmd_pte(pmd));
 	return virt_to_page(pmd_page_vaddr(pmd));
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  4:05   ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran, x86

Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
to an arch selected Kconfig option to save us some trouble in the
future.

Cc: x86@kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/x86/Kconfig | 1 +
 mm/Kconfig       | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd18994a9555..acbb15234562 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,6 +59,7 @@ config X86
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_HAS_ZONE_DEVICE		if X86_64
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_MIGHT_HAVE_ACPI_PDC		if ACPI
 	select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/mm/Kconfig b/mm/Kconfig
index beb7a455915d..2d38a4abe957 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -683,12 +683,15 @@ config IDLE_PAGE_TRACKING
 
 	  See Documentation/vm/idle_page_tracking.txt for more details.
 
+config ARCH_HAS_ZONE_DEVICE
+	def_bool n
+
 config ZONE_DEVICE
 	bool "Device memory (pmem, etc...) hotplug support"
 	depends on MEMORY_HOTPLUG
 	depends on MEMORY_HOTREMOVE
 	depends on SPARSEMEM_VMEMMAP
-	depends on X86_64 #arch_add_memory() comprehends device memory
+	depends on ARCH_HAS_ZONE_DEVICE
 
 	help
 	  Device memory hotplug support allows for establishing pmem,
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
@ 2017-05-23  4:05   ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran, x86

Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
to an arch selected Kconfig option to save us some trouble in the
future.

Cc: x86@kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/x86/Kconfig | 1 +
 mm/Kconfig       | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cd18994a9555..acbb15234562 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,6 +59,7 @@ config X86
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_HAS_ZONE_DEVICE		if X86_64
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_MIGHT_HAVE_ACPI_PDC		if ACPI
 	select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/mm/Kconfig b/mm/Kconfig
index beb7a455915d..2d38a4abe957 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -683,12 +683,15 @@ config IDLE_PAGE_TRACKING
 
 	  See Documentation/vm/idle_page_tracking.txt for more details.
 
+config ARCH_HAS_ZONE_DEVICE
+	def_bool n
+
 config ZONE_DEVICE
 	bool "Device memory (pmem, etc...) hotplug support"
 	depends on MEMORY_HOTPLUG
 	depends on MEMORY_HOTREMOVE
 	depends on SPARSEMEM_VMEMMAP
-	depends on X86_64 #arch_add_memory() comprehends device memory
+	depends on ARCH_HAS_ZONE_DEVICE
 
 	help
 	  Device memory hotplug support allows for establishing pmem,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  4:05   ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
but recommended.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f7c8f9972f61..bf3365c34244 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -138,6 +138,7 @@ config PPC
 	select ARCH_HAS_SG_CHAIN
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_HAS_ZONE_DEVICE		if PPC64
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
-- 
2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc
@ 2017-05-23  4:05   ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
but recommended.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index f7c8f9972f61..bf3365c34244 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -138,6 +138,7 @@ config PPC
 	select ARCH_HAS_SG_CHAIN
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_HAS_ZONE_DEVICE		if PPC64
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23  4:23     ` Aneesh Kumar K.V
  -1 siblings, 0 replies; 31+ messages in thread
From: Aneesh Kumar K.V @ 2017-05-23  4:23 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: linux-mm

Oliver O'Halloran <oohall@gmail.com> writes:

> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>
> Aneesh, this has been fleshed out substantially since v1. Can you
> re-review it? Also no explicit gup support is required in this patch
> since devmap support was added generic GUP as a part of making x86 use
> the generic version.
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837aaae8..eaaf613c5347 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>   */
>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>  {
> -	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
> +	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>  		  (_PAGE_PTE | H_PAGE_THP_HUGE));
>  }


_PAGE_DEVMAP is not really needed here. We will set H_PAGE_THP_HUGE only
for thp hugepage w.r.t hash. But putting it here also makes it clear
that devmap entries are not considered trans huge.

>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 85bc9875c3be..24634e92dd0b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -79,6 +79,9 @@
>
>  #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
>  #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
> +#define _PAGE_DEVMAP		_RPAGE_SW1
> +#define __HAVE_ARCH_PTE_DEVMAP
> +
>  /*
>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>   * Instead of fixing all of them, add an alternate define which
> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>  	return pte;
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> +}
> +
> +static inline int pte_devmap(pte_t pte)
> +{
> +	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
> +}
> +
>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  {
>  	/* FIXME!! check whether this need to be a conditional */
> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>  #define pmd_clear_savedwrite(pmd)	pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>
> +#define pud_pfn(...) (0)
> +#define pgd_pfn(...) (0)
> +
>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>  	return true;
>  }
>
> -
>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>  static inline bool arch_needs_pgtable_deposit(void)
>  {
> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>  	return true;
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +	return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}


We avoided setting _PAGE_SPECIAL on pmd entries. This will set that, we
may want to check if it is ok.  IIRC, we overloaded _PAGE_SPECIAL at
some point to indicate thp splitting. But good to double check. 

> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +	return pte_devmap(pmd_pte(pmd));
> +}
> +
> +static inline int pud_devmap(pud_t pud)
> +{
> +	return 0;
> +}
> +
> +static inline int pgd_devmap(pgd_t pgd)
> +{
> +	return 0;
> +}
> +
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  #endif /* __ASSEMBLY__ */
>  #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index ac16d1943022..ba43754e96d2 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
>
>  static inline int radix__pmd_trans_huge(pmd_t pmd)
>  {
> -	return !!(pmd_val(pmd) & _PAGE_PTE);
> +	return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
>  }
>
>  static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index a4f33de4008e..d9958af5c98e 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
>  			if (pmd_none(pmd))
>  				return NULL;
>
> -			if (pmd_trans_huge(pmd)) {
> +			if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
>  				if (is_thp)
>  					*is_thp = true;
>  				ret_pte = (pte_t *) pmdp;


Is that correct ? Do we want pmd_devmap to have is_thp set ? 


> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index 5fcb3dd74c13..31eed8fa8e99 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
>  {
>  	int changed;
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!pmd_trans_huge(*pmdp));
> +	WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&vma->vm_mm->page_table_lock);
>  #endif
>  	changed = !pmd_same(*(pmdp), entry);
> @@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>  #ifdef CONFIG_DEBUG_VM
>  	WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
>  	assert_spin_locked(&mm->page_table_lock);
> -	WARN_ON(!pmd_trans_huge(pmd));
> +	WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
>  #endif
>  	trace_hugepage_set_pmd(addr, pmd_val(pmd));
>  	return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
> diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
> index 8b85a14b08ea..7456cde4dbce 100644
> --- a/arch/powerpc/mm/pgtable-hash64.c
> +++ b/arch/powerpc/mm/pgtable-hash64.c
> @@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
>  	unsigned long old;
>
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!pmd_trans_huge(*pmdp));
> +	WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&mm->page_table_lock);
>  #endif
>
> @@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
>
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(pmd_trans_huge(*pmdp));
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>
>  	pmd = *pmdp;
>  	pmd_clear(pmdp);
> @@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
>  {
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>
>  	/*
>  	 * We can't mark the pmd none here, because that will cause a race
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index c28165d8970b..69e28dda81f2 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
>  	unsigned long old;
>
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!radix__pmd_trans_huge(*pmdp));
> +	WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&mm->page_table_lock);
>  #endif
>
> @@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
>
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>  	/*
>  	 * khugepaged calls this for normal pmd
>  	 */
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index db93cf747a03..aefde9bd3110 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
>   */
>  struct page *pmd_page(pmd_t pmd)
>  {
> -	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
> +	if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
>  		return pte_page(pmd_pte(pmd));
>  	return virt_to_page(pmd_page_vaddr(pmd));
>  }
> -- 
> 2.9.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
@ 2017-05-23  4:23     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 31+ messages in thread
From: Aneesh Kumar K.V @ 2017-05-23  4:23 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev; +Cc: linux-mm, Oliver O'Halloran

Oliver O'Halloran <oohall@gmail.com> writes:

> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>
> Aneesh, this has been fleshed out substantially since v1. Can you
> re-review it? Also no explicit gup support is required in this patch
> since devmap support was added generic GUP as a part of making x86 use
> the generic version.
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837aaae8..eaaf613c5347 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>   */
>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>  {
> -	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
> +	return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>  		  (_PAGE_PTE | H_PAGE_THP_HUGE));
>  }


_PAGE_DEVMAP is not really needed here. We will set H_PAGE_THP_HUGE only
for thp hugepage w.r.t hash. But putting it here also makes it clear
that devmap entries are not considered trans huge.

>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 85bc9875c3be..24634e92dd0b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -79,6 +79,9 @@
>
>  #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
>  #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
> +#define _PAGE_DEVMAP		_RPAGE_SW1
> +#define __HAVE_ARCH_PTE_DEVMAP
> +
>  /*
>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>   * Instead of fixing all of them, add an alternate define which
> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>  	return pte;
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +	return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> +}
> +
> +static inline int pte_devmap(pte_t pte)
> +{
> +	return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
> +}
> +
>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  {
>  	/* FIXME!! check whether this need to be a conditional */
> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mk_savedwrite(pmd)	pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>  #define pmd_clear_savedwrite(pmd)	pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>
> +#define pud_pfn(...) (0)
> +#define pgd_pfn(...) (0)
> +
>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>  	return true;
>  }
>
> -
>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>  static inline bool arch_needs_pgtable_deposit(void)
>  {
> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>  	return true;
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +	return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}


We avoided setting _PAGE_SPECIAL on pmd entries. This will set that, we
may want to check if it is ok.  IIRC, we overloaded _PAGE_SPECIAL at
some point to indicate thp splitting. But good to double check. 

> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +	return pte_devmap(pmd_pte(pmd));
> +}
> +
> +static inline int pud_devmap(pud_t pud)
> +{
> +	return 0;
> +}
> +
> +static inline int pgd_devmap(pgd_t pgd)
> +{
> +	return 0;
> +}
> +
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  #endif /* __ASSEMBLY__ */
>  #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index ac16d1943022..ba43754e96d2 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
>
>  static inline int radix__pmd_trans_huge(pmd_t pmd)
>  {
> -	return !!(pmd_val(pmd) & _PAGE_PTE);
> +	return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
>  }
>
>  static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index a4f33de4008e..d9958af5c98e 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
>  			if (pmd_none(pmd))
>  				return NULL;
>
> -			if (pmd_trans_huge(pmd)) {
> +			if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
>  				if (is_thp)
>  					*is_thp = true;
>  				ret_pte = (pte_t *) pmdp;


Is that correct ? Do we want pmd_devmap to have is_thp set ? 


> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index 5fcb3dd74c13..31eed8fa8e99 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
>  {
>  	int changed;
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!pmd_trans_huge(*pmdp));
> +	WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&vma->vm_mm->page_table_lock);
>  #endif
>  	changed = !pmd_same(*(pmdp), entry);
> @@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>  #ifdef CONFIG_DEBUG_VM
>  	WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
>  	assert_spin_locked(&mm->page_table_lock);
> -	WARN_ON(!pmd_trans_huge(pmd));
> +	WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
>  #endif
>  	trace_hugepage_set_pmd(addr, pmd_val(pmd));
>  	return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
> diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
> index 8b85a14b08ea..7456cde4dbce 100644
> --- a/arch/powerpc/mm/pgtable-hash64.c
> +++ b/arch/powerpc/mm/pgtable-hash64.c
> @@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
>  	unsigned long old;
>
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!pmd_trans_huge(*pmdp));
> +	WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&mm->page_table_lock);
>  #endif
>
> @@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
>
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(pmd_trans_huge(*pmdp));
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>
>  	pmd = *pmdp;
>  	pmd_clear(pmdp);
> @@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
>  {
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>
>  	/*
>  	 * We can't mark the pmd none here, because that will cause a race
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index c28165d8970b..69e28dda81f2 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
>  	unsigned long old;
>
>  #ifdef CONFIG_DEBUG_VM
> -	WARN_ON(!radix__pmd_trans_huge(*pmdp));
> +	WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>  	assert_spin_locked(&mm->page_table_lock);
>  #endif
>
> @@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
>
>  	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>  	VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
> +	VM_BUG_ON(pmd_devmap(*pmdp));
>  	/*
>  	 * khugepaged calls this for normal pmd
>  	 */
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index db93cf747a03..aefde9bd3110 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
>   */
>  struct page *pmd_page(pmd_t pmd)
>  {
> -	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
> +	if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
>  		return pte_page(pmd_pte(pmd));
>  	return virt_to_page(pmd_page_vaddr(pmd));
>  }
> -- 
> 2.9.3

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23  6:42     ` Ingo Molnar
  -1 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2017-05-23  6:42 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev, linux-mm, x86


* Oliver O'Halloran <oohall@gmail.com> wrote:

> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
> 
> Cc: x86@kernel.org
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>

Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
@ 2017-05-23  6:42     ` Ingo Molnar
  0 siblings, 0 replies; 31+ messages in thread
From: Ingo Molnar @ 2017-05-23  6:42 UTC (permalink / raw)
  To: Oliver O'Halloran; +Cc: linuxppc-dev, linux-mm, x86


* Oliver O'Halloran <oohall@gmail.com> wrote:

> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
> 
> Cc: x86@kernel.org
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>

Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
  2017-05-23  4:23     ` Aneesh Kumar K.V
@ 2017-05-23  6:42       ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  6:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, Linux MM

On Tue, May 23, 2017 at 2:23 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Oliver O'Halloran <oohall@gmail.com> writes:
>
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> -     return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> +     return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>>                 (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> _PAGE_DEVMAP is not really needed here. We will set H_PAGE_THP_HUGE only
> for thp hugepage w.r.t hash. But putting it here also makes it clear
> that devmap entries are not considered trans huge.

Good point. I'll remove it.

>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY     _RPAGE_SW3 /* software: software dirty tracking */
>>  #define _PAGE_SPECIAL                _RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP         _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>       return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +     return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +     return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>>       /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd)       pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)    pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>>       return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>>       return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +     return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>
>
> We avoided setting _PAGE_SPECIAL on pmd entries. This will set that, we
> may want to check if it is ok.  IIRC, we overloaded _PAGE_SPECIAL at some point to indicate thp splitting. But good to double check.

I took a cursory look in arch/powerpc/ and mm/ for usages and didn't
see any usages of _PAGE_SPECIAL for pmds. There's no good reason to
set the flag though so I'll remove it.

>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +     return pte_devmap(pmd_pte(pmd));
>> +}
>> +
>> +static inline int pud_devmap(pud_t pud)
>> +{
>> +     return 0;
>> +}
>> +
>> +static inline int pgd_devmap(pgd_t pgd)
>> +{
>> +     return 0;
>> +}
>> +
>>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>  #endif /* __ASSEMBLY__ */
>>  #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
>> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
>> index ac16d1943022..ba43754e96d2 100644
>> --- a/arch/powerpc/include/asm/book3s/64/radix.h
>> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
>> @@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
>>
>>  static inline int radix__pmd_trans_huge(pmd_t pmd)
>>  {
>> -     return !!(pmd_val(pmd) & _PAGE_PTE);
>> +     return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
>>  }
>>
>>  static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
>> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
>> index a4f33de4008e..d9958af5c98e 100644
>> --- a/arch/powerpc/mm/hugetlbpage.c
>> +++ b/arch/powerpc/mm/hugetlbpage.c
>> @@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
>>                       if (pmd_none(pmd))
>>                               return NULL;
>>
>> -                     if (pmd_trans_huge(pmd)) {
>> +                     if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
>>                               if (is_thp)
>>                                       *is_thp = true;
>>                               ret_pte = (pte_t *) pmdp;
>
>
> Is that correct ? Do we want pmd_devmap to have is_thp set ?

I think so, is_thp is used to differentiate between explicit and
transparent hugepages in the hash fault handler. The management and
fault handling of pmd devmap pages and thp is the same (by design)
while hugepages seem to have their own requirements. Most users of
find_linux_pte_or_hugepte() don't look at is_thp either so it should
be safe.

>> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
>> index 5fcb3dd74c13..31eed8fa8e99 100644
>> --- a/arch/powerpc/mm/pgtable-book3s64.c
>> +++ b/arch/powerpc/mm/pgtable-book3s64.c
>> @@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
>>  {
>>       int changed;
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!pmd_trans_huge(*pmdp));
>> +     WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&vma->vm_mm->page_table_lock);
>>  #endif
>>       changed = !pmd_same(*(pmdp), entry);
>> @@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>>  #ifdef CONFIG_DEBUG_VM
>>       WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
>>       assert_spin_locked(&mm->page_table_lock);
>> -     WARN_ON(!pmd_trans_huge(pmd));
>> +     WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
>>  #endif
>>       trace_hugepage_set_pmd(addr, pmd_val(pmd));
>>       return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
>> diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
>> index 8b85a14b08ea..7456cde4dbce 100644
>> --- a/arch/powerpc/mm/pgtable-hash64.c
>> +++ b/arch/powerpc/mm/pgtable-hash64.c
>> @@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
>>       unsigned long old;
>>
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!pmd_trans_huge(*pmdp));
>> +     WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&mm->page_table_lock);
>>  #endif
>>
>> @@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
>>
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(pmd_trans_huge(*pmdp));
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>
>>       pmd = *pmdp;
>>       pmd_clear(pmdp);
>> @@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
>>  {
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>
>>       /*
>>        * We can't mark the pmd none here, because that will cause a race
>> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
>> index c28165d8970b..69e28dda81f2 100644
>> --- a/arch/powerpc/mm/pgtable-radix.c
>> +++ b/arch/powerpc/mm/pgtable-radix.c
>> @@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
>>       unsigned long old;
>>
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!radix__pmd_trans_huge(*pmdp));
>> +     WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&mm->page_table_lock);
>>  #endif
>>
>> @@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
>>
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>       /*
>>        * khugepaged calls this for normal pmd
>>        */
>> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
>> index db93cf747a03..aefde9bd3110 100644
>> --- a/arch/powerpc/mm/pgtable_64.c
>> +++ b/arch/powerpc/mm/pgtable_64.c
>> @@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
>>   */
>>  struct page *pmd_page(pmd_t pmd)
>>  {
>> -     if (pmd_trans_huge(pmd) || pmd_huge(pmd))
>> +     if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
>>               return pte_page(pmd_pte(pmd));
>>       return virt_to_page(pmd_page_vaddr(pmd));
>>  }
>> --
>> 2.9.3
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
@ 2017-05-23  6:42       ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-23  6:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, Linux MM

On Tue, May 23, 2017 at 2:23 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Oliver O'Halloran <oohall@gmail.com> writes:
>
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> -     return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> +     return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>>                 (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> _PAGE_DEVMAP is not really needed here. We will set H_PAGE_THP_HUGE only
> for thp hugepage w.r.t hash. But putting it here also makes it clear
> that devmap entries are not considered trans huge.

Good point. I'll remove it.

>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY     _RPAGE_SW3 /* software: software dirty tracking */
>>  #define _PAGE_SPECIAL                _RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP         _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>       return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +     return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +     return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>>       /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd)       pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)    pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>>       return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>>       return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +     return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>
>
> We avoided setting _PAGE_SPECIAL on pmd entries. This will set that, we
> may want to check if it is ok.  IIRC, we overloaded _PAGE_SPECIAL at some point to indicate thp splitting. But good to double check.

I took a cursory look in arch/powerpc/ and mm/ for usages and didn't
see any usages of _PAGE_SPECIAL for pmds. There's no good reason to
set the flag though so I'll remove it.

>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +     return pte_devmap(pmd_pte(pmd));
>> +}
>> +
>> +static inline int pud_devmap(pud_t pud)
>> +{
>> +     return 0;
>> +}
>> +
>> +static inline int pgd_devmap(pgd_t pgd)
>> +{
>> +     return 0;
>> +}
>> +
>>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>>  #endif /* __ASSEMBLY__ */
>>  #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
>> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
>> index ac16d1943022..ba43754e96d2 100644
>> --- a/arch/powerpc/include/asm/book3s/64/radix.h
>> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
>> @@ -252,7 +252,7 @@ static inline int radix__pgd_bad(pgd_t pgd)
>>
>>  static inline int radix__pmd_trans_huge(pmd_t pmd)
>>  {
>> -     return !!(pmd_val(pmd) & _PAGE_PTE);
>> +     return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
>>  }
>>
>>  static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
>> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
>> index a4f33de4008e..d9958af5c98e 100644
>> --- a/arch/powerpc/mm/hugetlbpage.c
>> +++ b/arch/powerpc/mm/hugetlbpage.c
>> @@ -963,7 +963,7 @@ pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
>>                       if (pmd_none(pmd))
>>                               return NULL;
>>
>> -                     if (pmd_trans_huge(pmd)) {
>> +                     if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
>>                               if (is_thp)
>>                                       *is_thp = true;
>>                               ret_pte = (pte_t *) pmdp;
>
>
> Is that correct ? Do we want pmd_devmap to have is_thp set ?

I think so, is_thp is used to differentiate between explicit and
transparent hugepages in the hash fault handler. The management and
fault handling of pmd devmap pages and thp is the same (by design)
while hugepages seem to have their own requirements. Most users of
find_linux_pte_or_hugepte() don't look at is_thp either so it should
be safe.

>> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
>> index 5fcb3dd74c13..31eed8fa8e99 100644
>> --- a/arch/powerpc/mm/pgtable-book3s64.c
>> +++ b/arch/powerpc/mm/pgtable-book3s64.c
>> @@ -32,7 +32,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
>>  {
>>       int changed;
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!pmd_trans_huge(*pmdp));
>> +     WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&vma->vm_mm->page_table_lock);
>>  #endif
>>       changed = !pmd_same(*(pmdp), entry);
>> @@ -59,7 +59,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>>  #ifdef CONFIG_DEBUG_VM
>>       WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
>>       assert_spin_locked(&mm->page_table_lock);
>> -     WARN_ON(!pmd_trans_huge(pmd));
>> +     WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
>>  #endif
>>       trace_hugepage_set_pmd(addr, pmd_val(pmd));
>>       return set_pte_at(mm, addr, pmdp_ptep(pmdp), pmd_pte(pmd));
>> diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
>> index 8b85a14b08ea..7456cde4dbce 100644
>> --- a/arch/powerpc/mm/pgtable-hash64.c
>> +++ b/arch/powerpc/mm/pgtable-hash64.c
>> @@ -109,7 +109,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
>>       unsigned long old;
>>
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!pmd_trans_huge(*pmdp));
>> +     WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&mm->page_table_lock);
>>  #endif
>>
>> @@ -141,6 +141,7 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
>>
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(pmd_trans_huge(*pmdp));
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>
>>       pmd = *pmdp;
>>       pmd_clear(pmdp);
>> @@ -221,6 +222,7 @@ void hash__pmdp_huge_split_prepare(struct vm_area_struct *vma,
>>  {
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(REGION_ID(address) != USER_REGION_ID);
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>
>>       /*
>>        * We can't mark the pmd none here, because that will cause a race
>> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
>> index c28165d8970b..69e28dda81f2 100644
>> --- a/arch/powerpc/mm/pgtable-radix.c
>> +++ b/arch/powerpc/mm/pgtable-radix.c
>> @@ -683,7 +683,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
>>       unsigned long old;
>>
>>  #ifdef CONFIG_DEBUG_VM
>> -     WARN_ON(!radix__pmd_trans_huge(*pmdp));
>> +     WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
>>       assert_spin_locked(&mm->page_table_lock);
>>  #endif
>>
>> @@ -701,6 +701,7 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
>>
>>       VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>       VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
>> +     VM_BUG_ON(pmd_devmap(*pmdp));
>>       /*
>>        * khugepaged calls this for normal pmd
>>        */
>> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
>> index db93cf747a03..aefde9bd3110 100644
>> --- a/arch/powerpc/mm/pgtable_64.c
>> +++ b/arch/powerpc/mm/pgtable_64.c
>> @@ -323,7 +323,7 @@ struct page *pud_page(pud_t pud)
>>   */
>>  struct page *pmd_page(pmd_t pmd)
>>  {
>> -     if (pmd_trans_huge(pmd) || pmd_huge(pmd))
>> +     if (pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))
>>               return pte_page(pmd_pte(pmd));
>>       return virt_to_page(pmd_page_vaddr(pmd));
>>  }
>> --
>> 2.9.3
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23  9:20     ` Balbir Singh
  -1 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:20 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
>
> Cc: x86@kernel.org
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/x86/Kconfig | 1 +
>  mm/Kconfig       | 5 ++++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cd18994a9555..acbb15234562 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -59,6 +59,7 @@ config X86
>         select ARCH_HAS_STRICT_KERNEL_RWX
>         select ARCH_HAS_STRICT_MODULE_RWX
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> +       select ARCH_HAS_ZONE_DEVICE             if X86_64
>         select ARCH_HAVE_NMI_SAFE_CMPXCHG
>         select ARCH_MIGHT_HAVE_ACPI_PDC         if ACPI
>         select ARCH_MIGHT_HAVE_PC_PARPORT
> diff --git a/mm/Kconfig b/mm/Kconfig
> index beb7a455915d..2d38a4abe957 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -683,12 +683,15 @@ config IDLE_PAGE_TRACKING
>
>           See Documentation/vm/idle_page_tracking.txt for more details.
>
> +config ARCH_HAS_ZONE_DEVICE
> +       def_bool n
> +
>  config ZONE_DEVICE
>         bool "Device memory (pmem, etc...) hotplug support"
>         depends on MEMORY_HOTPLUG
>         depends on MEMORY_HOTREMOVE
>         depends on SPARSEMEM_VMEMMAP
> -       depends on X86_64 #arch_add_memory() comprehends device memory
> +       depends on ARCH_HAS_ZONE_DEVICE
>
>         help
>           Device memory hotplug support allows for establishing pmem,

Acked-by: Balbir Singh <bsingharora@gmail.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE
@ 2017-05-23  9:20     ` Balbir Singh
  0 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:20 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Currently ZONE_DEVICE depends on X86_64. This is fine for now, but it
> will get unwieldly as new platforms get ZONE_DEVICE support. Moving it
> to an arch selected Kconfig option to save us some trouble in the
> future.
>
> Cc: x86@kernel.org
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/x86/Kconfig | 1 +
>  mm/Kconfig       | 5 ++++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cd18994a9555..acbb15234562 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -59,6 +59,7 @@ config X86
>         select ARCH_HAS_STRICT_KERNEL_RWX
>         select ARCH_HAS_STRICT_MODULE_RWX
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> +       select ARCH_HAS_ZONE_DEVICE             if X86_64
>         select ARCH_HAVE_NMI_SAFE_CMPXCHG
>         select ARCH_MIGHT_HAVE_ACPI_PDC         if ACPI
>         select ARCH_MIGHT_HAVE_PC_PARPORT
> diff --git a/mm/Kconfig b/mm/Kconfig
> index beb7a455915d..2d38a4abe957 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -683,12 +683,15 @@ config IDLE_PAGE_TRACKING
>
>           See Documentation/vm/idle_page_tracking.txt for more details.
>
> +config ARCH_HAS_ZONE_DEVICE
> +       def_bool n
> +
>  config ZONE_DEVICE
>         bool "Device memory (pmem, etc...) hotplug support"
>         depends on MEMORY_HOTPLUG
>         depends on MEMORY_HOTREMOVE
>         depends on SPARSEMEM_VMEMMAP
> -       depends on X86_64 #arch_add_memory() comprehends device memory
> +       depends on ARCH_HAS_ZONE_DEVICE
>
>         help
>           Device memory hotplug support allows for establishing pmem,

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23  9:21     ` Balbir Singh
  -1 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:21 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), linux-mm

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
> but recommended.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index f7c8f9972f61..bf3365c34244 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -138,6 +138,7 @@ config PPC
>         select ARCH_HAS_SG_CHAIN
>         select ARCH_HAS_TICK_BROADCAST          if GENERIC_CLOCKEVENTS_BROADCAST
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> +       select ARCH_HAS_ZONE_DEVICE             if PPC64

Does this work for Book E as well?

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc
@ 2017-05-23  9:21     ` Balbir Singh
  0 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:21 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), linux-mm

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Flip the switch. Running around and screaming "IT'S ALIVE" is optional,
> but recommended.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index f7c8f9972f61..bf3365c34244 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -138,6 +138,7 @@ config PPC
>         select ARCH_HAS_SG_CHAIN
>         select ARCH_HAS_TICK_BROADCAST          if GENERIC_CLOCKEVENTS_BROADCAST
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> +       select ARCH_HAS_ZONE_DEVICE             if PPC64

Does this work for Book E as well?

Balbir Singh.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/6] powerpc/vmemmap: Add altmap support
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23  9:25     ` Balbir Singh
  -1 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:25 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), linux-mm

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
> altmap is a driver provided region that is used to provide the backing
> storage for the struct pages of ZONE_DEVICE memory. In situations where
> large amount of ZONE_DEVICE memory is being added to the system the
> altmap reduces pressure on main system memory by allowing the mm/
> metadata to be stored on the device itself rather in main memory.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/mm/init_64.c | 15 +++++++++++++--
>  arch/powerpc/mm/mem.c     | 16 +++++++++++++---
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 8851e4f5dbab..225fbb8034e6 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -44,6 +44,7 @@
>  #include <linux/slab.h>
>  #include <linux/of_fdt.h>
>  #include <linux/libfdt.h>
> +#include <linux/memremap.h>
>
>  #include <asm/pgalloc.h>
>  #include <asm/page.h>
> @@ -171,13 +172,17 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
>         pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
>
>         for (; start < end; start += page_size) {
> +               struct vmem_altmap *altmap;
>                 void *p;
>                 int rc;
>
>                 if (vmemmap_populated(start, page_size))
>                         continue;
>
> -               p = vmemmap_alloc_block(page_size, node);
> +               /* altmap lookups only work at section boundaries */
> +               altmap = to_vmem_altmap(SECTION_ALIGN_DOWN(start));
> +
> +               p =  __vmemmap_alloc_block_buf(page_size, node, altmap);
>                 if (!p)
>                         return -ENOMEM;
>
> @@ -242,6 +247,8 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
>
>         for (; start < end; start += page_size) {
>                 unsigned long nr_pages, addr;
> +               struct vmem_altmap *altmap;
> +               struct page *section_base;
>                 struct page *page;
>
>                 /*
> @@ -257,9 +264,13 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
>                         continue;
>
>                 page = pfn_to_page(addr >> PAGE_SHIFT);
> +               section_base = pfn_to_page(vmemmap_section_start(start));
>                 nr_pages = 1 << page_order;
>
> -               if (PageReserved(page)) {
> +               altmap = to_vmem_altmap((unsigned long) section_base);
> +               if (altmap) {
> +                       vmem_altmap_free(altmap, nr_pages);
> +               } else if (PageReserved(page)) {
>                         /* allocated from bootmem */
>                         if (page_size < PAGE_SIZE) {
>                                 /*
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 9ee536ec0739..2c0c16f11eee 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -36,6 +36,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> +#include <linux/memremap.h>
>
>  #include <asm/pgalloc.h>
>  #include <asm/prom.h>
> @@ -159,11 +160,20 @@ int arch_remove_memory(u64 start, u64 size)
>  {
>         unsigned long start_pfn = start >> PAGE_SHIFT;
>         unsigned long nr_pages = size >> PAGE_SHIFT;
> -       struct zone *zone;
> +       struct vmem_altmap *altmap;
> +       struct page *page;
>         int ret;
>
> -       zone = page_zone(pfn_to_page(start_pfn));
> -       ret = __remove_pages(zone, start_pfn, nr_pages);
> +       /*
> +        * If we have an altmap then we need to skip over any reserved PFNs
> +        * when querying the zone.
> +        */
> +       page = pfn_to_page(start_pfn);
> +       altmap = to_vmem_altmap((unsigned long) page);
> +       if (altmap)
> +               page += vmem_altmap_offset(altmap);
> +
> +       ret = __remove_pages(page_zone(page), start_pfn, nr_pages);
>         if (ret)
>                 return ret;

Reviewed-by: Balbir Singh <bsingharora@gmail.com>

Balbir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/6] powerpc/vmemmap: Add altmap support
@ 2017-05-23  9:25     ` Balbir Singh
  0 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:25 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT), linux-mm

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Adds support to powerpc for the altmap feature of ZONE_DEVICE memory. An
> altmap is a driver provided region that is used to provide the backing
> storage for the struct pages of ZONE_DEVICE memory. In situations where
> large amount of ZONE_DEVICE memory is being added to the system the
> altmap reduces pressure on main system memory by allowing the mm/
> metadata to be stored on the device itself rather in main memory.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
>  arch/powerpc/mm/init_64.c | 15 +++++++++++++--
>  arch/powerpc/mm/mem.c     | 16 +++++++++++++---
>  2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 8851e4f5dbab..225fbb8034e6 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -44,6 +44,7 @@
>  #include <linux/slab.h>
>  #include <linux/of_fdt.h>
>  #include <linux/libfdt.h>
> +#include <linux/memremap.h>
>
>  #include <asm/pgalloc.h>
>  #include <asm/page.h>
> @@ -171,13 +172,17 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
>         pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node);
>
>         for (; start < end; start += page_size) {
> +               struct vmem_altmap *altmap;
>                 void *p;
>                 int rc;
>
>                 if (vmemmap_populated(start, page_size))
>                         continue;
>
> -               p = vmemmap_alloc_block(page_size, node);
> +               /* altmap lookups only work at section boundaries */
> +               altmap = to_vmem_altmap(SECTION_ALIGN_DOWN(start));
> +
> +               p =  __vmemmap_alloc_block_buf(page_size, node, altmap);
>                 if (!p)
>                         return -ENOMEM;
>
> @@ -242,6 +247,8 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
>
>         for (; start < end; start += page_size) {
>                 unsigned long nr_pages, addr;
> +               struct vmem_altmap *altmap;
> +               struct page *section_base;
>                 struct page *page;
>
>                 /*
> @@ -257,9 +264,13 @@ void __ref vmemmap_free(unsigned long start, unsigned long end)
>                         continue;
>
>                 page = pfn_to_page(addr >> PAGE_SHIFT);
> +               section_base = pfn_to_page(vmemmap_section_start(start));
>                 nr_pages = 1 << page_order;
>
> -               if (PageReserved(page)) {
> +               altmap = to_vmem_altmap((unsigned long) section_base);
> +               if (altmap) {
> +                       vmem_altmap_free(altmap, nr_pages);
> +               } else if (PageReserved(page)) {
>                         /* allocated from bootmem */
>                         if (page_size < PAGE_SIZE) {
>                                 /*
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 9ee536ec0739..2c0c16f11eee 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -36,6 +36,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> +#include <linux/memremap.h>
>
>  #include <asm/pgalloc.h>
>  #include <asm/prom.h>
> @@ -159,11 +160,20 @@ int arch_remove_memory(u64 start, u64 size)
>  {
>         unsigned long start_pfn = start >> PAGE_SHIFT;
>         unsigned long nr_pages = size >> PAGE_SHIFT;
> -       struct zone *zone;
> +       struct vmem_altmap *altmap;
> +       struct page *page;
>         int ret;
>
> -       zone = page_zone(pfn_to_page(start_pfn));
> -       ret = __remove_pages(zone, start_pfn, nr_pages);
> +       /*
> +        * If we have an altmap then we need to skip over any reserved PFNs
> +        * when querying the zone.
> +        */
> +       page = pfn_to_page(start_pfn);
> +       altmap = to_vmem_altmap((unsigned long) page);
> +       if (altmap)
> +               page += vmem_altmap_offset(altmap);
> +
> +       ret = __remove_pages(page_zone(page), start_pfn, nr_pages);
>         if (ret)
>                 return ret;

Reviewed-by: Balbir Singh <bsingharora@gmail.com>

Balbir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv
  2017-05-23  4:05 ` Oliver O'Halloran
@ 2017-05-23  9:27   ` Balbir Singh
  -1 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:27 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Anton Blanchard

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> From: Anton Blanchard <anton@samba.org>
>
> Adds support for removing bolted (i.e kernel linear mapping) mappings on
> powernv. This is needed to support memory hot unplug operations which
> are required for the teardown of DAX/PMEM devices.
>
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Fixed the commit author
>           Added VM_WARN_ON() if we attempt to remove an unbolted hpte
> ---
>  arch/powerpc/mm/hash_native_64.c | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>
> diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
> index 65bb8f33b399..b534d041cfe8 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
>         tlbie(vpn, psize, psize, ssize, 0);
>  }
>
> +/*
> + * Remove a bolted kernel entry. Memory hotplug uses this.
> + *
> + * No need to lock here because we should be the only user.
> + */
> +static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
> +{
> +       unsigned long vpn;
> +       unsigned long vsid;
> +       long slot;
> +       struct hash_pte *hptep;
> +
> +       vsid = get_kernel_vsid(ea, ssize);
> +       vpn = hpt_vpn(ea, vsid, ssize);
> +
> +       slot = native_hpte_find(vpn, psize, ssize);
> +       if (slot == -1)
> +               return -ENOENT;
> +
> +       hptep = htab_address + slot;
> +
> +       VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
> +
> +       /* Invalidate the hpte */
> +       hptep->v = 0;
> +
> +       /* Invalidate the TLB */
> +       tlbie(vpn, psize, psize, ssize, 0);
> +       return 0;
> +}
> +

Reviewed-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv
@ 2017-05-23  9:27   ` Balbir Singh
  0 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23  9:27 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Anton Blanchard

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> From: Anton Blanchard <anton@samba.org>
>
> Adds support for removing bolted (i.e kernel linear mapping) mappings on
> powernv. This is needed to support memory hot unplug operations which
> are required for the teardown of DAX/PMEM devices.
>
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Fixed the commit author
>           Added VM_WARN_ON() if we attempt to remove an unbolted hpte
> ---
>  arch/powerpc/mm/hash_native_64.c | 33 +++++++++++++++++++++++++++++++++
>  1 file changed, 33 insertions(+)
>
> diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
> index 65bb8f33b399..b534d041cfe8 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
>         tlbie(vpn, psize, psize, ssize, 0);
>  }
>
> +/*
> + * Remove a bolted kernel entry. Memory hotplug uses this.
> + *
> + * No need to lock here because we should be the only user.
> + */
> +static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
> +{
> +       unsigned long vpn;
> +       unsigned long vsid;
> +       long slot;
> +       struct hash_pte *hptep;
> +
> +       vsid = get_kernel_vsid(ea, ssize);
> +       vpn = hpt_vpn(ea, vsid, ssize);
> +
> +       slot = native_hpte_find(vpn, psize, ssize);
> +       if (slot == -1)
> +               return -ENOENT;
> +
> +       hptep = htab_address + slot;
> +
> +       VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
> +
> +       /* Invalidate the hpte */
> +       hptep->v = 0;
> +
> +       /* Invalidate the TLB */
> +       tlbie(vpn, psize, psize, ssize, 0);
> +       return 0;
> +}
> +

Reviewed-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
  2017-05-23  4:05   ` Oliver O'Halloran
@ 2017-05-23 10:40     ` Balbir Singh
  -1 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23 10:40 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Aneesh Kumar K . V

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>
> Aneesh, this has been fleshed out substantially since v1. Can you
> re-review it? Also no explicit gup support is required in this patch
> since devmap support was added generic GUP as a part of making x86 use
> the generic version.
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837aaae8..eaaf613c5347 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>   */
>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>  {
> -       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
> +       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>                   (_PAGE_PTE | H_PAGE_THP_HUGE));
>  }

Like Aneesh suggested, I think we can probably skip this check here

>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 85bc9875c3be..24634e92dd0b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -79,6 +79,9 @@
>
>  #define _PAGE_SOFT_DIRTY       _RPAGE_SW3 /* software: software dirty tracking */
>  #define _PAGE_SPECIAL          _RPAGE_SW2 /* software: special page */
> +#define _PAGE_DEVMAP           _RPAGE_SW1
> +#define __HAVE_ARCH_PTE_DEVMAP
> +
>  /*
>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>   * Instead of fixing all of them, add an alternate define which
> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>         return pte;
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +       return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> +}
> +
> +static inline int pte_devmap(pte_t pte)
> +{
> +       return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
> +}
> +
>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  {
>         /* FIXME!! check whether this need to be a conditional */
> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>  #define pmd_clear_savedwrite(pmd)      pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>
> +#define pud_pfn(...) (0)
> +#define pgd_pfn(...) (0)
> +

Don't get these bits.. why are they zero?

>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>         return true;
>  }
>
> -
>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>  static inline bool arch_needs_pgtable_deposit(void)
>  {
> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>         return true;
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +       return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}
> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +       return pte_devmap(pmd_pte(pmd));
> +}

This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

The rest looks OK

Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
@ 2017-05-23 10:40     ` Balbir Singh
  0 siblings, 0 replies; 31+ messages in thread
From: Balbir Singh @ 2017-05-23 10:40 UTC (permalink / raw)
  To: Oliver O'Halloran
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Aneesh Kumar K . V

On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
> is used to differentiate device backed memory from transparent huge
> pages since they are handled in more or less the same manner by the core
> mm code.
>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>
> Aneesh, this has been fleshed out substantially since v1. Can you
> re-review it? Also no explicit gup support is required in this patch
> since devmap support was added generic GUP as a part of making x86 use
> the generic version.
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>  8 files changed, 47 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837aaae8..eaaf613c5347 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>   */
>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>  {
> -       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
> +       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>                   (_PAGE_PTE | H_PAGE_THP_HUGE));
>  }

Like Aneesh suggested, I think we can probably skip this check here

>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 85bc9875c3be..24634e92dd0b 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -79,6 +79,9 @@
>
>  #define _PAGE_SOFT_DIRTY       _RPAGE_SW3 /* software: software dirty tracking */
>  #define _PAGE_SPECIAL          _RPAGE_SW2 /* software: special page */
> +#define _PAGE_DEVMAP           _RPAGE_SW1
> +#define __HAVE_ARCH_PTE_DEVMAP
> +
>  /*
>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>   * Instead of fixing all of them, add an alternate define which
> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>         return pte;
>  }
>
> +static inline pte_t pte_mkdevmap(pte_t pte)
> +{
> +       return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
> +}
> +
> +static inline int pte_devmap(pte_t pte)
> +{
> +       return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
> +}
> +
>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>  {
>         /* FIXME!! check whether this need to be a conditional */
> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>  #define pmd_clear_savedwrite(pmd)      pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>
> +#define pud_pfn(...) (0)
> +#define pgd_pfn(...) (0)
> +

Don't get these bits.. why are they zero?

>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>         return true;
>  }
>
> -
>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>  static inline bool arch_needs_pgtable_deposit(void)
>  {
> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>         return true;
>  }
>
> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
> +{
> +       return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
> +}
> +
> +static inline int pmd_devmap(pmd_t pmd)
> +{
> +       return pte_devmap(pmd_pte(pmd));
> +}

This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

The rest looks OK

Balbir Singh.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
  2017-05-23 10:40     ` Balbir Singh
@ 2017-05-24  2:17       ` Oliver O'Halloran
  -1 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-24  2:17 UTC (permalink / raw)
  To: Balbir Singh
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Aneesh Kumar K . V

On Tue, May 23, 2017 at 8:40 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> -       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> +       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>>                   (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> Like Aneesh suggested, I think we can probably skip this check here
>
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY       _RPAGE_SW3 /* software: software dirty tracking */
>>  #define _PAGE_SPECIAL          _RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP           _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>         return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +       return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +       return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>>         /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)      pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>
> Don't get these bits.. why are they zero?

I think that was just hacking stuff until it worked. pud_pfn() needs
to exist for the kernel to build when __HAVE_ARCH_PTE_DEVMAP is set,
but we don't need it to do anything (yet) since pud_pfn() is only used
for handing devmap PUD faults. We currently support those so we will
never hit that code path. pgd_pfn() can die though.

>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>>         return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>>         return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +       return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +       return pte_devmap(pmd_pte(pmd));
>> +}
>
> This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

ok

>
> The rest looks OK
>
> Balbir Singh.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] powerpc/mm: Add devmap support for ppc64
@ 2017-05-24  2:17       ` Oliver O'Halloran
  0 siblings, 0 replies; 31+ messages in thread
From: Oliver O'Halloran @ 2017-05-24  2:17 UTC (permalink / raw)
  To: Balbir Singh
  Cc: open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	linux-mm, Aneesh Kumar K . V

On Tue, May 23, 2017 at 8:40 PM, Balbir Singh <bsingharora@gmail.com> wrote:
> On Tue, May 23, 2017 at 2:05 PM, Oliver O'Halloran <oohall@gmail.com> wrote:
>> Add support for the devmap bit on PTEs and PMDs for PPC64 Book3S.  This
>> is used to differentiate device backed memory from transparent huge
>> pages since they are handled in more or less the same manner by the core
>> mm code.
>>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> v1 -> v2: Properly differentiate THP and PMD Devmap entries. The
>> mm core assumes that pmd_trans_huge() and pmd_devmap() are mutually
>> exclusive and v1 had pmd_trans_huge() being true on a devmap pmd.
>>
>> Aneesh, this has been fleshed out substantially since v1. Can you
>> re-review it? Also no explicit gup support is required in this patch
>> since devmap support was added generic GUP as a part of making x86 use
>> the generic version.
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  2 +-
>>  arch/powerpc/include/asm/book3s/64/pgtable.h  | 37 ++++++++++++++++++++++++++-
>>  arch/powerpc/include/asm/book3s/64/radix.h    |  2 +-
>>  arch/powerpc/mm/hugetlbpage.c                 |  2 +-
>>  arch/powerpc/mm/pgtable-book3s64.c            |  4 +--
>>  arch/powerpc/mm/pgtable-hash64.c              |  4 ++-
>>  arch/powerpc/mm/pgtable-radix.c               |  3 ++-
>>  arch/powerpc/mm/pgtable_64.c                  |  2 +-
>>  8 files changed, 47 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> index 9732837aaae8..eaaf613c5347 100644
>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>> @@ -180,7 +180,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>>   */
>>  static inline int hash__pmd_trans_huge(pmd_t pmd)
>>  {
>> -       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
>> +       return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
>>                   (_PAGE_PTE | H_PAGE_THP_HUGE));
>>  }
>
> Like Aneesh suggested, I think we can probably skip this check here
>
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 85bc9875c3be..24634e92dd0b 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -79,6 +79,9 @@
>>
>>  #define _PAGE_SOFT_DIRTY       _RPAGE_SW3 /* software: software dirty tracking */
>>  #define _PAGE_SPECIAL          _RPAGE_SW2 /* software: special page */
>> +#define _PAGE_DEVMAP           _RPAGE_SW1
>> +#define __HAVE_ARCH_PTE_DEVMAP
>> +
>>  /*
>>   * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
>>   * Instead of fixing all of them, add an alternate define which
>> @@ -599,6 +602,16 @@ static inline pte_t pte_mkhuge(pte_t pte)
>>         return pte;
>>  }
>>
>> +static inline pte_t pte_mkdevmap(pte_t pte)
>> +{
>> +       return __pte(pte_val(pte) | _PAGE_SPECIAL|_PAGE_DEVMAP);
>> +}
>> +
>> +static inline int pte_devmap(pte_t pte)
>> +{
>> +       return !!(pte_raw(pte) & cpu_to_be64(_PAGE_DEVMAP));
>> +}
>> +
>>  static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
>>  {
>>         /* FIXME!! check whether this need to be a conditional */
>> @@ -963,6 +976,9 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
>>  #define pmd_mk_savedwrite(pmd) pte_pmd(pte_mk_savedwrite(pmd_pte(pmd)))
>>  #define pmd_clear_savedwrite(pmd)      pte_pmd(pte_clear_savedwrite(pmd_pte(pmd)))
>>
>> +#define pud_pfn(...) (0)
>> +#define pgd_pfn(...) (0)
>> +
>
> Don't get these bits.. why are they zero?

I think that was just hacking stuff until it worked. pud_pfn() needs
to exist for the kernel to build when __HAVE_ARCH_PTE_DEVMAP is set,
but we don't need it to do anything (yet) since pud_pfn() is only used
for handing devmap PUD faults. We currently support those so we will
never hit that code path. pgd_pfn() can die though.

>>  #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
>>  #define pmd_soft_dirty(pmd)    pte_soft_dirty(pmd_pte(pmd))
>>  #define pmd_mksoft_dirty(pmd)  pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)))
>> @@ -1137,7 +1153,6 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
>>         return true;
>>  }
>>
>> -
>>  #define arch_needs_pgtable_deposit arch_needs_pgtable_deposit
>>  static inline bool arch_needs_pgtable_deposit(void)
>>  {
>> @@ -1146,6 +1161,26 @@ static inline bool arch_needs_pgtable_deposit(void)
>>         return true;
>>  }
>>
>> +static inline pmd_t pmd_mkdevmap(pmd_t pmd)
>> +{
>> +       return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
>> +}
>> +
>> +static inline int pmd_devmap(pmd_t pmd)
>> +{
>> +       return pte_devmap(pmd_pte(pmd));
>> +}
>
> This should be defined only if #ifdef __HAVE_ARCH_PTE_DEVMAP

ok

>
> The rest looks OK
>
> Balbir Singh.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv
  2017-05-23  4:05 ` Oliver O'Halloran
                   ` (6 preceding siblings ...)
  (?)
@ 2017-05-25  0:02 ` Rashmica Gupta
  -1 siblings, 0 replies; 31+ messages in thread
From: Rashmica Gupta @ 2017-05-25  0:02 UTC (permalink / raw)
  To: Oliver O'Halloran, linuxppc-dev

Looks good to me


On 23/05/17 14:05, Oliver O'Halloran wrote:
> From: Anton Blanchard <anton@samba.org>
>
> Adds support for removing bolted (i.e kernel linear mapping) mappings on
> powernv. This is needed to support memory hot unplug operations which
> are required for the teardown of DAX/PMEM devices.
>
> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com>
^^ x2
> Signed-off-by: Anton Blanchard <anton@samba.org>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> v1 -> v2: Fixed the commit author
>            Added VM_WARN_ON() if we attempt to remove an unbolted hpte
> ---
>   arch/powerpc/mm/hash_native_64.c | 33 +++++++++++++++++++++++++++++++++
>   1 file changed, 33 insertions(+)
>
> diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
> index 65bb8f33b399..b534d041cfe8 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -407,6 +407,38 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea,
>   	tlbie(vpn, psize, psize, ssize, 0);
>   }
>   
> +/*
> + * Remove a bolted kernel entry. Memory hotplug uses this.
> + *
> + * No need to lock here because we should be the only user.
> + */
> +static int native_hpte_removebolted(unsigned long ea, int psize, int ssize)
> +{
> +	unsigned long vpn;
> +	unsigned long vsid;
> +	long slot;
> +	struct hash_pte *hptep;
> +
> +	vsid = get_kernel_vsid(ea, ssize);
> +	vpn = hpt_vpn(ea, vsid, ssize);
> +
> +	slot = native_hpte_find(vpn, psize, ssize);
> +	if (slot == -1)
> +		return -ENOENT;
> +
> +	hptep = htab_address + slot;
> +
> +	VM_WARN_ON(!(be64_to_cpu(hptep->v) & HPTE_V_BOLTED));
> +
> +	/* Invalidate the hpte */
> +	hptep->v = 0;
> +
> +	/* Invalidate the TLB */
> +	tlbie(vpn, psize, psize, ssize, 0);
> +	return 0;
> +}
> +
> +
>   static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
>   				   int bpsize, int apsize, int ssize, int local)
>   {
> @@ -725,6 +757,7 @@ void __init hpte_init_native(void)
>   	mmu_hash_ops.hpte_invalidate	= native_hpte_invalidate;
>   	mmu_hash_ops.hpte_updatepp	= native_hpte_updatepp;
>   	mmu_hash_ops.hpte_updateboltedpp = native_hpte_updateboltedpp;
> +	mmu_hash_ops.hpte_removebolted = native_hpte_removebolted;
>   	mmu_hash_ops.hpte_insert	= native_hpte_insert;
>   	mmu_hash_ops.hpte_remove	= native_hpte_remove;
>   	mmu_hash_ops.hpte_clear_all	= native_hpte_clear;

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2017-05-25  0:02 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-23  4:05 [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv Oliver O'Halloran
2017-05-23  4:05 ` Oliver O'Halloran
2017-05-23  4:05 ` [PATCH 2/6] powerpc/vmemmap: Reshuffle vmemmap_free() Oliver O'Halloran
2017-05-23  4:05   ` Oliver O'Halloran
2017-05-23  4:05 ` [PATCH 3/6] powerpc/vmemmap: Add altmap support Oliver O'Halloran
2017-05-23  4:05   ` Oliver O'Halloran
2017-05-23  9:25   ` Balbir Singh
2017-05-23  9:25     ` Balbir Singh
2017-05-23  4:05 ` [PATCH 4/6] powerpc/mm: Add devmap support for ppc64 Oliver O'Halloran
2017-05-23  4:05   ` Oliver O'Halloran
2017-05-23  4:23   ` Aneesh Kumar K.V
2017-05-23  4:23     ` Aneesh Kumar K.V
2017-05-23  6:42     ` Oliver O'Halloran
2017-05-23  6:42       ` Oliver O'Halloran
2017-05-23 10:40   ` Balbir Singh
2017-05-23 10:40     ` Balbir Singh
2017-05-24  2:17     ` Oliver O'Halloran
2017-05-24  2:17       ` Oliver O'Halloran
2017-05-23  4:05 ` [PATCH 5/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE Oliver O'Halloran
2017-05-23  4:05   ` Oliver O'Halloran
2017-05-23  6:42   ` Ingo Molnar
2017-05-23  6:42     ` Ingo Molnar
2017-05-23  9:20   ` Balbir Singh
2017-05-23  9:20     ` Balbir Singh
2017-05-23  4:05 ` [PATCH 6/6] powerpc/mm: Enable ZONE_DEVICE on powerpc Oliver O'Halloran
2017-05-23  4:05   ` Oliver O'Halloran
2017-05-23  9:21   ` Balbir Singh
2017-05-23  9:21     ` Balbir Singh
2017-05-23  9:27 ` [PATCH 1/6] powerpc/mm: Wire up hpte_removebolted for powernv Balbir Singh
2017-05-23  9:27   ` Balbir Singh
2017-05-25  0:02 ` Rashmica Gupta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.