linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware
@ 2016-03-09 12:10 Anshuman Khandual
  2016-03-09 12:10 ` [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function Anshuman Khandual
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

Currently both the ARCH_WANT_GENERAL_HUGETLB functions 'huge_pte_alloc'
and 'huge_pte_offset' dont take into account huge page implementation
at the PGD level. With addition of PGD awareness into these functions,
more architectures like POWER which also implements huge pages at PGD
level (along with PMD level), can use ARCH_WANT_GENERAL_HUGETLB option.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 mm/hugetlb.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 01f2b48..a478b7b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4251,6 +4251,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 	pte_t *pte = NULL;
 
 	pgd = pgd_offset(mm, addr);
+	if (sz == PGDIR_SIZE) {
+		pte = (pte_t *)pgd;
+		goto huge_pgd;
+	}
+
 	pud = pud_alloc(mm, pgd, addr);
 	if (pud) {
 		if (sz == PUD_SIZE) {
@@ -4263,6 +4268,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
 				pte = (pte_t *)pmd_alloc(mm, pud, addr);
 		}
 	}
+
+huge_pgd:
 	BUG_ON(pte && !pte_none(*pte) && !pte_huge(*pte));
 
 	return pte;
@@ -4276,6 +4283,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 
 	pgd = pgd_offset(mm, addr);
 	if (pgd_present(*pgd)) {
+		if (pgd_huge(*pgd))
+			return (pte_t *)pgd;
 		pud = pud_offset(pgd, addr);
 		if (pud_present(*pud)) {
 			if (pud_huge(*pud))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-11  3:02   ` Anshuman Khandual
  2016-03-09 12:10 ` [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware Anshuman Khandual
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

This just adds 'follow_huge_pgd' function which is will be used
later in this series to make 'follow_page_mask' function aware
of PGD based huge page implementation.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 include/linux/hugetlb.h |  3 +++
 mm/hugetlb.c            | 10 ++++++++++
 2 files changed, 13 insertions(+)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 7d953c2..71832e1 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -115,6 +115,8 @@ struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address,
 				pmd_t *pmd, int flags);
 struct page *follow_huge_pud(struct mm_struct *mm, unsigned long address,
 				pud_t *pud, int flags);
+struct page *follow_huge_pgd(struct mm_struct *mm, unsigned long address,
+				pgd_t *pgd, int flags);
 int pmd_huge(pmd_t pmd);
 int pud_huge(pud_t pmd);
 unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
@@ -143,6 +145,7 @@ static inline void hugetlb_show_meminfo(void)
 }
 #define follow_huge_pmd(mm, addr, pmd, flags)	NULL
 #define follow_huge_pud(mm, addr, pud, flags)	NULL
+#define follow_huge_pgd(mm, addr, pgd, flags)	NULL
 #define prepare_hugepage_range(file, addr, len)	(-EINVAL)
 #define pmd_huge(x)	0
 #define pud_huge(x)	0
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a478b7b..844c18f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4353,6 +4353,16 @@ follow_huge_pud(struct mm_struct *mm, unsigned long address,
 	return pte_page(*(pte_t *)pud) + ((address & ~PUD_MASK) >> PAGE_SHIFT);
 }
 
+struct page * __weak
+follow_huge_pgd(struct mm_struct *mm, unsigned long address,
+		pgd_t *pgd, int flags)
+{
+	if (flags & FOLL_GET)
+		return NULL;
+
+	return pte_page(*(pte_t *)pgd) + ((address & ~PGDIR_MASK) >> PAGE_SHIFT);
+}
+
 #ifdef CONFIG_MEMORY_FAILURE
 
 /*
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
  2016-03-09 12:10 ` [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-11  3:03   ` Anshuman Khandual
  2016-03-09 12:10 ` [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K Anshuman Khandual
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

Currently the function 'follow_page_mask' does not take into account
PGD based huge page implementation. This change achieves that and
makes it complete.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 mm/gup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/gup.c b/mm/gup.c
index 7bf19ff..53a2013 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -232,6 +232,12 @@ struct page *follow_page_mask(struct vm_area_struct *vma,
 	pgd = pgd_offset(mm, address);
 	if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd)))
 		return no_page_table(vma, flags);
+	if (pgd_huge(*pgd) && vma->vm_flags & VM_HUGETLB) {
+		page = follow_huge_pgd(mm, address, pgd, flags);
+		if (page)
+			return page;
+		return no_page_table(vma, flags);
+	}
 
 	pud = pud_offset(pgd, address);
 	if (pud_none(*pud))
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
  2016-03-09 12:10 ` [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function Anshuman Khandual
  2016-03-09 12:10 ` [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 19:55   ` Aneesh Kumar K.V
  2016-03-09 12:10 ` [RFC 5/9] powerpc/mm: Split huge_pte_offset " Anshuman Khandual
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

From: root <root@ltcalpine2-lp8.aus.stglabs.ibm.com>

Currently the 'huge_pte_alloc' function has two versions, one for the
BOOK3S and the other one for the BOOK3E platforms. This change splits
the BOOK3S version into two parts, one for the 4K page size based
implementation and the other one for the 64K page sized implementation.
This change is one of the prerequisites towards enabling GENERAL_HUGETLB
implementation for BOOK3S 64K based huge pages.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 67 +++++++++++++++++++++++++++----------------
 1 file changed, 43 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 744e24b..a49c6ae 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -59,6 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
 }
 
+#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 			   unsigned long address, unsigned pdshift, unsigned pshift)
 {
@@ -117,6 +118,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 	spin_unlock(&mm->page_table_lock);
 	return 0;
 }
+#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
 
 /*
  * These macros define how to determine which level of the page table holds
@@ -131,6 +133,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 #endif
 
 #ifdef CONFIG_PPC_BOOK3S_64
+#ifdef CONFIG_PPC_4K_PAGES
 /*
  * At this point we do the placement change only for BOOK3S 64. This would
  * possibly work on other subarchs.
@@ -146,32 +149,23 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 
 	addr &= ~(sz-1);
 	pg = pgd_offset(mm, addr);
-
-	if (pshift == PGDIR_SHIFT)
-		/* 16GB huge page */
-		return (pte_t *) pg;
-	else if (pshift > PUD_SHIFT)
-		/*
-		 * We need to use hugepd table
-		 */
+	if (pshift > PUD_SHIFT) {
 		hpdp = (hugepd_t *)pg;
-	else {
-		pdshift = PUD_SHIFT;
-		pu = pud_alloc(mm, pg, addr);
-		if (pshift == PUD_SHIFT)
-			return (pte_t *)pu;
-		else if (pshift > PMD_SHIFT)
-			hpdp = (hugepd_t *)pu;
-		else {
-			pdshift = PMD_SHIFT;
-			pm = pmd_alloc(mm, pu, addr);
-			if (pshift == PMD_SHIFT)
-				/* 16MB hugepage */
-				return (pte_t *)pm;
-			else
-				hpdp = (hugepd_t *)pm;
-		}
+		goto hugepd_search;
+	}
+
+	pdshift = PUD_SHIFT;
+	pu = pud_alloc(mm, pg, addr);
+	if (pshift > PMD_SHIFT) {
+		hpdp = (hugepd_t *)pu;
+		goto hugepd_search;
 	}
+
+	pdshift = PMD_SHIFT;
+	pm = pmd_alloc(mm, pu, addr);
+	hpdp = (hugepd_t *)pm;
+
+hugepd_search:
 	if (!hpdp)
 		return NULL;
 
@@ -184,6 +178,31 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
 }
 
 #else
+pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
+{
+	pgd_t *pg;
+	pud_t *pu;
+	pmd_t *pm;
+	unsigned pshift = __ffs(sz);
+
+	addr &= ~(sz-1);
+	pg = pgd_offset(mm, addr);
+
+	if (pshift == PGDIR_SHIFT)	/* 16GB Huge Page */
+		return (pte_t *)pg;
+
+	pu = pud_alloc(mm, pg, addr);	/* NA, skipped */
+	if (pshift == PUD_SHIFT)
+		return (pte_t *)pu;
+
+	pm = pmd_alloc(mm, pu, addr);	/* 16MB Huge Page */
+	if (pshift == PMD_SHIFT)
+		return (pte_t *)pm;
+
+	return NULL;
+}
+#endif
+#else
 
 pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 5/9] powerpc/mm: Split huge_pte_offset function for BOOK3S 64K
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (2 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 22:57   ` Dave Hansen
  2016-03-09 12:10 ` [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB " Anshuman Khandual
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

Currently the 'huge_pte_offset' function has only one version for
all the configuations and platforms. This change splits the function
into two versions, one for 64K page size based BOOK3S implementation
and the other one for everything else. This change is also one of the
prerequisites towards enabling GENERAL_HUGETLB implementation for
BOOK3S 64K based huge pages.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index a49c6ae..f834a74 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -53,11 +53,46 @@ static unsigned nr_gpages;
 
 #define hugepd_none(hpd)	((hpd).pd == 0)
 
+#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 {
 	/* Only called for hugetlbfs pages, hence can ignore THP */
 	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
 }
+#else
+pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
+{
+	pgd_t pgd, *pgdp;
+	pud_t pud, *pudp;
+	pmd_t pmd, *pmdp;
+
+	pgdp = mm->pgd + pgd_index(addr);
+	pgd  = READ_ONCE(*pgdp);
+
+	if (pgd_none(pgd))
+		return NULL;
+
+	if (pgd_huge(pgd))
+		return (pte_t *)pgdp;
+
+	pudp = pud_offset(&pgd, addr);
+	pud  = READ_ONCE(*pudp);
+	if (pud_none(pud))
+		return NULL;
+
+	if (pud_huge(pud))
+		return (pte_t *)pudp;
+
+	pmdp = pmd_offset(&pud, addr);
+	pmd  = READ_ONCE(*pmdp);
+	if (pmd_none(pmd))
+		return NULL;
+
+	if (pmd_huge(pmd))
+		return (pte_t *)pmdp;
+	return NULL;
+}
+#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
 
 #if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (3 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 5/9] powerpc/mm: Split huge_pte_offset " Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 19:58   ` Aneesh Kumar K.V
  2016-03-21  9:55   ` Rui Teng
  2016-03-09 12:10 ` [RFC 7/9] powerpc/hugetlb: Change follow_huge_* routines " Anshuman Khandual
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

This enables ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K in Kconfig.
It also implements a new function 'pte_huge' which is required by
function 'huge_pte_alloc' from generic VM. Existing BOOK3S 64K
specific functions 'huge_pte_alloc' and 'huge_pte_offset' (which
are no longer required) are removed with this change.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                          |  4 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  8 ++++
 arch/powerpc/mm/hugetlbpage.c                 | 60 ---------------------------
 3 files changed, 12 insertions(+), 60 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9faa18c..c6920bb 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -33,6 +33,10 @@ config HAVE_SETUP_PER_CPU_AREA
 config NEED_PER_CPU_EMBED_FIRST_CHUNK
 	def_bool PPC64
 
+config ARCH_WANT_GENERAL_HUGETLB
+	depends on PPC_64K_PAGES && PPC_BOOK3S_64
+	def_bool y
+
 config NR_IRQS
 	int "Number of virtual interrupt numbers"
 	range 32 32768
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 849bbec..5e9b9b9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -143,6 +143,14 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
  * Defined in such a way that we can optimize away code block at build time
  * if CONFIG_HUGETLB_PAGE=n.
  */
+static inline int pte_huge(pte_t pte)
+{
+	/*
+	 * leaf pte for huge page
+	 */
+	return !!(pte_val(pte) & _PAGE_PTE);
+}
+
 static inline int pmd_huge(pmd_t pmd)
 {
 	/*
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index f834a74..f6e4712 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -59,42 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 	/* Only called for hugetlbfs pages, hence can ignore THP */
 	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
 }
-#else
-pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
-{
-	pgd_t pgd, *pgdp;
-	pud_t pud, *pudp;
-	pmd_t pmd, *pmdp;
-
-	pgdp = mm->pgd + pgd_index(addr);
-	pgd  = READ_ONCE(*pgdp);
-
-	if (pgd_none(pgd))
-		return NULL;
-
-	if (pgd_huge(pgd))
-		return (pte_t *)pgdp;
-
-	pudp = pud_offset(&pgd, addr);
-	pud  = READ_ONCE(*pudp);
-	if (pud_none(pud))
-		return NULL;
-
-	if (pud_huge(pud))
-		return (pte_t *)pudp;
 
-	pmdp = pmd_offset(&pud, addr);
-	pmd  = READ_ONCE(*pmdp);
-	if (pmd_none(pmd))
-		return NULL;
-
-	if (pmd_huge(pmd))
-		return (pte_t *)pmdp;
-	return NULL;
-}
-#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
-
-#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
 			   unsigned long address, unsigned pdshift, unsigned pshift)
 {
@@ -211,31 +176,6 @@ hugepd_search:
 
 	return hugepte_offset(*hpdp, addr, pdshift);
 }
-
-#else
-pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
-{
-	pgd_t *pg;
-	pud_t *pu;
-	pmd_t *pm;
-	unsigned pshift = __ffs(sz);
-
-	addr &= ~(sz-1);
-	pg = pgd_offset(mm, addr);
-
-	if (pshift == PGDIR_SHIFT)	/* 16GB Huge Page */
-		return (pte_t *)pg;
-
-	pu = pud_alloc(mm, pg, addr);	/* NA, skipped */
-	if (pshift == PUD_SHIFT)
-		return (pte_t *)pu;
-
-	pm = pmd_alloc(mm, pu, addr);	/* 16MB Huge Page */
-	if (pshift == PMD_SHIFT)
-		return (pte_t *)pm;
-
-	return NULL;
-}
 #endif
 #else
 
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 7/9] powerpc/hugetlb: Change follow_huge_* routines for BOOK3S 64K
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (4 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB " Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 12:10 ` [RFC 8/9] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

With this change, BOOK3S 64K platforms will not use 'follow_huge_addr'
function any more and always return ERR_PTR(-ENIVAL), hence skipping
the BUG_ON(flags & FOLL_GET) test in 'follow_page_mask' function. These
platforms will then fall back on generic follow_huge_* functions for
everything else. While being here, also added 'follow_huge_pgd' function
which was missing earlier.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index f6e4712..89b748a 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -631,6 +631,10 @@ follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
 	unsigned long mask, flags;
 	struct page *page = ERR_PTR(-EINVAL);
 
+#if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_BOOK3S_64)
+	return ERR_PTR(-EINVAL);
+#endif
+
 	local_irq_save(flags);
 	ptep = find_linux_pte_or_hugepte(mm->pgd, address, &is_thp, &shift);
 	if (!ptep)
@@ -658,6 +662,7 @@ no_page:
 	return page;
 }
 
+#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
 struct page *
 follow_huge_pmd(struct mm_struct *mm, unsigned long address,
 		pmd_t *pmd, int write)
@@ -674,6 +679,15 @@ follow_huge_pud(struct mm_struct *mm, unsigned long address,
 	return NULL;
 }
 
+struct page *
+follow_huge_pgd(struct mm_struct *mm, unsigned long address,
+		pgd_t *pgd, int write)
+{
+	BUG();
+	return NULL;
+}
+#endif /* !defined(CONFIG_PPC_64K_PAGE) || !defined(CONFIG_BOOK3S_64) */
+
 static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end,
 				      unsigned long sz)
 {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 8/9] powerpc/mm: Enable HugeTLB page migration
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (5 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 7/9] powerpc/hugetlb: Change follow_huge_* routines " Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 12:10 ` [RFC 9/9] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
  2016-03-11  3:01 ` [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
  8 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

This change enables HugeTLB page migration for PPC64_BOOK3S systems
for HugeTLB pages implemented at the PMD level. It enables the kernel
configuration option ARCH_ENABLE_HUGEPAGE_MIGRATION which turns on
'hugepage_migration_supported' function which is checked for feature
presence during migration.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c6920bb..cefc368 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -86,6 +86,10 @@ config GENERIC_HWEIGHT
 config ARCH_HAS_DMA_SET_COHERENT_MASK
         bool
 
+config ARCH_ENABLE_HUGEPAGE_MIGRATION
+	def_bool y
+	depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION
+
 config PPC
 	bool
 	default y
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC 9/9] selfttest/powerpc: Add memory page migration tests
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (6 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 8/9] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
@ 2016-03-09 12:10 ` Anshuman Khandual
  2016-03-09 20:01   ` Aneesh Kumar K.V
  2016-03-11  3:01 ` [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-09 12:10 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

This adds two tests for memory page migration. One for normal page
migration which works for both 4K or 64K base page size kernel and
the other one is for huge page migration which works only on 64K
base page sized 16MB huge page implemention at the PMD level.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
 tools/testing/selftests/powerpc/mm/Makefile        |  14 +-
 .../selftests/powerpc/mm/hugepage-migration.c      |  30 +++
 tools/testing/selftests/powerpc/mm/migration.h     | 204 +++++++++++++++++++++
 .../testing/selftests/powerpc/mm/page-migration.c  |  33 ++++
 tools/testing/selftests/powerpc/mm/run_mmtests     | 104 +++++++++++
 5 files changed, 380 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/mm/hugepage-migration.c
 create mode 100644 tools/testing/selftests/powerpc/mm/migration.h
 create mode 100644 tools/testing/selftests/powerpc/mm/page-migration.c
 create mode 100755 tools/testing/selftests/powerpc/mm/run_mmtests

diff --git a/tools/testing/selftests/powerpc/mm/Makefile b/tools/testing/selftests/powerpc/mm/Makefile
index ee179e2..c482614 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -1,12 +1,16 @@
 noarg:
 	$(MAKE) -C ../
 
-TEST_PROGS := hugetlb_vs_thp_test subpage_prot
-TEST_FILES := tempfile
+TEST_PROGS := run_mmtests
+TEST_FILES := hugetlb_vs_thp_test
+TEST_FILES += subpage_prot
+TEST_FILES += tempfile
+TEST_FILES += hugepage-migration
+TEST_FILES += page-migration
 
-all: $(TEST_PROGS) $(TEST_FILES)
+all: $(TEST_FILES)
 
-$(TEST_PROGS): ../harness.c
+$(TEST_FILES): ../harness.c
 
 include ../../lib.mk
 
@@ -14,4 +18,4 @@ tempfile:
 	dd if=/dev/zero of=tempfile bs=64k count=1
 
 clean:
-	rm -f $(TEST_PROGS) tempfile
+	rm -f $(TEST_FILES)
diff --git a/tools/testing/selftests/powerpc/mm/hugepage-migration.c b/tools/testing/selftests/powerpc/mm/hugepage-migration.c
new file mode 100644
index 0000000..b60bc10
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/hugepage-migration.c
@@ -0,0 +1,30 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include "migration.h"
+
+static int hugepage_migration(void)
+{
+	int ret = 0;
+
+	if ((unsigned long)getpagesize() == 0x1000)
+		printf("Running on base page size 4K\n");
+
+	if ((unsigned long)getpagesize() == 0x10000)
+		printf("Running on base page size 64K\n");
+
+	ret = test_huge_migration(16 * MEM_MB);
+	ret = test_huge_migration(256 * MEM_MB);
+	ret = test_huge_migration(512 * MEM_MB);
+
+	return ret;
+}
+
+int main(void)
+{
+	return test_harness(hugepage_migration, "hugepage_migration");
+}
diff --git a/tools/testing/selftests/powerpc/mm/migration.h b/tools/testing/selftests/powerpc/mm/migration.h
new file mode 100644
index 0000000..fe35849
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/migration.h
@@ -0,0 +1,204 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/mman.h>
+#include <fcntl.h>
+
+#include "utils.h"
+
+#define HPAGE_OFF	0
+#define HPAGE_ON	1
+
+#define PAGE_SHIFT_4K	12
+#define PAGE_SHIFT_64K	16
+#define PAGE_SIZE_4K	0x1000
+#define PAGE_SIZE_64K	0x10000
+#define PAGE_SIZE_HUGE	16UL * 1024 * 1024
+
+#define MEM_GB		1024UL * 1024 * 1024
+#define MEM_MB		1024UL * 1024
+#define MME_KB		1024UL
+
+#define PMAP_FILE	"/proc/self/pagemap"
+#define PMAP_PFN	0x007FFFFFFFFFFFFFUL
+#define PMAP_SIZE	8
+
+#define SOFT_OFFLINE	"/sys/devices/system/memory/soft_offline_page"
+#define HARD_OFFLINE	"/sys/devices/system/memory/hard_offline_page"
+
+#define MMAP_LENGTH	(256 * MEM_MB)
+#define MMAP_ADDR	(void *)(0x0UL)
+#define MMAP_PROT	(PROT_READ | PROT_WRITE)
+#define MMAP_FLAGS	(MAP_PRIVATE | MAP_ANONYMOUS)
+#define MMAP_FLAGS_HUGE	(MAP_SHARED)
+
+#define FILE_NAME	"huge/hugepagefile"
+
+static void write_buffer(char *addr, unsigned long length)
+{
+	unsigned long i;
+
+	for (i = 0; i < length; i++)
+		*(addr + i) = (char)i;
+}
+
+static int read_buffer(char *addr, unsigned long length)
+{
+	unsigned long i;
+
+	for (i = 0; i < length; i++) {
+		if (*(addr + i) != (char)i) {
+			printf("Data miscompare at addr[%lu]\n", i);
+			return 1;
+		}
+	}
+	return 0;
+}
+
+static unsigned long get_npages(unsigned long length, unsigned long size)
+{
+	unsigned int tmp1 = length, tmp2 = size;
+
+	return tmp1/tmp2;
+}
+
+static void soft_offline_pages(int hugepage, void *addr,
+	unsigned long npages, unsigned long *skipped, unsigned long *failed)
+{
+	unsigned long psize, offset, pfn, paddr, fail, skip, i;
+	void *tmp;
+	int fd1, fd2;
+	char buf[20];
+
+	fd1 = open(PMAP_FILE, O_RDONLY);
+	if (fd1 == -1) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	fd2 = open(SOFT_OFFLINE, O_WRONLY);
+	if (fd2 == -1) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	fail = skip = 0;
+	psize = getpagesize();
+	for (i = 0; i < npages; i++) {
+		if (hugepage)
+			tmp = addr + i * PAGE_SIZE_HUGE;
+		else
+			tmp = addr + i * psize;
+
+		offset = ((unsigned long) tmp / psize) * PMAP_SIZE;
+
+		if (lseek(fd1, offset, SEEK_SET) == -1) {
+			perror("lseek() failed");
+			exit(-1);
+		}
+
+		if (read(fd1, &pfn, sizeof(pfn)) == -1) {
+			perror("read() failed");
+			exit(-1);
+		}
+
+		/* Skip if no valid PFN */
+		pfn = pfn & PMAP_PFN;
+		if (!pfn) {
+			skip++;
+			continue;
+		}
+
+		if (psize == PAGE_SIZE_4K)
+			paddr = pfn << PAGE_SHIFT_4K;
+
+		if (psize == PAGE_SIZE_64K)
+			paddr = pfn << PAGE_SHIFT_64K;
+
+		sprintf(buf, "0x%lx\n", paddr);
+
+		if (write(fd2, buf, strlen(buf)) == -1) {
+			perror("write() failed");
+			printf("[%ld] PFN: %lx BUF: %s\n",i, pfn, buf);
+			fail++;
+		}
+
+	}
+
+	if (failed)
+		*failed = fail;
+
+	if (skipped)
+		*skipped = skip;
+
+	close(fd1);
+	close(fd2);
+}
+
+int test_migration(unsigned long length)
+{
+	unsigned long skipped, failed;
+	void *addr;
+	int ret;
+
+	addr = mmap(MMAP_ADDR, length, MMAP_PROT, MMAP_FLAGS, -1, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap() failed");
+		exit(-1);
+	}
+
+	write_buffer(addr, length);
+	soft_offline_pages(HPAGE_OFF, addr, length/getpagesize(), &skipped, &failed);
+	ret = read_buffer(addr, length);
+
+	printf("%ld moved %ld skipped %ld failed\n", (length/getpagesize() - skipped - failed), skipped, failed);
+
+	munmap(addr, length);
+	return ret;
+}
+
+int test_huge_migration(unsigned long length)
+{
+	unsigned long skipped, failed, npages;
+	void *addr;
+	int fd, ret;
+
+	fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755);
+	if (fd < 0) {
+		perror("open() failed");
+		exit(-1);
+	}
+
+	addr = mmap(MMAP_ADDR, length, MMAP_PROT, MMAP_FLAGS_HUGE, fd, 0);
+	if (addr == MAP_FAILED) {
+		perror("mmap() failed");
+		unlink(FILE_NAME);
+		exit(-1);
+	}
+
+        if (mlock(addr, length) == -1) {
+                perror("mlock() failed");
+		munmap(addr, length);
+                unlink(FILE_NAME);
+                exit(-1);
+        }
+
+	write_buffer(addr, length);
+	npages = get_npages(length, PAGE_SIZE_HUGE);
+	soft_offline_pages(HPAGE_ON, addr, npages, &skipped, &failed);
+	ret = read_buffer(addr, length);
+
+	printf("%ld moved %ld skipped %ld failed\n", (npages - skipped - failed), skipped, failed);
+
+	munmap(addr, length);
+	unlink(FILE_NAME);
+	return ret;
+}
diff --git a/tools/testing/selftests/powerpc/mm/page-migration.c b/tools/testing/selftests/powerpc/mm/page-migration.c
new file mode 100644
index 0000000..fc6e472
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/page-migration.c
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2015, Anshuman Khandual, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include "migration.h"
+
+static int page_migration(void)
+{
+	int ret = 0;
+
+	if ((unsigned long)getpagesize() == 0x1000)
+		printf("Running on base page size 4K\n");
+
+	if ((unsigned long)getpagesize() == 0x10000)
+		printf("Running on base page size 64K\n");
+
+	ret = test_migration(4 * MEM_MB);
+	ret = test_migration(64 * MEM_MB);
+	ret = test_migration(256 * MEM_MB);
+	ret = test_migration(512 * MEM_MB);
+	ret = test_migration(1 * MEM_GB);
+	ret = test_migration(2 * MEM_GB);
+
+	return ret;
+}
+
+int main(void)
+{
+	return test_harness(page_migration, "page_migration");
+}
diff --git a/tools/testing/selftests/powerpc/mm/run_mmtests b/tools/testing/selftests/powerpc/mm/run_mmtests
new file mode 100755
index 0000000..19805ba
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/run_mmtests
@@ -0,0 +1,104 @@
+#!/bin/bash
+
+# Mostly borrowed from tools/testing/selftests/vm/run_vmtests
+
+# Please run this as root
+# Try allocating 2GB of 16MB huge pages, below is the size in kB.
+# Please change this needed memory if the test program changes
+needmem=2097152
+mnt=./huge
+exitcode=0
+
+# Get huge pagesize and freepages from /proc/meminfo
+while read name size unit; do
+	if [ "$name" = "HugePages_Free:" ]; then
+		freepgs=$size
+	fi
+	if [ "$name" = "Hugepagesize:" ]; then
+		pgsize=$size
+	fi
+done < /proc/meminfo
+
+# Set required nr_hugepages
+if [ -n "$freepgs" ] && [ -n "$pgsize" ]; then
+	nr_hugepgs=`cat /proc/sys/vm/nr_hugepages`
+	needpgs=`expr $needmem / $pgsize`
+	tries=2
+	while [ $tries -gt 0 ] && [ $freepgs -lt $needpgs ]; do
+		lackpgs=$(( $needpgs - $freepgs ))
+		echo 3 > /proc/sys/vm/drop_caches
+		echo $(( $lackpgs + $nr_hugepgs )) > /proc/sys/vm/nr_hugepages
+		if [ $? -ne 0 ]; then
+			echo "Please run this test as root"
+		fi
+		while read name size unit; do
+			if [ "$name" = "HugePages_Free:" ]; then
+				freepgs=$size
+			fi
+		done < /proc/meminfo
+		tries=$((tries - 1))
+	done
+	if [ $freepgs -lt $needpgs ]; then
+		printf "Not enough huge pages available (%d < %d)\n" \
+		       $freepgs $needpgs
+	fi
+else
+	echo "No hugetlbfs support in kernel ? check dmesg"
+fi
+
+mkdir $mnt
+mount -t hugetlbfs none $mnt
+
+# Run the test programs
+echo "...................."
+echo "Test HugeTLB vs THP"
+echo "...................."
+./hugetlb_vs_thp_test
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+echo "........................."
+echo "Test subpage protection"
+echo "........................."
+./subpage_prot
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+echo "..........................."
+echo "Test normal page migration"
+echo "..........................."
+./page-migration
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+# Enable this after huge page migration is supported on POWER
+
+echo "........................."
+echo "Test huge page migration"
+echo "........................."
+./hugepage-migration
+if [ $? -ne 0 ]; then
+	echo "[FAIL]"
+	exitcode=1
+else
+	echo "[PASS]"
+fi
+
+# Huge pages cleanup
+umount $mnt
+rm -rf $mnt
+echo $nr_hugepgs > /proc/sys/vm/nr_hugepages
+
+exit $exitcode
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K
  2016-03-09 12:10 ` [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K Anshuman Khandual
@ 2016-03-09 19:55   ` Aneesh Kumar K.V
  2016-03-10  5:33     ` Anshuman Khandual
  0 siblings, 1 reply; 22+ messages in thread
From: Aneesh Kumar K.V @ 2016-03-09 19:55 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, mpe

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> [ text/plain ]
> From: root <root@ltcalpine2-lp8.aus.stglabs.ibm.com>
>
> Currently the 'huge_pte_alloc' function has two versions, one for the
> BOOK3S and the other one for the BOOK3E platforms. This change splits
> the BOOK3S version into two parts, one for the 4K page size based
> implementation and the other one for the 64K page sized implementation.
> This change is one of the prerequisites towards enabling GENERAL_HUGETLB
> implementation for BOOK3S 64K based huge pages.

I really wish we reduce #ifdefs in C code and start splitting hash
and nonhash code out where ever we can. 

What we really want here is a book3s version and in book3s version use
powerpc specific huge_pte_alloc only if GENERAL_HUGETLB was not defined.
Don't limit it to 64k linux page size. We should select between powerpc
specific implementation and generic code using GENERAL_HUGETLB define.


>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hugetlbpage.c | 67 +++++++++++++++++++++++++++----------------
>  1 file changed, 43 insertions(+), 24 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 744e24b..a49c6ae 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -59,6 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>  	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
>  }
>
> +#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
>  static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  			   unsigned long address, unsigned pdshift, unsigned pshift)
>  {
> @@ -117,6 +118,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  	spin_unlock(&mm->page_table_lock);
>  	return 0;
>  }
> +#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
>
>  /*
>   * These macros define how to determine which level of the page table holds
> @@ -131,6 +133,7 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  #endif
>
>  #ifdef CONFIG_PPC_BOOK3S_64
> +#ifdef CONFIG_PPC_4K_PAGES
>  /*
>   * At this point we do the placement change only for BOOK3S 64. This would
>   * possibly work on other subarchs.
> @@ -146,32 +149,23 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
>
>  	addr &= ~(sz-1);
>  	pg = pgd_offset(mm, addr);
> -
> -	if (pshift == PGDIR_SHIFT)
> -		/* 16GB huge page */
> -		return (pte_t *) pg;
> -	else if (pshift > PUD_SHIFT)
> -		/*
> -		 * We need to use hugepd table
> -		 */
> +	if (pshift > PUD_SHIFT) {
>  		hpdp = (hugepd_t *)pg;
> -	else {
> -		pdshift = PUD_SHIFT;
> -		pu = pud_alloc(mm, pg, addr);
> -		if (pshift == PUD_SHIFT)
> -			return (pte_t *)pu;
> -		else if (pshift > PMD_SHIFT)
> -			hpdp = (hugepd_t *)pu;
> -		else {
> -			pdshift = PMD_SHIFT;
> -			pm = pmd_alloc(mm, pu, addr);
> -			if (pshift == PMD_SHIFT)
> -				/* 16MB hugepage */
> -				return (pte_t *)pm;
> -			else
> -				hpdp = (hugepd_t *)pm;
> -		}
> +		goto hugepd_search;
> +	}
> +
> +	pdshift = PUD_SHIFT;
> +	pu = pud_alloc(mm, pg, addr);
> +	if (pshift > PMD_SHIFT) {
> +		hpdp = (hugepd_t *)pu;
> +		goto hugepd_search;
>  	}
> +
> +	pdshift = PMD_SHIFT;
> +	pm = pmd_alloc(mm, pu, addr);
> +	hpdp = (hugepd_t *)pm;
> +
> +hugepd_search:
>  	if (!hpdp)
>  		return NULL;
>
> @@ -184,6 +178,31 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz
>  }
>
>  #else
> +pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
> +{
> +	pgd_t *pg;
> +	pud_t *pu;
> +	pmd_t *pm;
> +	unsigned pshift = __ffs(sz);
> +
> +	addr &= ~(sz-1);
> +	pg = pgd_offset(mm, addr);
> +
> +	if (pshift == PGDIR_SHIFT)	/* 16GB Huge Page */
> +		return (pte_t *)pg;
> +
> +	pu = pud_alloc(mm, pg, addr);	/* NA, skipped */
> +	if (pshift == PUD_SHIFT)
> +		return (pte_t *)pu;
> +
> +	pm = pmd_alloc(mm, pu, addr);	/* 16MB Huge Page */
> +	if (pshift == PMD_SHIFT)
> +		return (pte_t *)pm;
> +
> +	return NULL;
> +}
> +#endif
> +#else
>
>  pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
>  {
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K
  2016-03-09 12:10 ` [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB " Anshuman Khandual
@ 2016-03-09 19:58   ` Aneesh Kumar K.V
  2016-03-10  5:12     ` Anshuman Khandual
  2016-03-21  9:55   ` Rui Teng
  1 sibling, 1 reply; 22+ messages in thread
From: Aneesh Kumar K.V @ 2016-03-09 19:58 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, mpe

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> [ text/plain ]
> This enables ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K in Kconfig.
> It also implements a new function 'pte_huge' which is required by
> function 'huge_pte_alloc' from generic VM. Existing BOOK3S 64K
> specific functions 'huge_pte_alloc' and 'huge_pte_offset' (which
> are no longer required) are removed with this change.
>

You want this to be the last patch isn't it ? And you are mixing too
many things in this patch. Why not do this

* book3s specific hash pte routines
* book3s add conditional based on GENERAL_HUGETLB
* Enable GENERAL_HUGETLB for 64k page size config

> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>  arch/powerpc/Kconfig                          |  4 ++
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |  8 ++++
>  arch/powerpc/mm/hugetlbpage.c                 | 60 ---------------------------
>  3 files changed, 12 insertions(+), 60 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9faa18c..c6920bb 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -33,6 +33,10 @@ config HAVE_SETUP_PER_CPU_AREA
>  config NEED_PER_CPU_EMBED_FIRST_CHUNK
>  	def_bool PPC64
>
> +config ARCH_WANT_GENERAL_HUGETLB
> +	depends on PPC_64K_PAGES && PPC_BOOK3S_64
> +	def_bool y
> +
>  config NR_IRQS
>  	int "Number of virtual interrupt numbers"
>  	range 32 32768
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 849bbec..5e9b9b9 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -143,6 +143,14 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
>   * Defined in such a way that we can optimize away code block at build time
>   * if CONFIG_HUGETLB_PAGE=n.
>   */
> +static inline int pte_huge(pte_t pte)
> +{
> +	/*
> +	 * leaf pte for huge page
> +	 */
> +	return !!(pte_val(pte) & _PAGE_PTE);
> +}
> +
>  static inline int pmd_huge(pmd_t pmd)
>  {
>  	/*
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index f834a74..f6e4712 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -59,42 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>  	/* Only called for hugetlbfs pages, hence can ignore THP */
>  	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
>  }
> -#else
> -pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
> -{
> -	pgd_t pgd, *pgdp;
> -	pud_t pud, *pudp;
> -	pmd_t pmd, *pmdp;
> -
> -	pgdp = mm->pgd + pgd_index(addr);
> -	pgd  = READ_ONCE(*pgdp);
> -
> -	if (pgd_none(pgd))
> -		return NULL;
> -
> -	if (pgd_huge(pgd))
> -		return (pte_t *)pgdp;
> -
> -	pudp = pud_offset(&pgd, addr);
> -	pud  = READ_ONCE(*pudp);
> -	if (pud_none(pud))
> -		return NULL;
> -
> -	if (pud_huge(pud))
> -		return (pte_t *)pudp;
>
> -	pmdp = pmd_offset(&pud, addr);
> -	pmd  = READ_ONCE(*pmdp);
> -	if (pmd_none(pmd))
> -		return NULL;
> -
> -	if (pmd_huge(pmd))
> -		return (pte_t *)pmdp;
> -	return NULL;
> -}
> -#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
> -
> -#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
>  static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>  			   unsigned long address, unsigned pdshift, unsigned pshift)
>  {
> @@ -211,31 +176,6 @@ hugepd_search:
>
>  	return hugepte_offset(*hpdp, addr, pdshift);
>  }
> -
> -#else
> -pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
> -{
> -	pgd_t *pg;
> -	pud_t *pu;
> -	pmd_t *pm;
> -	unsigned pshift = __ffs(sz);
> -
> -	addr &= ~(sz-1);
> -	pg = pgd_offset(mm, addr);
> -
> -	if (pshift == PGDIR_SHIFT)	/* 16GB Huge Page */
> -		return (pte_t *)pg;
> -
> -	pu = pud_alloc(mm, pg, addr);	/* NA, skipped */
> -	if (pshift == PUD_SHIFT)
> -		return (pte_t *)pu;
> -
> -	pm = pmd_alloc(mm, pu, addr);	/* 16MB Huge Page */
> -	if (pshift == PMD_SHIFT)
> -		return (pte_t *)pm;
> -
> -	return NULL;
> -}
>  #endif
>  #else
>
> -- 
> 2.1.0

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 9/9] selfttest/powerpc: Add memory page migration tests
  2016-03-09 12:10 ` [RFC 9/9] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
@ 2016-03-09 20:01   ` Aneesh Kumar K.V
  2016-03-10  5:05     ` Anshuman Khandual
  0 siblings, 1 reply; 22+ messages in thread
From: Aneesh Kumar K.V @ 2016-03-09 20:01 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, mpe

Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:

> [ text/plain ]
> This adds two tests for memory page migration. One for normal page
> migration which works for both 4K or 64K base page size kernel and
> the other one is for huge page migration which works only on 64K
> base page sized 16MB huge page implemention at the PMD level.
>

can you also add the test in this commit
e66f17ff717 ("mm/hugetlb: take page table lock in follow_huge_pmd()")

> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---

-aneesh

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 5/9] powerpc/mm: Split huge_pte_offset function for BOOK3S 64K
  2016-03-09 12:10 ` [RFC 5/9] powerpc/mm: Split huge_pte_offset " Anshuman Khandual
@ 2016-03-09 22:57   ` Dave Hansen
  2016-03-10  3:37     ` Anshuman Khandual
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2016-03-09 22:57 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

On 03/09/2016 04:10 AM, Anshuman Khandual wrote:
> Currently the 'huge_pte_offset' function has only one version for
> all the configuations and platforms. This change splits the function
> into two versions, one for 64K page size based BOOK3S implementation
> and the other one for everything else. This change is also one of the
> prerequisites towards enabling GENERAL_HUGETLB implementation for
> BOOK3S 64K based huge pages.

I think there's a bit of background missing here for random folks on
linux-mm to make sense of these patches.

What is BOOK3S and what does it mean for these patches?  Why is its 64K
page size implementation different than all the others?  Is there a 4K
page size BOOK3S?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 5/9] powerpc/mm: Split huge_pte_offset function for BOOK3S 64K
  2016-03-09 22:57   ` Dave Hansen
@ 2016-03-10  3:37     ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-10  3:37 UTC (permalink / raw)
  To: Dave Hansen, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, aneesh.kumar, kirill, n-horiguchi, mgorman, akpm

On 03/10/2016 04:27 AM, Dave Hansen wrote:
> On 03/09/2016 04:10 AM, Anshuman Khandual wrote:
>> > Currently the 'huge_pte_offset' function has only one version for
>> > all the configuations and platforms. This change splits the function
>> > into two versions, one for 64K page size based BOOK3S implementation
>> > and the other one for everything else. This change is also one of the
>> > prerequisites towards enabling GENERAL_HUGETLB implementation for
>> > BOOK3S 64K based huge pages.
> I think there's a bit of background missing here for random folks on
> linux-mm to make sense of these patches.
> 
> What is BOOK3S and what does it mean for these patches?  Why is its 64K

BOOK3S is the server type in powerpc family of processors which can support
multiple base page sizes like 64K and 4K.

> page size implementation different than all the others?  Is there a 4K
> page size BOOK3S?

It supports huge pages of size 16M as well as 16G and their implementations
are different with respect to base page sizes of 64K and 4K.

Patches 1, 2 and 3 are generic VM changes and the rest are powerpc specific
changes. Should I have split them accordingly and send out differently for
generic and powerpc specific reviews ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 9/9] selfttest/powerpc: Add memory page migration tests
  2016-03-09 20:01   ` Aneesh Kumar K.V
@ 2016-03-10  5:05     ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-10  5:05 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, mgorman, akpm

On 03/10/2016 01:31 AM, Aneesh Kumar K.V wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> > [ text/plain ]
>> > This adds two tests for memory page migration. One for normal page
>> > migration which works for both 4K or 64K base page size kernel and
>> > the other one is for huge page migration which works only on 64K
>> > base page sized 16MB huge page implemention at the PMD level.
>> >
> can you also add the test in this commit
> e66f17ff717 ("mm/hugetlb: take page table lock in follow_huge_pmd()")

Thought about it but thats kind of bit tricky. All self tests have finite
runtime. Test case in that commit has two processes which execute for ever
and try to create the race condition. We can try to run it for *some time*
looking for races instead ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K
  2016-03-09 19:58   ` Aneesh Kumar K.V
@ 2016-03-10  5:12     ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-10  5:12 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, mgorman, akpm

On 03/10/2016 01:28 AM, Aneesh Kumar K.V wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> > [ text/plain ]
>> > This enables ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K in Kconfig.
>> > It also implements a new function 'pte_huge' which is required by
>> > function 'huge_pte_alloc' from generic VM. Existing BOOK3S 64K
>> > specific functions 'huge_pte_alloc' and 'huge_pte_offset' (which
>> > are no longer required) are removed with this change.
>> >
> You want this to be the last patch isn't it ? And you are mixing too

Yeah, it should be the last one.

> many things in this patch. Why not do this
> 
> * book3s specific hash pte routines
> * book3s add conditional based on GENERAL_HUGETLB
> * Enable GENERAL_HUGETLB for 64k page size config

which creates three separate patches ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K
  2016-03-09 19:55   ` Aneesh Kumar K.V
@ 2016-03-10  5:33     ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-10  5:33 UTC (permalink / raw)
  To: Aneesh Kumar K.V, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, mgorman, akpm

On 03/10/2016 01:25 AM, Aneesh Kumar K.V wrote:
> Anshuman Khandual <khandual@linux.vnet.ibm.com> writes:
> 
>> > [ text/plain ]
>> > From: root <root@ltcalpine2-lp8.aus.stglabs.ibm.com>
>> >
>> > Currently the 'huge_pte_alloc' function has two versions, one for the
>> > BOOK3S and the other one for the BOOK3E platforms. This change splits
>> > the BOOK3S version into two parts, one for the 4K page size based
>> > implementation and the other one for the 64K page sized implementation.
>> > This change is one of the prerequisites towards enabling GENERAL_HUGETLB
>> > implementation for BOOK3S 64K based huge pages.
> I really wish we reduce #ifdefs in C code and start splitting hash
> and nonhash code out where ever we can.

Okay but here we are only dealing with 64K and 4K configs inside book3s.
I guess it covers both hash and no hash implementations. Not sure if I
got it correctly.

> 
> What we really want here is a book3s version and in book3s version use
> powerpc specific huge_pte_alloc only if GENERAL_HUGETLB was not defined.

got it.

> Don't limit it to 64k linux page size. We should select between powerpc
> specific implementation and generic code using GENERAL_HUGETLB define.

Got it. will try.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware
  2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
                   ` (7 preceding siblings ...)
  2016-03-09 12:10 ` [RFC 9/9] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
@ 2016-03-11  3:01 ` Anshuman Khandual
  2016-03-14 20:29   ` Andrew Morton
  8 siblings, 1 reply; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-11  3:01 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, aneesh.kumar, kirill, n-horiguchi, mgorman, akpm

On 03/09/2016 05:40 PM, Anshuman Khandual wrote:
> Currently both the ARCH_WANT_GENERAL_HUGETLB functions 'huge_pte_alloc'
> and 'huge_pte_offset' dont take into account huge page implementation
> at the PGD level. With addition of PGD awareness into these functions,
> more architectures like POWER which also implements huge pages at PGD
> level (along with PMD level), can use ARCH_WANT_GENERAL_HUGETLB option.

Hugh/Mel/Naoya/Andrew,

	Thoughts/inputs/suggestions ? Does this change looks okay ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function
  2016-03-09 12:10 ` [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function Anshuman Khandual
@ 2016-03-11  3:02   ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-11  3:02 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, aneesh.kumar, kirill, n-horiguchi, mgorman, akpm

On 03/09/2016 05:40 PM, Anshuman Khandual wrote:
> This just adds 'follow_huge_pgd' function which is will be used
> later in this series to make 'follow_page_mask' function aware
> of PGD based huge page implementation.

Hugh/Mel/Naoya/Andrew,

	Thoughts/inputs/suggestions ? Does this change looks okay ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware
  2016-03-09 12:10 ` [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware Anshuman Khandual
@ 2016-03-11  3:03   ` Anshuman Khandual
  0 siblings, 0 replies; 22+ messages in thread
From: Anshuman Khandual @ 2016-03-11  3:03 UTC (permalink / raw)
  To: linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, aneesh.kumar, kirill, n-horiguchi, mgorman, akpm

On 03/09/2016 05:40 PM, Anshuman Khandual wrote:
> Currently the function 'follow_page_mask' does not take into account
> PGD based huge page implementation. This change achieves that and
> makes it complete.

Hugh/Mel/Naoya/Andrew,

	Thoughts/inputs/suggestions ? Does this change look okay ?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware
  2016-03-11  3:01 ` [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
@ 2016-03-14 20:29   ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2016-03-14 20:29 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-mm, linux-kernel, linuxppc-dev, hughd, aneesh.kumar,
	kirill, n-horiguchi, mgorman

On Fri, 11 Mar 2016 08:31:55 +0530 Anshuman Khandual <khandual@linux.vnet.ibm.com> wrote:

> On 03/09/2016 05:40 PM, Anshuman Khandual wrote:
> > Currently both the ARCH_WANT_GENERAL_HUGETLB functions 'huge_pte_alloc'
> > and 'huge_pte_offset' dont take into account huge page implementation
> > at the PGD level. With addition of PGD awareness into these functions,
> > more architectures like POWER which also implements huge pages at PGD
> > level (along with PMD level), can use ARCH_WANT_GENERAL_HUGETLB option.
> 
> Hugh/Mel/Naoya/Andrew,
> 
> 	Thoughts/inputs/suggestions ? Does this change looks okay ?

Patches 1, 2 and 3 look OK to me.  Please include them in the powerpc
merge when the patchset is considered ready.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K
  2016-03-09 12:10 ` [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB " Anshuman Khandual
  2016-03-09 19:58   ` Aneesh Kumar K.V
@ 2016-03-21  9:55   ` Rui Teng
  1 sibling, 0 replies; 22+ messages in thread
From: Rui Teng @ 2016-03-21  9:55 UTC (permalink / raw)
  To: Anshuman Khandual, linux-mm, linux-kernel, linuxppc-dev
  Cc: hughd, kirill, n-horiguchi, akpm, mgorman, aneesh.kumar, mpe

On 3/9/16 8:10 PM, Anshuman Khandual wrote:
> This enables ARCH_WANT_GENERAL_HUGETLB for BOOK3S 64K in Kconfig.
> It also implements a new function 'pte_huge' which is required by
> function 'huge_pte_alloc' from generic VM. Existing BOOK3S 64K
> specific functions 'huge_pte_alloc' and 'huge_pte_offset' (which
> are no longer required) are removed with this change.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
>   arch/powerpc/Kconfig                          |  4 ++
>   arch/powerpc/include/asm/book3s/64/hash-64k.h |  8 ++++
>   arch/powerpc/mm/hugetlbpage.c                 | 60 ---------------------------
>   3 files changed, 12 insertions(+), 60 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9faa18c..c6920bb 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -33,6 +33,10 @@ config HAVE_SETUP_PER_CPU_AREA
>   config NEED_PER_CPU_EMBED_FIRST_CHUNK
>   	def_bool PPC64
>
> +config ARCH_WANT_GENERAL_HUGETLB
> +	depends on PPC_64K_PAGES && PPC_BOOK3S_64
> +	def_bool y
> +
On the source code, the PowerPC specified huge_pte_alloc() function will 
not be defined if the configure logic is "!PPC_4K_PAGES && 
PPC_BOOK3S_64", but on the Kconfig file the general huge_pte_alloc() 
function will only be defined if the logic is "PPC_64K_PAGES && 
PPC_BOOK3S_64".

It works if PPC_4K_PAGES and PPC_64K_PAGES always against each other, 
but I also find PPC_16K_PAGES and PPC_256K_PAGES on the same Kconfig 
file. What happens if we configure PPC_16K_PAGES instead of PPC_4K_PAGES?

>   config NR_IRQS
>   	int "Number of virtual interrupt numbers"
>   	range 32 32768
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 849bbec..5e9b9b9 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -143,6 +143,14 @@ extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
>    * Defined in such a way that we can optimize away code block at build time
>    * if CONFIG_HUGETLB_PAGE=n.
>    */
> +static inline int pte_huge(pte_t pte)
> +{
> +	/*
> +	 * leaf pte for huge page
> +	 */
> +	return !!(pte_val(pte) & _PAGE_PTE);
> +}
> +
>   static inline int pmd_huge(pmd_t pmd)
>   {
>   	/*
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index f834a74..f6e4712 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -59,42 +59,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>   	/* Only called for hugetlbfs pages, hence can ignore THP */
>   	return __find_linux_pte_or_hugepte(mm->pgd, addr, NULL, NULL);
>   }
> -#else
> -pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
> -{
> -	pgd_t pgd, *pgdp;
> -	pud_t pud, *pudp;
> -	pmd_t pmd, *pmdp;
> -
> -	pgdp = mm->pgd + pgd_index(addr);
> -	pgd  = READ_ONCE(*pgdp);
> -
> -	if (pgd_none(pgd))
> -		return NULL;
> -
> -	if (pgd_huge(pgd))
> -		return (pte_t *)pgdp;
> -
> -	pudp = pud_offset(&pgd, addr);
> -	pud  = READ_ONCE(*pudp);
> -	if (pud_none(pud))
> -		return NULL;
> -
> -	if (pud_huge(pud))
> -		return (pte_t *)pudp;
>
> -	pmdp = pmd_offset(&pud, addr);
> -	pmd  = READ_ONCE(*pmdp);
> -	if (pmd_none(pmd))
> -		return NULL;
> -
> -	if (pmd_huge(pmd))
> -		return (pte_t *)pmdp;
> -	return NULL;
> -}
> -#endif /* !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64) */
> -
> -#if !defined(CONFIG_PPC_64K_PAGES) || !defined(CONFIG_PPC_BOOK3S_64)
>   static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
>   			   unsigned long address, unsigned pdshift, unsigned pshift)
>   {
> @@ -211,31 +176,6 @@ hugepd_search:
>
>   	return hugepte_offset(*hpdp, addr, pdshift);
>   }
> -
> -#else
> -pte_t *huge_pte_alloc(struct mm_struct *mm, unsigned long addr, unsigned long sz)
> -{
> -	pgd_t *pg;
> -	pud_t *pu;
> -	pmd_t *pm;
> -	unsigned pshift = __ffs(sz);
> -
> -	addr &= ~(sz-1);
> -	pg = pgd_offset(mm, addr);
> -
> -	if (pshift == PGDIR_SHIFT)	/* 16GB Huge Page */
> -		return (pte_t *)pg;
> -
> -	pu = pud_alloc(mm, pg, addr);	/* NA, skipped */
> -	if (pshift == PUD_SHIFT)
> -		return (pte_t *)pu;
> -
> -	pm = pmd_alloc(mm, pu, addr);	/* 16MB Huge Page */
> -	if (pshift == PMD_SHIFT)
> -		return (pte_t *)pm;
> -
> -	return NULL;
> -}
>   #endif
>   #else
>
Why these code need to be added on patch 4/9 but removed on 6/9?

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-03-21  9:55 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-09 12:10 [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
2016-03-09 12:10 ` [RFC 2/9] mm/hugetlb: Add follow_huge_pgd function Anshuman Khandual
2016-03-11  3:02   ` Anshuman Khandual
2016-03-09 12:10 ` [RFC 3/9] mm/gup: Make follow_page_mask function PGD implementation aware Anshuman Khandual
2016-03-11  3:03   ` Anshuman Khandual
2016-03-09 12:10 ` [RFC 4/9] powerpc/mm: Split huge_pte_alloc function for BOOK3S 64K Anshuman Khandual
2016-03-09 19:55   ` Aneesh Kumar K.V
2016-03-10  5:33     ` Anshuman Khandual
2016-03-09 12:10 ` [RFC 5/9] powerpc/mm: Split huge_pte_offset " Anshuman Khandual
2016-03-09 22:57   ` Dave Hansen
2016-03-10  3:37     ` Anshuman Khandual
2016-03-09 12:10 ` [RFC 6/9] powerpc/hugetlb: Enable ARCH_WANT_GENERAL_HUGETLB " Anshuman Khandual
2016-03-09 19:58   ` Aneesh Kumar K.V
2016-03-10  5:12     ` Anshuman Khandual
2016-03-21  9:55   ` Rui Teng
2016-03-09 12:10 ` [RFC 7/9] powerpc/hugetlb: Change follow_huge_* routines " Anshuman Khandual
2016-03-09 12:10 ` [RFC 8/9] powerpc/mm: Enable HugeTLB page migration Anshuman Khandual
2016-03-09 12:10 ` [RFC 9/9] selfttest/powerpc: Add memory page migration tests Anshuman Khandual
2016-03-09 20:01   ` Aneesh Kumar K.V
2016-03-10  5:05     ` Anshuman Khandual
2016-03-11  3:01 ` [RFC 1/9] mm/hugetlb: Make GENERAL_HUGETLB functions PGD implementation aware Anshuman Khandual
2016-03-14 20:29   ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).