linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] arm64: fix the bugs found in the hugetlb test
@ 2016-11-03  2:27 Huang Shijie
  2016-11-03  2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
  2016-11-03  2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie
  0 siblings, 2 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-03  2:27 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
	linux-arm-kernel, linux-kernel, akpm, Huang Shijie

(1) Backgroud
   For the arm64, the hugetlb page size can be 32M (PMD + Contiguous bit).
   In the 4K page environment, the max page order is 10 (max_order - 1),
   so 32M page is the gigantic page.    

   The arm64 MMU supports a Contiguous bit which is a hint that the PTE
   is one of a set of contiguous entries which can be cached in a single
   TLB entry.  Please refer to the arm64v8 mannul :
       DDI0487A_f_armv8_arm.pdf (in page D4-1811)

(2) The bugs   
   After I tested the libhugetlbfs, I found several bugs in arm64 code.
   This patch set has all the bug fixes for the arm64.
   
(3) The test result in the Softiron and Juno-r1 boards:

   This detail test result shows below (both the "make func" & "make stress"):

    4KB granule:

        1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
             Test result          : PASS

        1.2) PMD                  : 2M x  1 = 2M  (per huge page size)
             Test result          : PASS

        1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
             Test result          : PASS

    64KB granule:

        3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
             Test result          : PASS

        3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
             Test result          : no hardware to support this test


Huang Shijie (2):
  arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  arm64: hugetlb: fix the wrong address for several functions

 arch/arm64/mm/hugetlbpage.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  2016-11-03  2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
@ 2016-11-03  2:27 ` Huang Shijie
  2016-11-04  0:16   ` Catalin Marinas
  2016-11-03  2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie
  1 sibling, 1 reply; 7+ messages in thread
From: Huang Shijie @ 2016-11-03  2:27 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
	linux-arm-kernel, linux-kernel, akpm, Huang Shijie

The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:

   1.) pmd entry is not present.

   2.) the page fault occurs:
       ... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()

   3.) set_huge_pte_at() will only set the first PMD entry, since the
       find_num_contig just return 1 in this case. So the PMD entries
       are all empty except the first one.

   4.) when kernel accesses the address mapped by the second PMD entry,
       a new page fault occurs:
       ... hugetlb_fault() --> huge_ptep_set_access_flags()

       The second PMD entry is still empty now.

   5.) When the kernel returns, the access will cause a page fault again.
       The kernel will run like the "4)" above.
       We will see a dead loop since here.

The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).

This patch removes wrong pmd check, and fixes this dead loop.

Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 arch/arm64/mm/hugetlbpage.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 2e49bd2..4811ef1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
 		return 1;
 	}
 	pmd = pmd_offset(pud, addr);
-	if (!pmd_present(*pmd)) {
-		VM_BUG_ON(!pmd_present(*pmd));
-		return 1;
-	}
 	if ((pte_t *)pmd == ptep) {
 		*pgsize = PMD_SIZE;
 		return CONT_PMDS;
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions
  2016-11-03  2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
  2016-11-03  2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
@ 2016-11-03  2:27 ` Huang Shijie
  1 sibling, 0 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-03  2:27 UTC (permalink / raw)
  To: catalin.marinas
  Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
	linux-arm-kernel, linux-kernel, akpm, Huang Shijie

The libhugetlbfs meets several failures since the following functions
do not use the correct address:
   huge_ptep_get_and_clear()
   huge_ptep_set_access_flags()
   huge_ptep_set_wrprotect()
   huge_ptep_clear_flush()

This patch fixes the wrong address for them.

Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
 arch/arm64/mm/hugetlbpage.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 4811ef1..0e9401b 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -208,7 +208,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
 		ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
 		/* save the 1st pte to return */
 		pte = ptep_get_and_clear(mm, addr, cpte);
-		for (i = 1; i < ncontig; ++i) {
+		for (i = 1, addr += pgsize; i < ncontig; ++i, addr += pgsize) {
 			/*
 			 * If HW_AFDBM is enabled, then the HW could
 			 * turn on the dirty bit for any of the page
@@ -246,7 +246,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
 		pfn = pte_pfn(*cpte);
 		ncontig = find_num_contig(vma->vm_mm, addr, cpte,
 					  *cpte, &pgsize);
-		for (i = 0; i < ncontig; ++i, ++cpte) {
+		for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize) {
 			changed = ptep_set_access_flags(vma, addr, cpte,
 							pfn_pte(pfn,
 								hugeprot),
@@ -269,7 +269,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
 
 		cpte = huge_pte_offset(mm, addr);
 		ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
-		for (i = 0; i < ncontig; ++i, ++cpte)
+		for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
 			ptep_set_wrprotect(mm, addr, cpte);
 	} else {
 		ptep_set_wrprotect(mm, addr, ptep);
@@ -287,7 +287,7 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
 		cpte = huge_pte_offset(vma->vm_mm, addr);
 		ncontig = find_num_contig(vma->vm_mm, addr, cpte,
 					  *cpte, &pgsize);
-		for (i = 0; i < ncontig; ++i, ++cpte)
+		for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
 			ptep_clear_flush(vma, addr, cpte);
 	} else {
 		ptep_clear_flush(vma, addr, ptep);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  2016-11-03  2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
@ 2016-11-04  0:16   ` Catalin Marinas
  2016-11-04  2:52     ` Huang Shijie
  0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-11-04  0:16 UTC (permalink / raw)
  To: Huang Shijie
  Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
	nd, linux-arm-kernel

On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 2e49bd2..4811ef1 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>  		return 1;
>  	}
>  	pmd = pmd_offset(pud, addr);
> -	if (!pmd_present(*pmd)) {
> -		VM_BUG_ON(!pmd_present(*pmd));
> -		return 1;
> -	}
>  	if ((pte_t *)pmd == ptep) {
>  		*pgsize = PMD_SIZE;
>  		return CONT_PMDS;

BTW, for the !pud_present() and !pgd_present() cases, shouldn't
find_num_contig() actually return 0? These are more likely real bugs, so
no point in setting the huge pte.

-- 
Catalin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  2016-11-04  0:16   ` Catalin Marinas
@ 2016-11-04  2:52     ` Huang Shijie
  2016-11-04 15:48       ` Catalin Marinas
  0 siblings, 1 reply; 7+ messages in thread
From: Huang Shijie @ 2016-11-04  2:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
	nd, linux-arm-kernel

On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > index 2e49bd2..4811ef1 100644
> > --- a/arch/arm64/mm/hugetlbpage.c
> > +++ b/arch/arm64/mm/hugetlbpage.c
> > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> >  		return 1;
> >  	}
> >  	pmd = pmd_offset(pud, addr);
> > -	if (!pmd_present(*pmd)) {
> > -		VM_BUG_ON(!pmd_present(*pmd));
> > -		return 1;
> > -	}
> >  	if ((pte_t *)pmd == ptep) {
> >  		*pgsize = PMD_SIZE;
> >  		return CONT_PMDS;
> 
> BTW, for the !pud_present() and !pgd_present() cases, shouldn't
The kernel will not call the find_num_contig() if the PGD/PUD are empty.
Please see the code in the hugetlb_fault().

   ------------------------------------------------------
	ptep = huge_pte_offset(mm, address);
	if (ptep) {
	    ...............................
	} else {
		ptep = huge_pte_alloc(mm, address, huge_page_size(h));
		if (!ptep)
			return VM_FAULT_OOM;
	}
   ------------------------------------------------------


Thanks
Huang Shijie
> find_num_contig() actually return 0? These are more likely real bugs, so
> no point in setting the huge pte.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  2016-11-04  2:52     ` Huang Shijie
@ 2016-11-04 15:48       ` Catalin Marinas
  2016-11-08  2:25         ` Huang Shijie
  0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-11-04 15:48 UTC (permalink / raw)
  To: Huang Shijie
  Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
	nd, linux-arm-kernel

On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote:
> On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > > index 2e49bd2..4811ef1 100644
> > > --- a/arch/arm64/mm/hugetlbpage.c
> > > +++ b/arch/arm64/mm/hugetlbpage.c
> > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > >  		return 1;
> > >  	}
> > >  	pmd = pmd_offset(pud, addr);
> > > -	if (!pmd_present(*pmd)) {
> > > -		VM_BUG_ON(!pmd_present(*pmd));
> > > -		return 1;
> > > -	}
> > >  	if ((pte_t *)pmd == ptep) {
> > >  		*pgsize = PMD_SIZE;
> > >  		return CONT_PMDS;
> > 
> > BTW, for the !pud_present() and !pgd_present() cases, shouldn't
> > find_num_contig() actually return 0? These are more likely real bugs, so
> > no point in setting the huge pte.
> 
> The kernel will not call the find_num_contig() if the PGD/PUD are empty.
> Please see the code in the hugetlb_fault().
> 
>    ------------------------------------------------------
> 	ptep = huge_pte_offset(mm, address);
> 	if (ptep) {
> 	    ...............................
> 	} else {
> 		ptep = huge_pte_alloc(mm, address, huge_page_size(h));
> 		if (!ptep)
> 			return VM_FAULT_OOM;
> 	}
>    ------------------------------------------------------

Exactly. So what is the reason for returning 1 if !pgd_present()? Would
removing the checks entirely or adding BUG() be a better option?

-- 
Catalin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
  2016-11-04 15:48       ` Catalin Marinas
@ 2016-11-08  2:25         ` Huang Shijie
  0 siblings, 0 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-08  2:25 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
	nd, linux-arm-kernel

On Fri, Nov 04, 2016 at 09:48:14AM -0600, Catalin Marinas wrote:
> On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote:
> > On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> > > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > > > index 2e49bd2..4811ef1 100644
> > > > --- a/arch/arm64/mm/hugetlbpage.c
> > > > +++ b/arch/arm64/mm/hugetlbpage.c
> > > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > > >  		return 1;
> > > >  	}
> > > >  	pmd = pmd_offset(pud, addr);
> > > > -	if (!pmd_present(*pmd)) {
> > > > -		VM_BUG_ON(!pmd_present(*pmd));
> > > > -		return 1;
> > > > -	}
> > > >  	if ((pte_t *)pmd == ptep) {
> > > >  		*pgsize = PMD_SIZE;
> > > >  		return CONT_PMDS;
> > > 
> > > BTW, for the !pud_present() and !pgd_present() cases, shouldn't
> > > find_num_contig() actually return 0? These are more likely real bugs, so
> > > no point in setting the huge pte.
> > 
> > The kernel will not call the find_num_contig() if the PGD/PUD are empty.
> > Please see the code in the hugetlb_fault().
> > 
> >    ------------------------------------------------------
> > 	ptep = huge_pte_offset(mm, address);
> > 	if (ptep) {
> > 	    ...............................
> > 	} else {
> > 		ptep = huge_pte_alloc(mm, address, huge_page_size(h));
> > 		if (!ptep)
> > 			return VM_FAULT_OOM;
> > 	}
> >    ------------------------------------------------------
> 
> Exactly. So what is the reason for returning 1 if !pgd_present()? Would
I think the author was too cautious for returning 1 if !pgd_present().
:)
> removing the checks entirely or adding BUG() be a better option?
I will remove the checks in the next version.

Thanks
Huang Shijie

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-11-08  2:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-03  2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
2016-11-03  2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
2016-11-04  0:16   ` Catalin Marinas
2016-11-04  2:52     ` Huang Shijie
2016-11-04 15:48       ` Catalin Marinas
2016-11-08  2:25         ` Huang Shijie
2016-11-03  2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).