* [PATCH 0/2] arm64: fix the bugs found in the hugetlb test
@ 2016-11-03 2:27 Huang Shijie
2016-11-03 2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
2016-11-03 2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie
0 siblings, 2 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-03 2:27 UTC (permalink / raw)
To: catalin.marinas
Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
linux-arm-kernel, linux-kernel, akpm, Huang Shijie
(1) Backgroud
For the arm64, the hugetlb page size can be 32M (PMD + Contiguous bit).
In the 4K page environment, the max page order is 10 (max_order - 1),
so 32M page is the gigantic page.
The arm64 MMU supports a Contiguous bit which is a hint that the PTE
is one of a set of contiguous entries which can be cached in a single
TLB entry. Please refer to the arm64v8 mannul :
DDI0487A_f_armv8_arm.pdf (in page D4-1811)
(2) The bugs
After I tested the libhugetlbfs, I found several bugs in arm64 code.
This patch set has all the bug fixes for the arm64.
(3) The test result in the Softiron and Juno-r1 boards:
This detail test result shows below (both the "make func" & "make stress"):
4KB granule:
1.1) PTE + Contiguous bit : 4K x 16 = 64K (per huge page size)
Test result : PASS
1.2) PMD : 2M x 1 = 2M (per huge page size)
Test result : PASS
1.3) PMD + Contiguous bit : 2M x 16 = 32M (per huge page size)
Test result : PASS
64KB granule:
3.1) PTE + Contiguous bit : 64K x 32 = 2M (per huge page size)
Test result : PASS
3.2) PMD + Contiguous bit : 512M x 32 = 16G (per huge page size)
Test result : no hardware to support this test
Huang Shijie (2):
arm64: hugetlb: remove the wrong pmd check in find_num_contig()
arm64: hugetlb: fix the wrong address for several functions
arch/arm64/mm/hugetlbpage.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
--
2.5.5
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
2016-11-03 2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
@ 2016-11-03 2:27 ` Huang Shijie
2016-11-04 0:16 ` Catalin Marinas
2016-11-03 2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie
1 sibling, 1 reply; 7+ messages in thread
From: Huang Shijie @ 2016-11-03 2:27 UTC (permalink / raw)
To: catalin.marinas
Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
linux-arm-kernel, linux-kernel, akpm, Huang Shijie
The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:
1.) pmd entry is not present.
2.) the page fault occurs:
... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()
3.) set_huge_pte_at() will only set the first PMD entry, since the
find_num_contig just return 1 in this case. So the PMD entries
are all empty except the first one.
4.) when kernel accesses the address mapped by the second PMD entry,
a new page fault occurs:
... hugetlb_fault() --> huge_ptep_set_access_flags()
The second PMD entry is still empty now.
5.) When the kernel returns, the access will cause a page fault again.
The kernel will run like the "4)" above.
We will see a dead loop since here.
The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).
This patch removes wrong pmd check, and fixes this dead loop.
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
arch/arm64/mm/hugetlbpage.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 2e49bd2..4811ef1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
return 1;
}
pmd = pmd_offset(pud, addr);
- if (!pmd_present(*pmd)) {
- VM_BUG_ON(!pmd_present(*pmd));
- return 1;
- }
if ((pte_t *)pmd == ptep) {
*pgsize = PMD_SIZE;
return CONT_PMDS;
--
2.5.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions
2016-11-03 2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
2016-11-03 2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
@ 2016-11-03 2:27 ` Huang Shijie
1 sibling, 0 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-03 2:27 UTC (permalink / raw)
To: catalin.marinas
Cc: will.deacon, steve.capper, kaly.xin, nd, dwoods,
linux-arm-kernel, linux-kernel, akpm, Huang Shijie
The libhugetlbfs meets several failures since the following functions
do not use the correct address:
huge_ptep_get_and_clear()
huge_ptep_set_access_flags()
huge_ptep_set_wrprotect()
huge_ptep_clear_flush()
This patch fixes the wrong address for them.
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
---
arch/arm64/mm/hugetlbpage.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 4811ef1..0e9401b 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -208,7 +208,7 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
/* save the 1st pte to return */
pte = ptep_get_and_clear(mm, addr, cpte);
- for (i = 1; i < ncontig; ++i) {
+ for (i = 1, addr += pgsize; i < ncontig; ++i, addr += pgsize) {
/*
* If HW_AFDBM is enabled, then the HW could
* turn on the dirty bit for any of the page
@@ -246,7 +246,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,
pfn = pte_pfn(*cpte);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte) {
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize) {
changed = ptep_set_access_flags(vma, addr, cpte,
pfn_pte(pfn,
hugeprot),
@@ -269,7 +269,7 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm,
cpte = huge_pte_offset(mm, addr);
ncontig = find_num_contig(mm, addr, cpte, *cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte)
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
ptep_set_wrprotect(mm, addr, cpte);
} else {
ptep_set_wrprotect(mm, addr, ptep);
@@ -287,7 +287,7 @@ void huge_ptep_clear_flush(struct vm_area_struct *vma,
cpte = huge_pte_offset(vma->vm_mm, addr);
ncontig = find_num_contig(vma->vm_mm, addr, cpte,
*cpte, &pgsize);
- for (i = 0; i < ncontig; ++i, ++cpte)
+ for (i = 0; i < ncontig; ++i, ++cpte, addr += pgsize)
ptep_clear_flush(vma, addr, cpte);
} else {
ptep_clear_flush(vma, addr, ptep);
--
2.5.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
2016-11-03 2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
@ 2016-11-04 0:16 ` Catalin Marinas
2016-11-04 2:52 ` Huang Shijie
0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-11-04 0:16 UTC (permalink / raw)
To: Huang Shijie
Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
nd, linux-arm-kernel
On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 2e49bd2..4811ef1 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> return 1;
> }
> pmd = pmd_offset(pud, addr);
> - if (!pmd_present(*pmd)) {
> - VM_BUG_ON(!pmd_present(*pmd));
> - return 1;
> - }
> if ((pte_t *)pmd == ptep) {
> *pgsize = PMD_SIZE;
> return CONT_PMDS;
BTW, for the !pud_present() and !pgd_present() cases, shouldn't
find_num_contig() actually return 0? These are more likely real bugs, so
no point in setting the huge pte.
--
Catalin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
2016-11-04 0:16 ` Catalin Marinas
@ 2016-11-04 2:52 ` Huang Shijie
2016-11-04 15:48 ` Catalin Marinas
0 siblings, 1 reply; 7+ messages in thread
From: Huang Shijie @ 2016-11-04 2:52 UTC (permalink / raw)
To: Catalin Marinas
Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
nd, linux-arm-kernel
On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > index 2e49bd2..4811ef1 100644
> > --- a/arch/arm64/mm/hugetlbpage.c
> > +++ b/arch/arm64/mm/hugetlbpage.c
> > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > return 1;
> > }
> > pmd = pmd_offset(pud, addr);
> > - if (!pmd_present(*pmd)) {
> > - VM_BUG_ON(!pmd_present(*pmd));
> > - return 1;
> > - }
> > if ((pte_t *)pmd == ptep) {
> > *pgsize = PMD_SIZE;
> > return CONT_PMDS;
>
> BTW, for the !pud_present() and !pgd_present() cases, shouldn't
The kernel will not call the find_num_contig() if the PGD/PUD are empty.
Please see the code in the hugetlb_fault().
------------------------------------------------------
ptep = huge_pte_offset(mm, address);
if (ptep) {
...............................
} else {
ptep = huge_pte_alloc(mm, address, huge_page_size(h));
if (!ptep)
return VM_FAULT_OOM;
}
------------------------------------------------------
Thanks
Huang Shijie
> find_num_contig() actually return 0? These are more likely real bugs, so
> no point in setting the huge pte.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
2016-11-04 2:52 ` Huang Shijie
@ 2016-11-04 15:48 ` Catalin Marinas
2016-11-08 2:25 ` Huang Shijie
0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2016-11-04 15:48 UTC (permalink / raw)
To: Huang Shijie
Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
nd, linux-arm-kernel
On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote:
> On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > > index 2e49bd2..4811ef1 100644
> > > --- a/arch/arm64/mm/hugetlbpage.c
> > > +++ b/arch/arm64/mm/hugetlbpage.c
> > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > > return 1;
> > > }
> > > pmd = pmd_offset(pud, addr);
> > > - if (!pmd_present(*pmd)) {
> > > - VM_BUG_ON(!pmd_present(*pmd));
> > > - return 1;
> > > - }
> > > if ((pte_t *)pmd == ptep) {
> > > *pgsize = PMD_SIZE;
> > > return CONT_PMDS;
> >
> > BTW, for the !pud_present() and !pgd_present() cases, shouldn't
> > find_num_contig() actually return 0? These are more likely real bugs, so
> > no point in setting the huge pte.
>
> The kernel will not call the find_num_contig() if the PGD/PUD are empty.
> Please see the code in the hugetlb_fault().
>
> ------------------------------------------------------
> ptep = huge_pte_offset(mm, address);
> if (ptep) {
> ...............................
> } else {
> ptep = huge_pte_alloc(mm, address, huge_page_size(h));
> if (!ptep)
> return VM_FAULT_OOM;
> }
> ------------------------------------------------------
Exactly. So what is the reason for returning 1 if !pgd_present()? Would
removing the checks entirely or adding BUG() be a better option?
--
Catalin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig()
2016-11-04 15:48 ` Catalin Marinas
@ 2016-11-08 2:25 ` Huang Shijie
0 siblings, 0 replies; 7+ messages in thread
From: Huang Shijie @ 2016-11-08 2:25 UTC (permalink / raw)
To: Catalin Marinas
Cc: dwoods, steve.capper, will.deacon, linux-kernel, kaly.xin, akpm,
nd, linux-arm-kernel
On Fri, Nov 04, 2016 at 09:48:14AM -0600, Catalin Marinas wrote:
> On Fri, Nov 04, 2016 at 10:52:17AM +0800, Huang Shijie wrote:
> > On Thu, Nov 03, 2016 at 06:16:16PM -0600, Catalin Marinas wrote:
> > > On Thu, Nov 03, 2016 at 10:27:38AM +0800, Huang Shijie wrote:
> > > > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> > > > index 2e49bd2..4811ef1 100644
> > > > --- a/arch/arm64/mm/hugetlbpage.c
> > > > +++ b/arch/arm64/mm/hugetlbpage.c
> > > > @@ -61,10 +61,6 @@ static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> > > > return 1;
> > > > }
> > > > pmd = pmd_offset(pud, addr);
> > > > - if (!pmd_present(*pmd)) {
> > > > - VM_BUG_ON(!pmd_present(*pmd));
> > > > - return 1;
> > > > - }
> > > > if ((pte_t *)pmd == ptep) {
> > > > *pgsize = PMD_SIZE;
> > > > return CONT_PMDS;
> > >
> > > BTW, for the !pud_present() and !pgd_present() cases, shouldn't
> > > find_num_contig() actually return 0? These are more likely real bugs, so
> > > no point in setting the huge pte.
> >
> > The kernel will not call the find_num_contig() if the PGD/PUD are empty.
> > Please see the code in the hugetlb_fault().
> >
> > ------------------------------------------------------
> > ptep = huge_pte_offset(mm, address);
> > if (ptep) {
> > ...............................
> > } else {
> > ptep = huge_pte_alloc(mm, address, huge_page_size(h));
> > if (!ptep)
> > return VM_FAULT_OOM;
> > }
> > ------------------------------------------------------
>
> Exactly. So what is the reason for returning 1 if !pgd_present()? Would
I think the author was too cautious for returning 1 if !pgd_present().
:)
> removing the checks entirely or adding BUG() be a better option?
I will remove the checks in the next version.
Thanks
Huang Shijie
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-11-08 2:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-03 2:27 [PATCH 0/2] arm64: fix the bugs found in the hugetlb test Huang Shijie
2016-11-03 2:27 ` [PATCH 1/2] arm64: hugetlb: remove the wrong pmd check in find_num_contig() Huang Shijie
2016-11-04 0:16 ` Catalin Marinas
2016-11-04 2:52 ` Huang Shijie
2016-11-04 15:48 ` Catalin Marinas
2016-11-08 2:25 ` Huang Shijie
2016-11-03 2:27 ` [PATCH 2/2] arm64: hugetlb: fix the wrong address for several functions Huang Shijie
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).