* [PATCH v2 0/3] arm64: Add support for handling memory corruption @ 2017-05-17 15:23 Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Punit Agrawal @ 2017-05-17 15:23 UTC (permalink / raw) To: catalin.marinas, will.deacon Cc: Punit Agrawal, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer Hi, This series enables memory failure handling for arm64. Previous posting can be found at [0]. Changes since v1: * Reworked Patch 1 based on Catalin's feedbak to symmetrically deal with PUD and PMD hugepages in huge_pte_offset() * Added Steve's acks With support for contiguous hugepages being turned off[1], some of the problems arising from swap entries go away[2]. This simplifies the changes needed to enable memory corruption handling for arm64 (done in this seris). In this series, we updates huge_pte_offset() to correctly deal with swap entries (Patch 1). This function will need to be updated when contiguous hugepages are re-enabled. Patch 2 adds support to send SIGBUS to processes that have their memory corrupted. With the prerequisites in place, enable memory corruption handling for arm64 (patch 3). Thanks, Punit [0] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1376052.html [1] https://lkml.org/lkml/2017/4/7/486 [2] https://lkml.org/lkml/2017/4/5/402 Jonathan (Zhixiong) Zhang (2): arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling arm64: kconfig: allow support for memory failure handling Punit Agrawal (1): arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable.h | 2 +- arch/arm64/mm/fault.c | 22 +++++++++++++++++++--- arch/arm64/mm/hugetlbpage.c | 29 ++++++++++------------------- 4 files changed, 31 insertions(+), 23 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-05-17 15:23 [PATCH v2 0/3] arm64: Add support for handling memory corruption Punit Agrawal @ 2017-05-17 15:23 ` Punit Agrawal 2017-06-07 13:47 ` Will Deacon 2017-06-07 14:58 ` Catalin Marinas 2017-05-17 15:23 ` [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling Punit Agrawal ` (2 subsequent siblings) 3 siblings, 2 replies; 15+ messages in thread From: Punit Agrawal @ 2017-05-17 15:23 UTC (permalink / raw) To: catalin.marinas, will.deacon Cc: Punit Agrawal, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer, David Woods When memory failure is enabled, a poisoned hugepage pte is marked as a swap entry. huge_pte_offset() does not return the poisoned page table entries when it encounters PUD/PMD hugepages. This behaviour of huge_pte_offset() leads to error such as below when munmap is called on poisoned hugepages. [ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074. Fix huge_pte_offset() to return the poisoned pte which is then appropriately handled by the generic layer code. Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Woods <dwoods@mellanox.com> --- arch/arm64/include/asm/pgtable.h | 2 +- arch/arm64/mm/hugetlbpage.c | 29 ++++++++++------------------- 2 files changed, 11 insertions(+), 20 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index c213fdbd056c..6eae342ced6b 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -441,7 +441,7 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd) #define pud_none(pud) (!pud_val(pud)) #define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT)) -#define pud_present(pud) (pud_val(pud)) +#define pud_present(pud) pte_present(pud_pte(pud)) static inline void set_pud(pud_t *pudp, pud_t pud) { diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 7514a000e361..69b8200b1cfd 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) { pgd_t *pgd; pud_t *pud; - pmd_t *pmd = NULL; - pte_t *pte = NULL; + pmd_t *pmd; pgd = pgd_offset(mm, addr); pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); if (!pgd_present(*pgd)) return NULL; + pud = pud_offset(pgd, addr); - if (!pud_present(*pud)) + if (pud_none(*pud)) return NULL; - - if (pud_huge(*pud)) + /* swap or huge page */ + if (!pud_present(*pud) || pud_huge(*pud)) return (pte_t *)pud; + /* table; check the next level */ + pmd = pmd_offset(pud, addr); - if (!pmd_present(*pmd)) + if (pmd_none(*pmd)) return NULL; - - if (pte_cont(pmd_pte(*pmd))) { - pmd = pmd_offset( - pud, (addr & CONT_PMD_MASK)); - return (pte_t *)pmd; - } - if (pmd_huge(*pmd)) + if (!pmd_present(*pmd) || pmd_huge(*pmd)) return (pte_t *)pmd; - pte = pte_offset_kernel(pmd, addr); - if (pte_present(*pte) && pte_cont(*pte)) { - pte = pte_offset_kernel( - pmd, (addr & CONT_PTE_MASK)); - return pte; - } + return NULL; } -- 2.11.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal @ 2017-06-07 13:47 ` Will Deacon 2017-06-07 14:30 ` Catalin Marinas 2017-06-07 14:58 ` Catalin Marinas 1 sibling, 1 reply; 15+ messages in thread From: Will Deacon @ 2017-06-07 13:47 UTC (permalink / raw) To: Punit Agrawal Cc: catalin.marinas, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer, David Woods On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > When memory failure is enabled, a poisoned hugepage pte is marked as a > swap entry. huge_pte_offset() does not return the poisoned page table > entries when it encounters PUD/PMD hugepages. > > This behaviour of huge_pte_offset() leads to error such as below when > munmap is called on poisoned hugepages. > > [ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074. > > Fix huge_pte_offset() to return the poisoned pte which is then > appropriately handled by the generic layer code. > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > Acked-by: Steve Capper <steve.capper@arm.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: David Woods <dwoods@mellanox.com> > --- > arch/arm64/include/asm/pgtable.h | 2 +- > arch/arm64/mm/hugetlbpage.c | 29 ++++++++++------------------- > 2 files changed, 11 insertions(+), 20 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > index c213fdbd056c..6eae342ced6b 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -441,7 +441,7 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd) > > #define pud_none(pud) (!pud_val(pud)) > #define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT)) > -#define pud_present(pud) (pud_val(pud)) > +#define pud_present(pud) pte_present(pud_pte(pud)) > > static inline void set_pud(pud_t *pudp, pud_t pud) > { > diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c > index 7514a000e361..69b8200b1cfd 100644 > --- a/arch/arm64/mm/hugetlbpage.c > +++ b/arch/arm64/mm/hugetlbpage.c > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > { > pgd_t *pgd; > pud_t *pud; > - pmd_t *pmd = NULL; > - pte_t *pte = NULL; > + pmd_t *pmd; > > pgd = pgd_offset(mm, addr); > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); > if (!pgd_present(*pgd)) > return NULL; > + > pud = pud_offset(pgd, addr); > - if (!pud_present(*pud)) > + if (pud_none(*pud)) > return NULL; Do you actually need this special case? > - > - if (pud_huge(*pud)) > + /* swap or huge page */ > + if (!pud_present(*pud) || pud_huge(*pud)) ... couldn't you just add a '|| pud_none(*pud)' in here? > return (pte_t *)pud; > + /* table; check the next level */ > + > pmd = pmd_offset(pud, addr); > - if (!pmd_present(*pmd)) > + if (pmd_none(*pmd)) > return NULL; > - > - if (pte_cont(pmd_pte(*pmd))) { > - pmd = pmd_offset( > - pud, (addr & CONT_PMD_MASK)); > - return (pte_t *)pmd; > - } > - if (pmd_huge(*pmd)) > + if (!pmd_present(*pmd) || pmd_huge(*pmd)) Similarly here. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 13:47 ` Will Deacon @ 2017-06-07 14:30 ` Catalin Marinas 2017-06-07 14:57 ` Will Deacon 0 siblings, 1 reply; 15+ messages in thread From: Catalin Marinas @ 2017-06-07 14:30 UTC (permalink / raw) To: Will Deacon Cc: Punit Agrawal, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > > --- a/arch/arm64/mm/hugetlbpage.c > > +++ b/arch/arm64/mm/hugetlbpage.c > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > > { > > pgd_t *pgd; > > pud_t *pud; > > - pmd_t *pmd = NULL; > > - pte_t *pte = NULL; > > + pmd_t *pmd; > > > > pgd = pgd_offset(mm, addr); > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); > > if (!pgd_present(*pgd)) > > return NULL; > > + > > pud = pud_offset(pgd, addr); > > - if (!pud_present(*pud)) > > + if (pud_none(*pud)) > > return NULL; > > Do you actually need this special case? > > > - > > - if (pud_huge(*pud)) > > + /* swap or huge page */ > > + if (!pud_present(*pud) || pud_huge(*pud)) > > ... couldn't you just add a '|| pud_none(*pud)' in here? > > > return (pte_t *)pud; But then you no longer return NULL if *pud == 0. -- Catalin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 14:30 ` Catalin Marinas @ 2017-06-07 14:57 ` Will Deacon 2017-06-07 15:32 ` Punit Agrawal 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2017-06-07 14:57 UTC (permalink / raw) To: Catalin Marinas Cc: Punit Agrawal, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer On Wed, Jun 07, 2017 at 03:30:37PM +0100, Catalin Marinas wrote: > On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: > > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > > > --- a/arch/arm64/mm/hugetlbpage.c > > > +++ b/arch/arm64/mm/hugetlbpage.c > > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > > > { > > > pgd_t *pgd; > > > pud_t *pud; > > > - pmd_t *pmd = NULL; > > > - pte_t *pte = NULL; > > > + pmd_t *pmd; > > > > > > pgd = pgd_offset(mm, addr); > > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); > > > if (!pgd_present(*pgd)) > > > return NULL; > > > + > > > pud = pud_offset(pgd, addr); > > > - if (!pud_present(*pud)) > > > + if (pud_none(*pud)) > > > return NULL; > > > > Do you actually need this special case? > > > > > - > > > - if (pud_huge(*pud)) > > > + /* swap or huge page */ > > > + if (!pud_present(*pud) || pud_huge(*pud)) > > > > ... couldn't you just add a '|| pud_none(*pud)' in here? > > > > > return (pte_t *)pud; > > But then you no longer return NULL if *pud == 0. Does that actually matter? The bits of hugetlb code I looked at will deferenced the returned pud and handle the huge_pte_none case correctly. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 14:57 ` Will Deacon @ 2017-06-07 15:32 ` Punit Agrawal 2017-06-07 15:41 ` Will Deacon 2017-06-07 16:54 ` Catalin Marinas 0 siblings, 2 replies; 15+ messages in thread From: Punit Agrawal @ 2017-06-07 15:32 UTC (permalink / raw) To: Will Deacon Cc: Catalin Marinas, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer Will Deacon <will.deacon@arm.com> writes: > On Wed, Jun 07, 2017 at 03:30:37PM +0100, Catalin Marinas wrote: >> On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: >> > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: >> > > --- a/arch/arm64/mm/hugetlbpage.c >> > > +++ b/arch/arm64/mm/hugetlbpage.c >> > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) >> > > { >> > > pgd_t *pgd; >> > > pud_t *pud; >> > > - pmd_t *pmd = NULL; >> > > - pte_t *pte = NULL; >> > > + pmd_t *pmd; >> > > >> > > pgd = pgd_offset(mm, addr); >> > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); >> > > if (!pgd_present(*pgd)) >> > > return NULL; >> > > + >> > > pud = pud_offset(pgd, addr); >> > > - if (!pud_present(*pud)) >> > > + if (pud_none(*pud)) >> > > return NULL; >> > >> > Do you actually need this special case? >> > >> > > - >> > > - if (pud_huge(*pud)) >> > > + /* swap or huge page */ >> > > + if (!pud_present(*pud) || pud_huge(*pud)) >> > >> > ... couldn't you just add a '|| pud_none(*pud)' in here? >> > I think an earlier version took this approach but... >> > > return (pte_t *)pud; >> >> But then you no longer return NULL if *pud == 0. > > Does that actually matter? The bits of hugetlb code I looked at will > deferenced the returned pud and handle the huge_pte_none case correctly. For hugetlb fault handling (hugetlb_fault()), returning NULL vs pointer to the pud/pmd results in different behaviour. If we return the pud when pud_none(), then we lose the resulting hugepage size check we get from huge_pte_alloc(). > > Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 15:32 ` Punit Agrawal @ 2017-06-07 15:41 ` Will Deacon 2017-06-08 16:28 ` Punit Agrawal 2017-06-07 16:54 ` Catalin Marinas 1 sibling, 1 reply; 15+ messages in thread From: Will Deacon @ 2017-06-07 15:41 UTC (permalink / raw) To: Punit Agrawal Cc: Catalin Marinas, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer On Wed, Jun 07, 2017 at 04:32:28PM +0100, Punit Agrawal wrote: > Will Deacon <will.deacon@arm.com> writes: > > > On Wed, Jun 07, 2017 at 03:30:37PM +0100, Catalin Marinas wrote: > >> On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: > >> > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > >> > > --- a/arch/arm64/mm/hugetlbpage.c > >> > > +++ b/arch/arm64/mm/hugetlbpage.c > >> > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > >> > > { > >> > > pgd_t *pgd; > >> > > pud_t *pud; > >> > > - pmd_t *pmd = NULL; > >> > > - pte_t *pte = NULL; > >> > > + pmd_t *pmd; > >> > > > >> > > pgd = pgd_offset(mm, addr); > >> > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); > >> > > if (!pgd_present(*pgd)) > >> > > return NULL; > >> > > + > >> > > pud = pud_offset(pgd, addr); > >> > > - if (!pud_present(*pud)) > >> > > + if (pud_none(*pud)) > >> > > return NULL; > >> > > >> > Do you actually need this special case? > >> > > >> > > - > >> > > - if (pud_huge(*pud)) > >> > > + /* swap or huge page */ > >> > > + if (!pud_present(*pud) || pud_huge(*pud)) > >> > > >> > ... couldn't you just add a '|| pud_none(*pud)' in here? > >> > > > I think an earlier version took this approach but... > > >> > > return (pte_t *)pud; > >> > >> But then you no longer return NULL if *pud == 0. > > > > Does that actually matter? The bits of hugetlb code I looked at will > > deferenced the returned pud and handle the huge_pte_none case correctly. > > For hugetlb fault handling (hugetlb_fault()), returning NULL vs pointer > to the pud/pmd results in different behaviour. If we return the pud when > pud_none(), then we lose the resulting hugepage size check we get from > huge_pte_alloc(). Ok, so does that mean that many of the huge_pte_none checks in mm/hugetlb.c that operate on a huge_ptep_get of non-NULL output from huge_pte_offset are actually redundant? Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 15:41 ` Will Deacon @ 2017-06-08 16:28 ` Punit Agrawal 0 siblings, 0 replies; 15+ messages in thread From: Punit Agrawal @ 2017-06-08 16:28 UTC (permalink / raw) To: Will Deacon Cc: Catalin Marinas, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer Will Deacon <will.deacon@arm.com> writes: > On Wed, Jun 07, 2017 at 04:32:28PM +0100, Punit Agrawal wrote: >> Will Deacon <will.deacon@arm.com> writes: >> >> > On Wed, Jun 07, 2017 at 03:30:37PM +0100, Catalin Marinas wrote: >> >> On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: >> >> > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: >> >> > > --- a/arch/arm64/mm/hugetlbpage.c >> >> > > +++ b/arch/arm64/mm/hugetlbpage.c >> >> > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) >> >> > > { >> >> > > pgd_t *pgd; >> >> > > pud_t *pud; >> >> > > - pmd_t *pmd = NULL; >> >> > > - pte_t *pte = NULL; >> >> > > + pmd_t *pmd; >> >> > > >> >> > > pgd = pgd_offset(mm, addr); >> >> > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); >> >> > > if (!pgd_present(*pgd)) >> >> > > return NULL; >> >> > > + >> >> > > pud = pud_offset(pgd, addr); >> >> > > - if (!pud_present(*pud)) >> >> > > + if (pud_none(*pud)) >> >> > > return NULL; >> >> > >> >> > Do you actually need this special case? >> >> > >> >> > > - >> >> > > - if (pud_huge(*pud)) >> >> > > + /* swap or huge page */ >> >> > > + if (!pud_present(*pud) || pud_huge(*pud)) >> >> > >> >> > ... couldn't you just add a '|| pud_none(*pud)' in here? >> >> > >> >> I think an earlier version took this approach but... >> >> >> > > return (pte_t *)pud; >> >> >> >> But then you no longer return NULL if *pud == 0. >> > >> > Does that actually matter? The bits of hugetlb code I looked at will >> > deferenced the returned pud and handle the huge_pte_none case correctly. >> >> For hugetlb fault handling (hugetlb_fault()), returning NULL vs pointer >> to the pud/pmd results in different behaviour. If we return the pud when >> pud_none(), then we lose the resulting hugepage size check we get from >> huge_pte_alloc(). > > Ok, so does that mean that many of the huge_pte_none checks in mm/hugetlb.c > that operate on a huge_ptep_get of non-NULL output from huge_pte_offset are > actually redundant? Summarising our offline discussion - the semantics of huge_pte_offset() are unclear in terms of when a pointer to (p*d) is returned vs when to return NULL. As part of enabling contiguous hugepage support[0][1], I have a patch to add a size argument to huge_pte_offset() that can then help disambiguate when we have a valid pte* vs when we have an error (NULL). I'll separately look at clarifying the semantics of the generic version of huge_pte_offse() and potentially also look at unifying the huge_pte_offset() implementations across the various architectures. [0] https://lkml.org/lkml/2017/5/24/463 [1] https://www.spinics.net/lists/arm-kernel/msg583367.html > > Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-06-07 15:32 ` Punit Agrawal 2017-06-07 15:41 ` Will Deacon @ 2017-06-07 16:54 ` Catalin Marinas 1 sibling, 0 replies; 15+ messages in thread From: Catalin Marinas @ 2017-06-07 16:54 UTC (permalink / raw) To: Punit Agrawal Cc: Will Deacon, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer On Wed, Jun 07, 2017 at 04:32:28PM +0100, Punit Agrawal wrote: > Will Deacon <will.deacon@arm.com> writes: > > On Wed, Jun 07, 2017 at 03:30:37PM +0100, Catalin Marinas wrote: > >> On Wed, Jun 07, 2017 at 02:47:32PM +0100, Will Deacon wrote: > >> > On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > >> > > --- a/arch/arm64/mm/hugetlbpage.c > >> > > +++ b/arch/arm64/mm/hugetlbpage.c > >> > > @@ -136,36 +136,27 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) > >> > > { > >> > > pgd_t *pgd; > >> > > pud_t *pud; > >> > > - pmd_t *pmd = NULL; > >> > > - pte_t *pte = NULL; > >> > > + pmd_t *pmd; > >> > > > >> > > pgd = pgd_offset(mm, addr); > >> > > pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd); > >> > > if (!pgd_present(*pgd)) > >> > > return NULL; > >> > > + > >> > > pud = pud_offset(pgd, addr); > >> > > - if (!pud_present(*pud)) > >> > > + if (pud_none(*pud)) > >> > > return NULL; > >> > > >> > Do you actually need this special case? > >> > > >> > > - > >> > > - if (pud_huge(*pud)) > >> > > + /* swap or huge page */ > >> > > + if (!pud_present(*pud) || pud_huge(*pud)) > >> > > >> > ... couldn't you just add a '|| pud_none(*pud)' in here? > >> > > > I think an earlier version took this approach but... > > >> > > return (pte_t *)pud; > >> > >> But then you no longer return NULL if *pud == 0. > > > > Does that actually matter? The bits of hugetlb code I looked at will > > deferenced the returned pud and handle the huge_pte_none case correctly. > > For hugetlb fault handling (hugetlb_fault()), returning NULL vs pointer > to the pud/pmd results in different behaviour. If we return the pud when > pud_none(), then we lose the resulting hugepage size check we get from > huge_pte_alloc(). At a quick look, there are a few other places where not returning NULL has some other effects (though I don't think any of them are fatal): - copy_huge_tlb_page_range() - unnecessary allocation of a destination pud - huge_pmd_share() - do we actually need the subsequent get_page()? - page_vma_mapped_walk() - it even has a comment: "when pud is not present, pte will be NULL". Now, that's no longer true with swap entries but we'd never return NULL for a pud_none() case Current code behaviour is to return NULL when !p*d_present(). We are slightly relaxing this for swap entries while still returning NULL for the p*d_none() case but I wouldn't go that far as to never return NULL here. -- Catalin ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal 2017-06-07 13:47 ` Will Deacon @ 2017-06-07 14:58 ` Catalin Marinas 1 sibling, 0 replies; 15+ messages in thread From: Catalin Marinas @ 2017-06-07 14:58 UTC (permalink / raw) To: Punit Agrawal Cc: will.deacon, David Woods, steve.capper, tbaicar, linux-kernel, linux-arm-kernel, manoj.iyer On Wed, May 17, 2017 at 04:23:34PM +0100, Punit Agrawal wrote: > When memory failure is enabled, a poisoned hugepage pte is marked as a > swap entry. huge_pte_offset() does not return the poisoned page table > entries when it encounters PUD/PMD hugepages. > > This behaviour of huge_pte_offset() leads to error such as below when > munmap is called on poisoned hugepages. > > [ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074. > > Fix huge_pte_offset() to return the poisoned pte which is then > appropriately handled by the generic layer code. > > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > Acked-by: Steve Capper <steve.capper@arm.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: David Woods <dwoods@mellanox.com> Since the patch matches my suggestions in v1: Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling 2017-05-17 15:23 [PATCH v2 0/3] arm64: Add support for handling memory corruption Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal @ 2017-05-17 15:23 ` Punit Agrawal 2017-06-07 13:59 ` Will Deacon 2017-05-17 15:23 ` [PATCH v2 3/3] arm64: kconfig: allow support for memory failure handling Punit Agrawal [not found] ` <1495081650.3981.15@smtp.canonical.com> 3 siblings, 1 reply; 15+ messages in thread From: Punit Agrawal @ 2017-05-17 15:23 UTC (permalink / raw) To: catalin.marinas, will.deacon Cc: Jonathan (Zhixiong) Zhang, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer, Punit Agrawal From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org> Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar to VM_FAULT_OOM, the only difference is that a different si_code (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is initialized. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> (fix new __do_user_fault call-site) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> --- arch/arm64/mm/fault.c | 22 +++++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 37b95dff0b07..a85b44343ac6 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -31,6 +31,7 @@ #include <linux/highmem.h> #include <linux/perf_event.h> #include <linux/preempt.h> +#include <linux/hugetlb.h> #include <asm/bug.h> #include <asm/cpufeature.h> @@ -239,10 +240,11 @@ static void __do_kernel_fault(struct mm_struct *mm, unsigned long addr, */ static void __do_user_fault(struct task_struct *tsk, unsigned long addr, unsigned int esr, unsigned int sig, int code, - struct pt_regs *regs) + struct pt_regs *regs, int fault) { struct siginfo si; const struct fault_info *inf; + unsigned int lsb = 0; if (unhandled_signal(tsk, sig) && show_unhandled_signals_ratelimited()) { inf = esr_to_fault_info(esr); @@ -259,6 +261,17 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr, si.si_errno = 0; si.si_code = code; si.si_addr = (void __user *)addr; + /* + * Either small page or large page may be poisoned. + * In other words, VM_FAULT_HWPOISON_LARGE and + * VM_FAULT_HWPOISON are mutually exclusive. + */ + if (fault & VM_FAULT_HWPOISON_LARGE) + lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault)); + else if (fault & VM_FAULT_HWPOISON) + lsb = PAGE_SHIFT; + si.si_addr_lsb = lsb; + force_sig_info(sig, &si, tsk); } @@ -274,7 +287,7 @@ static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *re */ if (user_mode(regs)) { inf = esr_to_fault_info(esr); - __do_user_fault(tsk, addr, esr, inf->sig, inf->code, regs); + __do_user_fault(tsk, addr, esr, inf->sig, inf->code, regs, 0); } else __do_kernel_fault(mm, addr, esr, regs); } @@ -461,6 +474,9 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, */ sig = SIGBUS; code = BUS_ADRERR; + } else if (fault & (VM_FAULT_HWPOISON | VM_FAULT_HWPOISON_LARGE)) { + sig = SIGBUS; + code = BUS_MCEERR_AR; } else { /* * Something tried to access memory that isn't in our memory @@ -471,7 +487,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, SEGV_ACCERR : SEGV_MAPERR; } - __do_user_fault(tsk, addr, esr, sig, code, regs); + __do_user_fault(tsk, addr, esr, sig, code, regs, fault); return 0; no_context: -- 2.11.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling 2017-05-17 15:23 ` [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling Punit Agrawal @ 2017-06-07 13:59 ` Will Deacon 2017-06-07 17:47 ` Punit Agrawal 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2017-06-07 13:59 UTC (permalink / raw) To: Punit Agrawal Cc: catalin.marinas, Jonathan (Zhixiong) Zhang, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer On Wed, May 17, 2017 at 04:23:35PM +0100, Punit Agrawal wrote: > From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org> > > Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault > handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar > to VM_FAULT_OOM, the only difference is that a different si_code > (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is > initialized. > > Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> > Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> > (fix new __do_user_fault call-site) > Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> > Acked-by: Steve Capper <steve.capper@arm.com> > --- > arch/arm64/mm/fault.c | 22 +++++++++++++++++++--- > 1 file changed, 19 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c > index 37b95dff0b07..a85b44343ac6 100644 > --- a/arch/arm64/mm/fault.c > +++ b/arch/arm64/mm/fault.c > @@ -31,6 +31,7 @@ > #include <linux/highmem.h> > #include <linux/perf_event.h> > #include <linux/preempt.h> > +#include <linux/hugetlb.h> > > #include <asm/bug.h> > #include <asm/cpufeature.h> > @@ -239,10 +240,11 @@ static void __do_kernel_fault(struct mm_struct *mm, unsigned long addr, > */ > static void __do_user_fault(struct task_struct *tsk, unsigned long addr, > unsigned int esr, unsigned int sig, int code, > - struct pt_regs *regs) > + struct pt_regs *regs, int fault) > { > struct siginfo si; > const struct fault_info *inf; > + unsigned int lsb = 0; > > if (unhandled_signal(tsk, sig) && show_unhandled_signals_ratelimited()) { > inf = esr_to_fault_info(esr); > @@ -259,6 +261,17 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr, > si.si_errno = 0; > si.si_code = code; > si.si_addr = (void __user *)addr; > + /* > + * Either small page or large page may be poisoned. > + * In other words, VM_FAULT_HWPOISON_LARGE and > + * VM_FAULT_HWPOISON are mutually exclusive. > + */ > + if (fault & VM_FAULT_HWPOISON_LARGE) > + lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault)); > + else if (fault & VM_FAULT_HWPOISON) > + lsb = PAGE_SHIFT; > + si.si_addr_lsb = lsb; > + If we're going to start handling poison faults, then we should probably rejig the perf page fault accounting around here so that we follow x86: * Always report PERF_COUNT_SW_PAGE_FAULTS, * Don't report anything else for VM_FAULT_ERROR * Report PERF_COUNT_SW_PAGE_FAULTS_MAJ if VM_FAULT_MAJOR * Otherwise, report PERF_COUNT_SW_PAGE_FAULTS_MIN at the moment, I think you're accounting VM_FAULT_ERROR as PERF_COUNT_SW_PAGE_FAULTS_MIN, which doesn't feel right at all. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling 2017-06-07 13:59 ` Will Deacon @ 2017-06-07 17:47 ` Punit Agrawal 0 siblings, 0 replies; 15+ messages in thread From: Punit Agrawal @ 2017-06-07 17:47 UTC (permalink / raw) To: Will Deacon Cc: catalin.marinas, Jonathan (Zhixiong) Zhang, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer Will Deacon <will.deacon@arm.com> writes: > On Wed, May 17, 2017 at 04:23:35PM +0100, Punit Agrawal wrote: >> From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org> >> >> Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault >> handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar >> to VM_FAULT_OOM, the only difference is that a different si_code >> (BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is >> initialized. >> >> Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> >> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> >> (fix new __do_user_fault call-site) >> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> >> Acked-by: Steve Capper <steve.capper@arm.com> >> --- >> arch/arm64/mm/fault.c | 22 +++++++++++++++++++--- >> 1 file changed, 19 insertions(+), 3 deletions(-) >> >> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c >> index 37b95dff0b07..a85b44343ac6 100644 >> --- a/arch/arm64/mm/fault.c >> +++ b/arch/arm64/mm/fault.c >> @@ -31,6 +31,7 @@ >> #include <linux/highmem.h> >> #include <linux/perf_event.h> >> #include <linux/preempt.h> >> +#include <linux/hugetlb.h> >> >> #include <asm/bug.h> >> #include <asm/cpufeature.h> >> @@ -239,10 +240,11 @@ static void __do_kernel_fault(struct mm_struct *mm, unsigned long addr, >> */ >> static void __do_user_fault(struct task_struct *tsk, unsigned long addr, >> unsigned int esr, unsigned int sig, int code, >> - struct pt_regs *regs) >> + struct pt_regs *regs, int fault) >> { >> struct siginfo si; >> const struct fault_info *inf; >> + unsigned int lsb = 0; >> >> if (unhandled_signal(tsk, sig) && show_unhandled_signals_ratelimited()) { >> inf = esr_to_fault_info(esr); >> @@ -259,6 +261,17 @@ static void __do_user_fault(struct task_struct *tsk, unsigned long addr, >> si.si_errno = 0; >> si.si_code = code; >> si.si_addr = (void __user *)addr; >> + /* >> + * Either small page or large page may be poisoned. >> + * In other words, VM_FAULT_HWPOISON_LARGE and >> + * VM_FAULT_HWPOISON are mutually exclusive. >> + */ >> + if (fault & VM_FAULT_HWPOISON_LARGE) >> + lsb = hstate_index_to_shift(VM_FAULT_GET_HINDEX(fault)); >> + else if (fault & VM_FAULT_HWPOISON) >> + lsb = PAGE_SHIFT; >> + si.si_addr_lsb = lsb; >> + > > If we're going to start handling poison faults, then we should probably > rejig the perf page fault accounting around here so that we follow x86: > > * Always report PERF_COUNT_SW_PAGE_FAULTS, > * Don't report anything else for VM_FAULT_ERROR > * Report PERF_COUNT_SW_PAGE_FAULTS_MAJ if VM_FAULT_MAJOR > * Otherwise, report PERF_COUNT_SW_PAGE_FAULTS_MIN > > at the moment, I think you're accounting VM_FAULT_ERROR as > PERF_COUNT_SW_PAGE_FAULTS_MIN, which doesn't feel right at all. Ok. I'd missed the implication of enabling poisoned pages fault handling on the perf accounting. I'll add a patch to update the reporting. > > Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 3/3] arm64: kconfig: allow support for memory failure handling 2017-05-17 15:23 [PATCH v2 0/3] arm64: Add support for handling memory corruption Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling Punit Agrawal @ 2017-05-17 15:23 ` Punit Agrawal [not found] ` <1495081650.3981.15@smtp.canonical.com> 3 siblings, 0 replies; 15+ messages in thread From: Punit Agrawal @ 2017-05-17 15:23 UTC (permalink / raw) To: catalin.marinas, will.deacon Cc: Jonathan (Zhixiong) Zhang, tbaicar, steve.capper, linux-arm-kernel, linux-kernel, manoj.iyer, Punit Agrawal From: "Jonathan (Zhixiong) Zhang" <zjzhang@codeaurora.org> Declare ARCH_SUPPORTS_MEMORY_FAILURE, as arm64 does support memory failure recovery attempt. Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org> (Dropped changes to ACPI APEI Kconfig and updated commit log) Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 3dcd7ec69bca..39986b63383e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -20,6 +20,7 @@ config ARM64 select ARCH_HAS_STRICT_MODULE_RWX select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_USE_CMPXCHG_LOCKREF + select ARCH_SUPPORTS_MEMORY_FAILURE select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_WANT_COMPAT_IPC_PARSE_VERSION -- 2.11.0 ^ permalink raw reply related [flat|nested] 15+ messages in thread
[parent not found: <1495081650.3981.15@smtp.canonical.com>]
* Re: [PATCH v2 0/3] arm64: Add support for handling memory corruption [not found] ` <1495081650.3981.15@smtp.canonical.com> @ 2017-05-18 10:27 ` Punit Agrawal 0 siblings, 0 replies; 15+ messages in thread From: Punit Agrawal @ 2017-05-18 10:27 UTC (permalink / raw) To: Manoj Iyer Cc: catalin.marinas, will.deacon, tbaicar, steve.capper, linux-arm-kernel, linux-kernel Hi Manoj, Manoj Iyer <manoj.iyer@canonical.com> writes: > On Wed, May 17, 2017 at 10:23 AM, Punit Agrawal > <punit.agrawal@arm.com> wrote: > > Hi, This series enables memory failure handling for arm64. > Previous posting can be found at [0]. Changes since v1: * Reworked > Patch 1 based on Catalin's feedbak to symmetrically deal with PUD > and PMD hugepages in huge_pte_offset() * Added Steve's acks With > support for contiguous hugepages being turned off[1], some of the > problems arising from swap entries go away[2]. This simplifies the > changes needed to enable memory corruption handling for arm64 > (done in this seris). In this series, we updates huge_pte_offset() > to correctly deal with swap entries (Patch 1). This function will > need to be updated when contiguous hugepages are re-enabled. Patch > 2 adds support to send SIGBUS to processes that have their memory > corrupted. With the prerequisites in place, enable memory > corruption handling for arm64 (patch 3). > > Thanks, Punit [0] > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1376052.html > [1] https://lkml.org/lkml/2017/4/7/486 [2] > https://lkml.org/lkml/2017/4/5/402 Jonathan (Zhixiong) Zhang (2): > arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling arm64: > kconfig: allow support for memory failure handling Punit Agrawal > (1): arm64: hugetlb: Fix huge_pte_offset to return poisoned page > table entries arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/pgtable.h | 2 +- arch/arm64/mm/fault.c | 22 > +++++++++++++++++++--- arch/arm64/mm/hugetlbpage.c | 29 > ++++++++++------------------- 4 files changed, 31 insertions(+), > 23 deletions(-) > -- > 2.11.0 > > I applied Jonathans 2 patches to Ubuntu Zesty kernel (4.10) and ran > the mce-test ./run_hugepage.sh after fixing a few things in the test > case. This generated the bad pmd messages. > > linux-4.10.0/mm/pgtable-generic.c:33: bad pmd 0000000172420074. > > Then I applied Punit's patch to the kernel and re-ran the mce-test and > did not see the bad pmd messages. The tests were done on a Qualcomm > Centriq 2400 platform. > > Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Thanks for taking the patches for a spin. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-06-08 16:28 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-17 15:23 [PATCH v2 0/3] arm64: Add support for handling memory corruption Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries Punit Agrawal 2017-06-07 13:47 ` Will Deacon 2017-06-07 14:30 ` Catalin Marinas 2017-06-07 14:57 ` Will Deacon 2017-06-07 15:32 ` Punit Agrawal 2017-06-07 15:41 ` Will Deacon 2017-06-08 16:28 ` Punit Agrawal 2017-06-07 16:54 ` Catalin Marinas 2017-06-07 14:58 ` Catalin Marinas 2017-05-17 15:23 ` [PATCH v2 2/3] arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling Punit Agrawal 2017-06-07 13:59 ` Will Deacon 2017-06-07 17:47 ` Punit Agrawal 2017-05-17 15:23 ` [PATCH v2 3/3] arm64: kconfig: allow support for memory failure handling Punit Agrawal [not found] ` <1495081650.3981.15@smtp.canonical.com> 2017-05-18 10:27 ` [PATCH v2 0/3] arm64: Add support for handling memory corruption Punit Agrawal
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).