From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 23 Dec 2016 10:17:25 +0100 From: Michal Hocko To: Minchan Kim Cc: "Kirill A. Shutemov" , Andrew Morton , linux-mm@kvack.org, Jason Evans , "Kirill A . Shutemov" , Will Deacon , Catalin Marinas , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, "[4.5+]" , Andreas Schwab Subject: Re: [PATCH] mm: pmd dirty emulation in page fault handler Message-ID: <20161223091725.GA23117@dhcp22.suse.cz> References: <1482364101-16204-1-git-send-email-minchan@kernel.org> <20161222081713.GA32480@node.shutemov.name> <20161222145203.GA18970@bbox> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161222145203.GA18970@bbox> Sender: owner-linux-mm@kvack.org List-ID: On Thu 22-12-16 23:52:03, Minchan Kim wrote: [...] > >From b3ec95c0df91ad113525968a4a6b53030fd0b48d Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Thu, 22 Dec 2016 23:43:49 +0900 > Subject: [PATCH v2] mm: pmd dirty emulation in page fault handler > > Andreas reported [1] made a test in jemalloc hang in THP mode in arm64. > http://lkml.kernel.org/r/mvmmvfy37g1.fsf@hawking.suse.de > > The problem is page fault handler supports only accessed flag emulation > for THP page of SW-dirty/accessed architecture. > > This patch enables dirty-bit emulation for those architectures. > Without it, MADV_FREE makes application hang by repeated fault forever. The changelog is rather terse and considering the issue is rather subtle and it aims the stable tree I think it could see more information. How do we end up looping in the page fault and why the dirty pmd stops it. Could you update the changelog to be more verbose, please? I am still digesting this patch but I believe it is correct fwiw... Thanks! > [1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called > > Cc: Jason Evans > Cc: Kirill A. Shutemov > Cc: Will Deacon > Cc: Catalin Marinas > Cc: linux-arch@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: [4.5+] > Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") > Reported-by: Andreas Schwab > Signed-off-by: Minchan Kim > --- > * from v1 > * Remove __handle_mm_fault part - Kirill > > mm/huge_memory.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 10eedbf..29ec8a4 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) > { > pmd_t entry; > unsigned long haddr; > + bool write = vmf->flags & FAULT_FLAG_WRITE; > > vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); > if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) > goto unlock; > > entry = pmd_mkyoung(orig_pmd); > + if (write) > + entry = pmd_mkdirty(entry); > haddr = vmf->address & HPAGE_PMD_MASK; > - if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, > - vmf->flags & FAULT_FLAG_WRITE)) > + if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write)) > update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd); > > unlock: > -- > 2.7.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [PATCH] mm: pmd dirty emulation in page fault handler Date: Fri, 23 Dec 2016 10:17:25 +0100 Message-ID: <20161223091725.GA23117@dhcp22.suse.cz> References: <1482364101-16204-1-git-send-email-minchan@kernel.org> <20161222081713.GA32480@node.shutemov.name> <20161222145203.GA18970@bbox> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mx2.suse.de ([195.135.220.15]:55542 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751497AbcLWJRb (ORCPT ); Fri, 23 Dec 2016 04:17:31 -0500 Content-Disposition: inline In-Reply-To: <20161222145203.GA18970@bbox> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Minchan Kim Cc: "Kirill A. Shutemov" , Andrew Morton , linux-mm@kvack.org, Jason Evans , "Kirill A . Shutemov" , Will Deacon , Catalin Marinas , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, "[4.5+]" , Andreas Schwab On Thu 22-12-16 23:52:03, Minchan Kim wrote: [...] > >From b3ec95c0df91ad113525968a4a6b53030fd0b48d Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Thu, 22 Dec 2016 23:43:49 +0900 > Subject: [PATCH v2] mm: pmd dirty emulation in page fault handler > > Andreas reported [1] made a test in jemalloc hang in THP mode in arm64. > http://lkml.kernel.org/r/mvmmvfy37g1.fsf@hawking.suse.de > > The problem is page fault handler supports only accessed flag emulation > for THP page of SW-dirty/accessed architecture. > > This patch enables dirty-bit emulation for those architectures. > Without it, MADV_FREE makes application hang by repeated fault forever. The changelog is rather terse and considering the issue is rather subtle and it aims the stable tree I think it could see more information. How do we end up looping in the page fault and why the dirty pmd stops it. Could you update the changelog to be more verbose, please? I am still digesting this patch but I believe it is correct fwiw... Thanks! > [1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called > > Cc: Jason Evans > Cc: Kirill A. Shutemov > Cc: Will Deacon > Cc: Catalin Marinas > Cc: linux-arch@vger.kernel.org > Cc: linux-arm-kernel@lists.infradead.org > Cc: [4.5+] > Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") > Reported-by: Andreas Schwab > Signed-off-by: Minchan Kim > --- > * from v1 > * Remove __handle_mm_fault part - Kirill > > mm/huge_memory.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 10eedbf..29ec8a4 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) > { > pmd_t entry; > unsigned long haddr; > + bool write = vmf->flags & FAULT_FLAG_WRITE; > > vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); > if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) > goto unlock; > > entry = pmd_mkyoung(orig_pmd); > + if (write) > + entry = pmd_mkdirty(entry); > haddr = vmf->address & HPAGE_PMD_MASK; > - if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, > - vmf->flags & FAULT_FLAG_WRITE)) > + if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write)) > update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd); > > unlock: > -- > 2.7.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 From: mhocko@kernel.org (Michal Hocko) Date: Fri, 23 Dec 2016 10:17:25 +0100 Subject: [PATCH] mm: pmd dirty emulation in page fault handler In-Reply-To: <20161222145203.GA18970@bbox> References: <1482364101-16204-1-git-send-email-minchan@kernel.org> <20161222081713.GA32480@node.shutemov.name> <20161222145203.GA18970@bbox> Message-ID: <20161223091725.GA23117@dhcp22.suse.cz> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu 22-12-16 23:52:03, Minchan Kim wrote: [...] > >From b3ec95c0df91ad113525968a4a6b53030fd0b48d Mon Sep 17 00:00:00 2001 > From: Minchan Kim > Date: Thu, 22 Dec 2016 23:43:49 +0900 > Subject: [PATCH v2] mm: pmd dirty emulation in page fault handler > > Andreas reported [1] made a test in jemalloc hang in THP mode in arm64. > http://lkml.kernel.org/r/mvmmvfy37g1.fsf at hawking.suse.de > > The problem is page fault handler supports only accessed flag emulation > for THP page of SW-dirty/accessed architecture. > > This patch enables dirty-bit emulation for those architectures. > Without it, MADV_FREE makes application hang by repeated fault forever. The changelog is rather terse and considering the issue is rather subtle and it aims the stable tree I think it could see more information. How do we end up looping in the page fault and why the dirty pmd stops it. Could you update the changelog to be more verbose, please? I am still digesting this patch but I believe it is correct fwiw... Thanks! > [1] b8d3c4c3009d, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called > > Cc: Jason Evans > Cc: Kirill A. Shutemov > Cc: Will Deacon > Cc: Catalin Marinas > Cc: linux-arch at vger.kernel.org > Cc: linux-arm-kernel at lists.infradead.org > Cc: [4.5+] > Fixes: b8d3c4c3009d ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") > Reported-by: Andreas Schwab > Signed-off-by: Minchan Kim > --- > * from v1 > * Remove __handle_mm_fault part - Kirill > > mm/huge_memory.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 10eedbf..29ec8a4 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -883,15 +883,17 @@ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) > { > pmd_t entry; > unsigned long haddr; > + bool write = vmf->flags & FAULT_FLAG_WRITE; > > vmf->ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); > if (unlikely(!pmd_same(*vmf->pmd, orig_pmd))) > goto unlock; > > entry = pmd_mkyoung(orig_pmd); > + if (write) > + entry = pmd_mkdirty(entry); > haddr = vmf->address & HPAGE_PMD_MASK; > - if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, > - vmf->flags & FAULT_FLAG_WRITE)) > + if (pmdp_set_access_flags(vmf->vma, haddr, vmf->pmd, entry, write)) > update_mmu_cache_pmd(vmf->vma, vmf->address, vmf->pmd); > > unlock: > -- > 2.7.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo at kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email at kvack.org -- Michal Hocko SUSE Labs