From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751332AbdKMA2f (ORCPT ); Sun, 12 Nov 2017 19:28:35 -0500 Received: from LGEAMRELO12.lge.com ([156.147.23.52]:37545 "EHLO lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751020AbdKMA2e (ORCPT ); Sun, 12 Nov 2017 19:28:34 -0500 X-Original-SENDERIP: 156.147.1.125 X-Original-MAILFROM: minchan@kernel.org X-Original-SENDERIP: 10.177.220.163 X-Original-MAILFROM: minchan@kernel.org Date: Mon, 13 Nov 2017 09:28:33 +0900 From: Minchan Kim To: Michal Hocko Cc: Wang Nan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, will.deacon@arm.com, Bob Liu , Andrew Morton , David Rientjes , Ingo Molnar , Roman Gushchin , Konstantin Khlebnikov , Andrea Arcangeli Subject: Re: [PATCH] arch, mm: introduce arch_tlb_gather_mmu_lazy (was: Re: [RESEND PATCH] mm, oom_reaper: gather each vma to prevent) leaking TLB entry Message-ID: <20171113002833.GA18301@bbox> References: <20171107095453.179940-1-wangnan0@huawei.com> <20171110001933.GA12421@bbox> <20171110101529.op6yaxtdke2p4bsh@dhcp22.suse.cz> <20171110122635.q26xdxytgdfjy5q3@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171110122635.q26xdxytgdfjy5q3@dhcp22.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 10, 2017 at 01:26:35PM +0100, Michal Hocko wrote: > On Fri 10-11-17 11:15:29, Michal Hocko wrote: > > On Fri 10-11-17 09:19:33, Minchan Kim wrote: > > > On Tue, Nov 07, 2017 at 09:54:53AM +0000, Wang Nan wrote: > > > > tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory > > > > space. In this case, tlb->fullmm is true. Some archs like arm64 doesn't > > > > flush TLB when tlb->fullmm is true: > > > > > > > > commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1"). > > > > > > > > Which makes leaking of tlb entries. > > > > > > That means soft-dirty which has used tlb_gather_mmu with fullmm could be > > > broken via losing write-protection bit once it supports arm64 in future? > > > > > > If so, it would be better to use TASK_SIZE rather than -1 in tlb_gather_mmu. > > > Of course, it's a off-topic. > > > > I wouldn't play tricks like that. And maybe the API itself could be more > > explicit. E.g. add a lazy parameter which would allow arch specific code > > to not flush if it is sure that nobody can actually stumble over missed > > flush. E.g. the following? > > This one has a changelog and even compiles on my crosscompile test > --- > From 7f0fcd2cab379ddac5611b2a520cdca8a77a235b Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Fri, 10 Nov 2017 11:27:17 +0100 > Subject: [PATCH] arch, mm: introduce arch_tlb_gather_mmu_lazy > > 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1") has > introduced an optimization to not flush tlb when we are tearing the > whole address space down. Will goes on to explain > > : Basically, we tag each address space with an ASID (PCID on x86) which > : is resident in the TLB. This means we can elide TLB invalidation when > : pulling down a full mm because we won't ever assign that ASID to > : another mm without doing TLB invalidation elsewhere (which actually > : just nukes the whole TLB). > > This all is nice but tlb_gather users are not aware of that and this can > actually cause some real problems. E.g. the oom_reaper tries to reap the > whole address space but it might race with threads accessing the memory [1]. > It is possible that soft-dirty handling might suffer from the same > problem [2]. > > Introduce an explicit lazy variant tlb_gather_mmu_lazy which allows the > behavior arm64 implements for the fullmm case and replace it by an > explicit lazy flag in the mmu_gather structure. exit_mmap path is then > turned into the explicit lazy variant. Other architectures simply ignore > the flag. > > [1] http://lkml.kernel.org/r/20171106033651.172368-1-wangnan0@huawei.com > [2] http://lkml.kernel.org/r/20171110001933.GA12421@bbox > Signed-off-by: Michal Hocko > --- > arch/arm/include/asm/tlb.h | 3 ++- > arch/arm64/include/asm/tlb.h | 2 +- > arch/ia64/include/asm/tlb.h | 3 ++- > arch/s390/include/asm/tlb.h | 3 ++- > arch/sh/include/asm/tlb.h | 2 +- > arch/um/include/asm/tlb.h | 2 +- > include/asm-generic/tlb.h | 6 ++++-- > include/linux/mm_types.h | 2 ++ > mm/memory.c | 17 +++++++++++++++-- > mm/mmap.c | 2 +- > 10 files changed, 31 insertions(+), 11 deletions(-) > > diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h > index d5562f9ce600..fe9042aee8e9 100644 > --- a/arch/arm/include/asm/tlb.h > +++ b/arch/arm/include/asm/tlb.h > @@ -149,7 +149,8 @@ static inline void tlb_flush_mmu(struct mmu_gather *tlb) > > static inline void > arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > - unsigned long start, unsigned long end) > + unsigned long start, unsigned long end, > + bool lazy) > Thanks for the patch, Michal. However, it would be nice to do it tranparently without asking new flags from users. When I read tlb_gather_mmu's description, fullmm is supposed to be used only if there is no users and full address space. That means we can do it API itself like this? void arch_tlb_gather_mmu(...) tlb->fullmm = !(start | (end + 1)) && atomic_read(&mm->mm_users) == 0; From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f72.google.com (mail-pg0-f72.google.com [74.125.83.72]) by kanga.kvack.org (Postfix) with ESMTP id 87A09280249 for ; Sun, 12 Nov 2017 19:28:36 -0500 (EST) Received: by mail-pg0-f72.google.com with SMTP id r12so4088794pgu.9 for ; Sun, 12 Nov 2017 16:28:36 -0800 (PST) Received: from lgeamrelo12.lge.com (LGEAMRELO12.lge.com. [156.147.23.52]) by mx.google.com with ESMTP id w71si14317445pfd.262.2017.11.12.16.28.34 for ; Sun, 12 Nov 2017 16:28:34 -0800 (PST) Date: Mon, 13 Nov 2017 09:28:33 +0900 From: Minchan Kim Subject: Re: [PATCH] arch, mm: introduce arch_tlb_gather_mmu_lazy (was: Re: [RESEND PATCH] mm, oom_reaper: gather each vma to prevent) leaking TLB entry Message-ID: <20171113002833.GA18301@bbox> References: <20171107095453.179940-1-wangnan0@huawei.com> <20171110001933.GA12421@bbox> <20171110101529.op6yaxtdke2p4bsh@dhcp22.suse.cz> <20171110122635.q26xdxytgdfjy5q3@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171110122635.q26xdxytgdfjy5q3@dhcp22.suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Wang Nan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, will.deacon@arm.com, Bob Liu , Andrew Morton , David Rientjes , Ingo Molnar , Roman Gushchin , Konstantin Khlebnikov , Andrea Arcangeli On Fri, Nov 10, 2017 at 01:26:35PM +0100, Michal Hocko wrote: > On Fri 10-11-17 11:15:29, Michal Hocko wrote: > > On Fri 10-11-17 09:19:33, Minchan Kim wrote: > > > On Tue, Nov 07, 2017 at 09:54:53AM +0000, Wang Nan wrote: > > > > tlb_gather_mmu(&tlb, mm, 0, -1) means gathering the whole virtual memory > > > > space. In this case, tlb->fullmm is true. Some archs like arm64 doesn't > > > > flush TLB when tlb->fullmm is true: > > > > > > > > commit 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1"). > > > > > > > > Which makes leaking of tlb entries. > > > > > > That means soft-dirty which has used tlb_gather_mmu with fullmm could be > > > broken via losing write-protection bit once it supports arm64 in future? > > > > > > If so, it would be better to use TASK_SIZE rather than -1 in tlb_gather_mmu. > > > Of course, it's a off-topic. > > > > I wouldn't play tricks like that. And maybe the API itself could be more > > explicit. E.g. add a lazy parameter which would allow arch specific code > > to not flush if it is sure that nobody can actually stumble over missed > > flush. E.g. the following? > > This one has a changelog and even compiles on my crosscompile test > --- > From 7f0fcd2cab379ddac5611b2a520cdca8a77a235b Mon Sep 17 00:00:00 2001 > From: Michal Hocko > Date: Fri, 10 Nov 2017 11:27:17 +0100 > Subject: [PATCH] arch, mm: introduce arch_tlb_gather_mmu_lazy > > 5a7862e83000 ("arm64: tlbflush: avoid flushing when fullmm == 1") has > introduced an optimization to not flush tlb when we are tearing the > whole address space down. Will goes on to explain > > : Basically, we tag each address space with an ASID (PCID on x86) which > : is resident in the TLB. This means we can elide TLB invalidation when > : pulling down a full mm because we won't ever assign that ASID to > : another mm without doing TLB invalidation elsewhere (which actually > : just nukes the whole TLB). > > This all is nice but tlb_gather users are not aware of that and this can > actually cause some real problems. E.g. the oom_reaper tries to reap the > whole address space but it might race with threads accessing the memory [1]. > It is possible that soft-dirty handling might suffer from the same > problem [2]. > > Introduce an explicit lazy variant tlb_gather_mmu_lazy which allows the > behavior arm64 implements for the fullmm case and replace it by an > explicit lazy flag in the mmu_gather structure. exit_mmap path is then > turned into the explicit lazy variant. Other architectures simply ignore > the flag. > > [1] http://lkml.kernel.org/r/20171106033651.172368-1-wangnan0@huawei.com > [2] http://lkml.kernel.org/r/20171110001933.GA12421@bbox > Signed-off-by: Michal Hocko > --- > arch/arm/include/asm/tlb.h | 3 ++- > arch/arm64/include/asm/tlb.h | 2 +- > arch/ia64/include/asm/tlb.h | 3 ++- > arch/s390/include/asm/tlb.h | 3 ++- > arch/sh/include/asm/tlb.h | 2 +- > arch/um/include/asm/tlb.h | 2 +- > include/asm-generic/tlb.h | 6 ++++-- > include/linux/mm_types.h | 2 ++ > mm/memory.c | 17 +++++++++++++++-- > mm/mmap.c | 2 +- > 10 files changed, 31 insertions(+), 11 deletions(-) > > diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h > index d5562f9ce600..fe9042aee8e9 100644 > --- a/arch/arm/include/asm/tlb.h > +++ b/arch/arm/include/asm/tlb.h > @@ -149,7 +149,8 @@ static inline void tlb_flush_mmu(struct mmu_gather *tlb) > > static inline void > arch_tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, > - unsigned long start, unsigned long end) > + unsigned long start, unsigned long end, > + bool lazy) > Thanks for the patch, Michal. However, it would be nice to do it tranparently without asking new flags from users. When I read tlb_gather_mmu's description, fullmm is supposed to be used only if there is no users and full address space. That means we can do it API itself like this? void arch_tlb_gather_mmu(...) tlb->fullmm = !(start | (end + 1)) && atomic_read(&mm->mm_users) == 0; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org