From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: TLB and PTE coherency during munmap Date: Wed, 29 May 2013 14:27:28 +0200 Message-ID: <20130529122728.GA27176@twins.programming.kicks-ass.net> References: <51A45861.1010008@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from merlin.infradead.org ([205.233.59.134]:52614 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965969Ab3E2M1r (ORCPT ); Wed, 29 May 2013 08:27:47 -0400 Content-Disposition: inline In-Reply-To: <51A45861.1010008@gmail.com> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Max Filippov Cc: KAMEZAWA Hiroyuki , linux-arch@vger.kernel.org, linux-mm@kvack.org, Ralf Baechle , Chris Zankel , Marc Gauthier , linux-xtensa@linux-xtensa.org, Hugh Dickins On Tue, May 28, 2013 at 11:10:25AM +0400, Max Filippov wrote: > On Sun, May 26, 2013 at 6:50 AM, Max Filippov wrote: > > Hello arch and mm people. > > > > Is it intentional that threads of a process that invoked munmap syscall > > can see TLB entries pointing to already freed pages, or it is a bug? > > > > I'm talking about zap_pmd_range and zap_pte_range: > > > > zap_pmd_range > > zap_pte_range > > arch_enter_lazy_mmu_mode > > ptep_get_and_clear_full > > tlb_remove_tlb_entry > > __tlb_remove_page > > arch_leave_lazy_mmu_mode > > cond_resched > > > > With the default arch_{enter,leave}_lazy_mmu_mode, tlb_remove_tlb_entry > > and __tlb_remove_page there is a loop in the zap_pte_range that clears > > PTEs and frees corresponding pages, but doesn't flush TLB, and > > surrounding loop in the zap_pmd_range that calls cond_resched. If a thread > > of the same process gets scheduled then it is able to see TLB entries > > pointing to already freed physical pages. > > > > I've noticed that with xtensa arch when I added a test before returning to > > userspace checking that TLB contents agrees with page tables of the > > current mm. This check reliably fires with the LTP test mtest05 that > > maps, unmaps and accesses memory from multiple threads. > > > > Is there anything wrong in my description, maybe something specific to > > my arch, or this issue really exists? > > Hi, > > I've made similar checking function for MIPS (because qemu is my only choice > and it simulates MIPS TLB) and ran my tests on mips-malta machine in qemu. > With MIPS I can also see this issue. I hope I did it right, the patch at the > bottom is for the reference. The test I run and the diagnostic output are as > follows: > > To me it looks like the cond_resched in the zap_pmd_range is the root cause > of this issue (let alone SMP case for now). It was introduced in the commit > > commit 97a894136f29802da19a15541de3c019e1ca147e > Author: Peter Zijlstra > Date: Tue May 24 17:12:04 2011 -0700 > > mm: Remove i_mmap_lock lockbreak > > Peter, Kamezawa, other reviewers of that commit, could you please comment? Are you all running UP systems? I suppose the preemptible muck invalidated the assumption that UP systems are 'easy'. If you make tlb_fast_mode() return an unconditional false, does it all work again? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx198.postini.com [74.125.245.198]) by kanga.kvack.org (Postfix) with SMTP id 6357A6B00B3 for ; Wed, 29 May 2013 08:27:45 -0400 (EDT) Date: Wed, 29 May 2013 14:27:28 +0200 From: Peter Zijlstra Subject: Re: TLB and PTE coherency during munmap Message-ID: <20130529122728.GA27176@twins.programming.kicks-ass.net> References: <51A45861.1010008@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51A45861.1010008@gmail.com> Sender: owner-linux-mm@kvack.org List-ID: To: Max Filippov Cc: KAMEZAWA Hiroyuki , linux-arch@vger.kernel.org, linux-mm@kvack.org, Ralf Baechle , Chris Zankel , Marc Gauthier , linux-xtensa@linux-xtensa.org, Hugh Dickins On Tue, May 28, 2013 at 11:10:25AM +0400, Max Filippov wrote: > On Sun, May 26, 2013 at 6:50 AM, Max Filippov wrote: > > Hello arch and mm people. > > > > Is it intentional that threads of a process that invoked munmap syscall > > can see TLB entries pointing to already freed pages, or it is a bug? > > > > I'm talking about zap_pmd_range and zap_pte_range: > > > > zap_pmd_range > > zap_pte_range > > arch_enter_lazy_mmu_mode > > ptep_get_and_clear_full > > tlb_remove_tlb_entry > > __tlb_remove_page > > arch_leave_lazy_mmu_mode > > cond_resched > > > > With the default arch_{enter,leave}_lazy_mmu_mode, tlb_remove_tlb_entry > > and __tlb_remove_page there is a loop in the zap_pte_range that clears > > PTEs and frees corresponding pages, but doesn't flush TLB, and > > surrounding loop in the zap_pmd_range that calls cond_resched. If a thread > > of the same process gets scheduled then it is able to see TLB entries > > pointing to already freed physical pages. > > > > I've noticed that with xtensa arch when I added a test before returning to > > userspace checking that TLB contents agrees with page tables of the > > current mm. This check reliably fires with the LTP test mtest05 that > > maps, unmaps and accesses memory from multiple threads. > > > > Is there anything wrong in my description, maybe something specific to > > my arch, or this issue really exists? > > Hi, > > I've made similar checking function for MIPS (because qemu is my only choice > and it simulates MIPS TLB) and ran my tests on mips-malta machine in qemu. > With MIPS I can also see this issue. I hope I did it right, the patch at the > bottom is for the reference. The test I run and the diagnostic output are as > follows: > > To me it looks like the cond_resched in the zap_pmd_range is the root cause > of this issue (let alone SMP case for now). It was introduced in the commit > > commit 97a894136f29802da19a15541de3c019e1ca147e > Author: Peter Zijlstra > Date: Tue May 24 17:12:04 2011 -0700 > > mm: Remove i_mmap_lock lockbreak > > Peter, Kamezawa, other reviewers of that commit, could you please comment? Are you all running UP systems? I suppose the preemptible muck invalidated the assumption that UP systems are 'easy'. If you make tlb_fast_mode() return an unconditional false, does it all work again? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org