All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vineet Gupta <Vineet.Gupta1@synopsys.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	linux-mm@vger.kernel.org, linux-xtensa@linux-xtensa.org,
	Chris Zankel <chris@zankel.net>,
	Marc Gauthier <Marc.Gauthier@tensilica.com>
Subject: Re: TLB and PTE coherency during munmap
Date: Wed, 29 May 2013 17:23:14 +0530	[thread overview]
Message-ID: <51A5EC2A.4070903@synopsys.com> (raw)
In-Reply-To: <CAHkRjk4ZNwZvf_Cv+HqfMManodCkEpCPdZokPQ68z3nVG8-+wg@mail.gmail.com>

On 05/28/2013 08:05 PM, Catalin Marinas wrote:
> Max,
> 
> On 26 May 2013 03:42, Max Filippov <jcmvbkbc@gmail.com> wrote:
>> Hello arch and mm people.
>>
>> Is it intentional that threads of a process that invoked munmap syscall
>> can see TLB entries pointing to already freed pages, or it is a bug?
> 
> If it happens, this would be a bug. It means that a process can access
> a physical page that has been allocated to something else, possibly
> kernel data.
> 
>> I'm talking about zap_pmd_range and zap_pte_range:
>>
>>       zap_pmd_range
>>         zap_pte_range
>>           arch_enter_lazy_mmu_mode
>>             ptep_get_and_clear_full
>>             tlb_remove_tlb_entry
>>             __tlb_remove_page
>>           arch_leave_lazy_mmu_mode
>>         cond_resched
>>
>> With the default arch_{enter,leave}_lazy_mmu_mode, tlb_remove_tlb_entry
>> and __tlb_remove_page there is a loop in the zap_pte_range that clears
>> PTEs and frees corresponding pages, but doesn't flush TLB, and
>> surrounding loop in the zap_pmd_range that calls cond_resched. If a thread
>> of the same process gets scheduled then it is able to see TLB entries
>> pointing to already freed physical pages.
> 
> It looks to me like cond_resched() here introduces a possible bug but
> it depends on the actual arch code, especially the
> __tlb_remove_tlb_entry() function. On ARM we record the range in
> tlb_remove_tlb_entry() and queue the pages to be removed in
> __tlb_remove_page(). It pretty much acts like tlb_fast_mode() == 0
> even for the UP case (which is also needed for hardware speculative
> TLB loads). The tlb_finish_mmu() takes care of whatever pages are left
> to be freed.
> 
> With a dummy __tlb_remove_tlb_entry() and tlb_fast_mode() == 1,
> cond_resched() in zap_pmd_range() would cause problems.
> 
> I think possible workarounds:
> 
> 1. tlb_fast_mode() always returning 0.

This might add needless page free batching logic so not very lucrative.

> 2. add a tlb_flush_mmu(tlb) before cond_resched() in zap_pmd_range().

For !fullmm flushes it might be no-op on some arches (atleast on ARC) as we use
tlb_end_vma() to do TLB range flush. And flushing the entire TLB would be
excessive though.

Actually zap_pte_range() already has logic to flush the TLB range (if batching
runs out of space). Can we re-use that infrastructure to make sure zap_pte_range()
does it's share of TLB flushing before returning and going into cond_resched().

However with that, we need to prevent tlb_end_vma()/tlb_finish_mmu() from
duplicating the range flush - which can be done by clearing tlb->need_flush.

Now simplistically this will cause even the fullmm flushes (simple ASID increment
on ARC/ARM..) to become TLB walks to flush the individual entries so we can do
this for only for !fullmm, assuming that cond_resched() can potentially cause an
exit'ing task's thread to be scheduled in and reuse the entries.

Let me go off cook a patch to see if this might work.

-Vineet

  parent reply	other threads:[~2013-05-29 11:53 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-26  2:42 TLB and PTE coherency during munmap Max Filippov
2013-05-26  2:50 ` Max Filippov
2013-05-26  2:50   ` Max Filippov
2013-05-28  7:10   ` Max Filippov
2013-05-28  7:10     ` Max Filippov
2013-05-29 12:27     ` Peter Zijlstra
2013-05-29 12:27       ` Peter Zijlstra
2013-05-29 12:42       ` Vineet Gupta
2013-05-29 12:42         ` Vineet Gupta
2013-05-29 12:47         ` Peter Zijlstra
2013-05-29 12:47           ` Peter Zijlstra
2013-05-29 17:51         ` Peter Zijlstra
2013-05-29 17:51           ` Peter Zijlstra
2013-05-29 22:04           ` Catalin Marinas
2013-05-29 22:04             ` Catalin Marinas
2013-05-30  6:48             ` Peter Zijlstra
2013-05-30  6:48               ` Peter Zijlstra
2013-05-30  5:04           ` Vineet Gupta
2013-05-30  5:04             ` Vineet Gupta
2013-05-30  6:56             ` Peter Zijlstra
2013-05-30  6:56               ` Peter Zijlstra
2013-05-30  7:00               ` Vineet Gupta
2013-05-30  7:00                 ` Vineet Gupta
2013-05-30 11:03                 ` Peter Zijlstra
2013-05-30 11:03                   ` Peter Zijlstra
2013-05-31  4:09           ` Max Filippov
2013-05-31  4:09             ` Max Filippov
2013-05-31  7:55             ` Peter Zijlstra
2013-05-31  7:55               ` Peter Zijlstra
2013-06-03  9:05             ` Peter Zijlstra
2013-06-03  9:05               ` Peter Zijlstra
2013-06-03  9:16               ` Ingo Molnar
2013-06-03  9:16                 ` Ingo Molnar
2013-06-03 10:01                 ` Catalin Marinas
2013-06-03 10:01                   ` Catalin Marinas
2013-06-03 10:04                   ` Peter Zijlstra
2013-06-03 10:04                     ` Peter Zijlstra
2013-06-03 10:09                     ` Catalin Marinas
2013-06-03 10:09                       ` Catalin Marinas
2013-06-04  9:52               ` Peter Zijlstra
2013-06-04  9:52                 ` Peter Zijlstra
2013-06-05  0:05                 ` Linus Torvalds
2013-06-05  0:05                   ` Linus Torvalds
2013-06-05 10:26                   ` [PATCH] arch, mm: Remove tlb_fast_mode() Peter Zijlstra
2013-06-05 10:26                     ` Peter Zijlstra
2013-05-31  1:40       ` TLB and PTE coherency during munmap Max Filippov
2013-05-31  1:40         ` Max Filippov
2013-05-28 14:34   ` Konrad Rzeszutek Wilk
2013-05-28 14:34     ` Konrad Rzeszutek Wilk
2013-05-29  3:23     ` Max Filippov
2013-05-29  3:23       ` Max Filippov
2013-05-28 15:16   ` Michal Hocko
2013-05-28 15:16     ` Michal Hocko
2013-05-28 15:23     ` Catalin Marinas
2013-05-28 15:23       ` Catalin Marinas
2013-05-28 14:35 ` Catalin Marinas
2013-05-29  4:15   ` Max Filippov
2013-05-29  4:15     ` Max Filippov
2013-05-29 10:15     ` Catalin Marinas
2013-05-29 10:15       ` Catalin Marinas
2013-05-31  1:26       ` Max Filippov
2013-05-31  1:26         ` Max Filippov
2013-05-31  9:06         ` Catalin Marinas
2013-05-31  9:06           ` Catalin Marinas
2013-06-03  9:16         ` Max Filippov
2013-06-03  9:16           ` Max Filippov
2013-05-29 11:53   ` Vineet Gupta [this message]
2013-05-29 12:00   ` Vineet Gupta
2013-05-29 12:00     ` Vineet Gupta
2013-06-07  2:21 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51A5EC2A.4070903@synopsys.com \
    --to=vineet.gupta1@synopsys.com \
    --cc=Marc.Gauthier@tensilica.com \
    --cc=catalin.marinas@arm.com \
    --cc=chris@zankel.net \
    --cc=jcmvbkbc@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-mm@vger.kernel.org \
    --cc=linux-xtensa@linux-xtensa.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.