linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>
Subject: Re: Potential race in TLB flush batching?
Date: Wed, 12 Jul 2017 09:27:33 +0100	[thread overview]
Message-ID: <20170712082733.ouf7yx2bnvwwcfms@suse.de> (raw)
In-Reply-To: <9ECCACFE-6006-4C19-8FC0-C387EB5F3BEE@gmail.com>

On Tue, Jul 11, 2017 at 03:27:55PM -0700, Nadav Amit wrote:
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > On Tue, Jul 11, 2017 at 09:09:23PM +0100, Mel Gorman wrote:
> >> On Tue, Jul 11, 2017 at 08:18:23PM +0100, Mel Gorman wrote:
> >>> I don't think we should be particularly clever about this and instead just
> >>> flush the full mm if there is a risk of a parallel batching of flushing is
> >>> in progress resulting in a stale TLB entry being used. I think tracking mms
> >>> that are currently batching would end up being costly in terms of memory,
> >>> fairly complex, or both. Something like this?
> >> 
> >> mremap and madvise(DONTNEED) would also need to flush. Memory policies are
> >> fine as a move_pages call that hits the race will simply fail to migrate
> >> a page that is being freed and once migration starts, it'll be flushed so
> >> a stale access has no further risk. copy_page_range should also be ok as
> >> the old mm is flushed and the new mm cannot have entries yet.
> > 
> > Adding those results in
> 
> You are way too fast for me.
> 
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -637,12 +637,34 @@ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags)
> > 		return false;
> > 
> > 	/* If remote CPUs need to be flushed then defer batch the flush */
> > -	if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids)
> > +	if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) {
> > 		should_defer = true;
> > +		mm->tlb_flush_batched = true;
> > +	}
> 
> Since mm->tlb_flush_batched is set before the PTE is actually cleared, it
> still seems to leave a short window for a race.
> 
> CPU0				CPU1
> ---- 				----
> should_defer_flush
> => mm->tlb_flush_batched=true		
> 				flush_tlb_batched_pending (another PT)
> 				=> flush TLB
> 				=> mm->tlb_flush_batched=false
> ptep_get_and_clear
> ...
> 
> 				flush_tlb_batched_pending (batched PT)
> 				use the stale PTE
> ...
> try_to_unmap_flush
> 
> IOW it seems that mm->flush_flush_batched should be set after the PTE is
> cleared (and have some compiler barrier to be on the safe side).

I'm relying on setting and clearing of tlb_flush_batched is under a PTL
that is contended if the race is active.

If reclaim is first, it'll take the PTL, set batched while a racing
mprotect/munmap/etc spins. On release, the racing mprotect/munmmap
immediately calls flush_tlb_batched_pending() before proceeding as normal,
finding pte_none with the TLB flushed.

If the mprotect/munmap/etc is first, it'll take the PTL, observe that
pte_present and handle the flushing itself while reclaim potentially
spins. When reclaim acquires the lock, it'll still set set tlb_flush_batched.

As it's PTL that is taken for that field, it is possible for the accesses
to be re-ordered but only in the case where a race is not occurring.
I'll think some more about whether barriers are necessary but concluded
they weren't needed in this instance. Doing the setting/clear+flush under
the PTL, the protection is similar to normal page table operations that
do not batch the flush.

> One more question, please: how does elevated page count or even locking the
> page help (as you mention in regard to uprobes and ksm)? Yes, the page will
> not be reclaimed, but IIUC try_to_unmap is called before the reference count
> is frozen, and the page lock is dropped on each iteration of the loop in
> shrink_page_list. In this case, it seems to me that uprobes or ksm may still
> not flush the TLB.
> 

If page lock is held then reclaim skips the page entirely and uprobe,
ksm and cow holds the page lock for pages that potentially be observed
by reclaim.  That is the primary protection for those paths.

The elevated page count is less relevant but I was keeping it in mind
trying to think of cases where a stale TLB entry existed and pointed to
the wrong page.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-07-12  8:27 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11  0:52 Potential race in TLB flush batching? Nadav Amit
2017-07-11  6:41 ` Mel Gorman
2017-07-11  7:30   ` Nadav Amit
2017-07-11  9:29     ` Mel Gorman
2017-07-11 10:40       ` Nadav Amit
2017-07-11 13:20         ` Mel Gorman
2017-07-11 14:58           ` Andy Lutomirski
2017-07-11 15:53             ` Mel Gorman
2017-07-11 17:23               ` Andy Lutomirski
2017-07-11 19:18                 ` Mel Gorman
2017-07-11 20:06                   ` Nadav Amit
2017-07-11 21:09                     ` Mel Gorman
2017-07-11 20:09                   ` Mel Gorman
2017-07-11 21:52                     ` Mel Gorman
2017-07-11 22:27                       ` Nadav Amit
2017-07-11 22:34                         ` Nadav Amit
2017-07-12  8:27                         ` Mel Gorman [this message]
2017-07-12 23:27                           ` Nadav Amit
2017-07-12 23:36                             ` Andy Lutomirski
2017-07-12 23:42                               ` Nadav Amit
2017-07-13  5:38                                 ` Andy Lutomirski
2017-07-13 16:05                                   ` Nadav Amit
2017-07-13 16:06                                     ` Andy Lutomirski
2017-07-13  6:07                             ` Mel Gorman
2017-07-13 16:08                               ` Andy Lutomirski
2017-07-13 17:07                                 ` Mel Gorman
2017-07-13 17:15                                   ` Andy Lutomirski
2017-07-13 18:23                                     ` Mel Gorman
2017-07-14 23:16                               ` Nadav Amit
2017-07-15 15:55                                 ` Mel Gorman
2017-07-15 16:41                                   ` Andy Lutomirski
2017-07-17  7:49                                     ` Mel Gorman
2017-07-18 21:28                                   ` Nadav Amit
2017-07-19  7:41                                     ` Mel Gorman
2017-07-19 19:41                                       ` Nadav Amit
2017-07-19 19:58                                         ` Mel Gorman
2017-07-19 20:20                                           ` Nadav Amit
2017-07-19 21:47                                             ` Mel Gorman
2017-07-19 22:19                                               ` Nadav Amit
2017-07-19 22:59                                                 ` Mel Gorman
2017-07-19 23:39                                                   ` Nadav Amit
2017-07-20  7:43                                                     ` Mel Gorman
2017-07-22  1:19                                                       ` Nadav Amit
2017-07-24  9:58                                                         ` Mel Gorman
2017-07-24 19:46                                                           ` Nadav Amit
2017-07-25  7:37                                                           ` Minchan Kim
2017-07-25  8:51                                                             ` Mel Gorman
2017-07-25  9:11                                                               ` Minchan Kim
2017-07-25 10:10                                                                 ` Mel Gorman
2017-07-26  5:43                                                                   ` Minchan Kim
2017-07-26  9:22                                                                     ` Mel Gorman
2017-07-26 19:18                                                                       ` Nadav Amit
2017-07-26 23:40                                                                         ` Minchan Kim
2017-07-27  0:09                                                                           ` Nadav Amit
2017-07-27  0:34                                                                             ` Minchan Kim
2017-07-27  0:48                                                                               ` Nadav Amit
2017-07-27  1:13                                                                                 ` Nadav Amit
2017-07-27  7:04                                                                                   ` Minchan Kim
2017-07-27  7:21                                                                                     ` Mel Gorman
2017-07-27 16:04                                                                                       ` Nadav Amit
2017-07-27 17:36                                                                                         ` Mel Gorman
2017-07-26 23:44                                                                       ` Minchan Kim
2017-07-11 22:07                   ` Andy Lutomirski
2017-07-11 22:33                     ` Mel Gorman
2017-07-14  7:00                     ` Benjamin Herrenschmidt
2017-07-14  8:31                       ` Mel Gorman
2017-07-14  9:02                         ` Benjamin Herrenschmidt
2017-07-14  9:27                           ` Mel Gorman
2017-07-14 22:21                             ` Andy Lutomirski
2017-07-11 16:22           ` Nadav Amit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170712082733.ouf7yx2bnvwwcfms@suse.de \
    --to=mgorman@suse.de \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=nadav.amit@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).