From: Mel Gorman <mgorman@suse.de>
To: Nadav Amit <nadav.amit@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>
Subject: Re: Potential race in TLB flush batching?
Date: Tue, 11 Jul 2017 22:09:19 +0100 [thread overview]
Message-ID: <20170711210919.y4odiqtfeb4e3ulz@suse.de> (raw)
In-Reply-To: <3373F577-F289-4028-B6F6-777D029A7B07@gmail.com>
On Tue, Jul 11, 2017 at 01:06:48PM -0700, Nadav Amit wrote:
> > +/*
> > + * Reclaim batches unmaps pages under the PTL but does not flush the TLB
> > + * TLB prior to releasing the PTL. It's possible a parallel mprotect or
> > + * munmap can race between reclaim unmapping the page and flushing the
> > + * page. If this race occurs, it potentially allows access to data via
> > + * a stale TLB entry. Tracking all mm's that have TLB batching pending
> > + * would be expensive during reclaim so instead track whether TLB batching
> > + * occured in the past and if so then do a full mm flush here. This will
> > + * cost one additional flush per reclaim cycle paid by the first munmap or
> > + * mprotect. This assumes it's called under the PTL to synchronise access
> > + * to mm->tlb_flush_batched.
> > + */
> > +void flush_tlb_batched_pending(struct mm_struct *mm)
> > +{
> > + if (mm->tlb_flush_batched) {
> > + flush_tlb_mm(mm);
> > + mm->tlb_flush_batched = false;
> > + }
> > +}
> > #else
> > static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable)
> > {
>
> I don???t know what is exactly the invariant that is kept, so it is hard for
> me to figure out all sort of questions:
>
> Should pte_accessible return true if mm->tlb_flush_batch==true ?
>
It shouldn't be necessary. The contexts where we hit the path are
uprobes: elevated page count so no parallel reclaim
dax: PTEs are not mapping that would be reclaimed
hugetlbfs: Not reclaimed
ksm: holds page lock and elevates count so cannot race with reclaim
cow: at the time of the flush, the page count is elevated so cannot race with reclaim
page_mkclean: only concerned with marking existing ptes clean but in any
case, the batching flushes the TLB before issueing any IO so there
isn't space for a stable TLB entry to be used for something bad.
> Does madvise_free_pte_range need to be modified as well?
>
Yes, I noticed that out shortly after sending the first version and
commented upon it.
> How will future code not break anything?
>
I can't really answer that without a crystal ball. Code dealing with page
table updates would need to take some care if it can race with parallel
reclaim.
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-07-11 21:09 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-11 0:52 Potential race in TLB flush batching? Nadav Amit
2017-07-11 6:41 ` Mel Gorman
2017-07-11 7:30 ` Nadav Amit
2017-07-11 9:29 ` Mel Gorman
2017-07-11 10:40 ` Nadav Amit
2017-07-11 13:20 ` Mel Gorman
2017-07-11 14:58 ` Andy Lutomirski
2017-07-11 15:53 ` Mel Gorman
2017-07-11 17:23 ` Andy Lutomirski
2017-07-11 19:18 ` Mel Gorman
2017-07-11 20:06 ` Nadav Amit
2017-07-11 21:09 ` Mel Gorman [this message]
2017-07-11 20:09 ` Mel Gorman
2017-07-11 21:52 ` Mel Gorman
2017-07-11 22:27 ` Nadav Amit
2017-07-11 22:34 ` Nadav Amit
2017-07-12 8:27 ` Mel Gorman
2017-07-12 23:27 ` Nadav Amit
2017-07-12 23:36 ` Andy Lutomirski
2017-07-12 23:42 ` Nadav Amit
2017-07-13 5:38 ` Andy Lutomirski
2017-07-13 16:05 ` Nadav Amit
2017-07-13 16:06 ` Andy Lutomirski
2017-07-13 6:07 ` Mel Gorman
2017-07-13 16:08 ` Andy Lutomirski
2017-07-13 17:07 ` Mel Gorman
2017-07-13 17:15 ` Andy Lutomirski
2017-07-13 18:23 ` Mel Gorman
2017-07-14 23:16 ` Nadav Amit
2017-07-15 15:55 ` Mel Gorman
2017-07-15 16:41 ` Andy Lutomirski
2017-07-17 7:49 ` Mel Gorman
2017-07-18 21:28 ` Nadav Amit
2017-07-19 7:41 ` Mel Gorman
2017-07-19 19:41 ` Nadav Amit
2017-07-19 19:58 ` Mel Gorman
2017-07-19 20:20 ` Nadav Amit
2017-07-19 21:47 ` Mel Gorman
2017-07-19 22:19 ` Nadav Amit
2017-07-19 22:59 ` Mel Gorman
2017-07-19 23:39 ` Nadav Amit
2017-07-20 7:43 ` Mel Gorman
2017-07-22 1:19 ` Nadav Amit
2017-07-24 9:58 ` Mel Gorman
2017-07-24 19:46 ` Nadav Amit
2017-07-25 7:37 ` Minchan Kim
2017-07-25 8:51 ` Mel Gorman
2017-07-25 9:11 ` Minchan Kim
2017-07-25 10:10 ` Mel Gorman
2017-07-26 5:43 ` Minchan Kim
2017-07-26 9:22 ` Mel Gorman
2017-07-26 19:18 ` Nadav Amit
2017-07-26 23:40 ` Minchan Kim
2017-07-27 0:09 ` Nadav Amit
2017-07-27 0:34 ` Minchan Kim
2017-07-27 0:48 ` Nadav Amit
2017-07-27 1:13 ` Nadav Amit
2017-07-27 7:04 ` Minchan Kim
2017-07-27 7:21 ` Mel Gorman
2017-07-27 16:04 ` Nadav Amit
2017-07-27 17:36 ` Mel Gorman
2017-07-26 23:44 ` Minchan Kim
2017-07-11 22:07 ` Andy Lutomirski
2017-07-11 22:33 ` Mel Gorman
2017-07-14 7:00 ` Benjamin Herrenschmidt
2017-07-14 8:31 ` Mel Gorman
2017-07-14 9:02 ` Benjamin Herrenschmidt
2017-07-14 9:27 ` Mel Gorman
2017-07-14 22:21 ` Andy Lutomirski
2017-07-11 16:22 ` Nadav Amit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170711210919.y4odiqtfeb4e3ulz@suse.de \
--to=mgorman@suse.de \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=nadav.amit@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).