From: Mel Gorman <mgorman@suse.de>
To: Rik van Riel <riel@redhat.com>
Cc: Linux-MM <linux-mm@kvack.org>, Hugh Dickins <hughd@google.com>,
Minchan Kim <minchan@kernel.org>,
Dave Hansen <dave.hansen@intel.com>,
Andi Kleen <andi@firstfloor.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] mm: Defer TLB flush after unmap as long as possible
Date: Tue, 21 Apr 2015 22:17:04 +0100 [thread overview]
Message-ID: <20150421211704.GC14842@suse.de> (raw)
In-Reply-To: <5536B386.4050808@redhat.com>
On Tue, Apr 21, 2015 at 04:31:02PM -0400, Rik van Riel wrote:
> On 04/21/2015 06:41 AM, Mel Gorman wrote:
> > If a PTE is unmapped and it's dirty then it was writable recently. Due
> > to deferred TLB flushing, it's best to assume a writable TLB cache entry
> > exists. With that assumption, the TLB must be flushed before any IO can
> > start or the page is freed to avoid lost writes or data corruption. Prior
> > to this patch, such PFNs were simply flushed immediately. In this patch,
> > the caller is informed that such entries potentially exist and it's up to
> > the caller to flush before pages are freed or IO can start.
> >
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
>
> > @@ -1450,10 +1455,11 @@ static int page_not_mapped(struct page *page)
> > * page, used in the pageout path. Caller must hold the page lock.
> > * Return values are:
> > *
> > - * SWAP_SUCCESS - we succeeded in removing all mappings
> > - * SWAP_AGAIN - we missed a mapping, try again later
> > - * SWAP_FAIL - the page is unswappable
> > - * SWAP_MLOCK - page is mlocked.
> > + * SWAP_SUCCESS - we succeeded in removing all mappings
> > + * SWAP_SUCCESS_CACHED - Like SWAP_SUCCESS but a writable TLB entry may exist
> > + * SWAP_AGAIN - we missed a mapping, try again later
> > + * SWAP_FAIL - the page is unswappable
> > + * SWAP_MLOCK - page is mlocked.
> > */
> > int try_to_unmap(struct page *page, enum ttu_flags flags)
> > {
> > @@ -1481,7 +1487,8 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
> > ret = rmap_walk(page, &rwc);
> >
> > if (ret != SWAP_MLOCK && !page_mapped(page))
> > - ret = SWAP_SUCCESS;
> > + ret = (ret == SWAP_AGAIN_CACHED) ? SWAP_SUCCESS_CACHED : SWAP_SUCCESS;
> > +
> > return ret;
> > }
>
> This wants a big fat comment explaining why SWAP_AGAIN_CACHED is turned
> into SWAP_SUCCESS_CACHED.
>
I'll add something in V4. SWAP_AGAIN_CACHED indicates to rmap_walk that
it should continue the rmap but that a write cached PTE was encountered.
SWAP_SUCCESS is what callers of try_to_unmap() expect so
SWAP_SUCCESS_CACHED is the equivalent.
> I think I understand why this is happening, but I am not sure how to
> explain it...
>
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 12ec298087b6..0ad3f435afdd 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -860,6 +860,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > unsigned long nr_reclaimed = 0;
> > unsigned long nr_writeback = 0;
> > unsigned long nr_immediate = 0;
> > + bool tlb_flush_required = false;
> >
> > cond_resched();
> >
> > @@ -1032,6 +1033,9 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> > goto keep_locked;
> > case SWAP_MLOCK:
> > goto cull_mlocked;
> > + case SWAP_SUCCESS_CACHED:
> > + /* Must flush before free, fall through */
> > + tlb_flush_required = true;
> > case SWAP_SUCCESS:
> > ; /* try to free the page below */
> > }
> > @@ -1176,7 +1180,8 @@ keep:
> > }
> >
> > mem_cgroup_uncharge_list(&free_pages);
> > - try_to_unmap_flush();
> > + if (tlb_flush_required)
> > + try_to_unmap_flush();
> > free_hot_cold_page_list(&free_pages, true);
>
> Don't we have to flush the TLB before calling pageout() on the page?
>
Not any more. It got removed in patch 2 up and I forgot to reintroduce it
with a tlb_flush_required check here. Thanks for that.
> In other words, we would also have to batch up calls to pageout(), if
> we want to do batched TLB flushing.
>
> This could be accomplished by putting the SWAP_SUCCESS_CACHED pages on
> a special list, instead of calling pageout() on them immediately, and
> then calling pageout() on all the pages on that list after the batch
> flush.
>
True. We had discussed something like that on IRC. It's a good idea but
a separate patch because it's less clear-cut. We're taking a partial pass
through the list in an attempt to reduce IPIs.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2015-04-21 21:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-21 10:41 [RFC PATCH 0/6] TLB flush multiple pages with a single IPI v3 Mel Gorman
2015-04-21 10:41 ` [PATCH 1/6] x86, mm: Trace when an IPI is about to be sent Mel Gorman
2015-04-21 10:41 ` [PATCH 2/6] mm: Send a single IPI to TLB flush multiple pages when unmapping Mel Gorman
2015-04-21 10:41 ` [PATCH 3/6] mm: Defer TLB flush after unmap as long as possible Mel Gorman
2015-04-21 20:31 ` Rik van Riel
2015-04-21 21:17 ` Mel Gorman [this message]
2015-04-21 10:41 ` [PATCH 4/6] mm, migrate: Drop references to successfully migrated pages at the same time Mel Gorman
2015-04-21 10:41 ` [PATCH 5/6] mm, migrate: Batch TLB flushing when unmapping pages for migration Mel Gorman
2015-04-21 10:41 ` [PATCH 6/6] mm: Gather more PFNs before sending a TLB to flush unmapped pages Mel Gorman
2015-04-24 14:46 ` [RFC PATCH 0/6] TLB flush multiple pages with a single IPI v3 Vlastimil Babka
2015-04-24 15:16 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150421211704.GC14842@suse.de \
--to=mgorman@suse.de \
--cc=andi@firstfloor.org \
--cc=dave.hansen@intel.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).