Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux-FSDevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	Andi Kleen <ak@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 4/8] mm: Only drain per-cpu pagevecs once per pagevec usage
Date: Thu, 19 Oct 2017 10:33:46 +0100
Message-ID: <20171019093346.ylahzdpzmoriyf4v@techsingularity.net> (raw)
In-Reply-To: <a9f2fc7c-906d-a49e-8e8f-d1024dc754ac@suse.cz>

On Thu, Oct 19, 2017 at 11:12:52AM +0200, Vlastimil Babka wrote:
> On 10/18/2017 09:59 AM, Mel Gorman wrote:
> > When a pagevec is initialised on the stack, it is generally used multiple
> > times over a range of pages, looking up entries and then releasing them.
> > On each pagevec_release, the per-cpu deferred LRU pagevecs are drained
> > on the grounds the page being released may be on those queues and the
> > pages may be cache hot. In many cases only the first drain is necessary
> > as it's unlikely that the range of pages being walked is racing against
> > LRU addition.  Even if there is such a race, the impact is marginal where
> > as constantly redraining the lru pagevecs costs.
> 
> Right, the drain is only to a local cpu, not all of them, so that kind
> of "racing" shouldn't be even possible.
> 

Potentially the user of the pagevec can be preempted and another process
modify the per-cpu pagevecs in parallel. Note even that users of a pagevec
in a loop may call cond_resched so preemption is not even necessary if
the pagevec is being used for a large enough number of operations. The
risk is still marginal.

> > This patch ensures that pagevec is only drained once in a given lifecycle
> > without increasing the cache footprint of the pagevec structure. Only
> 
> Well, strictly speaking it does prevent decreasing the cache footprint
> by removing the 'cold' field later :)
> 

Debatable. Even freeing a cold page if it was handled properly still has
a cache footprint impact because the struct page fields are accessed.
Maybe that's not what you meant and you are referring to the size of the
structure itself. If so, note that I change the type of the two fields so
they should fit in the same size as an unsigned long in many cases.

As an aside, I did at one point have a patch that removed the drained
field as well and increased the number of pagevec entries but it didn't
work out in terms of overall performance so I dropped it.

> > sparsetruncate tiny is shown here as large files have many exceptional
> > entries and calls pagecache_release less frequently.
> > 
> > sparsetruncate (tiny)
> >                               4.14.0-rc4             4.14.0-rc4
> >                         batchshadow-v1r1          onedrain-v1r1
> > Min          Time      141.00 (   0.00%)      141.00 (   0.00%)
> > 1st-qrtle    Time      142.00 (   0.00%)      142.00 (   0.00%)
> > 2nd-qrtle    Time      142.00 (   0.00%)      142.00 (   0.00%)
> > 3rd-qrtle    Time      143.00 (   0.00%)      143.00 (   0.00%)
> > Max-90%      Time      144.00 (   0.00%)      144.00 (   0.00%)
> > Max-95%      Time      146.00 (   0.00%)      145.00 (   0.68%)
> > Max-99%      Time      198.00 (   0.00%)      194.00 (   2.02%)
> > Max          Time      254.00 (   0.00%)      208.00 (  18.11%)
> > Amean        Time      145.12 (   0.00%)      144.30 (   0.56%)
> > Stddev       Time       12.74 (   0.00%)        9.62 (  24.49%)
> > Coeff        Time        8.78 (   0.00%)        6.67 (  24.06%)
> > Best99%Amean Time      144.29 (   0.00%)      143.82 (   0.32%)
> > Best95%Amean Time      142.68 (   0.00%)      142.31 (   0.26%)
> > Best90%Amean Time      142.52 (   0.00%)      142.19 (   0.24%)
> > Best75%Amean Time      142.26 (   0.00%)      141.98 (   0.20%)
> > Best50%Amean Time      141.90 (   0.00%)      141.71 (   0.13%)
> > Best25%Amean Time      141.80 (   0.00%)      141.43 (   0.26%)
> > 
> > The impact on bonnie is marginal and within the noise because a significant
> > percentage of the file being truncated has been reclaimed and consists of
> > shadow entries which reduce the hotness of the pagevec_release path.
> > 
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> > ---
> >  include/linux/pagevec.h | 4 +++-
> >  mm/swap.c               | 5 ++++-
> >  2 files changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
> > index 4dcd5506f1ed..4231979be982 100644
> > --- a/include/linux/pagevec.h
> > +++ b/include/linux/pagevec.h
> > @@ -16,7 +16,8 @@ struct address_space;
> >  
> >  struct pagevec {
> >  	unsigned long nr;
> > -	unsigned long cold;
> > +	bool cold;
> > +	bool drained;
> 
> 'drained' sounds a bit misleading to me, I would expect it to refer to
> *this* pagevec. What about e.g. "lru_drained"?
> 

It's not draining the LRU as such. How about the following patch on top
of the series? If another full series repost is necessary, I'll fold it
in.

---8<---
mm, pagevec: Rename pagevec drained field

According to Vlastimil Babka, the drained field in pagevec is potentially
misleading because it might be interpreted as draining this pagevec instead
of the percpu lru pagevecs. Rename the field for clarity.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/pagevec.h | 4 ++--
 mm/swap.c               | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
index 8dcde51e80ff..ba5dc27ef6bb 100644
--- a/include/linux/pagevec.h
+++ b/include/linux/pagevec.h
@@ -16,7 +16,7 @@ struct address_space;
 
 struct pagevec {
 	unsigned long nr;
-	bool drained;
+	bool percpu_pvec_drained;
 	struct page *pages[PAGEVEC_SIZE];
 };
 
@@ -52,7 +52,7 @@ static inline unsigned pagevec_lookup_tag(struct pagevec *pvec,
 static inline void pagevec_init(struct pagevec *pvec)
 {
 	pvec->nr = 0;
-	pvec->drained = false;
+	pvec->percpu_pvec_drained = false;
 }
 
 static inline void pagevec_reinit(struct pagevec *pvec)
diff --git a/mm/swap.c b/mm/swap.c
index b480279c760c..38e1b6374a97 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -833,9 +833,9 @@ EXPORT_SYMBOL(release_pages);
  */
 void __pagevec_release(struct pagevec *pvec)
 {
-	if (!pvec->drained) {
+	if (!pvec->percpu_pvec_drained) {
 		lru_add_drain();
-		pvec->drained = true;
+		pvec->percpu_pvec_drained = true;
 	}
 	release_pages(pvec->pages, pagevec_count(pvec));
 	pagevec_reinit(pvec);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply index

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-18  7:59 [PATCH 0/8] Follow-up for speed up page cache truncation v2 Mel Gorman
2017-10-18  7:59 ` [PATCH 1/8] mm, page_alloc: Enable/disable IRQs once when freeing a list of pages Mel Gorman
2017-10-18  9:02   ` Vlastimil Babka
2017-10-18 10:15     ` Mel Gorman
2017-10-18  7:59 ` [PATCH 2/8] mm, truncate: Do not check mapping for every page being truncated Mel Gorman
2017-10-19  8:11   ` Jan Kara
2017-10-18  7:59 ` [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock Mel Gorman
2017-10-18  7:59 ` [PATCH 4/8] mm: Only drain per-cpu pagevecs once per pagevec usage Mel Gorman
2017-10-19  9:12   ` Vlastimil Babka
2017-10-19  9:33     ` Mel Gorman [this message]
2017-10-19 13:21       ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 5/8] mm, pagevec: Remove cold parameter for pagevecs Mel Gorman
2017-10-19  9:24   ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 6/8] mm: Remove cold parameter for release_pages Mel Gorman
2017-10-19  9:26   ` Vlastimil Babka
2017-10-18  7:59 ` [PATCH 7/8] mm, Remove cold parameter from free_hot_cold_page* Mel Gorman
2017-10-19 13:12   ` Vlastimil Babka
2017-10-19 15:43     ` Mel Gorman
2017-10-18  7:59 ` [PATCH 8/8] mm: Remove __GFP_COLD Mel Gorman
2017-10-19 13:18   ` Vlastimil Babka
2017-10-19 13:42   ` Vlastimil Babka
2017-10-19 14:32     ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2017-10-12  9:30 [PATCH 0/8] Follow-up for speed up page cache truncation Mel Gorman
2017-10-12  9:30 ` [PATCH 4/8] mm: Only drain per-cpu pagevecs once per pagevec usage Mel Gorman

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171019093346.ylahzdpzmoriyf4v@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox