linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v3 PATCH 1/2] mm: swap: make page_evictable() inline
@ 2020-03-17 17:42 Yang Shi
  2020-03-17 17:42 ` [v3 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Yang Shi @ 2020-03-17 17:42 UTC (permalink / raw)
  To: shakeelb, vbabka, willy, akpm; +Cc: yang.shi, linux-mm, linux-kernel

When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
down with a couple of vm-scalability's test cases (lru-file-readonce,
lru-file-readtwice and lru-file-mmap-read).  I didn't see that much down
on my VM (32c-64g-2nodes).  It might be caused by the test configuration,
which is 32c-256g with NUMA disabled and the tests were run in root memcg,
so the tests actually stress only one inactive and active lru.  It
sounds not very usual in mordern production environment.

That commit did two major changes:
1. Call page_evictable()
2. Use smp_mb to force the PG_lru set visible

It looks they contribute the most overhead.  The page_evictable() is a
function which does function prologue and epilogue, and that was used by
page reclaim path only.  However, lru add is a very hot path, so it
sounds better to make it inline.  However, it calls page_mapping() which
is not inlined either, but the disassemble shows it doesn't do push and
pop operations and it sounds not very straightforward to inline it.

Other than this, it sounds smp_mb() is not necessary for x86 since
SetPageLRU is atomic which enforces memory barrier already, replace it
with smp_mb__after_atomic() in the following patch.

With the two fixes applied, the tests can get back around 5% on that
test bench and get back normal on my VM.  Since the test bench
configuration is not that usual and I also saw around 6% up on the
latest upstream, so it sounds good enough IMHO.

The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
	mainline	w/ inline fix
          150MB            154MB

With this patch the throughput gets 2.67% up.  The data with using
smp_mb__after_atomic() is showed in the following patch.

Shakeel Butt did the below test:

On a real machine with limiting the 'dd' on a single node and reading 100 GiB
sparse file (less than a single node).  Just ran a single instance to not
cause the lru lock contention. The cmdline used is
"dd if=file-100GiB of=/dev/null bs=4k".  Ran the cmd 10 times with drop_caches
in between and measured the time it took.

Without patch: 56.64143 +- 0.672 sec

With patches: 56.10 +- 0.21 sec

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-and-Tested-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
v3: * Fixed the build error reported by lkp.
    * Added Shakeel's test result and his review-and-test signature.
v2: * Solved the comments from Willy.

 include/linux/pagemap.h |  2 +-
 include/linux/swap.h    | 25 ++++++++++++++++++++++++-
 mm/vmscan.c             | 23 -----------------------
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ccb14b6..654ce76 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -70,7 +70,7 @@ static inline void mapping_clear_unevictable(struct address_space *mapping)
 	clear_bit(AS_UNEVICTABLE, &mapping->flags);
 }
 
-static inline int mapping_unevictable(struct address_space *mapping)
+static inline bool mapping_unevictable(struct address_space *mapping)
 {
 	if (mapping)
 		return test_bit(AS_UNEVICTABLE, &mapping->flags);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e99f7a..e8b8bbe 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -12,6 +12,7 @@
 #include <linux/fs.h>
 #include <linux/atomic.h>
 #include <linux/page-flags.h>
+#include <linux/pagemap.h>
 #include <asm/page.h>
 
 struct notifier_block;
@@ -374,7 +375,29 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
 #define node_reclaim_mode 0
 #endif
 
-extern int page_evictable(struct page *page);
+/**
+ * page_evictable - test whether a page is evictable
+ * @page: the page to test
+ *
+ * Test whether page is evictable--i.e., should be placed on active/inactive
+ * lists vs unevictable list.
+ *
+ * Reasons page might not be evictable:
+ * (1) page's mapping marked unevictable
+ * (2) page is part of an mlocked VMA
+ *
+ */
+static inline bool page_evictable(struct page *page)
+{
+	bool ret;
+
+	/* Prevent address_space of inode and swap cache from being freed */
+	rcu_read_lock();
+	ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
+	rcu_read_unlock();
+	return ret;
+}
+
 extern void check_move_unevictable_pages(struct pagevec *pvec);
 
 extern int kswapd_run(int nid);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8763705..855c395 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
 }
 #endif
 
-/*
- * page_evictable - test whether a page is evictable
- * @page: the page to test
- *
- * Test whether page is evictable--i.e., should be placed on active/inactive
- * lists vs unevictable list.
- *
- * Reasons page might not be evictable:
- * (1) page's mapping marked unevictable
- * (2) page is part of an mlocked VMA
- *
- */
-int page_evictable(struct page *page)
-{
-	int ret;
-
-	/* Prevent address_space of inode and swap cache from being freed */
-	rcu_read_lock();
-	ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
-	rcu_read_unlock();
-	return ret;
-}
-
 /**
  * check_move_unevictable_pages - check pages for evictability and move to
  * appropriate zone lru list
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [v3 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set
  2020-03-17 17:42 [v3 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
@ 2020-03-17 17:42 ` Yang Shi
  2020-03-17 19:04 ` [v3 PATCH 1/2] mm: swap: make page_evictable() inline Matthew Wilcox
  2020-03-17 19:21 ` Matthew Wilcox
  2 siblings, 0 replies; 7+ messages in thread
From: Yang Shi @ 2020-03-17 17:42 UTC (permalink / raw)
  To: shakeelb, vbabka, willy, akpm; +Cc: yang.shi, linux-mm, linux-kernel

Memory barrier is needed after setting LRU bit, but smp_mb() is too
strong.  Some architectures, i.e. x86, imply memory barrier with atomic
operations, so replacing it with smp_mb__after_atomic() sounds better,
which is nop on strong ordered machines, and full memory barriers on
others.  With this change the vm-calability cases would perform better
on x86, I saw total 6% improvement with this patch and previous inline
fix.

The test data (lru-file-readtwice throughput) against v5.6-rc4:
	mainline	w/ inline fix	w/ both (adding this)
	150MB		154MB		159MB

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-and-Tested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
v3: * Added Shakeel's review-and-test signature.
v2: * Solved the comment from Vlastimil and added his Ack.

 mm/swap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index cf39d24..74ea08a 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -931,7 +931,6 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
-	SetPageLRU(page);
 	/*
 	 * Page becomes evictable in two ways:
 	 * 1) Within LRU lock [munlock_vma_page() and __munlock_pagevec()].
@@ -958,7 +957,8 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 	 * looking at the same page) and the evictable page will be stranded
 	 * in an unevictable LRU.
 	 */
-	smp_mb();
+	SetPageLRU(page);
+	smp_mb__after_atomic();
 
 	if (page_evictable(page)) {
 		lru = page_lru(page);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [v3 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17 17:42 [v3 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
  2020-03-17 17:42 ` [v3 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
@ 2020-03-17 19:04 ` Matthew Wilcox
  2020-03-17 19:28   ` Yang Shi
  2020-03-18  8:15   ` Michal Hocko
  2020-03-17 19:21 ` Matthew Wilcox
  2 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2020-03-17 19:04 UTC (permalink / raw)
  To: Yang Shi; +Cc: shakeelb, vbabka, akpm, linux-mm, linux-kernel

On Wed, Mar 18, 2020 at 01:42:50AM +0800, Yang Shi wrote:
> v3: * Fixed the build error reported by lkp.

I'm not terribly enthusiastic about including pagemap.h from swap.h.
It's a discussion that reasonable people can disagree about, so let's
set it up:

This patch adds inline bool page_evictable(struct page *page) to swap.h.
page_evictable() uses mapping_evictable() which is in pagemap.h.
mapping_evictable() uses AS_UNEVICTABLE which is also in pagemap.h.

We could move mapping_evictable() and friends to fs.h (already included
by swap.h).  But how about just moving page_evictable() to pagemap.h?
pagemap.h is already included by mm/mlock.c, mm/swap.c and mm/vmscan.c,
which are the only three current callers of page_evictable().

In fact, since it's only called by those three files, perhaps it should
be in mm/internal.h?  I don't see it becoming a terribly popular function
to call outside of the core mm code.

I think I have a mild preference for it being in pagemap.h, since I don't
have a hard time convincing myself that it's part of the page cache API,
but I definitely prefer internal.h over swap.h.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [v3 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17 17:42 [v3 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
  2020-03-17 17:42 ` [v3 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
  2020-03-17 19:04 ` [v3 PATCH 1/2] mm: swap: make page_evictable() inline Matthew Wilcox
@ 2020-03-17 19:21 ` Matthew Wilcox
  2020-03-17 19:29   ` Yang Shi
  2 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2020-03-17 19:21 UTC (permalink / raw)
  To: Yang Shi; +Cc: shakeelb, vbabka, akpm, linux-mm, linux-kernel

On Wed, Mar 18, 2020 at 01:42:50AM +0800, Yang Shi wrote:
> -static inline int mapping_unevictable(struct address_space *mapping)
> +static inline bool mapping_unevictable(struct address_space *mapping)
>  {
>  	if (mapping)
>  		return test_bit(AS_UNEVICTABLE, &mapping->flags);

Shouldn't this be:

-static inline int mapping_unevictable(struct address_space *mapping)
+static inline bool mapping_unevictable(struct address_space *mapping)
 {
-       if (mapping)
-               return test_bit(AS_UNEVICTABLE, &mapping->flags);
-       return !!mapping;
+       return mapping && test_bit(AS_UNEVICTABLE, &mapping->flags);
 }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [v3 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17 19:04 ` [v3 PATCH 1/2] mm: swap: make page_evictable() inline Matthew Wilcox
@ 2020-03-17 19:28   ` Yang Shi
  2020-03-18  8:15   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Yang Shi @ 2020-03-17 19:28 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: shakeelb, vbabka, akpm, linux-mm, linux-kernel



On 3/17/20 12:04 PM, Matthew Wilcox wrote:
> On Wed, Mar 18, 2020 at 01:42:50AM +0800, Yang Shi wrote:
>> v3: * Fixed the build error reported by lkp.
> I'm not terribly enthusiastic about including pagemap.h from swap.h.
> It's a discussion that reasonable people can disagree about, so let's
> set it up:
>
> This patch adds inline bool page_evictable(struct page *page) to swap.h.
> page_evictable() uses mapping_evictable() which is in pagemap.h.
> mapping_evictable() uses AS_UNEVICTABLE which is also in pagemap.h.
>
> We could move mapping_evictable() and friends to fs.h (already included
> by swap.h).  But how about just moving page_evictable() to pagemap.h?
> pagemap.h is already included by mm/mlock.c, mm/swap.c and mm/vmscan.c,
> which are the only three current callers of page_evictable().
>
> In fact, since it's only called by those three files, perhaps it should
> be in mm/internal.h?  I don't see it becoming a terribly popular function
> to call outside of the core mm code.
>
> I think I have a mild preference for it being in pagemap.h, since I don't
> have a hard time convincing myself that it's part of the page cache API,
> but I definitely prefer internal.h over swap.h.

Thanks for the suggestion. Will move it to pagemap.h



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [v3 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17 19:21 ` Matthew Wilcox
@ 2020-03-17 19:29   ` Yang Shi
  0 siblings, 0 replies; 7+ messages in thread
From: Yang Shi @ 2020-03-17 19:29 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: shakeelb, vbabka, akpm, linux-mm, linux-kernel



On 3/17/20 12:21 PM, Matthew Wilcox wrote:
> On Wed, Mar 18, 2020 at 01:42:50AM +0800, Yang Shi wrote:
>> -static inline int mapping_unevictable(struct address_space *mapping)
>> +static inline bool mapping_unevictable(struct address_space *mapping)
>>   {
>>   	if (mapping)
>>   		return test_bit(AS_UNEVICTABLE, &mapping->flags);
> Shouldn't this be:
>
> -static inline int mapping_unevictable(struct address_space *mapping)
> +static inline bool mapping_unevictable(struct address_space *mapping)
>   {
> -       if (mapping)
> -               return test_bit(AS_UNEVICTABLE, &mapping->flags);
> -       return !!mapping;
> +       return mapping && test_bit(AS_UNEVICTABLE, &mapping->flags);

Looks neater. Will take it in thew new version.

>   }


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [v3 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17 19:04 ` [v3 PATCH 1/2] mm: swap: make page_evictable() inline Matthew Wilcox
  2020-03-17 19:28   ` Yang Shi
@ 2020-03-18  8:15   ` Michal Hocko
  1 sibling, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2020-03-18  8:15 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Yang Shi, shakeelb, vbabka, akpm, linux-mm, linux-kernel

On Tue 17-03-20 12:04:11, Matthew Wilcox wrote:
[...]
> In fact, since it's only called by those three files, perhaps it should
> be in mm/internal.h?  I don't see it becoming a terribly popular function
> to call outside of the core mm code.

Agreed. I do not see any strong reasons why anything outside of the core
MM should care about evictability. So if moving it to internal.h is
reasonable without too much churn then I would that way.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-18  8:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-17 17:42 [v3 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
2020-03-17 17:42 ` [v3 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
2020-03-17 19:04 ` [v3 PATCH 1/2] mm: swap: make page_evictable() inline Matthew Wilcox
2020-03-17 19:28   ` Yang Shi
2020-03-18  8:15   ` Michal Hocko
2020-03-17 19:21 ` Matthew Wilcox
2020-03-17 19:29   ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).