linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2 PATCH 1/2] mm: swap: make page_evictable() inline
@ 2020-03-16 22:24 Yang Shi
  2020-03-16 22:24 ` [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Yang Shi @ 2020-03-16 22:24 UTC (permalink / raw)
  To: shakeelb, vbabka, willy, akpm; +Cc: yang.shi, linux-mm, linux-kernel

When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
down with a couple of vm-scalability's test cases (lru-file-readonce,
lru-file-readtwice and lru-file-mmap-read).  I didn't see that much down
on my VM (32c-64g-2nodes).  It might be caused by the test configuration,
which is 32c-256g with NUMA disabled and the tests were run in root memcg,
so the tests actually stress only one inactive and active lru.  It
sounds not very usual in mordern production environment.

That commit did two major changes:
1. Call page_evictable()
2. Use smp_mb to force the PG_lru set visible

It looks they contribute the most overhead.  The page_evictable() is a
function which does function prologue and epilogue, and that was used by
page reclaim path only.  However, lru add is a very hot path, so it
sounds better to make it inline.  However, it calls page_mapping() which
is not inlined either, but the disassemble shows it doesn't do push and
pop operations and it sounds not very straightforward to inline it.

Other than this, it sounds smp_mb() is not necessary for x86 since
SetPageLRU is atomic which enforces memory barrier already, replace it
with smp_mb__after_atomic() in the following patch.

With the two fixes applied, the tests can get back around 5% on that
test bench and get back normal on my VM.  Since the test bench
configuration is not that usual and I also saw around 6% up on the
latest upstream, so it sounds good enough IMHO.

The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
	mainline	w/ inline fix
          150MB            154MB

With this patch the throughput gets 2.67% up.  The data with using
smp_mb__after_atomic() is showed in the following patch.

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
v2: * Solved the comments from Willy.

 include/linux/pagemap.h |  2 +-
 include/linux/swap.h    | 24 +++++++++++++++++++++++-
 mm/vmscan.c             | 23 -----------------------
 3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index ccb14b6..654ce76 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -70,7 +70,7 @@ static inline void mapping_clear_unevictable(struct address_space *mapping)
 	clear_bit(AS_UNEVICTABLE, &mapping->flags);
 }
 
-static inline int mapping_unevictable(struct address_space *mapping)
+static inline bool mapping_unevictable(struct address_space *mapping)
 {
 	if (mapping)
 		return test_bit(AS_UNEVICTABLE, &mapping->flags);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e99f7a..d296c70 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -374,7 +374,29 @@ extern unsigned long mem_cgroup_shrink_node(struct mem_cgroup *mem,
 #define node_reclaim_mode 0
 #endif
 
-extern int page_evictable(struct page *page);
+/**
+ * page_evictable - test whether a page is evictable
+ * @page: the page to test
+ *
+ * Test whether page is evictable--i.e., should be placed on active/inactive
+ * lists vs unevictable list.
+ *
+ * Reasons page might not be evictable:
+ * (1) page's mapping marked unevictable
+ * (2) page is part of an mlocked VMA
+ *
+ */
+static inline bool page_evictable(struct page *page)
+{
+	bool ret;
+
+	/* Prevent address_space of inode and swap cache from being freed */
+	rcu_read_lock();
+	ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
+	rcu_read_unlock();
+	return ret;
+}
+
 extern void check_move_unevictable_pages(struct pagevec *pvec);
 
 extern int kswapd_run(int nid);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8763705..855c395 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
 }
 #endif
 
-/*
- * page_evictable - test whether a page is evictable
- * @page: the page to test
- *
- * Test whether page is evictable--i.e., should be placed on active/inactive
- * lists vs unevictable list.
- *
- * Reasons page might not be evictable:
- * (1) page's mapping marked unevictable
- * (2) page is part of an mlocked VMA
- *
- */
-int page_evictable(struct page *page)
-{
-	int ret;
-
-	/* Prevent address_space of inode and swap cache from being freed */
-	rcu_read_lock();
-	ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
-	rcu_read_unlock();
-	return ret;
-}
-
 /**
  * check_move_unevictable_pages - check pages for evictability and move to
  * appropriate zone lru list
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set
  2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
@ 2020-03-16 22:24 ` Yang Shi
  2020-03-16 23:47   ` Shakeel Butt
  2020-03-16 23:46 ` [v2 PATCH 1/2] mm: swap: make page_evictable() inline Shakeel Butt
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Yang Shi @ 2020-03-16 22:24 UTC (permalink / raw)
  To: shakeelb, vbabka, willy, akpm; +Cc: yang.shi, linux-mm, linux-kernel

Memory barrier is needed after setting LRU bit, but smp_mb() is too
strong.  Some architectures, i.e. x86, imply memory barrier with atomic
operations, so replacing it with smp_mb__after_atomic() sounds better,
which is nop on strong ordered machines, and full memory barriers on
others.  With this change the vm-calability cases would perform better
on x86, I saw total 6% improvement with this patch and previous inline
fix.

The test data (lru-file-readtwice throughput) against v5.6-rc4:
	mainline	w/ inline fix	w/ both (adding this)
	150MB		154MB		159MB

Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
Cc: Shakeel Butt <shakeelb@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
v2: * Solved the comment from Vlastimil and added his Ack.

 mm/swap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index cf39d24..74ea08a 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -931,7 +931,6 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
-	SetPageLRU(page);
 	/*
 	 * Page becomes evictable in two ways:
 	 * 1) Within LRU lock [munlock_vma_page() and __munlock_pagevec()].
@@ -958,7 +957,8 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 	 * looking at the same page) and the evictable page will be stranded
 	 * in an unevictable LRU.
 	 */
-	smp_mb();
+	SetPageLRU(page);
+	smp_mb__after_atomic();
 
 	if (page_evictable(page)) {
 		lru = page_lru(page);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
  2020-03-16 22:24 ` [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
@ 2020-03-16 23:46 ` Shakeel Butt
  2020-03-17  2:05   ` Yang Shi
  2020-03-17  3:00 ` kbuild test robot
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2020-03-16 23:46 UTC (permalink / raw)
  To: Yang Shi; +Cc: Vlastimil Babka, Matthew Wilcox, Andrew Morton, Linux MM, LKML

On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
> down with a couple of vm-scalability's test cases (lru-file-readonce,
> lru-file-readtwice and lru-file-mmap-read).  I didn't see that much down
> on my VM (32c-64g-2nodes).  It might be caused by the test configuration,
> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
> so the tests actually stress only one inactive and active lru.  It
> sounds not very usual in mordern production environment.
>
> That commit did two major changes:
> 1. Call page_evictable()
> 2. Use smp_mb to force the PG_lru set visible
>
> It looks they contribute the most overhead.  The page_evictable() is a
> function which does function prologue and epilogue, and that was used by
> page reclaim path only.  However, lru add is a very hot path, so it
> sounds better to make it inline.  However, it calls page_mapping() which
> is not inlined either, but the disassemble shows it doesn't do push and
> pop operations and it sounds not very straightforward to inline it.
>
> Other than this, it sounds smp_mb() is not necessary for x86 since
> SetPageLRU is atomic which enforces memory barrier already, replace it
> with smp_mb__after_atomic() in the following patch.
>
> With the two fixes applied, the tests can get back around 5% on that
> test bench and get back normal on my VM.  Since the test bench
> configuration is not that usual and I also saw around 6% up on the
> latest upstream, so it sounds good enough IMHO.
>
> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
>         mainline        w/ inline fix
>           150MB            154MB
>
> With this patch the throughput gets 2.67% up.  The data with using
> smp_mb__after_atomic() is showed in the following patch.
>
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>


So, I tested on a real machine with limiting the 'dd' on a single node
and reading 100 GiB sparse file (less than a single node). I just ran
a single instance to not cause the lru lock contention. The cmd I used
is "dd if=file-100GiB of=/dev/null bs=4k". I ran the cmd 10 times with
drop_caches in between and measured the time it took.

Without patch: 56.64143 +- 0.672 sec

With patches: 56.10 +- 0.21 sec

Reviewed-and-Tested-by: Shakeel Butt <shakeelb@google.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set
  2020-03-16 22:24 ` [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
@ 2020-03-16 23:47   ` Shakeel Butt
  0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2020-03-16 23:47 UTC (permalink / raw)
  To: Yang Shi; +Cc: Vlastimil Babka, Matthew Wilcox, Andrew Morton, Linux MM, LKML

On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>
> Memory barrier is needed after setting LRU bit, but smp_mb() is too
> strong.  Some architectures, i.e. x86, imply memory barrier with atomic
> operations, so replacing it with smp_mb__after_atomic() sounds better,
> which is nop on strong ordered machines, and full memory barriers on
> others.  With this change the vm-calability cases would perform better
> on x86, I saw total 6% improvement with this patch and previous inline
> fix.
>
> The test data (lru-file-readtwice throughput) against v5.6-rc4:
>         mainline        w/ inline fix   w/ both (adding this)
>         150MB           154MB           159MB
>
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <shakeelb@google.com>
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

Reviewed-and-Tested-by: Shakeel Butt <shakeelb@google.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-16 23:46 ` [v2 PATCH 1/2] mm: swap: make page_evictable() inline Shakeel Butt
@ 2020-03-17  2:05   ` Yang Shi
  0 siblings, 0 replies; 9+ messages in thread
From: Yang Shi @ 2020-03-17  2:05 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Vlastimil Babka, Matthew Wilcox, Andrew Morton, Linux MM, LKML



On 3/16/20 4:46 PM, Shakeel Butt wrote:
> On Mon, Mar 16, 2020 at 3:24 PM Yang Shi <yang.shi@linux.alibaba.com> wrote:
>> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
>> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
>> down with a couple of vm-scalability's test cases (lru-file-readonce,
>> lru-file-readtwice and lru-file-mmap-read).  I didn't see that much down
>> on my VM (32c-64g-2nodes).  It might be caused by the test configuration,
>> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
>> so the tests actually stress only one inactive and active lru.  It
>> sounds not very usual in mordern production environment.
>>
>> That commit did two major changes:
>> 1. Call page_evictable()
>> 2. Use smp_mb to force the PG_lru set visible
>>
>> It looks they contribute the most overhead.  The page_evictable() is a
>> function which does function prologue and epilogue, and that was used by
>> page reclaim path only.  However, lru add is a very hot path, so it
>> sounds better to make it inline.  However, it calls page_mapping() which
>> is not inlined either, but the disassemble shows it doesn't do push and
>> pop operations and it sounds not very straightforward to inline it.
>>
>> Other than this, it sounds smp_mb() is not necessary for x86 since
>> SetPageLRU is atomic which enforces memory barrier already, replace it
>> with smp_mb__after_atomic() in the following patch.
>>
>> With the two fixes applied, the tests can get back around 5% on that
>> test bench and get back normal on my VM.  Since the test bench
>> configuration is not that usual and I also saw around 6% up on the
>> latest upstream, so it sounds good enough IMHO.
>>
>> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
>>          mainline        w/ inline fix
>>            150MB            154MB
>>
>> With this patch the throughput gets 2.67% up.  The data with using
>> smp_mb__after_atomic() is showed in the following patch.
>>
>> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
>> Cc: Shakeel Butt <shakeelb@google.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>
> So, I tested on a real machine with limiting the 'dd' on a single node
> and reading 100 GiB sparse file (less than a single node). I just ran
> a single instance to not cause the lru lock contention. The cmd I used
> is "dd if=file-100GiB of=/dev/null bs=4k". I ran the cmd 10 times with
> drop_caches in between and measured the time it took.
>
> Without patch: 56.64143 +- 0.672 sec
>
> With patches: 56.10 +- 0.21 sec
>
> Reviewed-and-Tested-by: Shakeel Butt <shakeelb@google.com>

Thanks Shakeel. It'd better to add your test result in the commit log too.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
  2020-03-16 22:24 ` [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
  2020-03-16 23:46 ` [v2 PATCH 1/2] mm: swap: make page_evictable() inline Shakeel Butt
@ 2020-03-17  3:00 ` kbuild test robot
  2020-03-17  3:24   ` Yang Shi
  2020-03-17  3:15 ` kbuild test robot
  2020-03-17  8:59 ` Vlastimil Babka
  4 siblings, 1 reply; 9+ messages in thread
From: kbuild test robot @ 2020-03-17  3:00 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, shakeelb, vbabka, willy, akpm, yang.shi, linux-mm,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]

Hi Yang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[also build test ERROR on linus/master v5.6-rc6 next-20200316]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.5.0-5) 7.5.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/suspend.h:5:0,
                    from arch/x86/kernel/asm-offsets.c:13:
   include/linux/swap.h: In function 'page_evictable':
>> include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]
     ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
            ^~~~~~~~~~~~~~~~~~~
            mapping_deny_writable
   cc1: some warnings being treated as errors
   make[2]: *** [scripts/Makefile.build:99: arch/x86/kernel/asm-offsets.s] Error 1
   make[2]: Target '__build' not remade because of errors.
   make[1]: *** [Makefile:1139: prepare0] Error 2
   make[1]: Target 'prepare' not remade because of errors.
   make: *** [Makefile:179: sub-make] Error 2
   11 real  4 user  5 sys  89.51% cpu 	make prepare

vim +395 include/linux/swap.h

   376	
   377	/**
   378	 * page_evictable - test whether a page is evictable
   379	 * @page: the page to test
   380	 *
   381	 * Test whether page is evictable--i.e., should be placed on active/inactive
   382	 * lists vs unevictable list.
   383	 *
   384	 * Reasons page might not be evictable:
   385	 * (1) page's mapping marked unevictable
   386	 * (2) page is part of an mlocked VMA
   387	 *
   388	 */
   389	static inline bool page_evictable(struct page *page)
   390	{
   391		bool ret;
   392	
   393		/* Prevent address_space of inode and swap cache from being freed */
   394		rcu_read_lock();
 > 395		ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
   396		rcu_read_unlock();
   397		return ret;
   398	}
   399	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 7280 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
                   ` (2 preceding siblings ...)
  2020-03-17  3:00 ` kbuild test robot
@ 2020-03-17  3:15 ` kbuild test robot
  2020-03-17  8:59 ` Vlastimil Babka
  4 siblings, 0 replies; 9+ messages in thread
From: kbuild test robot @ 2020-03-17  3:15 UTC (permalink / raw)
  To: Yang Shi
  Cc: kbuild-all, shakeelb, vbabka, willy, akpm, yang.shi, linux-mm,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2707 bytes --]

Hi Yang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on mmotm/master]
[also build test ERROR on linus/master v5.6-rc6 next-20200316]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: openrisc-simple_smp_defconfig (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=9.2.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/suspend.h:5,
                    from init/do_mounts.c:7:
   include/linux/swap.h: In function 'page_evictable':
   include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]
     395 |  ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
         |         ^~~~~~~~~~~~~~~~~~~
         |         mapping_deny_writable
   In file included from include/linux/mempolicy.h:16,
                    from include/linux/shmem_fs.h:7,
                    from init/do_mounts.c:21:
   include/linux/pagemap.h: At top level:
>> include/linux/pagemap.h:73:20: error: conflicting types for 'mapping_unevictable'
      73 | static inline bool mapping_unevictable(struct address_space *mapping)
         |                    ^~~~~~~~~~~~~~~~~~~
   In file included from include/linux/suspend.h:5,
                    from init/do_mounts.c:7:
   include/linux/swap.h:395:9: note: previous implicit declaration of 'mapping_unevictable' was here
     395 |  ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
         |         ^~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/mapping_unevictable +73 include/linux/pagemap.h

    72	
  > 73	static inline bool mapping_unevictable(struct address_space *mapping)
    74	{
    75		if (mapping)
    76			return test_bit(AS_UNEVICTABLE, &mapping->flags);
    77		return !!mapping;
    78	}
    79	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 8035 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-17  3:00 ` kbuild test robot
@ 2020-03-17  3:24   ` Yang Shi
  0 siblings, 0 replies; 9+ messages in thread
From: Yang Shi @ 2020-03-17  3:24 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, shakeelb, vbabka, willy, akpm, linux-mm, linux-kernel



On 3/16/20 8:00 PM, kbuild test robot wrote:
> Hi Yang,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on mmotm/master]
> [also build test ERROR on linus/master v5.6-rc6 next-20200316]
> [if your patch is applied to the wrong git tree, please drop us a note to help
> improve the system. BTW, we also suggest to use '--base' option to specify the
> base tree in git format-patch, please see https://stackoverflow.com/a/37406982]
>
> url:    https://github.com/0day-ci/linux/commits/Yang-Shi/mm-swap-make-page_evictable-inline/20200317-094836
> base:   git://git.cmpxchg.org/linux-mmotm.git master
> config: i386-tinyconfig (attached as .config)
> compiler: gcc-7 (Debian 7.5.0-5) 7.5.0
> reproduce:
>          # save the attached .config to linux build tree
>          make ARCH=i386
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot <lkp@intel.com>
>
> All errors (new ones prefixed by >>):
>
>     In file included from include/linux/suspend.h:5:0,
>                      from arch/x86/kernel/asm-offsets.c:13:
>     include/linux/swap.h: In function 'page_evictable':
>>> include/linux/swap.h:395:9: error: implicit declaration of function 'mapping_unevictable'; did you mean 'mapping_deny_writable'? [-Werror=implicit-function-declaration]

It looks pagemap.h need to be included. The below patch should be able 
to fix the build error:

diff --git a/include/linux/swap.h b/include/linux/swap.h
index d296c70..e8b8bbe 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -12,6 +12,7 @@
  #include <linux/fs.h>
  #include <linux/atomic.h>
  #include <linux/page-flags.h>
+#include <linux/pagemap.h>
  #include <asm/page.h>

  struct notifier_block;


>       ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
>              ^~~~~~~~~~~~~~~~~~~
>              mapping_deny_writable
>     cc1: some warnings being treated as errors
>     make[2]: *** [scripts/Makefile.build:99: arch/x86/kernel/asm-offsets.s] Error 1
>     make[2]: Target '__build' not remade because of errors.
>     make[1]: *** [Makefile:1139: prepare0] Error 2
>     make[1]: Target 'prepare' not remade because of errors.
>     make: *** [Makefile:179: sub-make] Error 2
>     11 real  4 user  5 sys  89.51% cpu 	make prepare
>
> vim +395 include/linux/swap.h
>
>     376	
>     377	/**
>     378	 * page_evictable - test whether a page is evictable
>     379	 * @page: the page to test
>     380	 *
>     381	 * Test whether page is evictable--i.e., should be placed on active/inactive
>     382	 * lists vs unevictable list.
>     383	 *
>     384	 * Reasons page might not be evictable:
>     385	 * (1) page's mapping marked unevictable
>     386	 * (2) page is part of an mlocked VMA
>     387	 *
>     388	 */
>     389	static inline bool page_evictable(struct page *page)
>     390	{
>     391		bool ret;
>     392	
>     393		/* Prevent address_space of inode and swap cache from being freed */
>     394		rcu_read_lock();
>   > 395		ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page);
>     396		rcu_read_unlock();
>     397		return ret;
>     398	}
>     399	
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [v2 PATCH 1/2] mm: swap: make page_evictable() inline
  2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
                   ` (3 preceding siblings ...)
  2020-03-17  3:15 ` kbuild test robot
@ 2020-03-17  8:59 ` Vlastimil Babka
  4 siblings, 0 replies; 9+ messages in thread
From: Vlastimil Babka @ 2020-03-17  8:59 UTC (permalink / raw)
  To: Yang Shi, shakeelb, willy, akpm; +Cc: linux-mm, linux-kernel

On 3/16/20 11:24 PM, Yang Shi wrote:
> When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more
> skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10%
> down with a couple of vm-scalability's test cases (lru-file-readonce,
> lru-file-readtwice and lru-file-mmap-read).  I didn't see that much down
> on my VM (32c-64g-2nodes).  It might be caused by the test configuration,
> which is 32c-256g with NUMA disabled and the tests were run in root memcg,
> so the tests actually stress only one inactive and active lru.  It
> sounds not very usual in mordern production environment.
> 
> That commit did two major changes:
> 1. Call page_evictable()
> 2. Use smp_mb to force the PG_lru set visible
> 
> It looks they contribute the most overhead.  The page_evictable() is a
> function which does function prologue and epilogue, and that was used by
> page reclaim path only.  However, lru add is a very hot path, so it
> sounds better to make it inline.  However, it calls page_mapping() which
> is not inlined either, but the disassemble shows it doesn't do push and
> pop operations and it sounds not very straightforward to inline it.
> 
> Other than this, it sounds smp_mb() is not necessary for x86 since
> SetPageLRU is atomic which enforces memory barrier already, replace it
> with smp_mb__after_atomic() in the following patch.
> 
> With the two fixes applied, the tests can get back around 5% on that
> test bench and get back normal on my VM.  Since the test bench
> configuration is not that usual and I also saw around 6% up on the
> latest upstream, so it sounds good enough IMHO.
> 
> The below is test data (lru-file-readtwice throughput) against the v5.6-rc4:
> 	mainline	w/ inline fix
>           150MB            154MB
> 
> With this patch the throughput gets 2.67% up.  The data with using
> smp_mb__after_atomic() is showed in the following patch.
> 
> Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs")
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Matthew Wilcox <willy@infradead.org>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-03-17  8:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-16 22:24 [v2 PATCH 1/2] mm: swap: make page_evictable() inline Yang Shi
2020-03-16 22:24 ` [v2 PATCH 2/2] mm: swap: use smp_mb__after_atomic() to order LRU bit set Yang Shi
2020-03-16 23:47   ` Shakeel Butt
2020-03-16 23:46 ` [v2 PATCH 1/2] mm: swap: make page_evictable() inline Shakeel Butt
2020-03-17  2:05   ` Yang Shi
2020-03-17  3:00 ` kbuild test robot
2020-03-17  3:24   ` Yang Shi
2020-03-17  3:15 ` kbuild test robot
2020-03-17  8:59 ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).