From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: [patch 057/155] mm: swap: make page_evictable() inline Date: Wed, 01 Apr 2020 21:06:20 -0700 Message-ID: <20200402040620.mu7rVaF6c%akpm@linux-foundation.org> References: <20200401210155.09e3b9742e1c6e732f5a7250@linux-foundation.org> Reply-To: linux-kernel@vger.kernel.org Return-path: Received: from mail.kernel.org ([198.145.29.99]:55894 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725789AbgDBEGV (ORCPT ); Thu, 2 Apr 2020 00:06:21 -0400 In-Reply-To: <20200401210155.09e3b9742e1c6e732f5a7250@linux-foundation.org> Sender: mm-commits-owner@vger.kernel.org List-Id: mm-commits@vger.kernel.org To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org, yang.shi@linux.alibaba.com From: Yang Shi Subject: mm: swap: make page_evictable() inline When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10% down with a couple of vm-scalability's test cases (lru-file-readonce, lru-file-readtwice and lru-file-mmap-read). I didn't see that much down on my VM (32c-64g-2nodes). It might be caused by the test configuration, which is 32c-256g with NUMA disabled and the tests were run in root memcg, so the tests actually stress only one inactive and active lru. It sounds not very usual in mordern production environment. That commit did two major changes: 1. Call page_evictable() 2. Use smp_mb to force the PG_lru set visible It looks they contribute the most overhead. The page_evictable() is a function which does function prologue and epilogue, and that was used by page reclaim path only. However, lru add is a very hot path, so it sounds better to make it inline. However, it calls page_mapping() which is not inlined either, but the disassemble shows it doesn't do push and pop operations and it sounds not very straightforward to inline it. Other than this, it sounds smp_mb() is not necessary for x86 since SetPageLRU is atomic which enforces memory barrier already, replace it with smp_mb__after_atomic() in the following patch. With the two fixes applied, the tests can get back around 5% on that test bench and get back normal on my VM. Since the test bench configuration is not that usual and I also saw around 6% up on the latest upstream, so it sounds good enough IMHO. The below is test data (lru-file-readtwice throughput) against the v5.6-rc4: mainline w/ inline fix 150MB 154MB With this patch the throughput gets 2.67% up. The data with using smp_mb__after_atomic() is showed in the following patch. Shakeel Butt did the below test: On a real machine with limiting the 'dd' on a single node and reading 100 GiB sparse file (less than a single node). Just ran a single instance to not cause the lru lock contention. The cmdline used is "dd if=file-100GiB of=/dev/null bs=4k". Ran the cmd 10 times with drop_caches in between and measured the time it took. Without patch: 56.64143 +- 0.672 sec With patches: 56.10 +- 0.21 sec [akpm@linux-foundation.org: move page_evictable() to internal.h] Link: http://lkml.kernel.org/r/1584500541-46817-1-git-send-email-yang.shi@linux.alibaba.com Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs") Signed-off-by: Yang Shi Reviewed-by: Shakeel Butt Tested-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Johannes Weiner Signed-off-by: Andrew Morton --- include/linux/pagemap.h | 6 ++---- include/linux/swap.h | 1 - mm/internal.h | 23 +++++++++++++++++++++++ mm/vmscan.c | 23 ----------------------- 4 files changed, 25 insertions(+), 28 deletions(-) --- a/include/linux/pagemap.h~mm-swap-make-page_evictable-inline +++ a/include/linux/pagemap.h @@ -70,11 +70,9 @@ static inline void mapping_clear_unevict clear_bit(AS_UNEVICTABLE, &mapping->flags); } -static inline int mapping_unevictable(struct address_space *mapping) +static inline bool mapping_unevictable(struct address_space *mapping) { - if (mapping) - return test_bit(AS_UNEVICTABLE, &mapping->flags); - return !!mapping; + return mapping && test_bit(AS_UNEVICTABLE, &mapping->flags); } static inline void mapping_set_exiting(struct address_space *mapping) --- a/include/linux/swap.h~mm-swap-make-page_evictable-inline +++ a/include/linux/swap.h @@ -374,7 +374,6 @@ extern int sysctl_min_slab_ratio; #define node_reclaim_mode 0 #endif -extern int page_evictable(struct page *page); extern void check_move_unevictable_pages(struct pagevec *pvec); extern int kswapd_run(int nid); --- a/mm/internal.h~mm-swap-make-page_evictable-inline +++ a/mm/internal.h @@ -63,6 +63,29 @@ static inline unsigned long ra_submit(st ra->start, ra->size, ra->async_size); } +/** + * page_evictable - test whether a page is evictable + * @page: the page to test + * + * Test whether page is evictable--i.e., should be placed on active/inactive + * lists vs unevictable list. + * + * Reasons page might not be evictable: + * (1) page's mapping marked unevictable + * (2) page is part of an mlocked VMA + * + */ +static inline bool page_evictable(struct page *page) +{ + bool ret; + + /* Prevent address_space of inode and swap cache from being freed */ + rcu_read_lock(); + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); + rcu_read_unlock(); + return ret; +} + /* * Turn a non-refcounted page (->_refcount == 0) into refcounted with * a count of one. --- a/mm/vmscan.c~mm-swap-make-page_evictable-inline +++ a/mm/vmscan.c @@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgd } #endif -/* - * page_evictable - test whether a page is evictable - * @page: the page to test - * - * Test whether page is evictable--i.e., should be placed on active/inactive - * lists vs unevictable list. - * - * Reasons page might not be evictable: - * (1) page's mapping marked unevictable - * (2) page is part of an mlocked VMA - * - */ -int page_evictable(struct page *page) -{ - int ret; - - /* Prevent address_space of inode and swap cache from being freed */ - rcu_read_lock(); - ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); - rcu_read_unlock(); - return ret; -} From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 040ADC43331 for ; Thu, 2 Apr 2020 04:06:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD2D820747 for ; Thu, 2 Apr 2020 04:06:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="T62nbdj/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD2D820747 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5EF178E0039; Thu, 2 Apr 2020 00:06:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 59E3E8E000D; Thu, 2 Apr 2020 00:06:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48E7E8E0039; Thu, 2 Apr 2020 00:06:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0189.hostedemail.com [216.40.44.189]) by kanga.kvack.org (Postfix) with ESMTP id 2C8DB8E000D for ; Thu, 2 Apr 2020 00:06:22 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E700E8248047 for ; Thu, 2 Apr 2020 04:06:21 +0000 (UTC) X-FDA: 76661577762.04.mass55_37878e93f1362 X-HE-Tag: mass55_37878e93f1362 X-Filterd-Recvd-Size: 7343 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Thu, 2 Apr 2020 04:06:21 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7E612206E9; Thu, 2 Apr 2020 04:06:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1585800380; bh=W1ssqEs4qMQgH4C9aaYzd1JfmYbEmBmfoBJGyelOMoU=; h=Date:From:To:Subject:In-Reply-To:From; b=T62nbdj/mqHbSnjqc1GxbhIuZmzh9DctxK/biR2g30gUqF9tH2URMlYnfVcx2U4Jg KKGbOlOgSWAbY8jS5zizCVyTjt/LU5LE3KL5vnrOBuT45naSfb9ve1u7o4i+sfe/gW 70vkIHQGIFcghAUozx1UEhbKrq3zDwWBmUWZt/Fk= Date: Wed, 01 Apr 2020 21:06:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, hannes@cmpxchg.org, linux-mm@kvack.org, mm-commits@vger.kernel.org, shakeelb@google.com, torvalds@linux-foundation.org, vbabka@suse.cz, willy@infradead.org, yang.shi@linux.alibaba.com Subject: [patch 057/155] mm: swap: make page_evictable() inline Message-ID: <20200402040620.mu7rVaF6c%akpm@linux-foundation.org> In-Reply-To: <20200401210155.09e3b9742e1c6e732f5a7250@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Yang Shi Subject: mm: swap: make page_evictable() inline When backporting commit 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs") to our 4.9 kernel, our test bench noticed around 10% down with a couple of vm-scalability's test cases (lru-file-readonce, lru-file-readtwice and lru-file-mmap-read). I didn't see that much down on my VM (32c-64g-2nodes). It might be caused by the test configuration, which is 32c-256g with NUMA disabled and the tests were run in root memcg, so the tests actually stress only one inactive and active lru. It sounds not very usual in mordern production environment. That commit did two major changes: 1. Call page_evictable() 2. Use smp_mb to force the PG_lru set visible It looks they contribute the most overhead. The page_evictable() is a function which does function prologue and epilogue, and that was used by page reclaim path only. However, lru add is a very hot path, so it sounds better to make it inline. However, it calls page_mapping() which is not inlined either, but the disassemble shows it doesn't do push and pop operations and it sounds not very straightforward to inline it. Other than this, it sounds smp_mb() is not necessary for x86 since SetPageLRU is atomic which enforces memory barrier already, replace it with smp_mb__after_atomic() in the following patch. With the two fixes applied, the tests can get back around 5% on that test bench and get back normal on my VM. Since the test bench configuration is not that usual and I also saw around 6% up on the latest upstream, so it sounds good enough IMHO. The below is test data (lru-file-readtwice throughput) against the v5.6-rc4: mainline w/ inline fix 150MB 154MB With this patch the throughput gets 2.67% up. The data with using smp_mb__after_atomic() is showed in the following patch. Shakeel Butt did the below test: On a real machine with limiting the 'dd' on a single node and reading 100 GiB sparse file (less than a single node). Just ran a single instance to not cause the lru lock contention. The cmdline used is "dd if=file-100GiB of=/dev/null bs=4k". Ran the cmd 10 times with drop_caches in between and measured the time it took. Without patch: 56.64143 +- 0.672 sec With patches: 56.10 +- 0.21 sec [akpm@linux-foundation.org: move page_evictable() to internal.h] Link: http://lkml.kernel.org/r/1584500541-46817-1-git-send-email-yang.shi@linux.alibaba.com Fixes: 9c4e6b1a7027 ("mm, mlock, vmscan: no more skipping pagevecs") Signed-off-by: Yang Shi Reviewed-by: Shakeel Butt Tested-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Johannes Weiner Signed-off-by: Andrew Morton --- include/linux/pagemap.h | 6 ++---- include/linux/swap.h | 1 - mm/internal.h | 23 +++++++++++++++++++++++ mm/vmscan.c | 23 ----------------------- 4 files changed, 25 insertions(+), 28 deletions(-) --- a/include/linux/pagemap.h~mm-swap-make-page_evictable-inline +++ a/include/linux/pagemap.h @@ -70,11 +70,9 @@ static inline void mapping_clear_unevict clear_bit(AS_UNEVICTABLE, &mapping->flags); } -static inline int mapping_unevictable(struct address_space *mapping) +static inline bool mapping_unevictable(struct address_space *mapping) { - if (mapping) - return test_bit(AS_UNEVICTABLE, &mapping->flags); - return !!mapping; + return mapping && test_bit(AS_UNEVICTABLE, &mapping->flags); } static inline void mapping_set_exiting(struct address_space *mapping) --- a/include/linux/swap.h~mm-swap-make-page_evictable-inline +++ a/include/linux/swap.h @@ -374,7 +374,6 @@ extern int sysctl_min_slab_ratio; #define node_reclaim_mode 0 #endif -extern int page_evictable(struct page *page); extern void check_move_unevictable_pages(struct pagevec *pvec); extern int kswapd_run(int nid); --- a/mm/internal.h~mm-swap-make-page_evictable-inline +++ a/mm/internal.h @@ -63,6 +63,29 @@ static inline unsigned long ra_submit(st ra->start, ra->size, ra->async_size); } +/** + * page_evictable - test whether a page is evictable + * @page: the page to test + * + * Test whether page is evictable--i.e., should be placed on active/inactive + * lists vs unevictable list. + * + * Reasons page might not be evictable: + * (1) page's mapping marked unevictable + * (2) page is part of an mlocked VMA + * + */ +static inline bool page_evictable(struct page *page) +{ + bool ret; + + /* Prevent address_space of inode and swap cache from being freed */ + rcu_read_lock(); + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); + rcu_read_unlock(); + return ret; +} + /* * Turn a non-refcounted page (->_refcount == 0) into refcounted with * a count of one. --- a/mm/vmscan.c~mm-swap-make-page_evictable-inline +++ a/mm/vmscan.c @@ -4277,29 +4277,6 @@ int node_reclaim(struct pglist_data *pgd } #endif -/* - * page_evictable - test whether a page is evictable - * @page: the page to test - * - * Test whether page is evictable--i.e., should be placed on active/inactive - * lists vs unevictable list. - * - * Reasons page might not be evictable: - * (1) page's mapping marked unevictable - * (2) page is part of an mlocked VMA - * - */ -int page_evictable(struct page *page) -{ - int ret; - - /* Prevent address_space of inode and swap cache from being freed */ - rcu_read_lock(); - ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); - rcu_read_unlock(); - return ret; -} - /** * check_move_unevictable_pages - check pages for evictability and move to * appropriate zone lru list _