All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: eliminate function call overhead during copy_page_range()
@ 2023-02-05 15:06 Hao Lee
  2023-02-05 21:53 ` Matthew Wilcox
  0 siblings, 1 reply; 3+ messages in thread
From: Hao Lee @ 2023-02-05 15:06 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, haolee.swjtu, linux-kernel

vm_normal_page() is called so many times that its overhead is very high.
After changing this call site to an inline function, copy_page_range()
runs 3~5 times faster than before.

Signed-off-by: Hao Lee <haolee.swjtu@gmail.com>
---
 mm/memory.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 7a04a1130ec1..2084bb7aff85 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -562,7 +562,7 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
  * PFNMAP mappings in order to support COWable mappings.
  *
  */
-struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
+static inline struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 			    pte_t pte)
 {
 	unsigned long pfn = pte_pfn(pte);
@@ -625,6 +625,12 @@ struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
 	return pfn_to_page(pfn);
 }
 
+struct page *vm_normal_page(struct vm_area_struct *vma, unsigned long addr,
+			    pte_t pte)
+{
+	return __vm_normal_page(vma, addr, pte);
+}
+
 struct folio *vm_normal_folio(struct vm_area_struct *vma, unsigned long addr,
 			    pte_t pte)
 {
@@ -908,7 +914,7 @@ copy_present_pte(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma,
 	struct page *page;
 	struct folio *folio;
 
-	page = vm_normal_page(src_vma, addr, pte);
+	page = __vm_normal_page(src_vma, addr, pte);
 	if (page)
 		folio = page_folio(page);
 	if (page && folio_test_anon(folio)) {
-- 
2.37.3


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: eliminate function call overhead during copy_page_range()
  2023-02-05 15:06 [PATCH] mm: eliminate function call overhead during copy_page_range() Hao Lee
@ 2023-02-05 21:53 ` Matthew Wilcox
  2023-02-06  9:17   ` Hao Lee
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2023-02-05 21:53 UTC (permalink / raw)
  To: Hao Lee; +Cc: akpm, linux-mm, linux-kernel

On Sun, Feb 05, 2023 at 03:06:02PM +0000, Hao Lee wrote:
> vm_normal_page() is called so many times that its overhead is very high.
> After changing this call site to an inline function, copy_page_range()
> runs 3~5 times faster than before.

So you're saying that your compiler is making bad decisions?  What
architecture, what compiler, what version?  Do you have
CONFIG_ARCH_HAS_PTE_SPECIAL set?

Is there something about inlining it that makes the compiler able to
optimise away code, or is it really the function call overhead?  Can
you share any perf results?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] mm: eliminate function call overhead during copy_page_range()
  2023-02-05 21:53 ` Matthew Wilcox
@ 2023-02-06  9:17   ` Hao Lee
  0 siblings, 0 replies; 3+ messages in thread
From: Hao Lee @ 2023-02-06  9:17 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: akpm, linux-mm, linux-kernel

On Sun, Feb 05, 2023 at 09:53:53PM +0000, Matthew Wilcox wrote:
> On Sun, Feb 05, 2023 at 03:06:02PM +0000, Hao Lee wrote:
> > vm_normal_page() is called so many times that its overhead is very high.
> > After changing this call site to an inline function, copy_page_range()
> > runs 3~5 times faster than before.
> 
> So you're saying that your compiler is making bad decisions?  What
> architecture, what compiler, what version?  Do you have
> CONFIG_ARCH_HAS_PTE_SPECIAL set?
> 
> Is there something about inlining it that makes the compiler able to
> optimise away code, or is it really the function call overhead?  Can
> you share any perf results?

I am so embarrassed; I forgot to disable function_graph when timing the
non-inlined function so my test was interfered. And the actual
performance improvement is only ~3%.
Please ignore this patch. Sorry...

> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-02-06  9:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-05 15:06 [PATCH] mm: eliminate function call overhead during copy_page_range() Hao Lee
2023-02-05 21:53 ` Matthew Wilcox
2023-02-06  9:17   ` Hao Lee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.