[PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
@ 2012-10-26 12:54 Kirill A. Shutemov
  2012-10-26 12:54 ` [PATCH 2/2] numa, mm: consolidate error path " Kirill A. Shutemov
  2012-10-26 13:08 ` [PATCH 1/2] numa, mm: drop redundant check " Peter Zijlstra
  0 siblings, 2 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2012-10-26 12:54 UTC (permalink / raw)
  To: linux-mm
  Cc: Will Deacon, Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, Peter Zijlstra, linux-kernel

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

We check if the pmd entry is the same as on pmd_trans_huge() in
handle_mm_fault(). That's enough.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/huge_memory.c |    6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 3c14a96..9bb2c23 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -758,12 +758,6 @@ void do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (unlikely(!pmd_same(*pmd, entry)))
 		goto unlock;
 
-	if (unlikely(pmd_trans_splitting(entry))) {
-		spin_unlock(&mm->page_table_lock);
-		wait_split_huge_page(vma->anon_vma, pmd);
-		return;
-	}
-
 	page = pmd_page(entry);
 	if (page) {
 		VM_BUG_ON(!PageCompound(page) || !PageHead(page));
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] numa, mm: consolidate error path in do_huge_pmd_numa_page()
  2012-10-26 12:54 [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page() Kirill A. Shutemov
@ 2012-10-26 12:54 ` Kirill A. Shutemov
  2012-10-26 13:10   ` Peter Zijlstra
  2012-10-26 13:08 ` [PATCH 1/2] numa, mm: drop redundant check " Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2012-10-26 12:54 UTC (permalink / raw)
  To: linux-mm
  Cc: Will Deacon, Kirill A. Shutemov, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, Peter Zijlstra, linux-kernel

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Let's move all error path code to the end if the function. It makes code
more straight-forward.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/huge_memory.c |   44 ++++++++++++++++++++------------------------
 1 file changed, 20 insertions(+), 24 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 9bb2c23..95ec485 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -759,30 +759,14 @@ void do_huge_pmd_numa_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		goto unlock;
 
 	page = pmd_page(entry);
-	if (page) {
-		VM_BUG_ON(!PageCompound(page) || !PageHead(page));
-
-		get_page(page);
-		node = mpol_misplaced(page, vma, haddr);
-		if (node != -1)
-			goto migrate;
-	}
-
-fixup:
-	/* change back to regular protection */
-	entry = pmd_modify(entry, vma->vm_page_prot);
-	set_pmd_at(mm, haddr, pmd, entry);
-	update_mmu_cache_pmd(vma, address, entry);
-
-unlock:
-	spin_unlock(&mm->page_table_lock);
-	if (page) {
-		task_numa_fault(page_to_nid(page), HPAGE_PMD_NR);
-		put_page(page);
-	}
-	return;
+	if (!page)
+		goto fixup;
+	VM_BUG_ON(!PageCompound(page) || !PageHead(page));
 
-migrate:
+	get_page(page);
+	node = mpol_misplaced(page, vma, haddr);
+	if (node == -1)
+		goto fixup;
 	spin_unlock(&mm->page_table_lock);
 
 	lock_page(page);
@@ -871,7 +855,19 @@ alloc_fail:
 		page = NULL;
 		goto unlock;
 	}
-	goto fixup;
+fixup:
+	/* change back to regular protection */
+	entry = pmd_modify(entry, vma->vm_page_prot);
+	set_pmd_at(mm, haddr, pmd, entry);
+	update_mmu_cache_pmd(vma, address, entry);
+
+unlock:
+	spin_unlock(&mm->page_table_lock);
+	if (page) {
+		task_numa_fault(page_to_nid(page), HPAGE_PMD_NR);
+		put_page(page);
+	}
+	return;
 }
 
 int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 12:54 [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page() Kirill A. Shutemov
  2012-10-26 12:54 ` [PATCH 2/2] numa, mm: consolidate error path " Kirill A. Shutemov
@ 2012-10-26 13:08 ` Peter Zijlstra
  2012-10-26 13:41   ` Kirill A. Shutemov
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2012-10-26 13:08 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

On Fri, 2012-10-26 at 15:54 +0300, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> We check if the pmd entry is the same as on pmd_trans_huge() in
> handle_mm_fault(). That's enough.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

Ah indeed, Will mentioned something like this on IRC as well, I hadn't
gotten around to looking at it -- now have, thanks!

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

That said, where in handle_mm_fault() do we wait for a split to
complete? We have a pmd_trans_huge() && !pmd_trans_splitting(), so a
fault on a currently splitting pmd will fall through.

Is it the return from the fault on unlikely(pmd_trans_huge()) ?

I'm probably missing something obvious..

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] numa, mm: consolidate error path in do_huge_pmd_numa_page()
  2012-10-26 12:54 ` [PATCH 2/2] numa, mm: consolidate error path " Kirill A. Shutemov
@ 2012-10-26 13:10   ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2012-10-26 13:10 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

On Fri, 2012-10-26 at 15:54 +0300, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> 
> Let's move all error path code to the end if the function. It makes code
> more straight-forward.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/huge_memory.c |   44 ++++++++++++++++++++------------------------
>  1 file changed, 20 insertions(+), 24 deletions(-)

and smaller! Thanks!

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 13:08 ` [PATCH 1/2] numa, mm: drop redundant check " Peter Zijlstra
@ 2012-10-26 13:41   ` Kirill A. Shutemov
  2012-10-26 13:43     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2012-10-26 13:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

On Fri, Oct 26, 2012 at 03:08:05PM +0200, Peter Zijlstra wrote:
> On Fri, 2012-10-26 at 15:54 +0300, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > We check if the pmd entry is the same as on pmd_trans_huge() in
> > handle_mm_fault(). That's enough.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> Ah indeed, Will mentioned something like this on IRC as well, I hadn't
> gotten around to looking at it -- now have, thanks!
> 
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> 
> That said, where in handle_mm_fault() do we wait for a split to
> complete? We have a pmd_trans_huge() && !pmd_trans_splitting(), so a
> fault on a currently splitting pmd will fall through.
> 
> Is it the return from the fault on unlikely(pmd_trans_huge()) ?

Yes, this code will catch it:

	/* if an huge pmd materialized from under us just retry later */
	if (unlikely(pmd_trans_huge(*pmd)))
		return 0;

If the pmd is under splitting it's still a pmd_trans_huge().

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 13:41   ` Kirill A. Shutemov
@ 2012-10-26 13:43     ` Peter Zijlstra
  2012-10-26 13:57       ` Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2012-10-26 13:43 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

On Fri, 2012-10-26 at 16:41 +0300, Kirill A. Shutemov wrote:
> On Fri, Oct 26, 2012 at 03:08:05PM +0200, Peter Zijlstra wrote:
> > On Fri, 2012-10-26 at 15:54 +0300, Kirill A. Shutemov wrote:
> > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > 
> > > We check if the pmd entry is the same as on pmd_trans_huge() in
> > > handle_mm_fault(). That's enough.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > Ah indeed, Will mentioned something like this on IRC as well, I hadn't
> > gotten around to looking at it -- now have, thanks!
> > 
> > Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > 
> > That said, where in handle_mm_fault() do we wait for a split to
> > complete? We have a pmd_trans_huge() && !pmd_trans_splitting(), so a
> > fault on a currently splitting pmd will fall through.
> > 
> > Is it the return from the fault on unlikely(pmd_trans_huge()) ?
> 
> Yes, this code will catch it:
> 
> 	/* if an huge pmd materialized from under us just retry later */
> 	if (unlikely(pmd_trans_huge(*pmd)))
> 		return 0;
> 
> If the pmd is under splitting it's still a pmd_trans_huge().

OK, so then we simply keep taking the same fault until the split is
complete? Wouldn't it be better to wait for it instead of spin on
faults?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 13:43     ` Peter Zijlstra
@ 2012-10-26 13:57       ` Kirill A. Shutemov
  2012-10-26 14:07         ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Kirill A. Shutemov @ 2012-10-26 13:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1555 bytes --]

On Fri, Oct 26, 2012 at 03:43:12PM +0200, Peter Zijlstra wrote:
> On Fri, 2012-10-26 at 16:41 +0300, Kirill A. Shutemov wrote:
> > On Fri, Oct 26, 2012 at 03:08:05PM +0200, Peter Zijlstra wrote:
> > > On Fri, 2012-10-26 at 15:54 +0300, Kirill A. Shutemov wrote:
> > > > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > > > 
> > > > We check if the pmd entry is the same as on pmd_trans_huge() in
> > > > handle_mm_fault(). That's enough.
> > > > 
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > 
> > > Ah indeed, Will mentioned something like this on IRC as well, I hadn't
> > > gotten around to looking at it -- now have, thanks!
> > > 
> > > Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > > 
> > > That said, where in handle_mm_fault() do we wait for a split to
> > > complete? We have a pmd_trans_huge() && !pmd_trans_splitting(), so a
> > > fault on a currently splitting pmd will fall through.
> > > 
> > > Is it the return from the fault on unlikely(pmd_trans_huge()) ?
> > 
> > Yes, this code will catch it:
> > 
> > 	/* if an huge pmd materialized from under us just retry later */
> > 	if (unlikely(pmd_trans_huge(*pmd)))
> > 		return 0;
> > 
> > If the pmd is under splitting it's still a pmd_trans_huge().
> 
> OK, so then we simply keep taking the same fault until the split is
> complete? Wouldn't it be better to wait for it instead of spin on
> faults?

IIUC, on next fault we will wait split the page in fallow_page().

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 13:57       ` Kirill A. Shutemov
@ 2012-10-26 14:07         ` Peter Zijlstra
  2012-10-26 14:34           ` Kirill A. Shutemov
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2012-10-26 14:07 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

On Fri, 2012-10-26 at 16:57 +0300, Kirill A. Shutemov wrote:
> > > Yes, this code will catch it:
> > > 
> > >     /* if an huge pmd materialized from under us just retry later */
> > >     if (unlikely(pmd_trans_huge(*pmd)))
> > >             return 0;
> > > 
> > > If the pmd is under splitting it's still a pmd_trans_huge().
> > 
> > OK, so then we simply keep taking the same fault until the split is
> > complete? Wouldn't it be better to wait for it instead of spin on
> > faults?
> 
> IIUC, on next fault we will wait split the page in fallow_page(). 

What follow_page()?, a regular hardware page-fault will not call
follow_page() afaict, we do a down_read(), find_vma() and call
handle_mm_fault() -- with a lot of error and corner case checking in
between.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page()
  2012-10-26 14:07         ` Peter Zijlstra
@ 2012-10-26 14:34           ` Kirill A. Shutemov
  0 siblings, 0 replies; 9+ messages in thread
From: Kirill A. Shutemov @ 2012-10-26 14:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, Will Deacon, Andrew Morton, Andrea Arcangeli,
	Xiao Guangrong, Ingo Molnar, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1102 bytes --]

On Fri, Oct 26, 2012 at 04:07:44PM +0200, Peter Zijlstra wrote:
> On Fri, 2012-10-26 at 16:57 +0300, Kirill A. Shutemov wrote:
> > > > Yes, this code will catch it:
> > > > 
> > > >     /* if an huge pmd materialized from under us just retry later */
> > > >     if (unlikely(pmd_trans_huge(*pmd)))
> > > >             return 0;
> > > > 
> > > > If the pmd is under splitting it's still a pmd_trans_huge().
> > > 
> > > OK, so then we simply keep taking the same fault until the split is
> > > complete? Wouldn't it be better to wait for it instead of spin on
> > > faults?
> > 
> > IIUC, on next fault we will wait split the page in fallow_page(). 
> 
> What follow_page()?, a regular hardware page-fault will not call
> follow_page() afaict, we do a down_read(), find_vma() and call
> handle_mm_fault() -- with a lot of error and corner case checking in
> between.

Yeah, you're right. Then, it seems we're spinning on the fault until the
page is splitted.

I'm not sure how long spliting takes and if splitting itself can fix some
fault reason.

-- 
 Kirill A. Shutemov

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-10-26 14:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-26 12:54 [PATCH 1/2] numa, mm: drop redundant check in do_huge_pmd_numa_page() Kirill A. Shutemov
2012-10-26 12:54 ` [PATCH 2/2] numa, mm: consolidate error path " Kirill A. Shutemov
2012-10-26 13:10   ` Peter Zijlstra
2012-10-26 13:08 ` [PATCH 1/2] numa, mm: drop redundant check " Peter Zijlstra
2012-10-26 13:41   ` Kirill A. Shutemov
2012-10-26 13:43     ` Peter Zijlstra
2012-10-26 13:57       ` Kirill A. Shutemov
2012-10-26 14:07         ` Peter Zijlstra
2012-10-26 14:34           ` Kirill A. Shutemov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).