All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 10:12 ` Mel Gorman
  0 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-15 10:12 UTC (permalink / raw)
  To: akpm
  Cc: Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

With transparent hugepage support, handle_mm_fault() has to be careful
that a normal PMD has been established before handling a PTE fault. To
achieve this, it used __pte_alloc() directly instead of pte_alloc_map
as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
is called once it is known the PMD is safe.

pte_alloc_map() is smart enough to check if a PTE is already present
before calling __pte_alloc but this check was lost. As a consequence,
PTEs may be allocated unnecessarily and the page table lock taken.
Thi useless PTE does get cleaned up but it's a performance hit which
is visible in page_test from aim9.

This patch simply re-adds the check normally done by pte_alloc_map to
check if the PTE needs to be allocated before taking the page table
lock. The effect is noticable in page_test from aim9.

AIM9
                2.6.38-vanilla 2.6.38-checkptenone
creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
exec_test      382.00 ( 0.00%)   456.90 (16.39%)
fork_test       60.11 ( 0.00%)    67.79 (11.34%)
MMTests Statistics: duration
Total Elapsed Time (seconds)                611.90    612.22

(While this affects 2.6.38, it is a performance rather than a
functional bug and normally outside the rules -stable. While the big
performance differences are to a microbench, the difference in fork
and exec performance may be significant enough that -stable wants to
consider the patch)

Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
-- 
 mm/memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 5823698..1659574 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * run pte_offset_map on the pmd, if an huge pmd could
 	 * materialize from under us from a different thread.
 	 */
-	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
+	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
 		return VM_FAULT_OOM;
 	/* if an huge pmd materialized from under us just retry later */
 	if (unlikely(pmd_trans_huge(*pmd)))

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 10:12 ` Mel Gorman
  0 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-15 10:12 UTC (permalink / raw)
  To: akpm
  Cc: Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

With transparent hugepage support, handle_mm_fault() has to be careful
that a normal PMD has been established before handling a PTE fault. To
achieve this, it used __pte_alloc() directly instead of pte_alloc_map
as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
is called once it is known the PMD is safe.

pte_alloc_map() is smart enough to check if a PTE is already present
before calling __pte_alloc but this check was lost. As a consequence,
PTEs may be allocated unnecessarily and the page table lock taken.
Thi useless PTE does get cleaned up but it's a performance hit which
is visible in page_test from aim9.

This patch simply re-adds the check normally done by pte_alloc_map to
check if the PTE needs to be allocated before taking the page table
lock. The effect is noticable in page_test from aim9.

AIM9
                2.6.38-vanilla 2.6.38-checkptenone
creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
exec_test      382.00 ( 0.00%)   456.90 (16.39%)
fork_test       60.11 ( 0.00%)    67.79 (11.34%)
MMTests Statistics: duration
Total Elapsed Time (seconds)                611.90    612.22

(While this affects 2.6.38, it is a performance rather than a
functional bug and normally outside the rules -stable. While the big
performance differences are to a microbench, the difference in fork
and exec performance may be significant enough that -stable wants to
consider the patch)

Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
-- 
 mm/memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index 5823698..1659574 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * run pte_offset_map on the pmd, if an huge pmd could
 	 * materialize from under us from a different thread.
 	 */
-	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
+	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
 		return VM_FAULT_OOM;
 	/* if an huge pmd materialized from under us just retry later */
 	if (unlikely(pmd_trans_huge(*pmd)))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 10:12 ` Mel Gorman
@ 2011-04-15 13:23   ` Rik van Riel
  -1 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2011-04-15 13:23 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, kosaki.motohiro, lkml,
	linux-mm, stable

On 04/15/2011 06:12 AM, Mel Gorman wrote:

> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.

> Reported-by: Raz Ben Yehuda<raziebe@gmail.com>
> Signed-off-by: Mel Gorman<mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 13:23   ` Rik van Riel
  0 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2011-04-15 13:23 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, kosaki.motohiro, lkml,
	linux-mm, stable

On 04/15/2011 06:12 AM, Mel Gorman wrote:

> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.

> Reported-by: Raz Ben Yehuda<raziebe@gmail.com>
> Signed-off-by: Mel Gorman<mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 10:12 ` Mel Gorman
@ 2011-04-15 14:39   ` Andrea Arcangeli
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-15 14:39 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index 5823698..1659574 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * run pte_offset_map on the pmd, if an huge pmd could
>  	 * materialize from under us from a different thread.
>  	 */
> -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>  		return VM_FAULT_OOM;
>  	/* if an huge pmd materialized from under us just retry later */
>  	if (unlikely(pmd_trans_huge(*pmd)))

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 14:39   ` Andrea Arcangeli
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-15 14:39 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index 5823698..1659574 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * run pte_offset_map on the pmd, if an huge pmd could
>  	 * materialize from under us from a different thread.
>  	 */
> -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>  		return VM_FAULT_OOM;
>  	/* if an huge pmd materialized from under us just retry later */
>  	if (unlikely(pmd_trans_huge(*pmd)))

Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 14:39   ` Andrea Arcangeli
@ 2011-04-15 15:06     ` Andrea Arcangeli
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-15 15:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5823698..1659574 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >  	 * run pte_offset_map on the pmd, if an huge pmd could
> >  	 * materialize from under us from a different thread.
> >  	 */
> > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))

I started hacking on this and I noticed it'd be better to extend the
unlikely through the end. At first review I didn't notice the
parenthesis closure stops after pte_none and __pte_alloc is now
uncovered. I'd prefer this:

    if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))

I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
end up calling __pte_alloc or not, depends on the app. Generally it
sounds more frequent that the pte is not none, so it's not wrong, but
it's even less likely that __pte_alloc fails so that can be taken into
account too, and __pte_alloc runs still quite frequently. So either
above or:

    if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))

I generally prefer unlikely only when it's 100% sure thing it's less
likely (like the VM_FAULT_OOM), so the first version I guess it's
enough (I'm afraid unlikely for pte_none too, may make gcc generate a
far away jump possibly going out of l1 icache for a case that is only
512 times less likely at best). My point is that it's certainly hugely
more unlikely that __pte_alloc fails than the pte is none.

This is a real nitpick though ;).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-15 15:06     ` Andrea Arcangeli
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-15 15:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5823698..1659574 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >  	 * run pte_offset_map on the pmd, if an huge pmd could
> >  	 * materialize from under us from a different thread.
> >  	 */
> > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))

I started hacking on this and I noticed it'd be better to extend the
unlikely through the end. At first review I didn't notice the
parenthesis closure stops after pte_none and __pte_alloc is now
uncovered. I'd prefer this:

    if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))

I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
end up calling __pte_alloc or not, depends on the app. Generally it
sounds more frequent that the pte is not none, so it's not wrong, but
it's even less likely that __pte_alloc fails so that can be taken into
account too, and __pte_alloc runs still quite frequently. So either
above or:

    if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))

I generally prefer unlikely only when it's 100% sure thing it's less
likely (like the VM_FAULT_OOM), so the first version I guess it's
enough (I'm afraid unlikely for pte_none too, may make gcc generate a
far away jump possibly going out of l1 icache for a case that is only
512 times less likely at best). My point is that it's certainly hugely
more unlikely that __pte_alloc fails than the pte is none.

This is a real nitpick though ;).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 15:06     ` Andrea Arcangeli
@ 2011-04-18  7:21       ` raz ben yehuda
  -1 siblings, 0 replies; 30+ messages in thread
From: raz ben yehuda @ 2011-04-18  7:21 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mel Gorman, akpm, riel, kosaki.motohiro, lkml, linux-mm, stable, shai

patch works great. thank you Andrea.


On Fri, 2011-04-15 at 17:06 +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> > On Fri, Apr 15, 2011 at 11:12:4A.M +0100, Mel Gorman wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >  	 * run pte_offset_map on the pmd, if an huge pmd could
> > >  	 * materialize from under us from a different thread.
> > >  	 */
> > > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 
> I started hacking on this and I noticed it'd be better to extend the
> unlikely through the end. At first review I didn't notice the
> parenthesis closure stops after pte_none and __pte_alloc is now
> uncovered. I'd prefer this:
> 
>     if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))
> 
> I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
> end up calling __pte_alloc or not, depends on the app. Generally it
> sounds more frequent that the pte is not none, so it's not wrong, but
> it's even less likely that __pte_alloc fails so that can be taken into
> account too, and __pte_alloc runs still quite frequently. So either
> above or:
> 
>     if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))
> 
> I generally prefer unlikely only when it's 100% sure thing it's less
> likely (like the VM_FAULT_OOM), so the first version I guess it's
> enough (I'm afraid unlikely for pte_none too, may make gcc generate a
> far away jump possibly going out of l1 icache for a case that is only
> 512 times less likely at best). My point is that it's certainly hugely
> more unlikely that __pte_alloc fails than the pte is none.
> 
> This is a real nitpick though ;).



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-18  7:21       ` raz ben yehuda
  0 siblings, 0 replies; 30+ messages in thread
From: raz ben yehuda @ 2011-04-18  7:21 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mel Gorman, akpm, riel, kosaki.motohiro, lkml, linux-mm, stable, shai

patch works great. thank you Andrea.


On Fri, 2011-04-15 at 17:06 +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> > On Fri, Apr 15, 2011 at 11:12:4A.M +0100, Mel Gorman wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >  	 * run pte_offset_map on the pmd, if an huge pmd could
> > >  	 * materialize from under us from a different thread.
> > >  	 */
> > > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 
> I started hacking on this and I noticed it'd be better to extend the
> unlikely through the end. At first review I didn't notice the
> parenthesis closure stops after pte_none and __pte_alloc is now
> uncovered. I'd prefer this:
> 
>     if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))
> 
> I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
> end up calling __pte_alloc or not, depends on the app. Generally it
> sounds more frequent that the pte is not none, so it's not wrong, but
> it's even less likely that __pte_alloc fails so that can be taken into
> account too, and __pte_alloc runs still quite frequently. So either
> above or:
> 
>     if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))
> 
> I generally prefer unlikely only when it's 100% sure thing it's less
> likely (like the VM_FAULT_OOM), so the first version I guess it's
> enough (I'm afraid unlikely for pte_none too, may make gcc generate a
> far away jump possibly going out of l1 icache for a case that is only
> 512 times less likely at best). My point is that it's certainly hugely
> more unlikely that __pte_alloc fails than the pte is none.
> 
> This is a real nitpick though ;).


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 15:06     ` Andrea Arcangeli
@ 2011-04-18 10:23       ` Mel Gorman
  -1 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-18 10:23 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 05:06:06PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> > On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >  	 * run pte_offset_map on the pmd, if an huge pmd could
> > >  	 * materialize from under us from a different thread.
> > >  	 */
> > > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 
> I started hacking on this and I noticed it'd be better to extend the
> unlikely through the end. At first review I didn't notice the
> parenthesis closure stops after pte_none and __pte_alloc is now
> uncovered. I'd prefer this:
> 
>     if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))
> 

I had this at one point and then decided to match what we do for
pte_alloc_map(). My reasoning was that the most important part of this
check is pmd_none(). It's relatively unlikely we even call __pte_alloc
which is why I didn't think it belonged in the unlikely block. I also
preferred being consistent with pte_alloc_map.

> I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
> end up calling __pte_alloc or not, depends on the app. Generally it
> sounds more frequent that the pte is not none, so it's not wrong, but
> it's even less likely that __pte_alloc fails so that can be taken into
> account too, and __pte_alloc runs still quite frequently. So either
> above or:
> 
>     if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))
> 

I'd prefer this than putting everything inside the same unlikely block.
But if this makes a noticeable, why do we not do it for pte_alloc_map,
pmd_alloc and other similar functions?

> I generally prefer unlikely only when it's 100% sure thing it's less
> likely (like the VM_FAULT_OOM), so the first version I guess it's
> enough (I'm afraid unlikely for pte_none too, may make gcc generate a
> far away jump possibly going out of l1 icache for a case that is only
> 512 times less likely at best). My point is that it's certainly hugely
> more unlikely that __pte_alloc fails than the pte is none.
> 

For the bug fix, it's best to match what pte_alloc_map, pmd_alloc,
pud_alloc and others do in terms of how it uses unlikely. If what we are
currently doing is sub-optimal, a single patch should change all the
helpers.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-18 10:23       ` Mel Gorman
  0 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-18 10:23 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: akpm, raz ben yehuda, riel, kosaki.motohiro, lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 05:06:06PM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 15, 2011 at 04:39:16PM +0200, Andrea Arcangeli wrote:
> > On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >  	 * run pte_offset_map on the pmd, if an huge pmd could
> > >  	 * materialize from under us from a different thread.
> > >  	 */
> > > -	if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +	if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 
> I started hacking on this and I noticed it'd be better to extend the
> unlikely through the end. At first review I didn't notice the
> parenthesis closure stops after pte_none and __pte_alloc is now
> uncovered. I'd prefer this:
> 
>     if (unlikely(pmd_none(*pmd) && __pte_alloc(mm, vma, pmd, address)))
> 

I had this at one point and then decided to match what we do for
pte_alloc_map(). My reasoning was that the most important part of this
check is pmd_none(). It's relatively unlikely we even call __pte_alloc
which is why I didn't think it belonged in the unlikely block. I also
preferred being consistent with pte_alloc_map.

> I mean the real unlikely thing is that we return VM_FAULT_OOM, if we
> end up calling __pte_alloc or not, depends on the app. Generally it
> sounds more frequent that the pte is not none, so it's not wrong, but
> it's even less likely that __pte_alloc fails so that can be taken into
> account too, and __pte_alloc runs still quite frequently. So either
> above or:
> 
>     if (unlikely(pmd_none(*pmd)) && unlikely(__pte_alloc(mm, vma, pmd, address)))
> 

I'd prefer this than putting everything inside the same unlikely block.
But if this makes a noticeable, why do we not do it for pte_alloc_map,
pmd_alloc and other similar functions?

> I generally prefer unlikely only when it's 100% sure thing it's less
> likely (like the VM_FAULT_OOM), so the first version I guess it's
> enough (I'm afraid unlikely for pte_none too, may make gcc generate a
> far away jump possibly going out of l1 icache for a case that is only
> 512 times less likely at best). My point is that it's certainly hugely
> more unlikely that __pte_alloc fails than the pte is none.
> 

For the bug fix, it's best to match what pte_alloc_map, pmd_alloc,
pud_alloc and others do in terms of how it uses unlikely. If what we are
currently doing is sub-optimal, a single patch should change all the
helpers.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 10:12 ` Mel Gorman
@ 2011-04-21  6:59   ` Minchan Kim
  -1 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-21  6:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

Hi Mel,

On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> With transparent hugepage support, handle_mm_fault() has to be careful
> that a normal PMD has been established before handling a PTE fault. To
> achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> is called once it is known the PMD is safe.
>
> pte_alloc_map() is smart enough to check if a PTE is already present
> before calling __pte_alloc but this check was lost. As a consequence,
> PTEs may be allocated unnecessarily and the page table lock taken.
> Thi useless PTE does get cleaned up but it's a performance hit which
> is visible in page_test from aim9.
>
> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.
>
> AIM9
>                2.6.38-vanilla 2.6.38-checkptenone
> creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> MMTests Statistics: duration
> Total Elapsed Time (seconds)                611.90    612.22
>
> (While this affects 2.6.38, it is a performance rather than a
> functional bug and normally outside the rules -stable. While the big
> performance differences are to a microbench, the difference in fork
> and exec performance may be significant enough that -stable wants to
> consider the patch)
>
> Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> --
>  mm/memory.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 5823698..1659574 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>         * run pte_offset_map on the pmd, if an huge pmd could
>         * materialize from under us from a different thread.
>         */
> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>                return VM_FAULT_OOM;
>        /* if an huge pmd materialized from under us just retry later */
>        if (unlikely(pmd_trans_huge(*pmd)))
>

Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

Sorry for jumping in too late. I have a just nitpick.

We have another place, do_huge_pmd_anonymous_page.
Although it isn't workload of page_test, is it valuable to expand your
patch to cover it?
If there is workload there are many thread and share one shared anon
vma in ALWAYS THP mode, same problem would happen.


-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-21  6:59   ` Minchan Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-21  6:59 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

Hi Mel,

On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> With transparent hugepage support, handle_mm_fault() has to be careful
> that a normal PMD has been established before handling a PTE fault. To
> achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> is called once it is known the PMD is safe.
>
> pte_alloc_map() is smart enough to check if a PTE is already present
> before calling __pte_alloc but this check was lost. As a consequence,
> PTEs may be allocated unnecessarily and the page table lock taken.
> Thi useless PTE does get cleaned up but it's a performance hit which
> is visible in page_test from aim9.
>
> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.
>
> AIM9
>                2.6.38-vanilla 2.6.38-checkptenone
> creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> MMTests Statistics: duration
> Total Elapsed Time (seconds)                611.90    612.22
>
> (While this affects 2.6.38, it is a performance rather than a
> functional bug and normally outside the rules -stable. While the big
> performance differences are to a microbench, the difference in fork
> and exec performance may be significant enough that -stable wants to
> consider the patch)
>
> Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> --
>  mm/memory.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 5823698..1659574 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>         * run pte_offset_map on the pmd, if an huge pmd could
>         * materialize from under us from a different thread.
>         */
> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>                return VM_FAULT_OOM;
>        /* if an huge pmd materialized from under us just retry later */
>        if (unlikely(pmd_trans_huge(*pmd)))
>

Reviewed-by: Minchan Kim <minchan.kim@gmail.com>

Sorry for jumping in too late. I have a just nitpick.

We have another place, do_huge_pmd_anonymous_page.
Although it isn't workload of page_test, is it valuable to expand your
patch to cover it?
If there is workload there are many thread and share one shared anon
vma in ALWAYS THP mode, same problem would happen.


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21  6:59   ` Minchan Kim
@ 2011-04-21 11:08     ` Mel Gorman
  -1 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-21 11:08 UTC (permalink / raw)
  To: Minchan Kim
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> Hi Mel,
> 
> On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > With transparent hugepage support, handle_mm_fault() has to be careful
> > that a normal PMD has been established before handling a PTE fault. To
> > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > is called once it is known the PMD is safe.
> >
> > pte_alloc_map() is smart enough to check if a PTE is already present
> > before calling __pte_alloc but this check was lost. As a consequence,
> > PTEs may be allocated unnecessarily and the page table lock taken.
> > Thi useless PTE does get cleaned up but it's a performance hit which
> > is visible in page_test from aim9.
> >
> > This patch simply re-adds the check normally done by pte_alloc_map to
> > check if the PTE needs to be allocated before taking the page table
> > lock. The effect is noticable in page_test from aim9.
> >
> > AIM9
> >                2.6.38-vanilla 2.6.38-checkptenone
> > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > MMTests Statistics: duration
> > Total Elapsed Time (seconds)                611.90    612.22
> >
> > (While this affects 2.6.38, it is a performance rather than a
> > functional bug and normally outside the rules -stable. While the big
> > performance differences are to a microbench, the difference in fork
> > and exec performance may be significant enough that -stable wants to
> > consider the patch)
> >
> > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > --
> >  mm/memory.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5823698..1659574 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >         * run pte_offset_map on the pmd, if an huge pmd could
> >         * materialize from under us from a different thread.
> >         */
> > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> >                return VM_FAULT_OOM;
> >        /* if an huge pmd materialized from under us just retry later */
> >        if (unlikely(pmd_trans_huge(*pmd)))
> >
> 
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> 
> Sorry for jumping in too late. I have a just nitpick.
> 

Better late than never :)

> We have another place, do_huge_pmd_anonymous_page.
> Although it isn't workload of page_test, is it valuable to expand your
> patch to cover it?
> If there is workload there are many thread and share one shared anon
> vma in ALWAYS THP mode, same problem would happen.

We already checked pmd_none() in handle_mm_fault() before calling
into do_huge_pmd_anonymous_page(). We could race for the fault while
attempting to allocate a huge page but it wouldn't be as severe a
problem particularly as it is encountered after failing a 2M allocation.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-21 11:08     ` Mel Gorman
  0 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-21 11:08 UTC (permalink / raw)
  To: Minchan Kim
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> Hi Mel,
> 
> On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > With transparent hugepage support, handle_mm_fault() has to be careful
> > that a normal PMD has been established before handling a PTE fault. To
> > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > is called once it is known the PMD is safe.
> >
> > pte_alloc_map() is smart enough to check if a PTE is already present
> > before calling __pte_alloc but this check was lost. As a consequence,
> > PTEs may be allocated unnecessarily and the page table lock taken.
> > Thi useless PTE does get cleaned up but it's a performance hit which
> > is visible in page_test from aim9.
> >
> > This patch simply re-adds the check normally done by pte_alloc_map to
> > check if the PTE needs to be allocated before taking the page table
> > lock. The effect is noticable in page_test from aim9.
> >
> > AIM9
> >                2.6.38-vanilla 2.6.38-checkptenone
> > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > MMTests Statistics: duration
> > Total Elapsed Time (seconds)                611.90    612.22
> >
> > (While this affects 2.6.38, it is a performance rather than a
> > functional bug and normally outside the rules -stable. While the big
> > performance differences are to a microbench, the difference in fork
> > and exec performance may be significant enough that -stable wants to
> > consider the patch)
> >
> > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > --
> >  mm/memory.c |    2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 5823698..1659574 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >         * run pte_offset_map on the pmd, if an huge pmd could
> >         * materialize from under us from a different thread.
> >         */
> > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> >                return VM_FAULT_OOM;
> >        /* if an huge pmd materialized from under us just retry later */
> >        if (unlikely(pmd_trans_huge(*pmd)))
> >
> 
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> 
> Sorry for jumping in too late. I have a just nitpick.
> 

Better late than never :)

> We have another place, do_huge_pmd_anonymous_page.
> Although it isn't workload of page_test, is it valuable to expand your
> patch to cover it?
> If there is workload there are many thread and share one shared anon
> vma in ALWAYS THP mode, same problem would happen.

We already checked pmd_none() in handle_mm_fault() before calling
into do_huge_pmd_anonymous_page(). We could race for the fault while
attempting to allocate a huge page but it wouldn't be as severe a
problem particularly as it is encountered after failing a 2M allocation.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21 11:08     ` Mel Gorman
@ 2011-04-21 14:26       ` Minchan Kim
  -1 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-21 14:26 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
> On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> > Hi Mel,
> > 
> > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > > With transparent hugepage support, handle_mm_fault() has to be careful
> > > that a normal PMD has been established before handling a PTE fault. To
> > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > > is called once it is known the PMD is safe.
> > >
> > > pte_alloc_map() is smart enough to check if a PTE is already present
> > > before calling __pte_alloc but this check was lost. As a consequence,
> > > PTEs may be allocated unnecessarily and the page table lock taken.
> > > Thi useless PTE does get cleaned up but it's a performance hit which
> > > is visible in page_test from aim9.
> > >
> > > This patch simply re-adds the check normally done by pte_alloc_map to
> > > check if the PTE needs to be allocated before taking the page table
> > > lock. The effect is noticable in page_test from aim9.
> > >
> > > AIM9
> > >                2.6.38-vanilla 2.6.38-checkptenone
> > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > > MMTests Statistics: duration
> > > Total Elapsed Time (seconds)                611.90    612.22
> > >
> > > (While this affects 2.6.38, it is a performance rather than a
> > > functional bug and normally outside the rules -stable. While the big
> > > performance differences are to a microbench, the difference in fork
> > > and exec performance may be significant enough that -stable wants to
> > > consider the patch)
> > >
> > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > > --
> > >  mm/memory.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >         * run pte_offset_map on the pmd, if an huge pmd could
> > >         * materialize from under us from a different thread.
> > >         */
> > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> > >                return VM_FAULT_OOM;
> > >        /* if an huge pmd materialized from under us just retry later */
> > >        if (unlikely(pmd_trans_huge(*pmd)))
> > >
> > 
> > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > 
> > Sorry for jumping in too late. I have a just nitpick.
> > 
> 
> Better late than never :)
> 
> > We have another place, do_huge_pmd_anonymous_page.
> > Although it isn't workload of page_test, is it valuable to expand your
> > patch to cover it?
> > If there is workload there are many thread and share one shared anon
> > vma in ALWAYS THP mode, same problem would happen.
> 
> We already checked pmd_none() in handle_mm_fault() before calling
> into do_huge_pmd_anonymous_page(). We could race for the fault while
> attempting to allocate a huge page but it wouldn't be as severe a
> problem particularly as it is encountered after failing a 2M allocation.

Right you are. Fail ot 2M allocation would affect as throttle.
Thanks.

As I failed let you add the check, I have to reveal my mind. :)
Actually, what I want is consistency of the code.
The code have been same in two places but you find the problem in page_test of aim9,
you changed one of them slightly. I think in future someone will
have a question about that and he will start grep git log but it will take
a long time as the log is buried other code piled up. 

I hope adding the comment in this case.

        /*
         * PTEs may be allocated unnecessarily and the page table lock taken.
         * The useless PTE does get cleaned up but it's a performance hit in
         * some micro-benchmark. Let's check pmd_none before __pte_alloc to
         * reduce the overhead. 
         */
-       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
+       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))

If you mind it as someone who have a question can find the log at last 
although he need some time, I wouldn't care of the nitpick any more. :)
It's up to you. 

Thanks, Mel.

> 
> -- 
> Mel Gorman
> SUSE Labs

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-21 14:26       ` Minchan Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-21 14:26 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
> On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> > Hi Mel,
> > 
> > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > > With transparent hugepage support, handle_mm_fault() has to be careful
> > > that a normal PMD has been established before handling a PTE fault. To
> > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > > is called once it is known the PMD is safe.
> > >
> > > pte_alloc_map() is smart enough to check if a PTE is already present
> > > before calling __pte_alloc but this check was lost. As a consequence,
> > > PTEs may be allocated unnecessarily and the page table lock taken.
> > > Thi useless PTE does get cleaned up but it's a performance hit which
> > > is visible in page_test from aim9.
> > >
> > > This patch simply re-adds the check normally done by pte_alloc_map to
> > > check if the PTE needs to be allocated before taking the page table
> > > lock. The effect is noticable in page_test from aim9.
> > >
> > > AIM9
> > >                2.6.38-vanilla 2.6.38-checkptenone
> > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > > MMTests Statistics: duration
> > > Total Elapsed Time (seconds)                611.90    612.22
> > >
> > > (While this affects 2.6.38, it is a performance rather than a
> > > functional bug and normally outside the rules -stable. While the big
> > > performance differences are to a microbench, the difference in fork
> > > and exec performance may be significant enough that -stable wants to
> > > consider the patch)
> > >
> > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > > --
> > >  mm/memory.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 5823698..1659574 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > >         * run pte_offset_map on the pmd, if an huge pmd could
> > >         * materialize from under us from a different thread.
> > >         */
> > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> > >                return VM_FAULT_OOM;
> > >        /* if an huge pmd materialized from under us just retry later */
> > >        if (unlikely(pmd_trans_huge(*pmd)))
> > >
> > 
> > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > 
> > Sorry for jumping in too late. I have a just nitpick.
> > 
> 
> Better late than never :)
> 
> > We have another place, do_huge_pmd_anonymous_page.
> > Although it isn't workload of page_test, is it valuable to expand your
> > patch to cover it?
> > If there is workload there are many thread and share one shared anon
> > vma in ALWAYS THP mode, same problem would happen.
> 
> We already checked pmd_none() in handle_mm_fault() before calling
> into do_huge_pmd_anonymous_page(). We could race for the fault while
> attempting to allocate a huge page but it wouldn't be as severe a
> problem particularly as it is encountered after failing a 2M allocation.

Right you are. Fail ot 2M allocation would affect as throttle.
Thanks.

As I failed let you add the check, I have to reveal my mind. :)
Actually, what I want is consistency of the code.
The code have been same in two places but you find the problem in page_test of aim9,
you changed one of them slightly. I think in future someone will
have a question about that and he will start grep git log but it will take
a long time as the log is buried other code piled up. 

I hope adding the comment in this case.

        /*
         * PTEs may be allocated unnecessarily and the page table lock taken.
         * The useless PTE does get cleaned up but it's a performance hit in
         * some micro-benchmark. Let's check pmd_none before __pte_alloc to
         * reduce the overhead. 
         */
-       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
+       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))

If you mind it as someone who have a question can find the log at last 
although he need some time, I wouldn't care of the nitpick any more. :)
It's up to you. 

Thanks, Mel.

> 
> -- 
> Mel Gorman
> SUSE Labs

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21 14:26       ` Minchan Kim
@ 2011-04-21 16:00         ` Mel Gorman
  -1 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-21 16:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 11:26:36PM +0900, Minchan Kim wrote:
> On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
> > On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> > > Hi Mel,
> > > 
> > > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > > > With transparent hugepage support, handle_mm_fault() has to be careful
> > > > that a normal PMD has been established before handling a PTE fault. To
> > > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > > > is called once it is known the PMD is safe.
> > > >
> > > > pte_alloc_map() is smart enough to check if a PTE is already present
> > > > before calling __pte_alloc but this check was lost. As a consequence,
> > > > PTEs may be allocated unnecessarily and the page table lock taken.
> > > > Thi useless PTE does get cleaned up but it's a performance hit which
> > > > is visible in page_test from aim9.
> > > >
> > > > This patch simply re-adds the check normally done by pte_alloc_map to
> > > > check if the PTE needs to be allocated before taking the page table
> > > > lock. The effect is noticable in page_test from aim9.
> > > >
> > > > AIM9
> > > >                2.6.38-vanilla 2.6.38-checkptenone
> > > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > > > MMTests Statistics: duration
> > > > Total Elapsed Time (seconds)                611.90    612.22
> > > >
> > > > (While this affects 2.6.38, it is a performance rather than a
> > > > functional bug and normally outside the rules -stable. While the big
> > > > performance differences are to a microbench, the difference in fork
> > > > and exec performance may be significant enough that -stable wants to
> > > > consider the patch)
> > > >
> > > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > > > --
> > > >  mm/memory.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 5823698..1659574 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > > >         * run pte_offset_map on the pmd, if an huge pmd could
> > > >         * materialize from under us from a different thread.
> > > >         */
> > > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> > > >                return VM_FAULT_OOM;
> > > >        /* if an huge pmd materialized from under us just retry later */
> > > >        if (unlikely(pmd_trans_huge(*pmd)))
> > > >
> > > 
> > > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > > 
> > > Sorry for jumping in too late. I have a just nitpick.
> > > 
> > 
> > Better late than never :)
> > 
> > > We have another place, do_huge_pmd_anonymous_page.
> > > Although it isn't workload of page_test, is it valuable to expand your
> > > patch to cover it?
> > > If there is workload there are many thread and share one shared anon
> > > vma in ALWAYS THP mode, same problem would happen.
> > 
> > We already checked pmd_none() in handle_mm_fault() before calling
> > into do_huge_pmd_anonymous_page(). We could race for the fault while
> > attempting to allocate a huge page but it wouldn't be as severe a
> > problem particularly as it is encountered after failing a 2M allocation.
> 
> Right you are. Fail ot 2M allocation would affect as throttle.
> Thanks.
> 
> As I failed let you add the check, I have to reveal my mind. :)
> Actually, what I want is consistency of the code.

This is a stronger arguement than as a performance fix. I was concerned
that if such a check was added that it would confuse someone in a years
time trying to figure out why the pmd_none check was really necessary.

> The code have been same in two places but you find the problem in page_test of aim9,
> you changed one of them slightly. I think in future someone will
> have a question about that and he will start grep git log but it will take
> a long time as the log is buried other code piled up. 
> 

Fair point.

> I hope adding the comment in this case.
> 
>         /*
>          * PTEs may be allocated unnecessarily and the page table lock taken.
>          * The useless PTE does get cleaned up but it's a performance hit in
>          * some micro-benchmark. Let's check pmd_none before __pte_alloc to
>          * reduce the overhead. 
>          */
> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 

I think a better justification is

	/*
	 * Even though handle_mm_fault has already checked pmd_none, we
	 * have failed a huge allocation at this point during which a
	 * valid PTE could have been inserted. Double check a PTE alloc
	 * is still necessary to avoid additional overhead
	 */

> If you mind it as someone who have a question can find the log at last 
> although he need some time, I wouldn't care of the nitpick any more. :)
> It's up to you. 
> 

If you want to create a new patch with either your comment or mine
(whichever you prefer) I'll add my ack. I'm about to drop offline
for a few days but if it's still there Tuesday, I'll put together an
appropriate patch and submit. I'd keep it separate from the other patch
because it's a performance fix (which I'd like to see in -stable) where
as this is more of a cleanup IMO.

Thanks

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-21 16:00         ` Mel Gorman
  0 siblings, 0 replies; 30+ messages in thread
From: Mel Gorman @ 2011-04-21 16:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Thu, Apr 21, 2011 at 11:26:36PM +0900, Minchan Kim wrote:
> On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
> > On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
> > > Hi Mel,
> > > 
> > > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
> > > > With transparent hugepage support, handle_mm_fault() has to be careful
> > > > that a normal PMD has been established before handling a PTE fault. To
> > > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> > > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> > > > is called once it is known the PMD is safe.
> > > >
> > > > pte_alloc_map() is smart enough to check if a PTE is already present
> > > > before calling __pte_alloc but this check was lost. As a consequence,
> > > > PTEs may be allocated unnecessarily and the page table lock taken.
> > > > Thi useless PTE does get cleaned up but it's a performance hit which
> > > > is visible in page_test from aim9.
> > > >
> > > > This patch simply re-adds the check normally done by pte_alloc_map to
> > > > check if the PTE needs to be allocated before taking the page table
> > > > lock. The effect is noticable in page_test from aim9.
> > > >
> > > > AIM9
> > > >                2.6.38-vanilla 2.6.38-checkptenone
> > > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> > > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> > > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> > > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> > > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> > > > MMTests Statistics: duration
> > > > Total Elapsed Time (seconds)                611.90    612.22
> > > >
> > > > (While this affects 2.6.38, it is a performance rather than a
> > > > functional bug and normally outside the rules -stable. While the big
> > > > performance differences are to a microbench, the difference in fork
> > > > and exec performance may be significant enough that -stable wants to
> > > > consider the patch)
> > > >
> > > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> > > > Signed-off-by: Mel Gorman <mgorman@suse.de>
> > > > --
> > > >  mm/memory.c |    2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/mm/memory.c b/mm/memory.c
> > > > index 5823698..1659574 100644
> > > > --- a/mm/memory.c
> > > > +++ b/mm/memory.c
> > > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> > > >         * run pte_offset_map on the pmd, if an huge pmd could
> > > >         * materialize from under us from a different thread.
> > > >         */
> > > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> > > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> > > >                return VM_FAULT_OOM;
> > > >        /* if an huge pmd materialized from under us just retry later */
> > > >        if (unlikely(pmd_trans_huge(*pmd)))
> > > >
> > > 
> > > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
> > > 
> > > Sorry for jumping in too late. I have a just nitpick.
> > > 
> > 
> > Better late than never :)
> > 
> > > We have another place, do_huge_pmd_anonymous_page.
> > > Although it isn't workload of page_test, is it valuable to expand your
> > > patch to cover it?
> > > If there is workload there are many thread and share one shared anon
> > > vma in ALWAYS THP mode, same problem would happen.
> > 
> > We already checked pmd_none() in handle_mm_fault() before calling
> > into do_huge_pmd_anonymous_page(). We could race for the fault while
> > attempting to allocate a huge page but it wouldn't be as severe a
> > problem particularly as it is encountered after failing a 2M allocation.
> 
> Right you are. Fail ot 2M allocation would affect as throttle.
> Thanks.
> 
> As I failed let you add the check, I have to reveal my mind. :)
> Actually, what I want is consistency of the code.

This is a stronger arguement than as a performance fix. I was concerned
that if such a check was added that it would confuse someone in a years
time trying to figure out why the pmd_none check was really necessary.

> The code have been same in two places but you find the problem in page_test of aim9,
> you changed one of them slightly. I think in future someone will
> have a question about that and he will start grep git log but it will take
> a long time as the log is buried other code piled up. 
> 

Fair point.

> I hope adding the comment in this case.
> 
>         /*
>          * PTEs may be allocated unnecessarily and the page table lock taken.
>          * The useless PTE does get cleaned up but it's a performance hit in
>          * some micro-benchmark. Let's check pmd_none before __pte_alloc to
>          * reduce the overhead. 
>          */
> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
> 

I think a better justification is

	/*
	 * Even though handle_mm_fault has already checked pmd_none, we
	 * have failed a huge allocation at this point during which a
	 * valid PTE could have been inserted. Double check a PTE alloc
	 * is still necessary to avoid additional overhead
	 */

> If you mind it as someone who have a question can find the log at last 
> although he need some time, I wouldn't care of the nitpick any more. :)
> It's up to you. 
> 

If you want to create a new patch with either your comment or mine
(whichever you prefer) I'll add my ack. I'm about to drop offline
for a few days but if it's still there Tuesday, I'll put together an
appropriate patch and submit. I'd keep it separate from the other patch
because it's a performance fix (which I'd like to see in -stable) where
as this is more of a cleanup IMO.

Thanks

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21 16:00         ` Mel Gorman
@ 2011-04-21 16:14           ` Andrea Arcangeli
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-21 16:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

On Thu, Apr 21, 2011 at 05:00:57PM +0100, Mel Gorman wrote:
> If you want to create a new patch with either your comment or mine
> (whichever you prefer) I'll add my ack. I'm about to drop offline
> for a few days but if it's still there Tuesday, I'll put together an
> appropriate patch and submit. I'd keep it separate from the other patch
> because it's a performance fix (which I'd like to see in -stable) where
> as this is more of a cleanup IMO.

I think the older patch should have more priority agreed. This one may
actually waste cpu cycles overall, rather than saving them, it
shouldn't be a common occurrence.

>From a code consistency point of view maybe we should just implement a
pte_alloc macro (to put after pte_alloc_map) and use it in both
places, and hide the glory details of the unlikely in the macro. When
implementing pte_alloc, I suggest also adding unlikely to both, I mean
we added unlikely to the fast path ok, but __pte_alloc is orders of
magnitude less likely to fail than pte_none, and it still runs 1 every
512 4k page faults, so I think __pte_alloc deserves an unlikely too.

Minchan, you suggested this cleanup, so I suggest you to send a patch,
but if you're busy we can help.

Thanks!
Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-21 16:14           ` Andrea Arcangeli
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-21 16:14 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Minchan Kim, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

On Thu, Apr 21, 2011 at 05:00:57PM +0100, Mel Gorman wrote:
> If you want to create a new patch with either your comment or mine
> (whichever you prefer) I'll add my ack. I'm about to drop offline
> for a few days but if it's still there Tuesday, I'll put together an
> appropriate patch and submit. I'd keep it separate from the other patch
> because it's a performance fix (which I'd like to see in -stable) where
> as this is more of a cleanup IMO.

I think the older patch should have more priority agreed. This one may
actually waste cpu cycles overall, rather than saving them, it
shouldn't be a common occurrence.

>From a code consistency point of view maybe we should just implement a
pte_alloc macro (to put after pte_alloc_map) and use it in both
places, and hide the glory details of the unlikely in the macro. When
implementing pte_alloc, I suggest also adding unlikely to both, I mean
we added unlikely to the fast path ok, but __pte_alloc is orders of
magnitude less likely to fail than pte_none, and it still runs 1 every
512 4k page faults, so I think __pte_alloc deserves an unlikely too.

Minchan, you suggested this cleanup, so I suggest you to send a patch,
but if you're busy we can help.

Thanks!
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21 16:14           ` Andrea Arcangeli
@ 2011-04-22  0:54             ` Minchan Kim
  -1 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-22  0:54 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mel Gorman, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

Hi Andrea,

On Fri, Apr 22, 2011 at 1:14 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Thu, Apr 21, 2011 at 05:00:57PM +0100, Mel Gorman wrote:
>> If you want to create a new patch with either your comment or mine
>> (whichever you prefer) I'll add my ack. I'm about to drop offline
>> for a few days but if it's still there Tuesday, I'll put together an
>> appropriate patch and submit. I'd keep it separate from the other patch
>> because it's a performance fix (which I'd like to see in -stable) where
>> as this is more of a cleanup IMO.
>
> I think the older patch should have more priority agreed. This one may
> actually waste cpu cycles overall, rather than saving them, it
> shouldn't be a common occurrence.
>
> From a code consistency point of view maybe we should just implement a
> pte_alloc macro (to put after pte_alloc_map) and use it in both
> places, and hide the glory details of the unlikely in the macro. When
> implementing pte_alloc, I suggest also adding unlikely to both, I mean
> we added unlikely to the fast path ok, but __pte_alloc is orders of
> magnitude less likely to fail than pte_none, and it still runs 1 every
> 512 4k page faults, so I think __pte_alloc deserves an unlikely too.
>
> Minchan, you suggested this cleanup, so I suggest you to send a patch,
> but if you're busy we can help.

It's no problem to send a patch but I can do it at out-of-office time.
Maybe weekend. :)
Before doing that, let's clear the point. You mentioned  it shouldn't
be a common occurrence but you are suggesting we should do for code
consistency POV. Am I right?

>
> Thanks!
> Andrea
>



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-22  0:54             ` Minchan Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-22  0:54 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Mel Gorman, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

Hi Andrea,

On Fri, Apr 22, 2011 at 1:14 AM, Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Thu, Apr 21, 2011 at 05:00:57PM +0100, Mel Gorman wrote:
>> If you want to create a new patch with either your comment or mine
>> (whichever you prefer) I'll add my ack. I'm about to drop offline
>> for a few days but if it's still there Tuesday, I'll put together an
>> appropriate patch and submit. I'd keep it separate from the other patch
>> because it's a performance fix (which I'd like to see in -stable) where
>> as this is more of a cleanup IMO.
>
> I think the older patch should have more priority agreed. This one may
> actually waste cpu cycles overall, rather than saving them, it
> shouldn't be a common occurrence.
>
> From a code consistency point of view maybe we should just implement a
> pte_alloc macro (to put after pte_alloc_map) and use it in both
> places, and hide the glory details of the unlikely in the macro. When
> implementing pte_alloc, I suggest also adding unlikely to both, I mean
> we added unlikely to the fast path ok, but __pte_alloc is orders of
> magnitude less likely to fail than pte_none, and it still runs 1 every
> 512 4k page faults, so I think __pte_alloc deserves an unlikely too.
>
> Minchan, you suggested this cleanup, so I suggest you to send a patch,
> but if you're busy we can help.

It's no problem to send a patch but I can do it at out-of-office time.
Maybe weekend. :)
Before doing that, let's clear the point. You mentioned  it shouldn't
be a common occurrence but you are suggesting we should do for code
consistency POV. Am I right?

>
> Thanks!
> Andrea
>



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-21 16:00         ` Mel Gorman
@ 2011-04-22  1:01           ` Minchan Kim
  -1 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-22  1:01 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Fri, Apr 22, 2011 at 1:00 AM, Mel Gorman <mgorman@suse.de> wrote:
> On Thu, Apr 21, 2011 at 11:26:36PM +0900, Minchan Kim wrote:
>> On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
>> > On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
>> > > Hi Mel,
>> > >
>> > > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
>> > > > With transparent hugepage support, handle_mm_fault() has to be careful
>> > > > that a normal PMD has been established before handling a PTE fault. To
>> > > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
>> > > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
>> > > > is called once it is known the PMD is safe.
>> > > >
>> > > > pte_alloc_map() is smart enough to check if a PTE is already present
>> > > > before calling __pte_alloc but this check was lost. As a consequence,
>> > > > PTEs may be allocated unnecessarily and the page table lock taken.
>> > > > Thi useless PTE does get cleaned up but it's a performance hit which
>> > > > is visible in page_test from aim9.
>> > > >
>> > > > This patch simply re-adds the check normally done by pte_alloc_map to
>> > > > check if the PTE needs to be allocated before taking the page table
>> > > > lock. The effect is noticable in page_test from aim9.
>> > > >
>> > > > AIM9
>> > > >                2.6.38-vanilla 2.6.38-checkptenone
>> > > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
>> > > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
>> > > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
>> > > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
>> > > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
>> > > > MMTests Statistics: duration
>> > > > Total Elapsed Time (seconds)                611.90    612.22
>> > > >
>> > > > (While this affects 2.6.38, it is a performance rather than a
>> > > > functional bug and normally outside the rules -stable. While the big
>> > > > performance differences are to a microbench, the difference in fork
>> > > > and exec performance may be significant enough that -stable wants to
>> > > > consider the patch)
>> > > >
>> > > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
>> > > > Signed-off-by: Mel Gorman <mgorman@suse.de>
>> > > > --
>> > > >  mm/memory.c |    2 +-
>> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/mm/memory.c b/mm/memory.c
>> > > > index 5823698..1659574 100644
>> > > > --- a/mm/memory.c
>> > > > +++ b/mm/memory.c
>> > > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>> > > >         * run pte_offset_map on the pmd, if an huge pmd could
>> > > >         * materialize from under us from a different thread.
>> > > >         */
>> > > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
>> > > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>> > > >                return VM_FAULT_OOM;
>> > > >        /* if an huge pmd materialized from under us just retry later */
>> > > >        if (unlikely(pmd_trans_huge(*pmd)))
>> > > >
>> > >
>> > > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
>> > >
>> > > Sorry for jumping in too late. I have a just nitpick.
>> > >
>> >
>> > Better late than never :)
>> >
>> > > We have another place, do_huge_pmd_anonymous_page.
>> > > Although it isn't workload of page_test, is it valuable to expand your
>> > > patch to cover it?
>> > > If there is workload there are many thread and share one shared anon
>> > > vma in ALWAYS THP mode, same problem would happen.
>> >
>> > We already checked pmd_none() in handle_mm_fault() before calling
>> > into do_huge_pmd_anonymous_page(). We could race for the fault while
>> > attempting to allocate a huge page but it wouldn't be as severe a
>> > problem particularly as it is encountered after failing a 2M allocation.
>>
>> Right you are. Fail ot 2M allocation would affect as throttle.
>> Thanks.
>>
>> As I failed let you add the check, I have to reveal my mind. :)
>> Actually, what I want is consistency of the code.
>
> This is a stronger arguement than as a performance fix. I was concerned
> that if such a check was added that it would confuse someone in a years
> time trying to figure out why the pmd_none check was really necessary.
>
>> The code have been same in two places but you find the problem in page_test of aim9,
>> you changed one of them slightly. I think in future someone will
>> have a question about that and he will start grep git log but it will take
>> a long time as the log is buried other code piled up.
>>
>
> Fair point.
>
>> I hope adding the comment in this case.
>>
>>         /*
>>          * PTEs may be allocated unnecessarily and the page table lock taken.
>>          * The useless PTE does get cleaned up but it's a performance hit in
>>          * some micro-benchmark. Let's check pmd_none before __pte_alloc to
>>          * reduce the overhead.
>>          */
>> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
>> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>>
>
> I think a better justification is
>
>        /*
>         * Even though handle_mm_fault has already checked pmd_none, we
>         * have failed a huge allocation at this point during which a
>         * valid PTE could have been inserted. Double check a PTE alloc
>         * is still necessary to avoid additional overhead
>         */
>

Hmm. If we disable thp, the comment about failing a huge allocation.
was not true. So I prefer mine :)
But Andrea suggested defining new pte_alloc which checks pmd_none
internally for code consistency POV.
In such case, I have no concern about comment.
Is it okay?

>> If you mind it as someone who have a question can find the log at last
>> although he need some time, I wouldn't care of the nitpick any more. :)
>> It's up to you.
>>
>
> If you want to create a new patch with either your comment or mine
> (whichever you prefer) I'll add my ack. I'm about to drop offline
> for a few days but if it's still there Tuesday, I'll put together an
> appropriate patch and submit. I'd keep it separate from the other patch
> because it's a performance fix (which I'd like to see in -stable) where
> as this is more of a cleanup IMO.

Okay.
Thanks, Mel.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-22  1:01           ` Minchan Kim
  0 siblings, 0 replies; 30+ messages in thread
From: Minchan Kim @ 2011-04-22  1:01 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Fri, Apr 22, 2011 at 1:00 AM, Mel Gorman <mgorman@suse.de> wrote:
> On Thu, Apr 21, 2011 at 11:26:36PM +0900, Minchan Kim wrote:
>> On Thu, Apr 21, 2011 at 12:08:41PM +0100, Mel Gorman wrote:
>> > On Thu, Apr 21, 2011 at 03:59:47PM +0900, Minchan Kim wrote:
>> > > Hi Mel,
>> > >
>> > > On Fri, Apr 15, 2011 at 7:12 PM, Mel Gorman <mgorman@suse.de> wrote:
>> > > > With transparent hugepage support, handle_mm_fault() has to be careful
>> > > > that a normal PMD has been established before handling a PTE fault. To
>> > > > achieve this, it used __pte_alloc() directly instead of pte_alloc_map
>> > > > as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
>> > > > is called once it is known the PMD is safe.
>> > > >
>> > > > pte_alloc_map() is smart enough to check if a PTE is already present
>> > > > before calling __pte_alloc but this check was lost. As a consequence,
>> > > > PTEs may be allocated unnecessarily and the page table lock taken.
>> > > > Thi useless PTE does get cleaned up but it's a performance hit which
>> > > > is visible in page_test from aim9.
>> > > >
>> > > > This patch simply re-adds the check normally done by pte_alloc_map to
>> > > > check if the PTE needs to be allocated before taking the page table
>> > > > lock. The effect is noticable in page_test from aim9.
>> > > >
>> > > > AIM9
>> > > >                2.6.38-vanilla 2.6.38-checkptenone
>> > > > creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
>> > > > page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
>> > > > brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
>> > > > exec_test      382.00 ( 0.00%)   456.90 (16.39%)
>> > > > fork_test       60.11 ( 0.00%)    67.79 (11.34%)
>> > > > MMTests Statistics: duration
>> > > > Total Elapsed Time (seconds)                611.90    612.22
>> > > >
>> > > > (While this affects 2.6.38, it is a performance rather than a
>> > > > functional bug and normally outside the rules -stable. While the big
>> > > > performance differences are to a microbench, the difference in fork
>> > > > and exec performance may be significant enough that -stable wants to
>> > > > consider the patch)
>> > > >
>> > > > Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
>> > > > Signed-off-by: Mel Gorman <mgorman@suse.de>
>> > > > --
>> > > >  mm/memory.c |    2 +-
>> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > > >
>> > > > diff --git a/mm/memory.c b/mm/memory.c
>> > > > index 5823698..1659574 100644
>> > > > --- a/mm/memory.c
>> > > > +++ b/mm/memory.c
>> > > > @@ -3322,7 +3322,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>> > > >         * run pte_offset_map on the pmd, if an huge pmd could
>> > > >         * materialize from under us from a different thread.
>> > > >         */
>> > > > -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
>> > > > +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>> > > >                return VM_FAULT_OOM;
>> > > >        /* if an huge pmd materialized from under us just retry later */
>> > > >        if (unlikely(pmd_trans_huge(*pmd)))
>> > > >
>> > >
>> > > Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
>> > >
>> > > Sorry for jumping in too late. I have a just nitpick.
>> > >
>> >
>> > Better late than never :)
>> >
>> > > We have another place, do_huge_pmd_anonymous_page.
>> > > Although it isn't workload of page_test, is it valuable to expand your
>> > > patch to cover it?
>> > > If there is workload there are many thread and share one shared anon
>> > > vma in ALWAYS THP mode, same problem would happen.
>> >
>> > We already checked pmd_none() in handle_mm_fault() before calling
>> > into do_huge_pmd_anonymous_page(). We could race for the fault while
>> > attempting to allocate a huge page but it wouldn't be as severe a
>> > problem particularly as it is encountered after failing a 2M allocation.
>>
>> Right you are. Fail ot 2M allocation would affect as throttle.
>> Thanks.
>>
>> As I failed let you add the check, I have to reveal my mind. :)
>> Actually, what I want is consistency of the code.
>
> This is a stronger arguement than as a performance fix. I was concerned
> that if such a check was added that it would confuse someone in a years
> time trying to figure out why the pmd_none check was really necessary.
>
>> The code have been same in two places but you find the problem in page_test of aim9,
>> you changed one of them slightly. I think in future someone will
>> have a question about that and he will start grep git log but it will take
>> a long time as the log is buried other code piled up.
>>
>
> Fair point.
>
>> I hope adding the comment in this case.
>>
>>         /*
>>          * PTEs may be allocated unnecessarily and the page table lock taken.
>>          * The useless PTE does get cleaned up but it's a performance hit in
>>          * some micro-benchmark. Let's check pmd_none before __pte_alloc to
>>          * reduce the overhead.
>>          */
>> -       if (unlikely(__pte_alloc(mm, vma, pmd, address)))
>> +       if (unlikely(pmd_none(*pmd)) && __pte_alloc(mm, vma, pmd, address))
>>
>
> I think a better justification is
>
>        /*
>         * Even though handle_mm_fault has already checked pmd_none, we
>         * have failed a huge allocation at this point during which a
>         * valid PTE could have been inserted. Double check a PTE alloc
>         * is still necessary to avoid additional overhead
>         */
>

Hmm. If we disable thp, the comment about failing a huge allocation.
was not true. So I prefer mine :)
But Andrea suggested defining new pte_alloc which checks pmd_none
internally for code consistency POV.
In such case, I have no concern about comment.
Is it okay?

>> If you mind it as someone who have a question can find the log at last
>> although he need some time, I wouldn't care of the nitpick any more. :)
>> It's up to you.
>>
>
> If you want to create a new patch with either your comment or mine
> (whichever you prefer) I'll add my ack. I'm about to drop offline
> for a few days but if it's still there Tuesday, I'll put together an
> appropriate patch and submit. I'd keep it separate from the other patch
> because it's a performance fix (which I'd like to see in -stable) where
> as this is more of a cleanup IMO.

Okay.
Thanks, Mel.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-22  0:54             ` Minchan Kim
@ 2011-04-26 13:00               ` Andrea Arcangeli
  -1 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-26 13:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

On Fri, Apr 22, 2011 at 09:54:24AM +0900, Minchan Kim wrote:
> Before doing that, let's clear the point. You mentioned  it shouldn't
> be a common occurrence but you are suggesting we should do for code
> consistency POV. Am I right?

Yes, for code consistency and cleanup.

Thanks,
Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-26 13:00               ` Andrea Arcangeli
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Arcangeli @ 2011-04-26 13:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Mel Gorman, akpm, raz ben yehuda, riel, kosaki.motohiro, lkml,
	linux-mm, stable

On Fri, Apr 22, 2011 at 09:54:24AM +0900, Minchan Kim wrote:
> Before doing that, let's clear the point. You mentioned  it shouldn't
> be a common occurrence but you are suggesting we should do for code
> consistency POV. Am I right?

Yes, for code consistency and cleanup.

Thanks,
Andrea

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
  2011-04-15 10:12 ` Mel Gorman
@ 2011-04-27 13:50   ` Johannes Weiner
  -1 siblings, 0 replies; 30+ messages in thread
From: Johannes Weiner @ 2011-04-27 13:50 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> With transparent hugepage support, handle_mm_fault() has to be careful
> that a normal PMD has been established before handling a PTE fault. To
> achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> is called once it is known the PMD is safe.
> 
> pte_alloc_map() is smart enough to check if a PTE is already present
> before calling __pte_alloc but this check was lost. As a consequence,
> PTEs may be allocated unnecessarily and the page table lock taken.
> Thi useless PTE does get cleaned up but it's a performance hit which
> is visible in page_test from aim9.
> 
> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.
> 
> AIM9
>                 2.6.38-vanilla 2.6.38-checkptenone
> creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> MMTests Statistics: duration
> Total Elapsed Time (seconds)                611.90    612.22
> 
> (While this affects 2.6.38, it is a performance rather than a
> functional bug and normally outside the rules -stable. While the big
> performance differences are to a microbench, the difference in fork
> and exec performance may be significant enough that -stable wants to
> consider the patch)
> 
> Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] mm: Check if PTE is already allocated during page fault
@ 2011-04-27 13:50   ` Johannes Weiner
  0 siblings, 0 replies; 30+ messages in thread
From: Johannes Weiner @ 2011-04-27 13:50 UTC (permalink / raw)
  To: Mel Gorman
  Cc: akpm, Andrea Arcangeli, raz ben yehuda, riel, kosaki.motohiro,
	lkml, linux-mm, stable

On Fri, Apr 15, 2011 at 11:12:48AM +0100, Mel Gorman wrote:
> With transparent hugepage support, handle_mm_fault() has to be careful
> that a normal PMD has been established before handling a PTE fault. To
> achieve this, it used __pte_alloc() directly instead of pte_alloc_map
> as pte_alloc_map is unsafe to run against a huge PMD. pte_offset_map()
> is called once it is known the PMD is safe.
> 
> pte_alloc_map() is smart enough to check if a PTE is already present
> before calling __pte_alloc but this check was lost. As a consequence,
> PTEs may be allocated unnecessarily and the page table lock taken.
> Thi useless PTE does get cleaned up but it's a performance hit which
> is visible in page_test from aim9.
> 
> This patch simply re-adds the check normally done by pte_alloc_map to
> check if the PTE needs to be allocated before taking the page table
> lock. The effect is noticable in page_test from aim9.
> 
> AIM9
>                 2.6.38-vanilla 2.6.38-checkptenone
> creat-clo      446.10 ( 0.00%)   424.47 (-5.10%)
> page_test       38.10 ( 0.00%)    42.04 ( 9.37%)
> brk_test        52.45 ( 0.00%)    51.57 (-1.71%)
> exec_test      382.00 ( 0.00%)   456.90 (16.39%)
> fork_test       60.11 ( 0.00%)    67.79 (11.34%)
> MMTests Statistics: duration
> Total Elapsed Time (seconds)                611.90    612.22
> 
> (While this affects 2.6.38, it is a performance rather than a
> functional bug and normally outside the rules -stable. While the big
> performance differences are to a microbench, the difference in fork
> and exec performance may be significant enough that -stable wants to
> consider the patch)
> 
> Reported-by: Raz Ben Yehuda <raziebe@gmail.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2011-04-27 13:51 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-15 10:12 [PATCH] mm: Check if PTE is already allocated during page fault Mel Gorman
2011-04-15 10:12 ` Mel Gorman
2011-04-15 13:23 ` Rik van Riel
2011-04-15 13:23   ` Rik van Riel
2011-04-15 14:39 ` Andrea Arcangeli
2011-04-15 14:39   ` Andrea Arcangeli
2011-04-15 15:06   ` Andrea Arcangeli
2011-04-15 15:06     ` Andrea Arcangeli
2011-04-18  7:21     ` raz ben yehuda
2011-04-18  7:21       ` raz ben yehuda
2011-04-18 10:23     ` Mel Gorman
2011-04-18 10:23       ` Mel Gorman
2011-04-21  6:59 ` Minchan Kim
2011-04-21  6:59   ` Minchan Kim
2011-04-21 11:08   ` Mel Gorman
2011-04-21 11:08     ` Mel Gorman
2011-04-21 14:26     ` Minchan Kim
2011-04-21 14:26       ` Minchan Kim
2011-04-21 16:00       ` Mel Gorman
2011-04-21 16:00         ` Mel Gorman
2011-04-21 16:14         ` Andrea Arcangeli
2011-04-21 16:14           ` Andrea Arcangeli
2011-04-22  0:54           ` Minchan Kim
2011-04-22  0:54             ` Minchan Kim
2011-04-26 13:00             ` Andrea Arcangeli
2011-04-26 13:00               ` Andrea Arcangeli
2011-04-22  1:01         ` Minchan Kim
2011-04-22  1:01           ` Minchan Kim
2011-04-27 13:50 ` Johannes Weiner
2011-04-27 13:50   ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.