All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
@ 2017-04-03  6:41 Anton Blanchard
  2017-04-03 18:55 ` LEROY Christophe
  2017-04-06 13:06 ` Michael Ellerman
  0 siblings, 2 replies; 9+ messages in thread
From: Anton Blanchard @ 2017-04-03  6:41 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev

From: Anton Blanchard <anton@samba.org>

Early on in do_page_fault() we call store_updates_sp(), regardless of
the type of exception. For an instruction miss this doesn't make
sense, because we only use this information to detect if a data miss
is the result of a stack expansion instruction or not.

Worse still, it results in a data miss within every userspace
instruction miss handler, because we try and load the very instruction
we are about to install a pte for!

A simple exec microbenchmark runs 6% faster on POWER8 with this fix:

 #include <stdlib.h>
 #include <stdio.h>
 #include <unistd.h>

int main(int argc, char *argv[])
{
	unsigned long left = atol(argv[1]);
	char leftstr[16];

	if (left-- == 0)
		return 0;

	sprintf(leftstr, "%ld", left);
	execlp(argv[0], argv[0], leftstr, NULL);
	perror("exec failed\n");

	return 0;
}

Pass the number of iterations on the command line (eg 10000) and time
how long it takes to execute.

Signed-off-by: Anton Blanchard <anton@samba.org>
---
 arch/powerpc/mm/fault.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index fd6484fc2fa9..3a7d580fdc59 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -287,7 +287,7 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
 	 * can result in fault, which will cause a deadlock when called with
 	 * mmap_sem held
 	 */
-	if (user_mode(regs))
+	if (!is_exec && user_mode(regs))
 		store_update_sp = store_updates_sp(regs);
 
 	if (user_mode(regs))
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-03  6:41 [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss Anton Blanchard
@ 2017-04-03 18:55 ` LEROY Christophe
  2017-04-03 22:00   ` Anton Blanchard
  2017-04-06 13:06 ` Michael Ellerman
  1 sibling, 1 reply; 9+ messages in thread
From: LEROY Christophe @ 2017-04-03 18:55 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev, mpe, paulus, benh

Anton Blanchard <anton@ozlabs.org> a =C3=A9crit=C2=A0:

> From: Anton Blanchard <anton@samba.org>
>
> Early on in do_page_fault() we call store_updates_sp(), regardless of
> the type of exception. For an instruction miss this doesn't make
> sense, because we only use this information to detect if a data miss
> is the result of a stack expansion instruction or not.
>
> Worse still, it results in a data miss within every userspace
> instruction miss handler, because we try and load the very instruction
> we are about to install a pte for!
>
> A simple exec microbenchmark runs 6% faster on POWER8 with this fix:
>
>  #include <stdlib.h>
>  #include <stdio.h>
>  #include <unistd.h>
>
> int main(int argc, char *argv[])
> {
> 	unsigned long left =3D atol(argv[1]);
> 	char leftstr[16];
>
> 	if (left-- =3D=3D 0)
> 		return 0;
>
> 	sprintf(leftstr, "%ld", left);
> 	execlp(argv[0], argv[0], leftstr, NULL);
> 	perror("exec failed\n");
>
> 	return 0;
> }
>
> Pass the number of iterations on the command line (eg 10000) and time
> how long it takes to execute.
>
> Signed-off-by: Anton Blanchard <anton@samba.org>
> ---
>  arch/powerpc/mm/fault.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index fd6484fc2fa9..3a7d580fdc59 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -287,7 +287,7 @@ int do_page_fault(struct pt_regs *regs, unsigned=20=
=20
>=20long address,
>  	 * can result in fault, which will cause a deadlock when called with
>  	 * mmap_sem held
>  	 */
> -	if (user_mode(regs))
> +	if (!is_exec && user_mode(regs))

Shouldn't it also check 'is_write' ?
If it is a store, is_write should be set, shouldn't it ?

Christophe

>  		store_update_sp =3D store_updates_sp(regs);
>
>  	if (user_mode(regs))
> --
> 2.11.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-03 18:55 ` LEROY Christophe
@ 2017-04-03 22:00   ` Anton Blanchard
  2017-04-12 16:15     ` Christophe LEROY
  0 siblings, 1 reply; 9+ messages in thread
From: Anton Blanchard @ 2017-04-03 22:00 UTC (permalink / raw)
  To: LEROY Christophe; +Cc: linuxppc-dev, mpe, paulus, benh

Hi Christophe,

> > -	if (user_mode(regs))
> > +	if (!is_exec && user_mode(regs))  
> 
> Shouldn't it also check 'is_write' ?
> If it is a store, is_write should be set, shouldn't it ?

Thanks, Ben had the same suggestion. I'll add that further optimisation
in a subsequent patch.

Anton

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-03  6:41 [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss Anton Blanchard
  2017-04-03 18:55 ` LEROY Christophe
@ 2017-04-06 13:06 ` Michael Ellerman
  2017-04-13  1:16   ` Balbir Singh
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2017-04-06 13:06 UTC (permalink / raw)
  To: Anton Blanchard, benh, paulus; +Cc: linuxppc-dev

On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote:
> From: Anton Blanchard <anton@samba.org>
> 
> Early on in do_page_fault() we call store_updates_sp(), regardless of
> the type of exception. For an instruction miss this doesn't make
> sense, because we only use this information to detect if a data miss
> is the result of a stack expansion instruction or not.
> 
> Worse still, it results in a data miss within every userspace
> instruction miss handler, because we try and load the very instruction
> we are about to install a pte for!
> 
> A simple exec microbenchmark runs 6% faster on POWER8 with this fix:
> 
>  #include <stdlib.h>
>  #include <stdio.h>
>  #include <unistd.h>
> 
> int main(int argc, char *argv[])
> {
> 	unsigned long left = atol(argv[1]);
> 	char leftstr[16];
> 
> 	if (left-- == 0)
> 		return 0;
> 
> 	sprintf(leftstr, "%ld", left);
> 	execlp(argv[0], argv[0], leftstr, NULL);
> 	perror("exec failed\n");
> 
> 	return 0;
> }
> 
> Pass the number of iterations on the command line (eg 10000) and time
> how long it takes to execute.
> 
> Signed-off-by: Anton Blanchard <anton@samba.org>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-03 22:00   ` Anton Blanchard
@ 2017-04-12 16:15     ` Christophe LEROY
  2017-04-13 10:22       ` Michael Ellerman
  0 siblings, 1 reply; 9+ messages in thread
From: Christophe LEROY @ 2017-04-12 16:15 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev, mpe, paulus, benh

Hi Anton,

Le 04/04/2017 à 00:00, Anton Blanchard a écrit :
> Hi Christophe,
>
>>> -	if (user_mode(regs))
>>> +	if (!is_exec && user_mode(regs))
>>
>> Shouldn't it also check 'is_write' ?
>> If it is a store, is_write should be set, shouldn't it ?
>
> Thanks, Ben had the same suggestion. I'll add that further optimisation
> in a subsequent patch.
>
> Anton
>

For your information, I made some benchmark test using 'perf stat' with 
your app on MPC8321 and MPC885, and I got the following results:

MPC8321 before the change:

  Performance counter stats for './fault 1000' (10 runs):

        4491.971466      cpu-clock (msec) 
               ( +-  0.03% )
              47386      faults 
               ( +-  0.02% )

        4.727864465 seconds time elapsed 
          ( +-  0.17% )


MPC8321 after your change:

  Performance counter stats for './fault 1000' (10 runs):

        4278.738845      cpu-clock (msec) 
               ( +-  0.02% )
              35181      faults 
               ( +-  0.02% )

        4.504443891 seconds time elapsed 
          ( +-  0.19% )


MPC8321 after changing !is_exec by is_write

  Performance counter stats for './fault 1000' (10 runs):

        4268.187261      cpu-clock (msec) 
               ( +-  0.03% )
              35181      faults 
               ( +-  0.01% )

        4.489207922 seconds time elapsed 
          ( +-  0.20% )





MPC885 before the change:

  Performance counter stats for './fault 500' (10 runs):

          726605854      cpu-cycles 
               ( +-  0.03% )
             176067      dTLB-load-misses 
               ( +-  0.08% )
              52722      iTLB-load-misses 
               ( +-  0.01% )
              25718      faults 
               ( +-  0.03% )

        5.795924654 seconds time elapsed 
          ( +-  0.14% )


MPC885 after your change:

  Performance counter stats for './fault 500' (10 runs):

          711233251      cpu-cycles 
               ( +-  0.04% )
             152462      dTLB-load-misses 
               ( +-  0.09% )
              52715      iTLB-load-misses 
               ( +-  0.01% )
              19611      faults 
               ( +-  0.02% )

        5.673784606 seconds time elapsed 
          ( +-  0.14% )


MPC885 after changing !is_exec by is_write

  Performance counter stats for './fault 500' (10 runs):

          710904083      cpu-cycles 
               ( +-  0.05% )
             147162      dTLB-load-misses 
               ( +-  0.06% )
              52716      iTLB-load-misses 
               ( +-  0.01% )
              19610      faults 
               ( +-  0.02% )

        5.672091139 seconds time elapsed 
          ( +-  0.15% )



Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-06 13:06 ` Michael Ellerman
@ 2017-04-13  1:16   ` Balbir Singh
  2017-04-13  1:29     ` Anton Blanchard
  2017-04-13 10:24     ` Michael Ellerman
  0 siblings, 2 replies; 9+ messages in thread
From: Balbir Singh @ 2017-04-13  1:16 UTC (permalink / raw)
  To: Michael Ellerman, Anton Blanchard, benh, paulus; +Cc: linuxppc-dev

On Thu, 2017-04-06 at 23:06 +1000, Michael Ellerman wrote:
> On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote:
> > From: Anton Blanchard <anton@samba.org>
> > 
> Applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f
>

FYI: The version you applied does not have checks for is_write

Balbir Singh. 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-13  1:16   ` Balbir Singh
@ 2017-04-13  1:29     ` Anton Blanchard
  2017-04-13 10:24     ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Anton Blanchard @ 2017-04-13  1:29 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Michael Ellerman, benh, paulus, linuxppc-dev

Hi Balbir,

> FYI: The version you applied does not have checks for is_write

Yeah, we decided to do that in a follow up patch. I'm ok if someone
gets to it before me :)

Anton

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-12 16:15     ` Christophe LEROY
@ 2017-04-13 10:22       ` Michael Ellerman
  0 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2017-04-13 10:22 UTC (permalink / raw)
  To: Christophe LEROY, Anton Blanchard; +Cc: linuxppc-dev, paulus, benh

Christophe LEROY <christophe.leroy@c-s.fr> writes:

> Hi Anton,
>
> Le 04/04/2017 =C3=A0 00:00, Anton Blanchard a =C3=A9crit :
>> Hi Christophe,
>>
>>>> -	if (user_mode(regs))
>>>> +	if (!is_exec && user_mode(regs))
>>>
>>> Shouldn't it also check 'is_write' ?
>>> If it is a store, is_write should be set, shouldn't it ?
>>
>> Thanks, Ben had the same suggestion. I'll add that further optimisation
>> in a subsequent patch.
>>
>
> For your information, I made some benchmark test using 'perf stat' with=20
> your app on MPC8321 and MPC885, and I got the following results:

 MPC8321:
 before        47386      faults=20
 after         35181      faults        -12205
 is_write      35181      faults        -12205=20

So that's good.

 MPC885:
 before:      176067      dTLB-load-misses=20
               52722      iTLB-load-misses=20
               25718      faults=20
 after:       152462      dTLB-load-misses    -23605
               52715      iTLB-load-misses        -7
               19611      faults               -6107

 is_write:    147162      dTLB-load-misses    -28905
               52716      iTLB-load-misses        -6
               19610      faults               -6108

Also good, and shows that is_write idea would be even better.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: powerpc: Avoid taking a data miss on every userspace instruction miss
  2017-04-13  1:16   ` Balbir Singh
  2017-04-13  1:29     ` Anton Blanchard
@ 2017-04-13 10:24     ` Michael Ellerman
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2017-04-13 10:24 UTC (permalink / raw)
  To: Balbir Singh, Michael Ellerman, Anton Blanchard, benh, paulus
  Cc: linuxppc-dev

Balbir Singh <bsingharora@gmail.com> writes:

> On Thu, 2017-04-06 at 23:06 +1000, Michael Ellerman wrote:
>> On Mon, 2017-04-03 at 06:41:02 UTC, Anton Blanchard wrote:
>> > From: Anton Blanchard <anton@samba.org>
>> > 
>> Applied to powerpc next, thanks.
>> 
>> https://git.kernel.org/powerpc/c/a7a9dcd882a67b68568868b988289f
>>
>
> FYI: The version you applied does not have checks for is_write

The version I was sent didn't have it :)

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-13 10:24 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-03  6:41 [PATCH] powerpc: Avoid taking a data miss on every userspace instruction miss Anton Blanchard
2017-04-03 18:55 ` LEROY Christophe
2017-04-03 22:00   ` Anton Blanchard
2017-04-12 16:15     ` Christophe LEROY
2017-04-13 10:22       ` Michael Ellerman
2017-04-06 13:06 ` Michael Ellerman
2017-04-13  1:16   ` Balbir Singh
2017-04-13  1:29     ` Anton Blanchard
2017-04-13 10:24     ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.