[PATCH] powerpc/mm: Add trace point for tracking hash pte fault

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
@ 2015-01-20 11:35 Aneesh Kumar K.V
  2015-01-21  3:07 ` Michael Ellerman
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-01-20 11:35 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This enables us to understand how many hash fault we are taking
when running benchmarks.

For ex:
-bash-4.2# ./perf stat -e  powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30  -P -n 1000
...

 Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000':

       1,10,04,075      powerpc:hash_fault
       1,10,03,429      page-faults

      30.865978991 seconds time elapsed

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/trace.h | 21 +++++++++++++++++++++
 arch/powerpc/mm/hash_utils_64.c  |  2 ++
 2 files changed, 23 insertions(+)

diff --git a/arch/powerpc/include/asm/trace.h b/arch/powerpc/include/asm/trace.h
index c15da6073cb8..ded56b183888 100644
--- a/arch/powerpc/include/asm/trace.h
+++ b/arch/powerpc/include/asm/trace.h
@@ -142,8 +142,29 @@ TRACE_EVENT_FN(opal_exit,
 
 	opal_tracepoint_regfunc, opal_tracepoint_unregfunc
 );
+
 #endif
 
+TRACE_EVENT(hash_fault,
+
+	    TP_PROTO(unsigned long addr, unsigned long access, unsigned long trap),
+	    TP_ARGS(addr, access, trap),
+	    TP_STRUCT__entry(
+		    __field(unsigned long, addr)
+		    __field(unsigned long, access)
+		    __field(unsigned long, trap)
+		    ),
+
+	    TP_fast_assign(
+		    __entry->addr = addr;
+		    __entry->access = access;
+		    __entry->trap = trap;
+		    ),
+
+	    TP_printk("hash fault with addr 0x%lx and access = 0x%lx trap = 0x%lx",
+		      __entry->addr, __entry->access, __entry->trap)
+);
+
 #endif /* _TRACE_POWERPC_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 2c2022d16059..7e88470a876f 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -57,6 +57,7 @@
 #include <asm/fadump.h>
 #include <asm/firmware.h>
 #include <asm/tm.h>
+#include <asm/trace.h>
 
 #ifdef DEBUG
 #define DBG(fmt...) udbg_printf(fmt)
@@ -1004,6 +1005,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
 
 	DBG_LOW("hash_page(ea=%016lx, access=%lx, trap=%lx\n",
 		ea, access, trap);
+	trace_hash_fault(ea, access, trap);
 
 	/* Get region & vsid */
  	switch (REGION_ID(ea)) {
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-01-20 11:35 [PATCH] powerpc/mm: Add trace point for tracking hash pte fault Aneesh Kumar K.V
@ 2015-01-21  3:07 ` Michael Ellerman
  2015-01-21  8:45   ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Michael Ellerman @ 2015-01-21  3:07 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: paulus, linuxppc-dev

On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote:
> This enables us to understand how many hash fault we are taking
> when running benchmarks.
> 
> For ex:
> -bash-4.2# ./perf stat -e  powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30  -P -n 1000
> ...
> 
>  Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000':
> 
>        1,10,04,075      powerpc:hash_fault
>        1,10,03,429      page-faults
> 
>       30.865978991 seconds time elapsed

Looks good.

Can you attach some test results that show it's not hurting performance when
it's disabled.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-01-21  3:07 ` Michael Ellerman
@ 2015-01-21  8:45   ` Aneesh Kumar K.V
  2015-01-28  6:11     ` Michael Ellerman
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-01-21  8:45 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: paulus, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:

> On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote:
>> This enables us to understand how many hash fault we are taking
>> when running benchmarks.
>> 
>> For ex:
>> -bash-4.2# ./perf stat -e  powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30  -P -n 1000
>> ...
>> 
>>  Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000':
>> 
>>        1,10,04,075      powerpc:hash_fault
>>        1,10,03,429      page-faults
>> 
>>       30.865978991 seconds time elapsed
>
> Looks good.
>
> Can you attach some test results that show it's not hurting performance when
> it's disabled.
>

ebizzy with -S 30 -t 1 -P gave
13627 records/s -> Without patch
13546 records/s -> With patch with tracepoint disabled
12408 records/s -> With patch with tracepoint enabled.

perf stat gave the below data for the above run.

 22,38,284      page-faults                                                 
 22,38,291      powerpc:hash_fault

I also used random_access_bench that Anton wrote, it actually
create lots of hash fault. A simple run gives.
(random_access_bench -o load -g -i -t 10 16G)

1,888      page-faults                                                 
2,64,283   powerpc:hash_fault                                          

random_access_bench gave:
1435.979 MB/s -> Without patch
1435.29  MB/s -> With patch with tracepoint disabled
1434.75  MB/s -> With patch with tracepoint enabled.

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-01-21  8:45   ` Aneesh Kumar K.V
@ 2015-01-28  6:11     ` Michael Ellerman
  2015-02-02 10:26       ` Anton Blanchard
  2015-02-02 16:12       ` Aneesh Kumar K.V
  0 siblings, 2 replies; 10+ messages in thread
From: Michael Ellerman @ 2015-01-28  6:11 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: paulus, linuxppc-dev

On Wed, 2015-01-21 at 14:15 +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote:
> >> This enables us to understand how many hash fault we are taking
> >> when running benchmarks.
> >> 
> >> For ex:
> >> -bash-4.2# ./perf stat -e  powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30  -P -n 1000
> >> ...
> >> 
> >>  Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000':
> >> 
> >>        1,10,04,075      powerpc:hash_fault
> >>        1,10,03,429      page-faults
> >> 
> >>       30.865978991 seconds time elapsed
> >
> > Looks good.
> >
> > Can you attach some test results that show it's not hurting performance when
> > it's disabled.
> 
> ebizzy with -S 30 -t 1 -P gave
> 13627 records/s -> Without patch
> 13546 records/s -> With patch with tracepoint disabled

OK. So that's about -0.6%. Are we happy with that? I'm not sure.

Can you do a few more runs and see if that's a stable result.

> random_access_bench gave:
> 1435.979 MB/s -> Without patch
> 1435.29  MB/s -> With patch with tracepoint disabled

That's more like -0.05% which is in the noise.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-01-28  6:11     ` Michael Ellerman
@ 2015-02-02 10:26       ` Anton Blanchard
  2015-02-02 16:21         ` Aneesh Kumar K.V
  2015-02-02 16:12       ` Aneesh Kumar K.V
  1 sibling, 1 reply; 10+ messages in thread
From: Anton Blanchard @ 2015-02-02 10:26 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, paulus, Aneesh Kumar K.V

Hi,

> > ebizzy with -S 30 -t 1 -P gave
> > 13627 records/s -> Without patch
> > 13546 records/s -> With patch with tracepoint disabled
> 
> OK. So that's about -0.6%. Are we happy with that? I'm not sure.
> 
> Can you do a few more runs and see if that's a stable result.

Surprisingly large. Is CONFIG_JUMP_LABEL enabled? That should reduce the
tracepoint to just a nop.

Anton

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-01-28  6:11     ` Michael Ellerman
  2015-02-02 10:26       ` Anton Blanchard
@ 2015-02-02 16:12       ` Aneesh Kumar K.V
  2015-04-02  8:44         ` Aneesh Kumar K.V
  1 sibling, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-02-02 16:12 UTC (permalink / raw)
  To: Michael Ellerman, Anton Blanchard; +Cc: paulus, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:

> On Wed, 2015-01-21 at 14:15 +0530, Aneesh Kumar K.V wrote:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>> 
>> > On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote:
>> >> This enables us to understand how many hash fault we are taking
>> >> when running benchmarks.
>> >> 
>> >> For ex:
>> >> -bash-4.2# ./perf stat -e  powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30  -P -n 1000
>> >> ...
>> >> 
>> >>  Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000':
>> >> 
>> >>        1,10,04,075      powerpc:hash_fault
>> >>        1,10,03,429      page-faults
>> >> 
>> >>       30.865978991 seconds time elapsed
>> >
>> > Looks good.
>> >
>> > Can you attach some test results that show it's not hurting performance when
>> > it's disabled.
>> 
>> ebizzy with -S 30 -t 1 -P gave
>> 13627 records/s -> Without patch
>> 13546 records/s -> With patch with tracepoint disabled
>
> OK. So that's about -0.6%. Are we happy with that? I'm not sure.
>
> Can you do a few more runs and see if that's a stable result.

That is within the run variance for that test. Infact I found it
difficult to get a stable records/s with ebizzy run, even after fixing
the random state variable and forcing single thread. I ended up doing
a micro benchmark that allocate a large region and touch one byte per
page.

That resulted in
time perf stat -e page-faults -e powerpc:hash_fault  ./a.out

 Performance counter stats for './a.out':

         10,00,062      page-faults                                                 
         10,00,068      powerpc:hash_fault                                          

      12.414350121 seconds time elapsed


real    0m12.558s
user    0m0.577s
sys     0m11.932s

and with that test we have an average system time for 10 run
Without patch
sys: 0m11.2425

With patch:
sys: 0m11.3258

ie, a -0.7% impact 

If that impact is high we could possibly put that tracepoint within #ifdef
CONFIG_DEBUG_VM ?

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-02-02 10:26       ` Anton Blanchard
@ 2015-02-02 16:21         ` Aneesh Kumar K.V
  2015-02-02 22:01           ` Anton Blanchard
  0 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-02-02 16:21 UTC (permalink / raw)
  To: Anton Blanchard, Michael Ellerman; +Cc: paulus, linuxppc-dev

Anton Blanchard <anton@samba.org> writes:

> Hi,
>
>> > ebizzy with -S 30 -t 1 -P gave
>> > 13627 records/s -> Without patch
>> > 13546 records/s -> With patch with tracepoint disabled
>> 
>> OK. So that's about -0.6%. Are we happy with that? I'm not sure.
>> 
>> Can you do a few more runs and see if that's a stable result.
>
> Surprisingly large. Is CONFIG_JUMP_LABEL enabled? That should reduce the
> tracepoint to just a nop.

yes. We do use jump label. I also verified that looking at .s

#APP
 # 23 "./arch/powerpc/include/asm/jump_label.h" 1
        1:
        nop
        .pushsection __jump_table,  "aw"
        .llong 1b, .L201, __tracepoint_hash_fault+8      #,
        .popsection

 # 0 "" 2
.....

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-02-02 16:21         ` Aneesh Kumar K.V
@ 2015-02-02 22:01           ` Anton Blanchard
  2015-02-03  3:07             ` Aneesh Kumar K.V
  0 siblings, 1 reply; 10+ messages in thread
From: Anton Blanchard @ 2015-02-02 22:01 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: linuxppc-dev, paulus

Hi Aneesh,

> yes. We do use jump label. I also verified that looking at .s
> 
> #APP
>  # 23 "./arch/powerpc/include/asm/jump_label.h" 1
>         1:
>         nop
>         .pushsection __jump_table,  "aw"
>         .llong 1b, .L201, __tracepoint_hash_fault+8      #,
>         .popsection
> 
>  # 0 "" 2

So we insert a single nop, and the slow path is in another section. I'd
be surprised if we could measure this, unless the nop causes a branch
target alignment issue the slow path caused some hot path icache layout
issues.

> Without patch
> sys: 0m11.2425
> 
> With patch:
> sys: 0m11.3258
>
> ie, a -0.7% impact 
> 
> If that impact is high we could possibly put that tracepoint within
> #ifdef CONFIG_DEBUG_VM ?

Did the real time change? I'd be careful about comparing based on
system time.

My feeling is we should not hide it behind CONFIG_DEBUG_VM.

Anton

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-02-02 22:01           ` Anton Blanchard
@ 2015-02-03  3:07             ` Aneesh Kumar K.V
  0 siblings, 0 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-02-03  3:07 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: linuxppc-dev, paulus

Anton Blanchard <anton@samba.org> writes:

> Hi Aneesh,
>
>> yes. We do use jump label. I also verified that looking at .s
>> 
>> #APP
>>  # 23 "./arch/powerpc/include/asm/jump_label.h" 1
>>         1:
>>         nop
>>         .pushsection __jump_table,  "aw"
>>         .llong 1b, .L201, __tracepoint_hash_fault+8      #,
>>         .popsection
>> 
>>  # 0 "" 2
>
> So we insert a single nop, and the slow path is in another section. I'd
> be surprised if we could measure this, unless the nop causes a branch
> target alignment issue the slow path caused some hot path icache layout
> issues.
>
>> Without patch
>> sys: 0m11.2425
>> 
>> With patch:
>> sys: 0m11.3258
>>
>> ie, a -0.7% impact 
>> 
>> If that impact is high we could possibly put that tracepoint within
>> #ifdef CONFIG_DEBUG_VM ?
>
> Did the real time change? I'd be careful about comparing based on
> system time.


Yes it did. 
11.8769 with patch
11.7924 without patch

I will look at the perf stat data difference between the both runs and update.

-aneesh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] powerpc/mm: Add trace point for tracking hash pte fault
  2015-02-02 16:12       ` Aneesh Kumar K.V
@ 2015-04-02  8:44         ` Aneesh Kumar K.V
  0 siblings, 0 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2015-04-02  8:44 UTC (permalink / raw)
  To: Michael Ellerman, Anton Blanchard; +Cc: linuxppc-dev, paulus

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Michael Ellerman <mpe@ellerman.id.au> writes:
>
....
....
> With patch:
> sys: 0m11.3258
>
> ie, a -0.7% impact 
>
> If that impact is high we could possibly put that tracepoint within #ifdef
> CONFIG_DEBUG_VM ?

Since the ebizzy runs results were not stable, I did a micro benchmark
to measure this and I noticed that results observed are within the
run variance of the test. I made sure we don't have context-switches
between the runs. If I try to get large number of page-faults, we end up
with context switches.

for ex: We get without patch
--------------------------------
root@qemu-pr-host trace-fault]# bash run 

 Performance counter stats for './a.out 3000 300':

               643      page-faults               #    0.089 M/sec                  
          7.236562      task-clock (msec)         #    0.928 CPUs utilized          
         2,179,213      stalled-cycles-frontend   #    0.00% frontend cycles idle   
        17,174,367      stalled-cycles-backend    #    0.00% backend  cycles idle   
                 0      context-switches          #    0.000 K/sec                  

       0.007794658 seconds time elapsed

[root@qemu-pr-host trace-fault]# 

And with-patch:
---------------
[root@qemu-pr-host trace-fault]# bash run 

 Performance counter stats for './a.out 3000 300':

               643      page-faults               #    0.089 M/sec                  
          7.233746      task-clock (msec)         #    0.921 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  

       0.007854876 seconds time elapsed


 Performance counter stats for './a.out 3000 300':

               643      page-faults               #    0.087 M/sec                  
               649      powerpc:hash_fault        #    0.087 M/sec                  
          7.430376      task-clock (msec)         #    0.938 CPUs utilized          
         2,347,174      stalled-cycles-frontend   #    0.00% frontend cycles idle   
        17,524,282      stalled-cycles-backend    #    0.00% backend  cycles idle   
                 0      context-switches          #    0.000 K/sec                  

       0.007920284 seconds time elapsed

[root@qemu-pr-host trace-fault]# 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-04-02  8:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-20 11:35 [PATCH] powerpc/mm: Add trace point for tracking hash pte fault Aneesh Kumar K.V
2015-01-21  3:07 ` Michael Ellerman
2015-01-21  8:45   ` Aneesh Kumar K.V
2015-01-28  6:11     ` Michael Ellerman
2015-02-02 10:26       ` Anton Blanchard
2015-02-02 16:21         ` Aneesh Kumar K.V
2015-02-02 22:01           ` Anton Blanchard
2015-02-03  3:07             ` Aneesh Kumar K.V
2015-02-02 16:12       ` Aneesh Kumar K.V
2015-04-02  8:44         ` Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.