* [PATCH] Properly interpret indirect call in perf annotate.
@ 2018-08-23 12:29 Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Martin Liška @ 2018-08-23 12:29 UTC (permalink / raw)
To: linux-perf-users, lkml; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa
[-- Attachment #1: Type: text/plain, Size: 364 bytes --]
The patch changes interpretation of:
callq *0x8(%rbx)
from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)
in this can an address is followed by a register, thus
one can't parse only address.
Signed-off-by: Martin Liška <mliska@suse.cz>
---
tools/perf/util/annotate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch, Size: 668 bytes --]
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..e32ead4744bd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
indirect_call:
tok = strchr(endptr, '*');
- if (tok != NULL)
- ops->target.addr = strtoull(tok + 1, NULL, 16);
+ if (tok != NULL) {
+ endptr++;
+
+ /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
+ * Do not parse such instruction. */
+ if (strstr(endptr, "(%r") == NULL)
+ ops->target.addr = strtoull(endptr, NULL, 16);
+ }
goto find_target;
}
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
@ 2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-27 9:06 ` Martin Liška
2018-08-23 23:00 ` Kim Phillips
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-23 14:12 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
Please mention one or two functions where such sequence appears, so that
others can reproduce your before/after more quickly,
- Arnaldo
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
@ 2018-08-23 23:00 ` Kim Phillips
2018-08-27 10:37 ` Namhyung Kim
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
3 siblings, 0 replies; 10+ messages in thread
From: Kim Phillips @ 2018-08-23 23:00 UTC (permalink / raw)
To: Martin Liška
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa
On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <mliska@suse.cz> wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
Tested this doesn't incur any grievances parsing aarch64 code:
Tested-by: Kim Phillips <kim.phillips@arm.com>
Thanks,
Kim
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
@ 2018-08-27 9:06 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 10+ messages in thread
From: Martin Liška @ 2018-08-27 9:06 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa
[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]
On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Please mention one or two functions where such sequence appears, so that
> others can reproduce your before/after more quickly,
Sure, there's self-contained example on can compile (-O2) and test.
It's following call in test function:
test:
.LFB1:
.cfi_startproc
movq %rdi, %rax
subq $8, %rsp
.cfi_def_cfa_offset 16
movq %rsi, %rdi
movq %rdx, %rsi
call *8(%rax) <---- here
cmpl $1, %eax
adcl $-1, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
Martin
>
> - Arnaldo
>
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>> + }
>> goto find_target;
>> }
>>
>>
>
[-- Attachment #2: perf-callq.c --]
[-- Type: text/x-csrc, Size: 1271 bytes --]
typedef int (*htab_eq) (const void *, const void *);
static int eq (const void *a, const void *b)
{
return a - b;
}
struct htab
{
/* Current size (in entries) of the hash table */
int size;
/* Pointer to comparison function. */
htab_eq eq_f;
/* Table itself. */
void **entries;
/* Current number of elements including also deleted elements */
int n_elements;
/* Current number of deleted elements in the table */
int n_deleted;
/* The following member is used for debugging. Its value is number
of all calls of `htab_find_slot' for the hash table. */
unsigned int searches;
/* The following member is used for debugging. Its value is number
of collisions fixed for time of work with the hash table. */
unsigned int collisions;
unsigned int max_size;
/* This is non-zero if we are allowed to return NULL for function calls
that allocate memory. */
int return_allocation_failure;
};
int
__attribute__ ((noinline))
test(struct htab *t, int *a, int *b)
{
int r = t->eq_f (a, b);
if (r)
return r - 1;
return 0;
}
struct htab mytable;
int r;
int main(int argc, char **argv)
{
mytable.eq_f = &eq;
for (unsigned i = 0; i < 100000000; i++)
r += test (&mytable, &argc, &argc);
return 0;
}
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-23 23:00 ` Kim Phillips
@ 2018-08-27 10:37 ` Namhyung Kim
2018-08-27 14:28 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
3 siblings, 1 reply; 10+ messages in thread
From: Namhyung Kim @ 2018-08-27 10:37 UTC (permalink / raw)
To: Martin Liška
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team
Hello,
On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
Also there's a case with no offset like: callq *%rbx
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
It seems too x86-specific, what about this? (not tested)
indirect_call:
tok = strchr(endptr, '*');
if (tok != NULL) {
endptr++;
if (!isdigit(*endptr))
return 0;
addr = strtoull(endptr, &endptr, 0);
if (*endptr != '('))
ops->target.addr = addr;
Thanks,
Namhyung
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-27 14:28 ` Martin Liška
0 siblings, 0 replies; 10+ messages in thread
From: Martin Liška @ 2018-08-27 14:28 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team
[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]
On 08/27/2018 12:37 PM, Namhyung Kim wrote:
> Hello,
>
> On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Also there's a case with no offset like: callq *%rbx
Yes. But this case is fine as strtoull returns 0 for that:
'If there were no digits at all, strtoul() stores the original value of nptr in *endptr (and returns 0).'
So ops->target.addr is then 0 and it's fine.
>
>
>>
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>
> It seems too x86-specific, what about this? (not tested)
It is, I'm fine with that. I've just tested that for the callq *0x8(%rbx) example.
I'm sending patch for that version.
Martin
>
>
> indirect_call:
> tok = strchr(endptr, '*');
> if (tok != NULL) {
> endptr++;
> if (!isdigit(*endptr))
> return 0;
>
> addr = strtoull(endptr, &endptr, 0);
> if (*endptr != '('))
> ops->target.addr = addr;
>
>
> Thanks,
> Namhyung
>
>
>> + }
>> goto find_target;
>> }
>>
>>
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch; name="0001-Properly-interpret-indirect-call-in-perf-annotate.patch", Size: 1495 bytes --]
From 58a0eca544be8cc9e15b2ab5ecd9d9401ff4d2ec Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 23 Aug 2018 14:25:33 +0200
Subject: [PATCH] Properly interpret indirect call in perf annotate.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The patch changes interpretation of:
callq *0x8(%rbx)
from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)
in this can an address is followed by a register, thus
one can't parse only address.
Signed-off-by: Martin Liška <mliska@suse.cz>
---
tools/perf/util/annotate.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..18a8477d4664 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -212,6 +212,7 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
struct addr_map_symbol target = {
.map = map,
};
+ u64 addr;
ops->target.addr = strtoull(ops->raw, &endptr, 16);
@@ -246,8 +247,15 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
indirect_call:
tok = strchr(endptr, '*');
- if (tok != NULL)
- ops->target.addr = strtoull(tok + 1, NULL, 16);
+ if (tok != NULL) {
+ endptr++;
+ if (!isdigit(*endptr))
+ return 0;
+
+ addr = strtoull(endptr, &endptr, 0);
+ if (*endptr != '(')
+ ops->target.addr = addr;
+ }
goto find_target;
}
--
2.18.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
` (2 preceding siblings ...)
2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-28 14:10 ` Arnaldo Carvalho de Melo
3 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
Please don't send the patch as an attachment, use 'git-format-patch',
I'm fixing this up this time.
- Arnaldo
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-27 9:06 ` Martin Liška
@ 2018-08-28 14:10 ` Arnaldo Carvalho de Melo
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> >> The patch changes interpretation of:
> >> callq *0x8(%rbx)
> >>
> >> from:
> >> 0.26 │ → callq *8
> >> to:
> >> 0.26 │ → callq *0x8(%rbx)
> >>
> >> in this can an address is followed by a register, thus
> >> one can't parse only address.
> >
> > Please mention one or two functions where such sequence appears, so that
> > others can reproduce your before/after more quickly,
>
> Sure, there's self-contained example on can compile (-O2) and test.
> It's following call in test function:
>
> test:
> .LFB1:
> .cfi_startproc
> movq %rdi, %rax
> subq $8, %rsp
> .cfi_def_cfa_offset 16
> movq %rsi, %rdi
> movq %rdx, %rsi
> call *8(%rax) <---- here
> cmpl $1, %eax
> adcl $-1, %eax
> addq $8, %rsp
> .cfi_def_cfa_offset 8
> ret
> .cfi_endproc
Here I'm getting:
Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
test /home/acme/c/perf-callq [Percent: local period]
0.17 │ mov %rdx,-0x28(%rbp)
0.58 │ mov -0x18(%rbp),%rax
7.90 │ mov 0x8(%rax),%rax
8.67 │ mov -0x28(%rbp),%rcx
│ mov -0x20(%rbp),%rdx
0.08 │ mov %rcx,%rsi
6.28 │ mov %rdx,%rdi
10.50 │ → callq *%rax
1.67 │ mov %eax,-0x4(%rbp)
11.95 │ cmpl $0x0,-0x4(%rbp)
8.14 │ ↓ je 3d
│ mov -0x4(%rbp),%eax
│ sub $0x1,%eax
│ ↓ jmp 42
│3d: mov $0x0,%eax
7.84 │42: leaveq
│ ← retq
Without the patch, will check if something changes with it.
- Arnaldo
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
@ 2018-08-28 14:18 ` Arnaldo Carvalho de Melo
2018-08-28 17:55 ` Martin Liška
0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:18 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> > On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> > >> The patch changes interpretation of:
> > >> callq *0x8(%rbx)
> > >>
> > >> from:
> > >> 0.26 │ → callq *8
> > >> to:
> > >> 0.26 │ → callq *0x8(%rbx)
<SNIP>
> > > Please mention one or two functions where such sequence appears, so that
> > > others can reproduce your before/after more quickly,
> > Sure, there's self-contained example on can compile (-O2) and test.
> > It's following call in test function:
> > test:
> > .LFB1:
> > .cfi_startproc
> > movq %rdi, %rax
> > subq $8, %rsp
> > .cfi_def_cfa_offset 16
> > movq %rsi, %rdi
> > movq %rdx, %rsi
> > call *8(%rax) <---- here
> > cmpl $1, %eax
> > adcl $-1, %eax
> > addq $8, %rsp
> > .cfi_def_cfa_offset 8
> > ret
> > .cfi_endproc
>
> Here I'm getting:
>
> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
> test /home/acme/c/perf-callq [Percent: local period]
> 0.17 │ mov %rdx,-0x28(%rbp)
> 0.58 │ mov -0x18(%rbp),%rax
> 7.90 │ mov 0x8(%rax),%rax
> 8.67 │ mov -0x28(%rbp),%rcx
> │ mov -0x20(%rbp),%rdx
> 0.08 │ mov %rcx,%rsi
> 6.28 │ mov %rdx,%rdi
> 10.50 │ → callq *%rax
> 1.67 │ mov %eax,-0x4(%rbp)
> 11.95 │ cmpl $0x0,-0x4(%rbp)
> 8.14 │ ↓ je 3d
> │ mov -0x4(%rbp),%eax
> │ sub $0x1,%eax
> │ ↓ jmp 42
> │3d: mov $0x0,%eax
> 7.84 │42: leaveq
> │ ← retq
>
> Without the patch, will check if something changes with it.
No changes with the patch, but then I did another test, ran a system
wide record for a while, then tested without/with your patch, with
--stdio2 redirecting to /tmp/{before,after} and got the expected
results, see below.
Thanks, applying,
- Arnaldo
--- /tmp/before 2018-08-28 11:16:03.238384143 -0300
+++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
@@ -13274,7 +13274,7 @@
↓ jle 128
hash_value = hash_table->hash_func (key);
mov 0x8(%rsp),%rdi
- 0.91 → callq *30
+ 0.91 → callq *0x30(%r12)
mov $0x2,%r8d
cmp $0x2,%eax
node_hash = hash_table->hashes[node_index];
@@ -13848,7 +13848,7 @@
mov %r14,%rdi
sub %rbx,%r13
mov %r13,%rdx
- → callq *38
+ → callq *0x38(%r15)
cmp %rax,%r13
1.91 ↓ je 240
1b4: mov $0xffffffff,%r13d
@@ -14026,7 +14026,7 @@
mov %rcx,-0x500(%rbp)
mov %r15,%rsi
mov %r14,%rdi
- → callq *38
+ → callq *0x38(%rax)
mov -0x500(%rbp),%rcx
cmp %rax,%rcx
↓ jne 9b0
<SNIP tons of other such cases>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
@ 2018-08-28 17:55 ` Martin Liška
0 siblings, 0 replies; 10+ messages in thread
From: Martin Liška @ 2018-08-28 17:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa
On 08/28/2018 04:18 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
>>> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>>>>> The patch changes interpretation of:
>>>>> callq *0x8(%rbx)
>>>>>
>>>>> from:
>>>>> 0.26 │ → callq *8
>>>>> to:
>>>>> 0.26 │ → callq *0x8(%rbx)
>
> <SNIP>
>
>>>> Please mention one or two functions where such sequence appears, so that
>>>> others can reproduce your before/after more quickly,
>
>>> Sure, there's self-contained example on can compile (-O2) and test.
>>> It's following call in test function:
>
>>> test:
>>> .LFB1:
>>> .cfi_startproc
>>> movq %rdi, %rax
>>> subq $8, %rsp
>>> .cfi_def_cfa_offset 16
>>> movq %rsi, %rdi
>>> movq %rdx, %rsi
>>> call *8(%rax) <---- here
>>> cmpl $1, %eax
>>> adcl $-1, %eax
>>> addq $8, %rsp
>>> .cfi_def_cfa_offset 8
>>> ret
>>> .cfi_endproc
>>
>> Here I'm getting:
>>
>> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
>> test /home/acme/c/perf-callq [Percent: local period]
>> 0.17 │ mov %rdx,-0x28(%rbp)
>> 0.58 │ mov -0x18(%rbp),%rax
>> 7.90 │ mov 0x8(%rax),%rax
>> 8.67 │ mov -0x28(%rbp),%rcx
>> │ mov -0x20(%rbp),%rdx
>> 0.08 │ mov %rcx,%rsi
>> 6.28 │ mov %rdx,%rdi
>> 10.50 │ → callq *%rax
>> 1.67 │ mov %eax,-0x4(%rbp)
>> 11.95 │ cmpl $0x0,-0x4(%rbp)
>> 8.14 │ ↓ je 3d
>> │ mov -0x4(%rbp),%eax
>> │ sub $0x1,%eax
>> │ ↓ jmp 42
>> │3d: mov $0x0,%eax
>> 7.84 │42: leaveq
>> │ ← retq
>>
>> Without the patch, will check if something changes with it.
Hi Arnaldo.
Thanks for re-sending of the patch and for the testing. The example I sent
is dependent on version of GCC compiler.
>
> No changes with the patch, but then I did another test, ran a system
> wide record for a while, then tested without/with your patch, with
> --stdio2 redirecting to /tmp/{before,after} and got the expected
> results, see below.
>
> Thanks, applying,
Good!
Martin
>
> - Arnaldo
>
> --- /tmp/before 2018-08-28 11:16:03.238384143 -0300
> +++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
> @@ -13274,7 +13274,7 @@
> ↓ jle 128
> hash_value = hash_table->hash_func (key);
> mov 0x8(%rsp),%rdi
> - 0.91 → callq *30
> + 0.91 → callq *0x30(%r12)
> mov $0x2,%r8d
> cmp $0x2,%eax
> node_hash = hash_table->hashes[node_index];
> @@ -13848,7 +13848,7 @@
> mov %r14,%rdi
> sub %rbx,%r13
> mov %r13,%rdx
> - → callq *38
> + → callq *0x38(%r15)
> cmp %rax,%r13
> 1.91 ↓ je 240
> 1b4: mov $0xffffffff,%r13d
> @@ -14026,7 +14026,7 @@
> mov %rcx,-0x500(%rbp)
> mov %r15,%rsi
> mov %r14,%rdi
> - → callq *38
> + → callq *0x38(%rax)
> mov -0x500(%rbp),%rcx
> cmp %rax,%rcx
> ↓ jne 9b0
> <SNIP tons of other such cases>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2018-08-28 17:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-27 9:06 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
2018-08-28 17:55 ` Martin Liška
2018-08-23 23:00 ` Kim Phillips
2018-08-27 10:37 ` Namhyung Kim
2018-08-27 14:28 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).