* [PATCH] Properly interpret indirect call in perf annotate.
@ 2018-08-23 12:29 Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-23 12:29 UTC (permalink / raw)
To: linux-perf-users, lkml; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa
[-- Attachment #1: Type: text/plain, Size: 364 bytes --]
The patch changes interpretation of:
callq *0x8(%rbx)
from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)
in this can an address is followed by a register, thus
one can't parse only address.
Signed-off-by: Martin Liška <mliska@suse.cz>
---
tools/perf/util/annotate.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch, Size: 668 bytes --]
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..e32ead4744bd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
indirect_call:
tok = strchr(endptr, '*');
- if (tok != NULL)
- ops->target.addr = strtoull(tok + 1, NULL, 16);
+ if (tok != NULL) {
+ endptr++;
+
+ /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
+ * Do not parse such instruction. */
+ if (strstr(endptr, "(%r") == NULL)
+ ops->target.addr = strtoull(endptr, NULL, 16);
+ }
goto find_target;
}
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
@ 2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-27 9:06 ` Martin Liška
2018-08-23 23:00 ` Kim Phillips
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-23 14:12 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
Please mention one or two functions where such sequence appears, so that
others can reproduce your before/after more quickly,
- Arnaldo
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
@ 2018-08-23 23:00 ` Kim Phillips
2018-08-23 23:00 ` Kim Phillips
` (2 subsequent siblings)
3 siblings, 0 replies; 11+ messages in thread
From: Kim Phillips @ 2018-08-23 23:00 UTC (permalink / raw)
To: Martin Liška
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa
On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <mliska@suse.cz> wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
Tested this doesn't incur any grievances parsing aarch64 code:
Tested-by: Kim Phillips <kim.phillips@arm.com>
Thanks,
Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
@ 2018-08-23 23:00 ` Kim Phillips
0 siblings, 0 replies; 11+ messages in thread
From: Kim Phillips @ 2018-08-23 23:00 UTC (permalink / raw)
To: Martin Liška
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa
On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <mliska@suse.cz> wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
Tested this doesn't incur any grievances parsing aarch64 code:
Tested-by: Kim Phillips <kim.phillips@arm.com>
Thanks,
Kim
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
@ 2018-08-27 9:06 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 11+ messages in thread
From: Martin Liška @ 2018-08-27 9:06 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa
[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]
On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Please mention one or two functions where such sequence appears, so that
> others can reproduce your before/after more quickly,
Sure, there's self-contained example on can compile (-O2) and test.
It's following call in test function:
test:
.LFB1:
.cfi_startproc
movq %rdi, %rax
subq $8, %rsp
.cfi_def_cfa_offset 16
movq %rsi, %rdi
movq %rdx, %rsi
call *8(%rax) <---- here
cmpl $1, %eax
adcl $-1, %eax
addq $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
Martin
>
> - Arnaldo
>
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>> + }
>> goto find_target;
>> }
>>
>>
>
[-- Attachment #2: perf-callq.c --]
[-- Type: text/x-csrc, Size: 1271 bytes --]
typedef int (*htab_eq) (const void *, const void *);
static int eq (const void *a, const void *b)
{
return a - b;
}
struct htab
{
/* Current size (in entries) of the hash table */
int size;
/* Pointer to comparison function. */
htab_eq eq_f;
/* Table itself. */
void **entries;
/* Current number of elements including also deleted elements */
int n_elements;
/* Current number of deleted elements in the table */
int n_deleted;
/* The following member is used for debugging. Its value is number
of all calls of `htab_find_slot' for the hash table. */
unsigned int searches;
/* The following member is used for debugging. Its value is number
of collisions fixed for time of work with the hash table. */
unsigned int collisions;
unsigned int max_size;
/* This is non-zero if we are allowed to return NULL for function calls
that allocate memory. */
int return_allocation_failure;
};
int
__attribute__ ((noinline))
test(struct htab *t, int *a, int *b)
{
int r = t->eq_f (a, b);
if (r)
return r - 1;
return 0;
}
struct htab mytable;
int r;
int main(int argc, char **argv)
{
mytable.eq_f = &eq;
for (unsigned i = 0; i < 100000000; i++)
r += test (&mytable, &argc, &argc);
return 0;
}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-23 23:00 ` Kim Phillips
@ 2018-08-27 10:37 ` Namhyung Kim
2018-08-27 14:28 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
3 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2018-08-27 10:37 UTC (permalink / raw)
To: Martin Liška
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team
Hello,
On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
Also there's a case with no offset like: callq *%rbx
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
It seems too x86-specific, what about this? (not tested)
indirect_call:
tok = strchr(endptr, '*');
if (tok != NULL) {
endptr++;
if (!isdigit(*endptr))
return 0;
addr = strtoull(endptr, &endptr, 0);
if (*endptr != '('))
ops->target.addr = addr;
Thanks,
Namhyung
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-27 14:28 ` Martin Liška
0 siblings, 0 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-27 14:28 UTC (permalink / raw)
To: Namhyung Kim
Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team
[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]
On 08/27/2018 12:37 PM, Namhyung Kim wrote:
> Hello,
>
> On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
>> The patch changes interpretation of:
>> callq *0x8(%rbx)
>>
>> from:
>> 0.26 │ → callq *8
>> to:
>> 0.26 │ → callq *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
>
> Also there's a case with no offset like: callq *%rbx
Yes. But this case is fine as strtoull returns 0 for that:
'If there were no digits at all, strtoul() stores the original value of nptr in *endptr (and returns 0).'
So ops->target.addr is then 0 and it's fine.
>
>
>>
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>> tools/perf/util/annotate.c | 10 ++++++++--
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
>
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>
>> indirect_call:
>> tok = strchr(endptr, '*');
>> - if (tok != NULL)
>> - ops->target.addr = strtoull(tok + 1, NULL, 16);
>> + if (tok != NULL) {
>> + endptr++;
>> +
>> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
>> + * Do not parse such instruction. */
>> + if (strstr(endptr, "(%r") == NULL)
>> + ops->target.addr = strtoull(endptr, NULL, 16);
>
> It seems too x86-specific, what about this? (not tested)
It is, I'm fine with that. I've just tested that for the callq *0x8(%rbx) example.
I'm sending patch for that version.
Martin
>
>
> indirect_call:
> tok = strchr(endptr, '*');
> if (tok != NULL) {
> endptr++;
> if (!isdigit(*endptr))
> return 0;
>
> addr = strtoull(endptr, &endptr, 0);
> if (*endptr != '('))
> ops->target.addr = addr;
>
>
> Thanks,
> Namhyung
>
>
>> + }
>> goto find_target;
>> }
>>
>>
>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch; name="0001-Properly-interpret-indirect-call-in-perf-annotate.patch", Size: 1495 bytes --]
From 58a0eca544be8cc9e15b2ab5ecd9d9401ff4d2ec Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 23 Aug 2018 14:25:33 +0200
Subject: [PATCH] Properly interpret indirect call in perf annotate.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The patch changes interpretation of:
callq *0x8(%rbx)
from:
0.26 │ → callq *8
to:
0.26 │ → callq *0x8(%rbx)
in this can an address is followed by a register, thus
one can't parse only address.
Signed-off-by: Martin Liška <mliska@suse.cz>
---
tools/perf/util/annotate.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..18a8477d4664 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -212,6 +212,7 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
struct addr_map_symbol target = {
.map = map,
};
+ u64 addr;
ops->target.addr = strtoull(ops->raw, &endptr, 16);
@@ -246,8 +247,15 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
indirect_call:
tok = strchr(endptr, '*');
- if (tok != NULL)
- ops->target.addr = strtoull(tok + 1, NULL, 16);
+ if (tok != NULL) {
+ endptr++;
+ if (!isdigit(*endptr))
+ return 0;
+
+ addr = strtoull(endptr, &endptr, 0);
+ if (*endptr != '(')
+ ops->target.addr = addr;
+ }
goto find_target;
}
--
2.18.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
` (2 preceding siblings ...)
2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-28 14:10 ` Arnaldo Carvalho de Melo
3 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq *0x8(%rbx)
>
> from:
> 0.26 │ → callq *8
> to:
> 0.26 │ → callq *0x8(%rbx)
>
> in this can an address is followed by a register, thus
> one can't parse only address.
>
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
> tools/perf/util/annotate.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
Please don't send the patch as an attachment, use 'git-format-patch',
I'm fixing this up this time.
- Arnaldo
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>
> indirect_call:
> tok = strchr(endptr, '*');
> - if (tok != NULL)
> - ops->target.addr = strtoull(tok + 1, NULL, 16);
> + if (tok != NULL) {
> + endptr++;
> +
> + /* Indirect call can use a non-rip register and offset: callq *0x8(%rbx).
> + * Do not parse such instruction. */
> + if (strstr(endptr, "(%r") == NULL)
> + ops->target.addr = strtoull(endptr, NULL, 16);
> + }
> goto find_target;
> }
>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-27 9:06 ` Martin Liška
@ 2018-08-28 14:10 ` Arnaldo Carvalho de Melo
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> >> The patch changes interpretation of:
> >> callq *0x8(%rbx)
> >>
> >> from:
> >> 0.26 │ → callq *8
> >> to:
> >> 0.26 │ → callq *0x8(%rbx)
> >>
> >> in this can an address is followed by a register, thus
> >> one can't parse only address.
> >
> > Please mention one or two functions where such sequence appears, so that
> > others can reproduce your before/after more quickly,
>
> Sure, there's self-contained example on can compile (-O2) and test.
> It's following call in test function:
>
> test:
> .LFB1:
> .cfi_startproc
> movq %rdi, %rax
> subq $8, %rsp
> .cfi_def_cfa_offset 16
> movq %rsi, %rdi
> movq %rdx, %rsi
> call *8(%rax) <---- here
> cmpl $1, %eax
> adcl $-1, %eax
> addq $8, %rsp
> .cfi_def_cfa_offset 8
> ret
> .cfi_endproc
Here I'm getting:
Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
test /home/acme/c/perf-callq [Percent: local period]
0.17 │ mov %rdx,-0x28(%rbp)
0.58 │ mov -0x18(%rbp),%rax
7.90 │ mov 0x8(%rax),%rax
8.67 │ mov -0x28(%rbp),%rcx
│ mov -0x20(%rbp),%rdx
0.08 │ mov %rcx,%rsi
6.28 │ mov %rdx,%rdi
10.50 │ → callq *%rax
1.67 │ mov %eax,-0x4(%rbp)
11.95 │ cmpl $0x0,-0x4(%rbp)
8.14 │ ↓ je 3d
│ mov -0x4(%rbp),%eax
│ sub $0x1,%eax
│ ↓ jmp 42
│3d: mov $0x0,%eax
7.84 │42: leaveq
│ ← retq
Without the patch, will check if something changes with it.
- Arnaldo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
@ 2018-08-28 14:18 ` Arnaldo Carvalho de Melo
2018-08-28 17:55 ` Martin Liška
0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:18 UTC (permalink / raw)
To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa
Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> > On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> > >> The patch changes interpretation of:
> > >> callq *0x8(%rbx)
> > >>
> > >> from:
> > >> 0.26 │ → callq *8
> > >> to:
> > >> 0.26 │ → callq *0x8(%rbx)
<SNIP>
> > > Please mention one or two functions where such sequence appears, so that
> > > others can reproduce your before/after more quickly,
> > Sure, there's self-contained example on can compile (-O2) and test.
> > It's following call in test function:
> > test:
> > .LFB1:
> > .cfi_startproc
> > movq %rdi, %rax
> > subq $8, %rsp
> > .cfi_def_cfa_offset 16
> > movq %rsi, %rdi
> > movq %rdx, %rsi
> > call *8(%rax) <---- here
> > cmpl $1, %eax
> > adcl $-1, %eax
> > addq $8, %rsp
> > .cfi_def_cfa_offset 8
> > ret
> > .cfi_endproc
>
> Here I'm getting:
>
> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
> test /home/acme/c/perf-callq [Percent: local period]
> 0.17 │ mov %rdx,-0x28(%rbp)
> 0.58 │ mov -0x18(%rbp),%rax
> 7.90 │ mov 0x8(%rax),%rax
> 8.67 │ mov -0x28(%rbp),%rcx
> │ mov -0x20(%rbp),%rdx
> 0.08 │ mov %rcx,%rsi
> 6.28 │ mov %rdx,%rdi
> 10.50 │ → callq *%rax
> 1.67 │ mov %eax,-0x4(%rbp)
> 11.95 │ cmpl $0x0,-0x4(%rbp)
> 8.14 │ ↓ je 3d
> │ mov -0x4(%rbp),%eax
> │ sub $0x1,%eax
> │ ↓ jmp 42
> │3d: mov $0x0,%eax
> 7.84 │42: leaveq
> │ ← retq
>
> Without the patch, will check if something changes with it.
No changes with the patch, but then I did another test, ran a system
wide record for a while, then tested without/with your patch, with
--stdio2 redirecting to /tmp/{before,after} and got the expected
results, see below.
Thanks, applying,
- Arnaldo
--- /tmp/before 2018-08-28 11:16:03.238384143 -0300
+++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
@@ -13274,7 +13274,7 @@
↓ jle 128
hash_value = hash_table->hash_func (key);
mov 0x8(%rsp),%rdi
- 0.91 → callq *30
+ 0.91 → callq *0x30(%r12)
mov $0x2,%r8d
cmp $0x2,%eax
node_hash = hash_table->hashes[node_index];
@@ -13848,7 +13848,7 @@
mov %r14,%rdi
sub %rbx,%r13
mov %r13,%rdx
- → callq *38
+ → callq *0x38(%r15)
cmp %rax,%r13
1.91 ↓ je 240
1b4: mov $0xffffffff,%r13d
@@ -14026,7 +14026,7 @@
mov %rcx,-0x500(%rbp)
mov %r15,%rsi
mov %r14,%rdi
- → callq *38
+ → callq *0x38(%rax)
mov -0x500(%rbp),%rcx
cmp %rax,%rcx
↓ jne 9b0
<SNIP tons of other such cases>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] Properly interpret indirect call in perf annotate.
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
@ 2018-08-28 17:55 ` Martin Liška
0 siblings, 0 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-28 17:55 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa
On 08/28/2018 04:18 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
>>> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>>>>> The patch changes interpretation of:
>>>>> callq *0x8(%rbx)
>>>>>
>>>>> from:
>>>>> 0.26 │ → callq *8
>>>>> to:
>>>>> 0.26 │ → callq *0x8(%rbx)
>
> <SNIP>
>
>>>> Please mention one or two functions where such sequence appears, so that
>>>> others can reproduce your before/after more quickly,
>
>>> Sure, there's self-contained example on can compile (-O2) and test.
>>> It's following call in test function:
>
>>> test:
>>> .LFB1:
>>> .cfi_startproc
>>> movq %rdi, %rax
>>> subq $8, %rsp
>>> .cfi_def_cfa_offset 16
>>> movq %rsi, %rdi
>>> movq %rdx, %rsi
>>> call *8(%rax) <---- here
>>> cmpl $1, %eax
>>> adcl $-1, %eax
>>> addq $8, %rsp
>>> .cfi_def_cfa_offset 8
>>> ret
>>> .cfi_endproc
>>
>> Here I'm getting:
>>
>> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
>> test /home/acme/c/perf-callq [Percent: local period]
>> 0.17 │ mov %rdx,-0x28(%rbp)
>> 0.58 │ mov -0x18(%rbp),%rax
>> 7.90 │ mov 0x8(%rax),%rax
>> 8.67 │ mov -0x28(%rbp),%rcx
>> │ mov -0x20(%rbp),%rdx
>> 0.08 │ mov %rcx,%rsi
>> 6.28 │ mov %rdx,%rdi
>> 10.50 │ → callq *%rax
>> 1.67 │ mov %eax,-0x4(%rbp)
>> 11.95 │ cmpl $0x0,-0x4(%rbp)
>> 8.14 │ ↓ je 3d
>> │ mov -0x4(%rbp),%eax
>> │ sub $0x1,%eax
>> │ ↓ jmp 42
>> │3d: mov $0x0,%eax
>> 7.84 │42: leaveq
>> │ ← retq
>>
>> Without the patch, will check if something changes with it.
Hi Arnaldo.
Thanks for re-sending of the patch and for the testing. The example I sent
is dependent on version of GCC compiler.
>
> No changes with the patch, but then I did another test, ran a system
> wide record for a while, then tested without/with your patch, with
> --stdio2 redirecting to /tmp/{before,after} and got the expected
> results, see below.
>
> Thanks, applying,
Good!
Martin
>
> - Arnaldo
>
> --- /tmp/before 2018-08-28 11:16:03.238384143 -0300
> +++ /tmp/after 2018-08-28 11:15:39.335341042 -0300
> @@ -13274,7 +13274,7 @@
> ↓ jle 128
> hash_value = hash_table->hash_func (key);
> mov 0x8(%rsp),%rdi
> - 0.91 → callq *30
> + 0.91 → callq *0x30(%r12)
> mov $0x2,%r8d
> cmp $0x2,%eax
> node_hash = hash_table->hashes[node_index];
> @@ -13848,7 +13848,7 @@
> mov %r14,%rdi
> sub %rbx,%r13
> mov %r13,%rdx
> - → callq *38
> + → callq *0x38(%r15)
> cmp %rax,%r13
> 1.91 ↓ je 240
> 1b4: mov $0xffffffff,%r13d
> @@ -14026,7 +14026,7 @@
> mov %rcx,-0x500(%rbp)
> mov %r15,%rsi
> mov %r14,%rdi
> - → callq *38
> + → callq *0x38(%rax)
> mov -0x500(%rbp),%rcx
> cmp %rax,%rcx
> ↓ jne 9b0
> <SNIP tons of other such cases>
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-08-28 17:55 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-27 9:06 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
2018-08-28 14:18 ` Arnaldo Carvalho de Melo
2018-08-28 17:55 ` Martin Liška
2018-08-23 23:00 ` Kim Phillips
2018-08-23 23:00 ` Kim Phillips
2018-08-27 10:37 ` Namhyung Kim
2018-08-27 14:28 ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.