All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Properly interpret indirect call in perf annotate.
@ 2018-08-23 12:29 Martin Liška
  2018-08-23 14:12 ` Arnaldo Carvalho de Melo
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-23 12:29 UTC (permalink / raw)
  To: linux-perf-users, lkml; +Cc: Arnaldo Carvalho de Melo, Jiri Olsa

[-- Attachment #1: Type: text/plain, Size: 364 bytes --]

The patch changes interpretation of:
callq  *0x8(%rbx)

from:
  0.26 │     → callq  *8
to:
  0.26 │     → callq  *0x8(%rbx)

in this can an address is followed by a register, thus
one can't parse only address.

Signed-off-by: Martin Liška <mliska@suse.cz>
---
 tools/perf/util/annotate.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)



[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch, Size: 668 bytes --]

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..e32ead4744bd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
 
 indirect_call:
 	tok = strchr(endptr, '*');
-	if (tok != NULL)
-		ops->target.addr = strtoull(tok + 1, NULL, 16);
+	if (tok != NULL) {
+		endptr++;
+
+		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
+		 * Do not parse such instruction.  */
+		if (strstr(endptr, "(%r") == NULL)
+			ops->target.addr = strtoull(endptr, NULL, 16);
+	}
 	goto find_target;
 }
 


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
@ 2018-08-23 14:12 ` Arnaldo Carvalho de Melo
  2018-08-27  9:06   ` Martin Liška
  2018-08-23 23:00   ` Kim Phillips
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-23 14:12 UTC (permalink / raw)
  To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa

Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq  *0x8(%rbx)
> 
> from:
>   0.26 │     → callq  *8
> to:
>   0.26 │     → callq  *0x8(%rbx)
> 
> in this can an address is followed by a register, thus
> one can't parse only address.

Please mention one or two functions where such sequence appears, so that
others can reproduce your before/after more quickly,

- Arnaldo
 
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
>  tools/perf/util/annotate.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> 

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>  
>  indirect_call:
>  	tok = strchr(endptr, '*');
> -	if (tok != NULL)
> -		ops->target.addr = strtoull(tok + 1, NULL, 16);
> +	if (tok != NULL) {
> +		endptr++;
> +
> +		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
> +		 * Do not parse such instruction.  */
> +		if (strstr(endptr, "(%r") == NULL)
> +			ops->target.addr = strtoull(endptr, NULL, 16);
> +	}
>  	goto find_target;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
@ 2018-08-23 23:00   ` Kim Phillips
  2018-08-23 23:00   ` Kim Phillips
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Kim Phillips @ 2018-08-23 23:00 UTC (permalink / raw)
  To: Martin Liška
  Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa

On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <mliska@suse.cz> wrote:

> The patch changes interpretation of:
> callq  *0x8(%rbx)
> 
> from:
>   0.26 │     → callq  *8
> to:
>   0.26 │     → callq  *0x8(%rbx)
> 
> in this can an address is followed by a register, thus
> one can't parse only address.
> 
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---

Tested this doesn't incur any grievances parsing aarch64 code:

Tested-by: Kim Phillips <kim.phillips@arm.com>

Thanks,

Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
@ 2018-08-23 23:00   ` Kim Phillips
  0 siblings, 0 replies; 11+ messages in thread
From: Kim Phillips @ 2018-08-23 23:00 UTC (permalink / raw)
  To: Martin Liška
  Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa

On Thu, 23 Aug 2018 14:29:34 +0200
Martin Liška <mliska@suse.cz> wrote:

> The patch changes interpretation of:
> callq  *0x8(%rbx)
> 
> from:
>   0.26 │     → callq  *8
> to:
>   0.26 │     → callq  *0x8(%rbx)
> 
> in this can an address is followed by a register, thus
> one can't parse only address.
> 
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---

Tested this doesn't incur any grievances parsing aarch64 code:

Tested-by: Kim Phillips <kim.phillips@arm.com>

Thanks,

Kim

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-23 14:12 ` Arnaldo Carvalho de Melo
@ 2018-08-27  9:06   ` Martin Liška
  2018-08-28 14:10     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Martin Liška @ 2018-08-27  9:06 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa

[-- Attachment #1: Type: text/plain, Size: 1887 bytes --]

On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>> The patch changes interpretation of:
>> callq  *0x8(%rbx)
>>
>> from:
>>   0.26 │     → callq  *8
>> to:
>>   0.26 │     → callq  *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
> 
> Please mention one or two functions where such sequence appears, so that
> others can reproduce your before/after more quickly,

Sure, there's self-contained example on can compile (-O2) and test.
It's following call in test function:

test:
.LFB1:
        .cfi_startproc
        movq    %rdi, %rax
        subq    $8, %rsp
        .cfi_def_cfa_offset 16
        movq    %rsi, %rdi
        movq    %rdx, %rsi
        call    *8(%rax) <---- here
        cmpl    $1, %eax
        adcl    $-1, %eax
        addq    $8, %rsp
        .cfi_def_cfa_offset 8
        ret
        .cfi_endproc

Martin

> 
> - Arnaldo
>  
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>>  tools/perf/util/annotate.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
> 
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>  
>>  indirect_call:
>>  	tok = strchr(endptr, '*');
>> -	if (tok != NULL)
>> -		ops->target.addr = strtoull(tok + 1, NULL, 16);
>> +	if (tok != NULL) {
>> +		endptr++;
>> +
>> +		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
>> +		 * Do not parse such instruction.  */
>> +		if (strstr(endptr, "(%r") == NULL)
>> +			ops->target.addr = strtoull(endptr, NULL, 16);
>> +	}
>>  	goto find_target;
>>  }
>>  
>>
> 

[-- Attachment #2: perf-callq.c --]
[-- Type: text/x-csrc, Size: 1271 bytes --]

typedef int (*htab_eq) (const void *, const void *);

static int eq (const void *a, const void *b)
{
  return a - b;
}

struct htab
{
  /* Current size (in entries) of the hash table */
  int size;

  /* Pointer to comparison function.  */
  htab_eq eq_f;

  /* Table itself.  */
  void **entries;

  /* Current number of elements including also deleted elements */
  int n_elements;

  /* Current number of deleted elements in the table */
  int n_deleted;

  /* The following member is used for debugging. Its value is number
     of all calls of `htab_find_slot' for the hash table. */
  unsigned int searches;

  /* The following member is used for debugging.  Its value is number
     of collisions fixed for time of work with the hash table. */
  unsigned int collisions;

  unsigned int max_size;

  /* This is non-zero if we are allowed to return NULL for function calls
     that allocate memory.  */
  int return_allocation_failure;
};

int
__attribute__ ((noinline))
test(struct htab *t, int *a, int *b)
{
  int r = t->eq_f (a, b);
  if (r)
    return r - 1;

  return 0;
}

struct htab mytable;
int r;

int main(int argc, char **argv)
{
  mytable.eq_f = &eq;
  for (unsigned i = 0; i < 100000000; i++)
    r += test (&mytable, &argc, &argc);
  
  return 0;
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
  2018-08-23 14:12 ` Arnaldo Carvalho de Melo
  2018-08-23 23:00   ` Kim Phillips
@ 2018-08-27 10:37 ` Namhyung Kim
  2018-08-27 14:28   ` Martin Liška
  2018-08-28 14:10 ` Arnaldo Carvalho de Melo
  3 siblings, 1 reply; 11+ messages in thread
From: Namhyung Kim @ 2018-08-27 10:37 UTC (permalink / raw)
  To: Martin Liška
  Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team

Hello,

On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
> The patch changes interpretation of:
> callq  *0x8(%rbx)
> 
> from:
>   0.26 │     → callq  *8
> to:
>   0.26 │     → callq  *0x8(%rbx)
> 
> in this can an address is followed by a register, thus
> one can't parse only address.

Also there's a case with no offset like:  callq  *%rbx


> 
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
>  tools/perf/util/annotate.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> 

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>  
>  indirect_call:
>  	tok = strchr(endptr, '*');
> -	if (tok != NULL)
> -		ops->target.addr = strtoull(tok + 1, NULL, 16);
> +	if (tok != NULL) {
> +		endptr++;
> +
> +		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
> +		 * Do not parse such instruction.  */
> +		if (strstr(endptr, "(%r") == NULL)
> +			ops->target.addr = strtoull(endptr, NULL, 16);

It seems too x86-specific, what about this? (not tested)


indirect_call:
	tok = strchr(endptr, '*');
	if (tok != NULL) {
		endptr++;
		if (!isdigit(*endptr))
			return 0;

		addr = strtoull(endptr, &endptr, 0);
		if (*endptr != '('))
			ops->target.addr = addr;


Thanks,
Namhyung


> +	}
>  	goto find_target;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-27 14:28   ` Martin Liška
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-27 14:28 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: linux-perf-users, lkml, Arnaldo Carvalho de Melo, Jiri Olsa, kernel-team

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]

On 08/27/2018 12:37 PM, Namhyung Kim wrote:
> Hello,
> 
> On Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška wrote:
>> The patch changes interpretation of:
>> callq  *0x8(%rbx)
>>
>> from:
>>   0.26 │     → callq  *8
>> to:
>>   0.26 │     → callq  *0x8(%rbx)
>>
>> in this can an address is followed by a register, thus
>> one can't parse only address.
> 
> Also there's a case with no offset like:  callq  *%rbx

Yes. But this case is fine as strtoull returns 0 for that:
'If there were no digits at all, strtoul() stores the original value of nptr in *endptr (and returns 0).'
So ops->target.addr is then 0 and it's fine.

> 
> 
>>
>> Signed-off-by: Martin Liška <mliska@suse.cz>
>> ---
>>  tools/perf/util/annotate.c | 10 ++++++++--
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>>
> 
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index e4268b948e0e..e32ead4744bd 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>>  
>>  indirect_call:
>>  	tok = strchr(endptr, '*');
>> -	if (tok != NULL)
>> -		ops->target.addr = strtoull(tok + 1, NULL, 16);
>> +	if (tok != NULL) {
>> +		endptr++;
>> +
>> +		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
>> +		 * Do not parse such instruction.  */
>> +		if (strstr(endptr, "(%r") == NULL)
>> +			ops->target.addr = strtoull(endptr, NULL, 16);
> 
> It seems too x86-specific, what about this? (not tested)

It is, I'm fine with that. I've just tested that for the callq  *0x8(%rbx) example.
I'm sending patch for that version.

Martin

> 
> 
> indirect_call:
> 	tok = strchr(endptr, '*');
> 	if (tok != NULL) {
> 		endptr++;
> 		if (!isdigit(*endptr))
> 			return 0;
> 
> 		addr = strtoull(endptr, &endptr, 0);
> 		if (*endptr != '('))
> 			ops->target.addr = addr;
> 
> 
> Thanks,
> Namhyung
> 
> 
>> +	}
>>  	goto find_target;
>>  }
>>  
>>
> 


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-Properly-interpret-indirect-call-in-perf-annotate.patch --]
[-- Type: text/x-patch; name="0001-Properly-interpret-indirect-call-in-perf-annotate.patch", Size: 1495 bytes --]

From 58a0eca544be8cc9e15b2ab5ecd9d9401ff4d2ec Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 23 Aug 2018 14:25:33 +0200
Subject: [PATCH] Properly interpret indirect call in perf annotate.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The patch changes interpretation of:
callq  *0x8(%rbx)

from:
  0.26 │     → callq  *8
to:
  0.26 │     → callq  *0x8(%rbx)

in this can an address is followed by a register, thus
one can't parse only address.

Signed-off-by: Martin Liška <mliska@suse.cz>
---
 tools/perf/util/annotate.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index e4268b948e0e..18a8477d4664 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -212,6 +212,7 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
 	struct addr_map_symbol target = {
 		.map = map,
 	};
+  u64 addr;
 
 	ops->target.addr = strtoull(ops->raw, &endptr, 16);
 
@@ -246,8 +247,15 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
 
 indirect_call:
 	tok = strchr(endptr, '*');
-	if (tok != NULL)
-		ops->target.addr = strtoull(tok + 1, NULL, 16);
+	if (tok != NULL) {
+		endptr++;
+		if (!isdigit(*endptr))
+			return 0;
+
+		addr = strtoull(endptr, &endptr, 0);
+		if (*endptr != '(')
+			ops->target.addr = addr;
+	}
 	goto find_target;
 }
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
                   ` (2 preceding siblings ...)
  2018-08-27 10:37 ` Namhyung Kim
@ 2018-08-28 14:10 ` Arnaldo Carvalho de Melo
  3 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
  To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa

Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> The patch changes interpretation of:
> callq  *0x8(%rbx)
> 
> from:
>   0.26 │     → callq  *8
> to:
>   0.26 │     → callq  *0x8(%rbx)
> 
> in this can an address is followed by a register, thus
> one can't parse only address.
> 
> Signed-off-by: Martin Liška <mliska@suse.cz>
> ---
>  tools/perf/util/annotate.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

Please don't send the patch as an attachment, use 'git-format-patch',
I'm fixing this up this time.

- Arnaldo
 
> 

> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index e4268b948e0e..e32ead4744bd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -246,8 +246,14 @@ static int call__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>  
>  indirect_call:
>  	tok = strchr(endptr, '*');
> -	if (tok != NULL)
> -		ops->target.addr = strtoull(tok + 1, NULL, 16);
> +	if (tok != NULL) {
> +		endptr++;
> +
> +		/* Indirect call can use a non-rip register and offset: callq  *0x8(%rbx).
> +		 * Do not parse such instruction.  */
> +		if (strstr(endptr, "(%r") == NULL)
> +			ops->target.addr = strtoull(endptr, NULL, 16);
> +	}
>  	goto find_target;
>  }
>  
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-27  9:06   ` Martin Liška
@ 2018-08-28 14:10     ` Arnaldo Carvalho de Melo
  2018-08-28 14:18       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:10 UTC (permalink / raw)
  To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa

Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> >> The patch changes interpretation of:
> >> callq  *0x8(%rbx)
> >>
> >> from:
> >>   0.26 │     → callq  *8
> >> to:
> >>   0.26 │     → callq  *0x8(%rbx)
> >>
> >> in this can an address is followed by a register, thus
> >> one can't parse only address.
> > 
> > Please mention one or two functions where such sequence appears, so that
> > others can reproduce your before/after more quickly,
> 
> Sure, there's self-contained example on can compile (-O2) and test.
> It's following call in test function:
> 
> test:
> .LFB1:
>         .cfi_startproc
>         movq    %rdi, %rax
>         subq    $8, %rsp
>         .cfi_def_cfa_offset 16
>         movq    %rsi, %rdi
>         movq    %rdx, %rsi
>         call    *8(%rax) <---- here
>         cmpl    $1, %eax
>         adcl    $-1, %eax
>         addq    $8, %rsp
>         .cfi_def_cfa_offset 8
>         ret
>         .cfi_endproc

Here I'm getting:

Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
test  /home/acme/c/perf-callq [Percent: local period]
  0.17 │      mov    %rdx,-0x28(%rbp)
  0.58 │      mov    -0x18(%rbp),%rax
  7.90 │      mov    0x8(%rax),%rax
  8.67 │      mov    -0x28(%rbp),%rcx
       │      mov    -0x20(%rbp),%rdx
  0.08 │      mov    %rcx,%rsi
  6.28 │      mov    %rdx,%rdi
 10.50 │    → callq  *%rax
  1.67 │      mov    %eax,-0x4(%rbp)
 11.95 │      cmpl   $0x0,-0x4(%rbp)
  8.14 │    ↓ je     3d
       │      mov    -0x4(%rbp),%eax
       │      sub    $0x1,%eax
       │    ↓ jmp    42
       │3d:   mov    $0x0,%eax
  7.84 │42:   leaveq
       │    ← retq

Without the patch, will check if something changes with it.

- Arnaldo



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-28 14:10     ` Arnaldo Carvalho de Melo
@ 2018-08-28 14:18       ` Arnaldo Carvalho de Melo
  2018-08-28 17:55         ` Martin Liška
  0 siblings, 1 reply; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-28 14:18 UTC (permalink / raw)
  To: Martin Liška; +Cc: linux-perf-users, lkml, Jiri Olsa

Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
> > On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
> > > Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
> > >> The patch changes interpretation of:
> > >> callq  *0x8(%rbx)
> > >>
> > >> from:
> > >>   0.26 │     → callq  *8
> > >> to:
> > >>   0.26 │     → callq  *0x8(%rbx)

<SNIP>

> > > Please mention one or two functions where such sequence appears, so that
> > > others can reproduce your before/after more quickly,

> > Sure, there's self-contained example on can compile (-O2) and test.
> > It's following call in test function:

> > test:
> > .LFB1:
> >         .cfi_startproc
> >         movq    %rdi, %rax
> >         subq    $8, %rsp
> >         .cfi_def_cfa_offset 16
> >         movq    %rsi, %rdi
> >         movq    %rdx, %rsi
> >         call    *8(%rax) <---- here
> >         cmpl    $1, %eax
> >         adcl    $-1, %eax
> >         addq    $8, %rsp
> >         .cfi_def_cfa_offset 8
> >         ret
> >         .cfi_endproc
> 
> Here I'm getting:
> 
> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
> test  /home/acme/c/perf-callq [Percent: local period]
>   0.17 │      mov    %rdx,-0x28(%rbp)
>   0.58 │      mov    -0x18(%rbp),%rax
>   7.90 │      mov    0x8(%rax),%rax
>   8.67 │      mov    -0x28(%rbp),%rcx
>        │      mov    -0x20(%rbp),%rdx
>   0.08 │      mov    %rcx,%rsi
>   6.28 │      mov    %rdx,%rdi
>  10.50 │    → callq  *%rax
>   1.67 │      mov    %eax,-0x4(%rbp)
>  11.95 │      cmpl   $0x0,-0x4(%rbp)
>   8.14 │    ↓ je     3d
>        │      mov    -0x4(%rbp),%eax
>        │      sub    $0x1,%eax
>        │    ↓ jmp    42
>        │3d:   mov    $0x0,%eax
>   7.84 │42:   leaveq
>        │    ← retq
> 
> Without the patch, will check if something changes with it.

No changes with the patch, but then I did another test, ran a system
wide record for a while, then tested without/with your patch, with
--stdio2 redirecting to /tmp/{before,after} and got the expected
results, see below.

Thanks, applying,

- Arnaldo

--- /tmp/before 2018-08-28 11:16:03.238384143 -0300
+++ /tmp/after  2018-08-28 11:15:39.335341042 -0300
@@ -13274,7 +13274,7 @@
              ↓ jle    128     
                hash_value = hash_table->hash_func (key);
                mov    0x8(%rsp),%rdi
-  0.91       → callq  *30     
+  0.91       → callq  *0x30(%r12)
                mov    $0x2,%r8d
                cmp    $0x2,%eax
                node_hash = hash_table->hashes[node_index];
@@ -13848,7 +13848,7 @@
                 mov    %r14,%rdi
                 sub    %rbx,%r13
                 mov    %r13,%rdx
-              → callq  *38      
+              → callq  *0x38(%r15)
                 cmp    %rax,%r13
   1.91        ↓ je     240      
          1b4:   mov    $0xffffffff,%r13d
@@ -14026,7 +14026,7 @@
                 mov    %rcx,-0x500(%rbp)
                 mov    %r15,%rsi
                 mov    %r14,%rdi
-              → callq  *38      
+              → callq  *0x38(%rax)
                 mov    -0x500(%rbp),%rcx
                 cmp    %rax,%rcx
               ↓ jne    9b0
<SNIP tons of other such cases>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Properly interpret indirect call in perf annotate.
  2018-08-28 14:18       ` Arnaldo Carvalho de Melo
@ 2018-08-28 17:55         ` Martin Liška
  0 siblings, 0 replies; 11+ messages in thread
From: Martin Liška @ 2018-08-28 17:55 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo; +Cc: linux-perf-users, lkml, Jiri Olsa

On 08/28/2018 04:18 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Aug 28, 2018 at 11:10:47AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Mon, Aug 27, 2018 at 11:06:21AM +0200, Martin Liška escreveu:
>>> On 08/23/2018 04:12 PM, Arnaldo Carvalho de Melo wrote:
>>>> Em Thu, Aug 23, 2018 at 02:29:34PM +0200, Martin Liška escreveu:
>>>>> The patch changes interpretation of:
>>>>> callq  *0x8(%rbx)
>>>>>
>>>>> from:
>>>>>    0.26 │     → callq  *8
>>>>> to:
>>>>>    0.26 │     → callq  *0x8(%rbx)
> 
> <SNIP>
> 
>>>> Please mention one or two functions where such sequence appears, so that
>>>> others can reproduce your before/after more quickly,
> 
>>> Sure, there's self-contained example on can compile (-O2) and test.
>>> It's following call in test function:
> 
>>> test:
>>> .LFB1:
>>>          .cfi_startproc
>>>          movq    %rdi, %rax
>>>          subq    $8, %rsp
>>>          .cfi_def_cfa_offset 16
>>>          movq    %rsi, %rdi
>>>          movq    %rdx, %rsi
>>>          call    *8(%rax) <---- here
>>>          cmpl    $1, %eax
>>>          adcl    $-1, %eax
>>>          addq    $8, %rsp
>>>          .cfi_def_cfa_offset 8
>>>          ret
>>>          .cfi_endproc
>>
>> Here I'm getting:
>>
>> Samples: 2K of event 'cycles:uppp', 4000 Hz, Event count (approx.): 1808551484
>> test  /home/acme/c/perf-callq [Percent: local period]
>>    0.17 │      mov    %rdx,-0x28(%rbp)
>>    0.58 │      mov    -0x18(%rbp),%rax
>>    7.90 │      mov    0x8(%rax),%rax
>>    8.67 │      mov    -0x28(%rbp),%rcx
>>         │      mov    -0x20(%rbp),%rdx
>>    0.08 │      mov    %rcx,%rsi
>>    6.28 │      mov    %rdx,%rdi
>>   10.50 │    → callq  *%rax
>>    1.67 │      mov    %eax,-0x4(%rbp)
>>   11.95 │      cmpl   $0x0,-0x4(%rbp)
>>    8.14 │    ↓ je     3d
>>         │      mov    -0x4(%rbp),%eax
>>         │      sub    $0x1,%eax
>>         │    ↓ jmp    42
>>         │3d:   mov    $0x0,%eax
>>    7.84 │42:   leaveq
>>         │    ← retq
>>
>> Without the patch, will check if something changes with it.

Hi Arnaldo.

Thanks for re-sending of the patch and for the testing. The example I sent
is dependent on version of GCC compiler.

> 
> No changes with the patch, but then I did another test, ran a system
> wide record for a while, then tested without/with your patch, with
> --stdio2 redirecting to /tmp/{before,after} and got the expected
> results, see below.
> 
> Thanks, applying,

Good!
Martin

> 
> - Arnaldo
> 
> --- /tmp/before 2018-08-28 11:16:03.238384143 -0300
> +++ /tmp/after  2018-08-28 11:15:39.335341042 -0300
> @@ -13274,7 +13274,7 @@
>                ↓ jle    128
>                  hash_value = hash_table->hash_func (key);
>                  mov    0x8(%rsp),%rdi
> -  0.91       → callq  *30
> +  0.91       → callq  *0x30(%r12)
>                  mov    $0x2,%r8d
>                  cmp    $0x2,%eax
>                  node_hash = hash_table->hashes[node_index];
> @@ -13848,7 +13848,7 @@
>                   mov    %r14,%rdi
>                   sub    %rbx,%r13
>                   mov    %r13,%rdx
> -              → callq  *38
> +              → callq  *0x38(%r15)
>                   cmp    %rax,%r13
>     1.91        ↓ je     240
>            1b4:   mov    $0xffffffff,%r13d
> @@ -14026,7 +14026,7 @@
>                   mov    %rcx,-0x500(%rbp)
>                   mov    %r15,%rsi
>                   mov    %r14,%rdi
> -              → callq  *38
> +              → callq  *0x38(%rax)
>                   mov    -0x500(%rbp),%rcx
>                   cmp    %rax,%rcx
>                 ↓ jne    9b0
> <SNIP tons of other such cases>
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-08-28 17:55 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-23 12:29 [PATCH] Properly interpret indirect call in perf annotate Martin Liška
2018-08-23 14:12 ` Arnaldo Carvalho de Melo
2018-08-27  9:06   ` Martin Liška
2018-08-28 14:10     ` Arnaldo Carvalho de Melo
2018-08-28 14:18       ` Arnaldo Carvalho de Melo
2018-08-28 17:55         ` Martin Liška
2018-08-23 23:00 ` Kim Phillips
2018-08-23 23:00   ` Kim Phillips
2018-08-27 10:37 ` Namhyung Kim
2018-08-27 14:28   ` Martin Liška
2018-08-28 14:10 ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.