linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf test: Remove atomics from test_loop to avoid test failures
@ 2023-11-02 16:22 Nick Forrington
  2023-11-03  9:14 ` James Clark
  2023-11-24 19:57 ` Michael Petlan
  0 siblings, 2 replies; 10+ messages in thread
From: Nick Forrington @ 2023-11-02 16:22 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users
  Cc: Nick Forrington, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Ian Rogers, Adrian Hunter,
	Arnaldo Carvalho de Melo

The current use of atomics can lead to test failures, as tests (such as
tests/shell/record.sh) search for samples with "test_loop" as the
top-most stack frame, but find frames related to the atomic operation
(e.g. __aarch64_ldadd4_relax).

This change simply removes the "count" variable, as it is not necessary.

Fixes: 1962ab6f6e0b ("perf test workload thloop: Make count increments atomic")
Signed-off-by: Nick Forrington <nick.forrington@arm.com>
---
 tools/perf/tests/workloads/thloop.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/perf/tests/workloads/thloop.c b/tools/perf/tests/workloads/thloop.c
index af05269c2eb8..457b29f91c3e 100644
--- a/tools/perf/tests/workloads/thloop.c
+++ b/tools/perf/tests/workloads/thloop.c
@@ -7,7 +7,6 @@
 #include "../tests.h"
 
 static volatile sig_atomic_t done;
-static volatile unsigned count;
 
 /* We want to check this symbol in perf report */
 noinline void test_loop(void);
@@ -19,8 +18,7 @@ static void sighandler(int sig __maybe_unused)
 
 noinline void test_loop(void)
 {
-	while (!done)
-		__atomic_fetch_add(&count, 1, __ATOMIC_RELAXED);
+	while (!done);
 }
 
 static void *thfunc(void *arg)
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-02 16:22 [PATCH] perf test: Remove atomics from test_loop to avoid test failures Nick Forrington
@ 2023-11-03  9:14 ` James Clark
  2023-11-21 17:04   ` Arnaldo Carvalho de Melo
  2023-11-24 19:57 ` Michael Petlan
  1 sibling, 1 reply; 10+ messages in thread
From: James Clark @ 2023-11-03  9:14 UTC (permalink / raw)
  To: Nick Forrington, linux-kernel, linux-perf-users
  Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Arnaldo Carvalho de Melo



On 02/11/2023 16:22, Nick Forrington wrote:
> The current use of atomics can lead to test failures, as tests (such as
> tests/shell/record.sh) search for samples with "test_loop" as the
> top-most stack frame, but find frames related to the atomic operation
> (e.g. __aarch64_ldadd4_relax).
> 
> This change simply removes the "count" variable, as it is not necessary.
> 
> Fixes: 1962ab6f6e0b ("perf test workload thloop: Make count increments atomic")
> Signed-off-by: Nick Forrington <nick.forrington@arm.com>
> ---
>   tools/perf/tests/workloads/thloop.c | 4 +---
>   1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/tools/perf/tests/workloads/thloop.c b/tools/perf/tests/workloads/thloop.c
> index af05269c2eb8..457b29f91c3e 100644
> --- a/tools/perf/tests/workloads/thloop.c
> +++ b/tools/perf/tests/workloads/thloop.c
> @@ -7,7 +7,6 @@
>   #include "../tests.h"
>   
>   static volatile sig_atomic_t done;
> -static volatile unsigned count;
>   
>   /* We want to check this symbol in perf report */
>   noinline void test_loop(void);
> @@ -19,8 +18,7 @@ static void sighandler(int sig __maybe_unused)
>   
>   noinline void test_loop(void)
>   {
> -	while (!done)
> -		__atomic_fetch_add(&count, 1, __ATOMIC_RELAXED);
> +	while (!done);
>   }
>   
>   static void *thfunc(void *arg)

Reviewed-by: James Clark <james.clark@arm.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-03  9:14 ` James Clark
@ 2023-11-21 17:04   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-21 17:04 UTC (permalink / raw)
  To: James Clark
  Cc: Nick Forrington, linux-kernel, linux-perf-users, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Arnaldo Carvalho de Melo

Em Fri, Nov 03, 2023 at 09:14:45AM +0000, James Clark escreveu:
> On 02/11/2023 16:22, Nick Forrington wrote:
> > The current use of atomics can lead to test failures, as tests (such as
> > tests/shell/record.sh) search for samples with "test_loop" as the
> > top-most stack frame, but find frames related to the atomic operation
> > (e.g. __aarch64_ldadd4_relax).

> > This change simply removes the "count" variable, as it is not necessary.

> > Fixes: 1962ab6f6e0b ("perf test workload thloop: Make count increments atomic")
> > Signed-off-by: Nick Forrington <nick.forrington@arm.com>

> > +++ b/tools/perf/tests/workloads/thloop.c
> > @@ -7,7 +7,6 @@
> >   #include "../tests.h"
> >   static volatile sig_atomic_t done;
> > -static volatile unsigned count;
> >   /* We want to check this symbol in perf report */
> >   noinline void test_loop(void);
> > @@ -19,8 +18,7 @@ static void sighandler(int sig __maybe_unused)
> >   noinline void test_loop(void)
> >   {
> > -	while (!done)
> > -		__atomic_fetch_add(&count, 1, __ATOMIC_RELAXED);
> > +	while (!done);
> >   }
> >   static void *thfunc(void *arg)
 
> Reviewed-by: James Clark <james.clark@arm.com>

Thanks, applied to perf-tools-next.

- Arnaldo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-02 16:22 [PATCH] perf test: Remove atomics from test_loop to avoid test failures Nick Forrington
  2023-11-03  9:14 ` James Clark
@ 2023-11-24 19:57 ` Michael Petlan
  2023-11-25  3:05   ` Leo Yan
  2023-11-27 10:45   ` James Clark
  1 sibling, 2 replies; 10+ messages in thread
From: Michael Petlan @ 2023-11-24 19:57 UTC (permalink / raw)
  To: Nick Forrington
  Cc: linux-kernel, linux-perf-users, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter,
	Arnaldo Carvalho de Melo, vmolnaro

On Thu, 2 Nov 2023, Nick Forrington wrote:
> The current use of atomics can lead to test failures, as tests (such as
> tests/shell/record.sh) search for samples with "test_loop" as the
> top-most stack frame, but find frames related to the atomic operation
> (e.g. __aarch64_ldadd4_relax).
> 
> This change simply removes the "count" variable, as it is not necessary.

Hello.

I believe that it was there to prevent the compiler to optimize the loop
out or some reason like that. Hopefully, it will work even without that
on all architectures with all compilers that are used for building perf...

I also think that the tests should be designed in a more robust way, so
that they aren't directly dependent on exact stack frames, e.g. let's look
at the "inet_pton" testcase...

The inet_pton test result check algorithm is designed to rely on exact
stacktrace, without a possibility to specify something like "we want this
and that in the stack trace, but otherwise, it does not matter which
wrappers are aroung". Then there must be the following code to adjust
the expected output exactly per architecture:

    echo "ping[][0-9 \.:]+$event_name: \([[:xdigit:]]+\)" > $expected
    echo ".*inet_pton\+0x[[:xdigit:]]+[[:space:]]\($libc|inlined\)$" >> $expected
    case "$(uname -m)" in
      s390x)
        eventattr='call-graph=dwarf,max-stack=4'
        echo "(__GI_)?getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\($libc|inlined\)$" >> $expected
        echo "main\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" >> $expected
      ;;
      ppc64|ppc64le)
        eventattr='max-stack=4'
        echo "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> $expected
        echo "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> $expected
        echo ".*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$" >> $expected
      ;;
      *)
        eventattr='max-stack=3'
        echo ".*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$" >> $expected
      ;;
    esac

Of course, since it relies on libc version, then we need patches, such as
    1f85d016768f ("perf test record+probe_libc_inet_pton: Fix call chain match on x86_64")
    311693ce81c9 ("perf test record+probe_libc_inet_pton: Fix call chain match on s390")
    fb710ddee75f ("perf test record_probe_libc_inet_pton: Fix test on s/390 where 'text_to_binary_address' now appears on the backtrace")
    ...
which break the test when used with older libc...

I think that a better design of such test is either probing some program
of ourselves that has predictable and stable function call sequence or
be more robust in checking the stack trace.

Thoughts?

Michael

> 
> Fixes: 1962ab6f6e0b ("perf test workload thloop: Make count increments atomic")
> Signed-off-by: Nick Forrington <nick.forrington@arm.com>
> ---
>  tools/perf/tests/workloads/thloop.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/tools/perf/tests/workloads/thloop.c b/tools/perf/tests/workloads/thloop.c
> index af05269c2eb8..457b29f91c3e 100644
> --- a/tools/perf/tests/workloads/thloop.c
> +++ b/tools/perf/tests/workloads/thloop.c
> @@ -7,7 +7,6 @@
>  #include "../tests.h"
>  
>  static volatile sig_atomic_t done;
> -static volatile unsigned count;
>  
>  /* We want to check this symbol in perf report */
>  noinline void test_loop(void);
> @@ -19,8 +18,7 @@ static void sighandler(int sig __maybe_unused)
>  
>  noinline void test_loop(void)
>  {
> -	while (!done)
> -		__atomic_fetch_add(&count, 1, __ATOMIC_RELAXED);
> +	while (!done);
>  }
>  
>  static void *thfunc(void *arg)
> -- 
> 2.42.0
> 
> 
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-24 19:57 ` Michael Petlan
@ 2023-11-25  3:05   ` Leo Yan
  2023-11-25 19:10     ` Nick Forrington
  2023-11-27 10:45   ` James Clark
  1 sibling, 1 reply; 10+ messages in thread
From: Leo Yan @ 2023-11-25  3:05 UTC (permalink / raw)
  To: Michael Petlan
  Cc: Nick Forrington, linux-kernel, linux-perf-users, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Arnaldo Carvalho de Melo, vmolnaro

Hi all,

On Fri, Nov 24, 2023 at 08:57:52PM +0100, Michael Petlan wrote:
> On Thu, 2 Nov 2023, Nick Forrington wrote:
> > The current use of atomics can lead to test failures, as tests (such as
> > tests/shell/record.sh) search for samples with "test_loop" as the
> > top-most stack frame, but find frames related to the atomic operation
> > (e.g. __aarch64_ldadd4_relax).

I am confused by above description.  As I went through the script
record.sh, which is the only test invoking the program 'test_loop',
but I don't find any test is related with stack frame.

Do I miss anything?  I went through record.sh but no clue why the
failure is caused by stack frame.  All the testings use command:

  if ! perf report -i "${perfdata}" -q | grep -q "${testsym}"
    ...
  fi

@Nick, could you narrow down which specific test case causing the
failure.

[...]

> I believe that it was there to prevent the compiler to optimize the loop
> out or some reason like that. Hopefully, it will work even without that
> on all architectures with all compilers that are used for building perf...

Agreed.

As said above, I'd like to step back a bit for making clear what's the
exactly failure caused by the program.

Thanks,
Leo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-25  3:05   ` Leo Yan
@ 2023-11-25 19:10     ` Nick Forrington
  2023-11-26  7:41       ` Leo Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Nick Forrington @ 2023-11-25 19:10 UTC (permalink / raw)
  To: Leo Yan, Michael Petlan
  Cc: linux-kernel, linux-perf-users, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter,
	Arnaldo Carvalho de Melo, vmolnaro


On 25/11/2023 03:05, Leo Yan wrote:
> Hi all,
>
> On Fri, Nov 24, 2023 at 08:57:52PM +0100, Michael Petlan wrote:
>> On Thu, 2 Nov 2023, Nick Forrington wrote:
>>> The current use of atomics can lead to test failures, as tests (such as
>>> tests/shell/record.sh) search for samples with "test_loop" as the
>>> top-most stack frame, but find frames related to the atomic operation
>>> (e.g. __aarch64_ldadd4_relax).
> I am confused by above description.  As I went through the script
> record.sh, which is the only test invoking the program 'test_loop',
> but I don't find any test is related with stack frame.
>
> Do I miss anything?  I went through record.sh but no clue why the
> failure is caused by stack frame.  All the testings use command:
>
>    if ! perf report -i "${perfdata}" -q | grep -q "${testsym}"
>      ...
>    fi
>
> @Nick, could you narrow down which specific test case causing the
> failure.
>
> [...]


All checks for ${testsym} in record.sh (including the example you 
provide) can fail, as the expected symbol (test_loop) is not the 
top-most function on the stack (and therefore not the symbol associated 
with the sample).


Example perf report output:

# Overhead  Command  Shared Object          Symbol
# ........  .......  ..................... .............................
#
     99.53%  perf     perf                   [.] __aarch64_ldadd4_relax

...


You can see the issue when recording/reporting with call stacks:

# Children      Self  Command  Shared Object          Symbol
# ........  ........  .......  ..................... 
..........................................................
#
     99.52%    99.52%  perf     perf                   [.] 
__aarch64_ldadd4_relax
             |
             |--49.77%--0xffffb905a5dc
             |          0xffffb8ff0aec
             |          thfunc
             |          test_loop
             |          __aarch64_ldadd4_relax

...

>
>> I believe that it was there to prevent the compiler to optimize the loop
>> out or some reason like that. Hopefully, it will work even without that
>> on all architectures with all compilers that are used for building perf...
> Agreed.
>
> As said above, I'd like to step back a bit for making clear what's the
> exactly failure caused by the program.


I don't think this loop could be sensibly optimised away, as it depends 
on "done", which is defined at file scope (and assigned by a signal 
handler).


Cheers,
Nick

>
> Thanks,
> Leo
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-25 19:10     ` Nick Forrington
@ 2023-11-26  7:41       ` Leo Yan
  2023-11-27 13:20         ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: Leo Yan @ 2023-11-26  7:41 UTC (permalink / raw)
  To: Nick Forrington
  Cc: Michael Petlan, linux-kernel, linux-perf-users, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	Adrian Hunter, Arnaldo Carvalho de Melo, vmolnaro

On Sat, Nov 25, 2023 at 07:10:25PM +0000, Nick Forrington wrote:

[...]

> > @Nick, could you narrow down which specific test case causing the
> > failure.
> > 
> > [...]
> 
> All checks for ${testsym} in record.sh (including the example you provide)
> can fail, as the expected symbol (test_loop) is not the top-most function on
> the stack (and therefore not the symbol associated with the sample).
> 
> 
> Example perf report output:
> 
> # Overhead  Command  Shared Object          Symbol
> # ........  .......  ..................... .............................
> #
>     99.53%  perf     perf                   [.] __aarch64_ldadd4_relax

Thanks for confirmation.

I cannot reproduce the issue on my Juno board with Debian (buster).
It's likely because I don't use GCC toolchain with enabling
'-moutline-atomics' AArch64 flag [1].

> ...
> 
> 
> You can see the issue when recording/reporting with call stacks:
> 
> # Children      Self  Command  Shared Object          Symbol
> # ........  ........  .......  .....................
> ..........................................................
> #
>     99.52%    99.52%  perf     perf                   [.]
> __aarch64_ldadd4_relax
>             |
>             |--49.77%--0xffffb905a5dc
>             |          0xffffb8ff0aec
>             |          thfunc
>             |          test_loop
>             |          __aarch64_ldadd4_relax



> ...
> 
> > 
> > > I believe that it was there to prevent the compiler to optimize the loop
> > > out or some reason like that. Hopefully, it will work even without that
> > > on all architectures with all compilers that are used for building perf...
> > Agreed.
> > 
> > As said above, I'd like to step back a bit for making clear what's the
> > exactly failure caused by the program.
> 
> I don't think this loop could be sensibly optimised away, as it depends on
> "done", which is defined at file scope (and assigned by a signal handler).

I verified your patch with 'perf annotate'.

The disassembly on Arm64 is:

    noinline void test_loop(void)               
    {                                           
      stp  x29, x30, [sp, #-32]!                
      mov  x29, sp                              
      adrp x0, options+0x4c0                    
      ldr  x0, [x0, #3664]                      
      ldr  x1, [x0]                             
      str  x1, [sp, #24]                        
      mov  x1, #0x0                        // #0
    while (!done);                              
      nop                                       
20:   adrp x0, spacing.22251                    
      add  x0, x0, #0xa04                       
      ldr  w0, [x0]                             
      cmp  w0, #0x0                             
    ↑ b.eq 20                                   
    } 

The disassembly on x86_64 is:

    noinline void test_loop(void)
    {
      endbr64
      push    %rbp
      mov     %rsp,%rbp
      sub     $0x10,%rsp
      mov     %fs:0x28,%rax
      mov     %rax,-0x8(%rbp)
      xor     %eax,%eax
    while (!done);
      nop
1c:   mov     done,%eax
      test    %eax,%eax
    ↑ je      1c
    }


Maybe the commit log caused a bit confusion, the problem is after
enabling "-moutline-atomics" on aarch64, the overhead is altered into
the linked __aarch64_ldadd4_relax() function, test_loop() cannot be
sampled anymore, but it's not about stack tracing.

Anyway, the patch is fine for me.

Thanks,
Leo

[1] https://stackoverflow.com/questions/65239845/how-to-enable-mno-outline-atomics-aarch64-flag

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-24 19:57 ` Michael Petlan
  2023-11-25  3:05   ` Leo Yan
@ 2023-11-27 10:45   ` James Clark
  1 sibling, 0 replies; 10+ messages in thread
From: James Clark @ 2023-11-27 10:45 UTC (permalink / raw)
  To: Michael Petlan, Nick Forrington
  Cc: linux-kernel, linux-perf-users, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers, Adrian Hunter,
	Arnaldo Carvalho de Melo, vmolnaro



On 24/11/2023 19:57, Michael Petlan wrote:
> On Thu, 2 Nov 2023, Nick Forrington wrote:
>> The current use of atomics can lead to test failures, as tests (such as
>> tests/shell/record.sh) search for samples with "test_loop" as the
>> top-most stack frame, but find frames related to the atomic operation
>> (e.g. __aarch64_ldadd4_relax).
>>
>> This change simply removes the "count" variable, as it is not necessary.
> 
> Hello.
> 
> I believe that it was there to prevent the compiler to optimize the loop
> out or some reason like that. Hopefully, it will work even without that
> on all architectures with all compilers that are used for building perf...
> 

If that optimisation ever happens and there is a concrete case, I think
it should be treated like any other test regression and fixed at that
time when it's known what the specifics of that issue are. As Nick says
in a later comment, the loop can't be optimised out because it depends
on the done variable and the function is noinline.

> I also think that the tests should be designed in a more robust way, so
> that they aren't directly dependent on exact stack frames, e.g. let's look
> at the "inet_pton" testcase...

This one doesn't look like it's dependent on exact stack frames, but
just one frame at the end, which unless something is actually broken in
Perf causing a failure, I don't see how could be made any more robust.

> 
> The inet_pton test result check algorithm is designed to rely on exact
> stacktrace, without a possibility to specify something like "we want this
> and that in the stack trace, but otherwise, it does not matter which
> wrappers are aroung". Then there must be the following code to adjust
> the expected output exactly per architecture:
> 
>     echo "ping[][0-9 \.:]+$event_name: \([[:xdigit:]]+\)" > $expected
>     echo ".*inet_pton\+0x[[:xdigit:]]+[[:space:]]\($libc|inlined\)$" >> $expected
>     case "$(uname -m)" in
>       s390x)
>         eventattr='call-graph=dwarf,max-stack=4'
>         echo "(__GI_)?getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\($libc|inlined\)$" >> $expected
>         echo "main\+0x[[:xdigit:]]+[[:space:]]\(.*/bin/ping.*\)$" >> $expected
>       ;;
>       ppc64|ppc64le)
>         eventattr='max-stack=4'
>         echo "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> $expected
>         echo "getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\($libc\)$" >> $expected
>         echo ".*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$" >> $expected
>       ;;
>       *)
>         eventattr='max-stack=3'
>         echo ".*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$" >> $expected
>       ;;
>     esac
> 
> Of course, since it relies on libc version, then we need patches, such as
>     1f85d016768f ("perf test record+probe_libc_inet_pton: Fix call chain match on x86_64")
>     311693ce81c9 ("perf test record+probe_libc_inet_pton: Fix call chain match on s390")
>     fb710ddee75f ("perf test record_probe_libc_inet_pton: Fix test on s/390 where 'text_to_binary_address' now appears on the backtrace")
>     ...
> which break the test when used with older libc...
> 

I definitely think that relying on external libraries for stacks in
tests can be annoying to keep up to date, but that's exactly what we
just removed from test_loop.

For the inet_pton test it seems to be specifically testing probing libc,
so maybe making the test program ourselves would make the test much less
valuable. And then as long as all the tests are not like that and there
is only this one to keep up to date, maybe it's not that bad.

> I think that a better design of such test is either probing some program
> of ourselves that has predictable and stable function call sequence or
> be more robust in checking the stack trace.
> 
> Thoughts?
> 
> Michael
> 
>>
>> Fixes: 1962ab6f6e0b ("perf test workload thloop: Make count increments atomic")
>> Signed-off-by: Nick Forrington <nick.forrington@arm.com>
>> ---
>>  tools/perf/tests/workloads/thloop.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/tests/workloads/thloop.c b/tools/perf/tests/workloads/thloop.c
>> index af05269c2eb8..457b29f91c3e 100644
>> --- a/tools/perf/tests/workloads/thloop.c
>> +++ b/tools/perf/tests/workloads/thloop.c
>> @@ -7,7 +7,6 @@
>>  #include "../tests.h"
>>  
>>  static volatile sig_atomic_t done;
>> -static volatile unsigned count;
>>  
>>  /* We want to check this symbol in perf report */
>>  noinline void test_loop(void);
>> @@ -19,8 +18,7 @@ static void sighandler(int sig __maybe_unused)
>>  
>>  noinline void test_loop(void)
>>  {
>> -	while (!done)
>> -		__atomic_fetch_add(&count, 1, __ATOMIC_RELAXED);
>> +	while (!done);
>>  }
>>  
>>  static void *thfunc(void *arg)
>> -- 
>> 2.42.0
>>
>>
>>
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-26  7:41       ` Leo Yan
@ 2023-11-27 13:20         ` Arnaldo Carvalho de Melo
  2023-11-27 13:29           ` Leo Yan
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-27 13:20 UTC (permalink / raw)
  To: Leo Yan
  Cc: Nick Forrington, Michael Petlan, linux-kernel, linux-perf-users,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Arnaldo Carvalho de Melo, vmolnaro

Em Sun, Nov 26, 2023 at 03:41:14PM +0800, Leo Yan escreveu:
> Maybe the commit log caused a bit confusion, the problem is after

We'll have the Link pointing to this discussion.

> enabling "-moutline-atomics" on aarch64, the overhead is altered into
> the linked __aarch64_ldadd4_relax() function, test_loop() cannot be
> sampled anymore, but it's not about stack tracing.
> 
> Anyway, the patch is fine for me.

I'm taking this as an Acked-by: Leo

But probably this could be even a Tested-by, ok?

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] perf test: Remove atomics from test_loop to avoid test failures
  2023-11-27 13:20         ` Arnaldo Carvalho de Melo
@ 2023-11-27 13:29           ` Leo Yan
  0 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2023-11-27 13:29 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Nick Forrington, Michael Petlan, linux-kernel, linux-perf-users,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	Ian Rogers, Adrian Hunter, Arnaldo Carvalho de Melo, vmolnaro

On Mon, Nov 27, 2023 at 10:20:47AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Sun, Nov 26, 2023 at 03:41:14PM +0800, Leo Yan escreveu:
> > Maybe the commit log caused a bit confusion, the problem is after
> 
> We'll have the Link pointing to this discussion.

Okay, it's a good tracking.

> > enabling "-moutline-atomics" on aarch64, the overhead is altered into
> > the linked __aarch64_ldadd4_relax() function, test_loop() cannot be
> > sampled anymore, but it's not about stack tracing.
> > 
> > Anyway, the patch is fine for me.
> 
> I'm taking this as an Acked-by: Leo
> 
> But probably this could be even a Tested-by, ok?

Yes, here is my test tag:

Tested-by: Leo Yan <leo.yan@linaro.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-11-27 13:30 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-02 16:22 [PATCH] perf test: Remove atomics from test_loop to avoid test failures Nick Forrington
2023-11-03  9:14 ` James Clark
2023-11-21 17:04   ` Arnaldo Carvalho de Melo
2023-11-24 19:57 ` Michael Petlan
2023-11-25  3:05   ` Leo Yan
2023-11-25 19:10     ` Nick Forrington
2023-11-26  7:41       ` Leo Yan
2023-11-27 13:20         ` Arnaldo Carvalho de Melo
2023-11-27 13:29           ` Leo Yan
2023-11-27 10:45   ` James Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).