All of lore.kernel.org
 help / color / mirror / Atom feed
* blktrace: exit directly when nthreads_running != ncpus in run_tracers()
@ 2021-06-28 13:25 lijinlin
  2021-06-28 19:42 ` Jens Axboe
  0 siblings, 1 reply; 2+ messages in thread
From: lijinlin @ 2021-06-28 13:25 UTC (permalink / raw)
  To: linux-btrace

From: lijinlin <lijinlin3@huawei.com>

We found blktrace got stuck when cgroup restricts blktrace to use cpu,
the messages and stack is:
[root@localhost ~]# blktrace -w 10 -o- /dev/sda
FAILED to start thread on CPU 1: 22/Invalid argument
FAILED to start thread on CPU 2: 22/Invalid argument
[root@localhost ~]# cat /proc/1385110/stack
[<0>] __switch_to+0xe8/0x150
[<0>] futex_wait_queue_me+0xd4/0x158
[<0>] futex_wait+0xf4/0x230
[<0>] do_futex+0x470/0x900
[<0>] __arm64_sys_futex+0x13c/0x188
[<0>] el0_svc_common+0x80/0x200
[<0>] el0_svc_handler+0x78/0xe0
[<0>] el0_svc+0x10/0x260
[<0>] 0xffffffffffffffff

Blktrace failed to start thread is caused by thread can't lock on the
Restricted cpu. In this case, blktrace would't schedule an alarm after
defined time to set variable 'done' as 1.
We debug the code and found the call trace as bellow:
main()
   =>run_tracers()
      =>wait_tracers()
         =>process_trace_bufs()
            =>wait_empty_entries()
               =>t_pthread_cond_wait()
Blktrace was set to piped output, so the process is stuck in
wait_empty_entries() for wait variable 'done' have been set as 1.

We set variable 'done' as 1 when 'nthreads_running' is not equal to
'ncpus' in run_tracers() to fix the problem.

Signed-off-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Lixiaokeng <lixiaokeng@huawei.com>
---
 blktrace.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/blktrace.c b/blktrace.c
index 82a6aad..3444fbb 100644
--- a/blktrace.c
+++ b/blktrace.c
@@ -2705,8 +2705,10 @@ static int run_tracers(void)
                        printf("blktrace: connected!\n");
                if (stop_watch)
                        alarm(stop_watch);
-       } else
+       } else {
                stop_tracers();
+               done = 1;
+       }

        wait_tracers();
        if (nthreads_running = ncpus)
--
2.23.0

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: blktrace: exit directly when nthreads_running != ncpus in run_tracers()
  2021-06-28 13:25 blktrace: exit directly when nthreads_running != ncpus in run_tracers() lijinlin
@ 2021-06-28 19:42 ` Jens Axboe
  0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2021-06-28 19:42 UTC (permalink / raw)
  To: linux-btrace

On 6/28/21 7:25 AM, lijinlin wrote:
> From: lijinlin <lijinlin3@huawei.com>
> 
> We found blktrace got stuck when cgroup restricts blktrace to use cpu,
> the messages and stack is:
> [root@localhost ~]# blktrace -w 10 -o- /dev/sda
> FAILED to start thread on CPU 1: 22/Invalid argument
> FAILED to start thread on CPU 2: 22/Invalid argument
> [root@localhost ~]# cat /proc/1385110/stack
> [<0>] __switch_to+0xe8/0x150
> [<0>] futex_wait_queue_me+0xd4/0x158
> [<0>] futex_wait+0xf4/0x230
> [<0>] do_futex+0x470/0x900
> [<0>] __arm64_sys_futex+0x13c/0x188
> [<0>] el0_svc_common+0x80/0x200
> [<0>] el0_svc_handler+0x78/0xe0
> [<0>] el0_svc+0x10/0x260
> [<0>] 0xffffffffffffffff
> 
> Blktrace failed to start thread is caused by thread can't lock on the
> Restricted cpu. In this case, blktrace would't schedule an alarm after
> defined time to set variable 'done' as 1.
> We debug the code and found the call trace as bellow:
> main()
>    =>run_tracers()
>       =>wait_tracers()
>          =>process_trace_bufs()
>             =>wait_empty_entries()
>                =>t_pthread_cond_wait()
> Blktrace was set to piped output, so the process is stuck in
> wait_empty_entries() for wait variable 'done' have been set as 1.
> 
> We set variable 'done' as 1 when 'nthreads_running' is not equal to
> 'ncpus' in run_tracers() to fix the problem.

Applied, thanks.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-06-28 19:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-28 13:25 blktrace: exit directly when nthreads_running != ncpus in run_tracers() lijinlin
2021-06-28 19:42 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.