From mboxrd@z Thu Jan 1 00:00:00 1970 From: Olivier Matz Subject: Re: pthread_barrier_deadlock in -rc1 Date: Wed, 2 May 2018 11:20:11 +0200 Message-ID: <20180502092011.5nxl5nbka6zfi4hb@neon> References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Cc: "Tan, Jianfeng" , dev@dpdk.org, Anatoly Burakov , Thomas Monjalon To: Maxime Coquelin Return-path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id B986A23B for ; Wed, 2 May 2018 11:20:17 +0200 (CEST) Content-Disposition: inline In-Reply-To: <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Jianfeng, On Wed, May 02, 2018 at 11:05:12AM +0200, Maxime Coquelin wrote: > > > On 05/02/2018 10:50 AM, Tan, Jianfeng wrote: > > Hi Maxime, > > > > > > On 5/2/2018 4:34 PM, Maxime Coquelin wrote: > > > Hi Jiangfeng, > > > > > > On 05/02/2018 10:19 AM, Tan, Jianfeng wrote: > > > > > > > > > > > > On 5/1/2018 2:46 AM, Olivier Matz wrote: > > > > > Hi Maxime, > > > > > > > > > > Le 30 avril 2018 17:45:52 GMT+02:00, Maxime Coquelin > > > > > a écrit : > > > > > > Hi Olivier, > > > > > > > > > > > > On 04/24/2018 04:46 PM, Olivier Matz wrote: > > > > > > > Some parts of dpdk use their own management threads. Most of the > > > > > > time, > > > > > > > the affinity of the thread is not properly set: it should not be > > > > > > scheduled > > > > > > > on the dataplane cores, because interrupting them can cause packet > > > > > > losses. > > > > > > > This patchset introduces a new wrapper for thread creation that does > > > > > > > the job automatically, avoiding code duplication. > > > > > > > > > > > > > > v3: > > > > > > > * new patch: use this API in examples when relevant. > > > > > > > * replace pthread_kill by pthread_cancel. Note that pthread_join() > > > > > > >     is still needed. > > > > > > > * rebase: vfio and pdump do not have control pthreads anymore, and > > > > > > eal > > > > > > >     has 2 new pthreads > > > > > > > * remove all calls to snprintf/strlcpy that truncate the thread name: > > > > > > >     all strings lengths are already < 16. > > > > > > > > > > > > > > v2: > > > > > > > * set affinity to master core if no core is off, as suggested by > > > > > > >     Anatoly > > > > > > > > > > > > > > Olivier Matz (5): > > > > > > >     eal: use sizeof to avoid a double use of a define > > > > > > >     eal: new function to create control threads > > > > > > >     eal: set name when creating a control thread > > > > > > >     eal: set affinity for control threads > > > > > > >     examples: use new API to create control threads > > > > > > > > > > > > > >    drivers/net/kni/Makefile                     |  1 + > > > > > > >    drivers/net/kni/rte_eth_kni.c                |  3 +- > > > > > > >    examples/tep_termination/main.c              | 16 +++---- > > > > > > >    examples/vhost/main.c                        | 19 +++----- > > > > > > >    lib/librte_eal/bsdapp/eal/eal.c              |  4 +- > > > > > > >    lib/librte_eal/bsdapp/eal/eal_thread.c       |  2 +- > > > > > > >    lib/librte_eal/common/eal_common_proc.c      | 15 ++---- > > > > > > >    lib/librte_eal/common/eal_common_thread.c    | 72 > > > > > > ++++++++++++++++++++++++++++ > > > > > > > lib/librte_eal/common/include/rte_lcore.h    | 26 ++++++++++ > > > > > > >    lib/librte_eal/linuxapp/eal/eal.c            |  4 +- > > > > > > >    lib/librte_eal/linuxapp/eal/eal_interrupts.c | 17 ++----- > > > > > > >    lib/librte_eal/linuxapp/eal/eal_thread.c     |  2 +- > > > > > > >    lib/librte_eal/linuxapp/eal/eal_timer.c      | 12 +---- > > > > > > >    lib/librte_eal/rte_eal_version.map           |  1 + > > > > > > >    lib/librte_vhost/socket.c                    | 25 ++-------- > > > > > > >    15 files changed, 135 insertions(+), 84 deletions(-) > > > > > > > > > > > > > I face a deadlock issue with your series, that Jianfeng patch does not > > > > > > resolve ("eal: fix threads block on barrier"). Reverting > > > > > > the series and > > > > > > Jianfeng patch makes the issue to disappear. > > > > > > > > > > > > I face the problem in a VM (not seen on the host): > > > > > > # ./install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 --proc-type > > > > > > auto > > > > > > --file-prefix pg -- --portmask=3 --forward-mode=macswap > > > > > > --port-topology=chained --disable-rss -i --rxq=1 --txq=1 --rxd=256 > > > > > > --txd=256 --nb-cores=2 --auto-start > > > > > > EAL: Detected 3 lcore(s) > > > > > > EAL: Detected 1 NUMA nodes > > > > > > EAL: Auto-detected process type: PRIMARY > > > > > > EAL: Multi-process socket /var/run/.pg_unix > > > > > > > > > > > > > > > > > > Then it is stuck. Attaching with GDB, I get below backtrace > > > > > > information: > > > > > > > > > > > > (gdb) info threads > > > > > >    Id   Target Id         Frame > > > > > >    3    Thread 0x7f63e1f9f700 (LWP 8808) "rte_mp_handle" > > > > > > 0x00007f63e2591bfd in recvmsg () at > > > > > > ../sysdeps/unix/syscall-template.S:81 > > > > > >    2    Thread 0x7f63e179e700 (LWP 8809) "rte_mp_async" > > > > > > pthread_barrier_wait () at > > > > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 > > > > > > * 1    Thread 0x7f63e32cec00 (LWP 8807) "testpmd" pthread_barrier_wait > > > > > > () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 > > > > > > (gdb) bt full > > > > > > #0  pthread_barrier_wait () at > > > > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 > > > > > > No locals. > > > > > > #1  0x0000000000520c54 in rte_ctrl_thread_create > > > > > > (thread=thread@entry=0x7ffe5c895020, name=name@entry=0x869d86 > > > > > > "rte_mp_async", attr=attr@entry=0x0, > > > > > > start_routine=start_routine@entry=0x521030 , > > > > > > arg=arg@entry=0x0) > > > > > >      at /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:207 > > > > > >          params = 0x17b1e40 > > > > > >          lcore_id = > > > > > >          cpuset = {__bits = {1, 0 }} > > > > > >          cpu_found = > > > > > >          ret = 0 > > > > > > #2  0x00000000005220b6 in rte_mp_channel_init () at > > > > > > /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:674 > > > > > >         path = "/var/run\000.pg_unix_*", '\000' > > > > > 1301 times>... > > > > > >          dir_fd = 4 > > > > > >          mp_handle_tid = 140066969745152 > > > > > >          async_reply_handle_tid = 140066961352448 > > > > > > #3  0x000000000050c227 in rte_eal_init (argc=argc@entry=23, > > > > > > argv=argv@entry=0x7ffe5c896378) at > > > > > > /root/src/dpdk/lib/librte_eal/linuxapp/eal/eal.c:775 > > > > > >          i = > > > > > >          fctret = 11 > > > > > >          ret = > > > > > >          thread_id = 140066989861888 > > > > > >          run_once = {cnt = 1} > > > > > >          logid = 0x17b1e00 "testpmd" > > > > > >          cpuset = "T}\211\\\376\177", '\000' , > > > > > > "\020", '\000' ... > > > > > >       thread_name = > > > > > > "X}\211\\\376\177\000\000\226\301\036\342c\177\000" > > > > > >          __func__ = "rte_eal_init" > > > > > > #4  0x0000000000473214 in main (argc=23, argv=0x7ffe5c896378) at > > > > > > /root/src/dpdk/app/test-pmd/testpmd.c:2597 > > > > > >          diag = > > > > > >          port_id = > > > > > >          ret = > > > > > >          __func__ = "main" > > > > > > (gdb) thread 2 > > > > > > [Switching to thread 2 (Thread 0x7f63e179e700 (LWP 8809))] > > > > > > #0  pthread_barrier_wait () at > > > > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 > > > > > > 71        cmpl    %edx, (%rdi) > > > > > > (gdb) bt full > > > > > > #0  pthread_barrier_wait () at > > > > > > ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 > > > > > > No locals. > > > > > > #1  0x0000000000520777 in rte_thread_init (arg=) at > > > > > > /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:156 > > > > > >          params = > > > > > >          start_routine = 0x521030 > > > > > >          routine_arg = 0x0 > > > > > > #2  0x00007f63e258add5 in start_thread (arg=0x7f63e179e700) at > > > > > > pthread_create.c:308 > > > > > >          __res = > > > > > >          pd = 0x7f63e179e700 > > > > > >          now = > > > > > >          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066961352448, > > > > > > 1212869169857371576, 0, 8392704, 0, 140066961352448, > > > > > > -1291626103561052744, -1291619793368703560}, > > > > > > mask_was_saved = 0}}, priv > > > > > > > > > > > > = {pad = {0x0, 0x0, 0x0, 0x0}, data = { > > > > > >                prev = 0x0, cleanup = 0x0, canceltype = 0}}} > > > > > >          not_first_call = > > > > > >          pagesize_m1 = > > > > > >          sp = > > > > > >          freesize = > > > > > > #3  0x00007f63e22b4b3d in clone () at > > > > > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 > > > > > > No locals. > > > > > > (gdb) thread 3 > > > > > > [Switching to thread 3 (Thread 0x7f63e1f9f700 (LWP 8808))] > > > > > > #0  0x00007f63e2591bfd in recvmsg () at > > > > > > ../sysdeps/unix/syscall-template.S:81 > > > > > > 81    T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) > > > > > > (gdb) bt full > > > > > > #0  0x00007f63e2591bfd in recvmsg () at > > > > > > ../sysdeps/unix/syscall-template.S:81 > > > > > > No locals. > > > > > > #1  0x000000000052194e in read_msg (s=0x7f63e1f9d3b0, > > > > > > m=0x7f63e1f9d5a0) > > > > > > > > > > > > at /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:258 > > > > > >          msglen = > > > > > >          control = > > > > > > "\000\000\000\000\000\000\000\000\336~\f\343c\177\000\000\005", '\000' > > > > > > , "\360\371\033\342c\177\000" > > > > > >          cmsg = > > > > > >          iov = {iov_base = 0x7f63e1f9d5a0, iov_len = 332} > > > > > >        msgh = {msg_name = 0x7f63e1f9d3b0, msg_namelen = 110, msg_iov = > > > > > > 0x7f63e1f9d370, msg_iovlen = 1, msg_control = 0x7f63e1f9d380, > > > > > > msg_controllen = 48, msg_flags = 0} > > > > > > #2  mp_handle (arg=) at > > > > > > /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:346 > > > > > >          msg = {type = 0, msg = {name = '\000' , > > > > > > len_param = 0, num_fds = 0, param = '\000' , "\002", > > > > > > '\000' , fds = {0, 0, 0, 0, 0, 0, 0, 0}}} > > > > > >          sa = {sun_family = 55104, > > > > > >            sun_path = > > > > > > "\371\341c\177\000\000\352\372\f\343c\177\000\000\000\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\000\367\371\341c\177\000\000\030\000\000\000\000\000\000\000p\327\371\341c\177\000\000\000\367\371\341c\177\000\000\000\367\371\341c\177", > > > > > > > > > > > > > > > > > > '\000' , "\200\037\000\000\377\377"} > > > > > > #3  0x00007f63e258add5 in start_thread (arg=0x7f63e1f9f700) at > > > > > > pthread_create.c:308 > > > > > >          __res = > > > > > >          pd = 0x7f63e1f9f700 > > > > > >          now = > > > > > >          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066969745152, > > > > > > 1212869169857371576, 0, 8392704, 0, 140066969745152, > > > > > > -1291625004586295880, -1291619793368703560}, > > > > > > mask_was_saved = 0}}, priv > > > > > > > > > > > > = {pad = {0x0, 0x0, 0x0, 0x0}, data = { > > > > > >                prev = 0x0, cleanup = 0x0, canceltype = 0}}} > > > > > >          not_first_call = > > > > > >          pagesize_m1 = > > > > > >          sp = > > > > > >          freesize = > > > > > > #4  0x00007f63e22b4b3d in clone () at > > > > > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 > > > > > > No locals. > > > > > > > > > > > > I don't have more info for now. > > > > > > > > > > > > > > > > Thanks for the feedback on this issue. I don't see obvious > > > > > reason for this deadlock yet. > > > > > > > > > > I'll investigate it asap (not tomorrow, but wednesday). In > > > > > the worst case, we can revert the series if I cannot find > > > > > the root cause rapidly. > > > > > > > > I might think that the suggestion from Stephen of destroying the > > > > barrier can help this issue. I'll try to reproduce it and test > > > > it before sending a patch to fix it. > > > > > > In case you don't reproduce, feel free to send me the patch to test it. > > > > Below patch can fix another strange sigsegv issue in my VM. Please check > > if it works for you. I doubt it's use-after-free problem which could > > lead to different issues in different env. Please have a try. > > > > > > diff --git a/lib/librte_eal/common/eal_common_thread.c > > b/lib/librte_eal/common/eal_common_thread.c > > index de69452..d91b67d 100644 > > --- a/lib/librte_eal/common/eal_common_thread.c > > +++ b/lib/librte_eal/common/eal_common_thread.c > > @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const char > > *name, > >                 goto fail; > > > >         pthread_barrier_wait(¶ms->configured); > > +       pthread_barrier_destroy(¶ms->configured); > > Thanks Jianfeng, that fixes my issue. > For correctness, I wonder whether we should check pthread_barrier_wait > return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD? > And so also do same the same thing in rte_thread_init(). > > What do you think? > Thanks, > Maxime Thanks for the update. I also have a patch that replaces the barrier by a lock which could also work, but if Jianfeng's one fixes the issue, I think it is better. About the PTHREAD_BARRIER_SERIAL_THREAD, not sure it will change something: Upon successful completion, the pthread_barrier_wait() function shall return PTHREAD_BARRIER_SERIAL_THREAD for a single (arbitrary) thread synchronized at the barrier and zero for each of the other threads. Otherwise, an error number shall be returned to indicate the error. I understand that it will ensure that only one barrier will return PTHREAD_BARRIER_SERIAL_THREAD, but not necessarily the last one. So if destroy() is called in the parent thread, it should be the same, no? By the way, there is also a small memory leak that was introduced by the previous patch, maybe you can add the fix too: - if (ret != 0) + if (ret != 0) { + free(params); return ret; + } Olivier > > >         free(params); > > > >         return 0; > > > > Thanks, > > Jianfeng > > > > > > > > Thanks, > > > Maxime > > > > > > > Thanks, > > > > Jianfeng > > > > > > > > > > > > > > Olivier > > > > > > > > > > >