From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maxime Coquelin Subject: Re: pthread_barrier_deadlock in -rc1 Date: Wed, 2 May 2018 11:05:12 +0200 Message-ID: <393a2f7e-ed20-fa28-0b07-aa3374593d5a@redhat.com> References: <20180403130439.11151-1-olivier.matz@6wind.com> <20180424144651.13145-1-olivier.matz@6wind.com> <4256B2F0-EF9D-4B22-AC1A-D440C002360A@6wind.com> <39d5baf8-2bad-6df8-0419-a06c65d41475@redhat.com> <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Cc: Anatoly Burakov , Thomas Monjalon To: "Tan, Jianfeng" , Olivier Matz , dev@dpdk.org Return-path: Received: from mx1.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by dpdk.org (Postfix) with ESMTP id EB44716E for ; Wed, 2 May 2018 11:05:15 +0200 (CEST) In-Reply-To: <2d828aa1-482f-7f19-1909-c3ca4599c9b2@intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 05/02/2018 10:50 AM, Tan, Jianfeng wrote: > Hi Maxime, > > > On 5/2/2018 4:34 PM, Maxime Coquelin wrote: >> Hi Jiangfeng, >> >> On 05/02/2018 10:19 AM, Tan, Jianfeng wrote: >>> >>> >>> On 5/1/2018 2:46 AM, Olivier Matz wrote: >>>> Hi Maxime, >>>> >>>> Le 30 avril 2018 17:45:52 GMT+02:00, Maxime Coquelin >>>> a écrit : >>>>> Hi Olivier, >>>>> >>>>> On 04/24/2018 04:46 PM, Olivier Matz wrote: >>>>>> Some parts of dpdk use their own management threads. Most of the >>>>> time, >>>>>> the affinity of the thread is not properly set: it should not be >>>>> scheduled >>>>>> on the dataplane cores, because interrupting them can cause packet >>>>> losses. >>>>>> This patchset introduces a new wrapper for thread creation that does >>>>>> the job automatically, avoiding code duplication. >>>>>> >>>>>> v3: >>>>>> * new patch: use this API in examples when relevant. >>>>>> * replace pthread_kill by pthread_cancel. Note that pthread_join() >>>>>>     is still needed. >>>>>> * rebase: vfio and pdump do not have control pthreads anymore, and >>>>> eal >>>>>>     has 2 new pthreads >>>>>> * remove all calls to snprintf/strlcpy that truncate the thread name: >>>>>>     all strings lengths are already < 16. >>>>>> >>>>>> v2: >>>>>> * set affinity to master core if no core is off, as suggested by >>>>>>     Anatoly >>>>>> >>>>>> Olivier Matz (5): >>>>>>     eal: use sizeof to avoid a double use of a define >>>>>>     eal: new function to create control threads >>>>>>     eal: set name when creating a control thread >>>>>>     eal: set affinity for control threads >>>>>>     examples: use new API to create control threads >>>>>> >>>>>>    drivers/net/kni/Makefile                     |  1 + >>>>>>    drivers/net/kni/rte_eth_kni.c                |  3 +- >>>>>>    examples/tep_termination/main.c              | 16 +++---- >>>>>>    examples/vhost/main.c                        | 19 +++----- >>>>>>    lib/librte_eal/bsdapp/eal/eal.c              |  4 +- >>>>>>    lib/librte_eal/bsdapp/eal/eal_thread.c       |  2 +- >>>>>>    lib/librte_eal/common/eal_common_proc.c      | 15 ++---- >>>>>>    lib/librte_eal/common/eal_common_thread.c    | 72 >>>>> ++++++++++++++++++++++++++++ >>>>>> lib/librte_eal/common/include/rte_lcore.h    | 26 ++++++++++ >>>>>>    lib/librte_eal/linuxapp/eal/eal.c            |  4 +- >>>>>>    lib/librte_eal/linuxapp/eal/eal_interrupts.c | 17 ++----- >>>>>>    lib/librte_eal/linuxapp/eal/eal_thread.c     |  2 +- >>>>>>    lib/librte_eal/linuxapp/eal/eal_timer.c      | 12 +---- >>>>>>    lib/librte_eal/rte_eal_version.map           |  1 + >>>>>>    lib/librte_vhost/socket.c                    | 25 ++-------- >>>>>>    15 files changed, 135 insertions(+), 84 deletions(-) >>>>>> >>>>> I face a deadlock issue with your series, that Jianfeng patch does not >>>>> resolve ("eal: fix threads block on barrier"). Reverting the series >>>>> and >>>>> Jianfeng patch makes the issue to disappear. >>>>> >>>>> I face the problem in a VM (not seen on the host): >>>>> # ./install/bin/testpmd -l 0,1,2 --socket-mem 1024 -n 4 --proc-type >>>>> auto >>>>> --file-prefix pg -- --portmask=3 --forward-mode=macswap >>>>> --port-topology=chained --disable-rss -i --rxq=1 --txq=1 --rxd=256 >>>>> --txd=256 --nb-cores=2 --auto-start >>>>> EAL: Detected 3 lcore(s) >>>>> EAL: Detected 1 NUMA nodes >>>>> EAL: Auto-detected process type: PRIMARY >>>>> EAL: Multi-process socket /var/run/.pg_unix >>>>> >>>>> >>>>> Then it is stuck. Attaching with GDB, I get below backtrace >>>>> information: >>>>> >>>>> (gdb) info threads >>>>>    Id   Target Id         Frame >>>>>    3    Thread 0x7f63e1f9f700 (LWP 8808) "rte_mp_handle" >>>>> 0x00007f63e2591bfd in recvmsg () at >>>>> ../sysdeps/unix/syscall-template.S:81 >>>>>    2    Thread 0x7f63e179e700 (LWP 8809) "rte_mp_async" >>>>> pthread_barrier_wait () at >>>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>>>> * 1    Thread 0x7f63e32cec00 (LWP 8807) "testpmd" pthread_barrier_wait >>>>> () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>>>> (gdb) bt full >>>>> #0  pthread_barrier_wait () at >>>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>>>> No locals. >>>>> #1  0x0000000000520c54 in rte_ctrl_thread_create >>>>> (thread=thread@entry=0x7ffe5c895020, name=name@entry=0x869d86 >>>>> "rte_mp_async", attr=attr@entry=0x0, >>>>> start_routine=start_routine@entry=0x521030 , >>>>> arg=arg@entry=0x0) >>>>>      at /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:207 >>>>>          params = 0x17b1e40 >>>>>          lcore_id = >>>>>          cpuset = {__bits = {1, 0 }} >>>>>          cpu_found = >>>>>          ret = 0 >>>>> #2  0x00000000005220b6 in rte_mp_channel_init () at >>>>> /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:674 >>>>>         path = "/var/run\000.pg_unix_*", '\000' >>>> times>... >>>>>          dir_fd = 4 >>>>>          mp_handle_tid = 140066969745152 >>>>>          async_reply_handle_tid = 140066961352448 >>>>> #3  0x000000000050c227 in rte_eal_init (argc=argc@entry=23, >>>>> argv=argv@entry=0x7ffe5c896378) at >>>>> /root/src/dpdk/lib/librte_eal/linuxapp/eal/eal.c:775 >>>>>          i = >>>>>          fctret = 11 >>>>>          ret = >>>>>          thread_id = 140066989861888 >>>>>          run_once = {cnt = 1} >>>>>          logid = 0x17b1e00 "testpmd" >>>>>          cpuset = "T}\211\\\376\177", '\000' , >>>>> "\020", '\000' ... >>>>>       thread_name = >>>>> "X}\211\\\376\177\000\000\226\301\036\342c\177\000" >>>>>          __func__ = "rte_eal_init" >>>>> #4  0x0000000000473214 in main (argc=23, argv=0x7ffe5c896378) at >>>>> /root/src/dpdk/app/test-pmd/testpmd.c:2597 >>>>>          diag = >>>>>          port_id = >>>>>          ret = >>>>>          __func__ = "main" >>>>> (gdb) thread 2 >>>>> [Switching to thread 2 (Thread 0x7f63e179e700 (LWP 8809))] >>>>> #0  pthread_barrier_wait () at >>>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>>>> 71        cmpl    %edx, (%rdi) >>>>> (gdb) bt full >>>>> #0  pthread_barrier_wait () at >>>>> ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_barrier_wait.S:71 >>>>> No locals. >>>>> #1  0x0000000000520777 in rte_thread_init (arg=) at >>>>> /root/src/dpdk/lib/librte_eal/common/eal_common_thread.c:156 >>>>>          params = >>>>>          start_routine = 0x521030 >>>>>          routine_arg = 0x0 >>>>> #2  0x00007f63e258add5 in start_thread (arg=0x7f63e179e700) at >>>>> pthread_create.c:308 >>>>>          __res = >>>>>          pd = 0x7f63e179e700 >>>>>          now = >>>>>          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066961352448, >>>>> 1212869169857371576, 0, 8392704, 0, 140066961352448, >>>>> -1291626103561052744, -1291619793368703560}, mask_was_saved = 0}}, >>>>> priv >>>>> >>>>> = {pad = {0x0, 0x0, 0x0, 0x0}, data = { >>>>>                prev = 0x0, cleanup = 0x0, canceltype = 0}}} >>>>>          not_first_call = >>>>>          pagesize_m1 = >>>>>          sp = >>>>>          freesize = >>>>> #3  0x00007f63e22b4b3d in clone () at >>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 >>>>> No locals. >>>>> (gdb) thread 3 >>>>> [Switching to thread 3 (Thread 0x7f63e1f9f700 (LWP 8808))] >>>>> #0  0x00007f63e2591bfd in recvmsg () at >>>>> ../sysdeps/unix/syscall-template.S:81 >>>>> 81    T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS) >>>>> (gdb) bt full >>>>> #0  0x00007f63e2591bfd in recvmsg () at >>>>> ../sysdeps/unix/syscall-template.S:81 >>>>> No locals. >>>>> #1  0x000000000052194e in read_msg (s=0x7f63e1f9d3b0, >>>>> m=0x7f63e1f9d5a0) >>>>> >>>>> at /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:258 >>>>>          msglen = >>>>>          control = >>>>> "\000\000\000\000\000\000\000\000\336~\f\343c\177\000\000\005", '\000' >>>>> , "\360\371\033\342c\177\000" >>>>>          cmsg = >>>>>          iov = {iov_base = 0x7f63e1f9d5a0, iov_len = 332} >>>>>        msgh = {msg_name = 0x7f63e1f9d3b0, msg_namelen = 110, msg_iov = >>>>> 0x7f63e1f9d370, msg_iovlen = 1, msg_control = 0x7f63e1f9d380, >>>>> msg_controllen = 48, msg_flags = 0} >>>>> #2  mp_handle (arg=) at >>>>> /root/src/dpdk/lib/librte_eal/common/eal_common_proc.c:346 >>>>>          msg = {type = 0, msg = {name = '\000' , >>>>> len_param = 0, num_fds = 0, param = '\000' , "\002", >>>>> '\000' , fds = {0, 0, 0, 0, 0, 0, 0, 0}}} >>>>>          sa = {sun_family = 55104, >>>>>            sun_path = >>>>> "\371\341c\177\000\000\352\372\f\343c\177\000\000\000\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377\000\367\371\341c\177\000\000\030\000\000\000\000\000\000\000p\327\371\341c\177\000\000\000\367\371\341c\177\000\000\000\367\371\341c\177", >>>>> >>>>> >>>>> '\000' , "\200\037\000\000\377\377"} >>>>> #3  0x00007f63e258add5 in start_thread (arg=0x7f63e1f9f700) at >>>>> pthread_create.c:308 >>>>>          __res = >>>>>          pd = 0x7f63e1f9f700 >>>>>          now = >>>>>          unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140066969745152, >>>>> 1212869169857371576, 0, 8392704, 0, 140066969745152, >>>>> -1291625004586295880, -1291619793368703560}, mask_was_saved = 0}}, >>>>> priv >>>>> >>>>> = {pad = {0x0, 0x0, 0x0, 0x0}, data = { >>>>>                prev = 0x0, cleanup = 0x0, canceltype = 0}}} >>>>>          not_first_call = >>>>>          pagesize_m1 = >>>>>          sp = >>>>>          freesize = >>>>> #4  0x00007f63e22b4b3d in clone () at >>>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 >>>>> No locals. >>>>> >>>>> I don't have more info for now. >>>>> >>>> >>>> Thanks for the feedback on this issue. I don't see obvious reason >>>> for this deadlock yet. >>>> >>>> I'll investigate it asap (not tomorrow, but wednesday). In the worst >>>> case, we can revert the series if I cannot find the root cause rapidly. >>> >>> I might think that the suggestion from Stephen of destroying the >>> barrier can help this issue. I'll try to reproduce it and test it >>> before sending a patch to fix it. >> >> In case you don't reproduce, feel free to send me the patch to test it. > > Below patch can fix another strange sigsegv issue in my VM. Please check > if it works for you. I doubt it's use-after-free problem which could > lead to different issues in different env. Please have a try. > > > diff --git a/lib/librte_eal/common/eal_common_thread.c > b/lib/librte_eal/common/eal_common_thread.c > index de69452..d91b67d 100644 > --- a/lib/librte_eal/common/eal_common_thread.c > +++ b/lib/librte_eal/common/eal_common_thread.c > @@ -205,6 +205,7 @@ rte_ctrl_thread_create(pthread_t *thread, const char > *name, >                 goto fail; > >         pthread_barrier_wait(¶ms->configured); > +       pthread_barrier_destroy(¶ms->configured); Thanks Jianfeng, that fixes my issue. For correctness, I wonder whether we should check pthread_barrier_wait return, and only call destroy() if PTHREAD_BARRIER_SERIAL_THREAD? And so also do same the same thing in rte_thread_init(). What do you think? Thanks, Maxime >         free(params); > >         return 0; > > Thanks, > Jianfeng > >> >> Thanks, >> Maxime >> >>> Thanks, >>> Jianfeng >>> >>>> >>>> Olivier >>>> >>> >