All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: What is the effect of not preloading lttng-ust-fork?
       [not found] <DB5PR07MB09496A86EB5F24D58FFBE029F01F0@DB5PR07MB0949.eurprd07.prod.outlook.com>
@ 2017-04-26 16:05 ` Jonathan Rajotte Julien
  0 siblings, 0 replies; 2+ messages in thread
From: Jonathan Rajotte Julien @ 2017-04-26 16:05 UTC (permalink / raw)
  To: lttng-dev

Hi,

On 2017-04-23 09:56 PM, Douglas Graham wrote:
> Hi,
> 
>  
> 
> We have an init-like top-level application that starts a bunch of other daemons. This top-level daemon is started with LD_PRELOAD= /usr/lib/liblttng-ust-fork.so.0 so all the daemons that it starts inherit that.  Sometimes we want to debug one of these child applications, so we arrange that instead of starting it directly, the top-level daemon starts it like so: “gdbserver :23 <path/to/real/daemon>”.  This arranges that the child daemon not actually run until a remote debugger attaches.  This scheme seemed to work reasonably well until recently.  It still works maybe half the time, but the other half, gdbserver crashes in lttng related code.  The backtrace is below.  As a quick fix, I have tried starting gdbserver and the child application like so: “/usr/bin/env -u LD_PRELOAD gdbserver :23 <path/to/real/daemon>”.  In the limited testing I’ve done so far, this is looking good, but I’m wondering what other problems might turn up if I don’t preload liblttng-ust-fork into
> gdbserver and the application it is debugging.  Gdbserver does fork after all, and the application might as well.  Does this seem like a reasonable long-term solution for using the remote debugger?
> 
>  
> 
> Here’s the crash backtrace:
> 
>  
> 
> (gdb) bt
> 
> #0  0xb6e8c810 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
> 
> #1  0xb6e9028c in __GI_abort () at abort.c:89
> 
> #2  0xb6e08198 in lttng_ust_elf_destroy (elf=0x2, elf@entry=0xb6f9a088 <lock>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-elf.c:342
> 
> #3  0xb6e08bec in get_elf_info (bin_data=0xb6541308) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:165
> 
> #4  trace_baddr (bin_data=0xb6541308) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:175
> 
> #5  extract_bin_info_events (info=info@entry=0xb6542360, size=size@entry=40, _data=_data@entry=0xb65423b8)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:298
> 
> #6  0xb6f65a78 in __GI___dl_iterate_phdr (callback=0xb6e08a34 <extract_bin_info_events>, data=data@entry=0xb65423b8) at dl-iteratephdr.c:76
> 
> #7  0xb6e09c9c in do_baddr_statedump (owner=0xb6e5a618 <global_apps>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:332
> 
> #8  do_lttng_ust_statedump (owner=owner@entry=0xb6e5a618 <global_apps>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:355
> 
> #9  0xb6e007dc in lttng_handle_pending_statedump (owner=owner@entry=0xb6e5a618 <global_apps>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-events.c:857
> 
> #10 0xb6df864c in handle_pending_statedump (sock_info=0xb6e5a618 <global_apps>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:599
> 
> #11 handle_message (lum=0xb654286c, sock=<optimized out>, sock_info=<optimized out>)
> 
>     at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:957
> 
> #12 ust_listener_thread (arg=0xb6e5a618 <global_apps>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1520
> 
> #13 0xb6d49f6c in start_thread (arg=0xb6543420) at pthread_create.c:339
> 
>  
> 
> I am not able to easily examine the data in lttng_ust_elf_destroy() because as you can see from the argument list to that function, gdb is confused about the value of the “elf” pointer (I find that gdb often gets confused on ARM when unwinding from a call to abort()).  But I can see that it aborted  because a close() failed:
> 
>  
> 
>         if (close(elf->fd)) {
> 
>                 abort();
> 
>         }
> 
>  
> 
> The two other threads in the gdbserver application were here at the time of the crash:
> 
>  
> 
> #0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
> 
> #1  0xb6df7cd0 in futex (val3=0, uaddr2=0x0, timeout=0x0, val=0, op=0, uaddr=<optimized out>)
> 
>     at /repo/ngrbuilder/jenkins/ProductionBuild_0/xcs-build/build/xenon/bitbake_build/tmp/sysroots/xenon/usr/include/urcu/futex.h:65
> 
> #2  futex_async (op=0, val=0, timeout=0x0, uaddr2=0x0, val3=0, uaddr=<optimized out>)
> 
>     at /repo/ngrbuilder/jenkins/ProductionBuild_0/xcs-build/build/xenon/bitbake_build/tmp/sysroots/xenon/usr/include/urcu/futex.h:97
> 
> #3  wait_for_sessiond (sock_info=<optimized out>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1276
> 
> #4  ust_listener_thread (arg=0xb6e585ec <local_apps>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1330
> 
> #5  0xb6d49f6c in start_thread (arg=0xb6d43420) at pthread_create.c:339
> 
>  
> 
> #0  0xb6efd7b8 in execve () at ../sysdeps/unix/syscall-template.S:84
> 
> #1  0xb6efd8c8 in execv (path=<optimized out>, argv=<optimized out>) at execv.c:25
> 
> #2  0x00035b40 in linux_create_inferior (program=0x6bc80 "/opt/radiosw/bin/ngr2mpw.elf", allargs=0x6bc70)


I was not able to reproduce locally the problem so testing the following and reporting to us the result would be very helpful.

After some digging, linux_create_inferior seems to perform bulk closing after the fork.

    linux_create_inferior -> close_most_fds -> fdwalk with do_close.
	
lttng-ust live inside the process hence gdbserver could close the fds used by lttng-ust. Since lttng-ust can close fd already closed it would results in a double close.

I would suspect this is the primary source of problem. You will need to ld_preload the liblttng-ust-fd library available with lttng-ust 2.9+ to prevent this kind of problem.

The commit introducing liblttng-ust-fd:

commit 95e6d2685eec83a144f533a447ffd355fadf9d79
Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Date:   Mon Sep 12 12:04:13 2016 -0400

    Implement liblttng-ust-fd
    
    This library overrides close() and closeall() libc functions, and uses
    lttng_ust_safe_close_fd() to check whether the application can
    interact with the file descriptor or if it should be left to lttng-ust.
    
    This takes care of bugs caused by applications doing bulk close() or
    closefrom() of file descriptors soon after forking.
    
    Fixes: #253
    Fixes: #626
    
    CC: Aravind HT <aravind.ht@gmail.com>
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>


It looks like you are using lttng-ust 2.8. You can either upgrade and use liblttng-ust-fd;liblttng-ust-fork or use the wrapper functionality of gdbserver. e.g.:

/usr/bin/env -u LD_PRELOAD gdbserver --wrapper env LD_PRELOAD=<path/to/liblttng-ust-fork> -- :23 <path/to/real/daemon>

This should prevent the LD_PRELOAD for gdbserver and will LD_PRELOAD liblttng-ust-fork for the daemon.

Cheers

>     at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/linux-low.c:889
> 
> #3  0x00018e30 in start_inferior (argv=0x6bc70) at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/server.c:233
> 
> #4  0x0001aa58 in captured_main (argc=argc@entry=-1225151608, argv=argv@entry=0x3fda8 <__libc_csu_init+32>)
> 
>     at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/server.c:3516
> 
> #5  0x0001250c in main (argc=-1225151608, argv=0x3fda8 <__libc_csu_init+32>)
> 
>  
> 
> Thanks,
> 
> Doug
> 
>  
> 
>  
> 
>  
> 
> 
> 
> _______________________________________________
> lttng-dev mailing list
> lttng-dev@lists.lttng.org
> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev
> 

-- 
Jonathan R. Julien
Efficios
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 2+ messages in thread

* What is the effect of not preloading lttng-ust-fork?
@ 2017-04-24  1:56 Douglas Graham
  0 siblings, 0 replies; 2+ messages in thread
From: Douglas Graham @ 2017-04-24  1:56 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 5286 bytes --]

Hi,

We have an init-like top-level application that starts a bunch of other daemons. This top-level daemon is started with LD_PRELOAD= /usr/lib/liblttng-ust-fork.so.0 so all the daemons that it starts inherit that.  Sometimes we want to debug one of these child applications, so we arrange that instead of starting it directly, the top-level daemon starts it like so: "gdbserver :23 <path/to/real/daemon>".  This arranges that the child daemon not actually run until a remote debugger attaches.  This scheme seemed to work reasonably well until recently.  It still works maybe half the time, but the other half, gdbserver crashes in lttng related code.  The backtrace is below.  As a quick fix, I have tried starting gdbserver and the child application like so: "/usr/bin/env -u LD_PRELOAD gdbserver :23 <path/to/real/daemon>".  In the limited testing I've done so far, this is looking good, but I'm wondering what other problems might turn up if I don't preload liblttng-ust-fork into gdbserver and the application it is debugging.  Gdbserver does fork after all, and the application might as well.  Does this seem like a reasonable long-term solution for using the remote debugger?

Here's the crash backtrace:

(gdb) bt
#0  0xb6e8c810 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1  0xb6e9028c in __GI_abort () at abort.c:89
#2  0xb6e08198 in lttng_ust_elf_destroy (elf=0x2, elf@entry=0xb6f9a088 <lock>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-elf.c:342
#3  0xb6e08bec in get_elf_info (bin_data=0xb6541308) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:165
#4  trace_baddr (bin_data=0xb6541308) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:175
#5  extract_bin_info_events (info=info@entry=0xb6542360, size=size@entry=40, _data=_data@entry=0xb65423b8)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:298
#6  0xb6f65a78 in __GI___dl_iterate_phdr (callback=0xb6e08a34 <extract_bin_info_events>, data=data@entry=0xb65423b8) at dl-iteratephdr.c:76
#7  0xb6e09c9c in do_baddr_statedump (owner=0xb6e5a618 <global_apps>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:332
#8  do_lttng_ust_statedump (owner=owner@entry=0xb6e5a618 <global_apps>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-statedump.c:355
#9  0xb6e007dc in lttng_handle_pending_statedump (owner=owner@entry=0xb6e5a618 <global_apps>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-events.c:857
#10 0xb6df864c in handle_pending_statedump (sock_info=0xb6e5a618 <global_apps>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:599
#11 handle_message (lum=0xb654286c, sock=<optimized out>, sock_info=<optimized out>)
    at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:957
#12 ust_listener_thread (arg=0xb6e5a618 <global_apps>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1520
#13 0xb6d49f6c in start_thread (arg=0xb6543420) at pthread_create.c:339

I am not able to easily examine the data in lttng_ust_elf_destroy() because as you can see from the argument list to that function, gdb is confused about the value of the "elf" pointer (I find that gdb often gets confused on ARM when unwinding from a call to abort()).  But I can see that it aborted  because a close() failed:

        if (close(elf->fd)) {
                abort();
        }

The two other threads in the gdbserver application were here at the time of the crash:

#0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1  0xb6df7cd0 in futex (val3=0, uaddr2=0x0, timeout=0x0, val=0, op=0, uaddr=<optimized out>)
    at /repo/ngrbuilder/jenkins/ProductionBuild_0/xcs-build/build/xenon/bitbake_build/tmp/sysroots/xenon/usr/include/urcu/futex.h:65
#2  futex_async (op=0, val=0, timeout=0x0, uaddr2=0x0, val3=0, uaddr=<optimized out>)
    at /repo/ngrbuilder/jenkins/ProductionBuild_0/xcs-build/build/xenon/bitbake_build/tmp/sysroots/xenon/usr/include/urcu/futex.h:97
#3  wait_for_sessiond (sock_info=<optimized out>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1276
#4  ust_listener_thread (arg=0xb6e585ec <local_apps>) at /usr/src/debug/lttng-ust/2.8.2-r3.eriR0/git/liblttng-ust/lttng-ust-comm.c:1330
#5  0xb6d49f6c in start_thread (arg=0xb6d43420) at pthread_create.c:339

#0  0xb6efd7b8 in execve () at ../sysdeps/unix/syscall-template.S:84
#1  0xb6efd8c8 in execv (path=<optimized out>, argv=<optimized out>) at execv.c:25
#2  0x00035b40 in linux_create_inferior (program=0x6bc80 "/opt/radiosw/bin/ngr2mpw.elf", allargs=0x6bc70)
    at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/linux-low.c:889
#3  0x00018e30 in start_inferior (argv=0x6bc70) at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/server.c:233
#4  0x0001aa58 in captured_main (argc=argc@entry=-1225151608, argv=argv@entry=0x3fda8 <__libc_csu_init+32>)
    at /usr/src/debug/gdb/7.10.1-r0/gdb-7.10.1/gdb/gdbserver/server.c:3516
#5  0x0001250c in main (argc=-1225151608, argv=0x3fda8 <__libc_csu_init+32>)

Thanks,
Doug




[-- Attachment #1.2: Type: text/html, Size: 11364 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-04-26 16:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <DB5PR07MB09496A86EB5F24D58FFBE029F01F0@DB5PR07MB0949.eurprd07.prod.outlook.com>
2017-04-26 16:05 ` What is the effect of not preloading lttng-ust-fork? Jonathan Rajotte Julien
2017-04-24  1:56 Douglas Graham

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.