* [lttng-dev] Userspace tracing in docker containers @ 2021-04-05 18:09 Eqbal via lttng-dev 2021-04-05 18:36 ` Eqbal via lttng-dev 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev 0 siblings, 2 replies; 4+ messages in thread From: Eqbal via lttng-dev @ 2021-04-05 18:09 UTC (permalink / raw) To: lttng-dev [-- Attachment #1.1: Type: text/plain, Size: 1627 bytes --] Hi, I am trying to get user space tracing working for an application running in a docker container. I am running lttng session daemon in another container. I mounted the unix socket locations (either /var/run/lttng for root or $HOME/.lttng for another user). By doing that I can run commands like lttng create or lttng list <session-name>, but the tracepoint events from the application don't get registered and there is no trace output. I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and --verbose-consumer) and got the following error message: "*Unix socket credential pid=0. Refusing application in distinct, non-nested pid namespace.*" It appears that for some calls to the session daemon there is a getsockopt syscall made with *SO_PEERCRED* which returns 0 for pid and the call is failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in ustctl.c). If I comment out the getsockopt call, my application tracing starts to work. From what I found, docker cannot support getsockopt/SO_PEERCRED call to get peer pid on the unix socket which would make sense as it's in a separate namespace. I have a few questions on this: 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I would like to understand why it's required for some and not other calls. 2. Is there any workaround for this problem, so that I can get this to work with the container topology I am working with (app in one container and lttng daemons in another). 3. Related to 2, are there any gotchas to bypassing the getsockopt call in get_cred? Appreciate your help regarding this. Thanks, Eqbal [-- Attachment #1.2: Type: text/html, Size: 2382 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [lttng-dev] Userspace tracing in docker containers 2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev @ 2021-04-05 18:36 ` Eqbal via lttng-dev 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev 1 sibling, 0 replies; 4+ messages in thread From: Eqbal via lttng-dev @ 2021-04-05 18:36 UTC (permalink / raw) To: lttng-dev [-- Attachment #1.1: Type: text/plain, Size: 2893 bytes --] >If I comment out the getsockopt call, my application tracing starts to work. Correction on the above, I meant I comment out the check for pid being non-zero in the get_cred call, not the whole getsockopt call. diff --git a/liblttng-ust-ctl/ustctl.c b/liblttng-ust-ctl/ustctl.c index 39860ebf..96aeef3c 100644 --- a/liblttng-ust-ctl/ustctl.c +++ b/liblttng-ust-ctl/ustctl.c @@ -1825,10 +1825,10 @@ int get_cred(int sock, "application registered claiming [ pid: %u, ppid: %u, uid: %u, gid: %u ]", ucred.pid, ucred.uid, ucred.gid, reg_msg->pid, reg_msg->ppid, reg_msg->uid, reg_msg->gid); - if (!ucred.pid) { - ERR("Unix socket credential pid=0. Refusing application in distinct, non-nested pid namespace."); - return -LTTNG_UST_ERR_PEERCRED_PID; - } + // if (!ucred.pid) { + // ERR("Unix socket credential pid=0. Refusing application in distinct, non-nested pid namespace."); + // return -LTTNG_UST_ERR_PEERCRED_PID; + // } *pid = ucred.pid; *uid = ucred.uid; *gid = ucred.gid; On Mon, Apr 5, 2021 at 11:09 AM Eqbal <eqbalzee@gmail.com> wrote: > Hi, > > I am trying to get user space tracing working for an application running > in a docker container. I am running lttng session daemon in another > container. I mounted the unix socket locations (either /var/run/lttng for > root or $HOME/.lttng for another user). By doing that I can run commands > like lttng create or lttng list <session-name>, but the tracepoint events > from the application don't get registered and there is no trace output. > > I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and > --verbose-consumer) and got the following error message: > > "*Unix socket credential pid=0. Refusing application in distinct, > non-nested pid namespace.*" > > It appears that for some calls to the session daemon there is a getsockopt > syscall made with *SO_PEERCRED* which returns 0 for pid and the call is > failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in > ustctl.c). > > If I comment out the getsockopt call, my application tracing starts to > work. > > From what I found, docker cannot support getsockopt/SO_PEERCRED call to > get peer pid on the unix socket which would make sense as it's in a > separate namespace. > > I have a few questions on this: > 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I > would like to understand why it's required for some and not other calls. > 2. Is there any workaround for this problem, so that I can get this to > work with the container topology I am working with (app in one container > and lttng daemons in another). > 3. Related to 2, are there any gotchas to bypassing the getsockopt call in > get_cred? > > Appreciate your help regarding this. > > Thanks, > Eqbal > [-- Attachment #1.2: Type: text/html, Size: 4074 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [lttng-dev] Userspace tracing in docker containers 2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev 2021-04-05 18:36 ` Eqbal via lttng-dev @ 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev 2021-05-03 23:45 ` Eqbal via lttng-dev 1 sibling, 1 reply; 4+ messages in thread From: Jonathan Rajotte-Julien via lttng-dev @ 2021-04-06 14:07 UTC (permalink / raw) To: Eqbal; +Cc: lttng-dev Hi, On Mon, Apr 05, 2021 at 11:09:39AM -0700, Eqbal via lttng-dev wrote: > Hi, > > I am trying to get user space tracing working for an application running in > a docker container. I am running lttng session daemon in another container. > I mounted the unix socket locations (either /var/run/lttng for root or > $HOME/.lttng for another user). By doing that I can run commands like lttng > create or lttng list <session-name>, but the tracepoint events from the > application don't get registered and there is no trace output. > > I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and > --verbose-consumer) and got the following error message: > > "*Unix socket credential pid=0. Refusing application in distinct, > non-nested pid namespace.*" > > It appears that for some calls to the session daemon there is a getsockopt > syscall made with *SO_PEERCRED* which returns 0 for pid and the call is > failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in > ustctl.c). > > If I comment out the getsockopt call, my application tracing starts to work. > > From what I found, docker cannot support getsockopt/SO_PEERCRED call to get > peer pid on the unix socket which would make sense as it's in a separate > namespace. > > I have a few questions on this: > 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I > would like to understand why it's required for some and not other calls. More information is found in the introducing commit: commit a834901f2890deadb815d7f9e3ab79c3ba673994 Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Date: Mon Oct 12 16:52:03 2020 -0400 Fix: Use unix socket peercred for pid, uid, gid credentials Currently, the session daemon trust the pid, ppid, uid, and gid values passed by the application, but should really validate the uid using unix socket peercred. This fix uses the peercred values rather than the values provided by the application on registration for: - pid, uid and gid on Linux, - uid and gid on FreeBSD. This should improve how the session daemon deals with containerized applications on Linux as well. Applications are required to be either in the same pid namespace, or in a pid namespace nested within the pid namespace of the lttng-sessiond, so the session daemon can map the application pid to something meaningful within its own pid namespace. Applications in a unrelated (disjoint) pid namespace will be refused by the session daemon. About the uid and gid with user namespaces on Linux, those will provide meaningful IDs if the application user namespace is either the same as the user namespace of the session daemon, or a nested user namespace. Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and /proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on current distributions. Given that fetching the parent pid (ppid) of the application would require to use /proc/<pid>/status (which is racy wrt pid reuse), expose the ppid provided by the application on registration instead, but only in situations where the application sits in the same pid namespace as the session daemon (on Linux), which is detected by checking if the pid provided by the application matches the pid obtained using unix socket credentials. The ppid is only used for logging/debugging purposes in the session daemon anyway, so it is OK to use the value provided by the application for that purpose. Fixes: #1286 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f As for "why it's required for some and not other calls.", there is a difference between communicating with a lttng-sessiond daemon (using the lttng CLI) and userspace application registering. They are essentially two distinct communication interface. Now, to be honest, I'm not certain of the complete "security" policy for the lttng-sessiond <-> CLI interface and if we should be more strict or not. > 2. Is there any workaround for this problem, so that I can get this to work > with the container topology I am working with (app in one container and > lttng daemons in another). Based on the commit message, lttng-ust explicitly cannot be used across non-nested pid namespace. Could you give us more information on the goal for the topology you plan to use? This could lead to further discussion and/or alternative solution based on the goal and constraints of your deployment. > 3. Related to 2, are there any gotchas to bypassing the getsockopt call in > get_cred? Based on the content of the mentioned bug (1286) [1], the principal concern is: " This means a non-root application could theoretically impersonate a root application from a tracing perspective, and thus access root tracing buffers in a per-uid configuration, which is unwanted. " [1] https://bugs.lttng.org/issues/1286 Cheers -- Jonathan Rajotte-Julien EfficiOS _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [lttng-dev] Userspace tracing in docker containers 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev @ 2021-05-03 23:45 ` Eqbal via lttng-dev 0 siblings, 0 replies; 4+ messages in thread From: Eqbal via lttng-dev @ 2021-05-03 23:45 UTC (permalink / raw) To: Jonathan Rajotte-Julien; +Cc: lttng-dev [-- Attachment #1.1: Type: text/plain, Size: 5640 bytes --] Thanks for the responses. The reasoning makes sense. We have decided to run lttng-sessiond on the host. Our trace generating application will run in a container and so will our libbabeltrace based trace consumer app (using live sessions). On Tue, Apr 6, 2021 at 7:07 AM Jonathan Rajotte-Julien < jonathan.rajotte-julien@efficios.com> wrote: > Hi, > > On Mon, Apr 05, 2021 at 11:09:39AM -0700, Eqbal via lttng-dev wrote: > > Hi, > > > > I am trying to get user space tracing working for an application running > in > > a docker container. I am running lttng session daemon in another > container. > > I mounted the unix socket locations (either /var/run/lttng for root or > > $HOME/.lttng for another user). By doing that I can run commands like > lttng > > create or lttng list <session-name>, but the tracepoint events from the > > application don't get registered and there is no trace output. > > > > I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and > > --verbose-consumer) and got the following error message: > > > > "*Unix socket credential pid=0. Refusing application in distinct, > > non-nested pid namespace.*" > > > > It appears that for some calls to the session daemon there is a > getsockopt > > syscall made with *SO_PEERCRED* which returns 0 for pid and the call is > > failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in > > ustctl.c). > > > > If I comment out the getsockopt call, my application tracing starts to > work. > > > > From what I found, docker cannot support getsockopt/SO_PEERCRED call to > get > > peer pid on the unix socket which would make sense as it's in a separate > > namespace. > > > > I have a few questions on this: > > 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? > I > > would like to understand why it's required for some and not other calls. > > > More information is found in the introducing commit: > > commit a834901f2890deadb815d7f9e3ab79c3ba673994 > Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Date: Mon Oct 12 16:52:03 2020 -0400 > > Fix: Use unix socket peercred for pid, uid, gid credentials > > Currently, the session daemon trust the pid, ppid, uid, and gid values > passed by the application, but should really validate the uid using > unix > socket peercred. This fix uses the peercred values rather than the > values provided by the application on registration for: > > - pid, uid and gid on Linux, > - uid and gid on FreeBSD. > > This should improve how the session daemon deals with containerized > applications on Linux as well. Applications are required to be either > in > the same pid namespace, or in a pid namespace nested within the pid > namespace of the lttng-sessiond, so the session daemon can map the > application pid to something meaningful within its own pid namespace. > Applications in a unrelated (disjoint) pid namespace will be refused by > the session daemon. > > About the uid and gid with user namespaces on Linux, those will provide > meaningful IDs if the application user namespace is either the same as > the user namespace of the session daemon, or a nested user namespace. > Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and > /proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on > current distributions. > > Given that fetching the parent pid (ppid) of the application would > require to use /proc/<pid>/status (which is racy wrt pid reuse), expose > the ppid provided by the application on registration instead, but only > in situations where the application sits in the same pid namespace as > the session daemon (on Linux), which is detected by checking if the pid > provided by the application matches the pid obtained using unix socket > credentials. The ppid is only used for logging/debugging purposes in > the > session daemon anyway, so it is OK to use the value provided by the > application for that purpose. > > Fixes: #1286 > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f > > As for "why it's required for some and not other calls.", there is a > difference > between communicating with a lttng-sessiond daemon (using the lttng CLI) > and > userspace application registering. They are essentially two distinct > communication interface. Now, to be honest, I'm not certain of the complete > "security" policy for the lttng-sessiond <-> CLI interface and if we > should be > more strict or not. > > > 2. Is there any workaround for this problem, so that I can get this to > work > > with the container topology I am working with (app in one container and > > lttng daemons in another). > > Based on the commit message, lttng-ust explicitly cannot be used across > non-nested pid namespace. > > Could you give us more information on the goal for the topology you plan > to use? > This could lead to further discussion and/or alternative solution based on > the > goal and constraints of your deployment. > > > 3. Related to 2, are there any gotchas to bypassing the getsockopt call > in > > get_cred? > > Based on the content of the mentioned bug (1286) [1], the principal > concern is: > > " > This means a non-root application could theoretically impersonate a root > application from a tracing perspective, and thus access root tracing > buffers in > a per-uid configuration, which is unwanted. > " > > [1] https://bugs.lttng.org/issues/1286 > > Cheers > > -- > Jonathan Rajotte-Julien > EfficiOS > [-- Attachment #1.2: Type: text/html, Size: 6753 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-05-03 23:45 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev 2021-04-05 18:36 ` Eqbal via lttng-dev 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev 2021-05-03 23:45 ` Eqbal via lttng-dev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).