lttng-dev.lists.lttng.org archive mirror
 help / color / mirror / Atom feed
* [lttng-dev] Userspace tracing in docker containers
@ 2021-04-05 18:09 Eqbal via lttng-dev
  2021-04-05 18:36 ` Eqbal via lttng-dev
  2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev
  0 siblings, 2 replies; 4+ messages in thread
From: Eqbal via lttng-dev @ 2021-04-05 18:09 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 1627 bytes --]

Hi,

I am trying to get user space tracing working for an application running in
a docker container. I am running lttng session daemon in another container.
I mounted the unix socket locations (either /var/run/lttng for root or
$HOME/.lttng for another user). By doing that I can run commands like lttng
create or lttng list <session-name>, but the tracepoint events from the
application don't get registered and there is no trace output.

I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and
--verbose-consumer) and got the following error message:

"*Unix socket credential pid=0. Refusing application in distinct,
non-nested pid namespace.*"

It appears that for some calls to the session daemon there is a getsockopt
syscall made with *SO_PEERCRED* which returns 0 for pid and the call is
failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in
ustctl.c).

If I comment out the getsockopt call, my application tracing starts to work.

From what I found, docker cannot support getsockopt/SO_PEERCRED call to get
peer pid on the unix socket which would make sense as it's in a separate
namespace.

I have a few questions on this:
1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I
would like to understand why it's required for some and not other calls.
2. Is there any workaround for this problem, so that I can get this to work
with the container topology I am working with (app in one container and
lttng daemons in another).
3. Related to 2, are there any gotchas to bypassing the getsockopt call in
get_cred?

Appreciate your help regarding this.

Thanks,
Eqbal

[-- Attachment #1.2: Type: text/html, Size: 2382 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] Userspace tracing in docker containers
  2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev
@ 2021-04-05 18:36 ` Eqbal via lttng-dev
  2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev
  1 sibling, 0 replies; 4+ messages in thread
From: Eqbal via lttng-dev @ 2021-04-05 18:36 UTC (permalink / raw)
  To: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 2893 bytes --]

>If I comment out the getsockopt call, my application tracing starts to
work.

Correction on the above, I meant I comment out the check for pid being
non-zero in the get_cred call, not the whole getsockopt call.

diff --git a/liblttng-ust-ctl/ustctl.c b/liblttng-ust-ctl/ustctl.c
index 39860ebf..96aeef3c 100644
--- a/liblttng-ust-ctl/ustctl.c
+++ b/liblttng-ust-ctl/ustctl.c
@@ -1825,10 +1825,10 @@ int get_cred(int sock,
                "application registered claiming [ pid: %u, ppid: %u, uid:
%u, gid: %u ]",
                ucred.pid, ucred.uid, ucred.gid,
                reg_msg->pid, reg_msg->ppid, reg_msg->uid, reg_msg->gid);
-       if (!ucred.pid) {
-               ERR("Unix socket credential pid=0. Refusing application in
distinct, non-nested pid namespace.");
-               return -LTTNG_UST_ERR_PEERCRED_PID;
-       }
+       // if (!ucred.pid) {
+       //      ERR("Unix socket credential pid=0. Refusing application in
distinct, non-nested pid namespace.");
+       //      return -LTTNG_UST_ERR_PEERCRED_PID;
+       // }
        *pid = ucred.pid;
        *uid = ucred.uid;
        *gid = ucred.gid;

On Mon, Apr 5, 2021 at 11:09 AM Eqbal <eqbalzee@gmail.com> wrote:

> Hi,
>
> I am trying to get user space tracing working for an application running
> in a docker container. I am running lttng session daemon in another
> container. I mounted the unix socket locations (either /var/run/lttng for
> root or $HOME/.lttng for another user). By doing that I can run commands
> like lttng create or lttng list <session-name>, but the tracepoint events
> from the application don't get registered and there is no trace output.
>
> I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and
> --verbose-consumer) and got the following error message:
>
> "*Unix socket credential pid=0. Refusing application in distinct,
> non-nested pid namespace.*"
>
> It appears that for some calls to the session daemon there is a getsockopt
> syscall made with *SO_PEERCRED* which returns 0 for pid and the call is
> failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in
> ustctl.c).
>
> If I comment out the getsockopt call, my application tracing starts to
> work.
>
> From what I found, docker cannot support getsockopt/SO_PEERCRED call to
> get peer pid on the unix socket which would make sense as it's in a
> separate namespace.
>
> I have a few questions on this:
> 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I
> would like to understand why it's required for some and not other calls.
> 2. Is there any workaround for this problem, so that I can get this to
> work with the container topology I am working with (app in one container
> and lttng daemons in another).
> 3. Related to 2, are there any gotchas to bypassing the getsockopt call in
> get_cred?
>
> Appreciate your help regarding this.
>
> Thanks,
> Eqbal
>

[-- Attachment #1.2: Type: text/html, Size: 4074 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] Userspace tracing in docker containers
  2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev
  2021-04-05 18:36 ` Eqbal via lttng-dev
@ 2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev
  2021-05-03 23:45   ` Eqbal via lttng-dev
  1 sibling, 1 reply; 4+ messages in thread
From: Jonathan Rajotte-Julien via lttng-dev @ 2021-04-06 14:07 UTC (permalink / raw)
  To: Eqbal; +Cc: lttng-dev

Hi,

On Mon, Apr 05, 2021 at 11:09:39AM -0700, Eqbal via lttng-dev wrote:
> Hi,
> 
> I am trying to get user space tracing working for an application running in
> a docker container. I am running lttng session daemon in another container.
> I mounted the unix socket locations (either /var/run/lttng for root or
> $HOME/.lttng for another user). By doing that I can run commands like lttng
> create or lttng list <session-name>, but the tracepoint events from the
> application don't get registered and there is no trace output.
> 
> I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and
> --verbose-consumer) and got the following error message:
> 
> "*Unix socket credential pid=0. Refusing application in distinct,
> non-nested pid namespace.*"
> 
> It appears that for some calls to the session daemon there is a getsockopt
> syscall made with *SO_PEERCRED* which returns 0 for pid and the call is
> failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in
> ustctl.c).
> 
> If I comment out the getsockopt call, my application tracing starts to work.
> 
> From what I found, docker cannot support getsockopt/SO_PEERCRED call to get
> peer pid on the unix socket which would make sense as it's in a separate
> namespace.
> 
> I have a few questions on this:
> 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED? I
> would like to understand why it's required for some and not other calls.


More information is found in the introducing commit:

  commit a834901f2890deadb815d7f9e3ab79c3ba673994
  Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
  Date:   Mon Oct 12 16:52:03 2020 -0400

    Fix: Use unix socket peercred for pid, uid, gid credentials
    
    Currently, the session daemon trust the pid, ppid, uid, and gid values
    passed by the application, but should really validate the uid using unix
    socket peercred. This fix uses the peercred values rather than the
    values provided by the application on registration for:
    
    - pid, uid and gid on Linux,
    - uid and gid on FreeBSD.
    
    This should improve how the session daemon deals with containerized
    applications on Linux as well. Applications are required to be either in
    the same pid namespace, or in a pid namespace nested within the pid
    namespace of the lttng-sessiond, so the session daemon can map the
    application pid to something meaningful within its own pid namespace.
    Applications in a unrelated (disjoint) pid namespace will be refused by
    the session daemon.
    
    About the uid and gid with user namespaces on Linux, those will provide
    meaningful IDs if the application user namespace is either the same as
    the user namespace of the session daemon, or a nested user namespace.
    Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and
    /proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on
    current distributions.
    
    Given that fetching the parent pid (ppid) of the application would
    require to use /proc/<pid>/status (which is racy wrt pid reuse), expose
    the ppid provided by the application on registration instead, but only
    in situations where the application sits in the same pid namespace as
    the session daemon (on Linux), which is detected by checking if the pid
    provided by the application matches the pid obtained using unix socket
    credentials. The ppid is only used for logging/debugging purposes in the
    session daemon anyway, so it is OK to use the value provided by the
    application for that purpose.
    
    Fixes: #1286
    Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
    Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f

As for "why it's required for some and not other calls.", there is a difference
between communicating with a lttng-sessiond daemon (using the lttng CLI) and
userspace application registering. They are essentially two distinct
communication interface. Now, to be honest, I'm not certain of the complete
"security" policy for the lttng-sessiond <-> CLI interface and if we should be
more strict or not.

> 2. Is there any workaround for this problem, so that I can get this to work
> with the container topology I am working with (app in one container and
> lttng daemons in another).

Based on the commit message, lttng-ust explicitly cannot be used across
non-nested pid namespace.

Could you give us more information on the goal for the topology you plan to use?
This could lead to further discussion and/or alternative solution based on the
goal and constraints of your deployment.

> 3. Related to 2, are there any gotchas to bypassing the getsockopt call in
> get_cred?

Based on the content of the mentioned bug (1286) [1],  the principal concern is:

"
This means a non-root application could theoretically impersonate a root
application from a tracing perspective, and thus access root tracing buffers in
a per-uid configuration, which is unwanted.
"

[1] https://bugs.lttng.org/issues/1286

Cheers

-- 
Jonathan Rajotte-Julien
EfficiOS
_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [lttng-dev] Userspace tracing in docker containers
  2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev
@ 2021-05-03 23:45   ` Eqbal via lttng-dev
  0 siblings, 0 replies; 4+ messages in thread
From: Eqbal via lttng-dev @ 2021-05-03 23:45 UTC (permalink / raw)
  To: Jonathan Rajotte-Julien; +Cc: lttng-dev


[-- Attachment #1.1: Type: text/plain, Size: 5640 bytes --]

Thanks for the responses. The reasoning makes sense. We have decided to run
lttng-sessiond on the host. Our trace generating application will run in a
container and so will our libbabeltrace based trace consumer app (using
live sessions).

On Tue, Apr 6, 2021 at 7:07 AM Jonathan Rajotte-Julien <
jonathan.rajotte-julien@efficios.com> wrote:

> Hi,
>
> On Mon, Apr 05, 2021 at 11:09:39AM -0700, Eqbal via lttng-dev wrote:
> > Hi,
> >
> > I am trying to get user space tracing working for an application running
> in
> > a docker container. I am running lttng session daemon in another
> container.
> > I mounted the unix socket locations (either /var/run/lttng for root or
> > $HOME/.lttng for another user). By doing that I can run commands like
> lttng
> > create or lttng list <session-name>, but the tracepoint events from the
> > application don't get registered and there is no trace output.
> >
> > I enabled LTTNG_UST_DEBUG an ran lttng-sessiond in verbose mode (-vvv and
> > --verbose-consumer) and got the following error message:
> >
> > "*Unix socket credential pid=0. Refusing application in distinct,
> > non-nested pid namespace.*"
> >
> > It appears that for some calls to the session daemon there is a
> getsockopt
> > syscall made with *SO_PEERCRED* which returns 0 for pid and the call is
> > failed with *LTTNG_UST_ERR_PEERCRED_PID* error (see get_cred call in
> > ustctl.c).
> >
> > If I comment out the getsockopt call, my application tracing starts to
> work.
> >
> > From what I found, docker cannot support getsockopt/SO_PEERCRED call to
> get
> > peer pid on the unix socket which would make sense as it's in a separate
> > namespace.
> >
> > I have a few questions on this:
> > 1. What is the reason for the get_cred/getsockopt call with SO_PEERCRED?
> I
> > would like to understand why it's required for some and not other calls.
>
>
> More information is found in the introducing commit:
>
>   commit a834901f2890deadb815d7f9e3ab79c3ba673994
>   Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>   Date:   Mon Oct 12 16:52:03 2020 -0400
>
>     Fix: Use unix socket peercred for pid, uid, gid credentials
>
>     Currently, the session daemon trust the pid, ppid, uid, and gid values
>     passed by the application, but should really validate the uid using
> unix
>     socket peercred. This fix uses the peercred values rather than the
>     values provided by the application on registration for:
>
>     - pid, uid and gid on Linux,
>     - uid and gid on FreeBSD.
>
>     This should improve how the session daemon deals with containerized
>     applications on Linux as well. Applications are required to be either
> in
>     the same pid namespace, or in a pid namespace nested within the pid
>     namespace of the lttng-sessiond, so the session daemon can map the
>     application pid to something meaningful within its own pid namespace.
>     Applications in a unrelated (disjoint) pid namespace will be refused by
>     the session daemon.
>
>     About the uid and gid with user namespaces on Linux, those will provide
>     meaningful IDs if the application user namespace is either the same as
>     the user namespace of the session daemon, or a nested user namespace.
>     Otherwise, the IDs will be that of /proc/sys/kernel/overflowuid and
>     /proc/sys/kernel/overflowgid, which typically maps to nobody.nogroup on
>     current distributions.
>
>     Given that fetching the parent pid (ppid) of the application would
>     require to use /proc/<pid>/status (which is racy wrt pid reuse), expose
>     the ppid provided by the application on registration instead, but only
>     in situations where the application sits in the same pid namespace as
>     the session daemon (on Linux), which is detected by checking if the pid
>     provided by the application matches the pid obtained using unix socket
>     credentials. The ppid is only used for logging/debugging purposes in
> the
>     session daemon anyway, so it is OK to use the value provided by the
>     application for that purpose.
>
>     Fixes: #1286
>     Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>     Change-Id: I94742e57dad642106908d09e2c7e395993c2c48f
>
> As for "why it's required for some and not other calls.", there is a
> difference
> between communicating with a lttng-sessiond daemon (using the lttng CLI)
> and
> userspace application registering. They are essentially two distinct
> communication interface. Now, to be honest, I'm not certain of the complete
> "security" policy for the lttng-sessiond <-> CLI interface and if we
> should be
> more strict or not.
>
> > 2. Is there any workaround for this problem, so that I can get this to
> work
> > with the container topology I am working with (app in one container and
> > lttng daemons in another).
>
> Based on the commit message, lttng-ust explicitly cannot be used across
> non-nested pid namespace.
>
> Could you give us more information on the goal for the topology you plan
> to use?
> This could lead to further discussion and/or alternative solution based on
> the
> goal and constraints of your deployment.
>
> > 3. Related to 2, are there any gotchas to bypassing the getsockopt call
> in
> > get_cred?
>
> Based on the content of the mentioned bug (1286) [1],  the principal
> concern is:
>
> "
> This means a non-root application could theoretically impersonate a root
> application from a tracing perspective, and thus access root tracing
> buffers in
> a per-uid configuration, which is unwanted.
> "
>
> [1] https://bugs.lttng.org/issues/1286
>
> Cheers
>
> --
> Jonathan Rajotte-Julien
> EfficiOS
>

[-- Attachment #1.2: Type: text/html, Size: 6753 bytes --]

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
lttng-dev mailing list
lttng-dev@lists.lttng.org
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-05-03 23:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-05 18:09 [lttng-dev] Userspace tracing in docker containers Eqbal via lttng-dev
2021-04-05 18:36 ` Eqbal via lttng-dev
2021-04-06 14:07 ` Jonathan Rajotte-Julien via lttng-dev
2021-05-03 23:45   ` Eqbal via lttng-dev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).