kernel-hardening.lists.openwall.com archive mirror
 help / color / mirror / Atom feed
* Isolating abstract sockets
@ 2022-12-18 19:29 Stefan Bavendiek
  2023-10-24 13:46 ` Serge E. Hallyn
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Bavendiek @ 2022-12-18 19:29 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-hardening

[-- Attachment #1: Type: text/plain, Size: 1643 bytes --]

When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.

While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.

Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.

Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.

Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.

The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
And would such a patch have a realistic chance to make it into the kernel?

- Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2022-12-18 19:29 Isolating abstract sockets Stefan Bavendiek
@ 2023-10-24 13:46 ` Serge E. Hallyn
  2023-10-24 14:05   ` Boris Lukashev
                     ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-24 13:46 UTC (permalink / raw)
  To: Stefan Bavendiek; +Cc: kernel-hardening, linux-hardening

On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.

Veeery late reply.  Have you had any productive discussions about this in
other threads or venues?

> While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> 
> Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> 
> Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> 
> Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> 
> The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> And would such a patch have a realistic chance to make it into the kernel?

Disabling them altogether would break lots of things depending on them,
like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
namespaces.  There are several directions this could lead.  For one, as
Dinesh Subhraveti often points out, the current "network" namespace is
really a network device namespace.  If we instead namespace at the
bind/connect/etc calls, we end up with much different abilities.  You
can implement something like this today using seccomp-filter.

-serge

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 13:46 ` Serge E. Hallyn
@ 2023-10-24 14:05   ` Boris Lukashev
  2023-10-24 14:15     ` Serge E. Hallyn
  2023-10-24 14:14   ` Paul Moore
  2023-10-25 17:10   ` Jann Horn
  2 siblings, 1 reply; 18+ messages in thread
From: Boris Lukashev @ 2023-10-24 14:05 UTC (permalink / raw)
  To: kernel-hardening, Serge E. Hallyn, Stefan Bavendiek; +Cc: linux-hardening

[-- Attachment #1: Type: text/plain, Size: 3077 bytes --]

Namespacing at OSI4 seems a bit fraught as the underlying route, mac, and physdev fall outside the callers control. Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster (likely asymmetrically too) and have bound socket collisions opaque to each other requiring handling outside the NS/containers purview. We looked at this sort of thing during the r&d phase of our assured comms work (namespaces were young) and found a bunch of overhead and collision concerns. Not saying it can't be done, but getting consumers to play nice enough with such an approach may be a heavy lift.

Thanks,
-Boris


On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@hallyn.com> wrote:
>On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
>> When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
>
>Veeery late reply.  Have you had any productive discussions about this in
>other threads or venues?
>
>> While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
>> It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
>> 
>> Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
>> 
>> Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
>> 
>> Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
>> 
>> The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
>> What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
>> And would such a patch have a realistic chance to make it into the kernel?
>
>Disabling them altogether would break lots of things depending on them,
>like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
>namespaces.  There are several directions this could lead.  For one, as
>Dinesh Subhraveti often points out, the current "network" namespace is
>really a network device namespace.  If we instead namespace at the
>bind/connect/etc calls, we end up with much different abilities.  You
>can implement something like this today using seccomp-filter.
>
>-serge

[-- Attachment #2: Type: text/html, Size: 3755 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 13:46 ` Serge E. Hallyn
  2023-10-24 14:05   ` Boris Lukashev
@ 2023-10-24 14:14   ` Paul Moore
  2023-10-24 14:18     ` Serge E. Hallyn
  2023-10-25 17:10   ` Jann Horn
  2 siblings, 1 reply; 18+ messages in thread
From: Paul Moore @ 2023-10-24 14:14 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Stefan Bavendiek, kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 9:46 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
>
> Veeery late reply.  Have you had any productive discussions about this in
> other threads or venues?
>
> > While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> > It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> >
> > Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> >
> > Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> >
> > Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> >
> > The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> > What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> > And would such a patch have a realistic chance to make it into the kernel?
>
> Disabling them altogether would break lots of things depending on them,
> like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> namespaces.  There are several directions this could lead.  For one, as
> Dinesh Subhraveti often points out, the current "network" namespace is
> really a network device namespace.  If we instead namespace at the
> bind/connect/etc calls, we end up with much different abilities.

The LSM layer supports access controls on abstract sockets, with at
least two (AppArmor, SELinux) providing abstract socket access
controls, other LSMs may provide controls as well.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 14:05   ` Boris Lukashev
@ 2023-10-24 14:15     ` Serge E. Hallyn
  2023-10-24 15:55       ` Boris Lukashev
  0 siblings, 1 reply; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-24 14:15 UTC (permalink / raw)
  To: Boris Lukashev
  Cc: kernel-hardening, Serge E. Hallyn, Stefan Bavendiek, linux-hardening

Thanks for the reply.  Do you have any papers which came out of this r&d
phase?  Sounds very interesting.

> Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster

Yes, but that could be a feature.  I think of it as:  I'm unprivileged
user serge, and I want to fire off firefox in a whatzit-namespace so
that I can redirect or forbid some connections.  In this case, the
admins have not agreed to let me double my resource usage, so the fact
that the new namespace is sharing mine is a feature.  And this lets
me use network-namespace-like features completely unprivileged, without
having to use a setuid-root helper to hook up a bridge.

But, I didn't send this reply to advocate this approach.  My main point
was to mention that "network namespaces are network device namespaces"
and hope that others would bring other suggestions for alternatives.

-serge

On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote:
> Namespacing at OSI4 seems a bit fraught as the underlying route, mac, and physdev fall outside the callers control. Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster (likely asymmetrically too) and have bound socket collisions opaque to each other requiring handling outside the NS/containers purview. We looked at this sort of thing during the r&d phase of our assured comms work (namespaces were young) and found a bunch of overhead and collision concerns. Not saying it can't be done, but getting consumers to play nice enough with such an approach may be a heavy lift.
> 
> Thanks,
> -Boris
> 
> 
> On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@hallyn.com> wrote:
> >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> >> When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
> >
> >Veeery late reply.  Have you had any productive discussions about this in
> >other threads or venues?
> >
> >> While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> >> It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> >> 
> >> Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> >> 
> >> Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> >> 
> >> Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> >> 
> >> The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> >> What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> >> And would such a patch have a realistic chance to make it into the kernel?
> >
> >Disabling them altogether would break lots of things depending on them,
> >like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> >namespaces.  There are several directions this could lead.  For one, as
> >Dinesh Subhraveti often points out, the current "network" namespace is
> >really a network device namespace.  If we instead namespace at the
> >bind/connect/etc calls, we end up with much different abilities.  You
> >can implement something like this today using seccomp-filter.
> >
> >-serge

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 14:14   ` Paul Moore
@ 2023-10-24 14:18     ` Serge E. Hallyn
  2023-10-24 14:29       ` Paul Moore
  0 siblings, 1 reply; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-24 14:18 UTC (permalink / raw)
  To: Paul Moore
  Cc: Serge E. Hallyn, Stefan Bavendiek, kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 10:14:29AM -0400, Paul Moore wrote:
> On Tue, Oct 24, 2023 at 9:46 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > > When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
> >
> > Veeery late reply.  Have you had any productive discussions about this in
> > other threads or venues?
> >
> > > While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> > > It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> > >
> > > Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> > >
> > > Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> > >
> > > Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> > >
> > > The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> > > What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> > > And would such a patch have a realistic chance to make it into the kernel?
> >
> > Disabling them altogether would break lots of things depending on them,
> > like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> > namespaces.  There are several directions this could lead.  For one, as
> > Dinesh Subhraveti often points out, the current "network" namespace is
> > really a network device namespace.  If we instead namespace at the
> > bind/connect/etc calls, we end up with much different abilities.
> 
> The LSM layer supports access controls on abstract sockets, with at
> least two (AppArmor, SELinux) providing abstract socket access
> controls, other LSMs may provide controls as well.

Good point.  And for Stefan that may suffice, so thanks for mentioning
that.  But The LSM layer is mandatory access control for use by the
admins.  That doesn't help an unprivileged user.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 14:18     ` Serge E. Hallyn
@ 2023-10-24 14:29       ` Paul Moore
  2023-10-24 16:07         ` Serge E. Hallyn
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Moore @ 2023-10-24 14:29 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Stefan Bavendiek, kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 10:18 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> On Tue, Oct 24, 2023 at 10:14:29AM -0400, Paul Moore wrote:
> > On Tue, Oct 24, 2023 at 9:46 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > > On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > > > When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
> > >
> > > Veeery late reply.  Have you had any productive discussions about this in
> > > other threads or venues?
> > >
> > > > While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> > > > It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> > > >
> > > > Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> > > >
> > > > Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> > > >
> > > > Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> > > >
> > > > The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> > > > What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> > > > And would such a patch have a realistic chance to make it into the kernel?
> > >
> > > Disabling them altogether would break lots of things depending on them,
> > > like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> > > namespaces.  There are several directions this could lead.  For one, as
> > > Dinesh Subhraveti often points out, the current "network" namespace is
> > > really a network device namespace.  If we instead namespace at the
> > > bind/connect/etc calls, we end up with much different abilities.
> >
> > The LSM layer supports access controls on abstract sockets, with at
> > least two (AppArmor, SELinux) providing abstract socket access
> > controls, other LSMs may provide controls as well.
>
> Good point.  And for Stefan that may suffice, so thanks for mentioning
> that.  But The LSM layer is mandatory access control for use by the
> admins.  That doesn't help an unprivileged user.

Individual LSMs may implement mandatory access control models, but
that is not an inherent requirement imposed by the LSM layer.  While
the Landlock LSM does not (yet?) support access controls for abstract
sockets, it is a discretionary access control mechanism.

I'm not currently aware of a discretionary access control LSM that
supports abstract socket access control, but such a LSM should be
possible if someone wanted to implement one.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 14:15     ` Serge E. Hallyn
@ 2023-10-24 15:55       ` Boris Lukashev
  2023-10-24 16:11         ` Serge E. Hallyn
  0 siblings, 1 reply; 18+ messages in thread
From: Boris Lukashev @ 2023-10-24 15:55 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: kernel-hardening, Stefan Bavendiek, linux-hardening

[-- Attachment #1: Type: text/plain, Size: 5148 bytes --]

Good point: from the "resources granted to a user" perspective, that does
help bound their consumption. The nomenclature distinction seems like a
good one to have, but if "network namespaces" *change the meaning of the
term *and the original definition becomes "network device namespaces," then
there would be a period where older and newer kernels have very different
functions mapped to the same conceptual name. Might this make a bit more
sense as "network namespaces" meaning what they do now - "network device
namespaces," effectively; while the new concept would be "socket
namespaces" to account for the various socket style interfaces provided?

Thanks
-Boris

On Tue, Oct 24, 2023 at 10:15 AM Serge E. Hallyn <serge@hallyn.com> wrote:

> Thanks for the reply.  Do you have any papers which came out of this r&d
> phase?  Sounds very interesting.
>
> > Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster
>
> Yes, but that could be a feature.  I think of it as:  I'm unprivileged
> user serge, and I want to fire off firefox in a whatzit-namespace so
> that I can redirect or forbid some connections.  In this case, the
> admins have not agreed to let me double my resource usage, so the fact
> that the new namespace is sharing mine is a feature.  And this lets
> me use network-namespace-like features completely unprivileged, without
> having to use a setuid-root helper to hook up a bridge.
>
> But, I didn't send this reply to advocate this approach.  My main point
> was to mention that "network namespaces are network device namespaces"
> and hope that others would bring other suggestions for alternatives.
>
> -serge
>
> On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote:
> > Namespacing at OSI4 seems a bit fraught as the underlying route, mac,
> and physdev fall outside the callers control. Multiple NS' sharing an IP
> stack would exhaust ephemeral ranges faster (likely asymmetrically too) and
> have bound socket collisions opaque to each other requiring handling
> outside the NS/containers purview. We looked at this sort of thing during
> the r&d phase of our assured comms work (namespaces were young) and found a
> bunch of overhead and collision concerns. Not saying it can't be done, but
> getting consumers to play nice enough with such an approach may be a heavy
> lift.
> >
> > Thanks,
> > -Boris
> >
> >
> > On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@hallyn.com>
> wrote:
> > >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > >> When building userspace application sandboxes, one issue that does
> not seem trivial to solve is the isolation of abstract sockets.
> > >
> > >Veeery late reply.  Have you had any productive discussions about this
> in
> > >other threads or venues?
> > >
> > >> While most IPC mechanism can be isolated by mechanisms like mount
> namespaces, abstract sockets are part of the network namespace.
> > >> It is possible to isolate abstract sockets by using a new network
> namespace, however, unprivileged processes can only create a new empty
> network namespace, which removes network access as well and makes this
> useless for network clients.
> > >>
> > >> Same linux sandbox projects try to solve this by bridging the
> existing network interfaces into the new namespace or use something like
> slirp4netns to archive this, but this does not look like an ideal solution
> to this problem, especially since sandboxing should reduce the kernel
> attack surface without introducing more complexity.
> > >>
> > >> Aside from containers using namespaces, sandbox implementations based
> on seccomp and landlock would also run into the same problem, since
> landlock only provides file system isolation and seccomp cannot filter the
> path argument and therefore it can only be used to block new unix domain
> socket connections completely.
> > >>
> > >> Currently there does not seem to be any way to disable network
> namespaces in the kernel without also disabling unix domain sockets.
> > >>
> > >> The question is how to solve the issue of abstract socket isolation
> in a clean and efficient way, possibly even without namespaces.
> > >> What would be the ideal way to implement a mechanism to disable
> abstract sockets either globally or even better, in the context of a
> process.
> > >> And would such a patch have a realistic chance to make it into the
> kernel?
> > >
> > >Disabling them altogether would break lots of things depending on them,
> > >like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider
> network
> > >namespaces.  There are several directions this could lead.  For one, as
> > >Dinesh Subhraveti often points out, the current "network" namespace is
> > >really a network device namespace.  If we instead namespace at the
> > >bind/connect/etc calls, we end up with much different abilities.  You
> > >can implement something like this today using seccomp-filter.
> > >
> > >-serge
>


-- 
Boris Lukashev
Systems Architect
Semper Victus <https://www.sempervictus.com>

[-- Attachment #2: Type: text/html, Size: 6144 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 14:29       ` Paul Moore
@ 2023-10-24 16:07         ` Serge E. Hallyn
  2023-10-25 11:54           ` Mickaël Salaün
  2023-10-31 20:40           ` Stefan Bavendiek
  0 siblings, 2 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-24 16:07 UTC (permalink / raw)
  To: Paul Moore
  Cc: Serge E. Hallyn, Stefan Bavendiek, kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 10:29:17AM -0400, Paul Moore wrote:
> On Tue, Oct 24, 2023 at 10:18 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > On Tue, Oct 24, 2023 at 10:14:29AM -0400, Paul Moore wrote:
> > > On Tue, Oct 24, 2023 at 9:46 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > > > On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > > > > When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
> > > >
> > > > Veeery late reply.  Have you had any productive discussions about this in
> > > > other threads or venues?
> > > >
> > > > > While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> > > > > It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> > > > >
> > > > > Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> > > > >
> > > > > Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> > > > >
> > > > > Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> > > > >
> > > > > The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> > > > > What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> > > > > And would such a patch have a realistic chance to make it into the kernel?
> > > >
> > > > Disabling them altogether would break lots of things depending on them,
> > > > like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> > > > namespaces.  There are several directions this could lead.  For one, as
> > > > Dinesh Subhraveti often points out, the current "network" namespace is
> > > > really a network device namespace.  If we instead namespace at the
> > > > bind/connect/etc calls, we end up with much different abilities.
> > >
> > > The LSM layer supports access controls on abstract sockets, with at
> > > least two (AppArmor, SELinux) providing abstract socket access
> > > controls, other LSMs may provide controls as well.
> >
> > Good point.  And for Stefan that may suffice, so thanks for mentioning
> > that.  But The LSM layer is mandatory access control for use by the
> > admins.  That doesn't help an unprivileged user.
> 
> Individual LSMs may implement mandatory access control models, but
> that is not an inherent requirement imposed by the LSM layer.  While
> the Landlock LSM does not (yet?) support access controls for abstract
> sockets, it is a discretionary access control mechanism.

In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
which briefly made it into the -mm kernel, but was eventually rejected as
being an abuse of the LSM interface for OS level virtualization :)

It's not 100% clear to me whether Stefan only wants isolation, or
wants something closer to virtualization.

Stefan, would an LSM allowing you to isolate certain processes from
some abstract unix socket paths (or by label, whatever0 suffice for you?

> I'm not currently aware of a discretionary access control LSM that
> supports abstract socket access control, but such a LSM should be
> possible if someone wanted to implement one.
> 
> -- 
> paul-moore.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 15:55       ` Boris Lukashev
@ 2023-10-24 16:11         ` Serge E. Hallyn
  0 siblings, 0 replies; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-24 16:11 UTC (permalink / raw)
  To: Boris Lukashev
  Cc: Serge E. Hallyn, kernel-hardening, Stefan Bavendiek, linux-hardening

Yeah, I think I've heard the term "socket namespaces" before, and I
agree that changing the term 'network namespaces' in the kernel would
probably not be practical at this point.

On Tue, Oct 24, 2023 at 11:55:43AM -0400, Boris Lukashev wrote:
> Good point: from the "resources granted to a user" perspective, that does
> help bound their consumption. The nomenclature distinction seems like a
> good one to have, but if "network namespaces" *change the meaning of the
> term *and the original definition becomes "network device namespaces," then
> there would be a period where older and newer kernels have very different
> functions mapped to the same conceptual name. Might this make a bit more
> sense as "network namespaces" meaning what they do now - "network device
> namespaces," effectively; while the new concept would be "socket
> namespaces" to account for the various socket style interfaces provided?
> 
> Thanks
> -Boris
> 
> On Tue, Oct 24, 2023 at 10:15 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> 
> > Thanks for the reply.  Do you have any papers which came out of this r&d
> > phase?  Sounds very interesting.
> >
> > > Multiple NS' sharing an IP stack would exhaust ephemeral ranges faster
> >
> > Yes, but that could be a feature.  I think of it as:  I'm unprivileged
> > user serge, and I want to fire off firefox in a whatzit-namespace so
> > that I can redirect or forbid some connections.  In this case, the
> > admins have not agreed to let me double my resource usage, so the fact
> > that the new namespace is sharing mine is a feature.  And this lets
> > me use network-namespace-like features completely unprivileged, without
> > having to use a setuid-root helper to hook up a bridge.
> >
> > But, I didn't send this reply to advocate this approach.  My main point
> > was to mention that "network namespaces are network device namespaces"
> > and hope that others would bring other suggestions for alternatives.
> >
> > -serge
> >
> > On Tue, Oct 24, 2023 at 10:05:29AM -0400, Boris Lukashev wrote:
> > > Namespacing at OSI4 seems a bit fraught as the underlying route, mac,
> > and physdev fall outside the callers control. Multiple NS' sharing an IP
> > stack would exhaust ephemeral ranges faster (likely asymmetrically too) and
> > have bound socket collisions opaque to each other requiring handling
> > outside the NS/containers purview. We looked at this sort of thing during
> > the r&d phase of our assured comms work (namespaces were young) and found a
> > bunch of overhead and collision concerns. Not saying it can't be done, but
> > getting consumers to play nice enough with such an approach may be a heavy
> > lift.
> > >
> > > Thanks,
> > > -Boris
> > >
> > >
> > > On October 24, 2023 9:46:08 AM EDT, "Serge E. Hallyn" <serge@hallyn.com>
> > wrote:
> > > >On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > > >> When building userspace application sandboxes, one issue that does
> > not seem trivial to solve is the isolation of abstract sockets.
> > > >
> > > >Veeery late reply.  Have you had any productive discussions about this
> > in
> > > >other threads or venues?
> > > >
> > > >> While most IPC mechanism can be isolated by mechanisms like mount
> > namespaces, abstract sockets are part of the network namespace.
> > > >> It is possible to isolate abstract sockets by using a new network
> > namespace, however, unprivileged processes can only create a new empty
> > network namespace, which removes network access as well and makes this
> > useless for network clients.
> > > >>
> > > >> Same linux sandbox projects try to solve this by bridging the
> > existing network interfaces into the new namespace or use something like
> > slirp4netns to archive this, but this does not look like an ideal solution
> > to this problem, especially since sandboxing should reduce the kernel
> > attack surface without introducing more complexity.
> > > >>
> > > >> Aside from containers using namespaces, sandbox implementations based
> > on seccomp and landlock would also run into the same problem, since
> > landlock only provides file system isolation and seccomp cannot filter the
> > path argument and therefore it can only be used to block new unix domain
> > socket connections completely.
> > > >>
> > > >> Currently there does not seem to be any way to disable network
> > namespaces in the kernel without also disabling unix domain sockets.
> > > >>
> > > >> The question is how to solve the issue of abstract socket isolation
> > in a clean and efficient way, possibly even without namespaces.
> > > >> What would be the ideal way to implement a mechanism to disable
> > abstract sockets either globally or even better, in the context of a
> > process.
> > > >> And would such a patch have a realistic chance to make it into the
> > kernel?
> > > >
> > > >Disabling them altogether would break lots of things depending on them,
> > > >like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider
> > network
> > > >namespaces.  There are several directions this could lead.  For one, as
> > > >Dinesh Subhraveti often points out, the current "network" namespace is
> > > >really a network device namespace.  If we instead namespace at the
> > > >bind/connect/etc calls, we end up with much different abilities.  You
> > > >can implement something like this today using seccomp-filter.
> > > >
> > > >-serge
> >
> 
> 
> -- 
> Boris Lukashev
> Systems Architect
> Semper Victus <https://www.sempervictus.com>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 16:07         ` Serge E. Hallyn
@ 2023-10-25 11:54           ` Mickaël Salaün
  2023-10-31 20:40           ` Stefan Bavendiek
  1 sibling, 0 replies; 18+ messages in thread
From: Mickaël Salaün @ 2023-10-25 11:54 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Paul Moore, Stefan Bavendiek, kernel-hardening, linux-hardening,
	Björn Roy Baron, linux-security-module,
	Konstantin Meskhidze, Günther Noack

On Tue, Oct 24, 2023 at 11:07:14AM -0500, Serge E. Hallyn wrote:
> On Tue, Oct 24, 2023 at 10:29:17AM -0400, Paul Moore wrote:
> > On Tue, Oct 24, 2023 at 10:18 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > > On Tue, Oct 24, 2023 at 10:14:29AM -0400, Paul Moore wrote:
> > > > On Tue, Oct 24, 2023 at 9:46 AM Serge E. Hallyn <serge@hallyn.com> wrote:
> > > > > On Sun, Dec 18, 2022 at 08:29:10PM +0100, Stefan Bavendiek wrote:
> > > > > > When building userspace application sandboxes, one issue that does not seem trivial to solve is the isolation of abstract sockets.
> > > > >
> > > > > Veeery late reply.  Have you had any productive discussions about this in
> > > > > other threads or venues?
> > > > >
> > > > > > While most IPC mechanism can be isolated by mechanisms like mount namespaces, abstract sockets are part of the network namespace.
> > > > > > It is possible to isolate abstract sockets by using a new network namespace, however, unprivileged processes can only create a new empty network namespace, which removes network access as well and makes this useless for network clients.
> > > > > >
> > > > > > Same linux sandbox projects try to solve this by bridging the existing network interfaces into the new namespace or use something like slirp4netns to archive this, but this does not look like an ideal solution to this problem, especially since sandboxing should reduce the kernel attack surface without introducing more complexity.
> > > > > >
> > > > > > Aside from containers using namespaces, sandbox implementations based on seccomp and landlock would also run into the same problem, since landlock only provides file system isolation and seccomp cannot filter the path argument and therefore it can only be used to block new unix domain socket connections completely.
> > > > > >
> > > > > > Currently there does not seem to be any way to disable network namespaces in the kernel without also disabling unix domain sockets.
> > > > > >
> > > > > > The question is how to solve the issue of abstract socket isolation in a clean and efficient way, possibly even without namespaces.
> > > > > > What would be the ideal way to implement a mechanism to disable abstract sockets either globally or even better, in the context of a process.
> > > > > > And would such a patch have a realistic chance to make it into the kernel?
> > > > >
> > > > > Disabling them altogether would break lots of things depending on them,
> > > > > like X :)  (@/tmp/.X11-unix/X0).  The other path is to reconsider network
> > > > > namespaces.  There are several directions this could lead.  For one, as
> > > > > Dinesh Subhraveti often points out, the current "network" namespace is
> > > > > really a network device namespace.  If we instead namespace at the
> > > > > bind/connect/etc calls, we end up with much different abilities.
> > > >
> > > > The LSM layer supports access controls on abstract sockets, with at
> > > > least two (AppArmor, SELinux) providing abstract socket access
> > > > controls, other LSMs may provide controls as well.
> > >
> > > Good point.  And for Stefan that may suffice, so thanks for mentioning
> > > that.  But The LSM layer is mandatory access control for use by the
> > > admins.  That doesn't help an unprivileged user.
> > 
> > Individual LSMs may implement mandatory access control models, but
> > that is not an inherent requirement imposed by the LSM layer.  While
> > the Landlock LSM does not (yet?) support access controls for abstract
> > sockets, it is a discretionary access control mechanism.

A recent discussion focused on this topic:
https://lore.kernel.org/all/20231023.ahphah4Wii4v@digikod.net/

I'd like Landlock to be able to scope the use of unix sockets according
to a Landlock domain the same way it is done for ptrace. This would make
it possible to easily isolate unix sockets to a sandbox even by
unprivileged processes (without any namespace change). I'd be happy to
help implement such mechanism.

> 
> In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
> which briefly made it into the -mm kernel, but was eventually rejected as
> being an abuse of the LSM interface for OS level virtualization :)
> 
> It's not 100% clear to me whether Stefan only wants isolation, or
> wants something closer to virtualization.
> 
> Stefan, would an LSM allowing you to isolate certain processes from
> some abstract unix socket paths (or by label, whatever0 suffice for you?
> 
> > I'm not currently aware of a discretionary access control LSM that
> > supports abstract socket access control, but such a LSM should be
> > possible if someone wanted to implement one.
> > 
> > -- 
> > paul-moore.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 13:46 ` Serge E. Hallyn
  2023-10-24 14:05   ` Boris Lukashev
  2023-10-24 14:14   ` Paul Moore
@ 2023-10-25 17:10   ` Jann Horn
  2023-10-25 17:22     ` Serge E. Hallyn
  2 siblings, 1 reply; 18+ messages in thread
From: Jann Horn @ 2023-10-25 17:10 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Stefan Bavendiek, kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 3:46 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> Disabling them altogether would break lots of things depending on them,
> like X :)  (@/tmp/.X11-unix/X0).

FWIW, X can connect over both filesystem-based unix domain sockets and
abstract unix domain sockets. When a normal X client tries to connect
to the server, it'll try a bunch of stuff, including an abstract unix
socket address, a filesystem-based unix socket address, and TCP:

$ DISPLAY=:12345 strace -f -e trace=connect xev >/dev/null
connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X12345"}, 24)
= -1 ECONNREFUSED (Connection refused)
connect(3, {sa_family=AF_UNIX, sun_path="/tmp/.X11-unix/X12345"}, 110)
= -1 ENOENT (No such file or directory)
[...]
connect(3, {sa_family=AF_INET, sin_port=htons(18345),
sin_addr=inet_addr("127.0.0.1")}, 16) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
sin6_scope_id=0}, 28) = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
sin6_scope_id=0}, 28) = -1 ECONNREFUSED (Connection refused)
connect(3, {sa_family=AF_INET, sin_port=htons(18345),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
refused)

And the X server normally listens on both an abstract and a
filesystem-based unix socket address (see "netstat --unix -lnp").

So rejecting abstract unix socket connections shouldn't prevent an X
client from connecting to the X server, I think.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-25 17:10   ` Jann Horn
@ 2023-10-25 17:22     ` Serge E. Hallyn
  2023-10-25 17:41       ` Jann Horn
  0 siblings, 1 reply; 18+ messages in thread
From: Serge E. Hallyn @ 2023-10-25 17:22 UTC (permalink / raw)
  To: Jann Horn
  Cc: Serge E. Hallyn, Stefan Bavendiek, kernel-hardening, linux-hardening

On Wed, Oct 25, 2023 at 07:10:07PM +0200, Jann Horn wrote:
> On Tue, Oct 24, 2023 at 3:46 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> > Disabling them altogether would break lots of things depending on them,
> > like X :)  (@/tmp/.X11-unix/X0).
> 
> FWIW, X can connect over both filesystem-based unix domain sockets and
> abstract unix domain sockets. When a normal X client tries to connect
> to the server, it'll try a bunch of stuff, including an abstract unix
> socket address, a filesystem-based unix socket address, and TCP:
> 
> $ DISPLAY=:12345 strace -f -e trace=connect xev >/dev/null
> connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X12345"}, 24)
> = -1 ECONNREFUSED (Connection refused)
> connect(3, {sa_family=AF_UNIX, sun_path="/tmp/.X11-unix/X12345"}, 110)
> = -1 ENOENT (No such file or directory)
> [...]
> connect(3, {sa_family=AF_INET, sin_port=htons(18345),
> sin_addr=inet_addr("127.0.0.1")}, 16) = 0
> connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
> inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
> sin6_scope_id=0}, 28) = 0
> connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
> inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
> sin6_scope_id=0}, 28) = -1 ECONNREFUSED (Connection refused)
> connect(3, {sa_family=AF_INET, sin_port=htons(18345),
> sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
> refused)
> 
> And the X server normally listens on both an abstract and a
> filesystem-based unix socket address (see "netstat --unix -lnp").
> 
> So rejecting abstract unix socket connections shouldn't prevent an X
> client from connecting to the X server, I think.

Well it was just an example :)  Dbus is another.  But maybe all
the users of abstract unix sockets will fall back gracefully to
something else.  That'd be nice.

For X, abstract really doesn't even make sense to me.  Has it always
supported that?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-25 17:22     ` Serge E. Hallyn
@ 2023-10-25 17:41       ` Jann Horn
  0 siblings, 0 replies; 18+ messages in thread
From: Jann Horn @ 2023-10-25 17:41 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: Stefan Bavendiek, kernel-hardening, linux-hardening

On Wed, Oct 25, 2023 at 7:22 PM Serge E. Hallyn <serge@hallyn.com> wrote:
>
> On Wed, Oct 25, 2023 at 07:10:07PM +0200, Jann Horn wrote:
> > On Tue, Oct 24, 2023 at 3:46 PM Serge E. Hallyn <serge@hallyn.com> wrote:
> > > Disabling them altogether would break lots of things depending on them,
> > > like X :)  (@/tmp/.X11-unix/X0).
> >
> > FWIW, X can connect over both filesystem-based unix domain sockets and
> > abstract unix domain sockets. When a normal X client tries to connect
> > to the server, it'll try a bunch of stuff, including an abstract unix
> > socket address, a filesystem-based unix socket address, and TCP:
> >
> > $ DISPLAY=:12345 strace -f -e trace=connect xev >/dev/null
> > connect(3, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X12345"}, 24)
> > = -1 ECONNREFUSED (Connection refused)
> > connect(3, {sa_family=AF_UNIX, sun_path="/tmp/.X11-unix/X12345"}, 110)
> > = -1 ENOENT (No such file or directory)
> > [...]
> > connect(3, {sa_family=AF_INET, sin_port=htons(18345),
> > sin_addr=inet_addr("127.0.0.1")}, 16) = 0
> > connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
> > inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
> > sin6_scope_id=0}, 28) = 0
> > connect(3, {sa_family=AF_INET6, sin6_port=htons(18345),
> > inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=htonl(0),
> > sin6_scope_id=0}, 28) = -1 ECONNREFUSED (Connection refused)
> > connect(3, {sa_family=AF_INET, sin_port=htons(18345),
> > sin_addr=inet_addr("127.0.0.1")}, 16) = -1 ECONNREFUSED (Connection
> > refused)
> >
> > And the X server normally listens on both an abstract and a
> > filesystem-based unix socket address (see "netstat --unix -lnp").
> >
> > So rejecting abstract unix socket connections shouldn't prevent an X
> > client from connecting to the X server, I think.
>
> Well it was just an example :)  Dbus is another.  But maybe all
> the users of abstract unix sockets will fall back gracefully to
> something else.  That'd be nice.

For what it's worth, when I try to connect to the session or system
bus on my system (like "strace -f -e trace=connect dbus-send
--session/--system /foo foo"), the connections seem to go directly to
a filesystem socket...

> For X, abstract really doesn't even make sense to me.  Has it always
> supported that?

No idea.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-24 16:07         ` Serge E. Hallyn
  2023-10-25 11:54           ` Mickaël Salaün
@ 2023-10-31 20:40           ` Stefan Bavendiek
  2023-11-01 10:56             ` Mickaël Salaün
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Bavendiek @ 2023-10-31 20:40 UTC (permalink / raw)
  To: Serge E. Hallyn; +Cc: kernel-hardening, linux-hardening

On Tue, Oct 24, 2023 at 11:07:14AM -0500, Serge E. Hallyn wrote:
> In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
> which briefly made it into the -mm kernel, but was eventually rejected as
> being an abuse of the LSM interface for OS level virtualization :)
> 
> It's not 100% clear to me whether Stefan only wants isolation, or
> wants something closer to virtualization.
> 
> Stefan, would an LSM allowing you to isolate certain processes from
> some abstract unix socket paths (or by label, whatever0 suffice for you?
>

My intention was to find a clean way to isolate abstract sockets in network
applications without adding dependencies like LSMs. However the entire approach
of using namespaces for this is something I have mostly abandoned. LSMs like
Apparmor and SELinux would work fine for process isolation when you can control
the target system, but for general deployment of sandboxed processes, I found it
to be significantly easier (and more effective) to build this into the
application itself by using a multi process approach with seccomp (Basically how
OpenSSH did it)

- Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-10-31 20:40           ` Stefan Bavendiek
@ 2023-11-01 10:56             ` Mickaël Salaün
  2023-11-01 16:23               ` Jann Horn
  0 siblings, 1 reply; 18+ messages in thread
From: Mickaël Salaün @ 2023-11-01 10:56 UTC (permalink / raw)
  To: Stefan Bavendiek; +Cc: Serge E. Hallyn, kernel-hardening, linux-hardening

On Tue, Oct 31, 2023 at 09:40:59PM +0100, Stefan Bavendiek wrote:
> On Tue, Oct 24, 2023 at 11:07:14AM -0500, Serge E. Hallyn wrote:
> > In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
> > which briefly made it into the -mm kernel, but was eventually rejected as
> > being an abuse of the LSM interface for OS level virtualization :)
> > 
> > It's not 100% clear to me whether Stefan only wants isolation, or
> > wants something closer to virtualization.
> > 
> > Stefan, would an LSM allowing you to isolate certain processes from
> > some abstract unix socket paths (or by label, whatever0 suffice for you?
> >
> 
> My intention was to find a clean way to isolate abstract sockets in network
> applications without adding dependencies like LSMs. However the entire approach
> of using namespaces for this is something I have mostly abandoned. LSMs like
> Apparmor and SELinux would work fine for process isolation when you can control
> the target system, but for general deployment of sandboxed processes, I found it
> to be significantly easier (and more effective) to build this into the
> application itself by using a multi process approach with seccomp (Basically how
> OpenSSH did it)

I agree that for sandbox use cases embedding such security policy into
the application itself makes sense. Landlock works the same way as
seccomp but it sandboxes applications according to the kernel semantic
(e.g. process, socket). The LSM framework is just a kernel
implementation detail. ;)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-11-01 10:56             ` Mickaël Salaün
@ 2023-11-01 16:23               ` Jann Horn
  2023-11-02 14:50                 ` Mickaël Salaün
  0 siblings, 1 reply; 18+ messages in thread
From: Jann Horn @ 2023-11-01 16:23 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Stefan Bavendiek, Serge E. Hallyn, kernel-hardening, linux-hardening

On Wed, Nov 1, 2023 at 11:57 AM Mickaël Salaün <mic@digikod.net> wrote:
> On Tue, Oct 31, 2023 at 09:40:59PM +0100, Stefan Bavendiek wrote:
> > On Tue, Oct 24, 2023 at 11:07:14AM -0500, Serge E. Hallyn wrote:
> > > In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
> > > which briefly made it into the -mm kernel, but was eventually rejected as
> > > being an abuse of the LSM interface for OS level virtualization :)
> > >
> > > It's not 100% clear to me whether Stefan only wants isolation, or
> > > wants something closer to virtualization.
> > >
> > > Stefan, would an LSM allowing you to isolate certain processes from
> > > some abstract unix socket paths (or by label, whatever0 suffice for you?
> > >
> >
> > My intention was to find a clean way to isolate abstract sockets in network
> > applications without adding dependencies like LSMs. However the entire approach
> > of using namespaces for this is something I have mostly abandoned. LSMs like
> > Apparmor and SELinux would work fine for process isolation when you can control
> > the target system, but for general deployment of sandboxed processes, I found it
> > to be significantly easier (and more effective) to build this into the
> > application itself by using a multi process approach with seccomp (Basically how
> > OpenSSH did it)
>
> I agree that for sandbox use cases embedding such security policy into
> the application itself makes sense. Landlock works the same way as
> seccomp but it sandboxes applications according to the kernel semantic
> (e.g. process, socket). The LSM framework is just a kernel
> implementation detail. ;)

(Related, it might be nice if Landlock had a way to completely deny
access to abstract unix sockets, and a way to restrict filesystem unix
sockets with filesystem rules... LANDLOCK_ACCESS_FS_MAKE_SOCK exists
for restricting bind(), but I don't think there's an analogous
permission for connect().

Currently, when you try to sandbox an application with Landlock, you
have to use seccomp to completely block access to unix domain sockets,
or alternatively use something like the seccomp_unotify feature to
interactively filter connect() calls.

On the other hand, maybe such a feature would be a bit superfluous
when we have seccomp_unotify already... idk.)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Isolating abstract sockets
  2023-11-01 16:23               ` Jann Horn
@ 2023-11-02 14:50                 ` Mickaël Salaün
  0 siblings, 0 replies; 18+ messages in thread
From: Mickaël Salaün @ 2023-11-02 14:50 UTC (permalink / raw)
  To: Jann Horn
  Cc: Stefan Bavendiek, Serge E. Hallyn, kernel-hardening,
	linux-hardening, Konstantin Meskhidze, Günther Noack

On Wed, Nov 01, 2023 at 05:23:12PM +0100, Jann Horn wrote:
> On Wed, Nov 1, 2023 at 11:57 AM Mickaël Salaün <mic@digikod.net> wrote:
> > On Tue, Oct 31, 2023 at 09:40:59PM +0100, Stefan Bavendiek wrote:
> > > On Tue, Oct 24, 2023 at 11:07:14AM -0500, Serge E. Hallyn wrote:
> > > > In 2005, before namespaces were upstreamed, I posted the 'bsdjail' LSM,
> > > > which briefly made it into the -mm kernel, but was eventually rejected as
> > > > being an abuse of the LSM interface for OS level virtualization :)
> > > >
> > > > It's not 100% clear to me whether Stefan only wants isolation, or
> > > > wants something closer to virtualization.
> > > >
> > > > Stefan, would an LSM allowing you to isolate certain processes from
> > > > some abstract unix socket paths (or by label, whatever0 suffice for you?
> > > >
> > >
> > > My intention was to find a clean way to isolate abstract sockets in network
> > > applications without adding dependencies like LSMs. However the entire approach
> > > of using namespaces for this is something I have mostly abandoned. LSMs like
> > > Apparmor and SELinux would work fine for process isolation when you can control
> > > the target system, but for general deployment of sandboxed processes, I found it
> > > to be significantly easier (and more effective) to build this into the
> > > application itself by using a multi process approach with seccomp (Basically how
> > > OpenSSH did it)
> >
> > I agree that for sandbox use cases embedding such security policy into
> > the application itself makes sense. Landlock works the same way as
> > seccomp but it sandboxes applications according to the kernel semantic
> > (e.g. process, socket). The LSM framework is just a kernel
> > implementation detail. ;)
> 
> (Related, it might be nice if Landlock had a way to completely deny
> access to abstract unix sockets,

I think it would make more sense to scope access to abstract unix
sockets: https://lore.kernel.org/all/20231025.eecai4uGh5Ie@digikod.net/

A complementary approach would be to restrict socket creation according
to their properties:
https://lore.kernel.org/all/b8a2045a-e7e8-d141-7c01-bf47874c7930@digikod.net/

> and a way to restrict filesystem unix
> sockets with filesystem rules... LANDLOCK_ACCESS_FS_MAKE_SOCK exists
> for restricting bind(), but I don't think there's an analogous
> permission for connect().

I agree. It should not be too difficult to add a new LSM path hook for
connect (and sendmsg) to named unix socket with the related access
rights.  We should be careful about the impact on sendmsg calls though.

> 
> Currently, when you try to sandbox an application with Landlock, you
> have to use seccomp to completely block access to unix domain sockets,
> or alternatively use something like the seccomp_unotify feature to
> interactively filter connect() calls.
> 
> On the other hand, maybe such a feature would be a bit superfluous
> when we have seccomp_unotify already... idk.)

seccomp_unotify enables user space to emulate syscalls, which requires a
service per sandbox. seccomp is useful but will always be delicate to
use and to maintain the related filters for sandboxing use cases:
https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/

Anyway, I'd be happy to help improve Landlock with new access control
types.

FYI, TCP connect and bind access control should be part of Linux 6.7:
https://lore.kernel.org/all/20231102131354.263678-1-mic@digikod.net/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-11-02 14:51 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-18 19:29 Isolating abstract sockets Stefan Bavendiek
2023-10-24 13:46 ` Serge E. Hallyn
2023-10-24 14:05   ` Boris Lukashev
2023-10-24 14:15     ` Serge E. Hallyn
2023-10-24 15:55       ` Boris Lukashev
2023-10-24 16:11         ` Serge E. Hallyn
2023-10-24 14:14   ` Paul Moore
2023-10-24 14:18     ` Serge E. Hallyn
2023-10-24 14:29       ` Paul Moore
2023-10-24 16:07         ` Serge E. Hallyn
2023-10-25 11:54           ` Mickaël Salaün
2023-10-31 20:40           ` Stefan Bavendiek
2023-11-01 10:56             ` Mickaël Salaün
2023-11-01 16:23               ` Jann Horn
2023-11-02 14:50                 ` Mickaël Salaün
2023-10-25 17:10   ` Jann Horn
2023-10-25 17:22     ` Serge E. Hallyn
2023-10-25 17:41       ` Jann Horn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).