All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC(v2): Audit Kernel Container IDs
@ 2017-10-12 14:14 Richard Guy Briggs
  0 siblings, 0 replies; 94+ messages in thread
From: Richard Guy Briggs @ 2017-10-12 14:14 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA, Linux Containers, Linux API,
	Linux Audit, Linux FS Devel, Linux Kernel,
	Linux Network Development
  Cc: mszeredi-H+wXaHxf7aLQT0dZR+AlfA, Steve Grubb, Andy Lutomirski,
	jlayton-H+wXaHxf7aLQT0dZR+AlfA, Carlos O'Donell, Paul Moore,
	Al Viro, David Howells, Simo Sorce,
	trondmy-7I+n7zu2hftEKMMhf/gKZA, Eric Paris, Eric W. Biederman

Containers are a userspace concept.  The kernel knows nothing of them.

The Linux audit system needs a way to be able to track the container
provenance of events and actions.  Audit needs the kernel's help to do
this.

Since the concept of a container is entirely a userspace concept, a
registration from the userspace container orchestration system initiates
this.  This will define a point in time and a set of resources
associated with a particular container with an audit container ID.

The registration is a pseudo filesystem (proc, since PID tree already
exists) write of a u8[16] UUID representing the container ID to a file
representing a process that will become the first process in a new
container.  This write might place restrictions on mount namespaces
required to define a container, or at least careful checking of
namespaces in the kernel to verify permissions of the orchestrator so it
can't change its own container ID.  A bind mount of nsfs may be
necessary in the container orchestrator's mntNS.
Note: Use a 128-bit scalar rather than a string to make compares faster
and simpler.

Require a new CAP_CONTAINER_ADMIN to be able to carry out the
registration.  At that time, record the target container's user-supplied
container identifier along with the target container's first process
(which may become the target container's "init" process) process ID
(referenced from the initial PID namespace), all namespace IDs (in the
form of a nsfs device number and inode number tuple) in a new auxilliary
record AUDIT_CONTAINER with a qualifying op=$action field.

Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid
container ID present on an auditable action or event.

Forked and cloned processes inherit their parent's container ID,
referenced in the process' task_struct.

Mimic setns(2) and return an error if the process has already initiated
threading or forked since this registration should happen before the
process execution is started by the orchestrator and hence should not
yet have any threads or children.  If this is deemed overly restrictive,
switch all threads and children to the new containerID.

Trust the orchestrator to judiciously use and restrict CAP_CONTAINER_ADMIN.

Log the creation of every namespace, inheriting/adding its spawning
process' containerID(s), if applicable.  Include the spawning and
spawned namespace IDs (device and inode number tuples).
[AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)]
Note: At this point it appears only network namespaces may need to track
container IDs apart from processes since incoming packets may cause an
auditable event before being associated with a process.

Log the destruction of every namespace when it is no longer used by any
process, include the namespace IDs (device and inode number tuples).
[AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)]

Issue a new auxilliary record AUDIT_NS_CHANGE listing (opt: op=$action)
the parent and child namespace IDs for any changes to a process'
namespaces. [setns(2)]
Note: It may be possible to combine AUDIT_NS_* record formats and
distinguish them with an op=$action field depending on the fields
required for each message type.

When a container ceases to exist because the last process in that
container has exited and hence the last namespace has been destroyed and
its refcount dropping to zero, log the fact.
(This latter is likely needed for certification accountability.)  A
container object may need a list of processes and/or namespaces.

A namespace cannot directly migrate from one container to another but
could be assigned to a newly spawned container.  A namespace can be
moved from one container to another indirectly by having that namespace
used in a second process in another container and then ending all the
processes in the first container.

(v2)
- switch from u64 to u128 UUID
- switch from "signal" and "trigger" to "register"
- restrict registration to single process or force all threads and children into same container

- RGB

--
Richard Guy Briggs <rgb-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 94+ messages in thread
* RFC(v2): Audit Kernel Container IDs
@ 2017-10-12 14:14 ` Richard Guy Briggs
  0 siblings, 0 replies; 94+ messages in thread
From: Richard Guy Briggs @ 2017-10-12 14:14 UTC (permalink / raw)
  To: cgroups, Linux Containers, Linux API, Linux Audit,
	Linux FS Devel, Linux Kernel, Linux Network Development
  Cc: Simo Sorce, Carlos O'Donell, Aristeu Rozanski, David Howells,
	Eric W. Biederman, Eric Paris, jlayton, Andy Lutomirski,
	mszeredi, Paul Moore, Serge E. Hallyn, Steve Grubb, trondmy,
	Al Viro

Containers are a userspace concept.  The kernel knows nothing of them.

The Linux audit system needs a way to be able to track the container
provenance of events and actions.  Audit needs the kernel's help to do
this.

Since the concept of a container is entirely a userspace concept, a
registration from the userspace container orchestration system initiates
this.  This will define a point in time and a set of resources
associated with a particular container with an audit container ID.

The registration is a pseudo filesystem (proc, since PID tree already
exists) write of a u8[16] UUID representing the container ID to a file
representing a process that will become the first process in a new
container.  This write might place restrictions on mount namespaces
required to define a container, or at least careful checking of
namespaces in the kernel to verify permissions of the orchestrator so it
can't change its own container ID.  A bind mount of nsfs may be
necessary in the container orchestrator's mntNS.
Note: Use a 128-bit scalar rather than a string to make compares faster
and simpler.

Require a new CAP_CONTAINER_ADMIN to be able to carry out the
registration.  At that time, record the target container's user-supplied
container identifier along with the target container's first process
(which may become the target container's "init" process) process ID
(referenced from the initial PID namespace), all namespace IDs (in the
form of a nsfs device number and inode number tuple) in a new auxilliary
record AUDIT_CONTAINER with a qualifying op=$action field.

Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid
container ID present on an auditable action or event.

Forked and cloned processes inherit their parent's container ID,
referenced in the process' task_struct.

Mimic setns(2) and return an error if the process has already initiated
threading or forked since this registration should happen before the
process execution is started by the orchestrator and hence should not
yet have any threads or children.  If this is deemed overly restrictive,
switch all threads and children to the new containerID.

Trust the orchestrator to judiciously use and restrict CAP_CONTAINER_ADMIN.

Log the creation of every namespace, inheriting/adding its spawning
process' containerID(s), if applicable.  Include the spawning and
spawned namespace IDs (device and inode number tuples).
[AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)]
Note: At this point it appears only network namespaces may need to track
container IDs apart from processes since incoming packets may cause an
auditable event before being associated with a process.

Log the destruction of every namespace when it is no longer used by any
process, include the namespace IDs (device and inode number tuples).
[AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)]

Issue a new auxilliary record AUDIT_NS_CHANGE listing (opt: op=$action)
the parent and child namespace IDs for any changes to a process'
namespaces. [setns(2)]
Note: It may be possible to combine AUDIT_NS_* record formats and
distinguish them with an op=$action field depending on the fields
required for each message type.

When a container ceases to exist because the last process in that
container has exited and hence the last namespace has been destroyed and
its refcount dropping to zero, log the fact.
(This latter is likely needed for certification accountability.)  A
container object may need a list of processes and/or namespaces.

A namespace cannot directly migrate from one container to another but
could be assigned to a newly spawned container.  A namespace can be
moved from one container to another indirectly by having that namespace
used in a second process in another container and then ending all the
processes in the first container.

(v2)
- switch from u64 to u128 UUID
- switch from "signal" and "trigger" to "register"
- restrict registration to single process or force all threads and children into same container

- RGB

--
Richard Guy Briggs <rgb@redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2017-12-11 19:37 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-12 14:14 RFC(v2): Audit Kernel Container IDs Richard Guy Briggs
2017-10-12 14:14 Richard Guy Briggs
2017-10-12 14:14 ` Richard Guy Briggs
2017-10-12 15:45 ` Steve Grubb
2017-10-19 19:57   ` Richard Guy Briggs
2017-10-19 19:57     ` Richard Guy Briggs
     [not found]     ` <20171019195747.4ssujtaj3f5ipsoh-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
2017-10-19 23:11       ` Aleksa Sarai
2017-10-19 23:11     ` Aleksa Sarai
2017-10-19 23:11       ` Aleksa Sarai
2017-10-19 23:15       ` Aleksa Sarai
     [not found]       ` <8f495870-dd6c-23b9-b82b-4228a441c729-l3A5Bk7waGM@public.gmane.org>
2017-10-19 23:15         ` Aleksa Sarai
2017-10-20  2:25         ` Steve Grubb
2017-10-20  2:25       ` Steve Grubb
2017-10-20  2:25         ` Steve Grubb
2017-10-19 19:57   ` Richard Guy Briggs
     [not found] ` <20171012141359.saqdtnodwmbz33b2-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
2017-10-12 15:45   ` Steve Grubb
2017-10-12 16:33   ` Casey Schaufler
2017-10-12 17:59   ` Eric W. Biederman
2017-10-12 17:59     ` Eric W. Biederman
2017-10-13 13:43   ` Alan Cox
2017-10-12 16:33 ` Casey Schaufler
2017-10-12 16:33   ` Casey Schaufler
2017-10-17  0:33   ` Richard Guy Briggs
2017-10-17  1:10     ` Casey Schaufler
     [not found]       ` <81c15928-c445-fb8e-251c-bee566fbbf58-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-10-19  0:05         ` Richard Guy Briggs
2017-10-19  0:05       ` Richard Guy Briggs
2017-10-19  0:05         ` Richard Guy Briggs
     [not found]         ` <20171019000527.eio6dfsmujmtioyt-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
2017-10-19 13:32           ` Casey Schaufler
2017-10-19 13:32         ` Casey Schaufler
2017-10-19 13:32           ` Casey Schaufler
2017-10-19 15:51           ` Paul Moore
     [not found]           ` <18cb69a5-f998-0e6e-85df-7f4b9b768a6f-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-10-19 15:51             ` Paul Moore
     [not found]     ` <20171017003340.whjdkqmkw4lydwy7-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org>
2017-10-17  1:10       ` Casey Schaufler
2017-10-17  1:42       ` Steve Grubb
2017-10-17  1:42         ` Steve Grubb
2017-10-17 12:31         ` Simo Sorce
2017-10-17 14:59           ` Casey Schaufler
     [not found]             ` <a07968f6-fef1-f49d-01f1-6c660c0ada20-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-10-17 15:28               ` Simo Sorce
2017-10-17 15:28                 ` Simo Sorce
2017-10-17 15:28                 ` Simo Sorce
2017-10-17 15:44                 ` James Bottomley
2017-10-17 15:44                   ` James Bottomley
2017-10-17 16:43                   ` Casey Schaufler
2017-10-17 17:15                     ` Steve Grubb
2017-10-17 17:57                       ` James Bottomley
2017-10-17 17:57                         ` James Bottomley
     [not found]                         ` <1508263063.3129.35.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-10-18  0:23                           ` Steve Grubb
2017-10-18  0:23                             ` Steve Grubb
     [not found]                     ` <eb96144d-4ab5-7f9f-de18-b296db35a00a-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-10-17 17:15                       ` Steve Grubb
     [not found]                   ` <1508255091.3129.27.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-10-17 16:43                     ` Casey Schaufler
2017-10-18 20:56                     ` Paul Moore
2017-10-18 20:56                       ` Paul Moore
     [not found]                       ` <CAHC9VhRV9m6-APj3ofMQc22rL-WUoDzB8-urUxryszjCHHHLTg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-18 23:46                         ` Aleksa Sarai
2017-10-18 23:46                           ` Aleksa Sarai
     [not found]                           ` <49752b6f-8a77-d1e5-8acb-5a1eed0a992c-l3A5Bk7waGM@public.gmane.org>
2017-10-19  0:43                             ` Eric W. Biederman
2017-10-19  0:43                           ` Eric W. Biederman
2017-10-19  0:43                             ` Eric W. Biederman
2017-10-19 15:36                             ` Paul Moore
2017-10-19 15:36                               ` Paul Moore
2017-10-19 16:25                               ` Eric W. Biederman
2017-10-19 16:25                                 ` Eric W. Biederman
2017-10-19 17:47                                 ` Paul Moore
     [not found]                                 ` <87y3o7gl5l.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-10-19 17:47                                   ` Paul Moore
     [not found]                               ` <CAHC9VhTYF-MJm3ejWXE1H-eeXKaNBkeWKwdiKdj093xATYn7nQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-10-19 16:25                                 ` Eric W. Biederman
     [not found]                             ` <871sm0j7bm.fsf-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
2017-10-19 15:36                               ` Paul Moore
     [not found]                 ` <1508254120.6230.34.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-17 15:44                   ` James Bottomley
2017-10-17 16:10                   ` Casey Schaufler
2017-10-17 16:10                 ` Casey Schaufler
2017-10-18 19:58           ` Paul Moore
     [not found]           ` <1508243469.6230.24.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-17 14:59             ` Casey Schaufler
2017-10-18 19:58             ` Paul Moore
2017-10-17 12:31         ` Simo Sorce
     [not found]   ` <75b7d6a6-42ba-2dff-1836-1091c7c024e7-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-10-17  0:33     ` Richard Guy Briggs
2017-12-09 10:20     ` Mickaël Salaün
2017-12-09 10:20   ` Mickaël Salaün
2017-12-09 10:20     ` Mickaël Salaün
2017-12-09 18:28     ` Casey Schaufler
2017-12-09 18:28       ` Casey Schaufler
2017-12-09 18:28       ` Casey Schaufler
2017-12-11 16:30       ` Eric Paris
2017-12-11 16:52         ` Casey Schaufler
     [not found]         ` <1513009857.6310.337.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-12-11 16:52           ` Casey Schaufler
2017-12-11 19:37           ` Steve Grubb
2017-12-11 19:37         ` Steve Grubb
2017-12-11 19:37           ` Steve Grubb
     [not found]       ` <f8ea78be-9bbf-2967-7b12-ac93bb85b0bc-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>
2017-12-11 16:30         ` Eric Paris
2017-12-11 15:10     ` Richard Guy Briggs
2017-12-11 15:10       ` Richard Guy Briggs
2017-12-11 15:10       ` Richard Guy Briggs
     [not found]     ` <7ebca85a-425c-2b95-9a5f-59d81707339e-WFhQfpSGs3bR7s880joybQ@public.gmane.org>
2017-12-09 18:28       ` Casey Schaufler
2017-12-11 15:10       ` Richard Guy Briggs
2017-10-13 13:43 ` Alan Cox
2017-10-13 13:43   ` Alan Cox
2017-10-13 13:43   ` Alan Cox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.