From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756220AbdJLQd7 (ORCPT ); Thu, 12 Oct 2017 12:33:59 -0400 Received: from sonic306-27.consmr.mail.ne1.yahoo.com ([66.163.189.89]:40950 "EHLO sonic306-27.consmr.mail.ne1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752759AbdJLQdz (ORCPT ); Thu, 12 Oct 2017 12:33:55 -0400 X-YMail-OSG: FCAtfbsVM1kvnrgSHBpIVOkd_9MOMx2U7e0ac4_A2bZq896LM7gVzoJ7GSW2vLp MgTv5WCrHppWX.XlPaDX3ehocFn3W0Cw999T.qGkc2Taq3xOvhIVMS.BInNgGyJmxOXYcNRi3k90 z93OszYcB.yp9se3g4NvT9wifEHxn.7fn8FH8FvlSU4u4mhFCsF3a6vXOgXyo5bGbELN3aNwnc3g Pc4nD9rnL6gw3_H6y5frTEXMiTPQFcRNb.rH5Wq8yipEj4UqaS6ARaJAEO1MH.ygPXzNo73Z7jZT JmGwRRY8L7u8ulP3g21oB_mY1yE60Zt_lcadN1jQxCdRFwtin079MEzpUpxzW35BsB8Ds0hG0fAG SC6DPQ5l0MCuqdW9sfFMsJ14mXGwrdEhJ3xxfaiwIAY8t9Z5q3gLR_qCDEqoZ7kToYYWf5gnpVaI AvLqGQDrCv0kmcTrMTSBRoQI5TsTxOQh2SKV3.ANPJI.UQXsckRUSKaa3RrvGFjnaYLIE2XAkek3 JMCCqgf0jHH8TZsL.IxgajR2P X-Yahoo-Newman-Id: 418388.74890.bm@smtp205.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: FCAtfbsVM1kvnrgSHBpIVOkd_9MOMx2U7e0ac4_A2bZq896 LM7gVzoJ7GSW2vLpMgTv5WCrHppWX.XlPaDX3ehocFn3W0Cw999T.qGkc2Ta q3xOvhIVMS.BInNgGyJmxOXYcNRi3k90z93OszYcB.yp9se3g4NvT9wifEHx n.7fn8FH8FvlSU4u4mhFCsF3a6vXOgXyo5bGbELN3aNwnc3gPc4nD9rnL6gw 3_H6y5frTEXMiTPQFcRNb.rH5Wq8yipEj4UqaS6ARaJAEO1MH.ygPXzNo73Z 7jZTJmGwRRY8L7u8ulP3g21oB_mY1yE60Zt_lcadN1jQxCdRFwtin079MEzp UpxzW35BsB8Ds0hG0fAGSC6DPQ5l0MCuqdW9sfFMsJ14mXGwrdEhJ3xxfaiw IAY8t9Z5q3gLR_qCDEqoZ7kToYYWf5gnpVaIAvLqGQDrCv0kmcTrMTSBRoQI 5TsTxOQh2SKV3.ANPJI.UQXsckRUSKaa3RrvGFjnaYLIE2XAkek3JMCCqgf0 jHH8TZsL.IxgajR2P X-Yahoo-SMTP: OIJXglSswBDfgLtXluJ6wiAYv6_cnw-- Subject: Re: RFC(v2): Audit Kernel Container IDs To: Richard Guy Briggs , cgroups@vger.kernel.org, Linux Containers , Linux API , Linux Audit , Linux FS Devel , Linux Kernel , Linux Network Development Cc: mszeredi@redhat.com, Andy Lutomirski , jlayton@redhat.com, "Carlos O'Donell" , Al Viro , David Howells , Simo Sorce , trondmy@primarydata.com, Eric Paris , "Serge E. Hallyn" , "Eric W. Biederman" References: <20171012141359.saqdtnodwmbz33b2@madcap2.tricolour.ca> From: Casey Schaufler Message-ID: <75b7d6a6-42ba-2dff-1836-1091c7c024e7@schaufler-ca.com> Date: Thu, 12 Oct 2017 09:33:49 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20171012141359.saqdtnodwmbz33b2@madcap2.tricolour.ca> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/12/2017 7:14 AM, Richard Guy Briggs wrote: > Containers are a userspace concept. The kernel knows nothing of them. > > The Linux audit system needs a way to be able to track the container > provenance of events and actions. Audit needs the kernel's help to do > this. > > Since the concept of a container is entirely a userspace concept, a > registration from the userspace container orchestration system initiates > this. This will define a point in time and a set of resources > associated with a particular container with an audit container ID. > > The registration is a pseudo filesystem (proc, since PID tree already > exists) write of a u8[16] UUID representing the container ID to a file > representing a process that will become the first process in a new > container. This write might place restrictions on mount namespaces > required to define a container, or at least careful checking of > namespaces in the kernel to verify permissions of the orchestrator so it > can't change its own container ID. A bind mount of nsfs may be > necessary in the container orchestrator's mntNS. > Note: Use a 128-bit scalar rather than a string to make compares faster > and simpler. > > Require a new CAP_CONTAINER_ADMIN to be able to carry out the > registration. Hang on. If containers are a user space concept, how can you want CAP_CONTAINER_ANYTHING? If there's not such thing as a container, how can you be asking for a capability to manage them? > At that time, record the target container's user-supplied > container identifier along with the target container's first process > (which may become the target container's "init" process) process ID > (referenced from the initial PID namespace), all namespace IDs (in the > form of a nsfs device number and inode number tuple) in a new auxilliary > record AUDIT_CONTAINER with a qualifying op=$action field. > > Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid > container ID present on an auditable action or event. > > Forked and cloned processes inherit their parent's container ID, > referenced in the process' task_struct. > > Mimic setns(2) and return an error if the process has already initiated > threading or forked since this registration should happen before the > process execution is started by the orchestrator and hence should not > yet have any threads or children. If this is deemed overly restrictive, > switch all threads and children to the new containerID. > > Trust the orchestrator to judiciously use and restrict CAP_CONTAINER_ADMIN. > > Log the creation of every namespace, inheriting/adding its spawning > process' containerID(s), if applicable. Include the spawning and > spawned namespace IDs (device and inode number tuples). > [AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)] > Note: At this point it appears only network namespaces may need to track > container IDs apart from processes since incoming packets may cause an > auditable event before being associated with a process. > > Log the destruction of every namespace when it is no longer used by any > process, include the namespace IDs (device and inode number tuples). > [AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)] > > Issue a new auxilliary record AUDIT_NS_CHANGE listing (opt: op=$action) > the parent and child namespace IDs for any changes to a process' > namespaces. [setns(2)] > Note: It may be possible to combine AUDIT_NS_* record formats and > distinguish them with an op=$action field depending on the fields > required for each message type. > > When a container ceases to exist because the last process in that > container has exited and hence the last namespace has been destroyed and > its refcount dropping to zero, log the fact. > (This latter is likely needed for certification accountability.) A > container object may need a list of processes and/or namespaces. > > A namespace cannot directly migrate from one container to another but > could be assigned to a newly spawned container. A namespace can be > moved from one container to another indirectly by having that namespace > used in a second process in another container and then ending all the > processes in the first container. > > (v2) > - switch from u64 to u128 UUID > - switch from "signal" and "trigger" to "register" > - restrict registration to single process or force all threads and children into same container > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit > From mboxrd@z Thu Jan 1 00:00:00 1970 From: Casey Schaufler Subject: Re: RFC(v2): Audit Kernel Container IDs Date: Thu, 12 Oct 2017 09:33:49 -0700 Message-ID: <75b7d6a6-42ba-2dff-1836-1091c7c024e7@schaufler-ca.com> References: <20171012141359.saqdtnodwmbz33b2@madcap2.tricolour.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Cc: mszeredi-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Andy Lutomirski , jlayton-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Carlos O'Donell , Al Viro , David Howells , Simo Sorce , trondmy-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org, Eric Paris , "Serge E. Hallyn" , "Eric W. Biederman" To: Richard Guy Briggs , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Linux Containers , Linux API , Linux Audit , Linux FS Devel , Linux Kernel , Linux Network Development Return-path: In-Reply-To: <20171012141359.saqdtnodwmbz33b2-bcJWsdo4jJjeVoXN4CMphl7TgLCtbB0G@public.gmane.org> Content-Language: en-US Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: netdev.vger.kernel.org On 10/12/2017 7:14 AM, Richard Guy Briggs wrote: > Containers are a userspace concept. The kernel knows nothing of them. > > The Linux audit system needs a way to be able to track the container > provenance of events and actions. Audit needs the kernel's help to do > this. > > Since the concept of a container is entirely a userspace concept, a > registration from the userspace container orchestration system initiates > this. This will define a point in time and a set of resources > associated with a particular container with an audit container ID. > > The registration is a pseudo filesystem (proc, since PID tree already > exists) write of a u8[16] UUID representing the container ID to a file > representing a process that will become the first process in a new > container. This write might place restrictions on mount namespaces > required to define a container, or at least careful checking of > namespaces in the kernel to verify permissions of the orchestrator so it > can't change its own container ID. A bind mount of nsfs may be > necessary in the container orchestrator's mntNS. > Note: Use a 128-bit scalar rather than a string to make compares faster > and simpler. > > Require a new CAP_CONTAINER_ADMIN to be able to carry out the > registration. Hang on. If containers are a user space concept, how can you want CAP_CONTAINER_ANYTHING? If there's not such thing as a container, how can you be asking for a capability to manage them? > At that time, record the target container's user-supplied > container identifier along with the target container's first process > (which may become the target container's "init" process) process ID > (referenced from the initial PID namespace), all namespace IDs (in the > form of a nsfs device number and inode number tuple) in a new auxilliary > record AUDIT_CONTAINER with a qualifying op=$action field. > > Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid > container ID present on an auditable action or event. > > Forked and cloned processes inherit their parent's container ID, > referenced in the process' task_struct. > > Mimic setns(2) and return an error if the process has already initiated > threading or forked since this registration should happen before the > process execution is started by the orchestrator and hence should not > yet have any threads or children. If this is deemed overly restrictive, > switch all threads and children to the new containerID. > > Trust the orchestrator to judiciously use and restrict CAP_CONTAINER_ADMIN. > > Log the creation of every namespace, inheriting/adding its spawning > process' containerID(s), if applicable. Include the spawning and > spawned namespace IDs (device and inode number tuples). > [AUDIT_NS_CREATE, AUDIT_NS_DESTROY] [clone(2), unshare(2), setns(2)] > Note: At this point it appears only network namespaces may need to track > container IDs apart from processes since incoming packets may cause an > auditable event before being associated with a process. > > Log the destruction of every namespace when it is no longer used by any > process, include the namespace IDs (device and inode number tuples). > [AUDIT_NS_DESTROY] [process exit, unshare(2), setns(2)] > > Issue a new auxilliary record AUDIT_NS_CHANGE listing (opt: op=$action) > the parent and child namespace IDs for any changes to a process' > namespaces. [setns(2)] > Note: It may be possible to combine AUDIT_NS_* record formats and > distinguish them with an op=$action field depending on the fields > required for each message type. > > When a container ceases to exist because the last process in that > container has exited and hence the last namespace has been destroyed and > its refcount dropping to zero, log the fact. > (This latter is likely needed for certification accountability.) A > container object may need a list of processes and/or namespaces. > > A namespace cannot directly migrate from one container to another but > could be assigned to a newly spawned container. A namespace can be > moved from one container to another indirectly by having that namespace > used in a second process in another container and then ending all the > processes in the first container. > > (v2) > - switch from u64 to u128 UUID > - switch from "signal" and "trigger" to "register" > - restrict registration to single process or force all threads and children into same container > > - RGB > > -- > Richard Guy Briggs > Sr. S/W Engineer, Kernel Security, Base Operating Systems > Remote, Ottawa, Red Hat Canada > IRC: rgb, SunRaycer > Voice: +1.647.777.2635, Internal: (81) 32635 > > -- > Linux-audit mailing list > Linux-audit-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org > https://www.redhat.com/mailman/listinfo/linux-audit >