On 1/16/2017 12:46 PM, James Bottomley wrote: > > For identity, doesn't the UTS namespace do this? If not, what is > missing? > > James James, Thanks for posing the question. Unless I'm missing something, the UTS namespace permits an alternate 'hostname' and NIS 'domainname' to be specified for local visibility to the processes running in the container. For an /afs network file system client (kafs, OpenAFS or AuriStorFS) the kernel module must be able to associate each process with an authentication context. The AFS family of file systems have implemented this binding as part of its Process Authentication Group (PAG) concept. A PAG is a set of processes that share an authentication context. The authentication context includes: * network credentials necessary to establish new server connections to requisite network-based services. These include not only the backing store for files and directories but any distributed database services managing location independence, replication, failover, etc. * established server connections to individual servers. These connections are re-used for all requests from a process that shares the authentication context. The network credentials might be a Kerberos ticket, a public key, the result of a GSS-API exchange, or something else. It depends on the requirements of the security class. The security properties of a PAG are: * a new PAG may be created by any process. When a new PAG is created its membership is only the process that created it. * a process may remove itself from a PAG that it is a member of. * when a child process is created, it inherits a single PAG membership from the parent process. * it should not be possible to join a process to a PAG after process creation. Although due to implementation limitations on some platforms you will find references to a child process being able to set the PAG of its parent process. In the traditional PAG implementations used by AFS unix clients, there has been a restriction of one PAG membership per process. The Windows client implements an extended model which is better suited to multi-threaded processes. * a process can be a member of more than one PAG at a time * a process can select one of its PAGs as the default PAG * a thread can select one of the process' PAGs as its active PAG and if there is no active PAG, the process default PAG is used This extended Authentication Group model works well for processes such as web servers that need to execute requests in the authentication context of the delegated identity and be able to rapidly switch contexts for each request. It is important to note that the network credentials stored in an authentication context do not necessarily have any relationship to the local machine. It is also important to remember that network credentials often have a relatively short lifetime and must be renewed or replaced on a regular basis. For containers I envision PAGs being used in the following manner: * A process running in the context of the host OS or one that has access to keys stored in a TPM or other secure keystore creates a new PAG for each container it is going to launch. * This process will then obtain the initial network credentials required by the container processes and store them into the PAG. * The initial Container process will then be created as a child process and inherits the PAG membership. Each subsequent child process in the Container will in turn inherit the same PAG. * Periodically the host OS process will renew the network credentials for the PAG. This avoids the need for the processes in the container to have any access to or knowledge of the network identity under which it is executing. * A process in the container could decide to resign from the inherited PAG and create its own PAG using credentials available to that process. For example, a web server running in a container. The end result is a PAG which spans both the host OS and the Container processes. The Container processes might not even know what credentials they are running with. Keyrings were created as a storage facility for the network credentials, https://www.infradead.org/~dhowells/kafs/#keyrings, but keyrings are not an authentication context. While a file system can internally create an association between an authentication content with a file descriptor once it is created and with pages for write-back, I believe there would be benefit from a more generic method of tracking authentication contexts in file descriptors and pages. In particular would be better defined behavior when a file has been opened for "write" from processes associated with more than one authentication context. PAG creation and PAG token set manipulation in the AFS family of file systems traditionally took place via the use of path-based ioctls. Providing equivalent functionality to user-land is an open topic that David Howells's submitted as a topic for LSF/MM. See afs(setpag), VIOC_GETPAG, VIOCUNPAG, VIC*TOK* and VIOCUNLOG: https://www.infradead.org/~dhowells/kafs/user_interface.html While the PAG model has worked well for many decades it does periodically run into problems with system design that assumes that local system identities have the same meaning to network resources. For example, the problems that AFS is currently experiencing with systemd. A good description of problem by Jonathan Billings can be found at https://docs.google.com/document/d/1P27fP1uj-C8QdxDKMKtI-Qh00c5_9zJa4YHjnpB6ODM/pub I hope this letter is helpful in describing the issues that the AFS community has experienced and how we believe that authentication context management can be used to enhance the usability of containers.