On 1/16/2017 12:46 PM, James Bottomley wrote:
>
> For identity, doesn't the UTS namespace do this?  If not, what is
> missing?
> 
> James

James,

Thanks for posing the question.

Unless I'm missing something, the UTS namespace permits an alternate
'hostname' and NIS 'domainname' to be specified for local visibility to
the processes running in the container.

For an /afs network file system client (kafs, OpenAFS or AuriStorFS) the
kernel module must be able to associate each process with an
authentication context.  The AFS family of file systems have implemented
this binding as part of its Process Authentication Group (PAG) concept.
A PAG is a set of processes that share an authentication context.  The
authentication context includes:

 * network credentials necessary to establish new server connections
   to requisite network-based services.  These include not only the
   backing store for files and directories but any distributed database
   services managing location independence, replication, failover, etc.

 * established server connections to individual servers.  These
   connections are re-used for all requests from a process that shares
   the authentication context.

The network credentials might be a Kerberos ticket, a public key, the
result of a GSS-API exchange, or something else.   It depends on the
requirements of the security class.

The security properties of a PAG are:

 * a new PAG may be created by any process.  When a new PAG is
   created its membership is only the process that created it.

 * a process may remove itself from a PAG that it is a member of.

 * when a child process is created, it inherits a single PAG
   membership from the parent process.

 * it should not be possible to join a process to a PAG after
   process creation.  Although due to implementation limitations
   on some platforms you will find references to a child process
   being able to set the PAG of its parent process.

In the traditional PAG implementations used by AFS unix clients, there
has been a restriction of one PAG membership per process.  The Windows
client implements an extended model which is better suited to
multi-threaded processes.

 * a process can be a member of more than one PAG at a time

 * a process can select one of its PAGs as the default PAG

 * a thread can select one of the process' PAGs as its active
   PAG and if there is no active PAG, the process default PAG
   is used

This extended Authentication Group model works well for processes such
as web servers that need to execute requests in the authentication
context of the delegated identity and be able to rapidly switch contexts
for each request.

It is important to note that the network credentials stored in an
authentication context do not necessarily have any relationship to the
local machine.  It is also important to remember that network
credentials often have a relatively short lifetime and must be renewed
or replaced on a regular basis.

For containers I envision PAGs being used in the following manner:

 * A process running in the context of the host OS or one that has
   access to keys stored in a TPM or other secure keystore
   creates a new PAG for each container it is going to launch.

 * This process will then obtain the initial network credentials
   required by the container processes and store them into the PAG.

 * The initial Container process will then be created as a child
   process and inherits the PAG membership.  Each subsequent child
   process in the Container will in turn inherit the same PAG.

 * Periodically the host OS process will renew the network credentials
   for the PAG.  This avoids the need for the processes in the container
   to have any access to or knowledge of the network identity under
   which it is executing.

 * A process in the container could decide to resign from the
   inherited PAG and create its own PAG using credentials available
   to that process.  For example, a web server running in a container.

The end result is a PAG which spans both the host OS and the Container
processes.  The Container processes might not even know what credentials
they are running with.

Keyrings were created as a storage facility for the network credentials,
https://www.infradead.org/~dhowells/kafs/#keyrings, but keyrings are not
an authentication context.

While a file system can internally create an association between an
authentication content with a file descriptor once it is created and
with pages for write-back, I believe there would be benefit from a more
generic method of tracking authentication contexts in file descriptors
and pages.  In particular would be better defined behavior when a file
has been opened for "write" from processes associated with more than one
authentication context.

PAG creation and PAG token set manipulation in the AFS family of file
systems traditionally took place via the use of path-based ioctls.
Providing equivalent functionality to user-land is an open topic that
David Howells's submitted as a topic for LSF/MM.  See afs(setpag),
VIOC_GETPAG, VIOCUNPAG, VIC*TOK* and VIOCUNLOG:

  https://www.infradead.org/~dhowells/kafs/user_interface.html

While the PAG model has worked well for many decades it does
periodically run into problems with system design that assumes that
local system identities have the same meaning to network resources.  For
example, the problems that AFS is currently experiencing with systemd.
A good description of problem by Jonathan Billings can be found at


https://docs.google.com/document/d/1P27fP1uj-C8QdxDKMKtI-Qh00c5_9zJa4YHjnpB6ODM/pub

I hope this letter is helpful in describing the issues that the AFS
community has experienced and how we believe that authentication context
management can be used to enhance the usability of containers.