tpmdd-devel.lists.sourceforge.net archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found] <James.Bottomley@HansenPartnership.com>
@ 2017-02-10 10:03 ` Dr. Greg Wettstein
       [not found]   ` <201702101003.v1AA3plF029882-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-10 10:03 UTC (permalink / raw)
  To: James Bottomley, greg-R92VP3DqSWVWk0Htik3J/w, Jarkko Sakkinen
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Feb 9, 11:24am, James Bottomley wrote:
} Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global sessi

Good morning to everyone.

> On Thu, 2017-02-09 at 03:06 -0600, Dr. Greg Wettstein wrote:
> > Referring back to Ken's comments about having 20+ clients waiting to
> > get access to the hardware.  Even with the focus in TPM2 on having it
> > be more of a cryptographic accelerator are we convinced that the
> > hardware is ever going to be fast enough for a model of having it
> > directly service large numbers of transactions in something like a
> > 'cloud' model?

> It's already in use as such today:
> 
> https://tectonic.com/assets/pdf/TectonicTrustedComputing.pdf

We are familiar with this work.  I'm not sure, however, that this work
is representative of the notion of using TPM hardware to support a
transactional environment, particularly at the cloud/container level.

There is not a great deal of technical detail on the CoreOS integrity
architecture but it appears they are using TPM hardware to validate
container integrity.  I'm not sure this type of environment reflects
the ability of TPM hardware to support transactional throughputs in an
environment such as financial transaction processing.

Intel's Clear Container work cites the need to achieve container
startup times of 150 milliseconds and they are currently claiming 45
milliseconds as their optimal time.  This work was designed to
demonstrate the feasibility of providing virtual machine isolation
guarantees to containers and as such one of the mandates was to
achieve container start times comparable to standard namespaces.

I ran some very rough timing metrics on one of our Skylake development
systems with hardware TPM2 support.  Here are the elapsed times for
two common verification operations which I assume would be at the
heart of generating any type of reasonable integrity guarantee:

quote: 810 milliseconds
verify signature: 635 milliseconds

This is with the verifying key loaded into the chip.  The elapsed time
to load and validate a key into the chip averages 1200 milliseconds.
Since we are discussing a resource manager which would be shuttling
context into and out of the limited resource slots on the chip I
believe it is valid to consider this overhead as well.

This suggests that just a signature verification on the integrity of a
container is a factor of 4.2 times greater then a well accepted start
time metric for container technology.

Based on that I'm assuming that if TPM based integrity guarantees are
being implemented they are only on ingress of the container into the
cloud environment.  I'm assuming an alternate methodology must be in
place to protect against time of measurement/time of use issues.

Maybe people have better TPM2 hardware then what we have.  I was going
to run this on a Kaby Lake reference system but it appears that TXT is
causing some type of context depletion problems which we we need to
run down.

> We're also planning something like this in the IBM Cloud.

I assume if there is an expection of true transactional times you
either will have better hardware then current generation TPM2
technology.  Either that or I assume you will be using userspace
simulators anchored with a hardware TPM trust root.

Ken's reflection of having 21-22 competing transactions would appear
to have problematic latency issues given our measurements.

I influence engineering for a company which builds deterministically
modeled Linux platforms.  We've spent a lot of time considering TPM2
hardware bottlenecks since they constrain the rate at which we can
validate platform behavioral measurements.

We have a variation of this work which allows SGX OCALL's to validate
platform behavior in order to provide a broader TCB resource spectrum
to the enclave and hardware TPM performance is problematic there as
well.

> James

Have a good weekend.

Greg

}-- End of excerpt from James Bottomley

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg-R92VP3DqSWVWk0Htik3J/w@public.gmane.org
------------------------------------------------------------------------------
"After being a technician for 2 years, I've discovered if people took
 care of their health with the same reckless abandon as their computers,
 half would be at the kitchen table on the phone with the hospital, trying
 to remove their appendix with a butter knife."
                                -- Brian Jones

-- 

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]   ` <201702101003.v1AA3plF029882-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
@ 2017-02-10 16:46     ` James Bottomley
       [not found]       ` <1486745163.2502.26.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-02-12 20:29       ` [tpmdd-devel] " Ken Goldman
  0 siblings, 2 replies; 38+ messages in thread
From: James Bottomley @ 2017-02-10 16:46 UTC (permalink / raw)
  To: greg-R92VP3DqSWVWk0Htik3J/w, Jarkko Sakkinen
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 2017-02-10 at 04:03 -0600, Dr. Greg Wettstein wrote:
> On Feb 9, 11:24am, James Bottomley wrote:
> } Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for
> global sessi
> 
> Good morning to everyone.

Is there any way you could fix your email client?  It's setting In
-Reply-To: headers like this

In-reply-to: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org> "Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion" (Feb  9, 11:24am)

Not using the message id breaks threading for everyone.

> > On Thu, 2017-02-09 at 03:06 -0600, Dr. Greg Wettstein wrote:
> > > Referring back to Ken's comments about having 20+ clients waiting
> > > to
> > > get access to the hardware.  Even with the focus in TPM2 on
> > > having it
> > > be more of a cryptographic accelerator are we convinced that the
> > > hardware is ever going to be fast enough for a model of having it
> > > directly service large numbers of transactions in something like
> > > a
> > > 'cloud' model?
> 
> > It's already in use as such today:
> > 
> > https://tectonic.com/assets/pdf/TectonicTrustedComputing.pdf
> 
> We are familiar with this work.  I'm not sure, however, that this 
> work is representative of the notion of using TPM hardware to support 
> a transactional environment, particularly at the cloud/container
> level.

It allows for cloud clients to request attestations.  The next step is
to allow containers to provision key material and PCR locked blobs
securely to the TPM for use by correctly attested containers all of
those are cloud scale use cases.

> There is not a great deal of technical detail on the CoreOS integrity
> architecture but it appears they are using TPM hardware to validate
> container integrity.  I'm not sure this type of environment reflects
> the ability of TPM hardware to support transactional throughputs in 
> an environment such as financial transaction processing.

OK, so in the cloud neither key provisioning nor attestation has a huge
latency requirement.  This appears to be your concern?  All I'd say is
that the fact that there are use cases that can work at cloud scale
doesn't mean that every use case can.

> Intel's Clear Container work cites the need to achieve container
> startup times of 150 milliseconds and they are currently claiming 45
> milliseconds as their optimal time.  This work was designed to
> demonstrate the feasibility of providing virtual machine isolation
> guarantees to containers and as such one of the mandates was to
> achieve container start times comparable to standard namespaces.

There are ephemeral container use cases where the lifetimes are of this
order, but they're not every use case (In fact, even in the devops
environment, they're still a minority).

> I ran some very rough timing metrics on one of our Skylake
> development systems with hardware TPM2 support.  Here are the elapsed
> times for two common verification operations which I assume would be
> at the heart of generating any type of reasonable integrity
> guarantee:
> 
> quote: 810 milliseconds
> verify signature: 635 milliseconds

That's interesting, my Skylake system has these figures down around
100ms or so ... however, I agree that 100ms is the order of this. 
 Which is still significant compared to container start times.

> This is with the verifying key loaded into the chip.  The elapsed
> time to load and validate a key into the chip averages 1200
> milliseconds. Since we are discussing a resource manager which would
> be shuttling context into and out of the limited resource slots on
> the chip I believe it is valid to consider this overhead as well.
> 
> This suggests that just a signature verification on the integrity of 
> a container is a factor of 4.2 times greater then a well accepted 
> start time metric for container technology.

Part of the way of reducing the latency is not to use the TPM for
things that don't require secrecy: container signature verification is
one such because the container is signed with a private key to which
you know the public component ... you can verify it on the host without
needing to trouble the TPM.  We only use the TPM for state quotes,
unsealing and signature generation.

> Based on that I'm assuming that if TPM based integrity guarantees are
> being implemented they are only on ingress of the container into the
> cloud environment.  I'm assuming an alternate methodology must be in
> place to protect against time of measurement/time of use issues.
> 
> Maybe people have better TPM2 hardware then what we have.  I was 
> going to run this on a Kaby Lake reference system but it appears that 
> TXT is causing some type of context depletion problems which we we 
> need to run down.
> 
> > We're also planning something like this in the IBM Cloud.
> 
> I assume if there is an expection of true transactional times you
> either will have better hardware then current generation TPM2
> technology.  Either that or I assume you will be using userspace
> simulators anchored with a hardware TPM trust root.

vTPM is a possibility, yes, so is making the TPM faster.

> Ken's reflection of having 21-22 competing transactions would appear
> to have problematic latency issues given our measurements.

Consider the canonical use case to be VPNaaS with a secure connection
back to the enterprise and the client key being the privacy guarded
material.  The signature generation is once per channel re-key and you
have up to half the re-key interval to generate the re-key over the
control channel.  In this use case, latency isn't a problem (most re
-key intervals are around 3000s) but volume is.  VPNs are long running
not short running, so start up time isn't hugely relevant either.

Anyway, precisely what we're doing and how is getting off point.  The
point is that there are existing cloud use cases for the TPM which can
cause high concurrency.

James

> I influence engineering for a company which builds deterministically
> modeled Linux platforms.  We've spent a lot of time considering TPM2
> hardware bottlenecks since they constrain the rate at which we can
> validate platform behavioral measurements.
> 
> We have a variation of this work which allows SGX OCALL's to validate
> platform behavior in order to provide a broader TCB resource spectrum
> to the enclave and hardware TPM performance is problematic there as
> well.
> 
> > James
> 
> Have a good weekend.
> 
> Greg
> 
> }-- End of excerpt from James Bottomley
> 
> As always,
> Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
> 4206 N. 19th Ave.           Specializing in information infra
> -structure
> Fargo, ND  58102            development.
> PH: 701-281-1686
> FAX: 701-281-3949           EMAIL: greg-R92VP3DqSWVWk0Htik3J/w@public.gmane.org
> ---------------------------------------------------------------------
> ---------
> "After being a technician for 2 years, I've discovered if people took
>  care of their health with the same reckless abandon as their
> computers,
>  half would be at the kitchen table on the phone with the hospital,
> trying
>  to remove their appendix with a butter knife."
>                                 -- Brian Jones
> 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]       ` <1486745163.2502.26.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-02-10 21:13         ` Kenneth Goldman
  2017-02-14 14:38           ` [tpmdd-devel] " Dr. Greg Wettstein
  2017-02-10 21:18         ` Kenneth Goldman
  1 sibling, 1 reply; 38+ messages in thread
From: Kenneth Goldman @ 2017-02-10 21:13 UTC (permalink / raw)
  To: James Bottomley
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	greg-R92VP3DqSWVWk0Htik3J/w, linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 645 bytes --]

James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org> wrote on 
02/10/2017 11:46:03 AM:

> > quote: 810 milliseconds
> > verify signature: 635 milliseconds
> 
> Part of the way of reducing the latency is not to use the TPM for
> things that don't require secrecy: 

Agreed.  There are a few times one would verify a signature inside the 
TPM,
but they're far from mainstream:

1 - Early in the boot cycle, when there's no crypto library.

2 - When the crypto library doesn't support the required algorithm.

3 - When a ticket is needed to prove to the TPM later that it verified
the signature.


[-- Attachment #1.2: Type: text/html, Size: 914 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

[-- Attachment #3: Type: text/plain, Size: 192 bytes --]

_______________________________________________
tpmdd-devel mailing list
tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]       ` <1486745163.2502.26.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-02-10 21:13         ` Kenneth Goldman
@ 2017-02-10 21:18         ` Kenneth Goldman
  1 sibling, 0 replies; 38+ messages in thread
From: Kenneth Goldman @ 2017-02-10 21:18 UTC (permalink / raw)
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 487 bytes --]

> > quote: 810 milliseconds
> > verify signature: 635 milliseconds
> 
> Part of the way of reducing the latency is not to use the TPM for
> things that don't require secrecy: 

Agreed.  There are a few times one would verify a signature inside the 
TPM,
but they're far from mainstream:

1 - Early in the boot cycle, when there's no crypto library.

2 - When the crypto library doesn't support the required algorithm.

3 - When a ticket is needed to prove to the TPM la


[-- Attachment #1.2: Type: text/html, Size: 763 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

[-- Attachment #3: Type: text/plain, Size: 192 bytes --]

_______________________________________________
tpmdd-devel mailing list
tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-10 16:46     ` James Bottomley
       [not found]       ` <1486745163.2502.26.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-02-12 20:29       ` Ken Goldman
  1 sibling, 0 replies; 38+ messages in thread
From: Ken Goldman @ 2017-02-12 20:29 UTC (permalink / raw)
  Cc: tpmdd-devel, linux-security-module, linux-kernel

On 2/10/2017 11:46 AM, James Bottomley wrote:
> On Fri, 2017-02-10 at 04:03 -0600, Dr. Greg Wettstein wrote:
>> On Feb 9, 11:24am, James Bottomley wrote:

>> quote: 810 milliseconds
>> verify signature: 635 milliseconds
> ...
>
> Part of the way of reducing the latency is not to use the TPM for
> things that don't require secrecy: container signature verification is
> one such because the container is signed with a private key to which
> ...

Agreed.  There are a few times one would verify a signature inside the 
TPM, but they're far from mainstream:

1 - Early in the boot cycle, when there's no crypto library.

2 - When the crypto library doesn't support the required algorithm.

3 - When a ticket is needed to prove to the TPM later that it verified
the signature.



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-10 21:13         ` Kenneth Goldman
@ 2017-02-14 14:38           ` Dr. Greg Wettstein
       [not found]             ` <20170214143829.GA28175-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
       [not found]             ` <71dc0e80-6678-a124-9184-1f93c8532d09@linux.vnet.ibm.com>
  0 siblings, 2 replies; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-14 14:38 UTC (permalink / raw)
  To: Kenneth Goldman
  Cc: James Bottomley, greg, Jarkko Sakkinen, linux-kernel,
	linux-security-module, tpmdd-devel

On Fri, Feb 10, 2017 at 04:13:05PM -0500, Kenneth Goldman wrote:

Good morning to everyone.

> James Bottomley <James.Bottomley@HansenPartnership.com> wrote on 
> 02/10/2017 11:46:03 AM:
> 
> > > quote: 810 milliseconds
> > > verify signature: 635 milliseconds

For those who may be interested in this sort of thing I grabbed a few
minutes and ran these basic verification primitives against a Kaby
Lake system.

Average time for a quote is 600 milliseconds with a signature
verification clocking in at 100 milliseconds.  The latter is
consistent with what James found on his Skylake machine.

Latencies are still significant with things like container start
times.

> > Part of the way of reducing the latency is not to use the TPM for
> > things that don't require secrecy: 

> Agreed.  There are a few times one would verify a signature inside the 
> TPM,
> but they're far from mainstream:
> 
> 1 - Early in the boot cycle, when there's no crypto library.
> 
> 2 - When the crypto library doesn't support the required algorithm.
> 
> 3 - When a ticket is needed to prove to the TPM later that it verified
> the signature.

I don't think there is any doubt that running cryptographic primitives
in userspace is going to be faster then going to hardware.  Obviously
that also means there is no need for a TPM resource manager which has
been the subject of much discussion here.

The CoreOS paper makes significant reference to increased security
guarantees inherent in the use of a TPM.  Obviously whatever uses
those are will have the noted latency constraints.

We have extended our behavior measurement verifications to the
container level so we offer an explicit guarantee that a container has
not operated in a manner which is inconsistent with the intent of its
designer.  Getting the security guarantee we need requires that an
linkage to a hardware root of trust hence our concerns about hardware
latency.

Have a good day.

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"UNIX is simple and coherent, but it takes a genious (or at any rate,
 a programmer) to understand and appreciate its simplicity."
                                -- Dennis Ritchie
                                   USENIX '87

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]             ` <20170214143829.GA28175-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
@ 2017-02-14 16:47               ` James Bottomley
  0 siblings, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-02-14 16:47 UTC (permalink / raw)
  To: Dr. Greg Wettstein, Kenneth Goldman
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	greg-R92VP3DqSWVWk0Htik3J/w, linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, 2017-02-14 at 08:38 -0600, Dr. Greg Wettstein wrote:
> On Fri, Feb 10, 2017 at 04:13:05PM -0500, Kenneth Goldman wrote:
> 
> Good morning to everyone.
> 
> > James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org> wrote on 
> > 02/10/2017 11:46:03 AM:
> > 
> > > > quote: 810 milliseconds
> > > > verify signature: 635 milliseconds
> 
> For those who may be interested in this sort of thing I grabbed a few
> minutes and ran these basic verification primitives against a Kaby
> Lake system.
> 
> Average time for a quote is 600 milliseconds with a signature
> verification clocking in at 100 milliseconds.  The latter is
> consistent with what James found on his Skylake machine.
> 
> Latencies are still significant with things like container start
> times.
> 
> > > Part of the way of reducing the latency is not to use the TPM for
> > > things that don't require secrecy: 
> 
> > Agreed.  There are a few times one would verify a signature inside 
> > the TPM, but they're far from mainstream:
> > 
> > 1 - Early in the boot cycle, when there's no crypto library.
> > 
> > 2 - When the crypto library doesn't support the required algorithm.
> > 
> > 3 - When a ticket is needed to prove to the TPM later that it
> > verified
> > the signature.
> 
> I don't think there is any doubt that running cryptographic 
> primitives in userspace is going to be faster then going to hardware.
>   Obviously that also means there is no need for a TPM resource 
> manager which has been the subject of much discussion here.

That's a bit of a non-sequitur.  Ken's and my point was that although
you could run every crypto operation through the TPM, you don't (as you
say, because it's too slow), so you carefully select the ones that
preserve the confidentiality you're looking for.  To take the VPNaaS
use case again: the key material you're protecting is the client
identity key, so the only crypto operation you run through the TPM is
creation of the TLS client certificate verification signature. 
 Everything else, including the server certificate signature 
 verification, the symmetric key agreement and all the symmetric
encryption operations, you keep in userspace.  That means that instead
of requiring thousands of crypto operations per second from the TPM,
you basically require about one per hour per VPNaaS instance.

We need a RM because without one, given the constraints of TPM2, as few
as two VPNaaS instances can cause a resource exhaustion failure.

James

> The CoreOS paper makes significant reference to increased security
> guarantees inherent in the use of a TPM.  Obviously whatever uses
> those are will have the noted latency constraints.
> 
> We have extended our behavior measurement verifications to the
> container level so we offer an explicit guarantee that a container 
> has not operated in a manner which is inconsistent with the intent of 
> its designer.  Getting the security guarantee we need requires that 
> an linkage to a hardware root of trust hence our concerns about 
> hardware latency.
> 
> Have a good day.
> 
> As always,
> Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
> 4206 N. 19th Ave.           Specializing in information infra
> -structure
> Fargo, ND  58102            development.
> PH: 701-281-1686
> FAX: 701-281-3949           EMAIL: greg-R92VP3DqSWVWk0Htik3J/w@public.gmane.org
> ---------------------------------------------------------------------
> ---------
> "UNIX is simple and coherent, but it takes a genious (or at any rate,
>  a programmer) to understand and appreciate its simplicity."
>                                 -- Dennis Ritchie
>                                    USENIX '87
> 


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
       [not found]             ` <71dc0e80-6678-a124-9184-1f93c8532d09@linux.vnet.ibm.com>
@ 2017-02-16 20:06               ` Dr. Greg Wettstein
  2017-02-16 20:33                 ` Jarkko Sakkinen
  0 siblings, 1 reply; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-16 20:06 UTC (permalink / raw)
  To: Ken Goldman; +Cc: jarkko.sakkinen, linux-kernel, tpmdd-devel

On Thu, Feb 16, 2017 at 09:04:47AM -0500, Ken Goldman wrote:

Good morning to everyone, leveraging some time between planes.

> On 2/14/2017 9:38 AM, Dr. Greg Wettstein wrote:
> >
> >I don't think there is any doubt that running cryptographic primitives
> >in userspace is going to be faster then going to hardware.  Obviously
> >that also means there is no need for a TPM resource manager which has
> >been the subject of much discussion here.

> I don't understand that comment.
>
> The resource manager schedules user space access to the TPM.  It also
> handles swapping of objects in and out of the limited number of
> TPM slots.
> 
> Without a RM, either you'd have to permit only a single TPM connection,
> blocking all other connections, or you'd have different connections
> interfering with each other.

Yes, if multiple contexts of execution require access to the TPM a
resource manager is needed to arbitrate that access.

I think, however, that we are talking past one another a bit.

We design and build systems which implement autonomous
self-regulation.  As such we need a hardware based confirmation that
the machine is in a given behavioral state.  This requires that we
reference a hardware root of trust, ie. the TPM.

Depending on the assurance granularity requirements, that may mean a
high rate of TPM verifications.  When I noticed you and James talking
about 'cloud based' levels of transactions I was assuming you were
operating at transaction rates we build for, ie. 10-100's/second.
That didn't seem feasible given our hardware measurements on Skylake
and Kabylake based systems.

James had cited the CoreOS/Tectonic white paper as an example of TPM's
working at cloud scale.  Our conversation to date seems to indicate
that the accepted modality of security appers to be to do userspace
verification of container signatures.  Given the extensive dialogue in
the paper about using TPM's for security we had inadvertently believed
that container verifications were being pinned to current platform
status which didn't correlate with expected container start time
latencies.

Our behavioral assessment code is namespaced so a supervisory system
can make statements about the behavior of a container.  We have
concluded the only way that is possible is to use userspace TPM
implementations which can meet the necessary latency requirements.

Our point in all this is that it doesn't seem to make any sense to
implement anything in the kernel more then basic resource management.
If other 'virtualization' is needed, such as session state management
and the like, the community would seem to be served better by having a
solid userspace simulation environment, with appropriate hardware
security guarantees.  That would serve needs like re-keying support
for VPNaaS applications as well as high transaction rate environments,
ie. why load the kernel with code to virtualize a resource when a
'user' can just be given its own TPM2 instance.

Just as an aside, has anyone given any thought about TPM2 resource
management in things like TXT/tboot environments?  The current tboot
code makes a rather naive assumption that it can take a handle slot to
protect its platform verification secret.  Doing resource management
correctly will require addressing extra-OS environments such as this
which may have TPM2 state requirement issues.

Our take away from all this is that it doesn't seem that we need to
worry about the fact that someone may have invented TPM2 hardware
which is faster then what we are developing on.... :-)

Have a good weekend.

Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"If you ever teach a yodeling class, probably the hardest thing is to
 keep the students from just trying to yodel right off. You see, we build
 to that."
                                -- Jack Handey
                                   Deep Thoughts

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-16 20:06               ` [tpmdd-devel] " Dr. Greg Wettstein
@ 2017-02-16 20:33                 ` Jarkko Sakkinen
  2017-02-17  9:56                   ` Dr. Greg Wettstein
  0 siblings, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-02-16 20:33 UTC (permalink / raw)
  To: Dr. Greg Wettstein; +Cc: Ken Goldman, linux-kernel, tpmdd-devel

On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote:
> Just as an aside, has anyone given any thought about TPM2 resource
> management in things like TXT/tboot environments?  The current tboot
> code makes a rather naive assumption that it can take a handle slot to
> protect its platform verification secret.  Doing resource management
> correctly will require addressing extra-OS environments such as this
> which may have TPM2 state requirement issues.

The current implementation handles stuff created from regular /dev/tpm0
so I do not think this would be an issue. You can only access objects
from a TPM space that are created within that space.

/Jarkko

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-16 20:33                 ` Jarkko Sakkinen
@ 2017-02-17  9:56                   ` Dr. Greg Wettstein
  2017-02-17 12:37                     ` Jarkko Sakkinen
  0 siblings, 1 reply; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-17  9:56 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dr. Greg Wettstein, Ken Goldman, linux-kernel, tpmdd-devel

On Thu, Feb 16, 2017 at 10:33:04PM +0200, Jarkko Sakkinen wrote:

Good morning to everyone.

> On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote:
> > Just as an aside, has anyone given any thought about TPM2 resource
> > management in things like TXT/tboot environments?  The current tboot
> > code makes a rather naive assumption that it can take a handle slot to
> > protect its platform verification secret.  Doing resource management
> > correctly will require addressing extra-OS environments such as this
> > which may have TPM2 state requirement issues.

> The current implementation handles stuff created from regular
> /dev/tpm0 so I do not think this would be an issue. You can only
> access objects from a TPM space that are created within that space.

Unless I misunderstand the number of transient objects which can be
managed is a characteristic of the hardware and is a limited resource,
hence our discussion on the notion of a resource manager to shuttle
context in and out of these limited slots.

On a Kabylake system, running the following command:

getcapability -cap 6 | grep trans

After booting into a TXT mediated measured launch environment (MLE) yields
the following:

TPM_PT 0000010e value 00000003 TPM_PT_HR_TRANSIENT_MIN - the minimum number of transient objects that can be held in TPM RAM

TPM_PT 00000207 value 00000002 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the number of additional transient objects that could be loaded into TPM RAM

Booting without TXT results in the getcapability call indicating that
three slots are available.  Based on that and reading the tboot code,
we are assuming the occupied slot is the ephemeral primary key
generated by tboot which seals the verification secret.

In an MLE it is possible to create and then flush a new ephemeral
primary key which results in the following getcapability output:

TPM_PT 00000207 value 00000003 TPM_PT_HR_TRANSIENT_AVAIL - estimate of
the number of additional transient objects that could be loaded into TPM RAM

Which is probably going to be pretty surprising to tboot in the event
that it tries to re-verify the system state after a suspend event.

So based on that it would seem there would need to be some semblance
of cooperation between the resource manager and an extra-OS
utilization of TPM2 resources such as tboot.

Thoughts?

> /Jarkko

Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"For a successful technology, reality must take precedence over public
 relations, for nature cannot be fooled."
                                -- Richard Feynmann

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-17  9:56                   ` Dr. Greg Wettstein
@ 2017-02-17 12:37                     ` Jarkko Sakkinen
  2017-02-17 22:37                       ` Dr. Greg Wettstein
  0 siblings, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-02-17 12:37 UTC (permalink / raw)
  To: Dr. Greg Wettstein; +Cc: Ken Goldman, linux-kernel, tpmdd-devel

On Fri, Feb 17, 2017 at 03:56:26AM -0600, Dr. Greg Wettstein wrote:
> On Thu, Feb 16, 2017 at 10:33:04PM +0200, Jarkko Sakkinen wrote:
> 
> Good morning to everyone.
> 
> > On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote:
> > > Just as an aside, has anyone given any thought about TPM2 resource
> > > management in things like TXT/tboot environments?  The current tboot
> > > code makes a rather naive assumption that it can take a handle slot to
> > > protect its platform verification secret.  Doing resource management
> > > correctly will require addressing extra-OS environments such as this
> > > which may have TPM2 state requirement issues.
> 
> > The current implementation handles stuff created from regular
> > /dev/tpm0 so I do not think this would be an issue. You can only
> > access objects from a TPM space that are created within that space.
> 
> Unless I misunderstand the number of transient objects which can be
> managed is a characteristic of the hardware and is a limited resource,
> hence our discussion on the notion of a resource manager to shuttle
> context in and out of these limited slots.
> 
> On a Kabylake system, running the following command:
> 
> getcapability -cap 6 | grep trans
> 
> After booting into a TXT mediated measured launch environment (MLE) yields
> the following:
> 
> TPM_PT 0000010e value 00000003 TPM_PT_HR_TRANSIENT_MIN - the minimum number of transient objects that can be held in TPM RAM
> 
> TPM_PT 00000207 value 00000002 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the number of additional transient objects that could be loaded into TPM RAM
> 
> Booting without TXT results in the getcapability call indicating that
> three slots are available.  Based on that and reading the tboot code,
> we are assuming the occupied slot is the ephemeral primary key
> generated by tboot which seals the verification secret.
> 
> In an MLE it is possible to create and then flush a new ephemeral
> primary key which results in the following getcapability output:
> 
> TPM_PT 00000207 value 00000003 TPM_PT_HR_TRANSIENT_AVAIL - estimate of
> the number of additional transient objects that could be loaded into TPM RAM
> 
> Which is probably going to be pretty surprising to tboot in the event
> that it tries to re-verify the system state after a suspend event.
> 
> So based on that it would seem there would need to be some semblance
> of cooperation between the resource manager and an extra-OS
> utilization of TPM2 resources such as tboot.
> 
> Thoughts?

The driver swaps in and out all the objects for one send-receive cycle.
So unless the driver is sending a command to a TPM the resource manager
occupies zero slots. I do not see reason for forseeable future to change
this pattern.

I discussed about some "lazier" schemes for swapping with James an Ken
in the early Fall but came into conclusion that it would make the RM
really complicated. There would have to be something show stopper work
load to even to start consider it.

With the capacity of current TPMs and amount of traffic and workloads
it is really not a worth of the trouble.

I guess the way we do swapping kind of indirectly sorts out the issue
you described, doesn't it?

/Jarkko

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global session exhaustion
  2017-02-17 12:37                     ` Jarkko Sakkinen
@ 2017-02-17 22:37                       ` Dr. Greg Wettstein
  0 siblings, 0 replies; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-17 22:37 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Dr. Greg Wettstein, Ken Goldman, linux-kernel, tpmdd-devel

On Fri, Feb 17, 2017 at 02:37:12PM +0200, Jarkko Sakkinen wrote:

Hi, I hope the week is ending well for everyone.

> On Fri, Feb 17, 2017 at 03:56:26AM -0600, Dr. Greg Wettstein wrote:
> > On Thu, Feb 16, 2017 at 10:33:04PM +0200, Jarkko Sakkinen wrote:
> > 
> > Good morning to everyone.
> > 
> > > On Thu, Feb 16, 2017 at 02:06:42PM -0600, Dr. Greg Wettstein wrote:
> > > > Just as an aside, has anyone given any thought about TPM2 resource
> > > > management in things like TXT/tboot environments?  The current tboot
> > > > code makes a rather naive assumption that it can take a handle slot to
> > > > protect its platform verification secret.  Doing resource management
> > > > correctly will require addressing extra-OS environments such as this
> > > > which may have TPM2 state requirement issues.
> > 
> > > The current implementation handles stuff created from regular
> > > /dev/tpm0 so I do not think this would be an issue. You can only
> > > access objects from a TPM space that are created within that space.
> > 
> > Unless I misunderstand the number of transient objects which can be
> > managed is a characteristic of the hardware and is a limited resource,
> > hence our discussion on the notion of a resource manager to shuttle
> > context in and out of these limited slots.
> > 
> > On a Kabylake system, running the following command:
> > 
> > getcapability -cap 6 | grep trans
> > 
> > After booting into a TXT mediated measured launch environment (MLE) yields
> > the following:
> > 
> > TPM_PT 0000010e value 00000003 TPM_PT_HR_TRANSIENT_MIN - the minimum number of transient objects that can be held in TPM RAM
> > 
> > TPM_PT 00000207 value 00000002 TPM_PT_HR_TRANSIENT_AVAIL - estimate of the number of additional transient objects that could be loaded into TPM RAM
> > 
> > Booting without TXT results in the getcapability call indicating that
> > three slots are available.  Based on that and reading the tboot code,
> > we are assuming the occupied slot is the ephemeral primary key
> > generated by tboot which seals the verification secret.
> > 
> > In an MLE it is possible to create and then flush a new ephemeral
> > primary key which results in the following getcapability output:
> > 
> > TPM_PT 00000207 value 00000003 TPM_PT_HR_TRANSIENT_AVAIL - estimate of
> > the number of additional transient objects that could be loaded into TPM RAM
> > 
> > Which is probably going to be pretty surprising to tboot in the event
> > that it tries to re-verify the system state after a suspend event.
> > 
> > So based on that it would seem there would need to be some semblance
> > of cooperation between the resource manager and an extra-OS
> > utilization of TPM2 resources such as tboot.
> > 
> > Thoughts?

> The driver swaps in and out all the objects for one send-receive
> cycle.  So unless the driver is sending a command to a TPM the
> resource manager occupies zero slots. I do not see reason for
> forseeable future to change this pattern.
>
> I discussed about some "lazier" schemes for swapping with James an
> Ken in the early Fall but came into conclusion that it would make
> the RM really complicated. There would have to be something show
> stopper work load to even to start consider it.
>
> With the capacity of current TPMs and amount of traffic and
> workloads it is really not a worth of the trouble.
>
> I guess the way we do swapping kind of indirectly sorts out the
> issue you described, doesn't it?

I'm not sure, we've pulled down your resource manager branch so we can
figure out the exact mechanics of how it works.  Based on a cursory
read of the code it appears as if it loops through all three transient
handle slots and attempts to context save each transient object it
finds.  So if it does that for each send/receive cycle it should
theoretically inter-operate with TXT/tboot.

As noted previously, with the current kernel driver, we can see that
tboot has allocated a slot for the ephemeral key which is used to seal
the memory verification secrets.  This key gets allocated to handle
80000000 as one would anticipate.  However when we attempt to issue a
context save against that handle we get an error.

Interestingly, when we attempt to flush that handle manually we
receive an error as well, but the number of available transient
handles increases by one which suggests the context flush cleared the
slot.

It seems that we should be able to manually replicate what the
resource manager is doing with the standard kernel driver or is this
an incorrect assumption?

We will have to spin up a kernel with your patches and see how it
reacts to the presence of the extra-OS handle allocation.

> /Jarkko

Greg

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"We know that communication is a problem, but the company is not going
 to discuss it with the employees."
                                -- Switching supervisor
                                   AT&T Long Lines Division

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]             ` <20170210084837.lq3mofgfwvjx623m-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-02-10 23:13               ` Kenneth Goldman
  0 siblings, 0 replies; 38+ messages in thread
From: Kenneth Goldman @ 2017-02-10 23:13 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: greg-R92VP3DqSWVWk0Htik3J/w, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	James Bottomley, linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f


[-- Attachment #1.1: Type: text/plain, Size: 842 bytes --]

On Thu, Feb 09, 2017 at 12:04:26PM -0700, Jason Gunthorpe wrote:
Jarkko Sakkinen <jarkko.sakkinen-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote on 02/10/2017
03:48:37 AM:

> > This series should focus on allowing a user space RM to co-exist with
> > the in-kernel services - lets try and tackle the idea of a
> > policy-restricted or unpriv-safe cdev when someone comes up with a
> > comprehensive proposal..

First, does "coexist" mean in series (two layers of RM) or in parallel
(both
have simultaneous access).

Or does "in-kernel services" not include an RM?

Assuming in series, it will complicate the lower RM.  The main issue,
as always, is session context.  If you permit the upper RM to save context,
the lower RM has to track the mapping, because the lower layer can
alter the saved session context (regapping).

[-- Attachment #1.2: Type: text/html, Size: 1048 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

[-- Attachment #3: Type: text/plain, Size: 192 bytes --]

_______________________________________________
tpmdd-devel mailing list
tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]                     ` <1485814388.2518.28.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-01-30 22:46                       ` Ken Goldman
  2017-01-31 13:31                       ` Jarkko Sakkinen
@ 2017-02-10 17:22                       ` Kenneth Goldman
  2 siblings, 0 replies; 38+ messages in thread
From: Kenneth Goldman @ 2017-02-10 17:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 1127 bytes --]

> > It does. My trusted keys implementation actually uses sessions.
>
> But as I read the code, I can't find where the kernel creates a
> session.  It looks like the session and hmac are passed in as option
> arguments, aren't they?

A bit of background.

In TPM 1.2, any authorization needed a session and an HMAC.

In TPM 2.0, authorization can be done using a plaintext password
(optionally) rather than an HMAC.  To me, kernel authorization
is a good use case for a plaintext password, since there is a
trusted path to the TPM.

When using a plaintext password, the caller does not require
startauthsession.  There is a special handle number that means
"plaintext password, no HMAC".  It's always available, and does
not occupy a session slot.

However, for the future ...

TPM 2.0 also has policy sessions.  E.g., use of the EK requires
a policy.

If the kernel ever wants to use policy, it needs startauthsession.

That's why I'm thinking that perhaps the space code should just
reserve ~2 sessions for it's own use, so it never blocks
because user space has occupied all the session slots.

[-- Attachment #1.2: Type: text/html, Size: 1443 bytes --]

[-- Attachment #2: Type: text/plain, Size: 202 bytes --]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

[-- Attachment #3: Type: text/plain, Size: 192 bytes --]

_______________________________________________
tpmdd-devel mailing list
tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org
https://lists.sourceforge.net/lists/listinfo/tpmdd-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]               ` <1486668591.2616.45.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-02-09 21:54                 ` Jason Gunthorpe
  0 siblings, 0 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2017-02-09 21:54 UTC (permalink / raw)
  To: James Bottomley
  Cc: Ken Goldman, greg-R92VP3DqSWVWk0Htik3J/w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Thu, Feb 09, 2017 at 11:29:51AM -0800, James Bottomley wrote:
> On Thu, 2017-02-09 at 12:04 -0700, Jason Gunthorpe wrote:
> > On Thu, Feb 09, 2017 at 05:19:22PM +0200, Jarkko Sakkinen wrote:
> > > The current patch set does not define policy. The simple policy
> > > addition that could be added soon is the limit of connections
> > > because it is easy to implement in non-intrusive way.
> > 
> > It is also trivial for a userspace RM to limit the number of sessions
> > or connections or otherwise to manage this limitation. It is hard to
> > see why we'd need kernel support for this.
> 
> Because the kernel is a primary TPM user.

When I said 'this' I meant a kernel policy to limit the number of
user connections.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]   ` <201702090906.v1996c6a015552-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
  2017-02-09 15:19     ` Jarkko Sakkinen
@ 2017-02-09 20:05     ` James Bottomley
  1 sibling, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-02-09 20:05 UTC (permalink / raw)
  To: greg-R92VP3DqSWVWk0Htik3J/w, Jarkko Sakkinen
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, 2017-02-09 at 03:06 -0600, Dr. Greg Wettstein wrote:
> On Jan 30, 11:58pm, Jarkko Sakkinen wrote:
> } Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for
> global sessi
> 
> Good morning, I hope the day is going well for everyone.
> 
> > I'm kind dilating to an opinion that we would leave this commit out
> > from the first kernel release that will contain the resource 
> > manager with similar rationale as Jason gave me for whitelisting: 
> > get the basic stuff in and once it is used with some workloads 
> > whitelisting and exhaustion will take eventually the right form.
> > 
> > How would you feel about this?
> 
> I wasn't able to locate the exact context to include but we noted 
> with interest Ken's comments about his need to support a model where 
> a client needs a TPM session for transaction purposes which can last 
> a highly variable amount of time.  That and concerns about command
> white-listing, hardware denial of service and related issues tend to
> underscore our concerns about how much TPM resource management should
> go into the kernel.
> 
> Once an API is in the kernel we live with it forever.

This actually is far too strong a statement:  Once you make API
guarantees, you have to live with them forever, but there's a
considerable difference between an API guarantee and the API itself. 
 For instance the kernel overlay filesystem has gone through several
iterations of file whiteouts (showing a file as deleted above a read
only copy): we began with an inode flag, moved to an extended attribute
and finally ended up with a device.  Each of those three changes was
fairly radical to the VFS API, but didn't fundamentally alter the API
guarantee (that users wouldn't see a file after it was deleted on an
overlay).

The API guarantee /dev/tpms0 is adding is that you won't see TPM out of
memory errors based on what other people are doing, so I think it's a
simple isolation guarantee we can live with long term.  I think that's
a solidly defensible one.

However, right at the moment the guarantee isn't that you won't be
affcted by *anything* another user does, so it's a weak guarantee: you
will see uncorrectable regapping errors based on what others are doing
and you will see global session exhaustion.

I think we begin with the defensible weak guarantee and discuss how to
strengthen it.

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]           ` <20170209190426.GA1104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-02-09 19:29             ` James Bottomley
       [not found]               ` <1486668591.2616.45.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2017-02-09 19:29 UTC (permalink / raw)
  To: Jason Gunthorpe, Jarkko Sakkinen
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, greg-R92VP3DqSWVWk0Htik3J/w

On Thu, 2017-02-09 at 12:04 -0700, Jason Gunthorpe wrote:
> On Thu, Feb 09, 2017 at 05:19:22PM +0200, Jarkko Sakkinen wrote:
> > The current patch set does not define policy. The simple policy
> > addition that could be added soon is the limit of connections
> > because it is easy to implement in non-intrusive way.
> 
> It is also trivial for a userspace RM to limit the number of sessions
> or connections or otherwise to manage this limitation. It is hard to
> see why we'd need kernel support for this.

Because the kernel is a primary TPM user.  We can't have the kernel
call on the in-userspace resource manager without causing a deadlock,
so we need as much of the RM as is needed to support the kernel in the
kernel itself.

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]       ` <20170209151922.cqo32h4io5dqyvvw-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-02-09 19:04         ` Jason Gunthorpe
       [not found]           ` <20170209190426.GA1104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2017-02-10  8:48           ` [tpmdd-devel] " Jarkko Sakkinen
  0 siblings, 2 replies; 38+ messages in thread
From: Jason Gunthorpe @ 2017-02-09 19:04 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Ken Goldman, greg-R92VP3DqSWVWk0Htik3J/w,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, James Bottomley,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f

On Thu, Feb 09, 2017 at 05:19:22PM +0200, Jarkko Sakkinen wrote:
> > userspace instance with subsequent relinquishment of privilege.  At
> > that point one has the freedom to implement all sorts of policy.
> 
> If you look at the patch set that I sent yesterday it exactly has a
> feature that makes it more lean for a privileged process to implement
> a resource manager.

I continue to think, based on comments like this, that you should not
implement tmps0 in the first revision either. That is also something
we have to live with forever, and it can never become the 'policy
limited' or 'unpriv safe' access point to the kernel.  ie go back to
something based on tmp0 with ioctl.

This series should focus on allowing a user space RM to co-exist with
the in-kernel services - lets try and tackle the idea of a
policy-restricted or unpriv-safe cdev when someone comes up with a
comprehensive proposal..

> The current patch set does not define policy. The simple policy
> addition that could be added soon is the limit of connections
> because it is easy to implement in non-intrusive way.

It is also trivial for a userspace RM to limit the number of sessions
or connections or otherwise to manage this limitation. It is hard to
see why we'd need kernel support for this.

The main issue from the kernel perspecitive is how to allow sessions
to be used in-kernel and continue to make progress when they start to
run out.

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]   ` <201702090906.v1996c6a015552-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
@ 2017-02-09 15:19     ` Jarkko Sakkinen
       [not found]       ` <20170209151922.cqo32h4io5dqyvvw-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-02-09 20:05     ` James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-02-09 15:19 UTC (permalink / raw)
  To: greg-R92VP3DqSWVWk0Htik3J/w
  Cc: James Bottomley, tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Thu, Feb 09, 2017 at 03:06:38AM -0600, Dr. Greg Wettstein wrote:
> On Jan 30, 11:58pm, Jarkko Sakkinen wrote:
> } Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global sessi
> 
> Good morning, I hope the day is going well for everyone.
> 
> > I'm kind dilating to an opinion that we would leave this commit out
> > from the first kernel release that will contain the resource manager
> > with similar rationale as Jason gave me for whitelisting: get the
> > basic stuff in and once it is used with some workloads whitelisting
> > and exhaustion will take eventually the right form.
> >
> > How would you feel about this?
> 
> I wasn't able to locate the exact context to include but we noted with
> interest Ken's comments about his need to support a model where a
> client needs a TPM session for transaction purposes which can last a
> highly variable amount of time.  That and concerns about command
> white-listing, hardware denial of service and related issues tend to
> underscore our concerns about how much TPM resource management should
> go into the kernel.
> 
> Once an API is in the kernel we live with it forever.  Particularly
> with respect to TPM2, our field experiences suggest it is way too
> early to bake long term functionality into the kernel.
> 
> Referring back to Ken's comments about having 20+ clients waiting to
> get access to the hardware.  Even with the focus in TPM2 on having it
> be more of a cryptographic accelerator are we convinced that the
> hardware is ever going to be fast enough for a model of having it
> directly service large numbers of transactions in something like a
> 'cloud' model?

I doubt it. Personally I would rather just limit the number of
connections to /dev/tpms0 than have a complex lease model (like one
implemented in this commit). That could have '0' setting, which would
disable it so that it doesn't cause harm to those who do not need it.

> The industry has very solid userspace implementations of TPM2.  It
> seems that with respect to resource management about all we would want
> in the kernel is enough management to allow multiple privileged
> userspace process to establish a root of trust for a TPM2 based
> userspace instance with subsequent relinquishment of privilege.  At
> that point one has the freedom to implement all sorts of policy.

If you look at the patch set that I sent yesterday it exactly has a
feature that makes it more lean for a privileged process to implement
a resource manager.

> Given the potential lifespan of these security technologies I think a
> kernel design needs to factor in the availability of trusted execution
> environment's such as SGX as well.  Politics aside, such environments
> do have the ability to significantly modify the guarantees which can
> be afforded to architectural models which focus on using the hardware
> TPM as a root of trust for userspace implementations of 'TPM'
> functionality and policy.

Agreed.

> We can always add functionality to the kernel but we can never
> subtract.  It is way too early to lock security architecture decisions
> into the kernel.

The current patch set does not define policy. The simple policy
addition that could be added soon is the limit of connections
because it is easy to implement in non-intrusive way.

> 
> > /Jarkko
> 
> Have a good weekend.
> 
> Greg

Likewise!

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found] <jarkko.sakkinen@linux.intel.com>
@ 2017-02-09  9:06 ` Dr. Greg Wettstein
       [not found]   ` <201702090906.v1996c6a015552-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: Dr. Greg Wettstein @ 2017-02-09  9:06 UTC (permalink / raw)
  To: Jarkko Sakkinen, James Bottomley
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Jan 30, 11:58pm, Jarkko Sakkinen wrote:
} Subject: Re: [tpmdd-devel] [RFC] tpm2-space: add handling for global sessi

Good morning, I hope the day is going well for everyone.

> I'm kind dilating to an opinion that we would leave this commit out
> from the first kernel release that will contain the resource manager
> with similar rationale as Jason gave me for whitelisting: get the
> basic stuff in and once it is used with some workloads whitelisting
> and exhaustion will take eventually the right form.
>
> How would you feel about this?

I wasn't able to locate the exact context to include but we noted with
interest Ken's comments about his need to support a model where a
client needs a TPM session for transaction purposes which can last a
highly variable amount of time.  That and concerns about command
white-listing, hardware denial of service and related issues tend to
underscore our concerns about how much TPM resource management should
go into the kernel.

Once an API is in the kernel we live with it forever.  Particularly
with respect to TPM2, our field experiences suggest it is way too
early to bake long term functionality into the kernel.

Referring back to Ken's comments about having 20+ clients waiting to
get access to the hardware.  Even with the focus in TPM2 on having it
be more of a cryptographic accelerator are we convinced that the
hardware is ever going to be fast enough for a model of having it
directly service large numbers of transactions in something like a
'cloud' model?

The industry has very solid userspace implementations of TPM2.  It
seems that with respect to resource management about all we would want
in the kernel is enough management to allow multiple privileged
userspace process to establish a root of trust for a TPM2 based
userspace instance with subsequent relinquishment of privilege.  At
that point one has the freedom to implement all sorts of policy.

Given the potential lifespan of these security technologies I think a
kernel design needs to factor in the availability of trusted execution
environment's such as SGX as well.  Politics aside, such environments
do have the ability to significantly modify the guarantees which can
be afforded to architectural models which focus on using the hardware
TPM as a root of trust for userspace implementations of 'TPM'
functionality and policy.

We can always add functionality to the kernel but we can never
subtract.  It is way too early to lock security architecture decisions
into the kernel.

> /Jarkko

Have a good weekend.

Greg

}-- End of excerpt from Jarkko Sakkinen

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg-R92VP3DqSWVWk0Htik3J/w@public.gmane.org
------------------------------------------------------------------------------
"If I'd listened to customers, I'd have given them a faster horse."
                                -- Henry Ford

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
  2017-01-31 19:28               ` Ken Goldman
@ 2017-01-31 19:55                 ` James Bottomley
  0 siblings, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-01-31 19:55 UTC (permalink / raw)
  To: Ken Goldman, tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue, 2017-01-31 at 14:28 -0500, Ken Goldman wrote:
> On 1/30/2017 11:04 AM, James Bottomley wrote:
> > 
> > This depends what your threat model is.  For ssh keys, you worry
> > that someone might be watching, so you use HMAC authority even for 
> > a local TPM.
> 
> If someone can "watch" my local process, they can capture my password
> anyway.  Does using a password that the attacker knows to HMAC the 
> command help?

It's about attack surface.  If you want my password and I use TPM_RS_PW
then you either prise it out of my app or snoop the command path.  If I
always use HMAC, I know you can only prise it out of my app (reduction
in attack surface) and I can plan defences accordingly (not saying I'll
be successful, just saying I have a better idea where the attack is
coming from).

> > In the cloud, you don't quite know where the TPM is, so again you'd
> > use HMAC sessions ... however, in both use cases the sessions 
> > should be very short lived.
> 
> If your entire application is in the cloud, then I think the same 
> question as above applies.
> 
> If you have your application on one platform (that you trust) and the
> TPM is on another (that you don't trust), then I absolutely agree 
> that HMAC (and parameter encryption) are necessary.

It's attack surface again ... although lengthening the transmission
pathway, which happens in the cloud, correspondingly increases that sur
face.

Look at it this way: if your TPM were network remote, would you still
think TPM_RS_PW to be appropriate?  I suspect not because the network
is seen as a very insecure pathway.  We can argue about the relative
security or insecurity of other pathways to the TPM, but it's
unarguable that using HMAC and parameter encryption means we don't have
to (and so is best practice).

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]             ` <1485792295.2518.23.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-01-30 21:58               ` Jarkko Sakkinen
@ 2017-01-31 19:28               ` Ken Goldman
  2017-01-31 19:55                 ` James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Ken Goldman @ 2017-01-31 19:28 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 1/30/2017 11:04 AM, James Bottomley wrote:
>
> This depends what your threat model is.  For ssh keys, you worry
> that someone might be watching, so you use HMAC authority even for a
> local TPM.

If someone can "watch" my local process, they can capture my password 
anyway.  Does using a password that the attacker knows to HMAC the 
command help?

> In the cloud, you don't quite know where the TPM is, so again you'd
> use HMAC sessions ... however, in both use cases the sessions should
> be very short lived.

If your entire application is in the cloud, then I think the same 
question as above applies.

If you have your application on one platform (that you trust) and the 
TPM is on another (that you don't trust), then I absolutely agree that 
HMAC (and parameter encryption) are necessary.






------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]                     ` <1485814388.2518.28.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-01-30 22:46                       ` Ken Goldman
@ 2017-01-31 13:31                       ` Jarkko Sakkinen
  2017-02-10 17:22                       ` Kenneth Goldman
  2 siblings, 0 replies; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-01-31 13:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Ken Goldman, linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2017 at 02:13:08PM -0800, James Bottomley wrote:
> On Mon, 2017-01-30 at 23:58 +0200, Jarkko Sakkinen wrote:
> > On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> > > On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > > > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > > > 
> > > > > > Beware the nasty corner case:
> > > > > > 
> > > > > > - Application asks for a session and gets 02000000
> > > > > > 
> > > > > > - Time elapses and 02000000 gets forcibly flushed
> > > > > > 
> > > > > > - Later, app comes back, asks for a second session and again
> > > > > > gets
> > > > > > 02000000.
> > > > > > 
> > > > > > - App gets very confused.
> > > > > > 
> > > > > > May it be better to close the connection completely, which
> > > > > > the
> > > > > > application can detect, than flush a session and give this
> > > > > > corner
> > > > > > case?
> > > > > 
> > > > > if I look at the code I've written, I don't know what the
> > > > > session
> > > > > number is, I just save sessionHandle in a variable for later
> > > > > use 
> > > > > (lets say to v1).  If I got the same session number returned at
> > > > > a 
> > > > > later time and placed it in v2, all I'd notice is that an 
> > > > > authorization using v1 would fail.  I'm not averse to killing
> > > > > the 
> > > > > entire connection but, assuming you have fallback, it might be 
> > > > > kinder simply to ensure that the operations with the reclaimed 
> > > > > session fail (which is what the code currently does).
> > > > 
> > > > My worry is that this session failure cannot be detected by the 
> > > > application.  An HMAC failure could cause the app to tell a user
> > > > that
> > > > they entered the wrong password.  Misleading.  On the TPM, it
> > > > could 
> > > > trigger the dictionary attack lockout.  For a PIN index, it could
> > > > consume a failure count.  Killing a policy session that has e.g.,
> > > > a 
> > > > policy signed term could cause the application to go back to some
> > > > external entity for another authorization signature.
> > > > 
> > > > Let's go up to the stack.  What's the attack?
> > > > 
> > > > If we're worried about many simultaneous applications (wouldn't
> > > > that 
> > > > be wonderful), why not just let startauthsession fail?  The 
> > > > application can just retry periodically.
> > > 
> > > How in that scenario do we ensure that a session becomes available?
> > >  Once that's established, there's no real difference between
> > > retrying
> > > the startauthsession in the kernel when we know the session is
> > > available and forcing userspace to do the retry except that the
> > > former
> > > has a far greater chance of success (and it's only about 6 lines of
> > > code).
> > > 
> > > >   Just allocate them in triples so there's no deadlock.
> > > 
> > > Is this the application or the kernel?  If it's the kernel, that
> > > adds a
> > > lot of complexity.
> > > 
> > > > If we're worried about a DoS attack, killing a session just helps
> > > > the
> > > > attacker.  The attacker can create a few connections and spin on 
> > > > startauthsession, locking everyone out anyway.
> > > 
> > > There are two considerations here: firstly we'd need to introduce a
> > > mechanism to "kill" the connection.  Probably we'd simply error
> > > every
> > > command on the space until it was closed.  The second is which
> > > scenario
> > > is more reasonable: Say the application simply forgot to flush the
> > > session and will never use it again.  Simply reclaiming the session
> > > would produce no effect at all on the application in this scenario.
> > >  However, I have no data to say what's likely.
> > > 
> > > > ~~
> > > > 
> > > > Also, let's remember that this is a rare application.  Sessions
> > > > are 
> > > > only needed for remote access (requiring encryption, HMAC or
> > > > salt), 
> > > > or policy sessions.
> > > 
> > > This depends what your threat model is.  For ssh keys, you worry
> > > that
> > > someone might be watching, so you use HMAC authority even for a
> > > local
> > > TPM.  In the cloud, you don't quite know where the TPM is, so again
> > > you'd use HMAC sessions ... however, in both use cases the sessions
> > > should be very short lived.
> > > 
> > > > ~~
> > > > 
> > > > Should the code also reserve a session for the kernel?  Mark it
> > > > not 
> > > > kill'able?
> > > 
> > > At the moment, the kernel doesn't use sessions, so let's worry
> > > about
> > > that problem at the point it arises (if it ever arises).
> > > 
> > > James
> > 
> > It does. My trusted keys implementation actually uses sessions.
> 
> But as I read the code, I can't find where the kernel creates a
> session.  It looks like the session and hmac are passed in as option
> arguments, aren't they?

Yes. Sorry, I mixed up things.

> > I'm kind dilating to an opinion that we would leave this commit out 
> > from the first kernel release that will contain the resource manager 
> > with similar rationale as Jason gave me for whitelisting: get the 
> > basic stuff in and once it is used with some workloads whitelisting 
> > and exhaustion will take eventually the right form.
> > 
> > How would you feel about this?
> 
> As long as we get patch 1/2 then applications using sessions will
> actually work with spaces, so taking more time with 2/2 is fine by me.
> 
> James

1/2 contains code that with a few iterations it is in the form that I'm
able to merge it.

With 2/2 I'm not saying it is wrong approach but I cannot yet say that
I'm confident that it would be the best approach.

I think that the transient object and infrastructure stuff that is
already in the patch set and 1/2 session is the subset of commits where
we can be fairly confident that we are doing the right thing.

I'll start preparing a patch set with this content without RFC tag.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]                     ` <1485814388.2518.28.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-01-30 22:46                       ` Ken Goldman
  2017-01-31 13:31                       ` Jarkko Sakkinen
  2017-02-10 17:22                       ` Kenneth Goldman
  2 siblings, 0 replies; 38+ messages in thread
From: Ken Goldman @ 2017-01-30 22:46 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 1/30/2017 5:13 PM, James Bottomley wrote:
>
> But as I read the code, I can't find where the kernel creates a
> session.  It looks like the session and hmac are passed in as option
> arguments, aren't they?

A bit of background.

Unlike TPM 1.2, which always required an HMAC, TPM 2.0 has plaintext
password sessions, with the session number TPM_RS_PS.  This type of
session does not have to be created or flushed.  Since the kernel has a 
presumed trusted path to the TPM, I don't see any need for an HMAC session.

However, TPM 2.0 does has policy sessions.  These do have to be
created.  The kernel use case may be in the future.

The first use I encountered for a policy session is use of the EK.  The 
EK has no password of its own, but rather has a policy that points to 
the endorsement hierarchy authorization - policy secret.




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]                 ` <20170130215815.4lr42ob7e4cycwgi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-01-30 22:13                   ` James Bottomley
       [not found]                     ` <1485814388.2518.28.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2017-01-30 22:13 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: Ken Goldman, linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Mon, 2017-01-30 at 23:58 +0200, Jarkko Sakkinen wrote:
> On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> > On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > > 
> > > > > Beware the nasty corner case:
> > > > > 
> > > > > - Application asks for a session and gets 02000000
> > > > > 
> > > > > - Time elapses and 02000000 gets forcibly flushed
> > > > > 
> > > > > - Later, app comes back, asks for a second session and again
> > > > > gets
> > > > > 02000000.
> > > > > 
> > > > > - App gets very confused.
> > > > > 
> > > > > May it be better to close the connection completely, which
> > > > > the
> > > > > application can detect, than flush a session and give this
> > > > > corner
> > > > > case?
> > > > 
> > > > if I look at the code I've written, I don't know what the
> > > > session
> > > > number is, I just save sessionHandle in a variable for later
> > > > use 
> > > > (lets say to v1).  If I got the same session number returned at
> > > > a 
> > > > later time and placed it in v2, all I'd notice is that an 
> > > > authorization using v1 would fail.  I'm not averse to killing
> > > > the 
> > > > entire connection but, assuming you have fallback, it might be 
> > > > kinder simply to ensure that the operations with the reclaimed 
> > > > session fail (which is what the code currently does).
> > > 
> > > My worry is that this session failure cannot be detected by the 
> > > application.  An HMAC failure could cause the app to tell a user
> > > that
> > > they entered the wrong password.  Misleading.  On the TPM, it
> > > could 
> > > trigger the dictionary attack lockout.  For a PIN index, it could
> > > consume a failure count.  Killing a policy session that has e.g.,
> > > a 
> > > policy signed term could cause the application to go back to some
> > > external entity for another authorization signature.
> > > 
> > > Let's go up to the stack.  What's the attack?
> > > 
> > > If we're worried about many simultaneous applications (wouldn't
> > > that 
> > > be wonderful), why not just let startauthsession fail?  The 
> > > application can just retry periodically.
> > 
> > How in that scenario do we ensure that a session becomes available?
> >  Once that's established, there's no real difference between
> > retrying
> > the startauthsession in the kernel when we know the session is
> > available and forcing userspace to do the retry except that the
> > former
> > has a far greater chance of success (and it's only about 6 lines of
> > code).
> > 
> > >   Just allocate them in triples so there's no deadlock.
> > 
> > Is this the application or the kernel?  If it's the kernel, that
> > adds a
> > lot of complexity.
> > 
> > > If we're worried about a DoS attack, killing a session just helps
> > > the
> > > attacker.  The attacker can create a few connections and spin on 
> > > startauthsession, locking everyone out anyway.
> > 
> > There are two considerations here: firstly we'd need to introduce a
> > mechanism to "kill" the connection.  Probably we'd simply error
> > every
> > command on the space until it was closed.  The second is which
> > scenario
> > is more reasonable: Say the application simply forgot to flush the
> > session and will never use it again.  Simply reclaiming the session
> > would produce no effect at all on the application in this scenario.
> >  However, I have no data to say what's likely.
> > 
> > > ~~
> > > 
> > > Also, let's remember that this is a rare application.  Sessions
> > > are 
> > > only needed for remote access (requiring encryption, HMAC or
> > > salt), 
> > > or policy sessions.
> > 
> > This depends what your threat model is.  For ssh keys, you worry
> > that
> > someone might be watching, so you use HMAC authority even for a
> > local
> > TPM.  In the cloud, you don't quite know where the TPM is, so again
> > you'd use HMAC sessions ... however, in both use cases the sessions
> > should be very short lived.
> > 
> > > ~~
> > > 
> > > Should the code also reserve a session for the kernel?  Mark it
> > > not 
> > > kill'able?
> > 
> > At the moment, the kernel doesn't use sessions, so let's worry
> > about
> > that problem at the point it arises (if it ever arises).
> > 
> > James
> 
> It does. My trusted keys implementation actually uses sessions.

But as I read the code, I can't find where the kernel creates a
session.  It looks like the session and hmac are passed in as option
arguments, aren't they?

> I'm kind dilating to an opinion that we would leave this commit out 
> from the first kernel release that will contain the resource manager 
> with similar rationale as Jason gave me for whitelisting: get the 
> basic stuff in and once it is used with some workloads whitelisting 
> and exhaustion will take eventually the right form.
> 
> How would you feel about this?

As long as we get patch 1/2 then applications using sessions will
actually work with spaces, so taking more time with 2/2 is fine by me.

James



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]             ` <1485792295.2518.23.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-01-30 21:58               ` Jarkko Sakkinen
       [not found]                 ` <20170130215815.4lr42ob7e4cycwgi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-01-31 19:28               ` Ken Goldman
  1 sibling, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-01-30 21:58 UTC (permalink / raw)
  To: James Bottomley
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Mon, Jan 30, 2017 at 08:04:55AM -0800, James Bottomley wrote:
> On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> > On 1/27/2017 5:04 PM, James Bottomley wrote:
> > 
> > > > Beware the nasty corner case:
> > > > 
> > > > - Application asks for a session and gets 02000000
> > > > 
> > > > - Time elapses and 02000000 gets forcibly flushed
> > > > 
> > > > - Later, app comes back, asks for a second session and again gets
> > > > 02000000.
> > > > 
> > > > - App gets very confused.
> > > > 
> > > > May it be better to close the connection completely, which the
> > > > application can detect, than flush a session and give this corner
> > > > case?
> > > 
> > > if I look at the code I've written, I don't know what the session
> > > number is, I just save sessionHandle in a variable for later use 
> > > (lets say to v1).  If I got the same session number returned at a 
> > > later time and placed it in v2, all I'd notice is that an 
> > > authorization using v1 would fail.  I'm not averse to killing the 
> > > entire connection but, assuming you have fallback, it might be 
> > > kinder simply to ensure that the operations with the reclaimed 
> > > session fail (which is what the code currently does).
> > 
> > My worry is that this session failure cannot be detected by the 
> > application.  An HMAC failure could cause the app to tell a user that
> > they entered the wrong password.  Misleading.  On the TPM, it could 
> > trigger the dictionary attack lockout.  For a PIN index, it could 
> > consume a failure count.  Killing a policy session that has e.g., a 
> > policy signed term could cause the application to go back to some 
> > external entity for another authorization signature.
> > 
> > Let's go up to the stack.  What's the attack?
> > 
> > If we're worried about many simultaneous applications (wouldn't that 
> > be wonderful), why not just let startauthsession fail?  The 
> > application can just retry periodically.
> 
> How in that scenario do we ensure that a session becomes available? 
>  Once that's established, there's no real difference between retrying
> the startauthsession in the kernel when we know the session is
> available and forcing userspace to do the retry except that the former
> has a far greater chance of success (and it's only about 6 lines of
> code).
> 
> >   Just allocate them in triples so there's no deadlock.
> 
> Is this the application or the kernel?  If it's the kernel, that adds a
> lot of complexity.
> 
> > If we're worried about a DoS attack, killing a session just helps the
> > attacker.  The attacker can create a few connections and spin on 
> > startauthsession, locking everyone out anyway.
> 
> There are two considerations here: firstly we'd need to introduce a
> mechanism to "kill" the connection.  Probably we'd simply error every
> command on the space until it was closed.  The second is which scenario
> is more reasonable: Say the application simply forgot to flush the
> session and will never use it again.  Simply reclaiming the session
> would produce no effect at all on the application in this scenario. 
>  However, I have no data to say what's likely.
> 
> > ~~
> > 
> > Also, let's remember that this is a rare application.  Sessions are 
> > only needed for remote access (requiring encryption, HMAC or salt), 
> > or policy sessions.
> 
> This depends what your threat model is.  For ssh keys, you worry that
> someone might be watching, so you use HMAC authority even for a local
> TPM.  In the cloud, you don't quite know where the TPM is, so again
> you'd use HMAC sessions ... however, in both use cases the sessions
> should be very short lived.
> 
> > ~~
> > 
> > Should the code also reserve a session for the kernel?  Mark it not 
> > kill'able?
> 
> At the moment, the kernel doesn't use sessions, so let's worry about
> that problem at the point it arises (if it ever arises).
> 
> James

It does. My trusted keys implementation actually uses sessions.

I'm kind dilating to an opinion that we would leave this commit out from
the first kernel release that will contain the resource manager with
similar rationale as Jason gave me for whitelisting: get the basic stuff
in and once it is used with some workloads whitelisting and exhaustion
will take eventually the right form.

How would you feel about this?

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
  2017-01-30  0:52         ` Ken Goldman
@ 2017-01-30 16:04           ` James Bottomley
       [not found]             ` <1485792295.2518.23.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2017-01-30 16:04 UTC (permalink / raw)
  To: Ken Goldman, tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Sun, 2017-01-29 at 19:52 -0500, Ken Goldman wrote:
> On 1/27/2017 5:04 PM, James Bottomley wrote:
> 
> > > Beware the nasty corner case:
> > > 
> > > - Application asks for a session and gets 02000000
> > > 
> > > - Time elapses and 02000000 gets forcibly flushed
> > > 
> > > - Later, app comes back, asks for a second session and again gets
> > > 02000000.
> > > 
> > > - App gets very confused.
> > > 
> > > May it be better to close the connection completely, which the
> > > application can detect, than flush a session and give this corner
> > > case?
> > 
> > if I look at the code I've written, I don't know what the session
> > number is, I just save sessionHandle in a variable for later use 
> > (lets say to v1).  If I got the same session number returned at a 
> > later time and placed it in v2, all I'd notice is that an 
> > authorization using v1 would fail.  I'm not averse to killing the 
> > entire connection but, assuming you have fallback, it might be 
> > kinder simply to ensure that the operations with the reclaimed 
> > session fail (which is what the code currently does).
> 
> My worry is that this session failure cannot be detected by the 
> application.  An HMAC failure could cause the app to tell a user that
> they entered the wrong password.  Misleading.  On the TPM, it could 
> trigger the dictionary attack lockout.  For a PIN index, it could 
> consume a failure count.  Killing a policy session that has e.g., a 
> policy signed term could cause the application to go back to some 
> external entity for another authorization signature.
> 
> Let's go up to the stack.  What's the attack?
> 
> If we're worried about many simultaneous applications (wouldn't that 
> be wonderful), why not just let startauthsession fail?  The 
> application can just retry periodically.

How in that scenario do we ensure that a session becomes available? 
 Once that's established, there's no real difference between retrying
the startauthsession in the kernel when we know the session is
available and forcing userspace to do the retry except that the former
has a far greater chance of success (and it's only about 6 lines of
code).

>   Just allocate them in triples so there's no deadlock.

Is this the application or the kernel?  If it's the kernel, that adds a
lot of complexity.

> If we're worried about a DoS attack, killing a session just helps the
> attacker.  The attacker can create a few connections and spin on 
> startauthsession, locking everyone out anyway.

There are two considerations here: firstly we'd need to introduce a
mechanism to "kill" the connection.  Probably we'd simply error every
command on the space until it was closed.  The second is which scenario
is more reasonable: Say the application simply forgot to flush the
session and will never use it again.  Simply reclaiming the session
would produce no effect at all on the application in this scenario. 
 However, I have no data to say what's likely.

> ~~
> 
> Also, let's remember that this is a rare application.  Sessions are 
> only needed for remote access (requiring encryption, HMAC or salt), 
> or policy sessions.

This depends what your threat model is.  For ssh keys, you worry that
someone might be watching, so you use HMAC authority even for a local
TPM.  In the cloud, you don't quite know where the TPM is, so again
you'd use HMAC sessions ... however, in both use cases the sessions
should be very short lived.

> ~~
> 
> Should the code also reserve a session for the kernel?  Mark it not 
> kill'able?

At the moment, the kernel doesn't use sessions, so let's worry about
that problem at the point it arises (if it ever arises).

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]       ` <1485554699.3229.20.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-01-27 23:35         ` Jason Gunthorpe
@ 2017-01-30  0:52         ` Ken Goldman
  2017-01-30 16:04           ` James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Ken Goldman @ 2017-01-30  0:52 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 1/27/2017 5:04 PM, James Bottomley wrote:

>> Beware the nasty corner case:
>>
>> - Application asks for a session and gets 02000000
>>
>> - Time elapses and 02000000 gets forcibly flushed
>>
>> - Later, app comes back, asks for a second session and again gets
>> 02000000.
>>
>> - App gets very confused.
>>
>> May it be better to close the connection completely, which the
>> application can detect, than flush a session and give this corner
>> case?
>
> if I look at the code I've written, I don't know what the session
> number is, I just save sessionHandle in a variable for later use (lets
> say to v1).  If I got the same session number returned at a later time
> and placed it in v2, all I'd notice is that an authorization using v1
> would fail.  I'm not averse to killing the entire connection but,
> assuming you have fallback, it might be kinder simply to ensure that
> the operations with the reclaimed session fail (which is what the code
> currently does).

My worry is that this session failure cannot be detected by the 
application.  An HMAC failure could cause the app to tell a user that 
they entered the wrong password.  Misleading.  On the TPM, it could 
trigger the dictionary attack lockout.  For a PIN index, it could 
consume a failure count.  Killing a policy session that has e.g., a 
policy signed term could cause the application to go back to some 
external entity for another authorization signature.

Let's go up to the stack.  What's the attack?

If we're worried about many simultaneous applications (wouldn't that be 
wonderful), why not just let startauthsession fail?  The application can 
just retry periodically.  Just allocate them in triples so there's no 
deadlock.

If we're worried about a DoS attack, killing a session just helps the 
attacker.  The attacker can create a few connections and spin on 
startauthsession, locking everyone out anyway.

~~

Also, let's remember that this is a rare application.  Sessions are only 
needed for remote access (requiring encryption, HMAC or salt), or policy 
sessions.

~~

Should the code also reserve a session for the kernel?  Mark it not 
kill'able?




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]           ` <20170127233513.GA28995-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
@ 2017-01-27 23:48             ` James Bottomley
  0 siblings, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-01-27 23:48 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 2017-01-27 at 16:35 -0700, Jason Gunthorpe wrote:
> On Fri, Jan 27, 2017 at 02:04:59PM -0800, James Bottomley wrote:
> 
> > if I look at the code I've written, I don't know what the session
> > number is, I just save sessionHandle in a variable for later use 
> > (lets say to v1).  If I got the same session number returned at a 
> > later time and placed it in v2, all I'd notice is that an 
> > authorization using v1 would fail.
> 
> Is there any way that could be used to cause an op thinking it is
> using v1 to authorize something it shouldn't?

Not really: in the parameter or HMAC case, you have to compute based on
the initial nonce given by the TPM when the session was created. 
 Assuming the initial nonce belonged to the evicted session, the HMAC
will now fail because the nonce of the v2 session is different.  There
is a corner case where you track the nonce in a table indexed by
handle, so when v2 is created, its nonce replaces the old v1 nonce in
the table.  Now you can use v1 and v2 without error (because use picks
up the correct nonce) and effectively they're interchangeable as the
same session.  Even in this case, you're not authorising something you
shouldn't, you're just using one session for the authorisations where
you thought you had two.

James



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]       ` <1485554699.3229.20.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-01-27 23:35         ` Jason Gunthorpe
       [not found]           ` <20170127233513.GA28995-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
  2017-01-30  0:52         ` Ken Goldman
  1 sibling, 1 reply; 38+ messages in thread
From: Jason Gunthorpe @ 2017-01-27 23:35 UTC (permalink / raw)
  To: James Bottomley
  Cc: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f,
	linux-security-module-u79uwXL29TY76Z2rM5mHXA, Ken Goldman,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, Jan 27, 2017 at 02:04:59PM -0800, James Bottomley wrote:

> if I look at the code I've written, I don't know what the session
> number is, I just save sessionHandle in a variable for later use (lets
> say to v1).  If I got the same session number returned at a later time
> and placed it in v2, all I'd notice is that an authorization using v1
> would fail.

Is there any way that could be used to cause an op thinking it is
using v1 to authorize something it shouldn't?

Jason

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
  2017-01-27 21:42   ` Ken Goldman
@ 2017-01-27 22:04     ` James Bottomley
       [not found]       ` <1485554699.3229.20.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2017-01-27 22:04 UTC (permalink / raw)
  To: Ken Goldman, tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 2017-01-27 at 16:42 -0500, Ken Goldman wrote:
> On 1/18/2017 3:48 PM, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context
> > saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM
> > run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for
> > one
> > to become free, but if it doesn't, we forcibly evict an existing
> > one.
> > The eviction strategy waits until the current command is repeated
> > to
> > evict the session which should guarantee there is an available
> > slot.
> 
> Beware the nasty corner case:
> 
> - Application asks for a session and gets 02000000
> 
> - Time elapses and 02000000 gets forcibly flushed
> 
> - Later, app comes back, asks for a second session and again gets
> 02000000.
> 
> - App gets very confused.
> 
> May it be better to close the connection completely, which the 
> application can detect, than flush a session and give this corner
> case?

if I look at the code I've written, I don't know what the session
number is, I just save sessionHandle in a variable for later use (lets
say to v1).  If I got the same session number returned at a later time
and placed it in v2, all I'd notice is that an authorization using v1
would fail.  I'm not averse to killing the entire connection but,
assuming you have fallback, it might be kinder simply to ensure that
the operations with the reclaimed session fail (which is what the code
currently does).

> ~~~~
> 
> Part of me says to defer this.  That is:
> 
> 64 sessions / 3 = 21 simultaneous applications.  If we have 21 
> simultaneous TCG applications, we'll all celebrate.  For the DoS,
> chmod and chgrp /dev/tpm and let only well behaved applications in 
> the group.
> 
> Agreed, it's not a long term solution.

My use case is secret protection in the cloud.  I can certainly see >
21 applications wanting to do this at roughly the same time. However,
the periods over which they actually all need sessions should be very
short, hence the leasing proposal which would stagger them.

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
  2017-01-27 21:20           ` Ken Goldman
@ 2017-01-27 21:59             ` James Bottomley
  0 siblings, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-01-27 21:59 UTC (permalink / raw)
  To: Ken Goldman, tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Fri, 2017-01-27 at 16:20 -0500, Ken Goldman wrote:
> On 1/19/2017 7:41 AM, Jarkko Sakkinen wrote:
> > 
> > I actually think that the very best solution would be such that
> > sessions would be *always* lease based. So when you create a
> > session you would always loose within a time limit.
> > 
> > There would not be any special victim selection mechanism. You
> > would just loose your session within a time limit.
> 
> I worry about the time limit.
> 
> I have a proposed use case (policy signed) where the user sends the 
> session nonce along with a "payment" to a vendor and receives back a 
> signature authorization over the nonce.
> 
> The time could be minutes or even hours.

So the problem is that sessions are a limited resource and we need a
way to allocate them when under resource pressure.  Leasing is the
fairest way I can think of but I'm open to other mechanisms if you
propose them.

Note that the lease mechanism doesn't mean every session expires after
the limit, it just means that every session becomes eligible for
reclaim after the limit.  If there's no-one else waiting, you can keep
your session for hours.

James



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found] ` <1484772489.2396.2.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  2017-01-19 12:25   ` Jarkko Sakkinen
@ 2017-01-27 21:42   ` Ken Goldman
  2017-01-27 22:04     ` James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Ken Goldman @ 2017-01-27 21:42 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 1/18/2017 3:48 PM, James Bottomley wrote:
> In a TPM2, sessions can be globally exhausted once there are
> TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
> The Strategy for handling this is to keep a global count of all the
> sessions along with their creation time.  Then if we see the TPM run
> out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
> to become free, but if it doesn't, we forcibly evict an existing one.
> The eviction strategy waits until the current command is repeated to
> evict the session which should guarantee there is an available slot.

Beware the nasty corner case:

- Application asks for a session and gets 02000000

- Time elapses and 02000000 gets forcibly flushed

- Later, app comes back, asks for a second session and again gets 02000000.

- App gets very confused.

May it be better to close the connection completely, which the 
application can detect, than flush a session and give this corner case?

~~~~

Part of me says to defer this.  That is:

64 sessions / 3 = 21 simultaneous applications.  If we have 21 
simultaneous TCG applications, we'll all celebrate.  For the DoS,
chmod and chgrp /dev/tpm and let only well behaved applications in the 
group.

Agreed, it's not a long term solution.




------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]         ` <20170119124101.nw7a7m735zhiivfo-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-01-27 21:20           ` Ken Goldman
  2017-01-27 21:59             ` James Bottomley
  0 siblings, 1 reply; 38+ messages in thread
From: Ken Goldman @ 2017-01-27 21:20 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On 1/19/2017 7:41 AM, Jarkko Sakkinen wrote:
>
> I actually think that the very best solution would be such that
> sessions would be *always* lease based. So when you create a
> session you would always loose within a time limit.
>
> There would not be any special victim selection mechanism. You
> would just loose your session within a time limit.

I worry about the time limit.

I have a proposed use case (policy signed) where the user sends the 
session nonce along with a "payment" to a vendor and receives back a 
signature authorization over the nonce.

The time could be minutes or even hours.





------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]     ` <20170119122533.d7h5rgatpwl3qmcl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-01-19 12:41       ` Jarkko Sakkinen
@ 2017-01-19 12:59       ` James Bottomley
  1 sibling, 0 replies; 38+ messages in thread
From: James Bottomley @ 2017-01-19 12:59 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, open list

On Thu, 2017-01-19 at 14:25 +0200, Jarkko Sakkinen wrote:
> On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context
> > saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM
> > run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for
> > one
> > to become free, but if it doesn't, we forcibly evict an existing
> > one.
> > The eviction strategy waits until the current command is repeated
> > to
> > evict the session which should guarantee there is an available
> > slot.
> > 
> > On the force eviction case, we make sure that the victim session is
> > at
> > least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue
> > for
> > session slots is a FIFO one, ensuring that once we run out of
> > sessions, everyone will get a session in a bounded time and once
> > they
> > get one, they'll have SESSION_TIMEOUT to use it before it may be
> > subject to eviction.
> > 
> > Signed-off-by: James Bottomley <
> > James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> 
> I didn't yet read the code properly. I'll do a more proper review
> once I have v4 of my patch set together. This comment is solely
> based on your commit message.
> 
> I'm just thinking that do we need this complicated timeout stuff
> or could you just kick a session out in LRU fashion as we run
> out of them?
> 
> Or one variation of what you are doing: couldn't the session that
> needs a session handle to do something sleep for 2 seconds and then
> take the oldest session? It would have essentially the same effect
> but no waitqueue needed.
> 
> Yeah, as I said, this is just commentary based on the description.

If you don't have a wait queue you lose fairness in resource allocation
on starvation.  What happens is that you get RC_SESSION_HANDLES and
sleep for 2s and retry.  Meanwhile someone frees a session, then next
user grabs it while you were sleeping and when you wake you still get
RC_SESSION_HANDLES.  I can basically DoS your process if I understand
this. The only way to make the resource fairly allocated: i.e. the
first person to sleep waiting for a session is the one who gets it when
they wake is to make sure that you wake one waiter as soon as a free
session comes in so probabalistically, they get the session.  If you
look, there are two mechanisms for ensuring fairness: one is the FIFO
wait queue (probabalistic) and the other is the reserved session which
really ensures it belongs to you when you wake (deterministic but
expensive, so this is only activated on the penultimate go around).

James



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found]     ` <20170119122533.d7h5rgatpwl3qmcl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-01-19 12:41       ` Jarkko Sakkinen
       [not found]         ` <20170119124101.nw7a7m735zhiivfo-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-01-19 12:59       ` James Bottomley
  1 sibling, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-01-19 12:41 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, open list

On Thu, Jan 19, 2017 at 02:25:33PM +0200, Jarkko Sakkinen wrote:
> On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> > In a TPM2, sessions can be globally exhausted once there are
> > TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
> > The Strategy for handling this is to keep a global count of all the
> > sessions along with their creation time.  Then if we see the TPM run
> > out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
> > to become free, but if it doesn't, we forcibly evict an existing one.
> > The eviction strategy waits until the current command is repeated to
> > evict the session which should guarantee there is an available slot.
> > 
> > On the force eviction case, we make sure that the victim session is at
> > least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
> > session slots is a FIFO one, ensuring that once we run out of
> > sessions, everyone will get a session in a bounded time and once they
> > get one, they'll have SESSION_TIMEOUT to use it before it may be
> > subject to eviction.
> > 
> > Signed-off-by: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
> 
> I didn't yet read the code properly. I'll do a more proper review
> once I have v4 of my patch set together. This comment is solely
> based on your commit message.
> 
> I'm just thinking that do we need this complicated timeout stuff
> or could you just kick a session out in LRU fashion as we run
> out of them?
> 
> Or one variation of what you are doing: couldn't the session that
> needs a session handle to do something sleep for 2 seconds and then
> take the oldest session? It would have essentially the same effect
> but no waitqueue needed.
> 
> Yeah, as I said, this is just commentary based on the description.

I actually think that the very best solution would be such that
sessions would be *always* lease based. So when you create a
session you would always loose within a time limit.

There would not be any special victim selection mechanism. You
would just loose your session within a time limit.

This could be already part of the session isolation and would
actually make only isolation usable.

We do not have API yet locked so why not make API that models
the nature of the resource. Here given that the amount of sessions
is always fixed leases make sense.

You just then need a wait queue for those waiting for leases.
They don't need to do any victim selectio or whatever. Everything
that takes above the lease gets flushed.

I strongly feel that this would be the best long term solution.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC] tpm2-space: add handling for global session exhaustion
       [not found] ` <1484772489.2396.2.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
@ 2017-01-19 12:25   ` Jarkko Sakkinen
       [not found]     ` <20170119122533.d7h5rgatpwl3qmcl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-01-27 21:42   ` Ken Goldman
  1 sibling, 1 reply; 38+ messages in thread
From: Jarkko Sakkinen @ 2017-01-19 12:25 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA,
	tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, open list

On Wed, Jan 18, 2017 at 03:48:09PM -0500, James Bottomley wrote:
> In a TPM2, sessions can be globally exhausted once there are
> TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
> The Strategy for handling this is to keep a global count of all the
> sessions along with their creation time.  Then if we see the TPM run
> out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
> to become free, but if it doesn't, we forcibly evict an existing one.
> The eviction strategy waits until the current command is repeated to
> evict the session which should guarantee there is an available slot.
> 
> On the force eviction case, we make sure that the victim session is at
> least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
> session slots is a FIFO one, ensuring that once we run out of
> sessions, everyone will get a session in a bounded time and once they
> get one, they'll have SESSION_TIMEOUT to use it before it may be
> subject to eviction.
> 
> Signed-off-by: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>

I didn't yet read the code properly. I'll do a more proper review
once I have v4 of my patch set together. This comment is solely
based on your commit message.

I'm just thinking that do we need this complicated timeout stuff
or could you just kick a session out in LRU fashion as we run
out of them?

Or one variation of what you are doing: couldn't the session that
needs a session handle to do something sleep for 2 seconds and then
take the oldest session? It would have essentially the same effect
but no waitqueue needed.

Yeah, as I said, this is just commentary based on the description.

/Jarkko

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC] tpm2-space: add handling for global session exhaustion
@ 2017-01-18 20:48 James Bottomley
       [not found] ` <1484772489.2396.2.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
  0 siblings, 1 reply; 38+ messages in thread
From: James Bottomley @ 2017-01-18 20:48 UTC (permalink / raw)
  To: tpmdd-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
  Cc: linux-security-module-u79uwXL29TY76Z2rM5mHXA, open list

In a TPM2, sessions can be globally exhausted once there are
TPM_PT_ACTIVE_SESSION_MAX of them (even if they're all context saved).
The Strategy for handling this is to keep a global count of all the
sessions along with their creation time.  Then if we see the TPM run
out of sessions (via the TPM_RC_SESSION_HANDLES) we first wait for one
to become free, but if it doesn't, we forcibly evict an existing one.
The eviction strategy waits until the current command is repeated to
evict the session which should guarantee there is an available slot.

On the force eviction case, we make sure that the victim session is at
least SESSION_TIMEOUT old (currently 2 seconds).  The wait queue for
session slots is a FIFO one, ensuring that once we run out of
sessions, everyone will get a session in a bounded time and once they
get one, they'll have SESSION_TIMEOUT to use it before it may be
subject to eviction.

Signed-off-by: James Bottomley <James.Bottomley-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index a625884..c959b09 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -164,6 +164,7 @@ struct tpm_chip *tpm_chip_alloc(struct device *pdev,
 
 	mutex_init(&chip->tpm_mutex);
 	init_rwsem(&chip->ops_sem);
+	init_waitqueue_head(&chip->session_wait);
 
 	chip->ops = ops;
 
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 9923daa..38cc21c 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -95,6 +95,7 @@ enum tpm2_return_codes {
 	TPM2_RC_HANDLE		= 0x008B,
 	TPM2_RC_INITIALIZE	= 0x0100, /* RC_VER1 */
 	TPM2_RC_DISABLED	= 0x0120,
+	TPM2_RC_SESSION_HANDLES	= 0x0905,
 	TPM2_RC_TESTING		= 0x090A, /* RC_WARN */
 };
 
@@ -136,7 +137,8 @@ enum tpm2_capabilities {
 };
 
 enum tpm2_properties {
-	TPM_PT_TOTAL_COMMANDS	= 0x0129,
+	TPM_PT_TOTAL_COMMANDS		= 0x0129,
+	TPM_PT_ACTIVE_SESSIONS_MAX	= 0x0111,
 };
 
 enum tpm2_startup_types {
@@ -160,8 +162,24 @@ struct tpm_space {
 	u8 *context_buf;
 	u32 session_tbl[6];
 	u8 *session_buf;
+	u32 reserved_handle;
 };
 
+#define TPM2_HANDLE_FORCE_EVICT 0xFFFFFFFF
+
+static inline void tpm2_session_force_evict(struct tpm_space *space)
+{
+	/* if reserved handle is not empty, we already have a
+	 * session for eviction, so no need to force one
+	 */
+	if (space->reserved_handle == 0)
+		space->reserved_handle = TPM2_HANDLE_FORCE_EVICT;
+}
+static inline bool tpm2_is_session_force_evict(struct tpm_space *space)
+{
+	return space->reserved_handle == TPM2_HANDLE_FORCE_EVICT;
+}
+
 enum tpm_chip_flags {
 	TPM_CHIP_FLAG_TPM2		= BIT(1),
 	TPM_CHIP_FLAG_IRQ		= BIT(2),
@@ -174,6 +192,12 @@ struct tpm_chip_seqops {
 	const struct seq_operations *seqops;
 };
 
+struct tpm_sessions {
+	struct tpm_space *space;
+	u32 handle;
+	unsigned long created;
+};
+
 struct tpm_chip {
 	struct device dev, devrm;
 	struct cdev cdev, cdevrm;
@@ -214,8 +238,12 @@ struct tpm_chip {
 #endif /* CONFIG_ACPI */
 
 	struct tpm_space work_space;
+	struct tpm_space *space;
 	u32 nr_commands;
 	u32 *cc_attrs_tbl;
+	struct tpm_sessions *sessions;
+	int max_sessions;
+	wait_queue_head_t session_wait;
 };
 
 #define to_tpm_chip(d) container_of(d, struct tpm_chip, dev)
@@ -568,6 +596,13 @@ int tpm2_pcr_extend(struct tpm_chip *chip, int pcr_idx, const u8 *hash);
 int tpm2_get_random(struct tpm_chip *chip, u8 *out, size_t max);
 void tpm2_flush_context_cmd(struct tpm_chip *chip, u32 handle,
 			    unsigned int flags);
+static inline void tpm2_session_clear_reserved(struct tpm_chip *chip,
+					       struct tpm_space *space)
+{
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space))
+		tpm2_flush_context_cmd(chip, space->reserved_handle, 0);
+	space->reserved_handle = 0;
+}
 int tpm2_seal_trusted(struct tpm_chip *chip,
 		      struct trusted_key_payload *payload,
 		      struct trusted_key_options *options);
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index e1c1bbd..ac5c0a2 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -1007,6 +1007,7 @@ int tpm2_auto_startup(struct tpm_chip *chip)
 {
 	struct tpm_buf buf;
 	u32 nr_commands;
+	u32 nr_sessions;
 	int rc;
 	int i;
 
@@ -1067,6 +1068,20 @@ int tpm2_auto_startup(struct tpm_chip *chip)
 	chip->nr_commands = nr_commands;
 	tpm_buf_destroy(&buf);
 
+	rc = tpm2_get_tpm_pt(chip, TPM_PT_ACTIVE_SESSIONS_MAX,
+			     &nr_sessions, NULL);
+	if (rc)
+		goto out;
+
+	if (nr_sessions > 256)
+		nr_sessions = 256;
+
+	chip->max_sessions = nr_sessions;
+	chip->sessions = devm_kzalloc(&chip->dev,
+				      nr_sessions * sizeof(*chip->sessions),
+				      GFP_KERNEL);
+	if (!chip->sessions)
+		rc = -ENOMEM;
 out:
 	if (rc > 0)
 		rc = -ENODEV;
diff --git a/drivers/char/tpm/tpm2-space.c b/drivers/char/tpm/tpm2-space.c
index 04c9431..42c8c84 100644
--- a/drivers/char/tpm/tpm2-space.c
+++ b/drivers/char/tpm/tpm2-space.c
@@ -34,6 +34,169 @@ struct tpm2_context {
 	__be16 blob_size;
 } __packed;
 
+static struct tpm_sessions *tpm2_session_chip_get(struct tpm_chip *chip)
+{
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++)
+		if (chip->sessions[i].space == NULL)
+			return &chip->sessions[i];
+
+	return NULL;
+}
+
+static struct tpm_sessions *tpm2_session_chip_find_oldest(struct tpm_chip *chip)
+{
+	struct tpm_sessions *sess = NULL;
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++) {
+		if (chip->sessions[i].space == NULL)
+			continue;
+
+		if (!sess || time_after(sess->created,
+					chip->sessions[i].created))
+			sess = &chip->sessions[i];
+	}
+
+	return sess;
+}
+
+static void tpm2_session_chip_add(struct tpm_chip *chip,
+				  struct tpm_space *space, u32 h)
+{
+	struct tpm_sessions *sess = tpm2_session_chip_get(chip);
+
+	sess->space = space;
+	sess->handle = h;
+	sess->created = jiffies;
+	dev_info(&chip->dev, "Added Session at %ld, handle %08x", sess - chip->sessions, h);
+}
+
+static void tpm2_session_chip_remove(struct tpm_chip *chip, u32 h)
+{
+	int i;
+
+	for (i = 0; i < chip->max_sessions; i++)
+		if (chip->sessions[i].handle == h)
+			break;
+	if (i == chip->max_sessions) {
+		dev_warn(&chip->dev, "Missing session %08x", h);
+		return;
+	}
+
+	memset(&chip->sessions[i], 0, sizeof(chip->sessions[i]));
+	dev_info(&chip->dev, "Removed session at %d\n", i);
+	wake_up(&chip->session_wait);
+}
+
+static int tpm2_session_forget(struct tpm_chip *chip, struct tpm_space *space,
+			       u32 handle)
+{
+	int i, j;
+	struct tpm2_context *ctx;
+
+	for (i = 0, j = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
+		if (space->session_tbl[i] == 0)
+			continue;
+
+		ctx = (struct tpm2_context *)&space->session_buf[j];
+		j += sizeof(*ctx) + get_unaligned_be16(&ctx->blob_size);
+
+		if (space->session_tbl[i] != handle)
+			continue;
+
+		/* forget the session context */
+		memcpy(ctx, &space->session_buf[j], PAGE_SIZE - j);
+		tpm2_session_chip_remove(chip, handle);
+		space->session_tbl[i] = 0;
+		break;
+	}
+	if (i == ARRAY_SIZE(space->session_tbl))
+		return -EINVAL;
+	return 0;
+}
+
+static int tpm2_session_wait(struct tpm_chip *chip, struct tpm_space *space)
+{
+	int rc, failed;
+	struct tpm_sessions *sess;
+	const unsigned long min_timeout = msecs_to_jiffies(2000);
+	unsigned long timeout = min_timeout;
+	DEFINE_WAIT(wait);
+
+	for (failed = 0; ; ) {
+		prepare_to_wait(&chip->session_wait, &wait, TASK_INTERRUPTIBLE);
+
+		mutex_unlock(&chip->tpm_mutex);
+		rc = schedule_timeout_interruptible(timeout);
+		mutex_lock(&chip->tpm_mutex);
+
+		finish_wait(&chip->session_wait, &wait);
+
+		if (signal_pending(current))
+			/* got interrupted */
+			return -EINTR;
+
+		if (rc > 0 && !tpm2_is_session_force_evict(space))
+			/* got woken, so slot is free.  We don't
+			 * reserve the slot here because a) we can't
+			 * (no pending session in the TPM to evict)
+			 * and b) no-one is hogging sessions, so no
+			 * evidence of need.
+			 */
+			return 0;
+
+		/* timed out or victim required; select a victim
+		 * session to kill
+		 */
+		sess = tpm2_session_chip_find_oldest(chip);
+		if (sess == NULL) {
+			/* we get here when we can't create a session
+			 * but there are no listed active sessions
+			 * meaning they're all in various space
+			 * structures as victim sessions.  The wait
+			 * queue is a fair sequence, so we need to
+			 * wait a bit harder
+			 */
+			if (failed++ > 3)
+				break;
+			timeout *= 2;
+			dev_info(&chip->dev, "failed to get session, waiting for %us\n", jiffies_to_msecs(timeout)/1000);
+			continue;
+		}
+		/* is the victim old enough? */
+		timeout = jiffies - sess->created;
+		if (timeout > min_timeout)
+			break;
+		/* otherwise wait until the victim is old enough */
+		timeout = min_timeout - timeout;
+	}
+	if (sess == NULL)
+		/* still can't get a victim, give up */
+		return -EINVAL;
+
+	/* store the physical handle */
+	space->reserved_handle = sess->handle;
+	dev_info(&chip->dev, "Selecting handle %08x for eviction\n",
+		 space->reserved_handle);
+
+	/* cause a mapping failure if this session handle is
+	 * ever used in the victim space again
+	 */
+	tpm2_session_forget(chip, sess->space, sess->handle);
+	/* clear the session, but don't wake any other waiters */
+	memset(sess, 0, sizeof(*sess));
+	/* so now we have a saved physical handle but this handle is
+	 * still in the tpm.  After this we repeat the command, but
+	 * flush the handle once we obtain the tpm_mutex on the repeat
+	 * so, in theory, we should have a free handle to
+	 * re-execute
+	 */
+
+	return 0;
+}
+
 static int tpm2_context_save(struct tpm_chip *chip, u8 *area,
 			     int *offset, u32 handle)
 {
@@ -124,9 +287,9 @@ static int tpm2_session_find(struct tpm_space *space, u32 handle)
 	return i;
 }
 
-static int tpm2_session_add(struct tpm_chip *chip,
-			    struct tpm_space *space, u32 handle)
+static int tpm2_session_add(struct tpm_chip *chip, u32 handle)
 {
+	struct tpm_space *space = &chip->work_space;
 	int i;
 
 	for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++)
@@ -139,35 +302,11 @@ static int tpm2_session_add(struct tpm_chip *chip,
 	}
 
 	space->session_tbl[i] = handle;
+	tpm2_session_chip_add(chip, chip->space, handle);
 
 	return 0;
 }
 
-static int tpm2_session_forget(struct tpm_space *space, u32 handle)
-{
-	int i, j;
-	struct tpm2_context *ctx;
-
-	for (i = 0, j = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
-		if (space->session_tbl[i] == 0)
-			continue;
-
-		ctx = (struct tpm2_context *)&space->session_buf[j];
-		j += sizeof(*ctx) + get_unaligned_be16(&ctx->blob_size);
-
-		if (space->session_tbl[i] != handle)
-			continue;
-
-		/* forget the session context */
-		memcpy(ctx, &space->session_buf[j], PAGE_SIZE - j);
-		space->session_tbl[i] = 0;
-		break;
-	}
-	if (i == ARRAY_SIZE(space->session_tbl))
-		return -EINVAL;
-	return 0;
-}
-
 /* if a space is active, emulate some commands */
 static int tpm2_intercept(struct tpm_chip *chip, u32 cc, u8 *buf, size_t bufsiz)
 {
@@ -187,7 +326,7 @@ static int tpm2_intercept(struct tpm_chip *chip, u32 cc, u8 *buf, size_t bufsiz)
 		/* let the TPM figure out and return the error */
 		return 0;
 
-	return tpm2_session_forget(space, handle);
+	return tpm2_session_forget(chip, space, handle);
 }
 
 void tpm2_flush_space(struct tpm_chip *chip, struct tpm_space *space)
@@ -200,10 +339,19 @@ void tpm2_flush_space(struct tpm_chip *chip, struct tpm_space *space)
 					       TPM_TRANSMIT_UNLOCKED);
 
 	for (i = 0; i < ARRAY_SIZE(space->session_tbl); i++) {
+		if (!space->session_tbl[i])
+			continue;
+
 		space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
-		if (space->session_tbl[i])
-			tpm2_flush_context_cmd(chip, space->session_tbl[i],
-					       TPM_TRANSMIT_UNLOCKED);
+		tpm2_session_chip_remove(chip, space->session_tbl[i]);
+		tpm2_flush_context_cmd(chip, space->session_tbl[i],
+				       TPM_TRANSMIT_UNLOCKED);
+	}
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space)) {
+		tpm2_flush_context_cmd(chip, space->reserved_handle,
+				       TPM_TRANSMIT_UNLOCKED);
+		space->reserved_handle = 0;
+		/* subtlety here: if force evict is set, we don't clear it */
 	}
 }
 
@@ -264,11 +412,13 @@ static void tpm2_unmap_sessions(struct tpm_chip *chip, u32 rc)
 		if ((space->session_tbl[i] & TPM2_HT_TAG_FOR_FLUSH) !=
 		    TPM2_HT_TAG_FOR_FLUSH)
 			continue;
-		if (rc == TPM2_RC_SUCCESS)
+
+		/* for unsuccessful command, keep session */
+		space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
+		if (rc == TPM2_RC_SUCCESS) {
+			tpm2_session_chip_remove(chip, space->session_tbl[i]);
 			space->session_tbl[i] = 0;
-		else
-			/* for unsuccessful command, keep session */
-			space->session_tbl[i] &= ~TPM2_HT_TAG_FOR_FLUSH;
+		}
 	}
 }
 
@@ -387,6 +537,7 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space,
 	       sizeof(space->session_tbl));
 	memcpy(chip->work_space.context_buf, space->context_buf, PAGE_SIZE);
 	memcpy(chip->work_space.session_buf, space->session_buf, PAGE_SIZE);
+	chip->space = space;
 
 	rc = tpm2_intercept(chip, cc, buf, bufsiz);
 	if (rc)
@@ -400,16 +551,28 @@ int tpm2_prepare_space(struct tpm_chip *chip, struct tpm_space *space,
 	if (rc)
 		return rc;
 
+	if (space->reserved_handle && !tpm2_is_session_force_evict(space)) {
+		/* this is a trick to allow a previous command which
+		 * failed because it was out of handle space to
+		 * succeed.  The handle is still in the TPM, so now we
+		 * flush it under the tpm_mutex which should ensure we
+		 * can create a new one
+		 */
+		tpm2_flush_context_cmd(chip, space->reserved_handle,
+				       TPM_TRANSMIT_UNLOCKED);
+		space->reserved_handle = 0;
+	}
+
 	return 0;
 }
 
-static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len)
+static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len,
+			     u32 return_code)
 {
 	struct tpm_space *space = &chip->work_space;
 	u32 phandle, phandle_type;
 	u32 vhandle;
 	u32 attrs;
-	u32 return_code = get_unaligned_be32((__be32 *)&rsp[6]);
 	u16 tag = get_unaligned_be16((__be16 *)rsp);
 	int i;
 	int rc;
@@ -439,7 +602,7 @@ static int tpm2_map_response(struct tpm_chip *chip, u32 cc, u8 *rsp, size_t len)
 		return 0;
 
 	if (phandle_type != TPM2_HT_TRANSIENT)
-		return tpm2_session_add(chip, space, phandle);
+		return tpm2_session_add(chip, phandle);
 
 	/* Garbage collect a dead context. */
 	for (i = 0; i < ARRAY_SIZE(space->context_tbl); i++) {
@@ -521,11 +684,12 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
 		      u32 cc, u8 *buf, size_t bufsiz)
 {
 	int rc;
+	u32 return_code = get_unaligned_be32((__be32 *)&buf[6]);
 
 	if (!space)
 		return 0;
 
-	rc = tpm2_map_response(chip, cc, buf, bufsiz);
+	rc = tpm2_map_response(chip, cc, buf, bufsiz, return_code);
 	if (rc)
 		return rc;
 
@@ -539,6 +703,12 @@ int tpm2_commit_space(struct tpm_chip *chip, struct tpm_space *space,
 	       sizeof(space->session_tbl));
 	memcpy(space->context_buf, chip->work_space.context_buf, PAGE_SIZE);
 	memcpy(space->session_buf, chip->work_space.session_buf, PAGE_SIZE);
+	chip->space = NULL;
+
+	if (return_code == TPM2_RC_SESSION_HANDLES) {
+		tpm2_session_wait(chip, space);
+		return -EAGAIN;
+	}
 
 	return 0;
 }
diff --git a/drivers/char/tpm/tpms-dev.c b/drivers/char/tpm/tpms-dev.c
index 12b6e34..b13b000 100644
--- a/drivers/char/tpm/tpms-dev.c
+++ b/drivers/char/tpm/tpms-dev.c
@@ -56,8 +56,23 @@ ssize_t tpms_write(struct file *file, const char __user *buf,
 {
 	struct file_priv *fpriv = file->private_data;
 	struct tpms_priv *priv = container_of(fpriv, struct tpms_priv, priv);
+	int count = 0;
+	const int max_count = 3; /* number of retries */
+	int rc;
 
-	return tpm_common_write(file, buf, size, off, &priv->space);
+	for (count = 0; count < max_count; count++) {
+		rc = tpm_common_write(file, buf, size, off, &priv->space);
+		if (rc != -EAGAIN)
+			break;
+		if (count == max_count - 2)
+			/* second to last go around, force an eviction if
+			 * this go fails, so final go should succeed
+			 */
+			tpm2_session_force_evict(&priv->space);
+	}
+	tpm2_session_clear_reserved(fpriv->chip, &priv->space);
+
+	return rc;
 }
 
 const struct file_operations tpm_rm_fops = {

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot

^ permalink raw reply related	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2017-02-17 22:37 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <James.Bottomley@HansenPartnership.com>
2017-02-10 10:03 ` [RFC] tpm2-space: add handling for global session exhaustion Dr. Greg Wettstein
     [not found]   ` <201702101003.v1AA3plF029882-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
2017-02-10 16:46     ` James Bottomley
     [not found]       ` <1486745163.2502.26.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-02-10 21:13         ` Kenneth Goldman
2017-02-14 14:38           ` [tpmdd-devel] " Dr. Greg Wettstein
     [not found]             ` <20170214143829.GA28175-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
2017-02-14 16:47               ` James Bottomley
     [not found]             ` <71dc0e80-6678-a124-9184-1f93c8532d09@linux.vnet.ibm.com>
2017-02-16 20:06               ` [tpmdd-devel] " Dr. Greg Wettstein
2017-02-16 20:33                 ` Jarkko Sakkinen
2017-02-17  9:56                   ` Dr. Greg Wettstein
2017-02-17 12:37                     ` Jarkko Sakkinen
2017-02-17 22:37                       ` Dr. Greg Wettstein
2017-02-10 21:18         ` Kenneth Goldman
2017-02-12 20:29       ` [tpmdd-devel] " Ken Goldman
     [not found] <jarkko.sakkinen@linux.intel.com>
2017-02-09  9:06 ` Dr. Greg Wettstein
     [not found]   ` <201702090906.v1996c6a015552-DHO+NtfOqB5PEDpkEIzg7wC/G2K4zDHf@public.gmane.org>
2017-02-09 15:19     ` Jarkko Sakkinen
     [not found]       ` <20170209151922.cqo32h4io5dqyvvw-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-02-09 19:04         ` Jason Gunthorpe
     [not found]           ` <20170209190426.GA1104-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-02-09 19:29             ` James Bottomley
     [not found]               ` <1486668591.2616.45.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-02-09 21:54                 ` Jason Gunthorpe
2017-02-10  8:48           ` [tpmdd-devel] " Jarkko Sakkinen
     [not found]             ` <20170210084837.lq3mofgfwvjx623m-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-02-10 23:13               ` Kenneth Goldman
2017-02-09 20:05     ` James Bottomley
2017-01-18 20:48 James Bottomley
     [not found] ` <1484772489.2396.2.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-01-19 12:25   ` Jarkko Sakkinen
     [not found]     ` <20170119122533.d7h5rgatpwl3qmcl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-01-19 12:41       ` Jarkko Sakkinen
     [not found]         ` <20170119124101.nw7a7m735zhiivfo-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-01-27 21:20           ` Ken Goldman
2017-01-27 21:59             ` James Bottomley
2017-01-19 12:59       ` James Bottomley
2017-01-27 21:42   ` Ken Goldman
2017-01-27 22:04     ` James Bottomley
     [not found]       ` <1485554699.3229.20.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-01-27 23:35         ` Jason Gunthorpe
     [not found]           ` <20170127233513.GA28995-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2017-01-27 23:48             ` James Bottomley
2017-01-30  0:52         ` Ken Goldman
2017-01-30 16:04           ` James Bottomley
     [not found]             ` <1485792295.2518.23.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-01-30 21:58               ` Jarkko Sakkinen
     [not found]                 ` <20170130215815.4lr42ob7e4cycwgi-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-01-30 22:13                   ` James Bottomley
     [not found]                     ` <1485814388.2518.28.camel-d9PhHud1JfjCXq6kfMZ53/egYHeGw8Jk@public.gmane.org>
2017-01-30 22:46                       ` Ken Goldman
2017-01-31 13:31                       ` Jarkko Sakkinen
2017-02-10 17:22                       ` Kenneth Goldman
2017-01-31 19:28               ` Ken Goldman
2017-01-31 19:55                 ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).