All of lore.kernel.org
 help / color / mirror / Atom feed
* SVSM Attestation and vTPM specification additions - v0.60
@ 2023-01-10 18:54 Tom Lendacky
  2023-01-10 19:37 ` Tom Lendacky
                   ` (7 more replies)
  0 siblings, 8 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 18:54 UTC (permalink / raw)
  To: linux-coco, amd-sev-snp

[-- Attachment #1: Type: text/plain, Size: 277 bytes --]

Attached is an updated draft version of the SVSM specification with added 
support for an attestation protocol and a vTPM protocol as well as other 
miscellaneous changes (all identified by change bar). Please take a look 
and reply with any feedback you may have.

Thanks,
Tom

[-- Attachment #2: 58019-Secure_VM_Service_Module_Specification.pdf --]
[-- Type: application/pdf, Size: 521587 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
@ 2023-01-10 19:37 ` Tom Lendacky
  2023-01-10 19:40 ` Dionna Amalie Glaze
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 19:37 UTC (permalink / raw)
  To: linux-coco, amd-sev-snp

On 1/10/23 12:54, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with added 
> support for an attestation protocol and a vTPM protocol as well as other 
> miscellaneous changes (all identified by change bar). Please take a look 
> and reply with any feedback you may have.

First feedback is that I missed adding the vTPM service attestation GUID 
and contents. I'll work on that and send out an update in a few days.

Thanks,
Tom

> 
> Thanks,
> Tom

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
  2023-01-10 19:37 ` Tom Lendacky
@ 2023-01-10 19:40 ` Dionna Amalie Glaze
  2023-01-10 21:03   ` Tom Lendacky
  2023-01-10 20:29 ` James Bottomley
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 48+ messages in thread
From: Dionna Amalie Glaze @ 2023-01-10 19:40 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

typo: "oridnal"

For the statement "Locality usage for the vTPM is not currently
defined." should this be interpreted as version 1 of the vTPM protocol
will not support locality, or simply that version 1 might have the
affordance to add behavior for non-zero locality in a future revision
of version 1, such that the result is not specified as
SVSM_ERR_INVALID_PARAMETER? I think the latter is probably a dangerous
interpretation unless v0.60 of this document is strictly considered
"unstable" and shouldn't be used upstream, so I'd recommend clarifying
that "currently" in a document that might later be outdated should be
precise about its specified behavior in a versioned fashion.

On Tue, Jan 10, 2023 at 10:54 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> Attached is an updated draft version of the SVSM specification with added
> support for an attestation protocol and a vTPM protocol as well as other
> miscellaneous changes (all identified by change bar). Please take a look
> and reply with any feedback you may have.
>
> Thanks,
> Tom



-- 
-Dionna Glaze, PhD (she/her)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
  2023-01-10 19:37 ` Tom Lendacky
  2023-01-10 19:40 ` Dionna Amalie Glaze
@ 2023-01-10 20:29 ` James Bottomley
  2023-01-10 20:37   ` James Bottomley
  2023-01-10 21:32   ` Tom Lendacky
  2023-01-11 16:39 ` Christophe de Dinechin
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 48+ messages in thread
From: James Bottomley @ 2023-01-10 20:29 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 12:54 -0600, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with
> added support for an attestation protocol and a vTPM protocol as well
> as other  miscellaneous changes (all identified by change bar).
> Please take a look and reply with any feedback you may have.


I think the first feedback I'd have is that the word "ordinal" isn't
used anywhere in the MS reference implementation.  It confusingly
refers to them as "commands".  To avoid confusion, most TSS
implementations call them "platform commands" to distinguish them from
actual TPM commands.

Then in section 7 "Attestation Protocol", the way we've currently coded
keylime is to validate the specific TPM attestation.  It is possible to
code it to extract that from a bundle of attestations, but that bundle
size grows with every new service and the verification is remote.  It
would be ideal to be able to request only a single service attestation,
so we can keep the current VTPM only attestation in keylime.  You seem
to be identifying services by protocol number so perhaps we could use
the protocol number instead of the GUID and also have a call per
service protocol to attest only that service, say as function id 1 of
protocol 1 taking the protocol number and report version (starting at
zero)?

You suggested that because REPORT_DATA is 64 bytes that we should do
the sha512 hash of (nonce || data), which is how the current VTPM is
coded ... is there a reason to go back to sha256 like the new document
proposes?

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 20:29 ` James Bottomley
@ 2023-01-10 20:37   ` James Bottomley
  2023-01-10 21:33     ` Tom Lendacky
  2023-01-10 21:32   ` Tom Lendacky
  1 sibling, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-10 20:37 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

I also think there's a typo in 8.1: in the first table RCX appears
twice even though the text beneath implies that the second RCX should
actually be RDX.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 19:40 ` Dionna Amalie Glaze
@ 2023-01-10 21:03   ` Tom Lendacky
  2023-01-10 22:14     ` James Bottomley
  0 siblings, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 21:03 UTC (permalink / raw)
  To: Dionna Amalie Glaze; +Cc: linux-coco, amd-sev-snp

On 1/10/23 13:40, Dionna Amalie Glaze wrote:
> typo: "oridnal"

Will fix.

> 
> For the statement "Locality usage for the vTPM is not currently
> defined." should this be interpreted as version 1 of the vTPM protocol
> will not support locality, or simply that version 1 might have the
> affordance to add behavior for non-zero locality in a future revision
> of version 1, such that the result is not specified as
> SVSM_ERR_INVALID_PARAMETER? I think the latter is probably a dangerous
> interpretation unless v0.60 of this document is strictly considered
> "unstable" and shouldn't be used upstream, so I'd recommend clarifying
> that "currently" in a document that might later be outdated should be
> precise about its specified behavior in a versioned fashion.

Version 1 of the vTPM protocol will not support locality, so I'll remove 
the "currently." If locality is to be supported, it would be in a post 
version 1 of the vTPM protocol and will likely require invoking a new call 
id (unless we somehow manage to figure out locality before v1.0 of the 
SVSM specification).

v0.60 of the SVSM specification is not to be used upstream. Once v1.0 is 
reached, then it can be considered stable for usage upstream.

Thanks for the feedback.

Tom

> 
> On Tue, Jan 10, 2023 at 10:54 AM Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> Attached is an updated draft version of the SVSM specification with added
>> support for an attestation protocol and a vTPM protocol as well as other
>> miscellaneous changes (all identified by change bar). Please take a look
>> and reply with any feedback you may have.
>>
>> Thanks,
>> Tom
> 
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 20:29 ` James Bottomley
  2023-01-10 20:37   ` James Bottomley
@ 2023-01-10 21:32   ` Tom Lendacky
  2023-01-10 21:47     ` James Bottomley
  1 sibling, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 21:32 UTC (permalink / raw)
  To: James Bottomley, linux-coco, amd-sev-snp

On 1/10/23 14:29, James Bottomley wrote:
> On Tue, 2023-01-10 at 12:54 -0600, Tom Lendacky wrote:
>> Attached is an updated draft version of the SVSM specification with
>> added support for an attestation protocol and a vTPM protocol as well
>> as other  miscellaneous changes (all identified by change bar).
>> Please take a look and reply with any feedback you may have.
> 
> 
> I think the first feedback I'd have is that the word "ordinal" isn't
> used anywhere in the MS reference implementation.  It confusingly
> refers to them as "commands".  To avoid confusion, most TSS
> implementations call them "platform commands" to distinguish them from
> actual TPM commands.

I'll change over to using platform command instead of ordinal.

> 
> Then in section 7 "Attestation Protocol", the way we've currently coded
> keylime is to validate the specific TPM attestation.  It is possible to
> code it to extract that from a bundle of attestations, but that bundle
> size grows with every new service and the verification is remote.  It
> would be ideal to be able to request only a single service attestation,
> so we can keep the current VTPM only attestation in keylime.  You seem
> to be identifying services by protocol number so perhaps we could use
> the protocol number instead of the GUID and also have a call per
> service protocol to attest only that service, say as function id 1 of
> protocol 1 taking the protocol number and report version (starting at
> zero)?

Yes, it would be easy enough to create a single service attestation 
request call. Using the protocol ID would be simple, but do we envision 
any service that may be running in the SVSM that doesn't necessarily have 
a calling protocol? In that case we wouldn't be able to individually 
attest it. I'm not sure there is a case for that - maybe a live migration 
service? I think GUID works well in this case.

I'm assuming report version is to always be able to retrieve a specific 
version/format of a report. I can include that in the service specific 
attestation call. If the version isn't supported, then return 
SVSM_ERR_INVALID_PARAMETER.

I'm also assuming you have an idea of what you would like that report 
version to look like.

My thoughts were to again use GUIDs to identify the key and key-type 
present, i.e. EK RSA-2048 or EK EC (NIST-P256), etc:

	16-byte data GUID (e.g. EK EC (NIST-P256))
	4-byte data length
	data

	16-byte data GUID (e.g. SRK EC (NIST-P256))
	4-byte data length
	data

> 
> You suggested that because REPORT_DATA is 64 bytes that we should do
> the sha512 hash of (nonce || data), which is how the current VTPM is
> coded ... is there a reason to go back to sha256 like the new document
> proposes?

I think I originally suggested sha256 and Dov suggested sha512. We didn't 
really say definitively which one to use. I'm ok going with sha512.

Thanks for the feedback.

Tom

> 
> James
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 20:37   ` James Bottomley
@ 2023-01-10 21:33     ` Tom Lendacky
  0 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 21:33 UTC (permalink / raw)
  To: James Bottomley, linux-coco, amd-sev-snp

On 1/10/23 14:37, James Bottomley wrote:
> I also think there's a typo in 8.1: in the first table RCX appears
> twice even though the text beneath implies that the second RCX should
> actually be RDX.

Yup, will fix.

Thanks,
Tom

> 
> James
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 21:32   ` Tom Lendacky
@ 2023-01-10 21:47     ` James Bottomley
  2023-01-10 23:00       ` Tom Lendacky
  0 siblings, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-10 21:47 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 15:32 -0600, Tom Lendacky wrote:
[...]
> My thoughts were to again use GUIDs to identify the key and key-type 
> present, i.e. EK RSA-2048 or EK EC (NIST-P256), etc:
> 
>         16-byte data GUID (e.g. EK EC (NIST-P256))
>         4-byte data length
>         data
> 
>         16-byte data GUID (e.g. SRK EC (NIST-P256))
>         4-byte data length
>         data

I really wouldn't use GUIDS because the TPM structures should all be
self describing.  The structure we're thinking of packing into the
attestation report to be hashed with the nonce is the TPMT_PUBLIC of
the EK.  The algorithm, attributes, policy and name hash which are in
the public area should define the exact template that was used to
construct the public key.  Note that there are about 2^70 possible
combinations, which is why I don't think you want a GUID for each one
...

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 21:03   ` Tom Lendacky
@ 2023-01-10 22:14     ` James Bottomley
  2023-01-10 22:45       ` Tom Lendacky
  0 siblings, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-10 22:14 UTC (permalink / raw)
  To: Tom Lendacky, Dionna Amalie Glaze; +Cc: linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 15:03 -0600, Tom Lendacky wrote:
> On 1/10/23 13:40, Dionna Amalie Glaze wrote:
> > typo: "oridnal"
> 
> Will fix.
> 
> > 
> > For the statement "Locality usage for the vTPM is not currently
> > defined." should this be interpreted as version 1 of the vTPM
> > protocol will not support locality, or simply that version 1 might
> > have the affordance to add behavior for non-zero locality in a
> > future revision of version 1, such that the result is not specified
> > as SVSM_ERR_INVALID_PARAMETER? I think the latter is probably a
> > dangerous interpretation unless v0.60 of this document is strictly
> > considered "unstable" and shouldn't be used upstream, so I'd
> > recommend clarifying that "currently" in a document that might
> > later be outdated should be precise about its specified behavior in
> > a versioned fashion.
> 
> Version 1 of the vTPM protocol will not support locality, so I'll
> remove the "currently." If locality is to be supported, it would be
> in a post version 1 of the vTPM protocol and will likely require
> invoking a new call id (unless we somehow manage to figure out
> locality before v1.0 of the SVSM specification).

Actually, that's not entirely correct:  The current SVSM vTPM
implements locality as a number just fine.  However, if all TPM
consumers can access all localities without restriction locality
becomes a totally useless thing.  To give a meaning to locality, you
have to have some restrictions about how components can access it.  The
only current user of localities I know is dynamic launch and that's
usually done by restricting locality 4 to the CPU microcode in a
physical system.

Until we can agree what dynamic launch (or some other locality
consumer) might mean in a confidential VM (and that the SVSM can police
it) there's no real point wiring locality up in the linux kernel
driver.  I mean the linux kernel itself doesn't use localities either,
it just sets them to 0 unless the TPM indicates a different default
value.

If we do find a use for localites, whatever we use them for would be
described by TPM event log entries, so there would be no need of a new
versioned SVSM call.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 22:14     ` James Bottomley
@ 2023-01-10 22:45       ` Tom Lendacky
  2023-01-10 23:52         ` James Bottomley
  0 siblings, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 22:45 UTC (permalink / raw)
  To: jejb, Dionna Amalie Glaze; +Cc: linux-coco, amd-sev-snp

On 1/10/23 16:14, James Bottomley wrote:
> On Tue, 2023-01-10 at 15:03 -0600, Tom Lendacky wrote:
>> On 1/10/23 13:40, Dionna Amalie Glaze wrote:
>>> typo: "oridnal"
>>
>> Will fix.
>>
>>>
>>> For the statement "Locality usage for the vTPM is not currently
>>> defined." should this be interpreted as version 1 of the vTPM
>>> protocol will not support locality, or simply that version 1 might
>>> have the affordance to add behavior for non-zero locality in a
>>> future revision of version 1, such that the result is not specified
>>> as SVSM_ERR_INVALID_PARAMETER? I think the latter is probably a
>>> dangerous interpretation unless v0.60 of this document is strictly
>>> considered "unstable" and shouldn't be used upstream, so I'd
>>> recommend clarifying that "currently" in a document that might
>>> later be outdated should be precise about its specified behavior in
>>> a versioned fashion.
>>
>> Version 1 of the vTPM protocol will not support locality, so I'll
>> remove the "currently." If locality is to be supported, it would be
>> in a post version 1 of the vTPM protocol and will likely require
>> invoking a new call id (unless we somehow manage to figure out
>> locality before v1.0 of the SVSM specification).
> 
> Actually, that's not entirely correct:  The current SVSM vTPM
> implements locality as a number just fine.  However, if all TPM

I wasn't saying it doesn't. I'm saying v1.0 of the SVSM specification 
won't support requests with a non-zero locality since we don't know what a 
non-zero locality means.

But this is a specification for any SVSM, so the current SVSM vTPM is 
relative to what you're working on, but maybe not what someone else is 
working on.

> consumers can access all localities without restriction locality
> becomes a totally useless thing.  To give a meaning to locality, you
> have to have some restrictions about how components can access it.  The
> only current user of localities I know is dynamic launch and that's
> usually done by restricting locality 4 to the CPU microcode in a
> physical system.
> 
> Until we can agree what dynamic launch (or some other locality
> consumer) might mean in a confidential VM (and that the SVSM can police
> it) there's no real point wiring locality up in the linux kernel
> driver.  I mean the linux kernel itself doesn't use localities either,
> it just sets them to 0 unless the TPM indicates a different default
> value.
> 
> If we do find a use for localites, whatever we use them for would be
> described by TPM event log entries, so there would be no need of a new
> versioned SVSM call.

I think there would be, though. Right now the call says any non-zero 
locality value returns an error because, as you alluded, there is no 
policing by the SVSM. The API suddenly starting to support non-zero 
localities breaks the API, no?

Thanks,
Tom

> 
> James
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 21:47     ` James Bottomley
@ 2023-01-10 23:00       ` Tom Lendacky
  2023-01-10 23:09         ` James Bottomley
  2023-01-10 23:14         ` James Bottomley
  0 siblings, 2 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-10 23:00 UTC (permalink / raw)
  To: James Bottomley, linux-coco, amd-sev-snp



On 1/10/23 15:47, James Bottomley wrote:
> On Tue, 2023-01-10 at 15:32 -0600, Tom Lendacky wrote:
> [...]
>> My thoughts were to again use GUIDs to identify the key and key-type
>> present, i.e. EK RSA-2048 or EK EC (NIST-P256), etc:
>>
>>          16-byte data GUID (e.g. EK EC (NIST-P256))
>>          4-byte data length
>>          data
>>
>>          16-byte data GUID (e.g. SRK EC (NIST-P256))
>>          4-byte data length
>>          data
> 
> I really wouldn't use GUIDS because the TPM structures should all be
> self describing.  The structure we're thinking of packing into the
> attestation report to be hashed with the nonce is the TPMT_PUBLIC of
> the EK.  The algorithm, attributes, policy and name hash which are in
> the public area should define the exact template that was used to
> construct the public key.  Note that there are about 2^70 possible
> combinations, which is why I don't think you want a GUID for each one

Ok, so this is different from what was originally talked about being just 
the EK EC key andr the SRK EC key.

A GUID still works, though, to describe that the TPMT_PUBLIC supplied is 
for the EK - unless you want to go with the known handles, e.g. 0x81010001 
for the EK RSA handle or 0x81010002 for the EK EC handle, etc. You still 
need to identify what key is represented by the TPMT_PUBLIC structure, 
right, or am I missing something about the TPMT_PUBLIC structure?

Thanks,
Tom

> ...
> 
> James
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 23:00       ` Tom Lendacky
@ 2023-01-10 23:09         ` James Bottomley
  2023-01-11 14:49           ` Tom Lendacky
  2023-01-10 23:14         ` James Bottomley
  1 sibling, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-10 23:09 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 17:00 -0600, Tom Lendacky wrote:
> A GUID still works, though, to describe that the TPMT_PUBLIC supplied
> is for the EK - unless you want to go with the known handles, e.g.
> 0x81010001 for the EK RSA handle or 0x81010002 for the EK EC handle,
> etc.

You should probably use the hierarchy handle for that case: 0x40000001
for the Owner (storage) seed and 0x40000006 for the endorsement seed. 
However, I'd just specify the endorsement seed in the spec and be done
with it.  Once you know the EK, you can use it to certify any SRK you
create from the owner seed.

>  You still  need to identify what key is represented by the
> TPMT_PUBLIC structure, right, or am I missing something about the
> TPMT_PUBLIC structure?

All you need to know is what hierarchy seed (which is why I propose
endorsement).  After that the TPMT_PUBLIC defines exactly the
properties of the derived key, including the algorithm (RSA or EC
curve, etc).

James



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 23:00       ` Tom Lendacky
  2023-01-10 23:09         ` James Bottomley
@ 2023-01-10 23:14         ` James Bottomley
  1 sibling, 0 replies; 48+ messages in thread
From: James Bottomley @ 2023-01-10 23:14 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 17:00 -0600, Tom Lendacky wrote:
[...]
> Ok, so this is different from what was originally talked about being
> just the EK EC key andr the SRK EC key.

Just to clarify, the IBM research attestation team is still doing
active prototypes here.  I suspect when the dust settles we'll want to
use a signing EK with no policy to avoid the make credential/activate
credential dance, but first we're using an dencryption EK with
hierarchy password policy to use the standard protocols with keylime. 
I just don't want us to get into the position where we settle on a key
type and then find there isn't a defined GUID for it.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 22:45       ` Tom Lendacky
@ 2023-01-10 23:52         ` James Bottomley
  2023-01-11  9:15           ` Christophe de Dinechin Dupont de Dinechin
  0 siblings, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-10 23:52 UTC (permalink / raw)
  To: Tom Lendacky, Dionna Amalie Glaze; +Cc: linux-coco, amd-sev-snp

On Tue, 2023-01-10 at 16:45 -0600, Tom Lendacky wrote:
> On 1/10/23 16:14, James Bottomley wrote:
[...]
> > consumers can access all localities without restriction locality
> > becomes a totally useless thing.  To give a meaning to locality,
> > you have to have some restrictions about how components can access
> > it. The only current user of localities I know is dynamic launch
> > and that's usually done by restricting locality 4 to the CPU
> > microcode in a physical system.
> > 
> > Until we can agree what dynamic launch (or some other locality
> > consumer) might mean in a confidential VM (and that the SVSM can
> > police it) there's no real point wiring locality up in the linux
> > kernel driver.  I mean the linux kernel itself doesn't use
> > localities either, it just sets them to 0 unless the TPM indicates
> > a different default value.
> > 
> > If we do find a use for localites, whatever we use them for would
> > be described by TPM event log entries, so there would be no need of
> > a new versioned SVSM call.
> 
> I think there would be, though. Right now the call says any non-zero 
> locality value returns an error because, as you alluded, there is no 
> policing by the SVSM. The API suddenly starting to support non-zero 
> localities breaks the API, no?

Right, but that makes it a doctor it hurts problem.  Just because we
currently don't know how we'd police locality in the SVSM doesn't mean
an OS upward of us can't.  For instance the current argument about
hibernate keys could be solved by making the linux kernel access the
TPM at a different locality from the one it exposes to user space, so
creation data would definitively tell us if the kernel or a user
created a given TPM key.  Hmm, perhaps I should wire up locality in the
driver ... so the kernel can set it, then we won't fail unexpectedly if
a kernel use case actually comes along. 

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 23:52         ` James Bottomley
@ 2023-01-11  9:15           ` Christophe de Dinechin Dupont de Dinechin
  0 siblings, 0 replies; 48+ messages in thread
From: Christophe de Dinechin Dupont de Dinechin @ 2023-01-11  9:15 UTC (permalink / raw)
  To: James Bottomley
  Cc: Tom Lendacky, Dionna Amalie Glaze, linux-coco, amd-sev-snp



> On 11 Jan 2023, at 00:52, James Bottomley <jejb@linux.ibm.com> wrote:
> 
> On Tue, 2023-01-10 at 16:45 -0600, Tom Lendacky wrote:
>> On 1/10/23 16:14, James Bottomley wrote:
> [...]
>>> consumers can access all localities without restriction locality
>>> becomes a totally useless thing.  To give a meaning to locality,
>>> you have to have some restrictions about how components can access
>>> it. The only current user of localities I know is dynamic launch
>>> and that's usually done by restricting locality 4 to the CPU
>>> microcode in a physical system.
>>> 
>>> Until we can agree what dynamic launch (or some other locality
>>> consumer) might mean in a confidential VM (and that the SVSM can
>>> police it) there's no real point wiring locality up in the linux
>>> kernel driver.  I mean the linux kernel itself doesn't use
>>> localities either, it just sets them to 0 unless the TPM indicates
>>> a different default value.
>>> 
>>> If we do find a use for localites, whatever we use them for would
>>> be described by TPM event log entries, so there would be no need of
>>> a new versioned SVSM call.
>> 
>> I think there would be, though. Right now the call says any non-zero 
>> locality value returns an error because, as you alluded, there is no 
>> policing by the SVSM. The API suddenly starting to support non-zero 
>> localities breaks the API, no?
> 
> Right, but that makes it a doctor it hurts problem.  Just because we
> currently don't know how we'd police locality in the SVSM doesn't mean
> an OS upward of us can't.

+1.

>  For instance the current argument about
> hibernate keys could be solved by making the linux kernel access the
> TPM at a different locality from the one it exposes to user space, so
> creation data would definitively tell us if the kernel or a user
> created a given TPM key.  Hmm, perhaps I should wire up locality in the
> driver ... so the kernel can set it, then we won't fail unexpectedly if
> a kernel use case actually comes along. 

But then you’d need some kind of test case to exercise that path,
otherwise chances are it won’t be accepted.

The use case you just described (user-space vs. kernel locality)
looks like it would need to be implemented to justify the wiring
in the driver, no?

> 
> James
> 
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 23:09         ` James Bottomley
@ 2023-01-11 14:49           ` Tom Lendacky
  2023-01-11 14:56             ` James Bottomley
  0 siblings, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-11 14:49 UTC (permalink / raw)
  To: James Bottomley, linux-coco, amd-sev-snp

On 1/10/23 17:09, James Bottomley wrote:
> On Tue, 2023-01-10 at 17:00 -0600, Tom Lendacky wrote:
>> A GUID still works, though, to describe that the TPMT_PUBLIC supplied
>> is for the EK - unless you want to go with the known handles, e.g.
>> 0x81010001 for the EK RSA handle or 0x81010002 for the EK EC handle,
>> etc.
> 
> You should probably use the hierarchy handle for that case: 0x40000001
> for the Owner (storage) seed and 0x40000006 for the endorsement seed.

Looking at the spec, do you mean 0x4000000B for the endorsement seed?

Table 28 of the TPM 2.0 Structures document has TPM_RH_EK (0x40000006) as 
"not used", but TPM_RH_ENDORSEMENT (0x4000000B) as "references the 
Endorsement Primary Seed (EPS), endorsementAuth, and endorsementPolicy"

Thanks,
Tom

> However, I'd just specify the endorsement seed in the spec and be done
> with it.  Once you know the EK, you can use it to certify any SRK you
> create from the owner seed.
> 
>>   You still  need to identify what key is represented by the
>> TPMT_PUBLIC structure, right, or am I missing something about the
>> TPMT_PUBLIC structure?
> 
> All you need to know is what hierarchy seed (which is why I propose
> endorsement).  After that the TPMT_PUBLIC defines exactly the
> properties of the derived key, including the algorithm (RSA or EC
> curve, etc).
> 
> James
> 
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-11 14:49           ` Tom Lendacky
@ 2023-01-11 14:56             ` James Bottomley
  0 siblings, 0 replies; 48+ messages in thread
From: James Bottomley @ 2023-01-11 14:56 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp

On Wed, 2023-01-11 at 08:49 -0600, Tom Lendacky wrote:
> On 1/10/23 17:09, James Bottomley wrote:
> > On Tue, 2023-01-10 at 17:00 -0600, Tom Lendacky wrote:
> > > A GUID still works, though, to describe that the TPMT_PUBLIC
> > > supplied is for the EK - unless you want to go with the known
> > > handles, e.g. 0x81010001 for the EK RSA handle or 0x81010002 for
> > > the EK EC handle, etc.
> > 
> > You should probably use the hierarchy handle for that case:
> > 0x40000001 for the Owner (storage) seed and 0x40000006 for the
> > endorsement seed.
> 
> Looking at the spec, do you mean 0x4000000B for the endorsement seed?
> 
> Table 28 of the TPM 2.0 Structures document has TPM_RH_EK
> (0x40000006) as "not used", but TPM_RH_ENDORSEMENT (0x4000000B) as
> "references the Endorsement Primary Seed (EPS), endorsementAuth, and
> endorsementPolicy"

Yes, probably ... I don't really ever use it except as input the the
command line tsscreateprimary -hi e or create_tpm2_key --parent
endorsement.

But I still say don't bother: the endorsement key is all that needs to
be attested because there are well known processes to use it to attest
every other TPM key.  If you're using the SVSM vTPM because you care
about measurements, you need the endorsement key not the storage key
(you only need the latter if you're inserting other keys).

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
                   ` (2 preceding siblings ...)
  2023-01-10 20:29 ` James Bottomley
@ 2023-01-11 16:39 ` Christophe de Dinechin
  2023-01-11 23:00   ` Tom Lendacky
  2023-01-12 13:57   ` James Bottomley
  2023-01-12  8:19 ` Dov Murik
                   ` (3 subsequent siblings)
  7 siblings, 2 replies; 48+ messages in thread
From: Christophe de Dinechin @ 2023-01-11 16:39 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

Hi Tom,


On 2023-01-10 at 12:54 -06, Tom Lendacky <thomas.lendacky@amd.com> wrote...
> Attached is an updated draft version of the SVSM specification with added
> support for an attestation protocol and a vTPM protocol as well as other
> miscellaneous changes (all identified by change bar). Please take a look and
> reply with any feedback you may have.
>
> Thanks,
> Tom

Thanks for sharing.

This is the first time I actually review that document, so my feedback will
be a bit longer than most. Also, I read it at a time where I had lost
network access to Internet, so RTFM wasn't an option...

First, the actual errors:
p9: Typo VMLP1+ instead of VMPL1+
p18: "bit1=0" and "bit1=1": That seems to be bit 2

Then the more mundane comments

p9: "expected that, but not limited to": the wording sounds strange to me
p9: Undefined acronym (*) VMSA

p10: "certain forms of RMPADJUST": The body of the document seems to
     indicate that the required RMPADJUST are performed as part of the
     various other services. There is no explicit need (apparently) for a
     separate guest-accessible RMPADJUST. Maybe expand a little bit on this
     topic, and explain if the guest is supposed to do any RMPADJUST if
     running at VMPL1.
p10: gPA space of the guest: "of the guest" seems redundant, since it's gPAs
p10: lower VMPL -> less privileged VMPL, or explain that "lower" privilege
     levels have a higher number.
p10: "The initial SVSM memory configuration...": Unclear what "required"
     means in that sentence. Is that the core protocol? Can "create VCPU"
     request additional memory, for example?

p11: No explanation of how the SVSM knows where the secrets page is.
     Probably need an xref to some other doc.
p11: Who does the initial construction of the secrets page? What happens if
     that other actor does not write zeroes? What attacks can the host
     perform on the secrets page if any?
p11: undefined acronym VMPCK
p11: Why should the guest OS "capture" the SVSM_CAA value? In what sense?
     Is it because it can change afterwards, or because the secrets page
     becomes unavailable, or for another reason?
p11: Byte offset in secrets page fields starts at 0x140. Explain why this is
     safe, and how other possible users of the secrets page would avoid
     stomping on that area.

p12: How does the SVSM "terminate"? Is it a regular guest shutdown, or some
     other mechanism, or is unspecified? Can the SVSM log anything, and if
     so, where would it be found? (I assume that would be host platform
     specific, but would still be noteworthy in this specification)

p13: "Use of the Calling Area is necessary..." effectively, this is a single
     byte in the calling area, right? So it's not really "ensuring", maybe
     "detect" spurious invokation (1). I think malicious invokations are not
     made too difficult by this mechanism, since there are only two states,
     and the host still controls when vCPUs run.

p14: Undefined acronyms: GHCB, MSR
p14: I think GHCB Specification is an external reference, worthy having
     italics and a precise reference / link (can't check, no network ATM)
p14: "If the host illegally entered the SVSM, this field will be zero": I
     believe that the conditions enforcing this should be precisely spelled
     out, including for a host with malicious intent. If the mechanism is
     indeed robust, then we are not protecting against "spurious" calls but
     against "spurious or malicious" calls. Otherwise, "will be zero" should
     be replaced by "should normally be zero".
p14: "only after VMSA.RAX and SVSM_CALL_PENDING": this suggests that there
     is some kind of possible race condition here. If that is true, then
     maybe there is a need to specify memory ordering semantics on the three
     relevant fields?

p16: PVALIDATE: What happens if a guest is unaware of SVSM and executes
     PVALIDATE directly? Is the SVSM supposed to emulate that, or to punish
     the guest, or something else? Is that even possible for the SVSM to
     trap and emulate relevant VMPL1 instructions? Also see note on page 10
     regarding RMPADJUST.
p16: "It affects the Calling Area for calling vCPU only": This seems slighly
     inconsistent with page 9 "other SVSM implementations may choose a
     single execution context that services all guest VCPUs".
p16: Alignment for RCX is 8 bytes, but alignment for RDX in
     SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?
     Also, what is the use case for moving the calling area?

p17: The table links appear in a strange colour (some kind of weird cyan).
     It seems clickable too, so I suspect a hyperlink, but since the link is
     always on the same page, it's not super-useful.

p18: Is the CAA seen as assigned to the SVSM? I believe the answer is no
p18: For increased readability, I suggest naming the error codes for the
     SVSM_CORE_PVALIDATE call, and putting them in a table.
     Also, why tag the specific errors right after the architectural
     PVALIDATE errors? In case of architectural extension, you'd always get
     0x8000_1011, which is not super helpful. Instead, you could reserve
     0x8000_1xxx for protocol errors, and put PVALIDATE errors at
     0x9nnn_nnnn, which probably gives you enough room at least for the
     coming 6 months.
p18: "VMPL of the VCPU making the request": wouldn't it make sense to add a
     VMPL field in the PVALIDATE operation descriptor, so that the guest
     could control less-privileged VMPLs?
p18: As indicated earlier, I'm confused by the 4K alginment requirement for
     RDX (Calling Area gPA)
p18: What is the APIC ID of the vCPU used for? I see no mention in the
     explanatory text. Is that an internal index, or does the SVSM
     implementation need it for some reason (I was not clever enough to
     imagine why)

p19: What specification defines FAIL_INUSE? Add xref?

p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
     service returns 0x4mmm_mmmm. However, in the presence of a flag
     indicating "I may no longer need this memory", and given the
     limitations "cannot cross a page", I am concerned about possible lack
     of forward progress if two vCPUs start parallel operations where one
     vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
     be split into umteen DEPOSIT_MEM calls due to page limit), while
     another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
     see nothing in the spec that would prevent the second CPU from actively
     withdrawing the memory that the first one is trying to deposit. I think
     that the spec should clarify a forward-progress logic that prevents
     that from happening.
p20: It would be interesting to have at least a vague idea of what
     operations can actually request more memory, just to set expectations.
p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
     "restart" field, unlike things like PVALIDATE. That means that there
     should be a rather strong guarantee that all SVSM calls that can
     potentially return 0x4mmm_mmmm either have no effect when they return
     such a request, or are idempotent if called again after providing more
     memory.

p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
     Is there any requirement that WITHDRAW_MEM should only be called from
     the startup vCPU, or from only one vCPU at a time?

p22: The writable area ends at a page boundary. What could be a vald
     rationale for setting the RCX pointer in the middle of a page?
     Maybe simpler to require that the pointer be page-aligned than have a
     spec that mentions page offset 0xFF8 as a special case...
p22: Rationale for not returning incomplete? I'm trying to see how the guest
     could efficiently let secondary vCPUs withdraw memory with the protocol
     as specified, without a little additional wording regarding either the
     memory semantics of the MEM_AVAILABLE flag, and telling if there is
     indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.

p23: Table 9 is mnissing RDX, R8 and R9 rows (as input for configuration)
p23: Table 9 title should be "configuration or query"
p23: RCX result is the same for query and configuration, or is that for
     query only?
p23: At the specified RIP. If Bit 3 is not set in the configuration case,
     does it return after the VMGEXIT?

p29: Table 14: How can RAX be used as command ordinal and command response
     size if it's already used for call identifier / result value?

p30: Why would you need 4 bytes for the TPM command ordinal? This causes the
     TPM command size to be misaligned. What about 2 bytes for command
     ordinal, one byte for locality, and one reserved byte?


(*) Like most readers of this document, I know what it means, but since you
defined VMM just above, or VM the page before, I interpreted your intent to
be that every acronym should be defined on first use.

(1) As I am writing this, I have a doubt what happens if the host writes to
    the secrets page, and can't verify easily without a network.

>
> [2. application/pdf; 58019-Secure_VM_Service_Module_Specification.pdf]...


--
Cheers,
Christophe de Dinechin (https://c3d.github.io)
Theory of Incomplete Measurements (https://c3d.github.io/TIM)


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-11 16:39 ` Christophe de Dinechin
@ 2023-01-11 23:00   ` Tom Lendacky
  2023-01-12  1:27     ` [EXTERNAL] " Jon Lange
  2023-01-12 13:57   ` James Bottomley
  1 sibling, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-11 23:00 UTC (permalink / raw)
  To: Christophe de Dinechin, Jon Lange, James Bottomley
  Cc: linux-coco, amd-sev-snp

Adding @Jon Lange and @James Bottomley on the To: line.

@Jon Lange, please search for your name to answer some questions below.
@James Bottomley, ditto.

On 1/11/23 10:39, Christophe de Dinechin wrote:
> Hi Tom,
> 
> 
> On 2023-01-10 at 12:54 -06, Tom Lendacky <thomas.lendacky@amd.com> wrote...
>> Attached is an updated draft version of the SVSM specification with added
>> support for an attestation protocol and a vTPM protocol as well as other
>> miscellaneous changes (all identified by change bar). Please take a look and
>> reply with any feedback you may have.
>>
>> Thanks,
>> Tom
> 
> Thanks for sharing.
> 
> This is the first time I actually review that document, so my feedback will
> be a bit longer than most. Also, I read it at a time where I had lost
> network access to Internet, so RTFM wasn't an option...
> 
> First, the actual errors:
> p9: Typo VMLP1+ instead of VMPL1+

Fixed

> p18: "bit1=0" and "bit1=1": That seems to be bit 2

Fixed

> 
> Then the more mundane comments
> 
> p9: "expected that, but not limited to": the wording sounds strange to me

Ok, I'll work on re-wording this.

> p9: Undefined acronym (*) VMSA

Fixed

> 
> p10: "certain forms of RMPADJUST": The body of the document seems to
>       indicate that the required RMPADJUST are performed as part of the
>       various other services. There is no explicit need (apparently) for a
>       separate guest-accessible RMPADJUST. Maybe expand a little bit on this
>       topic, and explain if the guest is supposed to do any RMPADJUST if
>       running at VMPL1.

RMPADJUST can be performed when not running at VMPL0, but only RMPADJUST 
executed at VMPL0 is allowed to turn a page into a VMSA page. I'll re-word 
this.

> p10: gPA space of the guest: "of the guest" seems redundant, since it's gPAs

Fixed

> p10: lower VMPL -> less privileged VMPL, or explain that "lower" privilege
>       levels have a higher number.

Fixed

> p10: "The initial SVSM memory configuration...": Unclear what "required"
>       means in that sentence. Is that the core protocol? Can "create VCPU"
>       request additional memory, for example?

I'll work on clarifying that.

> 
> p11: No explanation of how the SVSM knows where the secrets page is.
>       Probably need an xref to some other doc.

That is implemenation specific. For example, in the Linux SVSM prototype, 
the address of where to place the secrets page is communicated to the 
hypervisor via a GUID structure at the end of the SVSM binary. Other 
implementations may have a different method.

> p11: Who does the initial construction of the secrets page? What happens if
>       that other actor does not write zeroes? What attacks can the host
>       perform on the secrets page if any?

The SEV firmware constructs the secrets page directly in guest memory. 
Standard SNP practices regarding memory usage and access within the guest 
apply.

> p11: undefined acronym VMPCK

Fixed

> p11: Why should the guest OS "capture" the SVSM_CAA value? In what sense?
>       Is it because it can change afterwards, or because the secrets page
>       becomes unavailable, or for another reason?

I'll reword this, but it basically means that the OS is expected to use 
that value (unless it changes it) for the BSP. The OS is free to do what 
it wants with the memory where the secrets page is located.

> p11: Byte offset in secrets page fields starts at 0x140. Explain why this is
>       safe, and how other possible users of the secrets page would avoid
>       stomping on that area.

Hmmm... what other users? The SNP spec defines the area as reserved for 
guest usage (intended for use by the SVSM, but not explicitly stated). The 
SVSM specifies how it uses it and identifies itself to the guest OS. If 
the guest OS wants to trash it later, it can do that, but it should save 
off any information it needs.

In the Linux SVSM prototype, the secrets page created by the SEV firmware 
lives in the SVSM memory. The SVSM locates where the OVMF secrets page is 
supposed to live and copies the secret page to that location, modifies it 
and then begins OVMF exececution. So the OS is using a different secrets 
page than the SVSM.

> 
> p12: How does the SVSM "terminate"? Is it a regular guest shutdown, or some
>       other mechanism, or is unspecified? Can the SVSM log anything, and if
>       so, where would it be found? (I assume that would be host platform
>       specific, but would still be noteworthy in this specification)

As it is implementation dependent, I'd rather not put that type of 
information in the specification.

In the current Linux SVSM prototype, regular guest shutdown terminates the 
SVSM and the it logs to a fixed serial port.

> 
> p13: "Use of the Calling Area is necessary..." effectively, this is a single
>       byte in the calling area, right? So it's not really "ensuring", maybe
>       "detect" spurious invokation (1). I think malicious invokations are not
>       made too difficult by this mechanism, since there are only two states,
>       and the host still controls when vCPUs run.

I can change ensure to detect.

> 
> p14: Undefined acronyms: GHCB, MSR

Fixed


> p14: I think GHCB Specification is an external reference, worthy having
>       italics and a precise reference / link (can't check, no network ATM)

Fixed

> p14: "If the host illegally entered the SVSM, this field will be zero": I
>       believe that the conditions enforcing this should be precisely spelled
>       out, including for a host with malicious intent. If the mechanism is
>       indeed robust, then we are not protecting against "spurious" calls but
>       against "spurious or malicious" calls. Otherwise, "will be zero" should
>       be replaced by "should normally be zero".

Standard SNP practices are to be applied here. The field will be zero, as 
the only way it can be non-zero is from the guest setting the value or the 
host taking ownership of the CAA page and changing it. However, the act of 
the hypervisor taking ownership of the page will result in a #VC in the 
guest when attempting to access the page. At which point the guest knows 
that the host is being malicious and should terminate.

> p14: "only after VMSA.RAX and SVSM_CALL_PENDING": this suggests that there
>       is some kind of possible race condition here. If that is true, then
>       maybe there is a need to specify memory ordering semantics on the three
>       relevant fields?

The race is that once VMSA.EFER.SVME is set to 1, the guest VMSA can be 
used on a VMRUN instruction. So all updates to the guest VMSA contents 
must be performed before making the VMSA runnable, again.

> 
> p16: PVALIDATE: What happens if a guest is unaware of SVSM and executes
>       PVALIDATE directly? Is the SVSM supposed to emulate that, or to punish
>       the guest, or something else? Is that even possible for the SVSM to
>       trap and emulate relevant VMPL1 instructions? Also see note on page 10
>       regarding RMPADJUST.

If the guest is unaware of the SVSM (meaning it is running at VMPL1, for 
example, but doesn't know it), the PVALIDATE will fail with a #GP and the 
guest will crash. The current upstream Linux implementation checks to 
ensure it is running at VMPL0 or else terminates.

> p16: "It affects the Calling Area for calling vCPU only": This seems slighly
>       inconsistent with page 9 "other SVSM implementations may choose a
>       single execution context that services all guest VCPUs".

Not really. It is used to identify the calling vCPU and thus the VMSA that 
contains the call information. So, even if a single execution context is 
being used, the SVSM still needs to have a unique calling area for each 
vCPU in the guest.

> p16: Alignment for RCX is 8 bytes, but alignment for RDX in
>       SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?
>       Also, what is the use case for moving the calling area?

As the original author, maybe @Jon Lange can explain these requirements.

> 
> p17: The table links appear in a strange colour (some kind of weird cyan).
>       It seems clickable too, so I suspect a hyperlink, but since the link is
>       always on the same page, it's not super-useful.

I can remove that.

> 
> p18: Is the CAA seen as assigned to the SVSM? I believe the answer is no

Correct, it is not.

> p18: For increased readability, I suggest naming the error codes for the
>       SVSM_CORE_PVALIDATE call, and putting them in a table.
>       Also, why tag the specific errors right after the architectural
>       PVALIDATE errors? In case of architectural extension, you'd always get
>       0x8000_1011, which is not super helpful. Instead, you could reserve
>       0x8000_1xxx for protocol errors, and put PVALIDATE errors at
>       0x9nnn_nnnn, which probably gives you enough room at least for the
>       coming 6 months.

I'll look into this, no promises, though :)

> p18: "VMPL of the VCPU making the request": wouldn't it make sense to add a
>       VMPL field in the PVALIDATE operation descriptor, so that the guest
>       could control less-privileged VMPLs?

After the PVALIDATE, the guest can use the RMPADJUST instruction for 
adjusting page permissions for less-privileged VMPLs.

> p18: As indicated earlier, I'm confused by the 4K alginment requirement for
>       RDX (Calling Area gPA)

Lets wait for the response from @Jon Lange.

> p18: What is the APIC ID of the vCPU used for? I see no mention in the
>       explanatory text. Is that an internal index, or does the SVSM
>       implementation need it for some reason (I was not clever enough to
>       imagine why)

It's needed to ensure we are using the correct VMSA and CAA for a specific 
vCPU. Since the BSP creates all of the OS APs, were it to not supply the 
APIC ID when calling the CREATE_VCPU call, the SVSM would not have a way 
to associate the VMSA and CAA being created with a vCPU.

> 
> p19: What specification defines FAIL_INUSE? Add xref?

That's the FAIL_INUSE error code, 3, of the RMPUPDATE instruction. The 
PSMASH, PVALIDATE and RMPADJUST instructions share the same return codes, 
where applicable.

> 
> p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
>       service returns 0x4mmm_mmmm. However, in the presence of a flag
>       indicating "I may no longer need this memory", and given the
>       limitations "cannot cross a page", I am concerned about possible lack
>       of forward progress if two vCPUs start parallel operations where one
>       vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
>       be split into umteen DEPOSIT_MEM calls due to page limit), while
>       another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
>       see nothing in the spec that would prevent the second CPU from actively
>       withdrawing the memory that the first one is trying to deposit. I think
>       that the spec should clarify a forward-progress logic that prevents
>       that from happening.
> p20: It would be interesting to have at least a vague idea of what
>       operations can actually request more memory, just to set expectations.
> p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
>       "restart" field, unlike things like PVALIDATE. That means that there
>       should be a rather strong guarantee that all SVSM calls that can
>       potentially return 0x4mmm_mmmm either have no effect when they return
>       such a request, or are idempotent if called again after providing more
>       memory.
> 
> p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
>       Is there any requirement that WITHDRAW_MEM should only be called from
>       the startup vCPU, or from only one vCPU at a time?
> 
> p22: The writable area ends at a page boundary. What could be a vald
>       rationale for setting the RCX pointer in the middle of a page?
>       Maybe simpler to require that the pointer be page-aligned than have a
>       spec that mentions page offset 0xFF8 as a special case... > p22: Rationale for not returning incomplete? I'm trying to see how the 
guest
>       could efficiently let secondary vCPUs withdraw memory with the protocol
>       as specified, without a little additional wording regarding either the
>       memory semantics of the MEM_AVAILABLE flag, and telling if there is
>       indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.

Let's see what @Jon Lange's view on the p20 - p22 comments.

> 
> p23: Table 9 is mnissing RDX, R8 and R9 rows (as input for configuration)
> p23: Table 9 title should be "configuration or query"
> p23: RCX result is the same for query and configuration, or is that for
>       query only?

I'll update the table to make everything more clear.

> p23: At the specified RIP. If Bit 3 is not set in the configuration case,
>       does it return after the VMGEXIT?

Yes, it will return to the current value of RIP in the VMSA.

> 
> p29: Table 14: How can RAX be used as command ordinal and command response
>       size if it's already used for call identifier / result value?

Typo, that should have been 0x000 (since the column is Byte Offset).

> 
> p30: Why would you need 4 bytes for the TPM command ordinal? This causes the
>       TPM command size to be misaligned. What about 2 bytes for command
>       ordinal, one byte for locality, and one reserved byte?

This is following the MS simulator protocol for the TPM 2.0 reference 
implementation as proposed by @James Bottomley. He also submitted an RFC 
TPM platform driver implementation to the linux-coco list following this 
protocol. It would be best to involve him in the discussion.

> 
> 
> (*) Like most readers of this document, I know what it means, but since you
> defined VMM just above, or VM the page before, I interpreted your intent to
> be that every acronym should be defined on first use.
> 
> (1) As I am writing this, I have a doubt what happens if the host writes to
>      the secrets page, and can't verify easily without a network.

The host can't write to the secrets page without first performing an 
RMPUPDATE to make it a hypervisor page. If it then were to modify the page 
and then re-assign it to the guest, the guest would receive a #VC when 
accessing the page because it is no longer validated, allowing the guest 
to detect the condition and terminate.

Thanks for the feedback!

Tom

> 
>>
>> [2. application/pdf; 58019-Secure_VM_Service_Module_Specification.pdf]...
> 
> 
> --
> Cheers,
> Christophe de Dinechin (https://c3d.github.io)
> Theory of Incomplete Measurements (https://c3d.github.io/TIM)
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-11 23:00   ` Tom Lendacky
@ 2023-01-12  1:27     ` Jon Lange
  2023-01-13 16:10       ` Tom Lendacky
  0 siblings, 1 reply; 48+ messages in thread
From: Jon Lange @ 2023-01-12  1:27 UTC (permalink / raw)
  To: Tom Lendacky, Christophe de Dinechin, James Bottomley
  Cc: linux-coco, amd-sev-snp

> p16: Alignment for RCX is 8 bytes, but alignment for RDX in
>       SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?

This is an inconsistency in the spec.  The first draft defined the calling area as 8-byte aligned but was redefined as 4k aligned after discussion about using the calling page as a parameter page.  The 4k change was made on page 13 but not here.  Tom, can you fix that, and can you look for other similar places I forgot to clean up before handing the spec off to you?

>       Also, what is the use case for moving the calling area?

It's about flexibility for OS memory management.  The calling area for the BSP is configured by the SVSM using whatever gPA it chooses.  That could be in the middle of EFI code or data, or some other random place.  By the time the OS has exited boot services and has started managing memory, it may find that that calling page is a fixed island in a sea of memory that it would rather use for something else.  Perhaps the OS wants to make sure that the calling pages for all vCPUs are contiguous to minimize the fragmentation of physical memory.  Best to give the OS a mechanism to relocate the calling page wherever it wants for maximum control of memory.

> p18: As indicated earlier, I'm confused by the 4K alginment requirement for
>       RDX (Calling Area gPA)

This is consistent with p13.

> p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
>       service returns 0x4mmm_mmmm. However, in the presence of a flag
>       indicating "I may no longer need this memory", and given the
>       limitations "cannot cross a page", I am concerned about possible lack
>       of forward progress if two vCPUs start parallel operations where one
>       vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
>       be split into umteen DEPOSIT_MEM calls due to page limit), while
>       another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
>       see nothing in the spec that would prevent the second CPU from actively
>       withdrawing the memory that the first one is trying to deposit. I think
>       that the spec should clarify a forward-progress logic that prevents
>       that from happening.

The expectation is that the OS is smart enough to solve this on its own.  It is certainly the case that the SVSM is designed to be stateless with respect to the core protocol elements, and not to make assumptions about how the OS intends to behave.  If the SVSM knows that it has memory to spare, it should be free to signal that to the OS for the OS to process as it chooses.  It would be pretty simple for the OS to have a gate that counts the number of SVSM operations that are underway at one time so that any one vCPU can avoid returning memory if there is a chance that other vCPUs may end up needing it.  This is best solved using conventions in the OS instead of crafting a delicate multi-threading protocol that the SVSM has to use to signal and arbitrate cross-vCPU contention.

It wouldn't hurt to make this more clear in the spec, for sure.

> p20: It would be interesting to have at least a vague idea of what
>       operations can actually request more memory, just to set expectations.

This depends on the architecture of the SVSM.  It may be the case that the SVSM has to allocate a bunch of internal data structures per vCPU in order to manage whatever state is required to do whatever it's going to do, and in that case we would expect CREATE_VCPU to require memory.  Of course, this may not be required of all SVSM implementations.  It could be the case that the vTPM requires a bunch of memory but the SVSM doesn't reserve that memory because it doesn't want to waste memory if the vTPM is never used - in this case we would expect the first call with the vTPM protocol to request memory to instantiate the vTPM data structures.  It could be that the SVSM wants to maintain a sparse array of validated memory state and therefore every PVALIDATE request has to touch that array, which may require additional memory to materialize a new page of the sparse array upon first access - and in that case the PVALIDATE request may request more memory.  It's good to be prepared for additional memory demands for everything if possible.

> p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
>       "restart" field, unlike things like PVALIDATE. That means that there
>       should be a rather strong guarantee that all SVSM calls that can
>       potentially return 0x4mmm_mmmm either have no effect when they return
>       such a request, or are idempotent if called again after providing more
>       memory.

There is absolutely an expectation that each call can either describe the progress that has been made (like PVALIDATE) or else it must be idempotent.  I agree that this needs to be explicitly called out in the spec.  Tom, can you find a place to include this language?

> p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
>       Is there any requirement that WITHDRAW_MEM should only be called from
>       the startup vCPU, or from only one vCPU at a time?

There is no such constraint in the specification.  The SVSM is expected to serialize multiple simultaneous WITHDRAW_MEM calls to ensure that the same page doesn't show up in multiple withdrawn lists due to some race condition.  From the point of view of the calling OS, if multiple vCPUs make simultaneous calls to WITHDRAW_MEM, then the lists observed by each of them should be disjoint and consistent, so that each vCPU can return those pages to the local free pool correctly.

> p22: The writable area ends at a page boundary. What could be a vald
>       rationale for setting the RCX pointer in the middle of a page?
>       Maybe simpler to require that the pointer be page-aligned than have a
>       spec that mentions page offset 0xFF8 as a special case...

The rationale is to permit the parameters to inhabit the same page as the calling area - this reduces the amount of memory that the vCPU has to dedicate to an operation that is underway.  In this case, RCX would have a byte offset of +008 (just past the end of the calling area).  There is nothing magic about the +FF8 value here other than to serve as an illustration of the example of an invalid call which leaves enough room in a page for the call header but no room for any of the array elements.

> p22: Rationale for not returning incomplete? I'm trying to see how the 
guest
>       could efficiently let secondary vCPUs withdraw memory with the protocol
>       as specified, without a little additional wording regarding either the
>       memory semantics of the MEM_AVAILABLE flag, and telling if there is
>       indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.

The MEM_AVAILABLE flag in each vCPU's calling area is effectively equivalent to indicating that the withdrawal operation is incomplete.  The SVSM sets that flag upon the completion of every call to indicate whether memory was available for withdrawal at the completion of that last call.  If any vCPU observes MEM_AVAILABLE, then it can kick off a loop across as many vCPUs as it wants where each vCPU will call WITHDRAW_MEM (do-style and not while-style) until either a failure occurs or until the local MEM_AVAILABLE is cleared.  It doesn't matter if one vCPU finishes due to MEM_AVAILABLE while the others are still going; if one of the others observes MEM_AVAILABLE and attempts withdrawal after there is nothing left to withdraw, that call will just complete with an empty page list and will clear the local MEM_AVAILABLE to signal completion of the withdrawal process.  Defining SVSM_ERR_INCOMPLETE as a valid error code here doesn't change any of this logic; it only changes what the local vCPU checks (error code vs. local flag) to determine whether there is more work to do at the time it makes that check.

-Jon

-----Original Message-----
From: Tom Lendacky <thomas.lendacky@amd.com> 
Sent: Wednesday, January 11, 2023 3:01 PM
To: Christophe de Dinechin <dinechin@redhat.com>; Jon Lange <jlange@microsoft.com>; James Bottomley <jejb@linux.ibm.com>
Cc: linux-coco@lists.linux.dev; amd-sev-snp@lists.suse.com
Subject: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60

Adding @Jon Lange and @James Bottomley on the To: line.

@Jon Lange, please search for your name to answer some questions below.
@James Bottomley, ditto.

On 1/11/23 10:39, Christophe de Dinechin wrote:
> Hi Tom,
> 
> 
> On 2023-01-10 at 12:54 -06, Tom Lendacky <thomas.lendacky@amd.com> wrote...
>> Attached is an updated draft version of the SVSM specification with 
>> added support for an attestation protocol and a vTPM protocol as well 
>> as other miscellaneous changes (all identified by change bar). Please 
>> take a look and reply with any feedback you may have.
>>
>> Thanks,
>> Tom
> 
> Thanks for sharing.
> 
> This is the first time I actually review that document, so my feedback 
> will be a bit longer than most. Also, I read it at a time where I had 
> lost network access to Internet, so RTFM wasn't an option...
> 
> First, the actual errors:
> p9: Typo VMLP1+ instead of VMPL1+

Fixed

> p18: "bit1=0" and "bit1=1": That seems to be bit 2

Fixed

> 
> Then the more mundane comments
> 
> p9: "expected that, but not limited to": the wording sounds strange to 
> me

Ok, I'll work on re-wording this.

> p9: Undefined acronym (*) VMSA

Fixed

> 
> p10: "certain forms of RMPADJUST": The body of the document seems to
>       indicate that the required RMPADJUST are performed as part of the
>       various other services. There is no explicit need (apparently) for a
>       separate guest-accessible RMPADJUST. Maybe expand a little bit on this
>       topic, and explain if the guest is supposed to do any RMPADJUST if
>       running at VMPL1.

RMPADJUST can be performed when not running at VMPL0, but only RMPADJUST executed at VMPL0 is allowed to turn a page into a VMSA page. I'll re-word this.

> p10: gPA space of the guest: "of the guest" seems redundant, since 
> it's gPAs

Fixed

> p10: lower VMPL -> less privileged VMPL, or explain that "lower" privilege
>       levels have a higher number.

Fixed

> p10: "The initial SVSM memory configuration...": Unclear what "required"
>       means in that sentence. Is that the core protocol? Can "create VCPU"
>       request additional memory, for example?

I'll work on clarifying that.

> 
> p11: No explanation of how the SVSM knows where the secrets page is.
>       Probably need an xref to some other doc.

That is implemenation specific. For example, in the Linux SVSM prototype, the address of where to place the secrets page is communicated to the hypervisor via a GUID structure at the end of the SVSM binary. Other implementations may have a different method.

> p11: Who does the initial construction of the secrets page? What happens if
>       that other actor does not write zeroes? What attacks can the host
>       perform on the secrets page if any?

The SEV firmware constructs the secrets page directly in guest memory. 
Standard SNP practices regarding memory usage and access within the guest apply.

> p11: undefined acronym VMPCK

Fixed

> p11: Why should the guest OS "capture" the SVSM_CAA value? In what sense?
>       Is it because it can change afterwards, or because the secrets page
>       becomes unavailable, or for another reason?

I'll reword this, but it basically means that the OS is expected to use that value (unless it changes it) for the BSP. The OS is free to do what it wants with the memory where the secrets page is located.

> p11: Byte offset in secrets page fields starts at 0x140. Explain why this is
>       safe, and how other possible users of the secrets page would avoid
>       stomping on that area.

Hmmm... what other users? The SNP spec defines the area as reserved for guest usage (intended for use by the SVSM, but not explicitly stated). The SVSM specifies how it uses it and identifies itself to the guest OS. If the guest OS wants to trash it later, it can do that, but it should save off any information it needs.

In the Linux SVSM prototype, the secrets page created by the SEV firmware lives in the SVSM memory. The SVSM locates where the OVMF secrets page is supposed to live and copies the secret page to that location, modifies it and then begins OVMF exececution. So the OS is using a different secrets page than the SVSM.

> 
> p12: How does the SVSM "terminate"? Is it a regular guest shutdown, or some
>       other mechanism, or is unspecified? Can the SVSM log anything, and if
>       so, where would it be found? (I assume that would be host platform
>       specific, but would still be noteworthy in this specification)

As it is implementation dependent, I'd rather not put that type of information in the specification.

In the current Linux SVSM prototype, regular guest shutdown terminates the SVSM and the it logs to a fixed serial port.

> 
> p13: "Use of the Calling Area is necessary..." effectively, this is a single
>       byte in the calling area, right? So it's not really "ensuring", maybe
>       "detect" spurious invokation (1). I think malicious invokations are not
>       made too difficult by this mechanism, since there are only two states,
>       and the host still controls when vCPUs run.

I can change ensure to detect.

> 
> p14: Undefined acronyms: GHCB, MSR

Fixed


> p14: I think GHCB Specification is an external reference, worthy having
>       italics and a precise reference / link (can't check, no network 
> ATM)

Fixed

> p14: "If the host illegally entered the SVSM, this field will be zero": I
>       believe that the conditions enforcing this should be precisely spelled
>       out, including for a host with malicious intent. If the mechanism is
>       indeed robust, then we are not protecting against "spurious" calls but
>       against "spurious or malicious" calls. Otherwise, "will be zero" should
>       be replaced by "should normally be zero".

Standard SNP practices are to be applied here. The field will be zero, as the only way it can be non-zero is from the guest setting the value or the host taking ownership of the CAA page and changing it. However, the act of the hypervisor taking ownership of the page will result in a #VC in the guest when attempting to access the page. At which point the guest knows that the host is being malicious and should terminate.

> p14: "only after VMSA.RAX and SVSM_CALL_PENDING": this suggests that there
>       is some kind of possible race condition here. If that is true, then
>       maybe there is a need to specify memory ordering semantics on the three
>       relevant fields?

The race is that once VMSA.EFER.SVME is set to 1, the guest VMSA can be used on a VMRUN instruction. So all updates to the guest VMSA contents must be performed before making the VMSA runnable, again.

> 
> p16: PVALIDATE: What happens if a guest is unaware of SVSM and executes
>       PVALIDATE directly? Is the SVSM supposed to emulate that, or to punish
>       the guest, or something else? Is that even possible for the SVSM to
>       trap and emulate relevant VMPL1 instructions? Also see note on page 10
>       regarding RMPADJUST.

If the guest is unaware of the SVSM (meaning it is running at VMPL1, for example, but doesn't know it), the PVALIDATE will fail with a #GP and the guest will crash. The current upstream Linux implementation checks to ensure it is running at VMPL0 or else terminates.

> p16: "It affects the Calling Area for calling vCPU only": This seems slighly
>       inconsistent with page 9 "other SVSM implementations may choose a
>       single execution context that services all guest VCPUs".

Not really. It is used to identify the calling vCPU and thus the VMSA that contains the call information. So, even if a single execution context is being used, the SVSM still needs to have a unique calling area for each vCPU in the guest.

> p16: Alignment for RCX is 8 bytes, but alignment for RDX in
>       SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?
>       Also, what is the use case for moving the calling area?

As the original author, maybe @Jon Lange can explain these requirements.

> 
> p17: The table links appear in a strange colour (some kind of weird cyan).
>       It seems clickable too, so I suspect a hyperlink, but since the link is
>       always on the same page, it's not super-useful.

I can remove that.

> 
> p18: Is the CAA seen as assigned to the SVSM? I believe the answer is 
> no

Correct, it is not.

> p18: For increased readability, I suggest naming the error codes for the
>       SVSM_CORE_PVALIDATE call, and putting them in a table.
>       Also, why tag the specific errors right after the architectural
>       PVALIDATE errors? In case of architectural extension, you'd always get
>       0x8000_1011, which is not super helpful. Instead, you could reserve
>       0x8000_1xxx for protocol errors, and put PVALIDATE errors at
>       0x9nnn_nnnn, which probably gives you enough room at least for the
>       coming 6 months.

I'll look into this, no promises, though :)

> p18: "VMPL of the VCPU making the request": wouldn't it make sense to add a
>       VMPL field in the PVALIDATE operation descriptor, so that the guest
>       could control less-privileged VMPLs?

After the PVALIDATE, the guest can use the RMPADJUST instruction for adjusting page permissions for less-privileged VMPLs.

> p18: As indicated earlier, I'm confused by the 4K alginment requirement for
>       RDX (Calling Area gPA)

Lets wait for the response from @Jon Lange.

> p18: What is the APIC ID of the vCPU used for? I see no mention in the
>       explanatory text. Is that an internal index, or does the SVSM
>       implementation need it for some reason (I was not clever enough to
>       imagine why)

It's needed to ensure we are using the correct VMSA and CAA for a specific vCPU. Since the BSP creates all of the OS APs, were it to not supply the APIC ID when calling the CREATE_VCPU call, the SVSM would not have a way to associate the VMSA and CAA being created with a vCPU.

> 
> p19: What specification defines FAIL_INUSE? Add xref?

That's the FAIL_INUSE error code, 3, of the RMPUPDATE instruction. The 
PSMASH, PVALIDATE and RMPADJUST instructions share the same return codes, 
where applicable.

> 
> p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
>       service returns 0x4mmm_mmmm. However, in the presence of a flag
>       indicating "I may no longer need this memory", and given the
>       limitations "cannot cross a page", I am concerned about possible lack
>       of forward progress if two vCPUs start parallel operations where one
>       vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
>       be split into umteen DEPOSIT_MEM calls due to page limit), while
>       another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
>       see nothing in the spec that would prevent the second CPU from actively
>       withdrawing the memory that the first one is trying to deposit. I think
>       that the spec should clarify a forward-progress logic that prevents
>       that from happening.
> p20: It would be interesting to have at least a vague idea of what
>       operations can actually request more memory, just to set expectations.
> p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
>       "restart" field, unlike things like PVALIDATE. That means that there
>       should be a rather strong guarantee that all SVSM calls that can
>       potentially return 0x4mmm_mmmm either have no effect when they return
>       such a request, or are idempotent if called again after providing more
>       memory.
> 
> p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
>       Is there any requirement that WITHDRAW_MEM should only be called from
>       the startup vCPU, or from only one vCPU at a time?
> 
> p22: The writable area ends at a page boundary. What could be a vald
>       rationale for setting the RCX pointer in the middle of a page?
>       Maybe simpler to require that the pointer be page-aligned than have a
>       spec that mentions page offset 0xFF8 as a special case... > p22: Rationale for not returning incomplete? I'm trying to see how the 
guest
>       could efficiently let secondary vCPUs withdraw memory with the protocol
>       as specified, without a little additional wording regarding either the
>       memory semantics of the MEM_AVAILABLE flag, and telling if there is
>       indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.

Let's see what @Jon Lange's view on the p20 - p22 comments.

> 
> p23: Table 9 is mnissing RDX, R8 and R9 rows (as input for configuration)
> p23: Table 9 title should be "configuration or query"
> p23: RCX result is the same for query and configuration, or is that for
>       query only?

I'll update the table to make everything more clear.

> p23: At the specified RIP. If Bit 3 is not set in the configuration case,
>       does it return after the VMGEXIT?

Yes, it will return to the current value of RIP in the VMSA.

> 
> p29: Table 14: How can RAX be used as command ordinal and command response
>       size if it's already used for call identifier / result value?

Typo, that should have been 0x000 (since the column is Byte Offset).

> 
> p30: Why would you need 4 bytes for the TPM command ordinal? This causes the
>       TPM command size to be misaligned. What about 2 bytes for command
>       ordinal, one byte for locality, and one reserved byte?

This is following the MS simulator protocol for the TPM 2.0 reference 
implementation as proposed by @James Bottomley. He also submitted an RFC 
TPM platform driver implementation to the linux-coco list following this 
protocol. It would be best to involve him in the discussion.

> 
> 
> (*) Like most readers of this document, I know what it means, but since you
> defined VMM just above, or VM the page before, I interpreted your intent to
> be that every acronym should be defined on first use.
> 
> (1) As I am writing this, I have a doubt what happens if the host writes to
>      the secrets page, and can't verify easily without a network.

The host can't write to the secrets page without first performing an 
RMPUPDATE to make it a hypervisor page. If it then were to modify the page 
and then re-assign it to the guest, the guest would receive a #VC when 
accessing the page because it is no longer validated, allowing the guest 
to detect the condition and terminate.

Thanks for the feedback!

Tom

> 
>>
>> [2. application/pdf; 58019-Secure_VM_Service_Module_Specification.pdf]...
> 
> 
> --
> Cheers,
> Christophe de Dinechin (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fc3d.github.io%2F&data=05%7C01%7Cjlange%40microsoft.com%7Cd9d72740aee74b848eb208daf427aa92%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638090748477137485%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=%2BO71GzOCiSNlnO5gC5QOGOBmOH9IK4vJZA9k06RTNwo%3D&reserved=0)
> Theory of Incomplete Measurements (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fc3d.github.io%2FTIM&data=05%7C01%7Cjlange%40microsoft.com%7Cd9d72740aee74b848eb208daf427aa92%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638090748477137485%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ghf1u8FdX9WtlfFjYaax07awau35J963AQrPASZenZw%3D&reserved=0)
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
                   ` (3 preceding siblings ...)
  2023-01-11 16:39 ` Christophe de Dinechin
@ 2023-01-12  8:19 ` Dov Murik
  2023-01-12 12:18   ` James Bottomley
  2023-01-13 16:16   ` Tom Lendacky
  2023-01-13 11:50 ` Nicolai Stange
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 48+ messages in thread
From: Dov Murik @ 2023-01-12  8:19 UTC (permalink / raw)
  To: Tom Lendacky, linux-coco, amd-sev-snp, James Bottomley; +Cc: Dov Murik

Thanks Tom for these additions.

On 10/01/2023 20:54, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with
> added support for an attestation protocol and a vTPM protocol as well as
> other miscellaneous changes (all identified by change bar). Please take
> a look and reply with any feedback you may have.
> 

Few comments/questions:

Page 25: 7.1 SVSM_ATTEST_SERVICES

1. Should we add two fields in Table 11 for certs_addr and certs_len? If
certs_len input is > 0, then SVSM will perform guest_ext_request and
retrieve the host/VMM certs into certs_addr.  Provided certs_len must be
4KB-aligned.  Not sure about certs_addr.

If the size of the certs exceeds the size of the supplied certs buffer,
R8 will be set to the size of the certs (in bytes) and the call will
return SVSM_ERR_INVALID_PARAMETER.

2. Table 11: The 'Attestation report buffer gPA': should have enough
space to hold the resulting SNP attestation report (0x500 bytes).

(I assume the implementation will ask the PSP to generate the report
into an SVSM-HV shared page, and then copy the result to the
caller-provided buffer in guest private memory.  So the caller doesn't
need to worry about page alignment/crossing?)

3. Consider stating that the SNP attestation report is always generated
with MSG_REPORT_REQ.VMPL=0.

4. Services Manifest (page 26): Can we require that the same SVSM source
code will produce the same (binary) manifest buffer (in different VMs)?
For example, at which order do the entries appear? I think James was
aiming at durable/repeatable attestations.



Page 28: vTPM Protocol

5. Will the SVSM update any PCRs on its own? For example, will it
measure the content of OVMF into PCR0?  Or are relying on the SNP
launch-update measurement (which currently includes both SVSM and OVMF)
to attest that part of the guest?


-Dov

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12  8:19 ` Dov Murik
@ 2023-01-12 12:18   ` James Bottomley
  2023-01-13 16:16   ` Tom Lendacky
  1 sibling, 0 replies; 48+ messages in thread
From: James Bottomley @ 2023-01-12 12:18 UTC (permalink / raw)
  To: Dov Murik, Tom Lendacky, linux-coco, amd-sev-snp

On Thu, 2023-01-12 at 10:19 +0200, Dov Murik wrote:
> Page 28: vTPM Protocol
> 
> 5. Will the SVSM update any PCRs on its own? For example, will it
> measure the content of OVMF into PCR0?  Or are relying on the SNP
> launch-update measurement (which currently includes both SVSM and
> OVMF) to attest that part of the guest?

If the SVSM were to load OVMF then it would have to because the SVSM
would become the root of trust.  While SVSM+OVMF are loaded together
(and attested by the PSP) there's no need because the combined entity
is the root of trust and thus measurements can begin in OVMF.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-11 16:39 ` Christophe de Dinechin
  2023-01-11 23:00   ` Tom Lendacky
@ 2023-01-12 13:57   ` James Bottomley
  2023-01-12 15:13     ` Tom Lendacky
  1 sibling, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-12 13:57 UTC (permalink / raw)
  To: Christophe de Dinechin, Tom Lendacky; +Cc: linux-coco, amd-sev-snp

On Wed, 2023-01-11 at 17:39 +0100, Christophe de Dinechin wrote:
[...]
> p29: Table 14: How can RAX be used as command ordinal and command
> response size if it's already used for call identifier / result
> value?

I think there's a bug on the first table: in the prototype we use RAX
[in] (not RCX [in]) for the pointer to the request/response buffer and
RAX [out] as the SVSM return code.

The shared response buffer has the platform command as the first word
when it goes down and the response length as the first word when it
comes back, which I think is what the second table is meant to show.

> p30: Why would you need 4 bytes for the TPM command ordinal? This
> causes the TPM command size to be misaligned. What about 2 bytes for
> command ordinal, one byte for locality, and one reserved byte?

It comes straight from the MSSIM protocol.  The actual vTPM service
routine sends the buffer down unmodified to the vTPM command routines.
If the size of this gets changed, the entire command would have to be
copied to a new buffer to send it to the MSSIM vTPM.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12 13:57   ` James Bottomley
@ 2023-01-12 15:13     ` Tom Lendacky
  2023-01-12 15:24       ` James Bottomley
  0 siblings, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-12 15:13 UTC (permalink / raw)
  To: James Bottomley, Christophe de Dinechin; +Cc: linux-coco, amd-sev-snp

On 1/12/23 07:57, James Bottomley wrote:
> On Wed, 2023-01-11 at 17:39 +0100, Christophe de Dinechin wrote:
> [...]
>> p29: Table 14: How can RAX be used as command ordinal and command
>> response size if it's already used for call identifier / result
>> value?
> 
> I think there's a bug on the first table: in the prototype we use RAX
> [in] (not RCX [in]) for the pointer to the request/response buffer and
> RAX [out] as the SVSM return code.

No, it's not a bug. You'll need to use RCX as the pointer to the 
request/response buffer because RAX, per calling convention, contains the 
protocol id and call id.

> 
> The shared response buffer has the platform command as the first word
> when it goes down and the response length as the first word when it
> comes back, which I think is what the second table is meant to show.

Right, I commented already that it was a typo and should have been an 
offset with a value of 0.

Thanks,
Tom

> 
>> p30: Why would you need 4 bytes for the TPM command ordinal? This
>> causes the TPM command size to be misaligned. What about 2 bytes for
>> command ordinal, one byte for locality, and one reserved byte?
> 
> It comes straight from the MSSIM protocol.  The actual vTPM service
> routine sends the buffer down unmodified to the vTPM command routines.
> If the size of this gets changed, the entire command would have to be
> copied to a new buffer to send it to the MSSIM vTPM.
> 
> James
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12 15:13     ` Tom Lendacky
@ 2023-01-12 15:24       ` James Bottomley
  2023-01-13 16:12         ` Tom Lendacky
  0 siblings, 1 reply; 48+ messages in thread
From: James Bottomley @ 2023-01-12 15:24 UTC (permalink / raw)
  To: Tom Lendacky, Christophe de Dinechin; +Cc: linux-coco, amd-sev-snp

On Thu, 2023-01-12 at 09:13 -0600, Tom Lendacky wrote:
> On 1/12/23 07:57, James Bottomley wrote:
> > On Wed, 2023-01-11 at 17:39 +0100, Christophe de Dinechin wrote:
> > [...]
> > > p29: Table 14: How can RAX be used as command ordinal and command
> > > response size if it's already used for call identifier / result
> > > value?
> > 
> > I think there's a bug on the first table: in the prototype we use
> > RAX [in] (not RCX [in]) for the pointer to the request/response
> > buffer and RAX [out] as the SVSM return code.
> 
> No, it's not a bug. You'll need to use RCX as the pointer to the 
> request/response buffer because RAX, per calling convention, contains
> the protocol id and call id.

Oh, right, I should have looked at the code first before commenting we
do indeed use RCX for the shared buffer.

Do you have a timetable for updating the linux-svsm github for the new
protocol based calling convention?  We'd like to rebase the new vTPM
code off it when it's available.

James


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
                   ` (4 preceding siblings ...)
  2023-01-12  8:19 ` Dov Murik
@ 2023-01-13 11:50 ` Nicolai Stange
  2023-01-13 17:20   ` Tom Lendacky
  2023-01-24  9:35 ` Jörg Rödel
  2023-01-24  9:45 ` Jörg Rödel
  7 siblings, 1 reply; 48+ messages in thread
From: Nicolai Stange @ 2023-01-13 11:50 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

Hi Tom,

Tom Lendacky <thomas.lendacky@amd.com> writes:

> Please take a look and reply with any feedback you may have.

Perhaps I simply missed it, but the available SVSM_VTPM_CMD response
buffer size seems to remain kind of unspecified. That is, the proposal
from [1] was to just require a complete page for the buffer, but I can't
find that explicitly stated anywhere (except for the required alignment
of %rcx giving a hint).


For the table on p. 28 in sec. 8.1 "SVSM_VTPM_QUERY Call", the
"Supported vTPM features" are meant get returned in %rdx, not %rcx, I
think.


And finally a question re the addition on p.9 ("Scope of the document"),
which reads as
 "Items measured at VMLP1+
  o Firmware binary
 "

In light of the sentence immediately preceeding the above and explicitly
stating that the svsm would get measured as part of the initial image,
does this conversely imply that the firmware binary would typically not
get measured as part of the initial guest image? I.e. that it would get
loaded with PAGE_TYPE_UNMEASURED? If so the above could be read as if
the firmware was supposed to measure itself at VMPL1. I think that's not
what's being meant here, but the wording is a bit misleading IMO.

Thanks!

Nicolai

[1] https://lore.kernel.org/linux-coco/b488a79617beed8913df61186e8e263c40f2330b.camel@linux.ibm.com/

-- 
SUSE Software Solutions Germany GmbH, Frankenstraße 146, 90461 Nürnberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
(HRB 36809, AG Nürnberg)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12  1:27     ` [EXTERNAL] " Jon Lange
@ 2023-01-13 16:10       ` Tom Lendacky
  0 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-13 16:10 UTC (permalink / raw)
  To: Jon Lange, Christophe de Dinechin, James Bottomley
  Cc: linux-coco, amd-sev-snp

On 1/11/23 19:27, Jon Lange wrote:
>> p16: Alignment for RCX is 8 bytes, but alignment for RDX in
>>        SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?
> 
> This is an inconsistency in the spec.  The first draft defined the calling area as 8-byte aligned but was redefined as 4k aligned after discussion about using the calling page as a parameter page.  The 4k change was made on page 13 but not here.  Tom, can you fix that, and can you look for other similar places I forgot to clean up before handing the spec off to you?

Yep, I can do that.

> 
>>        Also, what is the use case for moving the calling area?
> 
> It's about flexibility for OS memory management.  The calling area for the BSP is configured by the SVSM using whatever gPA it chooses.  That could be in the middle of EFI code or data, or some other random place.  By the time the OS has exited boot services and has started managing memory, it may find that that calling page is a fixed island in a sea of memory that it would rather use for something else.  Perhaps the OS wants to make sure that the calling pages for all vCPUs are contiguous to minimize the fragmentation of physical memory.  Best to give the OS a mechanism to relocate the calling page wherever it wants for maximum control of memory.
> 
>> p18: As indicated earlier, I'm confused by the 4K alginment requirement for
>>        RDX (Calling Area gPA)
> 
> This is consistent with p13.
> 
>> p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
>>        service returns 0x4mmm_mmmm. However, in the presence of a flag
>>        indicating "I may no longer need this memory", and given the
>>        limitations "cannot cross a page", I am concerned about possible lack
>>        of forward progress if two vCPUs start parallel operations where one
>>        vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
>>        be split into umteen DEPOSIT_MEM calls due to page limit), while
>>        another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
>>        see nothing in the spec that would prevent the second CPU from actively
>>        withdrawing the memory that the first one is trying to deposit. I think
>>        that the spec should clarify a forward-progress logic that prevents
>>        that from happening.
> 
> The expectation is that the OS is smart enough to solve this on its own.  It is certainly the case that the SVSM is designed to be stateless with respect to the core protocol elements, and not to make assumptions about how the OS intends to behave.  If the SVSM knows that it has memory to spare, it should be free to signal that to the OS for the OS to process as it chooses.  It would be pretty simple for the OS to have a gate that counts the number of SVSM operations that are underway at one time so that any one vCPU can avoid returning memory if there is a chance that other vCPUs may end up needing it.  This is best solved using conventions in the OS instead of crafting a delicate multi-threading protocol that the SVSM has to use to signal and arbitrate cross-vCPU contention.
> 
> It wouldn't hurt to make this more clear in the spec, for sure.

I'll update the specification to say that the OS is responsible for 
coordinating requests to DEPOSIT_MEM and WITHDRAW_MEM.

> 
>> p20: It would be interesting to have at least a vague idea of what
>>        operations can actually request more memory, just to set expectations.
> 
> This depends on the architecture of the SVSM.  It may be the case that the SVSM has to allocate a bunch of internal data structures per vCPU in order to manage whatever state is required to do whatever it's going to do, and in that case we would expect CREATE_VCPU to require memory.  Of course, this may not be required of all SVSM implementations.  It could be the case that the vTPM requires a bunch of memory but the SVSM doesn't reserve that memory because it doesn't want to waste memory if the vTPM is never used - in this case we would expect the first call with the vTPM protocol to request memory to instantiate the vTPM data structures.  It could be that the SVSM wants to maintain a sparse array of validated memory state and therefore every PVALIDATE request has to touch that array, which may require additional memory to materialize a new page of the sparse array upon first access - and in that case the PVALIDATE request may request more memory.  It's good to be prepared for additional memory demands for everything if possible.
> 
>> p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
>>        "restart" field, unlike things like PVALIDATE. That means that there
>>        should be a rather strong guarantee that all SVSM calls that can
>>        potentially return 0x4mmm_mmmm either have no effect when they return
>>        such a request, or are idempotent if called again after providing more
>>        memory.
> 
> There is absolutely an expectation that each call can either describe the progress that has been made (like PVALIDATE) or else it must be idempotent.  I agree that this needs to be explicitly called out in the spec.  Tom, can you find a place to include this language?

Will do.

> 
>> p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
>>        Is there any requirement that WITHDRAW_MEM should only be called from
>>        the startup vCPU, or from only one vCPU at a time?
> 
> There is no such constraint in the specification.  The SVSM is expected to serialize multiple simultaneous WITHDRAW_MEM calls to ensure that the same page doesn't show up in multiple withdrawn lists due to some race condition.  From the point of view of the calling OS, if multiple vCPUs make simultaneous calls to WITHDRAW_MEM, then the lists observed by each of them should be disjoint and consistent, so that each vCPU can return those pages to the local free pool correctly.

I'll remove the startup vCPU reference and add some language to the spec 
about this.

Thanks,
Tom

>  >> p22: The writable area ends at a page boundary. What could be a vald
>>        rationale for setting the RCX pointer in the middle of a page?
>>        Maybe simpler to require that the pointer be page-aligned than have a
>>        spec that mentions page offset 0xFF8 as a special case...
> 
> The rationale is to permit the parameters to inhabit the same page as the calling area - this reduces the amount of memory that the vCPU has to dedicate to an operation that is underway.  In this case, RCX would have a byte offset of +008 (just past the end of the calling area).  There is nothing magic about the +FF8 value here other than to serve as an illustration of the example of an invalid call which leaves enough room in a page for the call header but no room for any of the array elements.
> 
>> p22: Rationale for not returning incomplete? I'm trying to see how the
> guest
>>        could efficiently let secondary vCPUs withdraw memory with the protocol
>>        as specified, without a little additional wording regarding either the
>>        memory semantics of the MEM_AVAILABLE flag, and telling if there is
>>        indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.
> 
> The MEM_AVAILABLE flag in each vCPU's calling area is effectively equivalent to indicating that the withdrawal operation is incomplete.  The SVSM sets that flag upon the completion of every call to indicate whether memory was available for withdrawal at the completion of that last call.  If any vCPU observes MEM_AVAILABLE, then it can kick off a loop across as many vCPUs as it wants where each vCPU will call WITHDRAW_MEM (do-style and not while-style) until either a failure occurs or until the local MEM_AVAILABLE is cleared.  It doesn't matter if one vCPU finishes due to MEM_AVAILABLE while the others are still going; if one of the others observes MEM_AVAILABLE and attempts withdrawal after there is nothing left to withdraw, that call will just complete with an empty page list and will clear the local MEM_AVAILABLE to signal completion of the withdrawal process.  Defining SVSM_ERR_INCOMPLETE as a valid error code here doesn't change any of this logic; it only changes what the local vCPU checks (error code vs. local flag) to determine whether there is more work to do at the time it makes that check.
> 
> -Jon
> 
> -----Original Message-----
> From: Tom Lendacky <thomas.lendacky@amd.com>
> Sent: Wednesday, January 11, 2023 3:01 PM
> To: Christophe de Dinechin <dinechin@redhat.com>; Jon Lange <jlange@microsoft.com>; James Bottomley <jejb@linux.ibm.com>
> Cc: linux-coco@lists.linux.dev; amd-sev-snp@lists.suse.com
> Subject: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
> 
> Adding @Jon Lange and @James Bottomley on the To: line.
> 
> @Jon Lange, please search for your name to answer some questions below.
> @James Bottomley, ditto.
> 
> On 1/11/23 10:39, Christophe de Dinechin wrote:
>> Hi Tom,
>>
>>
>> On 2023-01-10 at 12:54 -06, Tom Lendacky <thomas.lendacky@amd.com> wrote...
>>> Attached is an updated draft version of the SVSM specification with
>>> added support for an attestation protocol and a vTPM protocol as well
>>> as other miscellaneous changes (all identified by change bar). Please
>>> take a look and reply with any feedback you may have.
>>>
>>> Thanks,
>>> Tom
>>
>> Thanks for sharing.
>>
>> This is the first time I actually review that document, so my feedback
>> will be a bit longer than most. Also, I read it at a time where I had
>> lost network access to Internet, so RTFM wasn't an option...
>>
>> First, the actual errors:
>> p9: Typo VMLP1+ instead of VMPL1+
> 
> Fixed
> 
>> p18: "bit1=0" and "bit1=1": That seems to be bit 2
> 
> Fixed
> 
>>
>> Then the more mundane comments
>>
>> p9: "expected that, but not limited to": the wording sounds strange to
>> me
> 
> Ok, I'll work on re-wording this.
> 
>> p9: Undefined acronym (*) VMSA
> 
> Fixed
> 
>>
>> p10: "certain forms of RMPADJUST": The body of the document seems to
>>        indicate that the required RMPADJUST are performed as part of the
>>        various other services. There is no explicit need (apparently) for a
>>        separate guest-accessible RMPADJUST. Maybe expand a little bit on this
>>        topic, and explain if the guest is supposed to do any RMPADJUST if
>>        running at VMPL1.
> 
> RMPADJUST can be performed when not running at VMPL0, but only RMPADJUST executed at VMPL0 is allowed to turn a page into a VMSA page. I'll re-word this.
> 
>> p10: gPA space of the guest: "of the guest" seems redundant, since
>> it's gPAs
> 
> Fixed
> 
>> p10: lower VMPL -> less privileged VMPL, or explain that "lower" privilege
>>        levels have a higher number.
> 
> Fixed
> 
>> p10: "The initial SVSM memory configuration...": Unclear what "required"
>>        means in that sentence. Is that the core protocol? Can "create VCPU"
>>        request additional memory, for example?
> 
> I'll work on clarifying that.
> 
>>
>> p11: No explanation of how the SVSM knows where the secrets page is.
>>        Probably need an xref to some other doc.
> 
> That is implemenation specific. For example, in the Linux SVSM prototype, the address of where to place the secrets page is communicated to the hypervisor via a GUID structure at the end of the SVSM binary. Other implementations may have a different method.
> 
>> p11: Who does the initial construction of the secrets page? What happens if
>>        that other actor does not write zeroes? What attacks can the host
>>        perform on the secrets page if any?
> 
> The SEV firmware constructs the secrets page directly in guest memory.
> Standard SNP practices regarding memory usage and access within the guest apply.
> 
>> p11: undefined acronym VMPCK
> 
> Fixed
> 
>> p11: Why should the guest OS "capture" the SVSM_CAA value? In what sense?
>>        Is it because it can change afterwards, or because the secrets page
>>        becomes unavailable, or for another reason?
> 
> I'll reword this, but it basically means that the OS is expected to use that value (unless it changes it) for the BSP. The OS is free to do what it wants with the memory where the secrets page is located.
> 
>> p11: Byte offset in secrets page fields starts at 0x140. Explain why this is
>>        safe, and how other possible users of the secrets page would avoid
>>        stomping on that area.
> 
> Hmmm... what other users? The SNP spec defines the area as reserved for guest usage (intended for use by the SVSM, but not explicitly stated). The SVSM specifies how it uses it and identifies itself to the guest OS. If the guest OS wants to trash it later, it can do that, but it should save off any information it needs.
> 
> In the Linux SVSM prototype, the secrets page created by the SEV firmware lives in the SVSM memory. The SVSM locates where the OVMF secrets page is supposed to live and copies the secret page to that location, modifies it and then begins OVMF exececution. So the OS is using a different secrets page than the SVSM.
> 
>>
>> p12: How does the SVSM "terminate"? Is it a regular guest shutdown, or some
>>        other mechanism, or is unspecified? Can the SVSM log anything, and if
>>        so, where would it be found? (I assume that would be host platform
>>        specific, but would still be noteworthy in this specification)
> 
> As it is implementation dependent, I'd rather not put that type of information in the specification.
> 
> In the current Linux SVSM prototype, regular guest shutdown terminates the SVSM and the it logs to a fixed serial port.
> 
>>
>> p13: "Use of the Calling Area is necessary..." effectively, this is a single
>>        byte in the calling area, right? So it's not really "ensuring", maybe
>>        "detect" spurious invokation (1). I think malicious invokations are not
>>        made too difficult by this mechanism, since there are only two states,
>>        and the host still controls when vCPUs run.
> 
> I can change ensure to detect.
> 
>>
>> p14: Undefined acronyms: GHCB, MSR
> 
> Fixed
> 
> 
>> p14: I think GHCB Specification is an external reference, worthy having
>>        italics and a precise reference / link (can't check, no network
>> ATM)
> 
> Fixed
> 
>> p14: "If the host illegally entered the SVSM, this field will be zero": I
>>        believe that the conditions enforcing this should be precisely spelled
>>        out, including for a host with malicious intent. If the mechanism is
>>        indeed robust, then we are not protecting against "spurious" calls but
>>        against "spurious or malicious" calls. Otherwise, "will be zero" should
>>        be replaced by "should normally be zero".
> 
> Standard SNP practices are to be applied here. The field will be zero, as the only way it can be non-zero is from the guest setting the value or the host taking ownership of the CAA page and changing it. However, the act of the hypervisor taking ownership of the page will result in a #VC in the guest when attempting to access the page. At which point the guest knows that the host is being malicious and should terminate.
> 
>> p14: "only after VMSA.RAX and SVSM_CALL_PENDING": this suggests that there
>>        is some kind of possible race condition here. If that is true, then
>>        maybe there is a need to specify memory ordering semantics on the three
>>        relevant fields?
> 
> The race is that once VMSA.EFER.SVME is set to 1, the guest VMSA can be used on a VMRUN instruction. So all updates to the guest VMSA contents must be performed before making the VMSA runnable, again.
> 
>>
>> p16: PVALIDATE: What happens if a guest is unaware of SVSM and executes
>>        PVALIDATE directly? Is the SVSM supposed to emulate that, or to punish
>>        the guest, or something else? Is that even possible for the SVSM to
>>        trap and emulate relevant VMPL1 instructions? Also see note on page 10
>>        regarding RMPADJUST.
> 
> If the guest is unaware of the SVSM (meaning it is running at VMPL1, for example, but doesn't know it), the PVALIDATE will fail with a #GP and the guest will crash. The current upstream Linux implementation checks to ensure it is running at VMPL0 or else terminates.
> 
>> p16: "It affects the Calling Area for calling vCPU only": This seems slighly
>>        inconsistent with page 9 "other SVSM implementations may choose a
>>        single execution context that services all guest VCPUs".
> 
> Not really. It is used to identify the calling vCPU and thus the VMSA that contains the call information. So, even if a single execution context is being used, the SVSM still needs to have a unique calling area for each vCPU in the guest.
> 
>> p16: Alignment for RCX is 8 bytes, but alignment for RDX in
>>        SVSM_CORE_CREATE_VCPU is 4K. Is that not the same calling area?
>>        Also, what is the use case for moving the calling area?
> 
> As the original author, maybe @Jon Lange can explain these requirements.
> 
>>
>> p17: The table links appear in a strange colour (some kind of weird cyan).
>>        It seems clickable too, so I suspect a hyperlink, but since the link is
>>        always on the same page, it's not super-useful.
> 
> I can remove that.
> 
>>
>> p18: Is the CAA seen as assigned to the SVSM? I believe the answer is
>> no
> 
> Correct, it is not.
> 
>> p18: For increased readability, I suggest naming the error codes for the
>>        SVSM_CORE_PVALIDATE call, and putting them in a table.
>>        Also, why tag the specific errors right after the architectural
>>        PVALIDATE errors? In case of architectural extension, you'd always get
>>        0x8000_1011, which is not super helpful. Instead, you could reserve
>>        0x8000_1xxx for protocol errors, and put PVALIDATE errors at
>>        0x9nnn_nnnn, which probably gives you enough room at least for the
>>        coming 6 months.
> 
> I'll look into this, no promises, though :)
> 
>> p18: "VMPL of the VCPU making the request": wouldn't it make sense to add a
>>        VMPL field in the PVALIDATE operation descriptor, so that the guest
>>        could control less-privileged VMPLs?
> 
> After the PVALIDATE, the guest can use the RMPADJUST instruction for adjusting page permissions for less-privileged VMPLs.
> 
>> p18: As indicated earlier, I'm confused by the 4K alginment requirement for
>>        RDX (Calling Area gPA)
> 
> Lets wait for the response from @Jon Lange.
> 
>> p18: What is the APIC ID of the vCPU used for? I see no mention in the
>>        explanatory text. Is that an internal index, or does the SVSM
>>        implementation need it for some reason (I was not clever enough to
>>        imagine why)
> 
> It's needed to ensure we are using the correct VMSA and CAA for a specific vCPU. Since the BSP creates all of the OS APs, were it to not supply the APIC ID when calling the CREATE_VCPU call, the SVSM would not have a way to associate the VMSA and CAA being created with a vCPU.
> 
>>
>> p19: What specification defines FAIL_INUSE? Add xref?
> 
> That's the FAIL_INUSE error code, 3, of the RMPUPDATE instruction. The
> PSMASH, PVALIDATE and RMPADJUST instructions share the same return codes,
> where applicable.
> 
>>
>> p20: Evidently, SVSM_CORE_DEPOSIT_MEM is intended to be used when another
>>        service returns 0x4mmm_mmmm. However, in the presence of a flag
>>        indicating "I may no longer need this memory", and given the
>>        limitations "cannot cross a page", I am concerned about possible lack
>>        of forward progress if two vCPUs start parallel operations where one
>>        vCPU says "Hey, I need X terabytes of RAM to do that" (which will then
>>        be split into umteen DEPOSIT_MEM calls due to page limit), while
>>        another says "Hey guys, I'm done" and sets the MEM_AVAILABLE flag. I
>>        see nothing in the spec that would prevent the second CPU from actively
>>        withdrawing the memory that the first one is trying to deposit. I think
>>        that the spec should clarify a forward-progress logic that prevents
>>        that from happening.
>> p20: It would be interesting to have at least a vague idea of what
>>        operations can actually request more memory, just to set expectations.
>> p20: Suppose that CREATE_VCPU requests more memory. It has no obvious
>>        "restart" field, unlike things like PVALIDATE. That means that there
>>        should be a rather strong guarantee that all SVSM calls that can
>>        potentially return 0x4mmm_mmmm either have no effect when they return
>>        such a request, or are idempotent if called again after providing more
>>        memory.
>>
>> p21: The MEM_AVAILABLE flag is set in the calling area of the startup vCPU.
>>        Is there any requirement that WITHDRAW_MEM should only be called from
>>        the startup vCPU, or from only one vCPU at a time?
>>
>> p22: The writable area ends at a page boundary. What could be a vald
>>        rationale for setting the RCX pointer in the middle of a page?
>>        Maybe simpler to require that the pointer be page-aligned than have a
>>        spec that mentions page offset 0xFF8 as a special case... > p22: Rationale for not returning incomplete? I'm trying to see how the
> guest
>>        could efficiently let secondary vCPUs withdraw memory with the protocol
>>        as specified, without a little additional wording regarding either the
>>        memory semantics of the MEM_AVAILABLE flag, and telling if there is
>>        indeed more work to be done by this vCPU using SVSM_ERR_INCOMPLETE.
> 
> Let's see what @Jon Lange's view on the p20 - p22 comments.
> 
>>
>> p23: Table 9 is mnissing RDX, R8 and R9 rows (as input for configuration)
>> p23: Table 9 title should be "configuration or query"
>> p23: RCX result is the same for query and configuration, or is that for
>>        query only?
> 
> I'll update the table to make everything more clear.
> 
>> p23: At the specified RIP. If Bit 3 is not set in the configuration case,
>>        does it return after the VMGEXIT?
> 
> Yes, it will return to the current value of RIP in the VMSA.
> 
>>
>> p29: Table 14: How can RAX be used as command ordinal and command response
>>        size if it's already used for call identifier / result value?
> 
> Typo, that should have been 0x000 (since the column is Byte Offset).
> 
>>
>> p30: Why would you need 4 bytes for the TPM command ordinal? This causes the
>>        TPM command size to be misaligned. What about 2 bytes for command
>>        ordinal, one byte for locality, and one reserved byte?
> 
> This is following the MS simulator protocol for the TPM 2.0 reference
> implementation as proposed by @James Bottomley. He also submitted an RFC
> TPM platform driver implementation to the linux-coco list following this
> protocol. It would be best to involve him in the discussion.
> 
>>
>>
>> (*) Like most readers of this document, I know what it means, but since you
>> defined VMM just above, or VM the page before, I interpreted your intent to
>> be that every acronym should be defined on first use.
>>
>> (1) As I am writing this, I have a doubt what happens if the host writes to
>>       the secrets page, and can't verify easily without a network.
> 
> The host can't write to the secrets page without first performing an
> RMPUPDATE to make it a hypervisor page. If it then were to modify the page
> and then re-assign it to the guest, the guest would receive a #VC when
> accessing the page because it is no longer validated, allowing the guest
> to detect the condition and terminate.
> 
> Thanks for the feedback!
> 
> Tom
> 
>>
>>>
>>> [2. application/pdf; 58019-Secure_VM_Service_Module_Specification.pdf]...
>>
>>
>> --
>> Cheers,
>> Christophe de Dinechin (https://c3d.github.io/
>> Theory of Incomplete Measurements (https://c3d.github.io/TIM
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12 15:24       ` James Bottomley
@ 2023-01-13 16:12         ` Tom Lendacky
  0 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-13 16:12 UTC (permalink / raw)
  To: James Bottomley, Christophe de Dinechin; +Cc: linux-coco, amd-sev-snp

On 1/12/23 09:24, James Bottomley wrote:
> On Thu, 2023-01-12 at 09:13 -0600, Tom Lendacky wrote:
>> On 1/12/23 07:57, James Bottomley wrote:
>>> On Wed, 2023-01-11 at 17:39 +0100, Christophe de Dinechin wrote:
>>> [...]
>>>> p29: Table 14: How can RAX be used as command ordinal and command
>>>> response size if it's already used for call identifier / result
>>>> value?
>>>
>>> I think there's a bug on the first table: in the prototype we use
>>> RAX [in] (not RCX [in]) for the pointer to the request/response
>>> buffer and RAX [out] as the SVSM return code.
>>
>> No, it's not a bug. You'll need to use RCX as the pointer to the
>> request/response buffer because RAX, per calling convention, contains
>> the protocol id and call id.
> 
> Oh, right, I should have looked at the code first before commenting we
> do indeed use RCX for the shared buffer.
> 
> Do you have a timetable for updating the linux-svsm github for the new
> protocol based calling convention?  We'd like to rebase the new vTPM
> code off it when it's available.

It would mean adding support for at least the attestation protocol and 
encryption support to perform that call. I don't have a timetable at the 
moment.

Thanks,
Tom

> 
> James
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-12  8:19 ` Dov Murik
  2023-01-12 12:18   ` James Bottomley
@ 2023-01-13 16:16   ` Tom Lendacky
  1 sibling, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-13 16:16 UTC (permalink / raw)
  To: Dov Murik, linux-coco, amd-sev-snp, James Bottomley

On 1/12/23 02:19, Dov Murik wrote:
> Thanks Tom for these additions.
> 
> On 10/01/2023 20:54, Tom Lendacky wrote:
>> Attached is an updated draft version of the SVSM specification with
>> added support for an attestation protocol and a vTPM protocol as well as
>> other miscellaneous changes (all identified by change bar). Please take
>> a look and reply with any feedback you may have.
>>
> 
> Few comments/questions:
> 
> Page 25: 7.1 SVSM_ATTEST_SERVICES
> 
> 1. Should we add two fields in Table 11 for certs_addr and certs_len? If
> certs_len input is > 0, then SVSM will perform guest_ext_request and
> retrieve the host/VMM certs into certs_addr.  Provided certs_len must be
> 4KB-aligned.  Not sure about certs_addr.
> 
> If the size of the certs exceeds the size of the supplied certs buffer,
> R8 will be set to the size of the certs (in bytes) and the call will
> return SVSM_ERR_INVALID_PARAMETER.

Yes, I forgot about the extended guest request comments in the other 
thread, I'll add those.

> 
> 2. Table 11: The 'Attestation report buffer gPA': should have enough
> space to hold the resulting SNP attestation report (0x500 bytes).

I'll add some language to the spec to indicate that it is assumed the full 
page is available to the SVSM.

> 
> (I assume the implementation will ask the PSP to generate the report
> into an SVSM-HV shared page, and then copy the result to the
> caller-provided buffer in guest private memory.  So the caller doesn't
> need to worry about page alignment/crossing?)

Correct.

> 
> 3. Consider stating that the SNP attestation report is always generated
> with MSG_REPORT_REQ.VMPL=0.

Will do.

> 
> 4. Services Manifest (page 26): Can we require that the same SVSM source
> code will produce the same (binary) manifest buffer (in different VMs)?
> For example, at which order do the entries appear? I think James was
> aiming at durable/repeatable attestations.

Yes, we can do that.

> 
> 
> 
> Page 28: vTPM Protocol
> 
> 5. Will the SVSM update any PCRs on its own? For example, will it
> measure the content of OVMF into PCR0?  Or are relying on the SNP
> launch-update measurement (which currently includes both SVSM and OVMF)
> to attest that part of the guest?

The latter.

Thanks for the feedback!

Tom

> 
> 
> -Dov

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-13 11:50 ` Nicolai Stange
@ 2023-01-13 17:20   ` Tom Lendacky
  0 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-01-13 17:20 UTC (permalink / raw)
  To: Nicolai Stange; +Cc: linux-coco, amd-sev-snp

On 1/13/23 05:50, Nicolai Stange wrote:
> Hi Tom,
> 
> Tom Lendacky <thomas.lendacky@amd.com> writes:
> 
>> Please take a look and reply with any feedback you may have.
> 
> Perhaps I simply missed it, but the available SVSM_VTPM_CMD response
> buffer size seems to remain kind of unspecified. That is, the proposal
> from [1] was to just require a complete page for the buffer, but I can't
> find that explicitly stated anywhere (except for the required alignment
> of %rcx giving a hint).

I'll add some language to the specification that it is expected that the 
request/response buffer is assumed to be large enough to hold the request 
and response of the specified vTPM command and will be treated as 
contiguous if spanning multiple pages.

> 
> 
> For the table on p. 28 in sec. 8.1 "SVSM_VTPM_QUERY Call", the
> "Supported vTPM features" are meant get returned in %rdx, not %rcx, I
> think.

Yep, already taken care of.

> 
> 
> And finally a question re the addition on p.9 ("Scope of the document"),
> which reads as
>   "Items measured at VMLP1+
>    o Firmware binary
>   "
> 
> In light of the sentence immediately preceeding the above and explicitly
> stating that the svsm would get measured as part of the initial image,
> does this conversely imply that the firmware binary would typically not
> get measured as part of the initial guest image? I.e. that it would get
> loaded with PAGE_TYPE_UNMEASURED? If so the above could be read as if
> the firmware was supposed to measure itself at VMPL1. I think that's not
> what's being meant here, but the wording is a bit misleading IMO.

Right, the firmware is expected to be loaded and measured as part of the 
initial guest image. I'll update the language in the specification to 
clarify that.

Thanks for the feedback!

Tom

> 
> Thanks!
> 
> Nicolai
> 
> [1] https://lore.kernel.org/linux-coco/b488a79617beed8913df61186e8e263c40f2330b.camel@linux.ibm.com/
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
                   ` (5 preceding siblings ...)
  2023-01-13 11:50 ` Nicolai Stange
@ 2023-01-24  9:35 ` Jörg Rödel
  2023-01-26 14:36   ` Tom Lendacky
  2023-02-01 10:50   ` Jörg Rödel
  2023-01-24  9:45 ` Jörg Rödel
  7 siblings, 2 replies; 48+ messages in thread
From: Jörg Rödel @ 2023-01-24  9:35 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

Hi,

On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with added
> support for an attestation protocol and a vTPM protocol as well as other
> miscellaneous changes (all identified by change bar). Please take a look and
> reply with any feedback you may have.

Thanks for putting this together, Tom! I think the review comments
which have been posted cover a good amount of improvements, but I'd like
to propose another addition:

It would be great if we have an equivalent to EBUSY in the return codes
to the guest. Something like SVSM_ERR_BUSY or SVSM_ERR_AGAIN, which
tells the guest that some resources needed to fulfill the request are
currently in-use and that the guest should try again later.

The reasoning here is that in a setup with multiple VCPUs one CPU does a
call to the SVSM which can take some time to complete (e.g. asking vTPM
to generate an RSA key) and then another VCPU comes along with a second
request to the vTPM. In that case the SVSM would have to busy-wait until
the other request is finished. I think it would be better to return to
the guest in this situation and try again later.

Thoughts?

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
                   ` (6 preceding siblings ...)
  2023-01-24  9:35 ` Jörg Rödel
@ 2023-01-24  9:45 ` Jörg Rödel
  2023-01-26 14:51   ` Tom Lendacky
  7 siblings, 1 reply; 48+ messages in thread
From: Jörg Rödel @ 2023-01-24  9:45 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
> Attached is an updated draft version of the SVSM specification with added
> support for an attestation protocol and a vTPM protocol as well as other
> miscellaneous changes (all identified by change bar). Please take a look and
> reply with any feedback you may have.

Another addition I'd like to propose:

It would be nice to have a protocol number reserved for implementation
specific requests. The protocol number only defines one standardized
request to identify the specific SVSM implemenation in use. Other
requests are implementation specific and can be used to manage the SVSM
from the guest.

One use-case would be, for example, to read the SVSM log buffer from the
guest side, but depending on the implementation there could be more
requests defined.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-24  9:35 ` Jörg Rödel
@ 2023-01-26 14:36   ` Tom Lendacky
  2023-01-26 16:45     ` Christophe de Dinechin Dupont de Dinechin
  2023-02-01 10:50   ` Jörg Rödel
  1 sibling, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-26 14:36 UTC (permalink / raw)
  To: Jörg Rödel; +Cc: linux-coco, amd-sev-snp

On 1/24/23 03:35, Jörg Rödel wrote:
> Hi,
> 
> On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
>> Attached is an updated draft version of the SVSM specification with added
>> support for an attestation protocol and a vTPM protocol as well as other
>> miscellaneous changes (all identified by change bar). Please take a look and
>> reply with any feedback you may have.
> 
> Thanks for putting this together, Tom! I think the review comments
> which have been posted cover a good amount of improvements, but I'd like
> to propose another addition:
> 
> It would be great if we have an equivalent to EBUSY in the return codes
> to the guest. Something like SVSM_ERR_BUSY or SVSM_ERR_AGAIN, which
> tells the guest that some resources needed to fulfill the request are
> currently in-use and that the guest should try again later.
> 
> The reasoning here is that in a setup with multiple VCPUs one CPU does a
> call to the SVSM which can take some time to complete (e.g. asking vTPM
> to generate an RSA key) and then another VCPU comes along with a second
> request to the vTPM. In that case the SVSM would have to busy-wait until
> the other request is finished. I think it would be better to return to
> the guest in this situation and try again later.
> 
> Thoughts?

That certainly reasonable to add. I'll add SVSM_ERR_BUSY to the spec along 
with a statement that says the implementation of any protocol function can 
return this result code and, similar to another comment, that the function 
must be able to describe the progress made or be idempotent.

Does anyone have any concerns over this?

Thanks,
Tom

> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-24  9:45 ` Jörg Rödel
@ 2023-01-26 14:51   ` Tom Lendacky
  2023-01-26 16:49     ` Christophe de Dinechin Dupont de Dinechin
  0 siblings, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-26 14:51 UTC (permalink / raw)
  To: Jörg Rödel; +Cc: linux-coco, amd-sev-snp

On 1/24/23 03:45, Jörg Rödel wrote:
> On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
>> Attached is an updated draft version of the SVSM specification with added
>> support for an attestation protocol and a vTPM protocol as well as other
>> miscellaneous changes (all identified by change bar). Please take a look and
>> reply with any feedback you may have.
> 
> Another addition I'd like to propose:
> 
> It would be nice to have a protocol number reserved for implementation
> specific requests. The protocol number only defines one standardized
> request to identify the specific SVSM implemenation in use. Other
> requests are implementation specific and can be used to manage the SVSM
> from the guest.

Would returning an implementation GUID as part of SVSM_CORE_QUERY_PROTCOL 
or adding a SVSM_CORE_GET_IMPLEMENTATION call to the core protocol be 
better? Then we could reserve a range of protocols for use SVSM 
implementation specific protocols as opposed to just one. Since the 
protocol ID is 32-bits, maybe make 0x8000_0000 to 0x8FFF_FFFF be SVSM 
implementation specific.

Which would folks prefer? A new protocol to retrieve the implementation, 
modify SVSM_CORE_QUERY_PROTCOL or add SVSM_CORE_GET_IMPLEMENTATION to the 
core protocol?

Or any strong feelings about why this wouldn't be good?

Thanks,
Tom

> 
> One use-case would be, for example, to read the SVSM log buffer from the
> guest side, but depending on the implementation there could be more
> requests defined.
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-26 14:36   ` Tom Lendacky
@ 2023-01-26 16:45     ` Christophe de Dinechin Dupont de Dinechin
  0 siblings, 0 replies; 48+ messages in thread
From: Christophe de Dinechin Dupont de Dinechin @ 2023-01-26 16:45 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: Jörg Rödel, linux-coco, amd-sev-snp



> On 26 Jan 2023, at 15:36, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> 
> On 1/24/23 03:35, Jörg Rödel wrote:
>> Hi,
>> On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
>>> Attached is an updated draft version of the SVSM specification with added
>>> support for an attestation protocol and a vTPM protocol as well as other
>>> miscellaneous changes (all identified by change bar). Please take a look and
>>> reply with any feedback you may have.
>> Thanks for putting this together, Tom! I think the review comments
>> which have been posted cover a good amount of improvements, but I'd like
>> to propose another addition:
>> It would be great if we have an equivalent to EBUSY in the return codes
>> to the guest. Something like SVSM_ERR_BUSY or SVSM_ERR_AGAIN, which
>> tells the guest that some resources needed to fulfill the request are
>> currently in-use and that the guest should try again later.
>> The reasoning here is that in a setup with multiple VCPUs one CPU does a
>> call to the SVSM which can take some time to complete (e.g. asking vTPM
>> to generate an RSA key) and then another VCPU comes along with a second
>> request to the vTPM. In that case the SVSM would have to busy-wait until
>> the other request is finished. I think it would be better to return to
>> the guest in this situation and try again later.
>> Thoughts?
> 
> That certainly reasonable to add. I'll add SVSM_ERR_BUSY to the spec along with a statement that says the implementation of any protocol function can return this result code and, similar to another comment, that the function must be able to describe the progress made or be idempotent.
> 
> Does anyone have any concerns over this?

Personally, I would welcome the addition. That would alleviate at least
one of my earlier concerns regarding multi-VCPU implementations.
Specifically that would make it somewhat easier for the caller to
detect contention, as opposed to having to try and predict it.

> 
> Thanks,
> Tom
> 
>> Regards,



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-26 14:51   ` Tom Lendacky
@ 2023-01-26 16:49     ` Christophe de Dinechin Dupont de Dinechin
  2023-01-26 17:33       ` [EXTERNAL] " Jon Lange
  0 siblings, 1 reply; 48+ messages in thread
From: Christophe de Dinechin Dupont de Dinechin @ 2023-01-26 16:49 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: Jörg Rödel, linux-coco, amd-sev-snp



> On 26 Jan 2023, at 15:51, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> 
> On 1/24/23 03:45, Jörg Rödel wrote:
>> On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
>>> Attached is an updated draft version of the SVSM specification with added
>>> support for an attestation protocol and a vTPM protocol as well as other
>>> miscellaneous changes (all identified by change bar). Please take a look and
>>> reply with any feedback you may have.
>> Another addition I'd like to propose:
>> It would be nice to have a protocol number reserved for implementation
>> specific requests. The protocol number only defines one standardized
>> request to identify the specific SVSM implemenation in use. Other
>> requests are implementation specific and can be used to manage the SVSM
>> from the guest.
> 
> Would returning an implementation GUID as part of SVSM_CORE_QUERY_PROTCOL or adding a SVSM_CORE_GET_IMPLEMENTATION call to the core protocol be better? Then we could reserve a range of protocols for use SVSM implementation specific protocols as opposed to just one. Since the protocol ID is 32-bits, maybe make 0x8000_0000 to 0x8FFF_FFFF be SVSM implementation specific.
> 
> Which would folks prefer? A new protocol to retrieve the implementation, modify SVSM_CORE_QUERY_PROTCOL or add SVSM_CORE_GET_IMPLEMENTATION to the core protocol?
> 
> Or any strong feelings about why this wouldn't be good?

Intuitively, I have a slight preference for the reserved range. It is the simplest to understand, has a larger potential for an implementation encoding some info in the protocol, and also means that testing for it can be static (on both sides).

> 
> Thanks,
> Tom
> 
>> One use-case would be, for example, to read the SVSM log buffer from the
>> guest side, but depending on the implementation there could be more
>> requests defined.
>> Regards,



^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-26 16:49     ` Christophe de Dinechin Dupont de Dinechin
@ 2023-01-26 17:33       ` Jon Lange
  2023-01-27  8:35         ` Jörg Rödel
  0 siblings, 1 reply; 48+ messages in thread
From: Jon Lange @ 2023-01-26 17:33 UTC (permalink / raw)
  To: Christophe de Dinechin Dupont de Dinechin, Tom Lendacky
  Cc: linux-coco, amd-sev-snp

One of the design goals of SVSM was to maximize the compatibility between all guest OSes and all SVSM implementations.  The idea that there are SVSM-specific protocols seems, in general, to be directly contradictory to this goal.  Why would it be desirable for a guest to have a conversation with its SVSM that is specific to the architecture of that SVSM?  I would think it far superior to define every sort of interaction as a generic contract that could be supported by every SVSM implementation to maximize compatibility in accordance with the stated goal.

Put another way, seeing any upstream implementation that provides functionality that is complete with SVSM A but which cannot be achieved with SVSM B should not be viewed as a desirable feature of the protocol. 

Attestation itself is a generic concept that should be applicable to every SVSM.  Why would anything related to attestation be implemented as specific to a single SVSM architecture?

-Jon

-----Original Message-----
From: AMD-SEV-SNP <amd-sev-snp-bounces+jlange=microsoft.com@lists.suse.com> On Behalf Of Christophe de Dinechin Dupont de Dinechin
Sent: Thursday, January 26, 2023 8:49 AM
To: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-coco@lists.linux.dev; amd-sev-snp@lists.suse.com
Subject: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60



> On 26 Jan 2023, at 15:51, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> 
> On 1/24/23 03:45, Jörg Rödel wrote:
>> On Tue, Jan 10, 2023 at 12:54:27PM -0600, Tom Lendacky wrote:
>>> Attached is an updated draft version of the SVSM specification with 
>>> added support for an attestation protocol and a vTPM protocol as 
>>> well as other miscellaneous changes (all identified by change bar). 
>>> Please take a look and reply with any feedback you may have.
>> Another addition I'd like to propose:
>> It would be nice to have a protocol number reserved for 
>> implementation specific requests. The protocol number only defines 
>> one standardized request to identify the specific SVSM implemenation 
>> in use. Other requests are implementation specific and can be used to 
>> manage the SVSM from the guest.
> 
> Would returning an implementation GUID as part of SVSM_CORE_QUERY_PROTCOL or adding a SVSM_CORE_GET_IMPLEMENTATION call to the core protocol be better? Then we could reserve a range of protocols for use SVSM implementation specific protocols as opposed to just one. Since the protocol ID is 32-bits, maybe make 0x8000_0000 to 0x8FFF_FFFF be SVSM implementation specific.
> 
> Which would folks prefer? A new protocol to retrieve the implementation, modify SVSM_CORE_QUERY_PROTCOL or add SVSM_CORE_GET_IMPLEMENTATION to the core protocol?
> 
> Or any strong feelings about why this wouldn't be good?

Intuitively, I have a slight preference for the reserved range. It is the simplest to understand, has a larger potential for an implementation encoding some info in the protocol, and also means that testing for it can be static (on both sides).

> 
> Thanks,
> Tom
> 
>> One use-case would be, for example, to read the SVSM log buffer from 
>> the guest side, but depending on the implementation there could be 
>> more requests defined.
>> Regards,



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-26 17:33       ` [EXTERNAL] " Jon Lange
@ 2023-01-27  8:35         ` Jörg Rödel
  2023-01-27 16:11           ` Jon Lange
  0 siblings, 1 reply; 48+ messages in thread
From: Jörg Rödel @ 2023-01-27  8:35 UTC (permalink / raw)
  To: Jon Lange
  Cc: Christophe de Dinechin Dupont de Dinechin, Tom Lendacky,
	linux-coco, amd-sev-snp

Hi Jon,

On Thu, Jan 26, 2023 at 05:33:54PM +0000, Jon Lange wrote:
> One of the design goals of SVSM was to maximize the compatibility
> between all guest OSes and all SVSM implementations.  The idea that
> there are SVSM-specific protocols seems, in general, to be directly
> contradictory to this goal.  Why would it be desirable for a guest to
> have a conversation with its SVSM that is specific to the architecture
> of that SVSM?  I would think it far superior to define every sort of
> interaction as a generic contract that could be supported by every
> SVSM implementation to maximize compatibility in accordance with the
> stated goal.

I agree in general that the SVSM implementations need to be compatible in
their guest-side interface and that a guest should not need to care which
SVSM implementation it is running on.

But I see still a need for implementation specific commands, which will
allow the guest owner to get information about the SVSM and which could
help with debugging in case of problems. These commands are tied to the
architecture of the underlying SVSM implementation and can not be
generic.

For example, an SVSM implementation can have one or more log buffers, it
can have a trace buffer (or not). Another SVSM might even allow the guest
OS to change some configuration data. In that area I see a clear need
for implemenation specific commands.

It probably makes sense to specify them in a way which allows the guest
OS to have a generic char driver to send implemenation specific commands
to every SVSM. On the guest OS side per-SVSM user-space tools can be used
with that char device.

> Put another way, seeing any upstream implementation that provides
> functionality that is complete with SVSM A but which cannot be
> achieved with SVSM B should not be viewed as a desirable feature of
> the protocol.

Implementation specific commands can also be used as a playground to
experiment with features that later can become part of the standard.

> Attestation itself is a generic concept that should be applicable to
> every SVSM.  Why would anything related to attestation be implemented
> as specific to a single SVSM architecture?

I agree that things like attestation and vTPM must be part of the
generic protocol.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-27  8:35         ` Jörg Rödel
@ 2023-01-27 16:11           ` Jon Lange
  2023-01-30 11:29             ` Jörg Rödel
  0 siblings, 1 reply; 48+ messages in thread
From: Jon Lange @ 2023-01-27 16:11 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: Christophe de Dinechin Dupont de Dinechin, Tom Lendacky,
	linux-coco, amd-sev-snp

-----Original Message-----
> From: Jörg Rödel <jroedel@suse.de> 
> Sent: Friday, January 27, 2023 12:35 AM
> Subject: Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60

Joerg, thanks for your thoughts.  

> But I see still a need for implementation specific commands, which will
> allow the guest owner to get information about the SVSM and which could
> help with debugging in case of problems. These commands are tied to the
> architecture of the underlying SVSM implementation and can not be generic.

I agree that some amount of implementation-specific SVSM communication may be unavoidable, but for the sake of a robust standard, this position should be accepted reluctantly rather than being embraced.  As long as the crutch of implementation-specific calls is available, it will be too easy for contributors to classify new features in this way when doing so may be the easy way out.  It would be much better if the specification of an implementation-specific call were the harder path, so that the default option would always be in the direction of standardization.

> For example, an SVSM implementation can have one or more log buffers,
> it can have a trace buffer (or not). Another SVSM might even allow the guest
> OS to change some configuration data. In that area I see a clear need for
> implemenation specific commands.

Logging is something that every SVSM implementation will want to be able to offer, and so the aspiration should be to design a standard log configuration and retrieval calling convention so that log management can be done in a consistent manner across all guests and across all SVSM implementations.  The reflex to call this "implementation-specific" is exactly the sort of deviation from a standard that worries me.

Perhaps not every SVSM will want to offer logging, at least not at first, and that's fine; those implementations can simply decline to offer the logging protocol.  But the key point here is that the set of features present in an SVSM should be at the discretion of the implementation - and may change over time - and must be discoverable in a standard way.  We want a design that lets SVSM implementors pick and choose the features that are most important to their users and not be forced into a position where they cannot offer certain features because of the underlying implementation platform.

> It probably makes sense to specify them in a way which allows the guest
> OS to have a generic char driver to send implementation specific commands
> to every SVSM. On the guest OS side per-SVSM user-space tools can be
> used with that char device.

Great idea, and one that can be standardized without having to couple it to a specific SVSM architecture.

> Implementation specific commands can also be used as a playground to
> experiment with features that later can become part of the standard.

Experimentation with new feature ideas should be done with an eye towards standardization.  Calling these "implementation-specific" during the prototyping phase just makes it harder to standardize later.  It does not seem difficult to allocate a new protocol number (or a new protocol version) every time a new feature is considered so that the prototyping can begin on a path towards standardization.  If the protocol registration process is unclear or not conducive to prototyping, then let's fix that instead of falling back to a model that diverges based on implementation.

Separately, I struggle to understand how a guest OS is supposed to know which SVSM implementation it's running with.  I don't recall a proposal to define a call to get the SVSM type, nor a proposal to define a registry of SVSM types.  Without such a mechanism, how could a guest have any idea which implementation-specific calls are expected to succeed?  Again, moving functionality into a standard calling convention avoids these questions, leaves protocol implementation choices in the hands of individual SVSM designers, and is fully enumerable and optional.  Shouldn't this be the primary design model for all SVSM interactions?

-Jon

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-27 16:11           ` Jon Lange
@ 2023-01-30 11:29             ` Jörg Rödel
  2023-01-31  4:44               ` Jon Lange
  0 siblings, 1 reply; 48+ messages in thread
From: Jörg Rödel @ 2023-01-30 11:29 UTC (permalink / raw)
  To: Jon Lange
  Cc: Christophe de Dinechin Dupont de Dinechin, Tom Lendacky,
	linux-coco, amd-sev-snp

Hi Jon,

On Fri, Jan 27, 2023 at 04:11:57PM +0000, Jon Lange wrote:
> I agree that some amount of implementation-specific SVSM communication
> may be unavoidable, but for the sake of a robust standard, this
> position should be accepted reluctantly rather than being embraced.

I totally agree with that. How about specifying that implementation
specific calls must be designed in a way that allows the guest OS to
treat them as optional. Or in other words, an SVSM must be able to run a
guest OS which does not use implementation specific call at all (and
even doesn't know about them)?

> As long as the crutch of implementation-specific calls is available,
> it will be too easy for contributors to classify new features
> in this way when doing so may be the easy way out.  It would
> be much better if the specification of an
> implementation-specific call were the harder path, so that the
> default option would always be in the direction of
> standardization.

Right, but I think this is hard to achieve. It depends on how the
standardization process works.

> Logging is something that every SVSM implementation will want to be
> able to offer, and so the aspiration should be to design a standard
> log configuration and retrieval calling convention so that log
> management can be done in a consistent manner across all guests and
> across all SVSM implementations.  The reflex to call this
> "implementation-specific" is exactly the sort of deviation from a
> standard that worries me.

On the other side, too much standardization on these matters could lock
down SVSM implementation details to a point that makes it hard to
innovate.

For example, consider these questions about logging alone:

	* Support one log or many?
	* In case of many logs, one per protocol? Or divide it
	  differently? Separate event and error logs?
	* How to enumerate the logs, by protocol number or by a name or
	  even by a GUID?

I think such questions are implementation specific, and it can be even
worse with other possible extensions, like tracing for example.

> Perhaps not every SVSM will want to offer logging, at least not at
> first, and that's fine; those implementations can simply decline to
> offer the logging protocol.  But the key point here is that the set of
> features present in an SVSM should be at the discretion of the
> implementation - and may change over time - and must be discoverable
> in a standard way.

I agree on the discoverability. There should be a standard way to
discover the specific SVSM implementation and the extensions it possibly
offers. The extension remain optional, of course.

> Separately, I struggle to understand how a guest OS is supposed to
> know which SVSM implementation it's running with.  I don't recall a
> proposal to define a call to get the SVSM type, nor a proposal to
> define a registry of SVSM types.  Without such a mechanism, how could
> a guest have any idea which implementation-specific calls are expected
> to succeed?  Again, moving functionality into a standard calling
> convention avoids these questions, leaves protocol implementation
> choices in the hands of individual SVSM designers, and is fully
> enumerable and optional.  Shouldn't this be the primary design model
> for all SVSM interactions?

As I said, the implementation specific parts must remain optional, but I
think the standard should leave room for implementers to have SVSM
specific calls without violating the standard.

As of discovery, this is what this sub-thread is about :) We can add a
call to the standard which allows to identify the SVSM implementation.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-30 11:29             ` Jörg Rödel
@ 2023-01-31  4:44               ` Jon Lange
  2023-01-31 15:06                 ` Tom Lendacky
  2023-02-01 15:20                 ` [EXTERNAL] " Christophe de Dinechin Dupont de Dinechin
  0 siblings, 2 replies; 48+ messages in thread
From: Jon Lange @ 2023-01-31  4:44 UTC (permalink / raw)
  To: Jörg Rödel
  Cc: Christophe de Dinechin Dupont de Dinechin, Tom Lendacky,
	linux-coco, amd-sev-snp

Joerg, I'm encouraged to see that we agree about most of the points related to implementation-specific protocol extensions.  It seems like the places where we have yet to come together have less to do with how the standard is defined but about how to approach implementation choices.  That's not really something that can be legislated into a standard (as you suggest) so we'll just have to trust that the right thing happens.

As far as how to enumerate implementation-specific extensions, why not just embrace the core concept of the protocol's extensibility, which is that individual protocol numbers are optional?  SVSM implementation 1A could propose protocol number 0x123 is its own extensions, and SVSM implementation X could propose protocol number 0xABC as its extensions, etc.  This would eliminate any need to define any first-class concept of implementation identity (thus eliminating the need for an identity registry) and would also eliminate the question of deciding which commands within a given protocol are implemented by a given SVSM.  Instead, a guest OS could query for the implementation protocol it wants, and if it's present, it can use it, and if it is absent, it won't.  A given implementation could choose to lump all of its various implementation-specific extensions into a single protocol, or add a different protocol number for every set of extensions (one for logging, a separate one for other configuration - whatever).  With a 32-bit protocol ID space to choose from, I suspect we don't have to worry too much about running out of IDs quickly, as long as we can construct some logical defense for every separate protocol ID.  And, as a bonus, those individual "implementation-specific" constructs that are easily embraced by another implementation can easily transform into standards without requiring an all-or-nothing adoption of an implementation's complete extension set - from your earlier comments about prototyping, this seems like one of the extensibility features you were hoping to achieve.

-Jon

-----Original Message-----
From: Jörg Rödel <jroedel@suse.de> 
Sent: Monday, January 30, 2023 3:29 AM
To: Jon Lange <jlange@microsoft.com>
Cc: Christophe de Dinechin Dupont de Dinechin <cdupontd@redhat.com>; Tom Lendacky <thomas.lendacky@amd.com>; linux-coco@lists.linux.dev; amd-sev-snp@lists.suse.com
Subject: Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60

Hi Jon,

On Fri, Jan 27, 2023 at 04:11:57PM +0000, Jon Lange wrote:
> I agree that some amount of implementation-specific SVSM communication
> may be unavoidable, but for the sake of a robust standard, this
> position should be accepted reluctantly rather than being embraced.

I totally agree with that. How about specifying that implementation
specific calls must be designed in a way that allows the guest OS to
treat them as optional. Or in other words, an SVSM must be able to run a
guest OS which does not use implementation specific call at all (and
even doesn't know about them)?

> As long as the crutch of implementation-specific calls is available,
> it will be too easy for contributors to classify new features
> in this way when doing so may be the easy way out.  It would
> be much better if the specification of an
> implementation-specific call were the harder path, so that the
> default option would always be in the direction of
> standardization.

Right, but I think this is hard to achieve. It depends on how the
standardization process works.

> Logging is something that every SVSM implementation will want to be
> able to offer, and so the aspiration should be to design a standard
> log configuration and retrieval calling convention so that log
> management can be done in a consistent manner across all guests and
> across all SVSM implementations.  The reflex to call this
> "implementation-specific" is exactly the sort of deviation from a
> standard that worries me.

On the other side, too much standardization on these matters could lock
down SVSM implementation details to a point that makes it hard to
innovate.

For example, consider these questions about logging alone:

	* Support one log or many?
	* In case of many logs, one per protocol? Or divide it
	  differently? Separate event and error logs?
	* How to enumerate the logs, by protocol number or by a name or
	  even by a GUID?

I think such questions are implementation specific, and it can be even
worse with other possible extensions, like tracing for example.

> Perhaps not every SVSM will want to offer logging, at least not at
> first, and that's fine; those implementations can simply decline to
> offer the logging protocol.  But the key point here is that the set of
> features present in an SVSM should be at the discretion of the
> implementation - and may change over time - and must be discoverable
> in a standard way.

I agree on the discoverability. There should be a standard way to
discover the specific SVSM implementation and the extensions it possibly
offers. The extension remain optional, of course.

> Separately, I struggle to understand how a guest OS is supposed to
> know which SVSM implementation it's running with.  I don't recall a
> proposal to define a call to get the SVSM type, nor a proposal to
> define a registry of SVSM types.  Without such a mechanism, how could
> a guest have any idea which implementation-specific calls are expected
> to succeed?  Again, moving functionality into a standard calling
> convention avoids these questions, leaves protocol implementation
> choices in the hands of individual SVSM designers, and is fully
> enumerable and optional.  Shouldn't this be the primary design model
> for all SVSM interactions?

As I said, the implementation specific parts must remain optional, but I
think the standard should leave room for implementers to have SVSM
specific calls without violating the standard.

As of discovery, this is what this sub-thread is about :) We can add a
call to the standard which allows to identify the SVSM implementation.

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-31  4:44               ` Jon Lange
@ 2023-01-31 15:06                 ` Tom Lendacky
  2023-01-31 15:34                   ` Jon Lange
  2023-02-01 15:20                 ` [EXTERNAL] " Christophe de Dinechin Dupont de Dinechin
  1 sibling, 1 reply; 48+ messages in thread
From: Tom Lendacky @ 2023-01-31 15:06 UTC (permalink / raw)
  To: Jon Lange, Jörg Rödel
  Cc: Christophe de Dinechin Dupont de Dinechin, linux-coco, amd-sev-snp

On 1/30/23 22:44, Jon Lange wrote:
> Joerg, I'm encouraged to see that we agree about most of the points related to implementation-specific protocol extensions.  It seems like the places where we have yet to come together have less to do with how the standard is defined but about how to approach implementation choices.  That's not really something that can be legislated into a standard (as you suggest) so we'll just have to trust that the right thing happens.
> 
> As far as how to enumerate implementation-specific extensions, why not just embrace the core concept of the protocol's extensibility, which is that individual protocol numbers are optional?  SVSM implementation 1A could propose protocol number 0x123 is its own extensions, and SVSM implementation X could propose protocol number 0xABC as its extensions, etc.  This would eliminate any need to define any first-class concept of implementation identity (thus eliminating the need for an identity registry) and would also eliminate the question of deciding which commands within a given protocol are implemented by a given SVSM.  Instead, a guest OS could query for the implementation protocol it wants, and if it's present, it can use it, and if it is absent, it won't.  A given implementation could choose to lump all of its various implementation-specific extensions into a single protocol, or add a different protocol number for every set of extensions (one for logging, a separate one for other configuration - whatever).  With a 32-bit protocol ID space to choose from, I suspect we don't have to worry too much about running out of IDs quickly, as long as we can construct some logical defense for every separate protocol ID.  And, as a bonus, those individual "implementation-specific" constructs that are easily embraced by another implementation can easily transform into standards without requiring an all-or-nothing adoption of an implementation's complete extension set - from your earlier comments about prototyping, this seems like one of the extensibility features you were hoping to achieve.

I don't think the SVSM specification would be the place to reserve the 
protocol numbers per implementation. So you would need to still somehow 
coordinate on what implementations get what protocol range.

We could just add an optional "identity" protocol with a single call that 
returns a 16-byte GUID. There should be little to no chance that a GUID 
would be re-used. The GUID could be used by the guest to know what 
implementation is being used and what protocols may be available - the 
query call should still be used to determine their presence. Thoughts?

Thanks,
Tom

> 
> -Jon
> 
> -----Original Message-----
> From: Jörg Rödel <jroedel@suse.de>
> Sent: Monday, January 30, 2023 3:29 AM
> To: Jon Lange <jlange@microsoft.com>
> Cc: Christophe de Dinechin Dupont de Dinechin <cdupontd@redhat.com>; Tom Lendacky <thomas.lendacky@amd.com>; linux-coco@lists.linux.dev; amd-sev-snp@lists.suse.com
> Subject: Re: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
> 
> Hi Jon,
> 
> On Fri, Jan 27, 2023 at 04:11:57PM +0000, Jon Lange wrote:
>> I agree that some amount of implementation-specific SVSM communication
>> may be unavoidable, but for the sake of a robust standard, this
>> position should be accepted reluctantly rather than being embraced.
> 
> I totally agree with that. How about specifying that implementation
> specific calls must be designed in a way that allows the guest OS to
> treat them as optional. Or in other words, an SVSM must be able to run a
> guest OS which does not use implementation specific call at all (and
> even doesn't know about them)?
> 
>> As long as the crutch of implementation-specific calls is available,
>> it will be too easy for contributors to classify new features
>> in this way when doing so may be the easy way out.  It would
>> be much better if the specification of an
>> implementation-specific call were the harder path, so that the
>> default option would always be in the direction of
>> standardization.
> 
> Right, but I think this is hard to achieve. It depends on how the
> standardization process works.
> 
>> Logging is something that every SVSM implementation will want to be
>> able to offer, and so the aspiration should be to design a standard
>> log configuration and retrieval calling convention so that log
>> management can be done in a consistent manner across all guests and
>> across all SVSM implementations.  The reflex to call this
>> "implementation-specific" is exactly the sort of deviation from a
>> standard that worries me.
> 
> On the other side, too much standardization on these matters could lock
> down SVSM implementation details to a point that makes it hard to
> innovate.
> 
> For example, consider these questions about logging alone:
> 
> 	* Support one log or many?
> 	* In case of many logs, one per protocol? Or divide it
> 	  differently? Separate event and error logs?
> 	* How to enumerate the logs, by protocol number or by a name or
> 	  even by a GUID?
> 
> I think such questions are implementation specific, and it can be even
> worse with other possible extensions, like tracing for example.
> 
>> Perhaps not every SVSM will want to offer logging, at least not at
>> first, and that's fine; those implementations can simply decline to
>> offer the logging protocol.  But the key point here is that the set of
>> features present in an SVSM should be at the discretion of the
>> implementation - and may change over time - and must be discoverable
>> in a standard way.
> 
> I agree on the discoverability. There should be a standard way to
> discover the specific SVSM implementation and the extensions it possibly
> offers. The extension remain optional, of course.
> 
>> Separately, I struggle to understand how a guest OS is supposed to
>> know which SVSM implementation it's running with.  I don't recall a
>> proposal to define a call to get the SVSM type, nor a proposal to
>> define a registry of SVSM types.  Without such a mechanism, how could
>> a guest have any idea which implementation-specific calls are expected
>> to succeed?  Again, moving functionality into a standard calling
>> convention avoids these questions, leaves protocol implementation
>> choices in the hands of individual SVSM designers, and is fully
>> enumerable and optional.  Shouldn't this be the primary design model
>> for all SVSM interactions?
> 
> As I said, the implementation specific parts must remain optional, but I
> think the standard should leave room for implementers to have SVSM
> specific calls without violating the standard.
> 
> As of discovery, this is what this sub-thread is about :) We can add a
> call to the standard which allows to identify the SVSM implementation.
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-31 15:06                 ` Tom Lendacky
@ 2023-01-31 15:34                   ` Jon Lange
  0 siblings, 0 replies; 48+ messages in thread
From: Jon Lange @ 2023-01-31 15:34 UTC (permalink / raw)
  To: Tom Lendacky, Jörg Rödel
  Cc: Christophe de Dinechin Dupont de Dinechin, linux-coco, amd-sev-snp

-----Original Message-----
From: Tom Lendacky <thomas.lendacky@amd.com> 
Sent: Tuesday, January 31, 2023 7:07 AM

> I don't think the SVSM specification would be the place to reserve the 
> protocol numbers per implementation. So you would need to still somehow 
> coordinate on what implementations get what protocol range.

Given the size of the protocol ID space, what if we just reserved a range of protocol numbers (say 64K) per known SVSM implementation for that implementation to use as it chooses?  This avoids the problem with a central registry.  We could start by saying that 0x40000000-0x4000FFFF are reserved for the AMD reference implementation, and then as more SVSM implementations come into existence, we could go through a registration process once for each to reserve another range.

> We could just add an optional "identity" protocol with a single call that 
> returns a 16-byte GUID. There should be little to no chance that a GUID 
> would be re-used. The GUID could be used by the guest to know what 
> implementation is being used and what protocols may be available - the 
> query call should still be used to determine their presence. Thoughts?

The downside to an identity-based scheme like this is the "negative implication" of identity: any declaration that "I am FOO" necessarily implies that "I do not support BAR's extensions" (since the meaning of the declaration "I am FOO" is specifically to say "I support FOO's extensions").  One of the stated desires, at least that I observed from the discussion, is the ability to evolve an implementation-specific detail into a standard when doing so is feasible across multiple implementations.  A GUID-based scheme like you suggest will necessarily constrain extensions into implementation-specific choices, which means that if "FOO function #3" becomes an attractive option for standardization, BAR will not be able to make the choice to implement it because BAR would have to declare itself as FOO in order to make "FOO #3" accessible - and that both makes all of the BAR extensions inaccessible, and requires BAR to implement all of the FOO extensions.  That may be OK if we know that 100% of extensions are implementation-specific 100% of the time, but I think that's contrary to the long-term desire.  Contrast this with the range-based scheme described above; if FOO #3 was enumerated as 0x40010003, then a guest could query for that interface and any implementation could choose to expose it regardless of whether it chooses to expose any other FOO-based extensions.

-Jon

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-01-24  9:35 ` Jörg Rödel
  2023-01-26 14:36   ` Tom Lendacky
@ 2023-02-01 10:50   ` Jörg Rödel
  2023-02-20 15:10     ` Tom Lendacky
  1 sibling, 1 reply; 48+ messages in thread
From: Jörg Rödel @ 2023-02-01 10:50 UTC (permalink / raw)
  To: Tom Lendacky; +Cc: linux-coco, amd-sev-snp

On Tue, Jan 24, 2023 at 10:35:50AM +0100, Jörg Rödel wrote:
> It would be great if we have an equivalent to EBUSY in the return codes
> to the guest. Something like SVSM_ERR_BUSY or SVSM_ERR_AGAIN, which
> tells the guest that some resources needed to fulfill the request are
> currently in-use and that the guest should try again later.

On a related issue, do we need an extra return code for the case that
the SVSM got a fault when trying to access VMPL1 memory? Request
processing can fault for various reasons when accessing addresses passed
in via requests, should this be reported a SVSM_ERR_INVALID_ADDRESS or
is another code needed?

Regards,

-- 
Jörg Rödel
jroedel@suse.de

SUSE Software Solutions Germany GmbH
Frankenstraße 146
90461 Nürnberg
Germany

(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [EXTERNAL] SVSM Attestation and vTPM specification additions - v0.60
  2023-01-31  4:44               ` Jon Lange
  2023-01-31 15:06                 ` Tom Lendacky
@ 2023-02-01 15:20                 ` Christophe de Dinechin Dupont de Dinechin
  2023-02-02  6:04                   ` Jon Lange
  1 sibling, 1 reply; 48+ messages in thread
From: Christophe de Dinechin Dupont de Dinechin @ 2023-02-01 15:20 UTC (permalink / raw)
  To: Jon Lange; +Cc: Jörg Rödel, Tom Lendacky, linux-coco, amd-sev-snp



> On 31 Jan 2023, at 05:44, Jon Lange <jlange@microsoft.com> wrote:
> 
> Joerg, I'm encouraged to see that we agree about most of the points related to implementation-specific protocol extensions.  It seems like the places where we have yet to come together have less to do with how the standard is defined but about how to approach implementation choices.  That's not really something that can be legislated into a standard (as you suggest) so we'll just have to trust that the right thing happens.
> 
> As far as how to enumerate implementation-specific extensions, why not just embrace the core concept of the protocol's extensibility, which is that individual protocol numbers are optional?  SVSM implementation 1A could propose protocol number 0x123 is its own extensions, and SVSM implementation X could propose protocol number 0xABC as its extensions, etc.  This would eliminate any need to define any first-class concept of implementation identity (thus eliminating the need for an identity registry) and would also eliminate the question of deciding which commands within a given protocol are implemented by a given SVSM.  Instead, a guest OS could query for the implementation protocol it wants, and if it's present, it can use it, and if it is absent, it won't.  A given implementation could choose to lump all of its various implementation-specific extensions into a single protocol, or add a different protocol number for every set of extensions (one for logging, a separate one for other configuration - whatever).  With a 32-bit protocol ID space to choose from, I suspect we don't have to worry too much about running out of IDs quickly, as long as we can construct some logical defense for every separate protocol ID.  And, as a bonus, those individual "implementation-specific" constructs that are easily embraced by another implementation can easily transform into standards without requiring an all-or-nothing adoption of an implementation's complete extension set - from your earlier comments about prototyping, this seems like one of the extensibility features you were hoping to achieve.

This is a common problem in the industry. For example, OpenGL had various extensions that could be queried, each exposing an interface (similar to the protocols here). C and C++ compilers exposed “working group” features or compiler-specific extensions in various ways.

In all cases, some extensions are meant to be and remain “implementation-dependent” whereas others may be work in progress with the intent to become part of the standard later. In OpenGL, GL_NV_draw_vulkan_image would be an example of the former, being specific to NVIDIA, whereas GL_ARB_arrays_of_arrays an example of the latter, ARB standing for Architecture Review Board). Another important aspect is to be able to query the available protocols in a portable way.

With IDs, I would split the ID space into three segments:
- The standard part, which we have today
- The experimental part, to become standard with possibly some changes
- The implementation-dependent part, never to be used by portable code

I must admit that I also thought of logging as an obvious extension to the current protocols, and I was also torn about how to standardize that. The risk of not having an “experimental” part to the ID space is that there will be a lot of discussion and bike shedding before we all agree on something. By contrast, pushing an experiment and having people test it should be comparatively painless.

I’m also considering future changes in the underlying hardware, exposing capabilities that may not exist today. One example that was discussed in another thread is the possibility, some day, to attest not just the guest code, but the host hypervisor and kernel as well. In the AMD case, I suspect this would involve having some of their code running at VMPL0, and maybe having a “VMPL1 SVSM” interface.


Cheers,
Christophe


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [EXTERNAL] SVSM Attestation and vTPM specification additions - v0.60
  2023-02-01 15:20                 ` [EXTERNAL] " Christophe de Dinechin Dupont de Dinechin
@ 2023-02-02  6:04                   ` Jon Lange
  0 siblings, 0 replies; 48+ messages in thread
From: Jon Lange @ 2023-02-02  6:04 UTC (permalink / raw)
  To: Christophe de Dinechin Dupont de Dinechin
  Cc: Jörg Rödel, Tom Lendacky, linux-coco, amd-sev-snp

-----Original Message-----
From: Christophe de Dinechin Dupont de Dinechin <cdupontd@redhat.com> 
Sent: Wednesday, February 1, 2023 7:20 AM

> I must admit that I also thought of logging as an obvious extension to the
> current protocols, and I was also torn about how to standardize that. The
> risk of not having an “experimental” part to the ID space is that there will
> be a lot of discussion and bike shedding before we all agree on something.
> By contrast, pushing an experiment and having people test it should be
> comparatively painless.

The downside is that prototyping in the "experimental" part of the ID space means that once the prototype is ready to be declared "stable", then all upstream changes that have already committed to the "experimental" ID - in all projects - will have to be modified to switch to the new, permanent ID.  And while those changes are flowing, there will be incompatibilities until all components have migrated to the new IDs.  That sounds pretty painful.  Why not just assign every new feature its own proper ID and consume it from the beginning?  As long as it remains experimental, there should be an understanding that the specific call formats of that protocol may change, but once the dust settles, it would be desirable to not have to turn around and change everything one more time.

Put another way, the ID space is big enough that we can afford to prototype protocols that end up getting thrown away.  We shouldn't feel like registering a protocol ID is a promise that the interface is stable at the time the ID is assigned; we can easily decorate various specifications with enough information to know how much churn to expect before it can be broadly consumed.  It's just that when the protocol reaches that point, we won't the ultimate churn of changing IDs one last time since that can be so easily avoided.

-Jon

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: SVSM Attestation and vTPM specification additions - v0.60
  2023-02-01 10:50   ` Jörg Rödel
@ 2023-02-20 15:10     ` Tom Lendacky
  0 siblings, 0 replies; 48+ messages in thread
From: Tom Lendacky @ 2023-02-20 15:10 UTC (permalink / raw)
  To: Jörg Rödel; +Cc: linux-coco, amd-sev-snp



On 2/1/23 04:50, Jörg Rödel wrote:
> On Tue, Jan 24, 2023 at 10:35:50AM +0100, Jörg Rödel wrote:
>> It would be great if we have an equivalent to EBUSY in the return codes
>> to the guest. Something like SVSM_ERR_BUSY or SVSM_ERR_AGAIN, which
>> tells the guest that some resources needed to fulfill the request are
>> currently in-use and that the guest should try again later.
> 
> On a related issue, do we need an extra return code for the case that
> the SVSM got a fault when trying to access VMPL1 memory? Request
> processing can fault for various reasons when accessing addresses passed
> in via requests, should this be reported a SVSM_ERR_INVALID_ADDRESS or
> is another code needed?

I think using SVSM_ERR_INVALID_ADDRESS should be ok. I'll add something to 
the specification to state that any faults that occur when accessing guest 
memory should return SVSM_ERR_INVALID_ADDRESS.

Thanks,
Tom

> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-02-20 15:10 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-10 18:54 SVSM Attestation and vTPM specification additions - v0.60 Tom Lendacky
2023-01-10 19:37 ` Tom Lendacky
2023-01-10 19:40 ` Dionna Amalie Glaze
2023-01-10 21:03   ` Tom Lendacky
2023-01-10 22:14     ` James Bottomley
2023-01-10 22:45       ` Tom Lendacky
2023-01-10 23:52         ` James Bottomley
2023-01-11  9:15           ` Christophe de Dinechin Dupont de Dinechin
2023-01-10 20:29 ` James Bottomley
2023-01-10 20:37   ` James Bottomley
2023-01-10 21:33     ` Tom Lendacky
2023-01-10 21:32   ` Tom Lendacky
2023-01-10 21:47     ` James Bottomley
2023-01-10 23:00       ` Tom Lendacky
2023-01-10 23:09         ` James Bottomley
2023-01-11 14:49           ` Tom Lendacky
2023-01-11 14:56             ` James Bottomley
2023-01-10 23:14         ` James Bottomley
2023-01-11 16:39 ` Christophe de Dinechin
2023-01-11 23:00   ` Tom Lendacky
2023-01-12  1:27     ` [EXTERNAL] " Jon Lange
2023-01-13 16:10       ` Tom Lendacky
2023-01-12 13:57   ` James Bottomley
2023-01-12 15:13     ` Tom Lendacky
2023-01-12 15:24       ` James Bottomley
2023-01-13 16:12         ` Tom Lendacky
2023-01-12  8:19 ` Dov Murik
2023-01-12 12:18   ` James Bottomley
2023-01-13 16:16   ` Tom Lendacky
2023-01-13 11:50 ` Nicolai Stange
2023-01-13 17:20   ` Tom Lendacky
2023-01-24  9:35 ` Jörg Rödel
2023-01-26 14:36   ` Tom Lendacky
2023-01-26 16:45     ` Christophe de Dinechin Dupont de Dinechin
2023-02-01 10:50   ` Jörg Rödel
2023-02-20 15:10     ` Tom Lendacky
2023-01-24  9:45 ` Jörg Rödel
2023-01-26 14:51   ` Tom Lendacky
2023-01-26 16:49     ` Christophe de Dinechin Dupont de Dinechin
2023-01-26 17:33       ` [EXTERNAL] " Jon Lange
2023-01-27  8:35         ` Jörg Rödel
2023-01-27 16:11           ` Jon Lange
2023-01-30 11:29             ` Jörg Rödel
2023-01-31  4:44               ` Jon Lange
2023-01-31 15:06                 ` Tom Lendacky
2023-01-31 15:34                   ` Jon Lange
2023-02-01 15:20                 ` [EXTERNAL] " Christophe de Dinechin Dupont de Dinechin
2023-02-02  6:04                   ` Jon Lange

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.