linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/tdx: Mark TSC reliable
@ 2023-08-08 16:23 Kirill A. Shutemov
  2023-08-08 17:13 ` Dave Hansen
  0 siblings, 1 reply; 16+ messages in thread
From: Kirill A. Shutemov @ 2023-08-08 16:23 UTC (permalink / raw)
  To: Thomas Gleixner, Dave Hansen, Borislav Petkov, Andy Lutomirski
  Cc: Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel, Kirill A. Shutemov

In x86 virtualization environments, including TDX, RDTSC instruction is
handled without causing a VM exit, resulting in minimal overhead and
jitters. On the other hand, other clock sources (such as HPET, ACPI
timer, APIC, etc.) necessitate VM exits to implement, resulting in more
fluctuating measurements compared to TSC. Thus, those clock sources are
not effective for calibrating TSC.

In TD guests, TSC is virtualized by the TDX module, which ensures:

  - Virtual TSC values are consistent among all the TD’s VCPUs;
  - Monotonously incrementing for any single VCPU;
  - The frequency is determined by TD configuration. The host TSC is
    invariant on platforms where TDX is available.

Use TSC as the only reliable clock source in TD guests, bypassing
unstable calibration.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..1583ec64d92e 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -769,6 +769,9 @@ void __init tdx_early_init(void)
 
 	setup_force_cpu_cap(X86_FEATURE_TDX_GUEST);
 
+	/* TSC is the only reliable clock in TDX guest */
+	setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
+
 	cc_vendor = CC_VENDOR_INTEL;
 	tdx_parse_tdinfo(&cc_mask);
 	cc_set_mask(cc_mask);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-08 16:23 [PATCH] x86/tdx: Mark TSC reliable Kirill A. Shutemov
@ 2023-08-08 17:13 ` Dave Hansen
  2023-08-08 20:01   ` Kirill A. Shutemov
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2023-08-08 17:13 UTC (permalink / raw)
  To: Kirill A. Shutemov, Thomas Gleixner, Borislav Petkov, Andy Lutomirski
  Cc: Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel

On 8/8/23 09:23, Kirill A. Shutemov wrote:
...
> On the other hand, other clock sources (such as HPET, ACPI timer,
> APIC, etc.) necessitate VM exits to implement, resulting in more 
> fluctuating measurements compared to TSC. Thus, those clock sources
> are not effective for calibrating TSC.

Do we need to do anything to _those_ to mark them as slightly stinky?

> In TD guests, TSC is virtualized by the TDX module, which ensures:
> 
>   - Virtual TSC values are consistent among all the TD’s VCPUs;
>   - Monotonously incrementing for any single VCPU;
>   - The frequency is determined by TD configuration. The host TSC is
>     invariant on platforms where TDX is available.

I take it this is carved in stone in the TDX specs somewhere.  A
reference would be nice.

We've got VMWare and Hyper-V code basically doing the same thing today.
So TDX is in kinda good company.  But this still makes me rather
nervous.  Do you have any encouraging words about how unlikely future
hardware is to screw this up, especially as TDX-supporting hardware gets
more diverse?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-08 17:13 ` Dave Hansen
@ 2023-08-08 20:01   ` Kirill A. Shutemov
  2023-08-09  5:44     ` Reshetova, Elena
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Kirill A. Shutemov @ 2023-08-08 20:01 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, Borislav Petkov, Andy Lutomirski,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel

On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> On 8/8/23 09:23, Kirill A. Shutemov wrote:
> ...
> > On the other hand, other clock sources (such as HPET, ACPI timer,
> > APIC, etc.) necessitate VM exits to implement, resulting in more 
> > fluctuating measurements compared to TSC. Thus, those clock sources
> > are not effective for calibrating TSC.
> 
> Do we need to do anything to _those_ to mark them as slightly stinky?

I don't know what the rules here. As far as I can see, all other clock
sources relevant for TDX guest have lower rating. I guess we are fine?

There's notable exception to the rating order is kvmclock which is higher
than tsc. It has to be disabled, but it is not clear to me how. This topic
is related to how we are going to filter allowed devices/drivers, so I
would postpone the decision until we settle on wider filtering schema.

> > In TD guests, TSC is virtualized by the TDX module, which ensures:
> > 
> >   - Virtual TSC values are consistent among all the TD’s VCPUs;
> >   - Monotonously incrementing for any single VCPU;
> >   - The frequency is determined by TD configuration. The host TSC is
> >     invariant on platforms where TDX is available.
> 
> I take it this is carved in stone in the TDX specs somewhere.  A
> reference would be nice.

TDX Module 1.0 spec:

	5.3.5. Time Stamp Counter (TSC)

	TDX provides a trusted virtual TSC to the guest TDs. TSC value is
	monotonously incrementing, starting from 0 on TD initialization by the
	host VMM. The deviation between virtual TSC values read by each VCPU is
	small.

	A guest TD should disable mechanisms that are used in non-trusted
	environment, which attempt to synchronize TSC between VCPUs, and should
	not revert to using untrusted time mechanisms.

...

	13.13.1. TSC Virtualization

	For virtual time stamp counter (TSC) values read by guest TDs, the Intel
	TDX module is designed to achieve the following:

	• Virtual TSC values are consistent among all the TD’s VCPUs at
	  the level supported by the CPU, see below.
	• The virtual TSC value for any single VCPU is monotonously
	  incrementing (except roll over from 264-1 to 0).
	• The virtual TSC frequency is determined by TD configuration.

...

> We've got VMWare and Hyper-V code basically doing the same thing today.
> So TDX is in kinda good company.  But this still makes me rather
> nervous.  Do you have any encouraging words about how unlikely future
> hardware is to screw this up, especially as TDX-supporting hardware gets
> more diverse?

Wording in the spec looks okay to me. We can only hope that implementation
going to be sane.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-08 20:01   ` Kirill A. Shutemov
@ 2023-08-09  5:44     ` Reshetova, Elena
  2023-08-09  6:13       ` Kirill A. Shutemov
  2023-08-24 15:49     ` Thomas Gleixner
  2023-08-24 19:31     ` Thomas Gleixner
  2 siblings, 1 reply; 16+ messages in thread
From: Reshetova, Elena @ 2023-08-09  5:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Hansen, Dave
  Cc: Thomas Gleixner, Borislav Petkov, Lutomirski, Andy,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, x86, linux-coco,
	linux-kernel


> On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> > On 8/8/23 09:23, Kirill A. Shutemov wrote:
> > ...
> > > On the other hand, other clock sources (such as HPET, ACPI timer,
> > > APIC, etc.) necessitate VM exits to implement, resulting in more
> > > fluctuating measurements compared to TSC. Thus, those clock sources
> > > are not effective for calibrating TSC.
> >
> > Do we need to do anything to _those_ to mark them as slightly stinky?

IMO from pure security pov yes. It would be good secure default that 
TDX guests (and other CoCo guests also) are using only trusted source time. 
There are issues with this though and would need to understand where
to draw the line. Things like hpet and such we hoped to disable via
device filtering. For some other time sources we
have used patches below. But then there are things like RTC that would
be great to disable also, but without a proper remote time server
that breaks any date/timing for the guest, so we have not done it
and probably should not by default, but we recommend not using it
in docs we have:
https://intel.github.io/ccc-linux-guest-hardening-docs/security-spec.html#tsc-and-other-timers

> 
> I don't know what the rules here. As far as I can see, all other clock
> sources relevant for TDX guest have lower rating. I guess we are fine?

What about acpi_pm? 
See this:
https://github.com/intel/tdx/commit/045692772ab4ef75062a83cc6e4ffa22cab40226

> 
> There's notable exception to the rating order is kvmclock which is higher
> than tsc. It has to be disabled, but it is not clear to me how. This topic
> is related to how we are going to filter allowed devices/drivers, so I
> would postpone the decision until we settle on wider filtering schema.

One option is to include "no-kvmclock" into kernel command line, which
is attested. Another option is to try to disable it explicitly, like we had
in past: 
https://github.com/intel/tdx/commit/6b0357f2115c1bdd158c0c8836f4f541517bf375

The obvious issues with command line is that it is going to 1) grow 
considerably if we try to disable everything we can via command line
and 2) there is a high chance that in practice people will not use secure default
and/or forget to verify the correct status of cmd line. But this is to be
expected I guess for any security method that involves attestation unfortunately.

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-09  5:44     ` Reshetova, Elena
@ 2023-08-09  6:13       ` Kirill A. Shutemov
  2023-08-22 23:39         ` Erdem Aktas
  0 siblings, 1 reply; 16+ messages in thread
From: Kirill A. Shutemov @ 2023-08-09  6:13 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Hansen, Dave, Thomas Gleixner, Borislav Petkov, Lutomirski, Andy,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, x86, linux-coco,
	linux-kernel

On Wed, Aug 09, 2023 at 05:44:37AM +0000, Reshetova, Elena wrote:
> > 
> > I don't know what the rules here. As far as I can see, all other clock
> > sources relevant for TDX guest have lower rating. I guess we are fine?
> 
> What about acpi_pm? 
> See this:
> https://github.com/intel/tdx/commit/045692772ab4ef75062a83cc6e4ffa22cab40226

clocksource_acpi_pm.rating is 200 while TSC is 300.

> > There's notable exception to the rating order is kvmclock which is higher
> > than tsc. It has to be disabled, but it is not clear to me how. This topic
> > is related to how we are going to filter allowed devices/drivers, so I
> > would postpone the decision until we settle on wider filtering schema.
> 
> One option is to include "no-kvmclock" into kernel command line, which
> is attested. Another option is to try to disable it explicitly, like we had
> in past: 
> https://github.com/intel/tdx/commit/6b0357f2115c1bdd158c0c8836f4f541517bf375
> 
> The obvious issues with command line is that it is going to 1) grow 
> considerably if we try to disable everything we can via command line
> and 2) there is a high chance that in practice people will not use secure default
> and/or forget to verify the correct status of cmd line. But this is to be
> expected I guess for any security method that involves attestation unfortunately.

I guess command line is fine, until we have coherent solution on
filtering.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-09  6:13       ` Kirill A. Shutemov
@ 2023-08-22 23:39         ` Erdem Aktas
  0 siblings, 0 replies; 16+ messages in thread
From: Erdem Aktas @ 2023-08-22 23:39 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Reshetova, Elena, Hansen, Dave, Thomas Gleixner, Borislav Petkov,
	Lutomirski, Andy, Kuppuswamy Sathyanarayanan, Nakajima, Jun, x86,
	linux-coco, linux-kernel

Reviewed-by: Erdem Aktas <erdemaktas@google.com>

Thanks Kirill for this patch.  I think this patch is necessary to
prevent guest marking TSC as unreliable due to the possible
calibration failures at runtime. We also tested it.




On Tue, Aug 8, 2023 at 11:14 PM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> On Wed, Aug 09, 2023 at 05:44:37AM +0000, Reshetova, Elena wrote:
> > >
> > > I don't know what the rules here. As far as I can see, all other clock
> > > sources relevant for TDX guest have lower rating. I guess we are fine?
> >
> > What about acpi_pm?
> > See this:
> > https://github.com/intel/tdx/commit/045692772ab4ef75062a83cc6e4ffa22cab40226
>
> clocksource_acpi_pm.rating is 200 while TSC is 300.
>
> > > There's notable exception to the rating order is kvmclock which is higher
> > > than tsc. It has to be disabled, but it is not clear to me how. This topic
> > > is related to how we are going to filter allowed devices/drivers, so I
> > > would postpone the decision until we settle on wider filtering schema.
> >
> > One option is to include "no-kvmclock" into kernel command line, which
> > is attested. Another option is to try to disable it explicitly, like we had
> > in past:
> > https://github.com/intel/tdx/commit/6b0357f2115c1bdd158c0c8836f4f541517bf375
> >
> > The obvious issues with command line is that it is going to 1) grow
> > considerably if we try to disable everything we can via command line
> > and 2) there is a high chance that in practice people will not use secure default
> > and/or forget to verify the correct status of cmd line. But this is to be
> > expected I guess for any security method that involves attestation unfortunately.
>
> I guess command line is fine, until we have coherent solution on
> filtering.
>
> --
>   Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-08 20:01   ` Kirill A. Shutemov
  2023-08-09  5:44     ` Reshetova, Elena
@ 2023-08-24 15:49     ` Thomas Gleixner
  2023-08-25 13:52       ` Kirill A. Shutemov
  2023-08-24 19:31     ` Thomas Gleixner
  2 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2023-08-24 15:49 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Hansen
  Cc: Borislav Petkov, Andy Lutomirski, Kuppuswamy Sathyanarayanan,
	Elena Reshetova, Jun Nakajima, x86, linux-coco, linux-kernel

On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
> On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
>> I take it this is carved in stone in the TDX specs somewhere.  A
>> reference would be nice.
>
> TDX Module 1.0 spec:
>
> 	5.3.5. Time Stamp Counter (TSC)
>
> 	TDX provides a trusted virtual TSC to the guest TDs. TSC value is
> 	monotonously incrementing, starting from 0 on TD initialization by the
> 	host VMM. The deviation between virtual TSC values read by each VCPU is
> 	small.

Nice weasel wording. What's the definition of "small"?

Any OS needs a guarantee that vCPUs cannot observe time going backwards,
which is obviously possible when the deviation is not small enough.

> Wording in the spec looks okay to me. We can only hope that implementation
> going to be sane.

Hope dies last :)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-08 20:01   ` Kirill A. Shutemov
  2023-08-09  5:44     ` Reshetova, Elena
  2023-08-24 15:49     ` Thomas Gleixner
@ 2023-08-24 19:31     ` Thomas Gleixner
  2023-08-25 13:47       ` Kirill A. Shutemov
  2 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2023-08-24 19:31 UTC (permalink / raw)
  To: Kirill A. Shutemov, Dave Hansen
  Cc: Borislav Petkov, Andy Lutomirski, Kuppuswamy Sathyanarayanan,
	Elena Reshetova, Jun Nakajima, x86, linux-coco, linux-kernel,
	Paolo Bonzini

On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
> On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
>> On 8/8/23 09:23, Kirill A. Shutemov wrote:
>> ...
>> > On the other hand, other clock sources (such as HPET, ACPI timer,
>> > APIC, etc.) necessitate VM exits to implement, resulting in more 
>> > fluctuating measurements compared to TSC. Thus, those clock sources
>> > are not effective for calibrating TSC.
>> 
>> Do we need to do anything to _those_ to mark them as slightly stinky?
>
> I don't know what the rules here. As far as I can see, all other clock
> sources relevant for TDX guest have lower rating. I guess we are fine?

Ideally they are not enumerated in the first place, which prevents the
kernel from trying.

> There's notable exception to the rating order is kvmclock which is higher
> than tsc.

Which is silly aside of TDX.

> It has to be disabled, but it is not clear to me how. This topic
> is related to how we are going to filter allowed devices/drivers, so I
> would postpone the decision until we settle on wider filtering schema.

TDX aside it might be useful to have a mechanism to select TSC over KVM
clock in general.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-24 19:31     ` Thomas Gleixner
@ 2023-08-25 13:47       ` Kirill A. Shutemov
  2023-08-25 15:16         ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Kirill A. Shutemov @ 2023-08-25 13:47 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dave Hansen, Borislav Petkov, Andy Lutomirski,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel, Paolo Bonzini, Sean Christopherson

On Thu, Aug 24, 2023 at 09:31:39PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
> > On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> >> On 8/8/23 09:23, Kirill A. Shutemov wrote:
> >> ...
> >> > On the other hand, other clock sources (such as HPET, ACPI timer,
> >> > APIC, etc.) necessitate VM exits to implement, resulting in more 
> >> > fluctuating measurements compared to TSC. Thus, those clock sources
> >> > are not effective for calibrating TSC.
> >> 
> >> Do we need to do anything to _those_ to mark them as slightly stinky?
> >
> > I don't know what the rules here. As far as I can see, all other clock
> > sources relevant for TDX guest have lower rating. I guess we are fine?
> 
> Ideally they are not enumerated in the first place, which prevents the
> kernel from trying.

We can ask QEMU/KVM not to advertise them to TDX guest, but guest has to
protect itself as the VMM is not trusted. And we are back to device
filtering...

> > There's notable exception to the rating order is kvmclock which is higher
> > than tsc.
> 
> Which is silly aside of TDX.
> 
> > It has to be disabled, but it is not clear to me how. This topic
> > is related to how we are going to filter allowed devices/drivers, so I
> > would postpone the decision until we settle on wider filtering schema.
> 
> TDX aside it might be useful to have a mechanism to select TSC over KVM
> clock in general.

Sean, Paolo, any comment on this?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-24 15:49     ` Thomas Gleixner
@ 2023-08-25 13:52       ` Kirill A. Shutemov
  2023-08-25 17:09         ` Thomas Gleixner
  0 siblings, 1 reply; 16+ messages in thread
From: Kirill A. Shutemov @ 2023-08-25 13:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dave Hansen, Borislav Petkov, Andy Lutomirski,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel

On Thu, Aug 24, 2023 at 05:49:05PM +0200, Thomas Gleixner wrote:
> On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
> > On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> >> I take it this is carved in stone in the TDX specs somewhere.  A
> >> reference would be nice.
> >
> > TDX Module 1.0 spec:
> >
> > 	5.3.5. Time Stamp Counter (TSC)
> >
> > 	TDX provides a trusted virtual TSC to the guest TDs. TSC value is
> > 	monotonously incrementing, starting from 0 on TD initialization by the
> > 	host VMM. The deviation between virtual TSC values read by each VCPU is
> > 	small.
> 
> Nice weasel wording. What's the definition of "small"?

The newer spec says "Virtual TSC values are consistent among all the TD’s
VCPUs at the level supported by the CPU".

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-25 13:47       ` Kirill A. Shutemov
@ 2023-08-25 15:16         ` Sean Christopherson
  2023-09-07 17:25           ` Paolo Bonzini
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2023-08-25 15:16 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Dave Hansen, Borislav Petkov, Andy Lutomirski,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel, Paolo Bonzini

On Fri, Aug 25, 2023, Kirill A. Shutemov wrote:
> On Thu, Aug 24, 2023 at 09:31:39PM +0200, Thomas Gleixner wrote:
> > On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
> > > On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
> > >> On 8/8/23 09:23, Kirill A. Shutemov wrote:
> > >> ...
> > >> > On the other hand, other clock sources (such as HPET, ACPI timer,
> > >> > APIC, etc.) necessitate VM exits to implement, resulting in more 
> > >> > fluctuating measurements compared to TSC. Thus, those clock sources
> > >> > are not effective for calibrating TSC.
> > >> 
> > >> Do we need to do anything to _those_ to mark them as slightly stinky?
> > >
> > > I don't know what the rules here. As far as I can see, all other clock
> > > sources relevant for TDX guest have lower rating. I guess we are fine?
> > 
> > Ideally they are not enumerated in the first place, which prevents the
> > kernel from trying.
> 
> We can ask QEMU/KVM not to advertise them to TDX guest, but guest has to
> protect itself as the VMM is not trusted. And we are back to device
> filtering...
> 
> > > There's notable exception to the rating order is kvmclock which is higher
> > > than tsc.
> > 
> > Which is silly aside of TDX.

It made a lot more sense back when stable TSC weren't a thing, which is why
kvmclock_init() drops its rating below the TSC's "300" rating when the TSC is
stable and nonstop.

	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
	    !check_tsc_unstable())
		kvm_clock.rating = 299;

Xen and Hyper-V also have a paravirt clock with a rating that is initially higher
than the TSC, but xen_time_init() and hv_init_tsc_clocksource() have similar behavior
to lower their rating when the TSC is deemed to be safe/stable.

Note, because KVM clock isn't marked VALID_FOR_HRES, even if kvmclock didn't
drop its rating, most guests will end up selecting the TSC anyways.

> > > It has to be disabled, but it is not clear to me how. This topic
> > > is related to how we are going to filter allowed devices/drivers, so I
> > > would postpone the decision until we settle on wider filtering schema.
> > 
> > TDX aside it might be useful to have a mechanism to select TSC over KVM
> > clock in general.
> 
> Sean, Paolo, any comment on this?

I would expect the VMM to not advertise KVM clock if the VM is going to run on
hosts with stable TSCs, i.e. the guest really shouldn't need to do anything in.
But I avoid clocks and timekeeping like the plague, so take that with a grain of
salt, e.g. maybe there's a good reason to always advertise kvmclock.

For TDX and other paranoid guests, I assume the kernel command line is captured
as part of attestation.   And so the existing "no-kvmclock" param should be
sufficient to ensure the guest doesn't use KVM clock over the TSC, though IIRC
TDX requires a constant, non-stop TSC, so it's likely not strictly necessary.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-25 13:52       ` Kirill A. Shutemov
@ 2023-08-25 17:09         ` Thomas Gleixner
  2023-08-29 16:01           ` Nakajima, Jun
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2023-08-25 17:09 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Dave Hansen, Borislav Petkov, Andy Lutomirski,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima, x86,
	linux-coco, linux-kernel

On Fri, Aug 25 2023 at 16:52, Kirill A. Shutemov wrote:
> On Thu, Aug 24, 2023 at 05:49:05PM +0200, Thomas Gleixner wrote:
>> On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
>> > On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
>> >> I take it this is carved in stone in the TDX specs somewhere.  A
>> >> reference would be nice.
>> >
>> > TDX Module 1.0 spec:
>> >
>> > 	5.3.5. Time Stamp Counter (TSC)
>> >
>> > 	TDX provides a trusted virtual TSC to the guest TDs. TSC value is
>> > 	monotonously incrementing, starting from 0 on TD initialization by the
>> > 	host VMM. The deviation between virtual TSC values read by each VCPU is
>> > 	small.
>> 
>> Nice weasel wording. What's the definition of "small"?
>
> The newer spec says "Virtual TSC values are consistent among all the TD’s
> VCPUs at the level supported by the CPU".

That means what? It's not a guarantee for consistency either. :(

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-25 17:09         ` Thomas Gleixner
@ 2023-08-29 16:01           ` Nakajima, Jun
  2023-08-30  7:33             ` Thomas Gleixner
  0 siblings, 1 reply; 16+ messages in thread
From: Nakajima, Jun @ 2023-08-29 16:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kirill A. Shutemov, Hansen, Dave, Borislav Petkov, Lutomirski,
	Andy, Kuppuswamy Sathyanarayanan, Reshetova, Elena, x86,
	linux-coco, linux-kernel

> On Aug 25, 2023, at 10:09 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Fri, Aug 25 2023 at 16:52, Kirill A. Shutemov wrote:
>> On Thu, Aug 24, 2023 at 05:49:05PM +0200, Thomas Gleixner wrote:
>>> On Tue, Aug 08 2023 at 23:01, Kirill A. Shutemov wrote:
>>>> On Tue, Aug 08, 2023 at 10:13:05AM -0700, Dave Hansen wrote:
>>>>> I take it this is carved in stone in the TDX specs somewhere.  A
>>>>> reference would be nice.
>>>> 
>>>> TDX Module 1.0 spec:
>>>> 
>>>> 5.3.5. Time Stamp Counter (TSC)
>>>> 
>>>> TDX provides a trusted virtual TSC to the guest TDs. TSC value is
>>>> monotonously incrementing, starting from 0 on TD initialization by the
>>>> host VMM. The deviation between virtual TSC values read by each VCPU is
>>>> small.
>>> 
>>> Nice weasel wording. What's the definition of "small"?
>> 
>> The newer spec says "Virtual TSC values are consistent among all the TD’s
>> VCPUs at the level supported by the CPU".
> 
> That means what? It's not a guarantee for consistency either. :(

Actually (in TDX Module 1.5 spec), the sentence is "Virtual TSC values are consistent among all the TD’s VCPUs at the level supported by the CPU, see below”. 

And the below:
---
The host VMM is required to do the following:
• Set up the same IA32_TSC_ADJUST values on all LPs before initializing the Intel TDX module.
• Make sure IA32_TSC_ADJUST is not modified from its initial value before calling SEAMCALL.

The Intel TDX module checks the above as part of TDH.VP.ENTER and any other SEAMCALL leaf function that reads TSC.

The virtualized TSC is designed to have the following characteristics:
• The virtual TSC frequency is specified by the host VMM as an input to TDH.MNG.INIT in units of 25MHz – it can be between 4 and 400 (corresponding to a range of 100MHz to 10GHz).
• The virtual TSC starts counting from 0 at TDH.MNG.INIT time.
...


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-29 16:01           ` Nakajima, Jun
@ 2023-08-30  7:33             ` Thomas Gleixner
  2023-08-31 15:16               ` Nakajima, Jun
  0 siblings, 1 reply; 16+ messages in thread
From: Thomas Gleixner @ 2023-08-30  7:33 UTC (permalink / raw)
  To: Nakajima, Jun
  Cc: Kirill A. Shutemov, Hansen, Dave, Borislav Petkov, Lutomirski,
	Andy, Kuppuswamy Sathyanarayanan, Reshetova, Elena, x86,
	linux-coco, linux-kernel

On Tue, Aug 29 2023 at 16:01, Jun Nakajima wrote:
>> On Aug 25, 2023, at 10:09 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>> The newer spec says "Virtual TSC values are consistent among all the TD’s
>>> VCPUs at the level supported by the CPU".
>> 
>> That means what? It's not a guarantee for consistency either. :(
>
> Actually (in TDX Module 1.5 spec), the sentence is "Virtual TSC values
> are consistent among all the TD’s VCPUs at the level supported by the
> CPU, see below”.
>
> And the below:
> ---
> The host VMM is required to do the following:
> • Set up the same IA32_TSC_ADJUST values on all LPs before initializing the Intel TDX module.
> • Make sure IA32_TSC_ADJUST is not modified from its initial value before calling SEAMCALL.
>
> The Intel TDX module checks the above as part of TDH.VP.ENTER and any
> other SEAMCALL leaf function that reads TSC.

What happens when the check detects that the host modified TSC ADJUST?

What validates the VMCS TSC offset field?

> The virtualized TSC is designed to have the following characteristics:
> • The virtual TSC frequency is specified by the host VMM as an input
> to TDH.MNG.INIT in units of 25MHz – it can be between 4 and 400
> (corresponding to a range of 100MHz to 10GHz).

What validates that the frequency is correct?

How is ensured that the host does not change TSC scaling?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-30  7:33             ` Thomas Gleixner
@ 2023-08-31 15:16               ` Nakajima, Jun
  0 siblings, 0 replies; 16+ messages in thread
From: Nakajima, Jun @ 2023-08-31 15:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Kirill A. Shutemov, Hansen, Dave, Borislav Petkov, Lutomirski,
	Andy, Kuppuswamy Sathyanarayanan, Reshetova, Elena, x86,
	linux-coco, linux-kernel


> On Aug 30, 2023, at 12:33 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Tue, Aug 29 2023 at 16:01, Jun Nakajima wrote:
>>> On Aug 25, 2023, at 10:09 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>>> The newer spec says "Virtual TSC values are consistent among all the TD’s
>>>> VCPUs at the level supported by the CPU".
>>> 
>>> That means what? It's not a guarantee for consistency either. :(
>> 
>> Actually (in TDX Module 1.5 spec), the sentence is "Virtual TSC values
>> are consistent among all the TD’s VCPUs at the level supported by the
>> CPU, see below”.
>> 
>> And the below:
>> ---
>> The host VMM is required to do the following:
>> • Set up the same IA32_TSC_ADJUST values on all LPs before initializing the Intel TDX module.
>> • Make sure IA32_TSC_ADJUST is not modified from its initial value before calling SEAMCALL.
>> 
>> The Intel TDX module checks the above as part of TDH.VP.ENTER and any
>> other SEAMCALL leaf function that reads TSC.
> 
> What happens when the check detects that the host modified TSC ADJUST?

Such a SEAMCALL, e.g., TDH.VP.ENTER will fail with an error code (TDX_INCONSISTENT_MSR and MSR index of TSC ADJUST).

> 
> What validates the VMCS TSC offset field?

TDX module. The VMCSs of TDs are in private (protected) memory and accessed by the TDX module only. 
The host has no direct access to them.

> 
>> The virtualized TSC is designed to have the following characteristics:
>> • The virtual TSC frequency is specified by the host VMM as an input
>> to TDH.MNG.INIT in units of 25MHz – it can be between 4 and 400
>> (corresponding to a range of 100MHz to 10GHz).
> 
> What validates that the frequency is correct?

Validation of the real/hardware TSC frequency is part of hardware validation.

> 
> How is ensured that the host does not change TSC scaling?

I guess you mean virtual TSC scaling, which is used for calculation of the TSC observed by the guest.
This is a VMCS field set by the TDX module, based on the configured virtual TSC frequency and the real TSC frequency. So, the host cannot change it (as it has no direct access to the VMCS).


---
Jun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] x86/tdx: Mark TSC reliable
  2023-08-25 15:16         ` Sean Christopherson
@ 2023-09-07 17:25           ` Paolo Bonzini
  0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2023-09-07 17:25 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Kirill A. Shutemov, Thomas Gleixner, Dave Hansen,
	Borislav Petkov, Andy Lutomirski, Kuppuswamy Sathyanarayanan,
	Elena Reshetova, Jun Nakajima, x86, linux-coco, linux-kernel

On Fri, Aug 25, 2023 at 5:25 PM Sean Christopherson <seanjc@google.com> wrote:
> > > TDX aside it might be useful to have a mechanism to select TSC over KVM
> > > clock in general.
> >
> > Sean, Paolo, any comment on this?
>
> I would expect the VMM to not advertise KVM clock if the VM is going to run on
> hosts with stable TSCs, i.e. the guest really shouldn't need to do anything in.
> But I avoid clocks and timekeeping like the plague, so take that with a grain of
> salt, e.g. maybe there's a good reason to always advertise kvmclock.

Mostly because users avoid clocks and timekeeping _even more_;
advertising kvmclock is safe in general, so userspace does it in case
the VM is later migrated to a machine with unstable TSC.

> For TDX and other paranoid guests, I assume the kernel command line is captured
> as part of attestation.   And so the existing "no-kvmclock" param should be
> sufficient to ensure the guest doesn't use KVM clock over the TSC, though IIRC
> TDX requires a constant, non-stop TSC, so it's likely not strictly necessary.

As Kirill said, the guest still has to protect itself, so the patch
makes sense (I see a v2 has been posted in the meanwhile).

Paolo


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-09-07 17:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-08 16:23 [PATCH] x86/tdx: Mark TSC reliable Kirill A. Shutemov
2023-08-08 17:13 ` Dave Hansen
2023-08-08 20:01   ` Kirill A. Shutemov
2023-08-09  5:44     ` Reshetova, Elena
2023-08-09  6:13       ` Kirill A. Shutemov
2023-08-22 23:39         ` Erdem Aktas
2023-08-24 15:49     ` Thomas Gleixner
2023-08-25 13:52       ` Kirill A. Shutemov
2023-08-25 17:09         ` Thomas Gleixner
2023-08-29 16:01           ` Nakajima, Jun
2023-08-30  7:33             ` Thomas Gleixner
2023-08-31 15:16               ` Nakajima, Jun
2023-08-24 19:31     ` Thomas Gleixner
2023-08-25 13:47       ` Kirill A. Shutemov
2023-08-25 15:16         ` Sean Christopherson
2023-09-07 17:25           ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).