linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] x86/random: Retry on RDSEED failure
@ 2024-01-30  8:30 Kirill A. Shutemov
  2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Kirill A. Shutemov @ 2024-01-30  8:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o, Jason A. Donenfeld
  Cc: Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel, Kirill A. Shutemov

The function rdrand_long() retries 10 times before returning failure to
the caller. On the other hand, rdseed_long() gives up on the first
failure.

According to the Intel SDM, both instructions should follow the same
retry approach. This information can be found in the section titled
"Random Number Generator Instructions".

To align the behavior of rdseed_long() with rdrand_long(), it should be
modified to retry 10 times before giving up.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/include/asm/archrandom.h | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..918c5880de9e 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -33,11 +33,19 @@ static inline bool __must_check rdrand_long(unsigned long *v)
 
 static inline bool __must_check rdseed_long(unsigned long *v)
 {
+	unsigned int retry = RDRAND_RETRY_LOOPS;
 	bool ok;
-	asm volatile("rdseed %[out]"
-		     CC_SET(c)
-		     : CC_OUT(c) (ok), [out] "=r" (*v));
-	return ok;
+
+	do {
+		asm volatile("rdseed %[out]"
+			     CC_SET(c)
+			     : CC_OUT(c) (ok), [out] "=r" (*v));
+
+		if (ok)
+			return true;
+	} while (--retry);
+
+	return false;
 }
 
 /*
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30  8:30 [PATCH 1/2] x86/random: Retry on RDSEED failure Kirill A. Shutemov
@ 2024-01-30  8:30 ` Kirill A. Shutemov
  2024-01-30 12:37   ` Jason A. Donenfeld
  2024-01-30 15:50   ` Kuppuswamy Sathyanarayanan
  2024-01-30 12:29 ` [PATCH 1/2] x86/random: Retry on RDSEED failure Jason A. Donenfeld
  2024-01-30 15:44 ` Kuppuswamy Sathyanarayanan
  2 siblings, 2 replies; 99+ messages in thread
From: Kirill A. Shutemov @ 2024-01-30  8:30 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o, Jason A. Donenfeld
  Cc: Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel, Kirill A. Shutemov

RDRAND and RDSEED instructions rarely fail. Ten retries should be
sufficient to account for occasional failures.

If the instruction fails more than ten times, it is likely that the
hardware is broken or someone is attempting to exceed the rate at which
the random number generator hardware can provide random numbers.

Issue a warning if ten retries were not enough.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/include/asm/archrandom.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 918c5880de9e..fc8d837fb3b9 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -13,6 +13,12 @@
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
 
+#ifdef KASLR_COMPRESSED_BOOT
+#define rd_warn(msg) warn(msg)
+#else
+#define rd_warn(msg) WARN_ONCE(1, msg)
+#endif
+
 #define RDRAND_RETRY_LOOPS	10
 
 /* Unconditional execution of RDRAND and RDSEED */
@@ -28,6 +34,9 @@ static inline bool __must_check rdrand_long(unsigned long *v)
 		if (ok)
 			return true;
 	} while (--retry);
+
+	rd_warn("RDRAND failed\n");
+
 	return false;
 }
 
@@ -45,6 +54,8 @@ static inline bool __must_check rdseed_long(unsigned long *v)
 			return true;
 	} while (--retry);
 
+	rd_warn("RDSEED failed\n");
+
 	return false;
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30  8:30 [PATCH 1/2] x86/random: Retry on RDSEED failure Kirill A. Shutemov
  2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
@ 2024-01-30 12:29 ` Jason A. Donenfeld
  2024-01-30 12:51   ` Jason A. Donenfeld
  2024-01-30 13:10   ` Reshetova, Elena
  2024-01-30 15:44 ` Kuppuswamy Sathyanarayanan
  2 siblings, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 12:29 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Hi Kirill,

I've been following the other discussion closely thinking about the
matter, but I suppose I'll jump in here directly on this patch, if
this is the approach the discussion is congealing around.

A comment below:

On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>  static inline bool __must_check rdseed_long(unsigned long *v)
>  {
> +       unsigned int retry = RDRAND_RETRY_LOOPS;
>         bool ok;
> -       asm volatile("rdseed %[out]"
> -                    CC_SET(c)
> -                    : CC_OUT(c) (ok), [out] "=r" (*v));
> -       return ok;
> +
> +       do {
> +               asm volatile("rdseed %[out]"
> +                            CC_SET(c)
> +                            : CC_OUT(c) (ok), [out] "=r" (*v));
> +
> +               if (ok)
> +                       return true;
> +       } while (--retry);
> +
> +       return false;
>  }

So, my understanding of RDRAND vs RDSEED -- deliberately leaving out
any cryptographic discussion here -- is roughly that RDRAND will
expand the seed material for longer, while RDSEED will mostly always
try to sample more bits from the environment. AES is fast, while
sampling is slow, so RDRAND gives better performance and is less
likely to fail, whereas RDSEED always has to wait on the hardware to
collect some bits, so is more likely to fail.

For that reason, most of the usage of RDRAND and RDSEED inside of
random.c is something to the tune of `if (!rdseed(out)) rdrand(out);`,
first trying RDSEED but falling back to RDRAND if it's busy. That
still seems to me like a reasonable approach, which this patch would
partly undermine (in concert with the next patch, which I'll comment
on in a follow up email there).

So maybe this patch #1 (of 2) can be dropped?

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
@ 2024-01-30 12:37   ` Jason A. Donenfeld
  2024-01-30 13:45     ` Reshetova, Elena
  2024-01-30 15:50   ` Kuppuswamy Sathyanarayanan
  1 sibling, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 12:37 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Hi Kirill,

Picking up from my last email on patch 1/2:

On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> RDRAND and RDSEED instructions rarely fail. Ten retries should be
> sufficient to account for occasional failures.
>
> If the instruction fails more than ten times, it is likely that the
> hardware is broken or someone is attempting to exceed the rate at which
> the random number generator hardware can provide random numbers.

You're the Intel employee so you can find out about this with much
more assurance than me, but I understand the sentence above to be _way
more_ true for RDRAND than for RDSEED. If your informed opinion is,
"RDRAND failing can only be due to totally broken hardware" then a
WARN_ON seems like an appropriate solution, consistent with what other
drivers do for totally broken hardware. I'm less convinced that this
is the case also for RDSEED, but you know better than me.

However, there's one potentially concerning aspect to consider: if the
statement is "RDRAND only fails when the hardware fails", that's fine,
but if the statement is "RDRAND only fails when the hardware fails or
a user hammers on RDRAND in a busy loop," then this seems like a
potential DoS vector from userspace, since RDRAND is not a privileged
instruction. Unless there's different pools and rate limiting and
hardware and such depending on which ring the instruction is called
from? But I've never read about that. What's your feeling on this
concern?

And if the DoS thing _is_ a concern, and the use case for this WARN_ON
in the first place is the trusted computing scenario, so we basically
only care about early boot, then one addendum would be to only warn if
we're in early boot, which would work because seeding via RDRAND is
attempted pretty early on in init.c.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 12:29 ` [PATCH 1/2] x86/random: Retry on RDSEED failure Jason A. Donenfeld
@ 2024-01-30 12:51   ` Jason A. Donenfeld
  2024-01-30 13:10   ` Reshetova, Elena
  1 sibling, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 12:51 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Elena Reshetova, Jun Nakajima,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 01:29:10PM +0100, Jason A. Donenfeld wrote:
> Hi Kirill,
> 
> I've been following the other discussion closely thinking about the
> matter, but I suppose I'll jump in here directly on this patch, if
> this is the approach the discussion is congealing around.
> 
> A comment below:
> 
> On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >  static inline bool __must_check rdseed_long(unsigned long *v)
> >  {
> > +       unsigned int retry = RDRAND_RETRY_LOOPS;
> >         bool ok;
> > -       asm volatile("rdseed %[out]"
> > -                    CC_SET(c)
> > -                    : CC_OUT(c) (ok), [out] "=r" (*v));
> > -       return ok;
> > +
> > +       do {
> > +               asm volatile("rdseed %[out]"
> > +                            CC_SET(c)
> > +                            : CC_OUT(c) (ok), [out] "=r" (*v));
> > +
> > +               if (ok)
> > +                       return true;
> > +       } while (--retry);
> > +
> > +       return false;
> >  }
> 
> So, my understanding of RDRAND vs RDSEED -- deliberately leaving out
> any cryptographic discussion here -- is roughly that RDRAND will
> expand the seed material for longer, while RDSEED will mostly always
> try to sample more bits from the environment. AES is fast, while
> sampling is slow, so RDRAND gives better performance and is less
> likely to fail, whereas RDSEED always has to wait on the hardware to
> collect some bits, so is more likely to fail.
> 
> For that reason, most of the usage of RDRAND and RDSEED inside of
> random.c is something to the tune of `if (!rdseed(out)) rdrand(out);`,
> first trying RDSEED but falling back to RDRAND if it's busy. That
> still seems to me like a reasonable approach, which this patch would
> partly undermine (in concert with the next patch, which I'll comment
> on in a follow up email there).
> 
> So maybe this patch #1 (of 2) can be dropped?

Unless there's a difference between ring 0 and ring 3, this simple test
is telling:

  #include <stdio.h>
  #include <immintrin.h>
  
  int main(int argc, char *argv[])
  {
    unsigned long long rand;
    unsigned int i, success_rand = 0, success_seed = 0;
    enum { TOTAL = 1000000 };
  
    for (i = 0; i < TOTAL; ++i)
    	success_rand += !!_rdrand64_step(&rand);
    for (i = 0; i < TOTAL; ++i)
    	success_seed += !!_rdseed64_step(&rand);
    
    printf("RDRAND: %.2f%%, RDSEED: %.2f%%\n", success_rand * 100.0 / TOTAL, success_seed * 100.0 / TOTAL);
    return 0;
  }

Result on my i7-11850H:

  RDRAND: 100.00%, RDSEED: 29.26%

And this doesn't even test multicore stuff.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 12:29 ` [PATCH 1/2] x86/random: Retry on RDSEED failure Jason A. Donenfeld
  2024-01-30 12:51   ` Jason A. Donenfeld
@ 2024-01-30 13:10   ` Reshetova, Elena
  2024-01-30 14:06     ` Jason A. Donenfeld
  2024-01-30 15:20     ` H. Peter Anvin
  1 sibling, 2 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-30 13:10 UTC (permalink / raw)
  To: Jason A. Donenfeld, Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

 
> Hi Kirill,
> 
> I've been following the other discussion closely thinking about the
> matter, but I suppose I'll jump in here directly on this patch, if
> this is the approach the discussion is congealing around.
> 
> A comment below:
> 
> On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >  static inline bool __must_check rdseed_long(unsigned long *v)
> >  {
> > +       unsigned int retry = RDRAND_RETRY_LOOPS;
> >         bool ok;
> > -       asm volatile("rdseed %[out]"
> > -                    CC_SET(c)
> > -                    : CC_OUT(c) (ok), [out] "=r" (*v));
> > -       return ok;
> > +
> > +       do {
> > +               asm volatile("rdseed %[out]"
> > +                            CC_SET(c)
> > +                            : CC_OUT(c) (ok), [out] "=r" (*v));
> > +
> > +               if (ok)
> > +                       return true;
> > +       } while (--retry);
> > +
> > +       return false;
> >  }
> 
> So, my understanding of RDRAND vs RDSEED -- deliberately leaving out
> any cryptographic discussion here -- is roughly that RDRAND will
> expand the seed material for longer, while RDSEED will mostly always
> try to sample more bits from the environment. AES is fast, while
> sampling is slow, so RDRAND gives better performance and is less
> likely to fail, whereas RDSEED always has to wait on the hardware to
> collect some bits, so is more likely to fail.

The internals of Intel DRBG behind RDRAND/RDSEED has been publicly
documented, so the structure is no secret. Please see [1] for overall
structure and other aspects. So, yes, your overall understanding is correct
(there are many more details though). 

[1] https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html


> 
> For that reason, most of the usage of RDRAND and RDSEED inside of
> random.c is something to the tune of `if (!rdseed(out)) rdrand(out);`,
> first trying RDSEED but falling back to RDRAND if it's busy. That
> still seems to me like a reasonable approach, which this patch would
> partly undermine (in concert with the next patch, which I'll comment
> on in a follow up email there).

I agree that for the purpose of extracting entropy for Linux RNG falling
back to RDRAND (current behavior) is perfectly ok, so I think you are doing
the right thing. However, in principle it is not always the case, there are
situations when a fallback to RDRAND should not be used, but it is also
true that the user of this interface should know/understand this situation. 

> 
> So maybe this patch #1 (of 2) can be dropped?

Before we start debating this patchset, what is your opinion on the original
problem we raised for CoCo VMs when both RDRAND/RDSEED are made to
fail deliberately? 

Best Regards,
Elena.


^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 12:37   ` Jason A. Donenfeld
@ 2024-01-30 13:45     ` Reshetova, Elena
  2024-01-30 14:21       ` Jason A. Donenfeld
  2024-01-30 17:31       ` Dave Hansen
  0 siblings, 2 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-30 13:45 UTC (permalink / raw)
  To: Jason A. Donenfeld, Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

> Hi Kirill,
> 
> Picking up from my last email on patch 1/2:
> 
> On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > RDRAND and RDSEED instructions rarely fail. Ten retries should be
> > sufficient to account for occasional failures.
> >
> > If the instruction fails more than ten times, it is likely that the
> > hardware is broken or someone is attempting to exceed the rate at which
> > the random number generator hardware can provide random numbers.
> 
> You're the Intel employee so you can find out about this with much
> more assurance than me, but I understand the sentence above to be _way
> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> "RDRAND failing can only be due to totally broken hardware"

No, this is not the case per Intel SDM. I think we can live under a simple
assumption that both of these instructions can fail not just due to broken
HW, but also due to enough pressure put into the whole DRBG construction
that supplies random numbers via RDRAND/RDSEED. 

 then a
> WARN_ON seems like an appropriate solution, consistent with what other
> drivers do for totally broken hardware. I'm less convinced that this
> is the case also for RDSEED, but you know better than me.

I do agree that due to internal structure of DRBG it is easier to create
a situation when RDSEED will fail. But for the purpose of Linux RNG and 
confidential computing it actually doesn’t make a difference if we get an
output from RDRAND or RDSEED, as soon as we get either of them. 
Problems only start imo when both of them are made to fail. 

> 
> However, there's one potentially concerning aspect to consider: if the
> statement is "RDRAND only fails when the hardware fails", that's fine,
> but if the statement is "RDRAND only fails when the hardware fails or
> a user hammers on RDRAND in a busy loop," then this seems like a
> potential DoS vector from userspace, since RDRAND is not a privileged
> instruction. Unless there's different pools and rate limiting and
> hardware and such depending on which ring the instruction is called
> from? But I've never read about that. What's your feeling on this
> concern?

RDRAND can fail with enough load as I already said above. 
I am also not aware about any ring separation or anything like this for
RDRAND/RDSEED instructions. 

I guess your concern about DoS here is for the case when we don’t
trust the host/VMM *and* assume malicious userspace, correct? 
Because in non-confidential computing case, the Linux RNG in such
case will just use non-RDRAND fallbacks, no DoS will happen and we
should have enough entropy that is outside of userspace control.

I guess this is indeed difficult situation because we don’t have any other
entropy sources anymore (unless assume some special HW). 
But you bring a very valid point that in this case we make it easier for userspace
to make a DoS to the kernel if we require RDRAND/RDSEED to 
succeed, which is not acceptable (with exception of early boot when
we don’t have the userspace problem). 

> 
> And if the DoS thing _is_ a concern, and the use case for this WARN_ON
> in the first place is the trusted computing scenario, so we basically
> only care about early boot, then one addendum would be to only warn if
> we're in early boot, which would work because seeding via RDRAND is
> attempted pretty early on in init.c.

I don’t think we are only concerned with initial early boot and initial seeding. 
What about periodic reseeding of ChaCha CSPRNG? If you don’t get
RDRAND/RDSEED output during this step, don’t we formally loose the forward
prediction resistance property of Linux RNG assuming this is the only source
of entropy that is outside of attacker control? 

Best Regards,
Elena. 

> 
> Jason


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 13:10   ` Reshetova, Elena
@ 2024-01-30 14:06     ` Jason A. Donenfeld
  2024-01-30 14:43       ` Daniel P. Berrangé
  2024-01-30 18:35       ` Jason A. Donenfeld
  2024-01-30 15:20     ` H. Peter Anvin
  1 sibling, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 14:06 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 2:10 PM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
> The internals of Intel DRBG behind RDRAND/RDSEED has been publicly
> documented, so the structure is no secret. Please see [1] for overall
> structure and other aspects. So, yes, your overall understanding is correct
> (there are many more details though).

Indeed, have read it.

> > So maybe this patch #1 (of 2) can be dropped?
>
> Before we start debating this patchset, what is your opinion on the original
> problem we raised for CoCo VMs when both RDRAND/RDSEED are made to
> fail deliberately?

My general feeling is that this seems like a hardware problem.

If you have a VM, the hypervisor should provide a seed. But with CoCo,
you can't trust the host to do that. But can't the host do anything to
the VM that it wants, like fiddle with its memory? No, there are
special new hardware features to encrypt and protect ram to prevent
this. So if you've found yourself in a situation where you absolutely
cannot trust the host, AND the hardware already has working guest
protections from the host, then it would seem you also need a hardware
solution to handle seeding. And you're claiming that RDRAND/RDSEED is
the *only* hardware solution available for it.

Is that an accurate summary? If it is, then the actual problem is that
the hardware provided to solve this problem doesn't actually solve it
that well, so we're caught deciding between guest-guest DoS (some
other guest on the system uses all RDRAND resources) and cryptographic
failure because of a malicious host creating a deterministic
environment.

But I have two questions:

1) Is this CoCo VM stuff even real? Is protecting guests from hosts
actually possible in the end? Is anybody doing this? I assume they
are, so maybe ignore this question, but I would like to register my
gut feeling that on the Intel platform this seems like an endless
whack-a-mole problem like SGX.

2) Can a malicious host *actually* create a fully deterministic
environment? One that'll produce the same timing for the jitter
entropy creation, and all the other timers and interrupts and things?
I imagine the attestation part of CoCo means these VMs need to run on
real Intel silicon and so it can't be single stepped in TCG or
something, right? So is this problem actually a real one? And to what
degree? Any good experimental research on this?

Either way, if you're convinced RDRAND is the *only* way here, adding
a `WARN_ON(is_in_early_boot)` to the RDRAND (but not RDSEED) failure
path seems a fairly lightweight bandaid. I just wonder if the hardware
people could come up with something more reliable that we wouldn't
have to agonize over in the kernel.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 13:45     ` Reshetova, Elena
@ 2024-01-30 14:21       ` Jason A. Donenfeld
  2024-01-30 14:55         ` Reshetova, Elena
  2024-01-30 17:31       ` Dave Hansen
  1 sibling, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 14:21 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
> No, this is not the case per Intel SDM. I think we can live under a simple
> assumption that both of these instructions can fail not just due to broken
> HW, but also due to enough pressure put into the whole DRBG construction
> that supplies random numbers via RDRAND/RDSEED.

Yea, thought so.

> I guess your concern about DoS here is for the case when we don’t
> trust the host/VMM *and* assume malicious userspace, correct?
> Because in non-confidential computing case, the Linux RNG in such
> case will just use non-RDRAND fallbacks, no DoS will happen and we
> should have enough entropy that is outside of userspace control.

Don't think about the RNG for just one second. The basic principle is
simpler: if you have a
`WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually
considered a DoS - panic_on_warn and such.

> >
> > And if the DoS thing _is_ a concern, and the use case for this WARN_ON
> > in the first place is the trusted computing scenario, so we basically
> > only care about early boot, then one addendum would be to only warn if
> > we're in early boot, which would work because seeding via RDRAND is
> > attempted pretty early on in init.c.
>
> I don’t think we are only concerned with initial early boot and initial seeding.
> What about periodic reseeding of ChaCha CSPRNG? If you don’t get
> RDRAND/RDSEED output during this step, don’t we formally loose the forward
> prediction resistance property of Linux RNG assuming this is the only source
> of entropy that is outside of attacker control?

If you never add new material, and you have the initial seed, then
it's deterministic. But you still mostly can't backtrack if the state
leaks at some future point in time.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 14:06     ` Jason A. Donenfeld
@ 2024-01-30 14:43       ` Daniel P. Berrangé
  2024-01-30 15:12         ` Jason A. Donenfeld
  2024-01-30 18:35       ` Jason A. Donenfeld
  1 sibling, 1 reply; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-01-30 14:43 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 03:06:14PM +0100, Jason A. Donenfeld wrote:
> Is that an accurate summary? If it is, then the actual problem is that
> the hardware provided to solve this problem doesn't actually solve it
> that well, so we're caught deciding between guest-guest DoS (some
> other guest on the system uses all RDRAND resources) and cryptographic
> failure because of a malicious host creating a deterministic
> environment.

In a CoCo VM environment, a guest DoS is not a unique threat
scenario, as it is unrelated to confidentiality. Ensuring
fair subdivision of resources between competeing guests is
just a general VM threat. There are many easy ways a host
admin can stop a guest making computational progress. Simply
not scheduling the guest vCPU threads is one. CoCo doesn't
try to solve this problem.

Preserving confidentiality is the primary aim of CoCo.

IOW, if the guest boot is stalled because the kernel is spinning
waiting on RDRAND to return data, that's fine. If the kernel
panics after "n" RDRAND failures in a row that's fine too. They
are both just yet another DoS scenario. 

If the kernel ignores the RDRAND failure and lets it boot with
degraded RNG state there were susceptible to attacks, that would
not be OK for CoCo. 

> But I have two questions:
> 
> 1) Is this CoCo VM stuff even real? Is protecting guests from hosts
> actually possible in the end? Is anybody doing this? I assume they
> are, so maybe ignore this question, but I would like to register my
> gut feeling that on the Intel platform this seems like an endless
> whack-a-mole problem like SGX.

It is real, but it is also not perfect. I expect it /will/ be an
endless whack-a-mole problem though.

None the less, it is a significant layer of defence, as compared
to traditional VMs where the guest RAM is nothing more than a
'cat' command away from host admin exposure.

> 2) Can a malicious host *actually* create a fully deterministic
> environment? One that'll produce the same timing for the jitter
> entropy creation, and all the other timers and interrupts and things?
> I imagine the attestation part of CoCo means these VMs need to run on
> real Intel silicon and so it can't be single stepped in TCG or
> something, right? So is this problem actually a real one? And to what
> degree? Any good experimental research on this?
> 
> Either way, if you're convinced RDRAND is the *only* way here, adding
> a `WARN_ON(is_in_early_boot)` to the RDRAND (but not RDSEED) failure
> path seems a fairly lightweight bandaid. I just wonder if the hardware
> people could come up with something more reliable that we wouldn't
> have to agonize over in the kernel.

If RDRAND failure is more of a theoretical problem than a practical
real world problem, I'd be inclined to just let the kernel loop on
RDRAND failure until it suceeds, with a WARN after 'n' iterations to
aid diagnosis of the stall in the unlikely even it did hit.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 14:21       ` Jason A. Donenfeld
@ 2024-01-30 14:55         ` Reshetova, Elena
  2024-01-30 15:00           ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-30 14:55 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel



> On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena
> <elena.reshetova@intel.com> wrote:
> > No, this is not the case per Intel SDM. I think we can live under a simple
> > assumption that both of these instructions can fail not just due to broken
> > HW, but also due to enough pressure put into the whole DRBG construction
> > that supplies random numbers via RDRAND/RDSEED.
> 
> Yea, thought so.
> 
> > I guess your concern about DoS here is for the case when we don’t
> > trust the host/VMM *and* assume malicious userspace, correct?
> > Because in non-confidential computing case, the Linux RNG in such
> > case will just use non-RDRAND fallbacks, no DoS will happen and we
> > should have enough entropy that is outside of userspace control.
> 
> Don't think about the RNG for just one second. The basic principle is
> simpler: if you have a
> `WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually
> considered a DoS - panic_on_warn and such.

Ok, agree, you do bring a valid point that we should not create new 
DoS attack vectors from  userspace in such cases. 

> 
> > >
> > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON
> > > in the first place is the trusted computing scenario, so we basically
> > > only care about early boot, then one addendum would be to only warn if
> > > we're in early boot, which would work because seeding via RDRAND is
> > > attempted pretty early on in init.c.
> >
> > I don’t think we are only concerned with initial early boot and initial seeding.
> > What about periodic reseeding of ChaCha CSPRNG? If you don’t get
> > RDRAND/RDSEED output during this step, don’t we formally loose the forward
> > prediction resistance property of Linux RNG assuming this is the only source
> > of entropy that is outside of attacker control?
> 
> If you never add new material, and you have the initial seed, then
> it's deterministic. But you still mostly can't backtrack if the state
> leaks at some future point in time.

I am not talking about backtrack resistance, i.e. when attacker learns about 
RNG state and then can recover the past output. I was talking about an attacker
learning the RNG state at some point of time (RNG compromise) and
then for RNG being able to recover over time from this state to a secure state using 
fresh entropy input that is outside of attacker control/observance.
Does Linux RNG aim to provide this property? Do people care about this? 
If noone cares about this one and Linux RNG doesn’t aim to provide it anyhow,
then I agree that we should just ensure that early entropy collection includes
RDRAND/RDSEED input for confidential VMs one way or another.

Best Regards,
Elena.

> 
> Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 14:55         ` Reshetova, Elena
@ 2024-01-30 15:00           ` Jason A. Donenfeld
  0 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 15:00 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 02:55:08PM +0000, Reshetova, Elena wrote:
> 
> 
> > On Tue, Jan 30, 2024 at 2:45 PM Reshetova, Elena
> > <elena.reshetova@intel.com> wrote:
> > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > assumption that both of these instructions can fail not just due to broken
> > > HW, but also due to enough pressure put into the whole DRBG construction
> > > that supplies random numbers via RDRAND/RDSEED.
> > 
> > Yea, thought so.
> > 
> > > I guess your concern about DoS here is for the case when we don’t
> > > trust the host/VMM *and* assume malicious userspace, correct?
> > > Because in non-confidential computing case, the Linux RNG in such
> > > case will just use non-RDRAND fallbacks, no DoS will happen and we
> > > should have enough entropy that is outside of userspace control.
> > 
> > Don't think about the RNG for just one second. The basic principle is
> > simpler: if you have a
> > `WARN_ON(unprivd_userspace_triggerable_condition)`, that's usually
> > considered a DoS - panic_on_warn and such.
> 
> Ok, agree, you do bring a valid point that we should not create new 
> DoS attack vectors from  userspace in such cases. 
> 
> > 
> > > >
> > > > And if the DoS thing _is_ a concern, and the use case for this WARN_ON
> > > > in the first place is the trusted computing scenario, so we basically
> > > > only care about early boot, then one addendum would be to only warn if
> > > > we're in early boot, which would work because seeding via RDRAND is
> > > > attempted pretty early on in init.c.
> > >
> > > I don’t think we are only concerned with initial early boot and initial seeding.
> > > What about periodic reseeding of ChaCha CSPRNG? If you don’t get
> > > RDRAND/RDSEED output during this step, don’t we formally loose the forward
> > > prediction resistance property of Linux RNG assuming this is the only source
> > > of entropy that is outside of attacker control?
> > 
> > If you never add new material, and you have the initial seed, then
> > it's deterministic. But you still mostly can't backtrack if the state
> > leaks at some future point in time.
> 
> I am not talking about backtrack resistance, i.e. when attacker learns about 
> RNG state and then can recover the past output. I was talking about an attacker
> learning the RNG state at some point of time (RNG compromise) and
> then for RNG being able to recover over time from this state to a secure state using 
> fresh entropy input that is outside of attacker control/observance.
> Does Linux RNG aim to provide this property? Do people care about this? 
> If noone cares about this one and Linux RNG doesn’t aim to provide it anyhow,
> then I agree that we should just ensure that early entropy collection includes
> RDRAND/RDSEED input for confidential VMs one way or another.

That's the first thing I mentioned -- "If you never add new material,
and you have the initial seed, then it's deterministic." The property
you mention is a good one to have and Linux usually has it. 


> 
> Best Regards,
> Elena.
> 
> > 
> > Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 14:43       ` Daniel P. Berrangé
@ 2024-01-30 15:12         ` Jason A. Donenfeld
  0 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 15:12 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 02:43:19PM +0000, Daniel P. Berrangé wrote:
> On Tue, Jan 30, 2024 at 03:06:14PM +0100, Jason A. Donenfeld wrote:
> > Is that an accurate summary? If it is, then the actual problem is that
> > the hardware provided to solve this problem doesn't actually solve it
> > that well, so we're caught deciding between guest-guest DoS (some
> > other guest on the system uses all RDRAND resources) and cryptographic
> > failure because of a malicious host creating a deterministic
> > environment.
> 
> In a CoCo VM environment, a guest DoS is not a unique threat
> scenario, as it is unrelated to confidentiality. Ensuring
> fair subdivision of resources between competeing guests is
> just a general VM threat. There are many easy ways a host
> admin can stop a guest making computational progress. Simply
> not scheduling the guest vCPU threads is one. CoCo doesn't
> try to solve this problem.
> 
> Preserving confidentiality is the primary aim of CoCo.
> 
> IOW, if the guest boot is stalled because the kernel is spinning
> waiting on RDRAND to return data, that's fine. If the kernel
> panics after "n" RDRAND failures in a row that's fine too. They
> are both just yet another DoS scenario. 
> 
> If the kernel ignores the RDRAND failure and lets it boot with
> degraded RNG state there were susceptible to attacks, that would
> not be OK for CoCo. 

Yea, that's why I said "we're caught deciding..." One case is a DoS that
would affect all VMs, so while one guest preventing new guests from
booting seems like not a CoCo problem, yes, it is still a problem.

At least in theory. And in practice this is easy with RDSEED too. In
practice, could you actually indefinably starve RDRAND between guests?
Is this pretty easy to do with a little tinkering, or is this a
practically impossible DoS vector? I don't actually know.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 13:10   ` Reshetova, Elena
  2024-01-30 14:06     ` Jason A. Donenfeld
@ 2024-01-30 15:20     ` H. Peter Anvin
  1 sibling, 0 replies; 99+ messages in thread
From: H. Peter Anvin @ 2024-01-30 15:20 UTC (permalink / raw)
  To: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On January 30, 2024 5:10:20 AM PST, "Reshetova, Elena" <elena.reshetova@intel.com> wrote:
> 
>> Hi Kirill,
>> 
>> I've been following the other discussion closely thinking about the
>> matter, but I suppose I'll jump in here directly on this patch, if
>> this is the approach the discussion is congealing around.
>> 
>> A comment below:
>> 
>> On Tue, Jan 30, 2024 at 9:30 AM Kirill A. Shutemov
>> <kirill.shutemov@linux.intel.com> wrote:
>> >  static inline bool __must_check rdseed_long(unsigned long *v)
>> >  {
>> > +       unsigned int retry = RDRAND_RETRY_LOOPS;
>> >         bool ok;
>> > -       asm volatile("rdseed %[out]"
>> > -                    CC_SET(c)
>> > -                    : CC_OUT(c) (ok), [out] "=r" (*v));
>> > -       return ok;
>> > +
>> > +       do {
>> > +               asm volatile("rdseed %[out]"
>> > +                            CC_SET(c)
>> > +                            : CC_OUT(c) (ok), [out] "=r" (*v));
>> > +
>> > +               if (ok)
>> > +                       return true;
>> > +       } while (--retry);
>> > +
>> > +       return false;
>> >  }
>> 
>> So, my understanding of RDRAND vs RDSEED -- deliberately leaving out
>> any cryptographic discussion here -- is roughly that RDRAND will
>> expand the seed material for longer, while RDSEED will mostly always
>> try to sample more bits from the environment. AES is fast, while
>> sampling is slow, so RDRAND gives better performance and is less
>> likely to fail, whereas RDSEED always has to wait on the hardware to
>> collect some bits, so is more likely to fail.
>
>The internals of Intel DRBG behind RDRAND/RDSEED has been publicly
>documented, so the structure is no secret. Please see [1] for overall
>structure and other aspects. So, yes, your overall understanding is correct
>(there are many more details though). 
>
>[1] https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html
>
>
>> 
>> For that reason, most of the usage of RDRAND and RDSEED inside of
>> random.c is something to the tune of `if (!rdseed(out)) rdrand(out);`,
>> first trying RDSEED but falling back to RDRAND if it's busy. That
>> still seems to me like a reasonable approach, which this patch would
>> partly undermine (in concert with the next patch, which I'll comment
>> on in a follow up email there).
>
>I agree that for the purpose of extracting entropy for Linux RNG falling
>back to RDRAND (current behavior) is perfectly ok, so I think you are doing
>the right thing. However, in principle it is not always the case, there are
>situations when a fallback to RDRAND should not be used, but it is also
>true that the user of this interface should know/understand this situation. 
>
>> 
>> So maybe this patch #1 (of 2) can be dropped?
>
>Before we start debating this patchset, what is your opinion on the original
>problem we raised for CoCo VMs when both RDRAND/RDSEED are made to
>fail deliberately? 
>
>Best Regards,
>Elena.
>
>

I have a real concern with this. We already have the option to let the entropy pool fill before the boot can proceed. This would have the risk of massively increasing the interrupt latency for what will be retried anyway.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30  8:30 [PATCH 1/2] x86/random: Retry on RDSEED failure Kirill A. Shutemov
  2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
  2024-01-30 12:29 ` [PATCH 1/2] x86/random: Retry on RDSEED failure Jason A. Donenfeld
@ 2024-01-30 15:44 ` Kuppuswamy Sathyanarayanan
  2 siblings, 0 replies; 99+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2024-01-30 15:44 UTC (permalink / raw)
  To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Jason A. Donenfeld
  Cc: Elena Reshetova, Jun Nakajima, Tom Lendacky, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel


On 1/30/24 12:30 AM, Kirill A. Shutemov wrote:
> The function rdrand_long() retries 10 times before returning failure to
> the caller. On the other hand, rdseed_long() gives up on the first
> failure.
>
> According to the Intel SDM, both instructions should follow the same
> retry approach. This information can be found in the section titled
> "Random Number Generator Instructions".
>
> To align the behavior of rdseed_long() with rdrand_long(), it should be
> modified to retry 10 times before giving up.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---

Change looks good to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

Wondering whether this needs to go to stable trees?

>  arch/x86/include/asm/archrandom.h | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..918c5880de9e 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -33,11 +33,19 @@ static inline bool __must_check rdrand_long(unsigned long *v)
>  
>  static inline bool __must_check rdseed_long(unsigned long *v)
>  {
> +	unsigned int retry = RDRAND_RETRY_LOOPS;
>  	bool ok;
> -	asm volatile("rdseed %[out]"
> -		     CC_SET(c)
> -		     : CC_OUT(c) (ok), [out] "=r" (*v));
> -	return ok;
> +
> +	do {
> +		asm volatile("rdseed %[out]"
> +			     CC_SET(c)
> +			     : CC_OUT(c) (ok), [out] "=r" (*v));
> +
> +		if (ok)
> +			return true;
> +	} while (--retry);
> +
> +	return false;
>  }
>  
>  /*

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
  2024-01-30 12:37   ` Jason A. Donenfeld
@ 2024-01-30 15:50   ` Kuppuswamy Sathyanarayanan
  1 sibling, 0 replies; 99+ messages in thread
From: Kuppuswamy Sathyanarayanan @ 2024-01-30 15:50 UTC (permalink / raw)
  To: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Jason A. Donenfeld
  Cc: Elena Reshetova, Jun Nakajima, Tom Lendacky, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel


On 1/30/24 12:30 AM, Kirill A. Shutemov wrote:
> RDRAND and RDSEED instructions rarely fail. Ten retries should be
> sufficient to account for occasional failures.
>
> If the instruction fails more than ten times, it is likely that the
> hardware is broken or someone is attempting to exceed the rate at which
> the random number generator hardware can provide random numbers.
>
> Issue a warning if ten retries were not enough.

Did you come across a case where it fails? Wondering why add this
warning now?

>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  arch/x86/include/asm/archrandom.h | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 918c5880de9e..fc8d837fb3b9 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,6 +13,12 @@
>  #include <asm/processor.h>
>  #include <asm/cpufeature.h>
>  
> +#ifdef KASLR_COMPRESSED_BOOT
> +#define rd_warn(msg) warn(msg)

Why not use warn_once in both cases?

> +#else
> +#define rd_warn(msg) WARN_ONCE(1, msg)
> +#endif
> +
>  #define RDRAND_RETRY_LOOPS	10
>  
>  /* Unconditional execution of RDRAND and RDSEED */
> @@ -28,6 +34,9 @@ static inline bool __must_check rdrand_long(unsigned long *v)
>  		if (ok)
>  			return true;
>  	} while (--retry);
> +
> +	rd_warn("RDRAND failed\n");
> +
>  	return false;
>  }
>  
> @@ -45,6 +54,8 @@ static inline bool __must_check rdseed_long(unsigned long *v)
>  			return true;
>  	} while (--retry);
>  
> +	rd_warn("RDSEED failed\n");
> +
>  	return false;
>  }
>  

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 13:45     ` Reshetova, Elena
  2024-01-30 14:21       ` Jason A. Donenfeld
@ 2024-01-30 17:31       ` Dave Hansen
  2024-01-30 17:49         ` Jason A. Donenfeld
  1 sibling, 1 reply; 99+ messages in thread
From: Dave Hansen @ 2024-01-30 17:31 UTC (permalink / raw)
  To: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On 1/30/24 05:45, Reshetova, Elena wrote:
>> You're the Intel employee so you can find out about this with much
>> more assurance than me, but I understand the sentence above to be _way
>> more_ true for RDRAND than for RDSEED. If your informed opinion is,
>> "RDRAND failing can only be due to totally broken hardware"
> No, this is not the case per Intel SDM. I think we can live under a simple
> assumption that both of these instructions can fail not just due to broken
> HW, but also due to enough pressure put into the whole DRBG construction
> that supplies random numbers via RDRAND/RDSEED. 

I don't think the SDM is the right thing to look at for guidance here.

Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
to be exceedingly rare by design.  If they're not, we're going to get
our trusty torches and pitchforks and go after the folks who built the
broken hardware.

Repeat after me:

	Regular RDRAND/RDSEED failures only occur on broken hardware

If it's nice hardware that's gone bad, then we WARN() and try to make
the best of it.  If it turns out that WARN() was because of a broken
hardware _design_ then we go sharpen the pitchforks.

Anybody disagree?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 17:31       ` Dave Hansen
@ 2024-01-30 17:49         ` Jason A. Donenfeld
  2024-01-30 17:58           ` Dave Hansen
  2024-01-30 18:05           ` Daniel P. Berrangé
  0 siblings, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 17:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 1/30/24 05:45, Reshetova, Elena wrote:
> >> You're the Intel employee so you can find out about this with much
> >> more assurance than me, but I understand the sentence above to be _way
> >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> >> "RDRAND failing can only be due to totally broken hardware"
> > No, this is not the case per Intel SDM. I think we can live under a simple
> > assumption that both of these instructions can fail not just due to broken
> > HW, but also due to enough pressure put into the whole DRBG construction
> > that supplies random numbers via RDRAND/RDSEED.
>
> I don't think the SDM is the right thing to look at for guidance here.
>
> Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> to be exceedingly rare by design.  If they're not, we're going to get
> our trusty torches and pitchforks and go after the folks who built the
> broken hardware.
>
> Repeat after me:
>
>         Regular RDRAND/RDSEED failures only occur on broken hardware
>
> If it's nice hardware that's gone bad, then we WARN() and try to make
> the best of it.  If it turns out that WARN() was because of a broken
> hardware _design_ then we go sharpen the pitchforks.
>
> Anybody disagree?

Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
in a busy loop. So at the very least, your statement holds true only
for RDRAND.

But, anyway, if the statement "RDRAND failures only occur on broken
hardware" is true, then a WARN() in the failure path there presents no
DoS potential of any kind, and so that's a straightforward conclusion
to this discussion. However, that really hinges on  "RDRAND failures
only occur on broken hardware" being a true statement.

Also, I don't know how much heavy lifting the word "regular" was doing
in your original statement. Because my example shows that that
irregular RDSEED usage from malicious users can hinder regular users.
If that also applies to RDRAND, the "regular" makes the statement not
as useful for taking conclusive action here.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 17:49         ` Jason A. Donenfeld
@ 2024-01-30 17:58           ` Dave Hansen
  2024-01-30 18:15             ` H. Peter Anvin
  2024-01-30 18:23             ` Jason A. Donenfeld
  2024-01-30 18:05           ` Daniel P. Berrangé
  1 sibling, 2 replies; 99+ messages in thread
From: Dave Hansen @ 2024-01-30 17:58 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On 1/30/24 09:49, Jason A. Donenfeld wrote:
>> Anybody disagree?
> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> in a busy loop. So at the very least, your statement holds true only
> for RDRAND.

Well, darn. :)

Any chance you could share some more information about the environment
where you're seeing this?  It'd be good to reconcile what you're seeing
with how the hardware is expected to behave.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 17:49         ` Jason A. Donenfeld
  2024-01-30 17:58           ` Dave Hansen
@ 2024-01-30 18:05           ` Daniel P. Berrangé
  2024-01-30 18:24             ` Jason A. Donenfeld
                               ` (3 more replies)
  1 sibling, 4 replies; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-01-30 18:05 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Dave Hansen, Reshetova, Elena, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
> On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote:
> >
> > On 1/30/24 05:45, Reshetova, Elena wrote:
> > >> You're the Intel employee so you can find out about this with much
> > >> more assurance than me, but I understand the sentence above to be _way
> > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> > >> "RDRAND failing can only be due to totally broken hardware"
> > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > assumption that both of these instructions can fail not just due to broken
> > > HW, but also due to enough pressure put into the whole DRBG construction
> > > that supplies random numbers via RDRAND/RDSEED.
> >
> > I don't think the SDM is the right thing to look at for guidance here.
> >
> > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> > to be exceedingly rare by design.  If they're not, we're going to get
> > our trusty torches and pitchforks and go after the folks who built the
> > broken hardware.
> >
> > Repeat after me:
> >
> >         Regular RDRAND/RDSEED failures only occur on broken hardware
> >
> > If it's nice hardware that's gone bad, then we WARN() and try to make
> > the best of it.  If it turns out that WARN() was because of a broken
> > hardware _design_ then we go sharpen the pitchforks.
> >
> > Anybody disagree?
> 
> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> in a busy loop. So at the very least, your statement holds true only
> for RDRAND.
> 
> But, anyway, if the statement "RDRAND failures only occur on broken
> hardware" is true, then a WARN() in the failure path there presents no
> DoS potential of any kind, and so that's a straightforward conclusion
> to this discussion. However, that really hinges on  "RDRAND failures
> only occur on broken hardware" being a true statement.

There's a useful comment here from an Intel engineer

https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed

  "RDRAND is, indeed, faster than RDSEED because it comes
   from a hardware-based pseudorandom number generator.
   One seed value (effectively, the output from one RDSEED
   command) can provide up to 511 128-bit random values
   before forcing a reseed"

We know we can exhaust RDSEED directly pretty trivially. Making your
test program run in parallel across 20 cpus, I got a mere 3% success
rate from RDSEED.

If RDRAND is reseeding every 511 values, RDRAND output would have
to be consumed significantly faster than RDSEED in order that the
reseed will happen frequently enough to exhaust the seeds.

This looks pretty hard, but maybe with a large enough CPU count
this will be possible in extreme load ?

So I'm not convinced we can blindly wave away RDRAND failures as
guaranteed to mean broken hardware.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 17:58           ` Dave Hansen
@ 2024-01-30 18:15             ` H. Peter Anvin
  2024-01-30 18:23               ` Jason A. Donenfeld
  2024-01-30 18:23             ` Jason A. Donenfeld
  1 sibling, 1 reply; 99+ messages in thread
From: H. Peter Anvin @ 2024-01-30 18:15 UTC (permalink / raw)
  To: Dave Hansen, Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On January 30, 2024 9:58:09 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>On 1/30/24 09:49, Jason A. Donenfeld wrote:
>>> Anybody disagree?
>> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
>> in a busy loop. So at the very least, your statement holds true only
>> for RDRAND.
>
>Well, darn. :)
>
>Any chance you could share some more information about the environment
>where you're seeing this?  It'd be good to reconcile what you're seeing
>with how the hardware is expected to behave.

What CPU is this and could you clarify exactly how you run your busy loop?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 17:58           ` Dave Hansen
  2024-01-30 18:15             ` H. Peter Anvin
@ 2024-01-30 18:23             ` Jason A. Donenfeld
  2024-01-30 18:37               ` Dave Hansen
  1 sibling, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 18:23 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 6:58 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 1/30/24 09:49, Jason A. Donenfeld wrote:
> >> Anybody disagree?
> > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> > in a busy loop. So at the very least, your statement holds true only
> > for RDRAND.
>
> Well, darn. :)
>
> Any chance you could share some more information about the environment
> where you're seeing this?  It'd be good to reconcile what you're seeing
> with how the hardware is expected to behave.

That is already in this thread already. Maybe catch up on the whole
thing and then jump back in?
https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:15             ` H. Peter Anvin
@ 2024-01-30 18:23               ` Jason A. Donenfeld
  0 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 18:23 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Hansen, Reshetova, Elena, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Jan 30, 2024 at 7:16 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On January 30, 2024 9:58:09 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
> >On 1/30/24 09:49, Jason A. Donenfeld wrote:
> >>> Anybody disagree?
> >> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> >> in a busy loop. So at the very least, your statement holds true only
> >> for RDRAND.
> >
> >Well, darn. :)
> >
> >Any chance you could share some more information about the environment
> >where you're seeing this?  It'd be good to reconcile what you're seeing
> >with how the hardware is expected to behave.
>
> What CPU is this and could you clarify exactly how you run your busy loop?

That is already in this thread already. Maybe catch up on the whole
thing and then jump back in?
https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:05           ` Daniel P. Berrangé
@ 2024-01-30 18:24             ` Jason A. Donenfeld
  2024-01-30 18:31             ` Jason A. Donenfeld
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 18:24 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dave Hansen, Reshetova, Elena, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Jan 30, 2024 at 7:06 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> So I'm not convinced we can blindly wave away RDRAND failures as
> guaranteed to mean broken hardware.

Indeed, and now I'm further disturbed by the @intel.com people on the
thread making these claims that are demonstratively false.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:05           ` Daniel P. Berrangé
  2024-01-30 18:24             ` Jason A. Donenfeld
@ 2024-01-30 18:31             ` Jason A. Donenfeld
  2024-01-30 18:40             ` H. Peter Anvin
  2024-01-31  8:16             ` Reshetova, Elena
  3 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 18:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Dave Hansen, Reshetova, Elena, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Jan 30, 2024 at 7:06 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
> > On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote:
> > >
> > > On 1/30/24 05:45, Reshetova, Elena wrote:
> > > >> You're the Intel employee so you can find out about this with much
> > > >> more assurance than me, but I understand the sentence above to be _way
> > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> > > >> "RDRAND failing can only be due to totally broken hardware"
> > > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > > assumption that both of these instructions can fail not just due to broken
> > > > HW, but also due to enough pressure put into the whole DRBG construction
> > > > that supplies random numbers via RDRAND/RDSEED.
> > >
> > > I don't think the SDM is the right thing to look at for guidance here.
> > >
> > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> > > to be exceedingly rare by design.  If they're not, we're going to get
> > > our trusty torches and pitchforks and go after the folks who built the
> > > broken hardware.
> > >
> > > Repeat after me:
> > >
> > >         Regular RDRAND/RDSEED failures only occur on broken hardware
> > >
> > > If it's nice hardware that's gone bad, then we WARN() and try to make
> > > the best of it.  If it turns out that WARN() was because of a broken
> > > hardware _design_ then we go sharpen the pitchforks.
> > >
> > > Anybody disagree?
> >
> > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> > in a busy loop. So at the very least, your statement holds true only
> > for RDRAND.
> >
> > But, anyway, if the statement "RDRAND failures only occur on broken
> > hardware" is true, then a WARN() in the failure path there presents no
> > DoS potential of any kind, and so that's a straightforward conclusion
> > to this discussion. However, that really hinges on  "RDRAND failures
> > only occur on broken hardware" being a true statement.
>
> There's a useful comment here from an Intel engineer
>
> https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
>
>   "RDRAND is, indeed, faster than RDSEED because it comes
>    from a hardware-based pseudorandom number generator.
>    One seed value (effectively, the output from one RDSEED
>    command) can provide up to 511 128-bit random values
>    before forcing a reseed"
>
> We know we can exhaust RDSEED directly pretty trivially. Making your
> test program run in parallel across 20 cpus, I got a mere 3% success
> rate from RDSEED.
>
> If RDRAND is reseeding every 511 values, RDRAND output would have
> to be consumed significantly faster than RDSEED in order that the
> reseed will happen frequently enough to exhaust the seeds.
>
> This looks pretty hard, but maybe with a large enough CPU count
> this will be possible in extreme load ?

So what this suggests is that the guest-guest DoS caused by looping
forever (or panic-on-warn'ing) is at least possible on large enough
hardware for some non-zero amount of time, depending on whatever hard
to hit environmental factors.

Another approach would be to treat this as a hardware flaw, in that
the RDRAND does not provide a universally reliable interface, and so
something like CoCo doesn't work with the current design, and so Intel
should issue some microcode updates that gives some separated pools
and separated rate limiting on a per-VMX ring 0 basis. Or something
like that. I dunno; maybe it's unrealistic to hope Intel will repair
their interface. But I think we've got to acknowledge that it's sort
of broken/irreliable.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 14:06     ` Jason A. Donenfeld
  2024-01-30 14:43       ` Daniel P. Berrangé
@ 2024-01-30 18:35       ` Jason A. Donenfeld
  2024-01-30 19:06         ` Reshetova, Elena
  1 sibling, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 18:35 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Elena,

On Tue, Jan 30, 2024 at 3:06 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> 2) Can a malicious host *actually* create a fully deterministic
> environment? One that'll produce the same timing for the jitter
> entropy creation, and all the other timers and interrupts and things?
> I imagine the attestation part of CoCo means these VMs need to run on
> real Intel silicon and so it can't be single stepped in TCG or
> something, right? So is this problem actually a real one? And to what
> degree? Any good experimental research on this?

I'd like to re-up this question. It seems like assessing the reality
of the concern would be worthwhile.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:23             ` Jason A. Donenfeld
@ 2024-01-30 18:37               ` Dave Hansen
  0 siblings, 0 replies; 99+ messages in thread
From: Dave Hansen @ 2024-01-30 18:37 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On 1/30/24 10:23, Jason A. Donenfeld wrote:
>> Any chance you could share some more information about the environment
>> where you're seeing this?  It'd be good to reconcile what you're seeing
>> with how the hardware is expected to behave.
> That is already in this thread already. Maybe catch up on the whole
> thing and then jump back in?
> https://lore.kernel.org/all/Zbjw5hRHr_E6k18r@zx2c4.com/

Gah, sorry about that.  I can reproduce what you're seeing, and it does
seem widespread.  Let me do some digging and see where we got our wires
crossed.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:05           ` Daniel P. Berrangé
  2024-01-30 18:24             ` Jason A. Donenfeld
  2024-01-30 18:31             ` Jason A. Donenfeld
@ 2024-01-30 18:40             ` H. Peter Anvin
  2024-01-31  8:16             ` Reshetova, Elena
  3 siblings, 0 replies; 99+ messages in thread
From: H. Peter Anvin @ 2024-01-30 18:40 UTC (permalink / raw)
  To: Daniel P. Berrangé, Jason A. Donenfeld
  Cc: Dave Hansen, Reshetova, Elena, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On January 30, 2024 10:05:59 AM PST, "Daniel P. Berrangé" <berrange@redhat.com> wrote:
>On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
>> On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote:
>> >
>> > On 1/30/24 05:45, Reshetova, Elena wrote:
>> > >> You're the Intel employee so you can find out about this with much
>> > >> more assurance than me, but I understand the sentence above to be _way
>> > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
>> > >> "RDRAND failing can only be due to totally broken hardware"
>> > > No, this is not the case per Intel SDM. I think we can live under a simple
>> > > assumption that both of these instructions can fail not just due to broken
>> > > HW, but also due to enough pressure put into the whole DRBG construction
>> > > that supplies random numbers via RDRAND/RDSEED.
>> >
>> > I don't think the SDM is the right thing to look at for guidance here.
>> >
>> > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
>> > to be exceedingly rare by design.  If they're not, we're going to get
>> > our trusty torches and pitchforks and go after the folks who built the
>> > broken hardware.
>> >
>> > Repeat after me:
>> >
>> >         Regular RDRAND/RDSEED failures only occur on broken hardware
>> >
>> > If it's nice hardware that's gone bad, then we WARN() and try to make
>> > the best of it.  If it turns out that WARN() was because of a broken
>> > hardware _design_ then we go sharpen the pitchforks.
>> >
>> > Anybody disagree?
>> 
>> Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
>> in a busy loop. So at the very least, your statement holds true only
>> for RDRAND.
>> 
>> But, anyway, if the statement "RDRAND failures only occur on broken
>> hardware" is true, then a WARN() in the failure path there presents no
>> DoS potential of any kind, and so that's a straightforward conclusion
>> to this discussion. However, that really hinges on  "RDRAND failures
>> only occur on broken hardware" being a true statement.
>
>There's a useful comment here from an Intel engineer
>
>https://web.archive.org/web/20190219074642/https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
>
>  "RDRAND is, indeed, faster than RDSEED because it comes
>   from a hardware-based pseudorandom number generator.
>   One seed value (effectively, the output from one RDSEED
>   command) can provide up to 511 128-bit random values
>   before forcing a reseed"
>
>We know we can exhaust RDSEED directly pretty trivially. Making your
>test program run in parallel across 20 cpus, I got a mere 3% success
>rate from RDSEED.
>
>If RDRAND is reseeding every 511 values, RDRAND output would have
>to be consumed significantly faster than RDSEED in order that the
>reseed will happen frequently enough to exhaust the seeds.
>
>This looks pretty hard, but maybe with a large enough CPU count
>this will be possible in extreme load ?
>
>So I'm not convinced we can blindly wave away RDRAND failures as
>guaranteed to mean broken hardware.
>
>With regards,
>Daniel

The general approach has been "don't credit entropy and try again on the next interrupt." We can, of course, be much more aggressive during boot.

We only need 256-512 bits for the kernel random pool to be equivalent to breaking mainstream crypto primitives even if it is a PRNG only from that point on (which is extremely unlikely.) The Linux PRNG has a very large state, which helps buffer entropy variations.

Again, applications *should* be using /dev/[u]random as appropriate, and if they opt to use lower level primitives in user space they need to implement them correctly – there is literally nothing the kernel can do at that point.

If the probability of success is 3% per your CPU that is still 2 bits of true entropy per invocation. However, the probability of failure after 16 loops is over 60%. I think this validates the concept of continuing to poll periodically rather than looping in time critical paths.


^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 18:35       ` Jason A. Donenfeld
@ 2024-01-30 19:06         ` Reshetova, Elena
  2024-01-30 19:16           ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-30 19:06 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel


> Elena,
> 
> On Tue, Jan 30, 2024 at 3:06 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > 2) Can a malicious host *actually* create a fully deterministic
> > environment? One that'll produce the same timing for the jitter
> > entropy creation, and all the other timers and interrupts and things?
> 
> I'd like to re-up this question. It seems like assessing the reality
> of the concern would be worthwhile.

Yes, sorry, I am just behind answering this thread and it is getting late here. 
This is exactly what I would like to have an open discussion about
with inputs from everyone. 
We have to remember that it is not only about host 'producing' 
a fully deterministic environment, but also about host being able to 
*observe* the entropy input. So the more precise question to ask is
how much can a host observe? My personal understanding is that host can
observe all guest interrupts and their timings, including APIC timer interrupts
(and IPIs), so what is actually left for the guest as unobservable entropy
input? 
And let's also please remember that this is by no means Intel-specific,
we have other confidential computing vendors, so we need a common
agreement on what is the superset of attacker powers that we can
assume.

> > I imagine the attestation part of CoCo means these VMs need to run on
> > real Intel silicon and so it can't be single stepped in TCG or
> > something, right? 

Yes, there is an attestation of a confidential VM and some protections in place
that helps against single-stepping attacks. But I am not sure how this is relevant
for this, could you please clarify? 

Best Regards,
Elena.


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 19:06         ` Reshetova, Elena
@ 2024-01-30 19:16           ` Jason A. Donenfeld
  2024-01-31  7:56             ` Reshetova, Elena
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-30 19:16 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Hi Elena,

On Tue, Jan 30, 2024 at 8:06 PM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
> Yes, sorry, I am just behind answering this thread and it is getting late here.
> This is exactly what I would like to have an open discussion about
> with inputs from everyone.
> We have to remember that it is not only about host 'producing'
> a fully deterministic environment, but also about host being able to
> *observe* the entropy input. So the more precise question to ask is
> how much can a host observe?

Right, observation is just as relevant.

> My personal understanding is that host can
> observe all guest interrupts and their timings, including APIC timer interrupts
> (and IPIs), so what is actually left for the guest as unobservable entropy
> input?

Check out try_to_generate_entropy() and random_get_entropy(), for
example. How observable is RDTSC? Other HPTs?

> > > I imagine the attestation part of CoCo means these VMs need to run on
> > > real Intel silicon and so it can't be single stepped in TCG or
> > > something, right?
>
> Yes, there is an attestation of a confidential VM and some protections in place
> that helps against single-stepping attacks. But I am not sure how this is relevant
> for this, could you please clarify?

I was just thinking that if this didn't require genuine Intel hardware
with prebaked keys in it that you could emulate a CPU and all its
peripherals and ram with defined latencies and such, and run the VM in
a very straightforwardly deterministic environment, because nothing
would be real. But if this does have to hit metal somewhere, then
there's some possibility you at least interact with some hard-to-model
physical hardware.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-30 19:16           ` Jason A. Donenfeld
@ 2024-01-31  7:56             ` Reshetova, Elena
  2024-01-31 13:14               ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-31  7:56 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

> Hi Elena,
> 
> On Tue, Jan 30, 2024 at 8:06 PM Reshetova, Elena
> <elena.reshetova@intel.com> wrote:
> > Yes, sorry, I am just behind answering this thread and it is getting late here.
> > This is exactly what I would like to have an open discussion about
> > with inputs from everyone.
> > We have to remember that it is not only about host 'producing'
> > a fully deterministic environment, but also about host being able to
> > *observe* the entropy input. So the more precise question to ask is
> > how much can a host observe?
> 
> Right, observation is just as relevant.
> 
> > My personal understanding is that host can
> > observe all guest interrupts and their timings, including APIC timer interrupts
> > (and IPIs), so what is actually left for the guest as unobservable entropy
> > input?
> 
> Check out try_to_generate_entropy() and random_get_entropy(), for
> example. How observable is RDTSC? Other HPTs?

Ok, here imo it gets arch-specific and so please treat my answers only 
with Intel TDX arch in mind. I do know that for example AMD behavior
for TSC is different, albeit I am not sure of details. Other archs might also
have different behavior.

For Intel TDX, when a guest executes RDTSC, it gets a virtual TSC value that
is calculated deterministically based on a bunch of inputs that are either
platform HW specific or VMM/host configured. The physical TSC value is taken 
into account also in calculations.  The guest itself is not able to
use usual controls (such as IA32_TSC_ADJUST and such). For details (albeit not
exact calculations) please see [1]. If you are interested in exact calculations, 
the public source code of TDX module is a better reference [2], check
calculate_virt_tsc() or just grep with "tsc" it would show you both comments
explaining what is happening and calculations.
So given this, I would personally consider the virtual guest TSC value
observable by host/VMM. 

 [1] TDX module spec, section 11.13 Time Stamp Counter (TSC)
https://cdrdv2.intel.com/v1/dl/getContent/733575 
[2] TDX module source code:
https://www.intel.com/content/www/us/en/download/738875/782152/intel-trust-domain-extension-intel-tdx-module.html

For the high resolution timers, host controls guest apic timers and interrupts fully. 
So, it has the power to see and even affect when a certain interrupt happens
or doesnt happen in the guest. It can delay guest timers at its will on pretty
extensive time periods. This seems powerful enough for me. 
Things like HPET are also fully under host control. 

> > > > I imagine the attestation part of CoCo means these VMs need to run on
> > > > real Intel silicon and so it can't be single stepped in TCG or
> > > > something, right?
> >
> > Yes, there is an attestation of a confidential VM and some protections in place
> > that helps against single-stepping attacks. But I am not sure how this is relevant
> > for this, could you please clarify?
> 
> I was just thinking that if this didn't require genuine Intel hardware
> with prebaked keys in it that you could emulate a CPU and all its
> peripherals and ram with defined latencies and such, and run the VM in
> a very straightforwardly deterministic environment, because nothing
> would be real. But if this does have to hit metal somewhere, then
> there's some possibility you at least interact with some hard-to-model
> physical hardware.

Yes, in practice there will be physical hw underneath, but the problem imo is
that the host is in between and still very powerful when it comes to interrupts and
timers at the moment. So, I want to make sure people understand the potential
implications overall, and in this case the potential implications on such a
critical security component as Linux RNG. 

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-30 18:05           ` Daniel P. Berrangé
                               ` (2 preceding siblings ...)
  2024-01-30 18:40             ` H. Peter Anvin
@ 2024-01-31  8:16             ` Reshetova, Elena
  2024-01-31 11:59               ` Dr. Greg
                                 ` (2 more replies)
  3 siblings, 3 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-31  8:16 UTC (permalink / raw)
  To: Daniel P. Berrangé, Jason A. Donenfeld
  Cc: Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

> On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
> > On Tue, Jan 30, 2024 at 6:32 PM Dave Hansen <dave.hansen@intel.com> wrote:
> > >
> > > On 1/30/24 05:45, Reshetova, Elena wrote:
> > > >> You're the Intel employee so you can find out about this with much
> > > >> more assurance than me, but I understand the sentence above to be _way
> > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> > > >> "RDRAND failing can only be due to totally broken hardware"
> > > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > > assumption that both of these instructions can fail not just due to broken
> > > > HW, but also due to enough pressure put into the whole DRBG construction
> > > > that supplies random numbers via RDRAND/RDSEED.
> > >
> > > I don't think the SDM is the right thing to look at for guidance here.
> > >
> > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> > > to be exceedingly rare by design.  If they're not, we're going to get
> > > our trusty torches and pitchforks and go after the folks who built the
> > > broken hardware.
> > >
> > > Repeat after me:
> > >
> > >         Regular RDRAND/RDSEED failures only occur on broken hardware
> > >
> > > If it's nice hardware that's gone bad, then we WARN() and try to make
> > > the best of it.  If it turns out that WARN() was because of a broken
> > > hardware _design_ then we go sharpen the pitchforks.
> > >
> > > Anybody disagree?
> >
> > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> > in a busy loop. So at the very least, your statement holds true only
> > for RDRAND.
> >
> > But, anyway, if the statement "RDRAND failures only occur on broken
> > hardware" is true, then a WARN() in the failure path there presents no
> > DoS potential of any kind, and so that's a straightforward conclusion
> > to this discussion. However, that really hinges on  "RDRAND failures
> > only occur on broken hardware" being a true statement.
> 
> There's a useful comment here from an Intel engineer
> 
> https://web.archive.org/web/20190219074642/https://software.intel.com/en-
> us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
> 
>   "RDRAND is, indeed, faster than RDSEED because it comes
>    from a hardware-based pseudorandom number generator.
>    One seed value (effectively, the output from one RDSEED
>    command) can provide up to 511 128-bit random values
>    before forcing a reseed"
> 
> We know we can exhaust RDSEED directly pretty trivially. Making your
> test program run in parallel across 20 cpus, I got a mere 3% success
> rate from RDSEED.
> 
> If RDRAND is reseeding every 511 values, RDRAND output would have
> to be consumed significantly faster than RDSEED in order that the
> reseed will happen frequently enough to exhaust the seeds.
> 
> This looks pretty hard, but maybe with a large enough CPU count
> this will be possible in extreme load ?
> 
> So I'm not convinced we can blindly wave away RDRAND failures as
> guaranteed to mean broken hardware.

This matches both my understanding (I do have cryptography background
and understanding how cryptographic RNGs work)
and official public docs that Intel published on this matter. 
Given that the physical entropy source is limited anyhow, and by giving
enough pressure on the whole construction you should be able to
make RDRAND fail because if the intermediate AES-CBC MAC extractor/
conditioner is not getting its min entropy input rate, it wont
produce a proper seed for AES CTR DRBG. 
Of course exact details/numbers can wary between different generations of 
Intel DRNG implementation, and the platforms where it is running on,
so be careful to sticking to concrete numbers. 

That said, I have taken an AR to follow up internally on what can be done
to improve our situation with RDRAND/RDSEED. But I would still like to
finish the discussion on what people think should be done in the
meanwhile keeping in mind that the problem is not intel specific, despite us
intel people bringing it for public discussion first. The old saying is still here:
"Please don’t shoot the messenger" )) We are actually trying to be open 
about these things and create a public discussion. 

Best Regards,
Elena. 

> 
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31  8:16             ` Reshetova, Elena
@ 2024-01-31 11:59               ` Dr. Greg
  2024-01-31 13:06               ` Jason A. Donenfeld
  2024-02-06  1:12               ` Dr. Greg
  2 siblings, 0 replies; 99+ messages in thread
From: Dr. Greg @ 2024-01-31 11:59 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Daniel P. Berrang??,
	Jason A. Donenfeld, Hansen, Dave, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 08:16:56AM +0000, Reshetova, Elena wrote:

Good morning, I hope the week is going well for everyone.

> > On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
> > > On Tue, Jan 30, 2024 at 6:32???PM Dave Hansen <dave.hansen@intel.com> wrote:
> > > >
> > > > On 1/30/24 05:45, Reshetova, Elena wrote:
> > > > >> You're the Intel employee so you can find out about this with much
> > > > >> more assurance than me, but I understand the sentence above to be _way
> > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> > > > >> "RDRAND failing can only be due to totally broken hardware"
> > > > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > > > assumption that both of these instructions can fail not just due to broken
> > > > > HW, but also due to enough pressure put into the whole DRBG construction
> > > > > that supplies random numbers via RDRAND/RDSEED.
> > > >
> > > > I don't think the SDM is the right thing to look at for guidance here.
> > > >
> > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> > > > to be exceedingly rare by design.  If they're not, we're going to get
> > > > our trusty torches and pitchforks and go after the folks who built the
> > > > broken hardware.
> > > >
> > > > Repeat after me:
> > > >
> > > >         Regular RDRAND/RDSEED failures only occur on broken hardware
> > > >
> > > > If it's nice hardware that's gone bad, then we WARN() and try to make
> > > > the best of it.  If it turns out that WARN() was because of a broken
> > > > hardware _design_ then we go sharpen the pitchforks.
> > > >
> > > > Anybody disagree?
> > >
> > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> > > in a busy loop. So at the very least, your statement holds true only
> > > for RDRAND.
> > >
> > > But, anyway, if the statement "RDRAND failures only occur on broken
> > > hardware" is true, then a WARN() in the failure path there presents no
> > > DoS potential of any kind, and so that's a straightforward conclusion
> > > to this discussion. However, that really hinges on  "RDRAND failures
> > > only occur on broken hardware" being a true statement.
> > 
> > There's a useful comment here from an Intel engineer
> > 
> > https://web.archive.org/web/20190219074642/https://software.intel.com/en-
> > us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
> > 
> >   "RDRAND is, indeed, faster than RDSEED because it comes
> >    from a hardware-based pseudorandom number generator.
> >    One seed value (effectively, the output from one RDSEED
> >    command) can provide up to 511 128-bit random values
> >    before forcing a reseed"
> > 
> > We know we can exhaust RDSEED directly pretty trivially. Making your
> > test program run in parallel across 20 cpus, I got a mere 3% success
> > rate from RDSEED.
> > 
> > If RDRAND is reseeding every 511 values, RDRAND output would have
> > to be consumed significantly faster than RDSEED in order that the
> > reseed will happen frequently enough to exhaust the seeds.
> > 
> > This looks pretty hard, but maybe with a large enough CPU count
> > this will be possible in extreme load ?
> > 
> > So I'm not convinced we can blindly wave away RDRAND failures as
> > guaranteed to mean broken hardware.

> This matches both my understanding (I do have cryptography
> background and understanding how cryptographic RNGs work) and
> official public docs that Intel published on this matter.  Given
> that the physical entropy source is limited anyhow, and by giving
> enough pressure on the whole construction you should be able to make
> RDRAND fail because if the intermediate AES-CBC MAC extractor/
> conditioner is not getting its min entropy input rate, it wont
> produce a proper seed for AES CTR DRBG.  Of course exact
> details/numbers can wary between different generations of Intel DRNG
> implementation, and the platforms where it is running on, so be
> careful to sticking to concrete numbers.
>
> That said, I have taken an AR to follow up internally on what can be
> done to improve our situation with RDRAND/RDSEED. But I would still
> like to finish the discussion on what people think should be done in
> the meanwhile keeping in mind that the problem is not intel
> specific, despite us intel people bringing it for public discussion
> first. The old saying is still here: "Please don't shoot the
> messenger" )) We are actually trying to be open about these things
> and create a public discussion.

Based on Dave Hansen's comments above, it appears that the COCO
community needs to break out the oil and whetstones and hone the tips
of their pitchforks.. :-)

The positive issue in all of this is that, to date, TDX hardware has
not seen significant public availability.  I suspect that when that
happens, if this problem isn't corrected, there will be the usual
flood of papers demonstrating quasi-practical lab attacks that stem
from the fruits of a poisonable random number source.

The problem reproduces pretty easily, albeit on somewhat dated
hardware.

One of our lab machines, that reports a model name of 'Intel(R)
Core(TM) i5-6500 CPU @ 3.20GHz', shows a consistent failure rate of
65% for RDSEED on a single-threaded test.  It gets worse when more
simultaneous demand is placed on the hardware randomness source, as
was demonstrated elsewhere.

Corrupted randomness has the potential to strike pretty deeply into
the TDX attestation architecture, given the need to generate signed
attestation reports and the SGX/TDX key generation architecture that
requires 'wearout protection'.  Beyond that, there is the subsequent
need to conduct userspace attestation, currently by IMA as I believe
is the intent, that in turn requires cryptography with undeniable
integrity.

At this point, given that all this is about confidentiality, that in
turn implies a trusted platform, there is only one option, panic and
hard fail the boot if there is any indication that the hardware has
not been able to provide sound instruction based randomness.  Doing
anything else breaks the 'contract' that a user is pushing a workload
into a trusted/confidential environment.

RDSEED is the root of hardware instruction based randomness and its
randomness comes from quantum noise across a diode junction
(simplistically).  The output of RDSEED drives the AES derived RDRAND
randomness.

Additional clarification from inside of Intel on this is needed, but
the problem would appear to be that the above infrastructure
(RDSEED/RDRAND) is part of the 'Uncore' architecture, rather than
being core specific.  This creates an incestuous relationship across
all of the cores sharing a resource, that as in the past, creates
security issues.

This issue was easily anticipated and foreshadowed by the
demonstration of the CVE-2020-0543/CrossTalk vulnerability.

If the above interpretion is correct, a full fix should be 'straight
forward', for some definition of 'straight forward'... :-)

On TDX capable hardare, the RDSEED noise source needs to come from a
piece of core specific silicon.  If the boot of a TDX VM is core
locked, this would create an environment where a socket based sibling
adversary would be unable to compromise the root of the randomness
source.

Once the Linux random number generator is properly seeded, the issue
should be over, given that by now, everyone has agreed that a properly
initialized Linux RNG cannot 'run out' of randomness.

Given that it seems pretty clear that timing and other 'noise'
information in a COCO environment can't be trusted, having core
specific randomness would be a win for the overall cryptographic
health of VM's that are running in a COCO environment.

Once again, an attack doesn't need to be practical, only demonstrable.
Once demonstrated, faith is lost in the technology, SGX clearly
demonstrated that, as did the micro-architectural attacks.

Both SGX and TDX operate from the notion of 'you trust Intel and the
silicon', so the fix is for Intel to implement a secure silicon based
source of randomness.

AMD will probably need the same thing.

> Best Regards,
> Elena. 

Hopefuly the above is helpful in furthering these discussions.

Have a good remainder of the week.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31  8:16             ` Reshetova, Elena
  2024-01-31 11:59               ` Dr. Greg
@ 2024-01-31 13:06               ` Jason A. Donenfeld
  2024-01-31 18:02                 ` Reshetova, Elena
  2024-01-31 20:35                 ` Dr. Greg
  2024-02-06  1:12               ` Dr. Greg
  2 siblings, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-31 13:06 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Daniel P. Berrangé,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Wed, Jan 31, 2024 at 9:17 AM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
> This matches both my understanding (I do have cryptography background
> and understanding how cryptographic RNGs work)
> and official public docs that Intel published on this matter.
> Given that the physical entropy source is limited anyhow, and by giving
> enough pressure on the whole construction you should be able to
> make RDRAND fail because if the intermediate AES-CBC MAC extractor/
> conditioner is not getting its min entropy input rate, it wont
> produce a proper seed for AES CTR DRBG.
> Of course exact details/numbers can wary between different generations of
> Intel DRNG implementation, and the platforms where it is running on,
> so be careful to sticking to concrete numbers.

Alright, so RDRAND is not reliable. The question for us now is: do we
want RDRAND unreliability to translate to another form of
unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or
would it be better to declare the hardware simply broken and ask Intel
to fix it? (I don't know the answer to that question.)

> That said, I have taken an AR to follow up internally on what can be done
> to improve our situation with RDRAND/RDSEED.

Specifying this is an interesting question. What exactly might our
requirements be for a "non-broken" RDRAND? It seems like we have two
basic ones:

- One VMX (or host) context can't DoS another one.
- Ring 3 can't DoS ring 0.

I don't know whether that'd be implemented with context-tied rate
limiting or more state or what. But I think, short of just making
RDRAND never fail, that's basically what's needed.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31  7:56             ` Reshetova, Elena
@ 2024-01-31 13:14               ` Jason A. Donenfeld
  2024-01-31 14:07                 ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-31 13:14 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Wed, Jan 31, 2024 at 8:56 AM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
> So given this, I would personally consider the virtual guest TSC value
> observable by host/VMM.
> [2] TDX module source code:
> https://www.intel.com/content/www/us/en/download/738875/782152/intel-trust-domain-extension-intel-tdx-module.html

Thanks for the explanation and link. Indeed if this is all mediated by
the host, we're in bad shape.

> For the high resolution timers, host controls guest apic timers and interrupts fully.
> So, it has the power to see and even affect when a certain interrupt happens
> or doesnt happen in the guest. It can delay guest timers at its will on pretty
> extensive time periods. This seems powerful enough for me.
> Things like HPET are also fully under host control.

And I suppose RDPMC is similar?

And it's not like the guest can just take an excessive amount of TSC
samples and randomly select which ones it uses because chickens and
eggs...

The situation you paint is that all of our entropy inputs -- timers,
rdrand, etc -- are either host controllable, host observable, or host
(and guest sibling) DoS'able, so if you don't trust the host, there
are no good inputs. That's not a great position to be in, and I wonder
if something can be done on the hardware side to remedy it, as this
seems like a major shortcoming in TDX. So far, all of the proposed
mitigations introduce some other DoS.

> Yes, in practice there will be physical hw underneath, but the problem imo is
> that the host is in between and still very powerful when it comes to interrupts and
> timers at the moment.

Sure sounds like it.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 13:14               ` Jason A. Donenfeld
@ 2024-01-31 14:07                 ` Theodore Ts'o
  2024-01-31 14:45                   ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-01-31 14:07 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

What about simply treating boot-time initialization of the /dev/random
state as special.  That is, on x86, if the hardware promises that
RDSEED or RDRAND is available, we use them to initialization our RNG
state at boot.  On bare metal, there can't be anyone else trying to
exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
aren't working available --- panic, since the hardware is clearly
busted.

On a guest OS, if confidential compute is enabled, and if RDSEED and
RDRAND don't work after N retries, and we know CC is enabled, panic,
since the kernel can't provide the promised security gaurantees, and
the CC developers and users are cordially invited to sharpen their
pitchforks and to send their tender regards to the Intel RNG
engineers.

For non-confidential compute guests, the question is what is the
appropriate reaction if another VM, possibly belonging to a different
user/customer, is carrying out a RDRAND DOS attack.  I'd argue that in
these cases, if the guest VM is using virtio-random, then the host's
/dev/random should be able to cover for cases of Intel RNG exhaustion,
and allowing other customer to be able to prevent other user's VM's
from being able to boot is the the greater evil, so we shouldn't treat
boot-time RDRAND/RDSEED failures as panic-worthy.

						- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 14:07                 ` Theodore Ts'o
@ 2024-01-31 14:45                   ` Jason A. Donenfeld
  2024-01-31 14:52                     ` Jason A. Donenfeld
  2024-01-31 17:10                     ` Theodore Ts'o
  0 siblings, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-31 14:45 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 09:07:56AM -0500, Theodore Ts'o wrote:
> What about simply treating boot-time initialization of the /dev/random
> state as special.  That is, on x86, if the hardware promises that
> RDSEED or RDRAND is available, we use them to initialization our RNG
> state at boot.  On bare metal, there can't be anyone else trying to
> exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
> aren't working available --- panic, since the hardware is clearly
> busted.

This is the first thing I suggested here: https://lore.kernel.org/all/CAHmME9qsfOdOEHHw_MOBmt6YAtncbbqP9LPK2dRjuOp1CrHzRA@mail.gmail.com/

But Elena found this dissatisfying because we still can't guarantee new
material later.

> On a guest OS, if confidential compute is enabled, and if RDSEED and
> RDRAND don't work after N retries, and we know CC is enabled, panic,
> since the kernel can't provide the promised security gaurantees, and
> the CC developers and users are cordially invited to sharpen their
> pitchforks and to send their tender regards to the Intel RNG
> engineers.

Yea, maybe bubbling the RDRAND DoS up to another DoS in the CoCo case is
a good tradeoff that will produce the right pitchforkers without
breaking anything real.

> For non-confidential compute guests, the question is what is the
> appropriate reaction if another VM, possibly belonging to a different
> user/customer, is carrying out a RDRAND DOS attack.  I'd argue that in
> these cases, if the guest VM is using virtio-random, then the host's
> /dev/random should be able to cover for cases of Intel RNG exhaustion,
> and allowing other customer to be able to prevent other user's VM's
> from being able to boot is the the greater evil, so we shouldn't treat
> boot-time RDRAND/RDSEED failures as panic-worthy.

The non-CoCo case is fine, because guests can trust hosts, so things are
as they have been forever.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 14:45                   ` Jason A. Donenfeld
@ 2024-01-31 14:52                     ` Jason A. Donenfeld
  2024-01-31 17:10                     ` Theodore Ts'o
  1 sibling, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-31 14:52 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 03:45:06PM +0100, Jason A. Donenfeld wrote:
> On Wed, Jan 31, 2024 at 09:07:56AM -0500, Theodore Ts'o wrote:
> > What about simply treating boot-time initialization of the /dev/random
> > state as special.  That is, on x86, if the hardware promises that
> > RDSEED or RDRAND is available, we use them to initialization our RNG
> > state at boot.  On bare metal, there can't be anyone else trying to
> > exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
> > aren't working available --- panic, since the hardware is clearly
> > busted.
> 
> This is the first thing I suggested here: https://lore.kernel.org/all/CAHmME9qsfOdOEHHw_MOBmt6YAtncbbqP9LPK2dRjuOp1CrHzRA@mail.gmail.com/
> 
> But Elena found this dissatisfying because we still can't guarantee new
> material later.
> 
> > On a guest OS, if confidential compute is enabled, and if RDSEED and
> > RDRAND don't work after N retries, and we know CC is enabled, panic,
> > since the kernel can't provide the promised security gaurantees, and
> > the CC developers and users are cordially invited to sharpen their
> > pitchforks and to send their tender regards to the Intel RNG
> > engineers.
> 
> Yea, maybe bubbling the RDRAND DoS up to another DoS in the CoCo case is
> a good tradeoff that will produce the right pitchforkers without
> breaking anything real.

One problem, though, is userspace can DoS the kernel's use of RDRAND.
So probably infinitely retrying in CoCo environments is better than
panicing/warning, since ostensibly a kthread will eventually succeed.

Maybe, though, the Intel platform just simply isn't ready for CoCo, and
marketing got a little bit ahead of the tech?

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 14:45                   ` Jason A. Donenfeld
  2024-01-31 14:52                     ` Jason A. Donenfeld
@ 2024-01-31 17:10                     ` Theodore Ts'o
  2024-01-31 17:37                       ` Reshetova, Elena
  1 sibling, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-01-31 17:10 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 03:45:06PM +0100, Jason A. Donenfeld wrote:
> On Wed, Jan 31, 2024 at 09:07:56AM -0500, Theodore Ts'o wrote:
> > What about simply treating boot-time initialization of the /dev/random
> > state as special.  That is, on x86, if the hardware promises that
> > RDSEED or RDRAND is available, we use them to initialization our RNG
> > state at boot.  On bare metal, there can't be anyone else trying to
> > exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
> > aren't working available --- panic, since the hardware is clearly
> > busted.
> 
> This is the first thing I suggested here: https://lore.kernel.org/all/CAHmME9qsfOdOEHHw_MOBmt6YAtncbbqP9LPK2dRjuOp1CrHzRA@mail.gmail.com/
> 
> But Elena found this dissatisfying because we still can't guarantee new
> material later.

Right, but this is good enough that modulo in-kernel RNG state
compromise, or the ability to attack the underlying cryptographic
primitives (in which case we have much bigger vulnerabilities than
this largely theoretical one), even if we don't have new material
later, the in-kernel RNG for the CC VM should be sufficiently
trustworthy for government work.

> Yea, maybe bubbling the RDRAND DoS up to another DoS in the CoCo case is
> a good tradeoff that will produce the right pitchforkers without
> breaking anything real.

<Evil Grin>

					- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 17:10                     ` Theodore Ts'o
@ 2024-01-31 17:37                       ` Reshetova, Elena
  2024-01-31 18:01                         ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-31 17:37 UTC (permalink / raw)
  To: Theodore Ts'o, Jason A. Donenfeld
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel



> On Wed, Jan 31, 2024 at 03:45:06PM +0100, Jason A. Donenfeld wrote:
> > On Wed, Jan 31, 2024 at 09:07:56AM -0500, Theodore Ts'o wrote:
> > > What about simply treating boot-time initialization of the /dev/random
> > > state as special.  That is, on x86, if the hardware promises that
> > > RDSEED or RDRAND is available, we use them to initialization our RNG
> > > state at boot.  On bare metal, there can't be anyone else trying to
> > > exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
> > > aren't working available --- panic, since the hardware is clearly
> > > busted.
> >
> > This is the first thing I suggested here:
> https://lore.kernel.org/all/CAHmME9qsfOdOEHHw_MOBmt6YAtncbbqP9LPK2dRjuO
> p1CrHzRA@mail.gmail.com/
> >
> > But Elena found this dissatisfying because we still can't guarantee new
> > material later.
> 
> Right, but this is good enough that modulo in-kernel RNG state
> compromise, or the ability to attack the underlying cryptographic
> primitives (in which case we have much bigger vulnerabilities than
> this largely theoretical one), even if we don't have new material
> later, the in-kernel RNG for the CC VM should be sufficiently
> trustworthy for government work.

I agree, this is probably the best we can do at the moment. 
I did want to point out the runtime need of fresh entropy also, but
as we discussed in this thread we might not be able to get it 
without introducing a DoS path for the userspace. 
In this case, it is the best to only loose the forward prediction property
vs. the whole Linux RNG. 

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 17:37                       ` Reshetova, Elena
@ 2024-01-31 18:01                         ` Jason A. Donenfeld
  2024-02-01  4:57                           ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-01-31 18:01 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Theodore Ts'o, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 6:37 PM Reshetova, Elena
<elena.reshetova@intel.com> wrote:
>
>
>
> > On Wed, Jan 31, 2024 at 03:45:06PM +0100, Jason A. Donenfeld wrote:
> > > On Wed, Jan 31, 2024 at 09:07:56AM -0500, Theodore Ts'o wrote:
> > > > What about simply treating boot-time initialization of the /dev/random
> > > > state as special.  That is, on x86, if the hardware promises that
> > > > RDSEED or RDRAND is available, we use them to initialization our RNG
> > > > state at boot.  On bare metal, there can't be anyone else trying to
> > > > exhaust the on-chip RNG's entropy supply, so if RDSEED or RDRAND
> > > > aren't working available --- panic, since the hardware is clearly
> > > > busted.
> > >
> > > This is the first thing I suggested here:
> > https://lore.kernel.org/all/CAHmME9qsfOdOEHHw_MOBmt6YAtncbbqP9LPK2dRjuO
> > p1CrHzRA@mail.gmail.com/
> > >
> > > But Elena found this dissatisfying because we still can't guarantee new
> > > material later.
> >
> > Right, but this is good enough that modulo in-kernel RNG state
> > compromise, or the ability to attack the underlying cryptographic
> > primitives (in which case we have much bigger vulnerabilities than
> > this largely theoretical one), even if we don't have new material
> > later, the in-kernel RNG for the CC VM should be sufficiently
> > trustworthy for government work.
>
> I agree, this is probably the best we can do at the moment.
> I did want to point out the runtime need of fresh entropy also, but
> as we discussed in this thread we might not be able to get it
> without introducing a DoS path for the userspace.
> In this case, it is the best to only loose the forward prediction property
> vs. the whole Linux RNG.

So if this is what we're congealing around, I guess we can:

0) Leave RDSEED alone and focus on RDRAND.
1) Add `WARN_ON_ONCE(in_early_boot);` to the failure path of RDRAND
(and simply hope this doesn't get exploited for guest-guest boot DoS).
2) Loop forever in RDRAND on CoCo VMs, post-boot, with the comments
and variable naming making it clear that this is a hardware bug
workaround, not a "feature" added for "extra security".
3) Complain loudly to Intel and get them to fix the hardware.

Though, a large part of me would really like to skip that step (2),
first because it's a pretty gross bandaid that adds lots of
complexity, and second because it'll make (3) less poignant.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31 13:06               ` Jason A. Donenfeld
@ 2024-01-31 18:02                 ` Reshetova, Elena
  2024-01-31 20:35                 ` Dr. Greg
  1 sibling, 0 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-01-31 18:02 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Daniel P. Berrangé,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

> On Wed, Jan 31, 2024 at 9:17 AM Reshetova, Elena
> <elena.reshetova@intel.com> wrote:
> > This matches both my understanding (I do have cryptography background
> > and understanding how cryptographic RNGs work)
> > and official public docs that Intel published on this matter.
> > Given that the physical entropy source is limited anyhow, and by giving
> > enough pressure on the whole construction you should be able to
> > make RDRAND fail because if the intermediate AES-CBC MAC extractor/
> > conditioner is not getting its min entropy input rate, it wont
> > produce a proper seed for AES CTR DRBG.
> > Of course exact details/numbers can wary between different generations of
> > Intel DRNG implementation, and the platforms where it is running on,
> > so be careful to sticking to concrete numbers.
> 
> Alright, so RDRAND is not reliable. 

Correction here: "... not reliable *in theory*". Because in practice it
all depends on amount of pressure you are able to put on the overall 
construction, which goes into concrete numbers I warned about.
That would depend on the number of available cores, and some other
platform specific factors. I will work on getting this clarified externally
so that there is no confusion. 


The question for us now is: do we
> want RDRAND unreliability to translate to another form of
> unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or
> would it be better to declare the hardware simply broken and ask Intel
> to fix it? (I don't know the answer to that question.)
> 
> > That said, I have taken an AR to follow up internally on what can be done
> > to improve our situation with RDRAND/RDSEED.
> 
> Specifying this is an interesting question. What exactly might our
> requirements be for a "non-broken" RDRAND? It seems like we have two
> basic ones:
> 
> - One VMX (or host) context can't DoS another one.
> - Ring 3 can't DoS ring 0.
> 
> I don't know whether that'd be implemented with context-tied rate
> limiting or more state or what. But I think, short of just making
> RDRAND never fail, that's basically what's needed.

I agree.

Best Regards,
Elena.

> 
> Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31 13:06               ` Jason A. Donenfeld
  2024-01-31 18:02                 ` Reshetova, Elena
@ 2024-01-31 20:35                 ` Dr. Greg
  2024-02-01  4:47                   ` Theodore Ts'o
  2024-02-01  7:26                   ` Reshetova, Elena
  1 sibling, 2 replies; 99+ messages in thread
From: Dr. Greg @ 2024-01-31 20:35 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Daniel P. Berrang??,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote:

Hi again to everyone, beautiful day here in North Dakota.

> On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena
> <elena.reshetova@intel.com> wrote:
> > This matches both my understanding (I do have cryptography background
> > and understanding how cryptographic RNGs work)
> > and official public docs that Intel published on this matter.
> > Given that the physical entropy source is limited anyhow, and by giving
> > enough pressure on the whole construction you should be able to
> > make RDRAND fail because if the intermediate AES-CBC MAC extractor/
> > conditioner is not getting its min entropy input rate, it wont
> > produce a proper seed for AES CTR DRBG.
> > Of course exact details/numbers can wary between different generations of
> > Intel DRNG implementation, and the platforms where it is running on,
> > so be careful to sticking to concrete numbers.

> Alright, so RDRAND is not reliable. The question for us now is: do
> we want RDRAND unreliability to translate to another form of
> unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or
> would it be better to declare the hardware simply broken and ask
> Intel to fix it? (I don't know the answer to that question.)

I think it would demonstrate a lack of appropriate engineering
diligence on the part of our community to declare RDRAND 'busted' at
this point.

While it appeares to be trivially easy to force RDSEED into depletion,
there does not seem to be a suggestion, at least in the open
literature, that this directly or easily translates into stalling
output from RDRAND in any type of relevant adversarial fashion.

If this were the case, given what CVE's seem to be worth on a resume,
someone would have rented a cloud machine and come up with a POC
against RDRAND in a multi-tenant environment and then promptly put up
a web-site called 'Random Starve' or something equally ominous.

This is no doubt secondary to the 1022x amplication factor inherent in
the 'Bull Mountain' architecture.

I'm a bit surprised that no one from the Intel side of this
conversation didn't pitch this over the wall as soon as this
conversation came up, but I would suggest that everyone concerned
about this issue give the following a thorough read:

https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html

Relevant highlights:

- As I suggested in my earlier e-mail, random number generation is a
  socket based resource, hence an adversarial domain limited to only
  the cores on a common socket.

- There is a maximum randomness throughput rate of 800 MB/s over all
  cores sharing common random number infrastructure.  Single thread
  throughput rates of 70-200 MB/s are demonstratable.

- A failure of RDRAND over 10 re-tries is 'astronomically' small, with
  no definition of astronomical provided, one would assume really
  small, given they are using the word astronomical.

> > That said, I have taken an AR to follow up internally on what can be done
> > to improve our situation with RDRAND/RDSEED.

I think I can save you some time Elena.

> Specifying this is an interesting question. What exactly might our
> requirements be for a "non-broken" RDRAND? It seems like we have two
> basic ones:
> 
> - One VMX (or host) context can't DoS another one.
> - Ring 3 can't DoS ring 0.
> 
> I don't know whether that'd be implemented with context-tied rate
> limiting or more state or what. But I think, short of just making
> RDRAND never fail, that's basically what's needed.

I think we probably have that, for all intents and purposes, given
that we embrace the following methodogy:

- Use RDRAND exclusively.

- Be willing to take 10 swings at the plate.

- Given the somewhat demanding requirements for TDX/COCO, fail and
  either deadlock or panic after 10 swings since that would seem to
  suggest the hardware is broken, ie. RMA time.

Either deadlock or panic would be appropriate.  The objective in the
COCO environment is to get the person who clicked on the 'Enable Azure
Confidential' checkbox, or its equivalent, on their cloud dashboard,
to call the HelpDesk and ask them why their confidential application
won't come up.

After the user confirms to the HelpDesk that their computer is plugged
in, the problem will get fixed.  Either the broken hardware will be
identified and idled out or the mighty sword of vengeance will be
summoned down on whoever has all of the other cores on the socket
pegged.

Final thoughts:

- RDSEED is probably a poor thing to be using.

- There may be a reasonable argument that RDSEED shouldn't have been
  exposed above ring 0, but that ship has sailed.  Brownie points
  moving forward for an RDsomething that is ring 0 and has guaranteed
  access to some amount of functionally reasonable entropy.

- Intel and AMD are already doing a lot of 'special' stuff with their
  COCO hardware in order to defy the long standing adage of: 'You
  can't have security without physical security'.  Access to per core thermal
  noise, as I suggested, is probably a big lift but clever engineers can
  probably cook up some type of fairness doctrine for randomness in
  TDX or SEV_SNP, given the particular importance of instruction based
  randomness in COCO.

- Perfection is the enemy of good.

> Jason

Have a good day.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31 20:35                 ` Dr. Greg
@ 2024-02-01  4:47                   ` Theodore Ts'o
  2024-02-01  9:54                     ` Dr. Greg
  2024-02-01  7:26                   ` Reshetova, Elena
  1 sibling, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-01  4:47 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Jason A. Donenfeld, Reshetova, Elena, Daniel P. Berrang??,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 02:35:32PM -0600, Dr. Greg wrote:
> I think it would demonstrate a lack of appropriate engineering
> diligence on the part of our community to declare RDRAND 'busted' at
> this point.
> 
> While it appeares to be trivially easy to force RDSEED into depletion,
> there does not seem to be a suggestion, at least in the open
> literature, that this directly or easily translates into stalling
> output from RDRAND in any type of relevant adversarial fashion.
> 
> If this were the case, given what CVE's seem to be worth on a resume,
> someone would have rented a cloud machine and come up with a POC
> against RDRAND in a multi-tenant environment and then promptly put up
> a web-site called 'Random Starve' or something equally ominous.

I suspect the reason why DOS attacks aren't happening in practice, is
because of concerns over the ability to trust the RDRAND (how do you
prove that the NSA didn't put a backdoor into the hardware with
Intel's acquisence --- after all, the NSA absolutely positively didn't
encourage the kneecaping of WEP and absolutely didn't put a trapdoor
into DUAL_EC_DRBG...)  since it can not externally audited and verfied
by a third party, in contrast to the source code for the /dev/random
driver or the RNG used in OpenSSL.

As a result, most random number generators use RDRAND in combination
with other techniques.  If RDRAND is absolutely trustworthy, the extra
sources won't hurt --- and if it isn't trustworthy mixing in other
sources will likely make things harder for Fort Meade.  And even if
these other sources might be observable for someone who can listen in
on the inter-packet arrival times on the LAN (for example), it might
not be so easy for an analyst sitting at their desk in Fort Meade.

And once you do _that_, you don't need to necessarily loop on RDRAND,
because it's one of multiple sources of entropies that are getting
mixed togethwer.  Hence, even if someone drives RDRAND into depletion,
if they are using getrandom(2), it's not a big deal.

There's a special case with Confidential Compute VM's, since the
assumption is that you want to protect against even a malicious
hypervisor who could theoretically control all other sources of timing
uncertainty.  And so, yes, in that case, the only thing we can do is
Panic if RDRAND fails.

   	    	  		     	 - Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-01-31 18:01                         ` Jason A. Donenfeld
@ 2024-02-01  4:57                           ` Theodore Ts'o
  2024-02-01 18:09                             ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-01  4:57 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 07:01:01PM +0100, Jason A. Donenfeld wrote:
> So if this is what we're congealing around, I guess we can:
> 
> 0) Leave RDSEED alone and focus on RDRAND.
> 1) Add `WARN_ON_ONCE(in_early_boot);` to the failure path of RDRAND
> (and simply hope this doesn't get exploited for guest-guest boot DoS).
> 2) Loop forever in RDRAND on CoCo VMs, post-boot, with the comments
> and variable naming making it clear that this is a hardware bug
> workaround, not a "feature" added for "extra security".
> 3) Complain loudly to Intel and get them to fix the hardware.
> 
> Though, a large part of me would really like to skip that step (2),
> first because it's a pretty gross bandaid that adds lots of
> complexity, and second because it'll make (3) less poignant

If we need to loop more than, say, 10 seconds in a CoCo VM, I'd just
panic with a repeated RDRAND failure message.  This makes the point of
(3) that much pointed, and it's better than having a CoCo VM
mysteriously hang in the face of a DOS attack.

I'll note that it should be relatively easy for Intel to make sure
that if there is an undue draw on RDRAND, to at that point enforce
"fair share" mode where each of the N cores get at most 1/N of the
available entropy.  So if you have single core CoCo VM on a 256 core
machine trying to boot, and the evil attacker has purchased 255 cores
worth of VM's, all of which are busy-looping on RDRAND, while the CoCo
VM is booting, if it is looping on RDRAND, it should be getting
1/256th of the availabe RDRAND output, and since it is only trying to
grab enough randomness to seed the /dev/random CRNG, if it can't get
enough randomness in 10 seconds --- well, Intel's customers should be
finding another vendor's CPU that can do a better job.

			     	      	 - Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31 20:35                 ` Dr. Greg
  2024-02-01  4:47                   ` Theodore Ts'o
@ 2024-02-01  7:26                   ` Reshetova, Elena
  2024-02-01 10:52                     ` Dr. Greg
  1 sibling, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-01  7:26 UTC (permalink / raw)
  To: Dr. Greg, Jason A. Donenfeld
  Cc: Daniel P. Berrang??,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

> On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote:
> 
> Hi again to everyone, beautiful day here in North Dakota.
> 
> > On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena
> > <elena.reshetova@intel.com> wrote:
> > > This matches both my understanding (I do have cryptography background
> > > and understanding how cryptographic RNGs work)
> > > and official public docs that Intel published on this matter.
> > > Given that the physical entropy source is limited anyhow, and by giving
> > > enough pressure on the whole construction you should be able to
> > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/
> > > conditioner is not getting its min entropy input rate, it wont
> > > produce a proper seed for AES CTR DRBG.
> > > Of course exact details/numbers can wary between different generations of
> > > Intel DRNG implementation, and the platforms where it is running on,
> > > so be careful to sticking to concrete numbers.
> 
> > Alright, so RDRAND is not reliable. The question for us now is: do
> > we want RDRAND unreliability to translate to another form of
> > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or
> > would it be better to declare the hardware simply broken and ask
> > Intel to fix it? (I don't know the answer to that question.)
> 
> I think it would demonstrate a lack of appropriate engineering
> diligence on the part of our community to declare RDRAND 'busted' at
> this point.
> 
> While it appeares to be trivially easy to force RDSEED into depletion,
> there does not seem to be a suggestion, at least in the open
> literature, that this directly or easily translates into stalling
> output from RDRAND in any type of relevant adversarial fashion.
> 
> If this were the case, given what CVE's seem to be worth on a resume,
> someone would have rented a cloud machine and come up with a POC
> against RDRAND in a multi-tenant environment and then promptly put up
> a web-site called 'Random Starve' or something equally ominous.
> 
> This is no doubt secondary to the 1022x amplication factor inherent in
> the 'Bull Mountain' architecture.
> 
> I'm a bit surprised that no one from the Intel side of this
> conversation didn't pitch this over the wall as soon as this
> conversation came up, but I would suggest that everyone concerned
> about this issue give the following a thorough read:
> 
> https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-
> random-number-generator-drng-software-implementation-guide.html
> 
> Relevant highlights:
> 
> - As I suggested in my earlier e-mail, random number generation is a
>   socket based resource, hence an adversarial domain limited to only
>   the cores on a common socket.
> 
> - There is a maximum randomness throughput rate of 800 MB/s over all
>   cores sharing common random number infrastructure.  Single thread
>   throughput rates of 70-200 MB/s are demonstratable.
> 
> - A failure of RDRAND over 10 re-tries is 'astronomically' small, with
>   no definition of astronomical provided, one would assume really
>   small, given they are using the word astronomical.

As I said, I want to investigate this properly before stating anything.
In a CoCo VM we cannot guarantee that a victim guest is able to execute
this 10 re-try loop (there is also a tightness requirement listed in official 
guide that is not further specified) without interruption since all guest
scheduling is under the host control. Again, this is the angle that was not
present before and I want to make sure we are protected against this case. 

> 
> > > That said, I have taken an AR to follow up internally on what can be done
> > > to improve our situation with RDRAND/RDSEED.
> 
> I think I can save you some time Elena.
> 
> > Specifying this is an interesting question. What exactly might our
> > requirements be for a "non-broken" RDRAND? It seems like we have two
> > basic ones:
> >
> > - One VMX (or host) context can't DoS another one.
> > - Ring 3 can't DoS ring 0.
> >
> > I don't know whether that'd be implemented with context-tied rate
> > limiting or more state or what. But I think, short of just making
> > RDRAND never fail, that's basically what's needed.
> 
> I think we probably have that, for all intents and purposes, given
> that we embrace the following methodogy:
> 
> - Use RDRAND exclusively.
> 
> - Be willing to take 10 swings at the plate.
> 
> - Given the somewhat demanding requirements for TDX/COCO, fail and
>   either deadlock or panic after 10 swings since that would seem to
>   suggest the hardware is broken, ie. RMA time.

Again, my worry here that a CoCo guest is not in control of its own scheduling
and this might make an impact on the above statement, i.e. it might
theoretical be possible to cause this without physically broken HW. 

Best Regards,
Elena.

> 
> Either deadlock or panic would be appropriate.  The objective in the
> COCO environment is to get the person who clicked on the 'Enable Azure
> Confidential' checkbox, or its equivalent, on their cloud dashboard,
> to call the HelpDesk and ask them why their confidential application
> won't come up.
> 
> After the user confirms to the HelpDesk that their computer is plugged
> in, the problem will get fixed.  Either the broken hardware will be
> identified and idled out or the mighty sword of vengeance will be
> summoned down on whoever has all of the other cores on the socket
> pegged.
> 
> Final thoughts:
> 
> - RDSEED is probably a poor thing to be using.
> 
> - There may be a reasonable argument that RDSEED shouldn't have been
>   exposed above ring 0, but that ship has sailed.  Brownie points
>   moving forward for an RDsomething that is ring 0 and has guaranteed
>   access to some amount of functionally reasonable entropy.
> 
> - Intel and AMD are already doing a lot of 'special' stuff with their
>   COCO hardware in order to defy the long standing adage of: 'You
>   can't have security without physical security'.  Access to per core thermal
>   noise, as I suggested, is probably a big lift but clever engineers can
>   probably cook up some type of fairness doctrine for randomness in
>   TDX or SEV_SNP, given the particular importance of instruction based
>   randomness in COCO.
> 
> - Perfection is the enemy of good.
> 
> > Jason
> 
> Have a good day.
> 
> As always,
> Dr. Greg
> 
> The Quixote Project - Flailing at the Travails of Cybersecurity
>               https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-01  4:47                   ` Theodore Ts'o
@ 2024-02-01  9:54                     ` Dr. Greg
  2024-02-01 11:08                       ` Daniel P. Berrangé
  0 siblings, 1 reply; 99+ messages in thread
From: Dr. Greg @ 2024-02-01  9:54 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jason A. Donenfeld, Reshetova, Elena, Daniel P. Berrang??,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 11:47:35PM -0500, Theodore Ts'o wrote:
> On Wed, Jan 31, 2024 at 02:35:32PM -0600, Dr. Greg wrote:
> > I think it would demonstrate a lack of appropriate engineering
> > diligence on the part of our community to declare RDRAND 'busted' at
> > this point.
> > 
> > While it appeares to be trivially easy to force RDSEED into depletion,
> > there does not seem to be a suggestion, at least in the open
> > literature, that this directly or easily translates into stalling
> > output from RDRAND in any type of relevant adversarial fashion.
> > 
> > If this were the case, given what CVE's seem to be worth on a resume,
> > someone would have rented a cloud machine and come up with a POC
> > against RDRAND in a multi-tenant environment and then promptly put up
> > a web-site called 'Random Starve' or something equally ominous.

> I suspect the reason why DOS attacks aren't happening in practice, is
> because of concerns over the ability to trust the RDRAND (how do you
> prove that the NSA didn't put a backdoor into the hardware with
> Intel's acquisence --- after all, the NSA absolutely positively didn't
> encourage the kneecaping of WEP and absolutely didn't put a trapdoor
> into DUAL_EC_DRBG...)  since it can not externally audited and verfied
> by a third party, in contrast to the source code for the /dev/random
> driver or the RNG used in OpenSSL.
> 
> As a result, most random number generators use RDRAND in combination
> with other techniques.  If RDRAND is absolutely trustworthy, the extra
> sources won't hurt --- and if it isn't trustworthy mixing in other
> sources will likely make things harder for Fort Meade.  And even if
> these other sources might be observable for someone who can listen in
> on the inter-packet arrival times on the LAN (for example), it might
> not be so easy for an analyst sitting at their desk in Fort Meade.
> 
> And once you do _that_, you don't need to necessarily loop on RDRAND,
> because it's one of multiple sources of entropies that are getting
> mixed togethwer.  Hence, even if someone drives RDRAND into depletion,
> if they are using getrandom(2), it's not a big deal.

All well taken points, the Linux RNG and associated community has
benefited from your and Jason's concerns and work on all of this.

However, whether or not DOS attacks based on RNG depletion are
happening in the wild, and the reasons they are not, are orthogonal to
whether or not they can be proven to exist, which was the point I was
trying to make.

Demonstrating a vulnerability in something as critical as Intel's RNG
implementation would be a big motivation for some research group.  The
fact that hasn't occurred would seem to suggest that the RDRAND
resource depletion we are concerned with is not adversarially
exploitable.

I suspect that the achievable socket core count cannot effectively
overwhelm the 1022x amplification factor inherent in the design of the
RDSEED based seeding of RDRAND.

We will see if Elena can come up with what Intel engineering's
definition of 'astronomical' is.. :-)

> There's a special case with Confidential Compute VM's, since the
> assumption is that you want to protect against even a malicious
> hypervisor who could theoretically control all other sources of
> timing uncertainty.  And so, yes, in that case, the only thing we
> can do is Panic if RDRAND fails.

Indeed.

The bigger question, which I will respond to Elena with, is how much
this issue calls the entire question of confidential computing into
question.

>    	    	  		     	 - Ted

Have a good day, it has been a long time since you and I were standing
around with Phil Hughes in the Galleria in Atlanta argueing about
whether or not Active Directory was going to dominate enterprise
computing.

It does seem to have gained some significant amount of traction... :-)

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-01  7:26                   ` Reshetova, Elena
@ 2024-02-01 10:52                     ` Dr. Greg
  0 siblings, 0 replies; 99+ messages in thread
From: Dr. Greg @ 2024-02-01 10:52 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Jason A. Donenfeld, Daniel P. Berrang??,
	Hansen, Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Thu, Feb 01, 2024 at 07:26:15AM +0000, Reshetova, Elena wrote:

Good morning to everyone.

> > On Wed, Jan 31, 2024 at 02:06:13PM +0100, Jason A. Donenfeld wrote:
> > 
> > Hi again to everyone, beautiful day here in North Dakota.
> > 
> > > On Wed, Jan 31, 2024 at 9:17???AM Reshetova, Elena
> > > <elena.reshetova@intel.com> wrote:
> > > > This matches both my understanding (I do have cryptography background
> > > > and understanding how cryptographic RNGs work)
> > > > and official public docs that Intel published on this matter.
> > > > Given that the physical entropy source is limited anyhow, and by giving
> > > > enough pressure on the whole construction you should be able to
> > > > make RDRAND fail because if the intermediate AES-CBC MAC extractor/
> > > > conditioner is not getting its min entropy input rate, it wont
> > > > produce a proper seed for AES CTR DRBG.
> > > > Of course exact details/numbers can wary between different generations of
> > > > Intel DRNG implementation, and the platforms where it is running on,
> > > > so be careful to sticking to concrete numbers.
> > 
> > > Alright, so RDRAND is not reliable. The question for us now is: do
> > > we want RDRAND unreliability to translate to another form of
> > > unreliability elsewhere, e.g. DoS/infiniteloop/latency/WARN_ON()? Or
> > > would it be better to declare the hardware simply broken and ask
> > > Intel to fix it? (I don't know the answer to that question.)
> > 
> > I think it would demonstrate a lack of appropriate engineering
> > diligence on the part of our community to declare RDRAND 'busted' at
> > this point.
> > 
> > While it appeares to be trivially easy to force RDSEED into depletion,
> > there does not seem to be a suggestion, at least in the open
> > literature, that this directly or easily translates into stalling
> > output from RDRAND in any type of relevant adversarial fashion.
> > 
> > If this were the case, given what CVE's seem to be worth on a resume,
> > someone would have rented a cloud machine and come up with a POC
> > against RDRAND in a multi-tenant environment and then promptly put up
> > a web-site called 'Random Starve' or something equally ominous.
> > 
> > This is no doubt secondary to the 1022x amplication factor inherent in
> > the 'Bull Mountain' architecture.
> > 
> > I'm a bit surprised that no one from the Intel side of this
> > conversation didn't pitch this over the wall as soon as this
> > conversation came up, but I would suggest that everyone concerned
> > about this issue give the following a thorough read:
> > 
> > https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-
> > random-number-generator-drng-software-implementation-guide.html
> > 
> > Relevant highlights:
> > 
> > - As I suggested in my earlier e-mail, random number generation is a
> >   socket based resource, hence an adversarial domain limited to only
> >   the cores on a common socket.
> > 
> > - There is a maximum randomness throughput rate of 800 MB/s over all
> >   cores sharing common random number infrastructure.  Single thread
> >   throughput rates of 70-200 MB/s are demonstratable.
> > 
> > - A failure of RDRAND over 10 re-tries is 'astronomically' small, with
> >   no definition of astronomical provided, one would assume really
> >   small, given they are using the word astronomical.

> As I said, I want to investigate this properly before stating
> anything.  In a CoCo VM we cannot guarantee that a victim guest is
> able to execute this 10 re-try loop (there is also a tightness
> requirement listed in official guide that is not further specified)
> without interruption since all guest scheduling is under the host
> control. Again, this is the angle that was not present before and I
> want to make sure we are protected against this case.

I suspect that all of this may be the source of interesting
discussions inside of Intel, see my closing question below.

If nothing else, we will wait with baited breath for a definition of
astronomical, if of course, the definition of that value is
unprivileged and you would be free to forward it along... :-)

> > > > That said, I have taken an AR to follow up internally on what can be done
> > > > to improve our situation with RDRAND/RDSEED.
> > 
> > I think I can save you some time Elena.
> > 
> > > Specifying this is an interesting question. What exactly might our
> > > requirements be for a "non-broken" RDRAND? It seems like we have two
> > > basic ones:
> > >
> > > - One VMX (or host) context can't DoS another one.
> > > - Ring 3 can't DoS ring 0.
> > >
> > > I don't know whether that'd be implemented with context-tied rate
> > > limiting or more state or what. But I think, short of just making
> > > RDRAND never fail, that's basically what's needed.
> > 
> > I think we probably have that, for all intents and purposes, given
> > that we embrace the following methodogy:
> > 
> > - Use RDRAND exclusively.
> > 
> > - Be willing to take 10 swings at the plate.
> > 
> > - Given the somewhat demanding requirements for TDX/COCO, fail and
> >   either deadlock or panic after 10 swings since that would seem to
> >   suggest the hardware is broken, ie. RMA time.

> Again, my worry here that a CoCo guest is not in control of its own
> scheduling and this might make an impact on the above statement,
> i.e. it might theoretical be possible to cause this without
> physically broken HW.

So all of this leaves open a very significant question that would seem
to be worthy of further enlightenment from inside the bowels of
Intel engineering.

Our discussion has now led us to a point where there appears to be a
legitimate concern that the hypervisor has such significant control
over a confidential VM that the integrity of a simple re-try loop is
an open question.

Let us posit for argument, that confidential computing resolves down
to the implementation of a trusted computing platform that in turn
resolves to a requirement for competent and robust cryptography for
initial and ongoing attestation, let alone confidentiality in the face
of possible side-channel and timing attacks.

I'm sure there would be a great deal of interest in any information
that can be provided that this scenario is possible, given the level
of control that is being suggested that a hypervisor would enjoy over
an ostensibly confidential and trusted guest.

> Best Regards,
> Elena.

Have a good day.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-01  9:54                     ` Dr. Greg
@ 2024-02-01 11:08                       ` Daniel P. Berrangé
  2024-02-01 21:04                         ` Dr. Greg
  0 siblings, 1 reply; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-02-01 11:08 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Theodore Ts'o, Jason A. Donenfeld, Reshetova, Elena, Hansen,
	Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Thu, Feb 01, 2024 at 03:54:51AM -0600, Dr. Greg wrote:
> I suspect that the achievable socket core count cannot effectively
> overwhelm the 1022x amplification factor inherent in the design of the
> RDSEED based seeding of RDRAND.

In testing I could get RDSEED down to < 3% success rate when
running on 20 cores in parallel on a laptop class i7. If that
failure rate can be improved by a little more than one order
of magnitude to 0.1% we're starting to get to the point where
it might be enough to make RDRAND re-seed fail.

Intel's Sierra Forest CPUs are said to have a variant with 288
cores per socket, which is an order of magnitude larger. It is
conceivable this might be large enough to demonstrate RDRAND
failure in extreme load. Then again who knows what else has
changed that might alter the equation, maybe the DRBG is also
better / faster. Only real world testing can say for sure.
One thing is certain though, core counts per socket keep going
up, so the potential worst case load on RDSEED will increase...

> We will see if Elena can come up with what Intel engineering's
> definition of 'astronomical' is.. :-)
> 
> > There's a special case with Confidential Compute VM's, since the
> > assumption is that you want to protect against even a malicious
> > hypervisor who could theoretically control all other sources of
> > timing uncertainty.  And so, yes, in that case, the only thing we
> > can do is Panic if RDRAND fails.
> 
> Indeed.
> 
> The bigger question, which I will respond to Elena with, is how much
> this issue calls the entire question of confidential computing into
> question.

A denial of service (from a panic on RDRAND fail) doesn't undermine
confidental computing. Guest data confidentiality is maintained by
panicing on RDRAND failure and DoS protection isn't a threat that CC
claims to be able to mitigate in general.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-01  4:57                           ` Theodore Ts'o
@ 2024-02-01 18:09                             ` Jason A. Donenfeld
  2024-02-01 18:46                               ` Dave Hansen
                                                 ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-01 18:09 UTC (permalink / raw)
  To: Theodore Ts'o, Reshetova, Elena, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

Hi Ted, Elena, Dave,

On Thu, Feb 1, 2024 at 5:57 AM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Wed, Jan 31, 2024 at 07:01:01PM +0100, Jason A. Donenfeld wrote:
> > So if this is what we're congealing around, I guess we can:
> >
> > 0) Leave RDSEED alone and focus on RDRAND.
> > 1) Add `WARN_ON_ONCE(in_early_boot);` to the failure path of RDRAND
> > (and simply hope this doesn't get exploited for guest-guest boot DoS).
> > 2) Loop forever in RDRAND on CoCo VMs, post-boot, with the comments
> > and variable naming making it clear that this is a hardware bug
> > workaround, not a "feature" added for "extra security".
> > 3) Complain loudly to Intel and get them to fix the hardware.
> >
> > Though, a large part of me would really like to skip that step (2),
> > first because it's a pretty gross bandaid that adds lots of
> > complexity, and second because it'll make (3) less poignant
>
> If we need to loop more than, say, 10 seconds in a CoCo VM, I'd just
> panic with a repeated RDRAND failure message.  This makes the point of
> (3) that much pointed, and it's better than having a CoCo VM
> mysteriously hang in the face of a DOS attack.

Yea, true. Problem is that in theory, userspace can DoS the kernel's
use of RDRAND. Of course in practice, a userspace process preempting a
kthread for >10 seconds is probably a larger problem.

Anyway, I want to lay out the various potential solutions discussed.
As they all have some drawback, it's worth enumerating them.

==

Solution A) WARN_ON_ONCE(is_early_boot)/BUG_ON(is_early_boot) in the
RDRAND failure path (> 10 retries).

The biggest advantage here is that this is super simple and isn't
CoCo-specific. The premise is that if RDRAND fails 10 times in a row
before userspace has started, it's most definitely a hardware problem.
Systems-wise, the drawback is that, in a VM, it alternatively might be
a guest-guest DoS attack on RDRAND, or in the CoCo case, a host-guest
DoS attack (which is presumably easier because the host controls
scheduling). In the CoCo case, not booting is better than losing
confidentiality. In the non-CoCo case, that seems like theoretically a
DoS we might not want. RNG-wise, the drawback is that this doesn't
help deal with secure reseeding later in time, which is a RNG property
that we otherwise enjoy.

Solution B) BUG_ON(is_early_boot && is_coco_system) in the RDRAND
failure path (> 10 retries).

This is slightly less simple than A, because we have to plumb
CoCo-detection through to the RDRAND helper. [Side note: I feel
ridiculous typing 'CoCo'.] Systems-wise, I don't see drawbacks.
RNG-wise, the drawback is that this doesn't help deal with secure
reseeding later in time, which is a RNG property that we otherwise
enjoy.

Solution C) WARN_ONCE()/BUG() in the RDRAND failure path (> 10 retries).

The advantage here is also simplicity, and the fact that it "ensures"
we'll be able to securely reseed later on. Systems-wise, the drawback
is that userspace can in theory DoS the kernel's RDRAND and cause a
crash.

Solution D) BUG_ON(is_coco_system) in the RDRAND failure path (> 10 retries).

This is slightly less simple than A, because we have to plumb
CoCo-detection through to the RDRAND helper, but it "ensures" we'll be
able to securely reseed later on. Systems-wise, the drawback is that
userspace can in theory DoS the kernel's RDRAND and cause a crash.

Solution E) BUG() in a new time-based RDRAND failure path on CoCo
systems (> 10 seconds).

This adds a lot of complexity, and we'd need some alternative code
path for CoCo with an infinite loop that breaks on a jiffies
comparison. But it at least makes it harder for userspace to DoS the
kernel's use of RDRAND, because it seems hard for a user thread to
preempt a kthread for that long, though maybe somebody has some nasty
scheduler tricks here that would break that hope.

Solution F) Loop forever in RDRAND on CoCo systems.

This makes debugging harder because of lockups (though I suppose we
could WARN after some amount of time), but at least it's somewhat
"sound".

==

I am currently leaning toward (B) as being the lightest touch that has
the least potential to break anything. (F) is also tempting because it
doesn't have the RNG-drawback. The others seem complex or incomplete
or otherwise annoying somehow.

There is also "Solution G" -- do nothing and raise a fuss and let
security researchers go to town and hope Intel gets their act
together. Given that the CoCo thing seems kind of
imaginary/aspirational anyway at this point, I'm very attracted by
this. I don't mean to say that I intend to mount a large argument that
we *should* do nothing, but it's just sort of sitting there in the
back of my mind as an appealing possibility.

Also, I wanted to enumerate currently open questions:

==

Question i) Just how deterministic can these CoCo VMs be? Elena
pointed to some TDX code regarding RDTSC that seemed fairly damning,
but I also wonder what gotchas a motivated researcher might run into
and how those could help us (or not).

Question ii) Just how DoS-able is RDRAND? From host to guest, where
the host controls scheduling, that seems easier, but how much so, and
what's the granularity of these operations, and could retries still
help, or not at all? What about from guest to guest, where the
scheduling is out of control; in that case is there a value of N for
which N retries makes it actually impossible to DoS? What about from
userspace to kernelspace; good value of N?

Question iii) How likely is Intel to actually fix this in a
satisfactory way (see "specifying this is an interesting question" in
[1])? And if they would, what would the timeline even be?

==

Anyway, that's about where I'm at. I figure I'll wait to see if the
internal inquiry within Intel yields anything interesting, and then
maybe we can move forward with solutions (B) or (F) or (G) or a
different Roald Dahl novel instead.

Jason

[1] https://lore.kernel.org/all/CAHmME9ps6W5snQrYeNVMFgfhMKFKciky=-UxxGFbAx_RrxSHoA@mail.gmail.com/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-01 18:09                             ` Jason A. Donenfeld
@ 2024-02-01 18:46                               ` Dave Hansen
  2024-02-01 19:02                                 ` H. Peter Anvin
  2024-02-02  7:25                               ` Reshetova, Elena
  2024-02-02 15:47                               ` James Bottomley
  2 siblings, 1 reply; 99+ messages in thread
From: Dave Hansen @ 2024-02-01 18:46 UTC (permalink / raw)
  To: Jason A. Donenfeld, Theodore Ts'o, Reshetova, Elena, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

On 2/1/24 10:09, Jason A. Donenfeld wrote:
> Question ii) Just how DoS-able is RDRAND? From host to guest, where
> the host controls scheduling, that seems easier, but how much so, and
> what's the granularity of these operations, and could retries still
> help, or not at all? What about from guest to guest, where the
> scheduling is out of control; in that case is there a value of N for
> which N retries makes it actually impossible to DoS? What about from
> userspace to kernelspace; good value of N?

So far, in practice, I haven't seen a single failure of RDRAND.  It's
been limited to RDSEED.  In a perfect world, I'd change the architecture
docs to say, "RDRAND only fails when the hardware breaks" and leave
RDSEED defined to be the one that fails easily.

Dealing with a fragile RDSEED seems like a much easier problem than
dealing with a fragile RDRAND since RDSEED is used _much_ more sparingly
in the kernel today.

But I'm not sure if the hardware implementations fit into this perfect
world I've conjured up.  We're going to wrangle up the folks at Intel
who can hopefully tell me if I'm totally deluded.

Has anyone seen RDRAND failures in practice?  Or just RDSEED?

> Question iii) How likely is Intel to actually fix this in a
> satisfactory way (see "specifying this is an interesting question" in
> [1])? And if they would, what would the timeline even be?

If the fix is pure documentation, it's on the order of months.  I'm
holding out hope that some kind of anti-DoS claims like you mentioned:

> Specifying this is an interesting question. What exactly might our
> requirements be for a "non-broken" RDRAND? It seems like we have two
> basic ones:
> 
> - One VMX (or host) context can't DoS another one.
> - Ring 3 can't DoS ring 0.

are still possible on existing hardware, at least for RDRAND.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-01 18:46                               ` Dave Hansen
@ 2024-02-01 19:02                                 ` H. Peter Anvin
  0 siblings, 0 replies; 99+ messages in thread
From: H. Peter Anvin @ 2024-02-01 19:02 UTC (permalink / raw)
  To: Dave Hansen, Jason A. Donenfeld, Theodore Ts'o, Reshetova,
	Elena, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On February 1, 2024 10:46:06 AM PST, Dave Hansen <dave.hansen@intel.com> wrote:
>On 2/1/24 10:09, Jason A. Donenfeld wrote:
>> Question ii) Just how DoS-able is RDRAND? From host to guest, where
>> the host controls scheduling, that seems easier, but how much so, and
>> what's the granularity of these operations, and could retries still
>> help, or not at all? What about from guest to guest, where the
>> scheduling is out of control; in that case is there a value of N for
>> which N retries makes it actually impossible to DoS? What about from
>> userspace to kernelspace; good value of N?
>
>So far, in practice, I haven't seen a single failure of RDRAND.  It's
>been limited to RDSEED.  In a perfect world, I'd change the architecture
>docs to say, "RDRAND only fails when the hardware breaks" and leave
>RDSEED defined to be the one that fails easily.
>
>Dealing with a fragile RDSEED seems like a much easier problem than
>dealing with a fragile RDRAND since RDSEED is used _much_ more sparingly
>in the kernel today.
>
>But I'm not sure if the hardware implementations fit into this perfect
>world I've conjured up.  We're going to wrangle up the folks at Intel
>who can hopefully tell me if I'm totally deluded.
>
>Has anyone seen RDRAND failures in practice?  Or just RDSEED?
>
>> Question iii) How likely is Intel to actually fix this in a
>> satisfactory way (see "specifying this is an interesting question" in
>> [1])? And if they would, what would the timeline even be?
>
>If the fix is pure documentation, it's on the order of months.  I'm
>holding out hope that some kind of anti-DoS claims like you mentioned:
>
>> Specifying this is an interesting question. What exactly might our
>> requirements be for a "non-broken" RDRAND? It seems like we have two
>> basic ones:
>> 
>> - One VMX (or host) context can't DoS another one.
>> - Ring 3 can't DoS ring 0.
>
>are still possible on existing hardware, at least for RDRAND.

The real question is: what do we actually need?

During startup, we could afford a *lot* of looping to collect enough entropy before giving up. After that, even if RDSEED fails 99% of the time, it will still produce far more entropy than a typical external randomness source. We don't want to loop that long, obviously (*), but instead try periodically and let the entropy accumulate.

(*) We *could* of course choose to aggressively loop in task context if there task would otherwise block on /dev/random.


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-01 11:08                       ` Daniel P. Berrangé
@ 2024-02-01 21:04                         ` Dr. Greg
  2024-02-02  7:56                           ` Reshetova, Elena
  0 siblings, 1 reply; 99+ messages in thread
From: Dr. Greg @ 2024-02-01 21:04 UTC (permalink / raw)
  To: Daniel P. Berrang??
  Cc: Theodore Ts'o, Jason A. Donenfeld, Reshetova, Elena, Hansen,
	Dave, Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Thu, Feb 01, 2024 at 11:08:09AM +0000, Daniel P. Berrang?? wrote:

Hi Dan, thanks for the thoughts.

> On Thu, Feb 01, 2024 at 03:54:51AM -0600, Dr. Greg wrote:
> > I suspect that the achievable socket core count cannot effectively
> > overwhelm the 1022x amplification factor inherent in the design of the
> > RDSEED based seeding of RDRAND.

> In testing I could get RDSEED down to < 3% success rate when
> running on 20 cores in parallel on a laptop class i7. If that
> failure rate can be improved by a little more than one order
> of magnitude to 0.1% we're starting to get to the point where
> it might be enough to make RDRAND re-seed fail.
> 
> Intel's Sierra Forest CPUs are said to have a variant with 288
> cores per socket, which is an order of magnitude larger. It is
> conceivable this might be large enough to demonstrate RDRAND
> failure in extreme load. Then again who knows what else has
> changed that might alter the equation, maybe the DRBG is also
> better / faster. Only real world testing can say for sure.
> One thing is certain though, core counts per socket keep going
> up, so the potential worst case load on RDSEED will increase...

Indeed, that would seem to be the important and operative question
that Intel could answer, maybe Dave and Elena will be able to provide
some guidance.

Until someone can actually demonstrate a sustained RDRAND depletion
attack we don't have an issue, only a lot of wringing of hands and
other handwaving on what we should do.

The thing that intrigues me is that we have two AMD engineers
following this, do you guys have any comments, reflections?  Unless I
misunderstand, SEV-SNP has the same challenges and issues.

As of late you guys have been delivering higher core counts that would
make your platform more susceptible.  Does your hardware design not
have a socket common RNG architecture that makes RDSEED vulnerable to
socket adversarial depletion?  Is this a complete non-issue in
practice?

Big opportunity here to proclaim: "Just buy AMD"... :-)

> > We will see if Elena can come up with what Intel engineering's
> > definition of 'astronomical' is.. :-)
> > 
> > > There's a special case with Confidential Compute VM's, since the
> > > assumption is that you want to protect against even a malicious
> > > hypervisor who could theoretically control all other sources of
> > > timing uncertainty.  And so, yes, in that case, the only thing we
> > > can do is Panic if RDRAND fails.
> > 
> > Indeed.
> > 
> > The bigger question, which I will respond to Elena with, is how much
> > this issue calls the entire question of confidential computing into
> > question.

> A denial of service (from a panic on RDRAND fail) doesn't undermine
> confidental computing. Guest data confidentiality is maintained by
> panicing on RDRAND failure and DoS protection isn't a threat that CC
> claims to be able to mitigate in general.

Yes, if there is a problem with RDRAND we have a CoCo solution, full
stop.

The issue that I was raising with Elena is more generic, to wit:

Her expressed concern is that a code construct looking something like this,
rdrand() returning 0 on success:

for (i= 0; i < 9; ++i)
	if (!rdrand(&seed))
		break;
	sleep(some time);
}
if (i == 9)
	BUG("No entropy");

do_something_with(seed);

Could be sufficiently manipulated by a malicious hypervisor in a TDX
environment so as to compromise its functionality.

If this level of control is indeed possible, given the long history of
timing and side-channel attacks against cryptography, this would seem
to pose significant questions as to whether or not CoCo can deliver on
its stated goals.

> With regards,
> Daniel

Have a good evening.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-01 18:09                             ` Jason A. Donenfeld
  2024-02-01 18:46                               ` Dave Hansen
@ 2024-02-02  7:25                               ` Reshetova, Elena
  2024-02-02 15:39                                 ` Theodore Ts'o
  2024-02-14 15:18                                 ` Reshetova, Elena
  2024-02-02 15:47                               ` James Bottomley
  2 siblings, 2 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-02  7:25 UTC (permalink / raw)
  To: Jason A. Donenfeld, Theodore Ts'o, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel


> Hi Ted, Elena, Dave,
> 
> On Thu, Feb 1, 2024 at 5:57 AM Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > On Wed, Jan 31, 2024 at 07:01:01PM +0100, Jason A. Donenfeld wrote:
> > > So if this is what we're congealing around, I guess we can:
> > >
> > > 0) Leave RDSEED alone and focus on RDRAND.
> > > 1) Add `WARN_ON_ONCE(in_early_boot);` to the failure path of RDRAND
> > > (and simply hope this doesn't get exploited for guest-guest boot DoS).
> > > 2) Loop forever in RDRAND on CoCo VMs, post-boot, with the comments
> > > and variable naming making it clear that this is a hardware bug
> > > workaround, not a "feature" added for "extra security".
> > > 3) Complain loudly to Intel and get them to fix the hardware.
> > >
> > > Though, a large part of me would really like to skip that step (2),
> > > first because it's a pretty gross bandaid that adds lots of
> > > complexity, and second because it'll make (3) less poignant
> >
> > If we need to loop more than, say, 10 seconds in a CoCo VM, I'd just
> > panic with a repeated RDRAND failure message.  This makes the point of
> > (3) that much pointed, and it's better than having a CoCo VM
> > mysteriously hang in the face of a DOS attack.
> 
> Yea, true. Problem is that in theory, userspace can DoS the kernel's
> use of RDRAND. Of course in practice, a userspace process preempting a
> kthread for >10 seconds is probably a larger problem.
> 
> Anyway, I want to lay out the various potential solutions discussed.
> As they all have some drawback, it's worth enumerating them.
> 
> ==
> 
> Solution A) WARN_ON_ONCE(is_early_boot)/BUG_ON(is_early_boot) in the
> RDRAND failure path (> 10 retries).
> 
> The biggest advantage here is that this is super simple and isn't
> CoCo-specific. The premise is that if RDRAND fails 10 times in a row
> before userspace has started, it's most definitely a hardware problem.
> Systems-wise, the drawback is that, in a VM, it alternatively might be
> a guest-guest DoS attack on RDRAND, or in the CoCo case, a host-guest
> DoS attack (which is presumably easier because the host controls
> scheduling). In the CoCo case, not booting is better than losing
> confidentiality. In the non-CoCo case, that seems like theoretically a
> DoS we might not want. RNG-wise, the drawback is that this doesn't
> help deal with secure reseeding later in time, which is a RNG property
> that we otherwise enjoy.
> 
> Solution B) BUG_ON(is_early_boot && is_coco_system) in the RDRAND
> failure path (> 10 retries).
> 
> This is slightly less simple than A, because we have to plumb
> CoCo-detection through to the RDRAND helper. [Side note: I feel
> ridiculous typing 'CoCo'.] Systems-wise, I don't see drawbacks.
> RNG-wise, the drawback is that this doesn't help deal with secure
> reseeding later in time, which is a RNG property that we otherwise
> enjoy.
> 
> Solution C) WARN_ONCE()/BUG() in the RDRAND failure path (> 10 retries).
> 
> The advantage here is also simplicity, and the fact that it "ensures"
> we'll be able to securely reseed later on. Systems-wise, the drawback
> is that userspace can in theory DoS the kernel's RDRAND and cause a
> crash.
> 
> Solution D) BUG_ON(is_coco_system) in the RDRAND failure path (> 10 retries).
> 
> This is slightly less simple than A, because we have to plumb
> CoCo-detection through to the RDRAND helper, but it "ensures" we'll be
> able to securely reseed later on. Systems-wise, the drawback is that
> userspace can in theory DoS the kernel's RDRAND and cause a crash.
> 
> Solution E) BUG() in a new time-based RDRAND failure path on CoCo
> systems (> 10 seconds).
> 
> This adds a lot of complexity, and we'd need some alternative code
> path for CoCo with an infinite loop that breaks on a jiffies
> comparison. But it at least makes it harder for userspace to DoS the
> kernel's use of RDRAND, because it seems hard for a user thread to
> preempt a kthread for that long, though maybe somebody has some nasty
> scheduler tricks here that would break that hope.
> 
> Solution F) Loop forever in RDRAND on CoCo systems.
> 
> This makes debugging harder because of lockups (though I suppose we
> could WARN after some amount of time), but at least it's somewhat
> "sound".
> 
> ==

This is a great summary of options, thank you Jason!
My proposal would be to wait on result of our internal investigation 
before proceeding to choose the approach. 

> 
> I am currently leaning toward (B) as being the lightest touch that has
> the least potential to break anything. (F) is also tempting because it
> doesn't have the RNG-drawback. The others seem complex or incomplete
> or otherwise annoying somehow.
> 
> There is also "Solution G" -- do nothing and raise a fuss and let
> security researchers go to town and hope Intel gets their act
> together. Given that the CoCo thing seems kind of
> imaginary/aspirational anyway at this point, I'm very attracted by
> this. I don't mean to say that I intend to mount a large argument that
> we *should* do nothing, but it's just sort of sitting there in the
> back of my mind as an appealing possibility.
> 
> Also, I wanted to enumerate currently open questions:
> 
> ==
> 
> Question i) Just how deterministic can these CoCo VMs be? Elena
> pointed to some TDX code regarding RDTSC that seemed fairly damning,
> but I also wonder what gotchas a motivated researcher might run into
> and how those could help us (or not).

This would be great imo to have a discussion on. I don’t think the internal
design or implementation of TDX module is complicated to scare
anyone off. So I think it would be a question on how practical would
be for VMM to make such an attack on guest kernel? A lot of times such
things are about precision, reliability and an ability to filter out the noise.  
So questions like how precisely *in practice* can VMM measure guest's
virtual TSC and other parameters that are used as entropy inputs?

But overall both in crypto and security, we don’t like to be too near the
security bounds, because we always assume our understanding might be
incomplete, so putting a reasonable and clear countermeasure is usually the
better approach. 

> 
> Question ii) Just how DoS-able is RDRAND? From host to guest, where
> the host controls scheduling, that seems easier, but how much so, and
> what's the granularity of these operations, and could retries still
> help, or not at all? What about from guest to guest, where the
> scheduling is out of control; in that case is there a value of N for
> which N retries makes it actually impossible to DoS? What about from
> userspace to kernelspace; good value of N?

All valid questions that I am also trying to understand the answers. 

Best Regards,
Elena.

> 
> Question iii) How likely is Intel to actually fix this in a
> satisfactory way (see "specifying this is an interesting question" in
> [1])? And if they would, what would the timeline even be?
> 
> ==
> 
> Anyway, that's about where I'm at. I figure I'll wait to see if the
> internal inquiry within Intel yields anything interesting, and then
> maybe we can move forward with solutions (B) or (F) or (G) or a
> different Roald Dahl novel instead.
> 
> Jason
> 
> [1] https://lore.kernel.org/all/CAHmME9ps6W5snQrYeNVMFgfhMKFKciky=-
> UxxGFbAx_RrxSHoA@mail.gmail.com/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-01 21:04                         ` Dr. Greg
@ 2024-02-02  7:56                           ` Reshetova, Elena
  0 siblings, 0 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-02  7:56 UTC (permalink / raw)
  To: Dr. Greg, Daniel P. Berrang??
  Cc: Theodore Ts'o, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

> > > The bigger question, which I will respond to Elena with, is how much
> > > this issue calls the entire question of confidential computing into
> > > question.
> 
> > A denial of service (from a panic on RDRAND fail) doesn't undermine
> > confidental computing. Guest data confidentiality is maintained by
> > panicing on RDRAND failure and DoS protection isn't a threat that CC
> > claims to be able to mitigate in general.
> 
> Yes, if there is a problem with RDRAND we have a CoCo solution, full
> stop.
> 
> The issue that I was raising with Elena is more generic, to wit:
> 
> Her expressed concern is that a code construct looking something like this,
> rdrand() returning 0 on success:
> 
> for (i= 0; i < 9; ++i)
> 	if (!rdrand(&seed))
> 		break;
> 	sleep(some time);
> }
> if (i == 9)
> 	BUG("No entropy");
> 
> do_something_with(seed);
> 
> Could be sufficiently manipulated by a malicious hypervisor in a TDX
> environment so as to compromise its functionality.

This is not what I had in mind. How does the above can be manipulated
by a malicious hypervisor? If the above construction can be
logically manipulated we have other issues than rdrand, this is imo
already a control flow manipulation attack that you are stating here. 

What a malicious hypervisor can *in theory* do is to insert the execution
delays and make the above loop fail even if we assume that the probability
of falling the 10 retry loop is negligible in normal cases (assuming tightness
or other timing requirements). But again, this is theoretical at this point.
But if the SW refuses to proceed and panics in such cases, we have a DoS
as we already discussed.

Best Regards,
Elena.


> 
> If this level of control is indeed possible, given the long history of
> timing and side-channel attacks against cryptography, this would seem
> to pose significant questions as to whether or not CoCo can deliver on
> its stated goals.
> 
> > With regards,
> > Daniel
> 
> Have a good evening.
> 
> As always,
> Dr. Greg
> 
> The Quixote Project - Flailing at the Travails of Cybersecurity
>               https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02  7:25                               ` Reshetova, Elena
@ 2024-02-02 15:39                                 ` Theodore Ts'o
  2024-02-03 10:12                                   ` Jason A. Donenfeld
  2024-02-14 15:18                                 ` Reshetova, Elena
  1 sibling, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-02 15:39 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Jason A. Donenfeld, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

On Fri, Feb 02, 2024 at 07:25:42AM +0000, Reshetova, Elena wrote:
> This is a great summary of options, thank you Jason!
> My proposal would be to wait on result of our internal investigation 
> before proceeding to choose the approach. 

I'm happy for the option "Do nothing for now", but if we do want to do
something in the absence of more detailed information, I'd suggest
doing something simple for first, on the theory that it doesn't make
things worse, and we can always do something more complicated if it
turns out to be needed.

In that vein, my suggestion is:

> > Solution B) BUG_ON(is_early_boot && is_coco_system) in the RDRAND
> > failure path (> 10 retries).
> > 
> > This is slightly less simple than A, because we have to plumb
> > CoCo-detection through to the RDRAND helper. [Side note: I feel
> > ridiculous typing 'CoCo'.] Systems-wise, I don't see drawbacks.
> > RNG-wise, the drawback is that this doesn't help deal with secure
> > reseeding later in time, which is a RNG property that we otherwise
> > enjoy.

If there isn't a global variable we can test to see if Confidential
Compute is enabled, I suspect we should just add one.  I would assume
that /dev/random isn't the only place where we might need to do
whether Confidential Compute is enabled.

So I don't think plumbing CC into the /dev/random code, and since we
are only doing this in early boot, I wouldn't put it in the RDRAND
helper, but rather in the caller of the RDRAND helper that gets used
in the early boot path.

(Side note, internally, at least in my part of my company, we use CC
as the acronym of convenience.  And any comments that I make are my
own opinion, and do not reflect the positions or desires of my
employer...)

> > Question iii) How likely is Intel to actually fix this in a
> > satisfactory way (see "specifying this is an interesting question" in
> > [1])? And if they would, what would the timeline even be?

Here are at least two obvious ways that Intel could fix or mitigate
this issue:

(1) Add more hardware RNG IP's for chips with a huge number of cores.
This is the *obvious* way to address the problem with hundreds of CPU
cores, although it's only something that can be done on newer chips.

(2) Have a per-core throttle where a core is not allowed to issue
RDRAND or RDSEED instructions more than N times per millisecond (or
some other unit of time).  So long as N is larger than the maximum
number of SSL connections that a front-end server can actually
terminate, it's not going to impact legitimate workloads.  This can be
approximated by the number of public key operations per unit time that
a single CPU core achieve.  And if RDRAND isn't sufficient to support
that today, then see solution (1), or CPU customers should switch to
some other CPU vendor that can...  

(3) Provide some kind of perf counter so the host can see which cores
are issuing a huge number of RDRAND/RDSEED instructions, and which
cores have been suffering from entropy exhaustion RDRAND/RDSEED
failures.  This would allow the operator of the host to detect which
VM's might be carrying out DOS attacks, so that the operator can kill
those VM's, and disable the customer account that was launching these
abusive VM's.


Hopefully mitigation #2 in particular (and maybe mitigation #3) is
something that Intel could implement as a firmware update; I'd love
comments from Intel if that is the case.

I'll also note that the threat model where customer A is trying to
start a CC VM, and customer B has purchased VM's that use all of the
other cores on a server, is primarily the sort of thing that a public
cloud vendor would need to worry about.  And *if* this become a real
issue, where some researcher demonstrates that there is a problem, the
cloud provider will be hugely incentivized to apply host-side
mitigations, and to lean on the guest OS providers to apply guest-side
mitigations.

So if this is only a DOS which applies for CC VM's, and it turns out
that solution (B) is not sufficient, we can do something more
complicated, such as having the guest retry the RDRAND instruction for
ten seconds.  And if some hypothetical "RandExhaust" attack is being
written about by the New York Times, I suspect it won't be that hard
to get Red Hat to apply mitigations to the RHEL kernel.  :-)

So I don't think it really is *that* big of a deal; if it turns out to
be an issue, we will be able to deal with it.

      	     	     	     	     - Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-01 18:09                             ` Jason A. Donenfeld
  2024-02-01 18:46                               ` Dave Hansen
  2024-02-02  7:25                               ` Reshetova, Elena
@ 2024-02-02 15:47                               ` James Bottomley
  2024-02-02 16:05                                 ` Theodore Ts'o
  2 siblings, 1 reply; 99+ messages in thread
From: James Bottomley @ 2024-02-02 15:47 UTC (permalink / raw)
  To: Jason A. Donenfeld, Theodore Ts'o, Reshetova, Elena, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra,  Ashish, Sean Christopherson,
	linux-coco, linux-kernel

On Thu, 2024-02-01 at 19:09 +0100, Jason A. Donenfeld wrote:
[...]
> Anyway, that's about where I'm at. I figure I'll wait to see if the
> internal inquiry within Intel yields anything interesting, and then
> maybe we can move forward with solutions (B) or (F) or (G) or a
> different Roald Dahl novel instead.

It's a lot to quote, so I cut it, but all of your solutions assume a
rdseed/rdrand failure equates to a system one but it really doesn't: in
most systems there are other entropy sources.  In confidential
computing it is an issue because we have no other trusted sources.  The
problem with picking on rdseed/rdrand is that there are bound to be
older CPUs somewhere that have rng generation bugs that this will
expose.    How about making the failure contingent on the entropy pool
not having any entropy when the first random number is requested?  That
way systems with more than one usable entropy source won't flag a bug,
but it will still flag up confidential computing systems where there's
a malicious entropy depleter.

James


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02 15:47                               ` James Bottomley
@ 2024-02-02 16:05                                 ` Theodore Ts'o
  2024-02-02 21:28                                   ` James Bottomley
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-02 16:05 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jason A. Donenfeld, Reshetova, Elena, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

On Fri, Feb 02, 2024 at 04:47:11PM +0100, James Bottomley wrote:
> 
> It's a lot to quote, so I cut it, but all of your solutions assume a
> rdseed/rdrand failure equates to a system one but it really doesn't: in
> most systems there are other entropy sources.  In confidential
> computing it is an issue because we have no other trusted sources.  The
> problem with picking on rdseed/rdrand is that there are bound to be
> older CPUs somewhere that have rng generation bugs that this will
> expose.

I'm not sure what you're concerned about.  As far as I know, all of
the CPU's have some variant of Confidential Compute have some kind of
RDRAND-like command.  And while we're using the term RDRAND, I'd
extend this to any CPU architecture-level RNG instruction which can
return failure if it is subject to exhaustion attacks.

> How about making the failure contingent on the entropy pool
> not having any entropy when the first random number is requested?

We have tried to avoid characterizing entropy sources as "valid" or
"invalid".  First of all, it's rarely quite so black-and-white.
Something which is vulnerable to someone who can spy on inter-packet
arrival times by having a hardware tap between the CPU and the network
switch, or a wireless radio right next to the device being attacked,
might not be easily carried out by someone who doesn't have local
physical access.

So we may be measuring various things that might or might not have
"entropy".  In the case of Confidential Compute, we have declared that
none of those other sources constitute "entropy".  But that's not a
decision that can be made by the computer, or at least until we've
tracked the AGI problem.  (At which point, we might have other
problems --- "I'm sorry, I'm afraid I can't do that.")

	     	  	     	      - Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02 16:05                                 ` Theodore Ts'o
@ 2024-02-02 21:28                                   ` James Bottomley
  2024-02-03 14:35                                     ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: James Bottomley @ 2024-02-02 21:28 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Jason A. Donenfeld, Reshetova, Elena, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

On Fri, 2024-02-02 at 11:05 -0500, Theodore Ts'o wrote:
> On Fri, Feb 02, 2024 at 04:47:11PM +0100, James Bottomley wrote:
> > 
> > It's a lot to quote, so I cut it, but all of your solutions assume
> > a rdseed/rdrand failure equates to a system one but it really
> > doesn't: in most systems there are other entropy sources.  In
> > confidential computing it is an issue because we have no other
> > trusted sources. The problem with picking on rdseed/rdrand is that
> > there are bound to be older CP s somewhere that have rng generation
> > bugs that this will
> > expose.
> 
> I'm not sure what you're concerned about.  As far as I know, all of
> the CPU's have some variant of Confidential Compute have some kind of
> RDRAND-like command.  And while we're using the term RDRAND, I'd
> extend this to any CPU architecture-level RNG instruction which can
> return failure if it is subject to exhaustion attacks.

My big concern is older cpus where rdrand/rdseed don't produce useful
entropy.  Exhaustion attacks are going to be largely against VMs not
physical systems, so I worry about physical systems with older CPUs
that might have rdrand issues which then trip our Confidential
Computing checks.


> > How about making the failure contingent on the entropy pool
> > not having any entropy when the first random number is requested?
> 
> We have tried to avoid characterizing entropy sources as "valid" or
> "invalid".  First of all, it's rarely quite so black-and-white.
> Something which is vulnerable to someone who can spy on inter-packet
> arrival times by having a hardware tap between the CPU and the
> network switch, or a wireless radio right next to the device being
> attacked, might not be easily carried out by someone who doesn't have
> local physical access.
> 
> So we may be measuring various things that might or might not have
> "entropy".  In the case of Confidential Compute, we have declared
> that none of those other sources constitute "entropy".  But that's
> not a decision that can be made by the computer, or at least until
> we've tracked the AGI problem.  (At which point, we might have other
> problems --- "I'm sorry, I'm afraid I can't do that.")

The signal for rdseed failing is fairly clear, so if the node has other
entropy sources, it should continue otherwise it should signal failure.
Figuring out how a confidential computing environment signals that
failure is TBD.

James


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02 15:39                                 ` Theodore Ts'o
@ 2024-02-03 10:12                                   ` Jason A. Donenfeld
  2024-02-09 19:53                                     ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-03 10:12 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Reshetova, Elena, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

Hi Ted, Kirill,

On Fri, Feb 02, 2024 at 10:39:27AM -0500, Theodore Ts'o wrote:
> On Fri, Feb 02, 2024 at 07:25:42AM +0000, Reshetova, Elena wrote:
> > This is a great summary of options, thank you Jason!
> > My proposal would be to wait on result of our internal investigation 
> > before proceeding to choose the approach. 
> 
> I'm happy for the option "Do nothing for now", but if we do want to do
> something in the absence of more detailed information, I'd suggest
> doing something simple for first, on the theory that it doesn't make
> things worse, and we can always do something more complicated if it
> turns out to be needed.
> 
> In that vein, my suggestion is:
> 
> > > Solution B) BUG_ON(is_early_boot && is_coco_system) in the RDRAND
> > > failure path (> 10 retries).
> > > 
> > > This is slightly less simple than A, because we have to plumb
> > > CoCo-detection through to the RDRAND helper. [Side note: I feel
> > > ridiculous typing 'CoCo'.] Systems-wise, I don't see drawbacks.
> > > RNG-wise, the drawback is that this doesn't help deal with secure
> > > reseeding later in time, which is a RNG property that we otherwise
> > > enjoy.
> 
> If there isn't a global variable we can test to see if Confidential
> Compute is enabled, I suspect we should just add one.  I would assume
> that /dev/random isn't the only place where we might need to do
> whether Confidential Compute is enabled.
> 
> So I don't think plumbing CC into the /dev/random code, and since we
> are only doing this in early boot, I wouldn't put it in the RDRAND
> helper, but rather in the caller of the RDRAND helper that gets used
> in the early boot path.

Yea, actually, I had a pretty similar idea for something like that
that's very non-invasive, where none of this even touches the RDRAND
core code, much less random.c. Specifically, we consider "adding some
extra RDRAND to the pool" like any other driver that wants to add some
of its own seeds to the pool, with add_device_randomness(), a call that
lives in various driver code, doesn't influence any entropy readiness
aspects of random.c, and can safely be sprinkled in any device or
platform driver.

Specifically what I'm thinking about is something like:

void coco_main_boottime_init_function_somewhere_deep_in_arch_code(void)
{
  // [...]
  // bring up primary CoCo nuts
  // [...]

  /* CoCo requires an explicit RDRAND seed, because the host can make the
   * rest of the system deterministic.
   */
  unsigned long seed[32 / sizeof(long)];
  size_t i, longs;
  for (i = 0; i < ARRAY_SIZE(seed); i += longs) {
    longs = arch_get_random_longs(&seed[i], ARRAY_SIZE(seed) - i);
    /* If RDRAND is being DoS'd, panic, because we can't ensure
     * confidentiality.
     */
    BUG_ON(!longs);
  }
  add_device_randomness(seed, sizeof(seed));
  memzero_explicit(seed, sizeof(seed));

  // [...]
  // do other CoCo things
  // [...]
}

I would have no objection to the CoCo people adding something like this
and would give it my Ack, but more importantly, my Ack for that doesn't
even matter, because add_device_randomness() is pretty innocuous.

So Kirill, if nobody else here objects to that approach, and you want to
implement it in some super minimal way like that, that would be fine
with me. Or maybe we want to wait for that internal inquiry at Intel to
return some answers first. But either way, this might be an easy
approach that doesn't add too much complexity.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02 21:28                                   ` James Bottomley
@ 2024-02-03 14:35                                     ` Theodore Ts'o
  2024-02-06 19:12                                       ` H. Peter Anvin
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-03 14:35 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jason A. Donenfeld, Reshetova, Elena, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

On Fri, Feb 02, 2024 at 10:28:01PM +0100, James Bottomley wrote:
> 
> My big concern is older cpus where rdrand/rdseed don't produce useful
> entropy.  Exhaustion attacks are going to be largely against VMs not
> physical systems, so I worry about physical systems with older CPUs
> that might have rdrand issues which then trip our Confidential
> Computing checks.

For (non-CC) VM's the answer is virtio-rng.  This solves the
exhaustion problem, since if you can't trust the host, the VM's
security is taost anyway (again, ignoring Confidential Compute).

> The signal for rdseed failing is fairly clear, so if the node has other
> entropy sources, it should continue otherwise it should signal failure.
> Figuring out how a confidential computing environment signals that
> failure is TBD.

That's a design decision, and I believe we've been converging on a
panic during early boot.  Post boot, if we've successfully succeeded
in initializing the guest kernel's RNG, we're secure so long as the
cryptographic primitives haven't been defeated --- and if we have,
such as if Quantuum Computing because practical, we've got bigger
problems anyway.

					- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-01-31  8:16             ` Reshetova, Elena
  2024-01-31 11:59               ` Dr. Greg
  2024-01-31 13:06               ` Jason A. Donenfeld
@ 2024-02-06  1:12               ` Dr. Greg
  2024-02-06  8:04                 ` Daniel P. Berrangé
  2 siblings, 1 reply; 99+ messages in thread
From: Dr. Greg @ 2024-02-06  1:12 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Daniel P. Berrang??,
	Jason A. Donenfeld, Hansen, Dave, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Jan 31, 2024 at 08:16:56AM +0000, Reshetova, Elena wrote:

Good evening, I hope the week has started well for everyone.

> > On Tue, Jan 30, 2024 at 06:49:15PM +0100, Jason A. Donenfeld wrote:
> > > On Tue, Jan 30, 2024 at 6:32???PM Dave Hansen <dave.hansen@intel.com> wrote:
> > > >
> > > > On 1/30/24 05:45, Reshetova, Elena wrote:
> > > > >> You're the Intel employee so you can find out about this with much
> > > > >> more assurance than me, but I understand the sentence above to be _way
> > > > >> more_ true for RDRAND than for RDSEED. If your informed opinion is,
> > > > >> "RDRAND failing can only be due to totally broken hardware"
> > > > > No, this is not the case per Intel SDM. I think we can live under a simple
> > > > > assumption that both of these instructions can fail not just due to broken
> > > > > HW, but also due to enough pressure put into the whole DRBG construction
> > > > > that supplies random numbers via RDRAND/RDSEED.
> > > >
> > > > I don't think the SDM is the right thing to look at for guidance here.
> > > >
> > > > Despite the SDM allowing it, we (software) need RDRAND/RDSEED failures
> > > > to be exceedingly rare by design.  If they're not, we're going to get
> > > > our trusty torches and pitchforks and go after the folks who built the
> > > > broken hardware.
> > > >
> > > > Repeat after me:
> > > >
> > > >         Regular RDRAND/RDSEED failures only occur on broken hardware
> > > >
> > > > If it's nice hardware that's gone bad, then we WARN() and try to make
> > > > the best of it.  If it turns out that WARN() was because of a broken
> > > > hardware _design_ then we go sharpen the pitchforks.
> > > >
> > > > Anybody disagree?
> > >
> > > Yes, I disagree. I made a trivial test that shows RDSEED breaks easily
> > > in a busy loop. So at the very least, your statement holds true only
> > > for RDRAND.
> > >
> > > But, anyway, if the statement "RDRAND failures only occur on broken
> > > hardware" is true, then a WARN() in the failure path there presents no
> > > DoS potential of any kind, and so that's a straightforward conclusion
> > > to this discussion. However, that really hinges on  "RDRAND failures
> > > only occur on broken hardware" being a true statement.
> > 
> > There's a useful comment here from an Intel engineer
> > 
> > https://web.archive.org/web/20190219074642/https://software.intel.com/en-
> > us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
> > 
> >   "RDRAND is, indeed, faster than RDSEED because it comes
> >    from a hardware-based pseudorandom number generator.
> >    One seed value (effectively, the output from one RDSEED
> >    command) can provide up to 511 128-bit random values
> >    before forcing a reseed"
> > 
> > We know we can exhaust RDSEED directly pretty trivially. Making your
> > test program run in parallel across 20 cpus, I got a mere 3% success
> > rate from RDSEED.
> > 
> > If RDRAND is reseeding every 511 values, RDRAND output would have
> > to be consumed significantly faster than RDSEED in order that the
> > reseed will happen frequently enough to exhaust the seeds.
> > 
> > This looks pretty hard, but maybe with a large enough CPU count
> > this will be possible in extreme load ?
> > 
> > So I'm not convinced we can blindly wave away RDRAND failures as
> > guaranteed to mean broken hardware.

> This matches both my understanding (I do have cryptography
> background and understanding how cryptographic RNGs work) and
> official public docs that Intel published on this matter.  Given
> that the physical entropy source is limited anyhow, and by giving
> enough pressure on the whole construction you should be able to make
> RDRAND fail because if the intermediate AES-CBC MAC extractor/
> conditioner is not getting its min entropy input rate, it wont
> produce a proper seed for AES CTR DRBG.  Of course exact
> details/numbers can wary between different generations of Intel DRNG
> implementation, and the platforms where it is running on, so be
> careful to sticking to concrete numbers.

In the spirit of that philosophy we proffer the response below.

> That said, I have taken an AR to follow up internally on what can be
> done to improve our situation with RDRAND/RDSEED. But I would still
> like to finish the discussion on what people think should be done in
> the meanwhile keeping in mind that the problem is not intel
> specific, despite us intel people bringing it for public discussion
> first. The old saying is still here: "Please don't shoot the
> messenger" )) We are actually trying to be open about these things
> and create a public discussion.

Actually, I now believe there is clear evidence that the problem is
indeed Intel specific.  In light of our testing, it will be
interesting to see what your 'AR' returns with respect to an official
response from Intel engineering on this issue.

One of the very bright young engineers collaborating on Quixote, who
has been following this conversation, took it upon himself to do some
very methodical engineering analysis on this issue.  I'm the messenger
but this is very much his work product.

Executive summary is as follows:

- No RDRAND depletion failures were observable with either the Intel
  or AMD hardware that was load tested.

- RDSEED depletion is an Intel specific issue, AMD's RDSEED
  implementation could not be provoked into failure.

- AMD's RDRAND/RDSEED implementation is significantly slower than
  Intel's.

Here are the engineer's lab notes verbatim:

---------------------------------------------------------------------------
I tested both the single-threaded and OMP-multithreaded versions of
the RDSEED/RDRAND depletion loop on each of the machines below.

AMD: 2X AMD EPYC 7713 (Milan) 64-Core Processor @ 2.0 GHz, 128
physical cores total

Intel: 2X Intel Xeon Gold 6140 (Skylake) 18-Core Processor @ 2.3 GHz,
36 physical cores total

Single-threaded results:

Test case: 1,000,000 iterations each for RDRAND and RDSEED, n=100
tests, single-threaded.

AMD: 100% success rate for both RDRAND and RDSEED for all tests,
runtime 0.909-1.055s (min-max).

Intel: 100% success rate for RDRAND for all tests, 20.01-20.12%
(min-max) success rate for RSEED, runtime 0.256-0.281s (min-max)

OMP multithreaded results:

Test case: 1,000,000 iterations per thread, for both RDRAND and
RDSEED, n=100 tests, OMP multithreaded with OMP_NUM_THREADS=<total
physical cores> (i.e. 128 for AMD and 36 for Intel)

AMD: 100% success rate for both RDRAND and RDSEED for all tests,
runtime 47.229-47.603s (min-max).

Intel: 100% success rate for RDRAND for all tests, 1.77-5.62%
(min-max) success rate for RSEED, runtime 0.562-0.595s (min-max)

CONCLUSION

RDSEED failure was reproducibly induced on the Intel Skylake platform,
for both single- and multithreaded tests, whereas RDSEED failure could
not be induced on the AMD platform for either test. RDRAND did not
fail on either platform for either test.

AMD execution time was roughly 4x slower than Intel (1s vs 0.25s) for
the single-threaded test, and almost 100x slower than Intel (47s vs
0.5s) for the multithreaded test. The difference in clock rates (2.0
GHz for AMD vs 2.3 GHz for Intel) is not sufficient to explain these
runtime differences. So it seems likely that AMD is gating the rate at
which a new RDSEED value can be requested.
---------------------------------------------------------------------------

Speaking now with my voice:

Unless additional information shows up, despite our collective
handwringing, as long as the RDRAND instruction is used as the
cryptographic primitive, there appears to be little likelihood of a
DOS randomness attack against a TDX based CoCo virtual machine.

While it is highly unlikely we will ever get an 'official' readout on
this issue, I suspect there is a high probability that Intel
engineering favored performance with their RDSEED/RDRAND
implementation.

AMD 'appears', and without engineering feedback from AMD I would
emphasize the notion of 'appears', to have embraced the principal of
taking steps to eliminate the possibility of a socket based adversary
attack against their RNG infrastructure.

> Elena.

Hopefully the above is useful for everyone interested in this issue.

Once again, a thank you to our version of 'Sancho' for his legwork on
this, who has also read Cervantes at length... :-)

Have a good remainder of the week.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06  1:12               ` Dr. Greg
@ 2024-02-06  8:04                 ` Daniel P. Berrangé
  2024-02-06 12:04                   ` Dr. Greg
  0 siblings, 1 reply; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-02-06  8:04 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> 
> Actually, I now believe there is clear evidence that the problem is
> indeed Intel specific.  In light of our testing, it will be
> interesting to see what your 'AR' returns with respect to an official
> response from Intel engineering on this issue.
> 
> One of the very bright young engineers collaborating on Quixote, who
> has been following this conversation, took it upon himself to do some
> very methodical engineering analysis on this issue.  I'm the messenger
> but this is very much his work product.
> 
> Executive summary is as follows:
> 
> - No RDRAND depletion failures were observable with either the Intel
>   or AMD hardware that was load tested.
> 
> - RDSEED depletion is an Intel specific issue, AMD's RDSEED
>   implementation could not be provoked into failure.

My colleague ran a multithread parallel stress test program on
his 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate
in RDSEED.

> - AMD's RDRAND/RDSEED implementation is significantly slower than
>   Intel's.

Yes, we also noticed the AMD impl is horribly slow compared to Intel,
had to cut test iterations x100

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06  8:04                 ` Daniel P. Berrangé
@ 2024-02-06 12:04                   ` Dr. Greg
  2024-02-06 13:00                     ` Daniel P. Berrangé
                                       ` (3 more replies)
  0 siblings, 4 replies; 99+ messages in thread
From: Dr. Greg @ 2024-02-06 12:04 UTC (permalink / raw)
  To: Daniel P. Berrang??
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:

Good morning to everyone.

> On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> > 
> > Actually, I now believe there is clear evidence that the problem is
> > indeed Intel specific.  In light of our testing, it will be
> > interesting to see what your 'AR' returns with respect to an official
> > response from Intel engineering on this issue.
> > 
> > One of the very bright young engineers collaborating on Quixote, who
> > has been following this conversation, took it upon himself to do some
> > very methodical engineering analysis on this issue.  I'm the messenger
> > but this is very much his work product.
> > 
> > Executive summary is as follows:
> > 
> > - No RDRAND depletion failures were observable with either the Intel
> >   or AMD hardware that was load tested.
> > 
> > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> >   implementation could not be provoked into failure.

> My colleague ran a multithread parallel stress test program on his
> 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> RDSEED.

Interesting datapoint, thanks for forwarding it along, so the issue
shows up on at least some AMD platforms as well.

On the 18 core/socket Intel Skylake platform, the parallelized
depletion test forces RDSEED success rates down to around 2%.  It
would appear that your tests suggest that the AMD platform fairs
better than the Intel platform.

So this is turning into even more of a morass, given that RDSEED
depletion on AMD may be a function of the micro-architecture the
platform is based on.  The other variable is that our AMD test
platform had a substantially higher core count per socket, one would
assume that would result in higher depletion rates, if the operative
theory of socket common RNG infrastructure is valid.

Unless AMD engineering understands the problem and has taken some type
of action on higher core count systems to address the issue.

Of course, the other variable may be how the parallelized stress test
is conducted.  If you would like to share your implementation source
we could give it a twirl on the systems we have access to.

The continuing operative question is whether or not any of this ever
leads to an RDRAND failure.

We've conducted some additional tests on the Intel platform where
RDSEED depletion was driven low as possible, ~1-2% success rates,
while RDRAND depletion tests were being run simultaneously.  No RDRAND
failures have been noted.

So the operative question remains, why worry about this if RDRAND is
used as the randomness primitive.

We haven't seen anything out of Intel yet on this, maybe AMD has a
quantifying definition for 'astronomical' when it comes to RDRAND
failures.

The silence appears to be deafening out of the respective engineering
camps... :-)

> > - AMD's RDRAND/RDSEED implementation is significantly slower than
> >   Intel's.

> Yes, we also noticed the AMD impl is horribly slow compared to
> Intel, had to cut test iterations x100.

The operative question is the impact of 'slow', in the absence of
artifical stress tests.

It would seem that a major question is what are or were the
engineering thought processes on the throughput of the hardware
randomness instructions.

Intel documents the following randomness throughput rates:

RDSEED: 3 Gbit/second
RDRAND: 6.4 Gbit/second

If there is the possibility of over-harvesting randomness, why not
design the implementations to be clamped at some per core value such
as a megabit/second.  In the case of the documented RDSEED generation
rates, that would allow the servicing of 3222 cores, if my math at
0530 in the morning is correct.

Would a core need more than 128 kilobytes of randomness, ie. one
second of output, to effectively seed a random number generator?

A cynical conclusion would suggest engineering acquiesing to marketing
demands... :-)

> With regards,
> Daniel

Have a good day.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 12:04                   ` Dr. Greg
@ 2024-02-06 13:00                     ` Daniel P. Berrangé
  2024-02-08 10:31                       ` Dr. Greg
  2024-02-06 13:50                     ` Daniel P. Berrangé
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-02-06 13:00 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
> 
> Good morning to everyone.
> 
> > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> > > 
> > > Actually, I now believe there is clear evidence that the problem is
> > > indeed Intel specific.  In light of our testing, it will be
> > > interesting to see what your 'AR' returns with respect to an official
> > > response from Intel engineering on this issue.
> > > 
> > > One of the very bright young engineers collaborating on Quixote, who
> > > has been following this conversation, took it upon himself to do some
> > > very methodical engineering analysis on this issue.  I'm the messenger
> > > but this is very much his work product.
> > > 
> > > Executive summary is as follows:
> > > 
> > > - No RDRAND depletion failures were observable with either the Intel
> > >   or AMD hardware that was load tested.
> > > 
> > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> > >   implementation could not be provoked into failure.
> 
> > My colleague ran a multithread parallel stress test program on his
> > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> > RDSEED.
> 
> Interesting datapoint, thanks for forwarding it along, so the issue
> shows up on at least some AMD platforms as well.
> 
> On the 18 core/socket Intel Skylake platform, the parallelized
> depletion test forces RDSEED success rates down to around 2%.  It
> would appear that your tests suggest that the AMD platform fairs
> better than the Intel platform.

Yes, given the speed of the AMD RDRAND/RDSEED ops, compared to my
Intel test platforms, their DRBG looks better able to keep up with
the demand for bits.

> Of course, the other variable may be how the parallelized stress test
> is conducted.  If you would like to share your implementation source
> we could give it a twirl on the systems we have access to.

It is just Jason's earlier test program, but moved into one thread
for each core....

$ cat cpurngstress.c
#include <stdio.h>
#include <immintrin.h>
#include <pthread.h>
#include <unistd.h>

/*
 * Gives about 25 seconds walllock time on my Alderlake CPU
 *
 * Probably want to reduce this x10, or possibly even x100
 * on AMD due to much slower ops.
 */
#define MAX_ITER 10000000

#define MAX_CPUS 4096

void *doit(void *f) {
    unsigned long long rand;
    unsigned int i, success_rand = 0, success_seed = 0;

    for (i = 0; i < MAX_ITER; ++i) {
        success_seed += !!_rdseed64_step(&rand);
    }
    for (i = 0; i < MAX_ITER; ++i) {
        success_rand += !!_rdrand64_step(&rand);
    }

    fprintf(stderr,
	    "RDRAND: %.2f%%, RDSEED: %.2f%%\n",
	    success_rand * 100.0 / MAX_ITER,
	    success_seed * 100.0 / MAX_ITER);

    return NULL;
}


int main(int argc, char *argv[])
{
    pthread_t th[MAX_CPUS];
    int nproc = sysconf(_SC_NPROCESSORS_ONLN);
    if (nproc > MAX_CPUS) {
      nproc = MAX_CPUS;
    }
    fprintf(stderr, "Stressing RDRAND/RDSEED across %d CPUs\n", nproc);

    for (int i = 0 ; i < nproc;i ++) {
      pthread_create(&th[i], NULL, doit,NULL);
    }

    for (int i = 0 ; i < nproc;i ++) {
      pthread_join(th[i], NULL);
    }

    return 0;
}

$ gcc -march=native -o cpurngstress cpurngstress.c


> If there is the possibility of over-harvesting randomness, why not
> design the implementations to be clamped at some per core value such
> as a megabit/second.  In the case of the documented RDSEED generation
> rates, that would allow the servicing of 3222 cores, if my math at
> 0530 in the morning is correct.
> 
> Would a core need more than 128 kilobytes of randomness, ie. one
> second of output, to effectively seed a random number generator?
> 
> A cynical conclusion would suggest engineering acquiesing to marketing
> demands... :-)

My assumption is that it was simply easier to not implement a
rate limiting feature at the CPU level and punt the starvation
problem to software :-)

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 12:04                   ` Dr. Greg
  2024-02-06 13:00                     ` Daniel P. Berrangé
@ 2024-02-06 13:50                     ` Daniel P. Berrangé
  2024-02-06 15:35                     ` Borislav Petkov
  2024-02-06 18:49                     ` H. Peter Anvin
  3 siblings, 0 replies; 99+ messages in thread
From: Daniel P. Berrangé @ 2024-02-06 13:50 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
> 
> Good morning to everyone.
> 
> > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> > > 
> > > Actually, I now believe there is clear evidence that the problem is
> > > indeed Intel specific.  In light of our testing, it will be
> > > interesting to see what your 'AR' returns with respect to an official
> > > response from Intel engineering on this issue.
> > > 
> > > One of the very bright young engineers collaborating on Quixote, who
> > > has been following this conversation, took it upon himself to do some
> > > very methodical engineering analysis on this issue.  I'm the messenger
> > > but this is very much his work product.
> > > 
> > > Executive summary is as follows:
> > > 
> > > - No RDRAND depletion failures were observable with either the Intel
> > >   or AMD hardware that was load tested.
> > > 
> > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> > >   implementation could not be provoked into failure.
> 
> > My colleague ran a multithread parallel stress test program on his
> > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> > RDSEED.
> 
> Interesting datapoint, thanks for forwarding it along, so the issue
> shows up on at least some AMD platforms as well.

I got access to a couple more AMD machines. An EPYC 24core/2HT
(Zen-1 uarch) and an EPYC 2socket/16core/2HT (Zen-3 uarch).

Both of these show 100% success with RDSEED. So there's clearly
some variance across AMD SKUs. So perhaps this is an EPYC vs Ryzen
distinction, with the server focused EPYCs able to sustain RDSEED.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 12:04                   ` Dr. Greg
  2024-02-06 13:00                     ` Daniel P. Berrangé
  2024-02-06 13:50                     ` Daniel P. Berrangé
@ 2024-02-06 15:35                     ` Borislav Petkov
  2024-02-08 11:44                       ` Dr. Greg
  2024-02-06 18:49                     ` H. Peter Anvin
  3 siblings, 1 reply; 99+ messages in thread
From: Borislav Petkov @ 2024-02-06 15:35 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Daniel P. Berrang??,
	Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> The silence appears to be deafening out of the respective engineering
> camps... :-)

I usually wait for those threads to "relax" themselves first. :)

So, what do you wanna know?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 12:04                   ` Dr. Greg
                                       ` (2 preceding siblings ...)
  2024-02-06 15:35                     ` Borislav Petkov
@ 2024-02-06 18:49                     ` H. Peter Anvin
  2024-02-08 16:38                       ` Dr. Greg
  3 siblings, 1 reply; 99+ messages in thread
From: H. Peter Anvin @ 2024-02-06 18:49 UTC (permalink / raw)
  To: Dr. Greg, Daniel P. Berrang??
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On February 6, 2024 4:04:45 AM PST, "Dr. Greg" <greg@enjellic.com> wrote:
>On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
>
>Good morning to everyone.
>
>> On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
>> > 
>> > Actually, I now believe there is clear evidence that the problem is
>> > indeed Intel specific.  In light of our testing, it will be
>> > interesting to see what your 'AR' returns with respect to an official
>> > response from Intel engineering on this issue.
>> > 
>> > One of the very bright young engineers collaborating on Quixote, who
>> > has been following this conversation, took it upon himself to do some
>> > very methodical engineering analysis on this issue.  I'm the messenger
>> > but this is very much his work product.
>> > 
>> > Executive summary is as follows:
>> > 
>> > - No RDRAND depletion failures were observable with either the Intel
>> >   or AMD hardware that was load tested.
>> > 
>> > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
>> >   implementation could not be provoked into failure.
>
>> My colleague ran a multithread parallel stress test program on his
>> 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
>> RDSEED.
>
>Interesting datapoint, thanks for forwarding it along, so the issue
>shows up on at least some AMD platforms as well.
>
>On the 18 core/socket Intel Skylake platform, the parallelized
>depletion test forces RDSEED success rates down to around 2%.  It
>would appear that your tests suggest that the AMD platform fairs
>better than the Intel platform.
>
>So this is turning into even more of a morass, given that RDSEED
>depletion on AMD may be a function of the micro-architecture the
>platform is based on.  The other variable is that our AMD test
>platform had a substantially higher core count per socket, one would
>assume that would result in higher depletion rates, if the operative
>theory of socket common RNG infrastructure is valid.
>
>Unless AMD engineering understands the problem and has taken some type
>of action on higher core count systems to address the issue.
>
>Of course, the other variable may be how the parallelized stress test
>is conducted.  If you would like to share your implementation source
>we could give it a twirl on the systems we have access to.
>
>The continuing operative question is whether or not any of this ever
>leads to an RDRAND failure.
>
>We've conducted some additional tests on the Intel platform where
>RDSEED depletion was driven low as possible, ~1-2% success rates,
>while RDRAND depletion tests were being run simultaneously.  No RDRAND
>failures have been noted.
>
>So the operative question remains, why worry about this if RDRAND is
>used as the randomness primitive.
>
>We haven't seen anything out of Intel yet on this, maybe AMD has a
>quantifying definition for 'astronomical' when it comes to RDRAND
>failures.
>
>The silence appears to be deafening out of the respective engineering
>camps... :-)
>
>> > - AMD's RDRAND/RDSEED implementation is significantly slower than
>> >   Intel's.
>
>> Yes, we also noticed the AMD impl is horribly slow compared to
>> Intel, had to cut test iterations x100.
>
>The operative question is the impact of 'slow', in the absence of
>artifical stress tests.
>
>It would seem that a major question is what are or were the
>engineering thought processes on the throughput of the hardware
>randomness instructions.
>
>Intel documents the following randomness throughput rates:
>
>RDSEED: 3 Gbit/second
>RDRAND: 6.4 Gbit/second
>
>If there is the possibility of over-harvesting randomness, why not
>design the implementations to be clamped at some per core value such
>as a megabit/second.  In the case of the documented RDSEED generation
>rates, that would allow the servicing of 3222 cores, if my math at
>0530 in the morning is correct.
>
>Would a core need more than 128 kilobytes of randomness, ie. one
>second of output, to effectively seed a random number generator?
>
>A cynical conclusion would suggest engineering acquiesing to marketing
>demands... :-)
>
>> With regards,
>> Daniel
>
>Have a good day.
>
>As always,
>Dr. Greg
>
>The Quixote Project - Flailing at the Travails of Cybersecurity
>              https://github.com/Quixote-Project

You do realize, right, that the "deafening silence" is due to the need for research and discussions on our part, and presumably AMD's.

In addition, quite frankly, your rather abusive language isn't exactly encouraging people to speak publicly based on immediately available and therefore inherently incomplete and/or dated information, meaning that we have had to take even what discussions we might have been able to have in public without IP concerns behind the scenes.

Yes, we work for Intel. No, we don't know every detail about every Intel chip ever created off the top of my head, nor do we necessarily know the exact person that is *currently* in charge of the architecture of a particular unit, nor is it necessarily true that even *that* person knows all the exact properties of the behavior of their unit when integrated into a particular SoC, as units are modular by design.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-03 14:35                                     ` Theodore Ts'o
@ 2024-02-06 19:12                                       ` H. Peter Anvin
  0 siblings, 0 replies; 99+ messages in thread
From: H. Peter Anvin @ 2024-02-06 19:12 UTC (permalink / raw)
  To: Theodore Ts'o, James Bottomley
  Cc: Jason A. Donenfeld, Reshetova, Elena, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On February 3, 2024 6:35:47 AM PST, Theodore Ts'o <tytso@mit.edu> wrote:
>On Fri, Feb 02, 2024 at 10:28:01PM +0100, James Bottomley wrote:
>> 
>> My big concern is older cpus where rdrand/rdseed don't produce useful
>> entropy.  Exhaustion attacks are going to be largely against VMs not
>> physical systems, so I worry about physical systems with older CPUs
>> that might have rdrand issues which then trip our Confidential
>> Computing checks.
>
>For (non-CC) VM's the answer is virtio-rng.  This solves the
>exhaustion problem, since if you can't trust the host, the VM's
>security is taost anyway (again, ignoring Confidential Compute).
>
>> The signal for rdseed failing is fairly clear, so if the node has other
>> entropy sources, it should continue otherwise it should signal failure.
>> Figuring out how a confidential computing environment signals that
>> failure is TBD.
>
>That's a design decision, and I believe we've been converging on a
>panic during early boot.  Post boot, if we've successfully succeeded
>in initializing the guest kernel's RNG, we're secure so long as the
>cryptographic primitives haven't been defeated --- and if we have,
>such as if Quantuum Computing because practical, we've got bigger
>problems anyway.
>
>					- Ted

I also want to emphasize that there is a huge difference between boot (initialization) time and runtime. Runtime harvesting has always been opportunistic in Linux, and so if RDSEED fails, try again later – unless perhaps a task is blocked on /dev/random in which case it might make sense to aggressively loop on the blocked core instead of just putting the process to sleep.

Initialization time is a different game entirely. Until we have accumulated about 256-512 bits of seed data, even the best PRNG can't really be considered "completely random." Thus a far more aggressive approach may be called for; furthermore, this is the time to look for total failure of the NRBG if after some number N attempts (where I believe N should be quite large, if we spend a full second in the very worst case that is probably better than declaring failure and optionally panic the system) we have not acquired enough entropy then warn and optionally panic the system.

By setting the limit in terms of time rather than iterations, this avoids the awkward issue of "the interface to the RDSEED unit is too fast and so it returns failure too often." I don't think anyone would argue that the right thing would be to slow down the response time of RDSEED for that reason, even though it would most likely radically reduce the failure rate (because the NRBG would have more time to produce entropy between queries at the maximum rate.)

Let's say, entirely hypothetically (as of right now I have absolutely *no* insider information of the RNG unit roadmap), that we were to implement a prefetch buffer in the core, such that a single or a handful of RD* instructions could execute in a handful of cycles, with the core itself issuing the request to the RNG unit when there is space in the queue. Such a prefetch buffer could rather obviously get *very* quickly exhausted because the poll rate could be dramatically increased, and having the core stall until there is data may or may not be a good solution (advantage: the CPU can go to a lower power state while waiting; disadvantage: opportunistic harvesting would prefer a "poll and fail fast" variation, *especially* if the CPU is going to fulfill the request autonomously anyway.)

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 13:00                     ` Daniel P. Berrangé
@ 2024-02-08 10:31                       ` Dr. Greg
  0 siblings, 0 replies; 99+ messages in thread
From: Dr. Greg @ 2024-02-08 10:31 UTC (permalink / raw)
  To: Daniel P. Berrang??
  Cc: Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Feb 06, 2024 at 01:00:03PM +0000, Daniel P. Berrang?? wrote:

Good morning.

> On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> > On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
> > 
> > Good morning to everyone.
> > 
> > > On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> > > > 
> > > > Actually, I now believe there is clear evidence that the problem is
> > > > indeed Intel specific.  In light of our testing, it will be
> > > > interesting to see what your 'AR' returns with respect to an official
> > > > response from Intel engineering on this issue.
> > > > 
> > > > One of the very bright young engineers collaborating on Quixote, who
> > > > has been following this conversation, took it upon himself to do some
> > > > very methodical engineering analysis on this issue.  I'm the messenger
> > > > but this is very much his work product.
> > > > 
> > > > Executive summary is as follows:
> > > > 
> > > > - No RDRAND depletion failures were observable with either the Intel
> > > >   or AMD hardware that was load tested.
> > > > 
> > > > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> > > >   implementation could not be provoked into failure.
> > 
> > > My colleague ran a multithread parallel stress test program on his
> > > 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> > > RDSEED.
> > 
> > Interesting datapoint, thanks for forwarding it along, so the issue
> > shows up on at least some AMD platforms as well.
> > 
> > On the 18 core/socket Intel Skylake platform, the parallelized
> > depletion test forces RDSEED success rates down to around 2%.  It
> > would appear that your tests suggest that the AMD platform fairs
> > better than the Intel platform.

> Yes, given the speed of the AMD RDRAND/RDSEED ops, compared to my
> Intel test platforms, their DRBG looks better able to keep up with
> the demand for bits.

We now believe the observed resiliency of AMD's RNG infrastructure
comes down to the fact that the completion times of their RNG
instructions are significantly slower than Intel's.

SkyLake and KabyLake instruction completion times are documented at
463 clock cycles, regardless of operand size.

AMD Ryzen documents variable completion times based on operand size.
16 and 32 bit transfers complete in 1200 clock cycles with 64 bit
requests completing in 2500 clock cycles.

Given that Jason's test program was issueing 64-bit RNG requests, the
AMD platforms are going to be approximately 5.4 times slower than
Intel platforms, provided the results are corrected for CPU clock
rates.

AMD's entropy source is execution jitter time over a bank of inverter
based ring oscillors, presumably sampled by a constant clock rate
sampler.  Slower instruction retirement times consumes less of the
constant rate entropy production.

Intel uses thermal/quantum noise across a diode junction retrieved by
a self-clocked sampler.  Faster instruction retirement translates into
increased bandwidth demands on the sampler.

> > Of course, the other variable may be how the parallelized stress test
> > is conducted.  If you would like to share your implementation source
> > we could give it a twirl on the systems we have access to.
> 
> It is just Jason's earlier test program, but moved into one thread
> for each core....
> 
> $ cat cpurngstress.c
> #include <stdio.h>
> #include <immintrin.h>
> #include <pthread.h>
> #include <unistd.h>
> 
> /*
>  * Gives about 25 seconds walllock time on my Alderlake CPU
>  *
>  * Probably want to reduce this x10, or possibly even x100
>  * on AMD due to much slower ops.
>  */
> #define MAX_ITER 10000000
> 
> #define MAX_CPUS 4096
> 
> void *doit(void *f) {
>     unsigned long long rand;
>     unsigned int i, success_rand = 0, success_seed = 0;
> 
>     for (i = 0; i < MAX_ITER; ++i) {
>         success_seed += !!_rdseed64_step(&rand);
>     }
>     for (i = 0; i < MAX_ITER; ++i) {
>         success_rand += !!_rdrand64_step(&rand);
>     }
> 
>     fprintf(stderr,
> 	    "RDRAND: %.2f%%, RDSEED: %.2f%%\n",
> 	    success_rand * 100.0 / MAX_ITER,
> 	    success_seed * 100.0 / MAX_ITER);
> 
>     return NULL;
> }
> 
> 
> int main(int argc, char *argv[])
> {
>     pthread_t th[MAX_CPUS];
>     int nproc = sysconf(_SC_NPROCESSORS_ONLN);
>     if (nproc > MAX_CPUS) {
>       nproc = MAX_CPUS;
>     }
>     fprintf(stderr, "Stressing RDRAND/RDSEED across %d CPUs\n", nproc);
> 
>     for (int i = 0 ; i < nproc;i ++) {
>       pthread_create(&th[i], NULL, doit,NULL);
>     }
> 
>     for (int i = 0 ; i < nproc;i ++) {
>       pthread_join(th[i], NULL);
>     }
> 
>     return 0;
> }
> 
> $ gcc -march=native -o cpurngstress cpurngstress.c

Thanks for forwarding your test code along, we've added it to our
tests for comparison.

> > If there is the possibility of over-harvesting randomness, why not
> > design the implementations to be clamped at some per core value such
> > as a megabit/second.  In the case of the documented RDSEED generation
> > rates, that would allow the servicing of 3222 cores, if my math at
> > 0530 in the morning is correct.
> > 
> > Would a core need more than 128 kilobytes of randomness, ie. one
> > second of output, to effectively seed a random number generator?
> > 
> > A cynical conclusion would suggest engineering acquiesing to marketing
> > demands... :-)

> My assumption is that it was simply easier to not implement a rate
> limiting feature at the CPU level and punt the starvation problem to
> software :-)

Could be, it does seem unlikely that random number generation speed
would be seen as fertile ground for marketing types.

Punting to software is certainly rationale, perhaps problematic in a
CoCo environment depending on the definition of 'astronomical'.  See
my response to Borislav who was kind enough to respond to all of this.

> With regards,
> Daniel

Have a good day.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 15:35                     ` Borislav Petkov
@ 2024-02-08 11:44                       ` Dr. Greg
  2024-02-09 17:31                         ` Borislav Petkov
  0 siblings, 1 reply; 99+ messages in thread
From: Dr. Greg @ 2024-02-08 11:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Daniel P. Berrang??,
	Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Feb 06, 2024 at 04:35:29PM +0100, Borislav Petkov wrote:

Good morning, or perhaps afternoon, thanks for taking the time to
reply.

> On Tue, Feb 06, 2024 at 06:04:45AM -0600, Dr. Greg wrote:
> > The silence appears to be deafening out of the respective engineering
> > camps... :-)

> I usually wait for those threads to "relax" themselves first. :)

Indeed, my standard practice is to wait 24 hours before replying to
any public e-mail, hence the delay in my response.

> So, what do you wanna know?

I guess a useful starting point would be if AMD would like to offer
any type of quantification for 'astronomically small' when it comes to
the probability of failure over 10 RDRAND attempts... :-)

Secondly, given our test findings and those of RedHat, would it be
safe to assume that EPYC has engineering that prevents RDSEED failures
that Ryzen does not?

Given HPA's response in this thread, I do appreciate that all of this
may be shrouded in trade secrets and other issues.  With an
acknowledgement to that fact, let me see if I can extend the
discussion in a generic manner that may prove useful to the community
without being 'abusive'.

Both AMD and Intel designs start with a hardware based entropy source.
Intel samples thermal/quantum junction noise, AMD samples execution
jitter over a bank of inverter based oscillators.  An assumption of
constant clocked sampling implies a maximum randomness bandwidth
limit.

None of this implies that randomness is a finite resource, it will
always become available, with the caveat that a core may have to stand
in line, cup in hand, waiting for a dollop.

So this leaves the fundamental question of what does an RDRAND or
RDSEED failure return actually imply?

Silicon is a expensive resource, which would imply a queue depth
limitation for access to the socket common RNG infastructure.  If the
queue is full when an instruction issues, it would be a logical
response to signal an instruction failure quickly and let software try
again.

An alternate theory would be a requirement for constant instruction
time completion.  In that case a 'buffer' of cycles would be included
in the RNG instruction cycle allocation count.  If the instruction
would need to 'sleep', waiting for randomness, beyond this cycle
buffer, a failure would be returned.

Absent broken hardware, astronomical then becomes the probability of a
core being unlucky enough to run into these or alternate
implementation scenarios 10 times in a row.  Particularly given the
recommendation to sleep between attempts, which implies getting
scheduled onto different cores for the attempts.

Any enlightenment along these lines would seem to be useful in
facilitating an understanding of the issues at hand.

Given the time and engineering invested in the engineering behind both
TDX and SEV-SNP, it would seem unlikely that really smart engineers at
both Intel and AMD didn't anticipate this issue and its proper
resolution for CoCo environments.

> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

All the best from the Upper Midwest.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-06 18:49                     ` H. Peter Anvin
@ 2024-02-08 16:38                       ` Dr. Greg
  0 siblings, 0 replies; 99+ messages in thread
From: Dr. Greg @ 2024-02-08 16:38 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Daniel P. Berrang??,
	Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Tue, Feb 06, 2024 at 10:49:59AM -0800, H. Peter Anvin wrote:

Good morning HPA, I hope your week is going well, thanks for taking
the time to extend comments.

> On February 6, 2024 4:04:45 AM PST, "Dr. Greg" <greg@enjellic.com> wrote:
> >On Tue, Feb 06, 2024 at 08:04:57AM +0000, Daniel P. Berrang?? wrote:
> >
> >Good morning to everyone.
> >
> >> On Mon, Feb 05, 2024 at 07:12:47PM -0600, Dr. Greg wrote:
> >> > 
> >> > Actually, I now believe there is clear evidence that the problem is
> >> > indeed Intel specific.  In light of our testing, it will be
> >> > interesting to see what your 'AR' returns with respect to an official
> >> > response from Intel engineering on this issue.
> >> > 
> >> > One of the very bright young engineers collaborating on Quixote, who
> >> > has been following this conversation, took it upon himself to do some
> >> > very methodical engineering analysis on this issue.  I'm the messenger
> >> > but this is very much his work product.
> >> > 
> >> > Executive summary is as follows:
> >> > 
> >> > - No RDRAND depletion failures were observable with either the Intel
> >> >   or AMD hardware that was load tested.
> >> > 
> >> > - RDSEED depletion is an Intel specific issue, AMD's RDSEED
> >> >   implementation could not be provoked into failure.
> >
> >> My colleague ran a multithread parallel stress test program on his
> >> 16core/2HT AMD Ryzen (Zen4 uarch) and saw a 80% failure rate in
> >> RDSEED.
> >
> >Interesting datapoint, thanks for forwarding it along, so the issue
> >shows up on at least some AMD platforms as well.
> >
> >On the 18 core/socket Intel Skylake platform, the parallelized
> >depletion test forces RDSEED success rates down to around 2%.  It
> >would appear that your tests suggest that the AMD platform fairs
> >better than the Intel platform.
> >
> >So this is turning into even more of a morass, given that RDSEED
> >depletion on AMD may be a function of the micro-architecture the
> >platform is based on.  The other variable is that our AMD test
> >platform had a substantially higher core count per socket, one would
> >assume that would result in higher depletion rates, if the operative
> >theory of socket common RNG infrastructure is valid.
> >
> >Unless AMD engineering understands the problem and has taken some type
> >of action on higher core count systems to address the issue.
> >
> >Of course, the other variable may be how the parallelized stress test
> >is conducted.  If you would like to share your implementation source
> >we could give it a twirl on the systems we have access to.
> >
> >The continuing operative question is whether or not any of this ever
> >leads to an RDRAND failure.
> >
> >We've conducted some additional tests on the Intel platform where
> >RDSEED depletion was driven low as possible, ~1-2% success rates,
> >while RDRAND depletion tests were being run simultaneously.  No RDRAND
> >failures have been noted.
> >
> >So the operative question remains, why worry about this if RDRAND is
> >used as the randomness primitive.
> >
> >We haven't seen anything out of Intel yet on this, maybe AMD has a
> >quantifying definition for 'astronomical' when it comes to RDRAND
> >failures.
> >
> >The silence appears to be deafening out of the respective engineering
> >camps... :-)
> >
> >> > - AMD's RDRAND/RDSEED implementation is significantly slower than
> >> >   Intel's.
> >
> >> Yes, we also noticed the AMD impl is horribly slow compared to
> >> Intel, had to cut test iterations x100.
> >
> >The operative question is the impact of 'slow', in the absence of
> >artifical stress tests.
> >
> >It would seem that a major question is what are or were the
> >engineering thought processes on the throughput of the hardware
> >randomness instructions.
> >
> >Intel documents the following randomness throughput rates:
> >
> >RDSEED: 3 Gbit/second
> >RDRAND: 6.4 Gbit/second
> >
> >If there is the possibility of over-harvesting randomness, why not
> >design the implementations to be clamped at some per core value such
> >as a megabit/second.  In the case of the documented RDSEED generation
> >rates, that would allow the servicing of 3222 cores, if my math at
> >0530 in the morning is correct.
> >
> >Would a core need more than 128 kilobytes of randomness, ie. one
> >second of output, to effectively seed a random number generator?
> >
> >A cynical conclusion would suggest engineering acquiesing to marketing
> >demands... :-)
> >
> >> With regards,
> >> Daniel
> >
> >Have a good day.
> >
> >As always,
> >Dr. Greg
> >
> >The Quixote Project - Flailing at the Travails of Cybersecurity
> >              https://github.com/Quixote-Project

> You do realize, right, that the "deafening silence" is due to the
> need for research and discussions on our part, and presumably AMD's.

That would certainly be anticipated if not embraced, while those
discussions ensue, let me explain where we are coming from on this
issue.

I have a long time friend and valued personal consigliere who is a
tremendous attorney and legal scholar.  She has long advised me that
two basic concepts are instilled in law school; how to make sense out
of legal writing and don't ask any questions you don't know the answer
to.

CoCo is an engineering endeavor to defy the long held, and difficult
to deny, premise in information technology that if you don't have
physical security of a platform you don't have security.

We value ourselves as a good engineering team with considerable
interest and experience in all of this.  Any suggestion that there may
be some type of, even subtle, concern over the behavior of fundamental
hardware security primitives causes us to start asking questions and
testing things.

In this case the testing, quickly and easily, caused even more
questions to emerge.

> In addition, quite frankly, your rather abusive language isn't
> exactly encouraging people to speak publicly based on immediately
> available and therefore inherently incomplete and/or dated
> information, meaning that we have had to take even what discussions we
> might have been able to have in public without IP concerns behind the
> scenes.

Abusive?

I will freely don the moniker of being a practioner and purveyor of
rapier cynicism and wit, however, everyone who knows me beyond e-mail
would tell you that abusive would be the last definition they would
use in describing my intent and character.

I had the opportunity to sit next to Jim Gordon at dinner, who at the
time ran the Platform Security Division for Intel, at the SGX
Development Outreach Meeting that was held in Tel Aviv.  That was
after a pretty direct, but productive, technical exchange with the SGX
hardware engineers from Haifa.

I told him that I didn't mean to put the engineers on the spot but we
had been asked to voice our concerns as SGX infrastructure developers.
He told me the purpose of the meeting was for Intel to get tough and
demanding questions on issues of concern to developers so Intel could
deliver better and more relevant technology.

Just for the record and to close the abusive issue.  A review of this
thread will show that I never threw out accusations that hardware was
busted, backdoored nor did I advocate that the solution was to find
a better hardware vendor.

You fix engineering problems with engineering facts, hence our
interest in seeing how the question that got asked, perhaps
inadvertently, gets answered, so appropriate engineering changes can
be made in security dependent products.

> Yes, we work for Intel. No, we don't know every detail about every
> Intel chip ever created off the top of my head, nor do we
> necessarily know the exact person that is *currently* in charge of
> the architecture of a particular unit, nor is it necessarily true
> that even *that* person knows all the exact properties of the
> behavior of their unit when integrated into a particular SoC, as
> units are modular by design.

Interesting.

From the outside looking in, as engineers, this raises the obvious
question if the 'bus factor' for Bull Mountain has been exceeded.

Let me toss out, hopefully as a positive contribution, a 'Gedanken
Thought Dilemma' that the Intel team can take into their internal
deliberations.

One of the other products of this thread was the suggestion that a
CoCo hypervisor/hardware contribution could exert sufficient timing or
scheduling control so as to defeat the SDM's 10 try RDRAND
recommendation and induce a denial-of-service condition.

If that is indeed a possibility, given the long history of timing
based observation attacks on confidentiality, what guidance can be
offered to consumers of the relevant technologies that CoCo is indeed
a valid concept.  Particularly given the fact that the hardware that
consumers are trusting is physically in the hands of highly skilled
personnel, who have both the skills and phsyical control of the
hardware needed, to mount such an attack?

This is obviously a somewhat larger question than if RDRAND depletion
can be practically induced, so no need to rush the deliberations on
our behalf.

We will stand by in a quiet and decidedly non-abusive and
non-threatening posture, waiting to see what reflections that Borislav
might have on all of this... :-)

Have a good weekend.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-08 11:44                       ` Dr. Greg
@ 2024-02-09 17:31                         ` Borislav Petkov
  2024-02-09 19:49                           ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Borislav Petkov @ 2024-02-09 17:31 UTC (permalink / raw)
  To: Dr. Greg
  Cc: Daniel P. Berrang??,
	Reshetova, Elena, Jason A. Donenfeld, Hansen, Dave,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, Dave Hansen,
	H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Thu, Feb 08, 2024 at 05:44:44AM -0600, Dr. Greg wrote:
> I guess a useful starting point would be if AMD would like to offer
> any type of quantification for 'astronomically small' when it comes to
> the probability of failure over 10 RDRAND attempts... :-)

Right, let's establish the common ground first: please have a look at
this, albeit a bit outdated whitepaper:

https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/white-papers/amd-random-number-generator.pdf

in case you haven't seen it yet.

Now, considering that this is a finite resource, you can imagine that
there can be scenarios where that source can be depleted.

And newer Zen generations perform significantly better. So much so that
on Zen3 and later 10 retries should never observe a failure unless it
is bad hardware. Also, I agree with hpa's note that any and all retries
should be time based.

> Secondly, given our test findings and those of RedHat, would it be
> safe to assume that EPYC has engineering that prevents RDSEED failures
> that Ryzen does not?

Well, roughly speaking, client is a less beefier and less performant
version of server. You can extrapolate that to the topic at hand.

But at least on AMD, any potential DoSing of RDRAND on client doesn't
matter for CoCo because client doesn't enable SEV*.

> Both AMD and Intel designs start with a hardware based entropy source.
> Intel samples thermal/quantum junction noise, AMD samples execution
> jitter over a bank of inverter based oscillators.

See above paper for the AMD side.

> An assumption of constant clocked sampling implies a maximum
> randomness bandwidth limit.

You said it.

> None of this implies that randomness is a finite resource

Huh? This contradicts with what you just said in the above sentence.

Or maybe I'm reading this wrong...

> So this leaves the fundamental question of what does an RDRAND or
> RDSEED failure return actually imply?

Simple: if no random data is ready at the time the insn executes, it
says "invalid". Because the generator is a finite resource as you said
above, if the software tries to pull random data faster than it can
generate, this is the main case for CF=0.

> Silicon is a expensive resource, which would imply a queue depth
> limitation for access to the socket common RNG infastructure.  If the
> queue is full when an instruction issues, it would be a logical
> response to signal an instruction failure quickly and let software try
> again.

That's actually in the APM documenting RDRAND:

"If the returned value is invalid, software must execute the instruction
again."

> Given the time and engineering invested in the engineering behind both
> TDX and SEV-SNP, it would seem unlikely that really smart engineers at
> both Intel and AMD didn't anticipate this issue and its proper
> resolution for CoCo environments.

You can probably imagine that no one can do a fully secure system in one
single attempt but rather needs to do an iterative process.

And I don't know how much you've followed those technologies but they
*are* the perfect example for such an iterative improvement process.

I hope this answers at least some of your questions.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-09 17:31                         ` Borislav Petkov
@ 2024-02-09 19:49                           ` Jason A. Donenfeld
  2024-02-09 20:37                             ` Dave Hansen
  2024-02-09 21:45                             ` Borislav Petkov
  0 siblings, 2 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-09 19:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Dr. Greg, Daniel P. Berrang??,
	Reshetova, Elena, Hansen, Dave, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Hey Boris,

While you're here, I was wondering if you could comment on one thing related:

On Fri, Feb 9, 2024 at 6:31 PM Borislav Petkov <bp@alien8.de> wrote:
> Now, considering that this is a finite resource, you can imagine that
> there can be scenarios where that source can be depleted.

Yea, this makes sense.

[As an aside, I would like to note that a different construction of
RDRAND could keep outputting good random numbers for a reeeeeallly
long time without needing to reseed, or without penalty if RDSEED is
depleted, and so could be made to actually never fail. But given the
design goals of RDRAND, this kind of crypto is highly likely to never
be implemented, so I'm not even moving to suggest that AMD/Intel just
'fix' the crypto design goals of the instruction. It's not gonna
happen for lots of reasons.]

So assuming that RDSEED and hence RDRAND can never be made to never
fail, the options are:

1. Finite resource that refills faster than whatever instruction
issuance latency, so it's never observably empty. (Seems unlikely)
2. More secure sharing of the finite resource.

It's this second option I wanted to ask you about. I wrote down what I
thought "secure sharing" meant here [1]:

> - One VMX (or host) context can't DoS another one.
> - Ring 3 can't DoS ring 0.

It's a bit of a scheduling/queueing thing, where different security
contexts shouldn't be able to starve others out of the finite resource
indefinitely.

What I'm wondering is if that kind of fairness is even possible to
achieve in the hardware or the microcode. I don't really know how that
all works under the covers and what sorts of "policies" and such are
feasible to implement. In suggesting it, I feel like a bit of a
presumptuous kernel developer talking to hardware people, not fully
appreciating their domain and its challenges. For, if this were just a
C program, I know exactly what I'd do, but we're talking about a CPU
here.

Is it actually possible to make RDRAND usage "fair" between different
security contexts? Or am I totally delusional and this is not how the
hardware works or can ever work?

Jason

[1] https://lore.kernel.org/all/CAHmME9ps6W5snQrYeNVMFgfhMKFKciky=-UxxGFbAx_RrxSHoA@mail.gmail.com/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-03 10:12                                   ` Jason A. Donenfeld
@ 2024-02-09 19:53                                     ` Jason A. Donenfeld
  2024-02-12  8:25                                       ` Reshetova, Elena
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-09 19:53 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Reshetova, Elena, Theodore Ts'o, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

Hi Kirill,

On Sat, Feb 3, 2024 at 11:12 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> Yea, actually, I had a pretty similar idea for something like that
> that's very non-invasive, where none of this even touches the RDRAND
> core code, much less random.c. Specifically, we consider "adding some
> extra RDRAND to the pool" like any other driver that wants to add some
> of its own seeds to the pool, with add_device_randomness(), a call that
> lives in various driver code, doesn't influence any entropy readiness
> aspects of random.c, and can safely be sprinkled in any device or
> platform driver.
>
> Specifically what I'm thinking about is something like:
>
> void coco_main_boottime_init_function_somewhere_deep_in_arch_code(void)
> {
>   // [...]
>   // bring up primary CoCo nuts
>   // [...]
>
>   /* CoCo requires an explicit RDRAND seed, because the host can make the
>    * rest of the system deterministic.
>    */
>   unsigned long seed[32 / sizeof(long)];
>   size_t i, longs;
>   for (i = 0; i < ARRAY_SIZE(seed); i += longs) {
>     longs = arch_get_random_longs(&seed[i], ARRAY_SIZE(seed) - i);
>     /* If RDRAND is being DoS'd, panic, because we can't ensure
>      * confidentiality.
>      */
>     BUG_ON(!longs);
>   }
>   add_device_randomness(seed, sizeof(seed));
>   memzero_explicit(seed, sizeof(seed));
>
>   // [...]
>   // do other CoCo things
>   // [...]
> }
>
> I would have no objection to the CoCo people adding something like this
> and would give it my Ack, but more importantly, my Ack for that doesn't
> even matter, because add_device_randomness() is pretty innocuous.
>
> So Kirill, if nobody else here objects to that approach, and you want to
> implement it in some super minimal way like that, that would be fine
> with me. Or maybe we want to wait for that internal inquiry at Intel to
> return some answers first. But either way, this might be an easy
> approach that doesn't add too much complexity.

I went ahead and implemented this just to have something concrete out there:
https://lore.kernel.org/all/20240209164946.4164052-1-Jason@zx2c4.com/

I probably screwed up some x86 platform conventions/details, but
that's the general idea I had in mind.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-09 19:49                           ` Jason A. Donenfeld
@ 2024-02-09 20:37                             ` Dave Hansen
  2024-02-09 21:45                             ` Borislav Petkov
  1 sibling, 0 replies; 99+ messages in thread
From: Dave Hansen @ 2024-02-09 20:37 UTC (permalink / raw)
  To: Jason A. Donenfeld, Borislav Petkov
  Cc: Dr. Greg, Daniel P. Berrang??,
	Reshetova, Elena, Kirill A. Shutemov, Thomas Gleixner,
	Ingo Molnar, Dave Hansen, H. Peter Anvin, x86, Theodore Ts'o,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On 2/9/24 11:49, Jason A. Donenfeld wrote:
> [As an aside, I would like to note that a different construction of
> RDRAND could keep outputting good random numbers for a reeeeeallly
> long time without needing to reseed, or without penalty if RDSEED is
> depleted, and so could be made to actually never fail. But given the
> design goals of RDRAND, this kind of crypto is highly likely to never
> be implemented, so I'm not even moving to suggest that AMD/Intel just
> 'fix' the crypto design goals of the instruction. It's not gonna
> happen for lots of reasons.]

Intel's RDRAND reseeding behavior is spelled out here:

> https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html

In the "Guaranteeing DBRG Reseeding" section.

> It's a bit of a scheduling/queueing thing, where different security
> contexts shouldn't be able to starve others out of the finite resource
> indefinitely.
> 
> What I'm wondering is if that kind of fairness is even possible to
> achieve in the hardware or the microcode.
...

Even ignoring different security contexts, Intel's whitepaper claims
that no starvation happens with RDRAND:

> If multiple threads are invoking RDRAND simultaneously, total RDRAND
> throughput (across all threads) scales approximately linearly with
> the number of threads until no more hardware threads remain, the bus
> limits of the processor are reached, or the DRNG interface is fully
> saturated. Past this point, the maximum throughput is divided equally
> among the active threads. No threads get starved.

800 MB/sec of total RDRAND throughput across all threads, guaranteed
reseeding, and no starvation sounds pretty good to me.

Does that need improving?

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails
  2024-02-09 19:49                           ` Jason A. Donenfeld
  2024-02-09 20:37                             ` Dave Hansen
@ 2024-02-09 21:45                             ` Borislav Petkov
  1 sibling, 0 replies; 99+ messages in thread
From: Borislav Petkov @ 2024-02-09 21:45 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Dr. Greg, Daniel P. Berrang??,
	Reshetova, Elena, Hansen, Dave, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Dave Hansen, H. Peter Anvin, x86,
	Theodore Ts'o, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Fri, Feb 09, 2024 at 08:49:40PM +0100, Jason A. Donenfeld wrote:
> While you're here,

I was here the whole time, lurking in the shadows. :)

> Is it actually possible to make RDRAND usage "fair" between different
> security contexts? Or am I totally delusional and this is not how the
> hardware works or can ever work?

Yeah, I know exactly what you mean and I won't go into details for
obvious reasons. Two things:

* Starting with Zen3, provided properly configured hw RDRAND will never
fail. It is also fair when feeding the different contexts.

* My hardware engineers tell me that this is tough to do for RDSEED

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-09 19:53                                     ` Jason A. Donenfeld
@ 2024-02-12  8:25                                       ` Reshetova, Elena
  2024-02-12 16:32                                         ` Theodore Ts'o
  2024-02-14 17:30                                         ` Jason A. Donenfeld
  0 siblings, 2 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-12  8:25 UTC (permalink / raw)
  To: Jason A. Donenfeld, Kirill A. Shutemov, H. Peter Anvin
  Cc: Theodore Ts'o, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

> Hi Kirill,
> 
> On Sat, Feb 3, 2024 at 11:12 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > Yea, actually, I had a pretty similar idea for something like that
> > that's very non-invasive, where none of this even touches the RDRAND
> > core code, much less random.c. Specifically, we consider "adding some
> > extra RDRAND to the pool" like any other driver that wants to add some
> > of its own seeds to the pool, with add_device_randomness(), a call that
> > lives in various driver code, doesn't influence any entropy readiness
> > aspects of random.c, and can safely be sprinkled in any device or
> > platform driver.
> >
> > Specifically what I'm thinking about is something like:
> >
> > void coco_main_boottime_init_function_somewhere_deep_in_arch_code(void)
> > {
> >   // [...]
> >   // bring up primary CoCo nuts
> >   // [...]
> >
> >   /* CoCo requires an explicit RDRAND seed, because the host can make the
> >    * rest of the system deterministic.
> >    */
> >   unsigned long seed[32 / sizeof(long)];
> >   size_t i, longs;
> >   for (i = 0; i < ARRAY_SIZE(seed); i += longs) {
> >     longs = arch_get_random_longs(&seed[i], ARRAY_SIZE(seed) - i);
> >     /* If RDRAND is being DoS'd, panic, because we can't ensure
> >      * confidentiality.
> >      */
> >     BUG_ON(!longs);
> >   }
> >   add_device_randomness(seed, sizeof(seed));
> >   memzero_explicit(seed, sizeof(seed));
> >
> >   // [...]
> >   // do other CoCo things
> >   // [...]
> > }
> >
> > I would have no objection to the CoCo people adding something like this
> > and would give it my Ack, but more importantly, my Ack for that doesn't
> > even matter, because add_device_randomness() is pretty innocuous.
> >
> > So Kirill, if nobody else here objects to that approach, and you want to
> > implement it in some super minimal way like that, that would be fine
> > with me. Or maybe we want to wait for that internal inquiry at Intel to
> > return some answers first. But either way, this might be an easy
> > approach that doesn't add too much complexity.
> 
> I went ahead and implemented this just to have something concrete out there:
> https://lore.kernel.org/all/20240209164946.4164052-1-Jason@zx2c4.com/
> 
> I probably screwed up some x86 platform conventions/details, but
> that's the general idea I had in mind.
> 

Thank you Jason!
I want to bring another potential idea here for a discussion, which Peter Anvin
proposed in our internal discussions, and I like it conceptually better
than any options we discussed so far since it is much more generic. 

What if we instead of doing some special treatment on rdrand/seed, we
try to fix the underneath problem of Linux RNG not supporting CoCo threat
model. Linux RNG has almost set in stone definition of what sources contribute
entropy and what don’t (with some additional flexibility with flags like trust_cpu).
This works well for the current fixed threat model, but doesn’t work for
CoCo because some sources are suddenly not trusted anymore to contribute
entropy. However, some are still trusted and that is not just rdrand/rdseed,
but we would also trust add_hwgenerator_randomness (given that we use
TEE IO device here or have a way to get this input securely). So, even in
theoretical scenario that both rdrand/rdseed is broken (let's say HW failure),
a Linux RNG can actually boot securely in the guest if we have enough
entropy from add_hwgenerator_randomness.

So the change would be around adding the notion of conditional entropy
counting (we will always take input as we do now because it wont hurt),
which would automatically give us a correct behavior in _credit_init_bits()
for initial seeding of crng. Also we need to have a generic way to stop the
boot if the entropy is not increasing (for any reasons) and prevent booting
with insecurely seeded crng. 

I do understand that this is going to be much bigger change than anything we
are discussing so far, but conceptually it sounds right to be able to have a say
what sources of entropy one trusts in runtime (probably applicable beyond
CoCo in the future also) and what is the action when we cannot collect the
entropy from these sources. 

What does everyone think? 

Best Regards,
Elena.





^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-12  8:25                                       ` Reshetova, Elena
@ 2024-02-12 16:32                                         ` Theodore Ts'o
  2024-02-13  7:28                                           ` Dan Williams
  2024-02-14 17:30                                         ` Jason A. Donenfeld
  1 sibling, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-12 16:32 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Jason A. Donenfeld, Kirill A. Shutemov, H. Peter Anvin,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Mon, Feb 12, 2024 at 08:25:33AM +0000, Reshetova, Elena wrote:
> What if we instead of doing some special treatment on rdrand/seed, we
> try to fix the underneath problem of Linux RNG not supporting CoCo threat
> model. Linux RNG has almost set in stone definition of what sources contribute
> entropy and what don’t (with some additional flexibility with flags like trust_cpu).
> This works well for the current fixed threat model, but doesn’t work for
> CoCo because some sources are suddenly not trusted anymore to contribute
> entropy. However, some are still trusted and that is not just rdrand/rdseed,
> but we would also trust add_hwgenerator_randomness (given that we use
> TEE IO device here or have a way to get this input securely). So, even in
> theoretical scenario that both rdrand/rdseed is broken (let's say HW failure),
> a Linux RNG can actually boot securely in the guest if we have enough
> entropy from add_hwgenerator_randomness.

So the problem with this is that there is now way we can authenticate
the hardware RNG.  For example, the hypervisor could claim that there
is a ChaosKey USB key attached, and at the moment, unlike all other
hardware random number generators, the Linux kernel is configured to
blindly trust the ChaosKey because it was designed by Keith Packard
and Bdale Garbee, and "It Must Be Good".  But the only way that we
know that it is a ChaosKey is by its USB major and minor id numbers
--- and a malicious hypervisor could fake up such a device.

And of course, that's not unique to the hypervisor --- someone could
create a hardware USB key that claimed to be a ChaosKey, but which
generated a fixed sequence, say 3,1,4,1,5,9,2,6,... and it would pass
most RNG quality checkers, since it's obviously not a repeated
sequence of digits, so the mandated FIPS required check would give it
a thumbs up.  And it doesn't have to be a faked ChaosKey device; a
hypervisor could claim that there is a virtual TPM with its hardware
random number generator, but it's also gimmicked to always give the
same fixed sequence, and there's no way the guest OS could know
otherwise.

Hence, for the unique requirements of Confidential Compute, I'm afraid
it's RDRAND/RSEED or bust....

						- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-12 16:32                                         ` Theodore Ts'o
@ 2024-02-13  7:28                                           ` Dan Williams
  2024-02-13 23:13                                             ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: Dan Williams @ 2024-02-13  7:28 UTC (permalink / raw)
  To: Theodore Ts'o, Reshetova, Elena
  Cc: Jason A. Donenfeld, Kirill A. Shutemov, H. Peter Anvin,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

Theodore Ts'o wrote:
> On Mon, Feb 12, 2024 at 08:25:33AM +0000, Reshetova, Elena wrote:
> > What if we instead of doing some special treatment on rdrand/seed, we
> > try to fix the underneath problem of Linux RNG not supporting CoCo threat
> > model. Linux RNG has almost set in stone definition of what sources contribute
> > entropy and what don’t (with some additional flexibility with flags like trust_cpu).
> > This works well for the current fixed threat model, but doesn’t work for
> > CoCo because some sources are suddenly not trusted anymore to contribute
> > entropy. However, some are still trusted and that is not just rdrand/rdseed,
> > but we would also trust add_hwgenerator_randomness (given that we use
> > TEE IO device here or have a way to get this input securely). So, even in
> > theoretical scenario that both rdrand/rdseed is broken (let's say HW failure),
> > a Linux RNG can actually boot securely in the guest if we have enough
> > entropy from add_hwgenerator_randomness.
> 
> So the problem with this is that there is now way we can authenticate
> the hardware RNG.

Sure there is, that is what, for example, PCI TDISP (TEE Device
Interface Security Protocol) is about. Set aside the difficulty of doing
the PCI TDISP flow early in boot, and validating the device certficate
and measurements based on golden values without talking to a remote
verifier etc..., but if such a device has been accepted and its driver
calls hwrng_register() it should be added as an entropy source.

Now maybe there is something fatal in that "etc", and RDRAND needs to
work for early entropy, but if a PCI device passes guest acceptance
there should be no additional concerns for it to be considered a CC
approved RNG.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-13  7:28                                           ` Dan Williams
@ 2024-02-13 23:13                                             ` Theodore Ts'o
  2024-02-14  0:53                                               ` Dan Williams
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-13 23:13 UTC (permalink / raw)
  To: Dan Williams
  Cc: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov,
	H. Peter Anvin, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Mon, Feb 12, 2024 at 11:28:31PM -0800, Dan Williams wrote:
> Sure there is, that is what, for example, PCI TDISP (TEE Device
> Interface Security Protocol) is about. Set aside the difficulty of doing
> the PCI TDISP flow early in boot, and validating the device certficate
> and measurements based on golden values without talking to a remote
> verifier etc..., but if such a device has been accepted and its driver
> calls hwrng_register() it should be added as an entropy source.

How real is TDISP?  What hardware exists today and how much of this
support is ready to land in the kernel?  Looking at the news articles,
it appears to me like bleeding edge technology, and what an unkind
person might call "vaporware"?  Is that an unfair characterization?

There have plenty of things that have squirted out of standards
bodies, like for example, "objected base storage", which has turned
out to be a complete commercial failure and was never actually
deployed in any real numbers, other than sample hardare being provided
to academic researchers.  How can we be sure that PCI TDISP won't end
up going down that route?

In any case, if we are going to go down this path, we will need to
have some kind of policy engine hwrng_register() reject
non-authenticated hardware if Confidential Compute is enabled (and
possibly in other cases).

				- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-13 23:13                                             ` Theodore Ts'o
@ 2024-02-14  0:53                                               ` Dan Williams
  2024-02-14  4:32                                                 ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: Dan Williams @ 2024-02-14  0:53 UTC (permalink / raw)
  To: Theodore Ts'o, Dan Williams
  Cc: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov,
	H. Peter Anvin, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Theodore Ts'o wrote:
> On Mon, Feb 12, 2024 at 11:28:31PM -0800, Dan Williams wrote:
> > Sure there is, that is what, for example, PCI TDISP (TEE Device
> > Interface Security Protocol) is about. Set aside the difficulty of doing
> > the PCI TDISP flow early in boot, and validating the device certficate
> > and measurements based on golden values without talking to a remote
> > verifier etc..., but if such a device has been accepted and its driver
> > calls hwrng_register() it should be added as an entropy source.
> 
> How real is TDISP?  What hardware exists today and how much of this
> support is ready to land in the kernel?  Looking at the news articles,
> it appears to me like bleeding edge technology, and what an unkind
> person might call "vaporware"?  Is that an unfair characterization?

Indeed it is. Typically when you have x86, riscv, arm, and s390 folks
all show up at a Linux Plumbers session [1] to talk about their approach
to handling a new platform paradigm, that is a decent indication that
the technology is more real than not. Point taken that it is not here
today, but it is also not multiple hardware generations away as the
Plumbers participation indicated.

> There have plenty of things that have squirted out of standards
> bodies, like for example, "objected base storage", which has turned
> out to be a complete commercial failure and was never actually
> deployed in any real numbers, other than sample hardare being provided
> to academic researchers.  How can we be sure that PCI TDISP won't end
> up going down that route?

Of course, that is always a risk. History is littered with obsolesence,
some of it before seeing any commercial uptake, some after.

> In any case, if we are going to go down this path, we will need to
> have some kind of policy engine hwrng_register() reject
> non-authenticated hardware if Confidential Compute is enabled (and
> possibly in other cases).

Sounds reasonable, that recognition is all I wanted from mentioning PCI
TDISP.

[1]: https://lpc.events/event/17/contributions/1633/

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14  0:53                                               ` Dan Williams
@ 2024-02-14  4:32                                                 ` Theodore Ts'o
  2024-02-14  6:48                                                   ` Dan Williams
                                                                     ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-14  4:32 UTC (permalink / raw)
  To: Dan Williams
  Cc: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov,
	H. Peter Anvin, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

On Tue, Feb 13, 2024 at 04:53:06PM -0800, Dan Williams wrote:
> 
> Indeed it is. Typically when you have x86, riscv, arm, and s390 folks
> all show up at a Linux Plumbers session [1] to talk about their approach
> to handling a new platform paradigm, that is a decent indication that
> the technology is more real than not. Point taken that it is not here
> today, but it is also not multiple hardware generations away as the
> Plumbers participation indicated.

My big concerns with TDISP which make me believe it may not be a
silver bullet is that (a) it's hyper-complex (although to be fair
Confidential Compute isn't exactly simple, and (b) it's one thing to
digitally sign software so you know that it comes from a trusted
source; but it's a **lot** harder to prove that hardware hasn't been
tampered with --- a digital siganture can't tell you much about
whether or not the hardware is in an as-built state coming from the
factory --- this requires things like wrapping the device with
resistive wire in multiple directions with a whetstone bridge to
detect if the wire has gotten cut or shorted, then dunking the whole
thing in epoxy, so that any attempt to tamper with the hardware will
result it self-destructing (via a thermite charge or equivalent :-)

Remember, the whole conceit of Confidential Compute is that you don't
trust the cloud provider --- but if that entity controls the PCI cards
installed in their servers, and and that entity has the ability to
*modify* the PCI cards in the server, all of the digital signatures
and fancy-schmancy TDISP complexity isn't necessarily going to save
you.

The final concern is that it may take quite a while before these
devices become real, and then for cloud providers like Amazon, Azure,
to actually deploy them.  And in the meantime, Confidential Compute
VM's are already something which are available for customers to
purchase *today*.  So we need some kind of solution right now, and
preferably, something which is simple enough that it is likely to be
back-portable to RHEL.

(And I fear that even if TDISP hardware existed today, it is so
complicated that it may be a heavy lift to get it backported into
enterprise distro kernels.)

Ultimately, if CPU's can actually have an architectgural RNG ala
RDRAND/RDSEED that actually can do the right thing in the face of
entropy draining attacks, that seems to be a **much** simpler
solution.  And even if it requires waiting for the next generation of
CPU's, this might be faster than waiting for the TDISP ecosystem
mature.

					- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14  4:32                                                 ` Theodore Ts'o
@ 2024-02-14  6:48                                                   ` Dan Williams
  2024-02-14  6:54                                                   ` Reshetova, Elena
  2024-02-14  8:34                                                   ` Nikolay Borisov
  2 siblings, 0 replies; 99+ messages in thread
From: Dan Williams @ 2024-02-14  6:48 UTC (permalink / raw)
  To: Theodore Ts'o, Dan Williams
  Cc: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov,
	H. Peter Anvin, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel

Theodore Ts'o wrote:
> On Tue, Feb 13, 2024 at 04:53:06PM -0800, Dan Williams wrote:
> > 
> > Indeed it is. Typically when you have x86, riscv, arm, and s390 folks
> > all show up at a Linux Plumbers session [1] to talk about their approach
> > to handling a new platform paradigm, that is a decent indication that
> > the technology is more real than not. Point taken that it is not here
> > today, but it is also not multiple hardware generations away as the
> > Plumbers participation indicated.
> 
> My big concerns with TDISP which make me believe it may not be a
> silver bullet is that (a) it's hyper-complex (although to be fair
> Confidential Compute isn't exactly simple, and (b) it's one thing to
> digitally sign software so you know that it comes from a trusted
> source; but it's a **lot** harder to prove that hardware hasn't been
> tampered with --- a digital siganture can't tell you much about
> whether or not the hardware is in an as-built state coming from the
> factory --- this requires things like wrapping the device with
> resistive wire in multiple directions with a whetstone bridge to
> detect if the wire has gotten cut or shorted, then dunking the whole
> thing in epoxy, so that any attempt to tamper with the hardware will
> result it self-destructing (via a thermite charge or equivalent :-)
> 
> Remember, the whole conceit of Confidential Compute is that you don't
> trust the cloud provider --- but if that entity controls the PCI cards
> installed in their servers, and and that entity has the ability to
> *modify* the PCI cards in the server, all of the digital signatures
> and fancy-schmancy TDISP complexity isn't necessarily going to save
> you.
>
> The final concern is that it may take quite a while before these
> devices become real, and then for cloud providers like Amazon, Azure,
> to actually deploy them.  And in the meantime, Confidential Compute
> VM's are already something which are available for customers to
> purchase *today*.  So we need some kind of solution right now, and
> preferably, something which is simple enough that it is likely to be
> back-portable to RHEL.
> 
> (And I fear that even if TDISP hardware existed today, it is so
> complicated that it may be a heavy lift to get it backported into
> enterprise distro kernels.)

No lies detected.

Something is broken if you need to rely on TDISP to get a reliable
random number in a guest. All it can enforce is that the VMM is not
emulating a HWRNG. Also, VMM denial of service is outside of the TDISP
threat model, so if VMM can steal all the entropy, or DoS RDSEED, you
are back at square one. The only reason for jumping in on this tangent
was to counterpoint the implication that the RNG core must always hard
code a dependency on CPU HWRNG for confidential computing.

However, yes, given the timelines for TDISP Linux could hard code that
choice in the near term for expediency and leave it to the TDISP folks
to unwind it later.

> Ultimately, if CPU's can actually have an architectgural RNG ala
> RDRAND/RDSEED that actually can do the right thing in the face of
> entropy draining attacks, that seems to be a **much** simpler
> solution.  And even if it requires waiting for the next generation of
> CPU's, this might be faster than waiting for the TDISP ecosystem
> mature.

Yes, please. I am happy if TDISP flies below the hype cycle so that its
implications can be considered carefullly. At the same time I will keep
an eye out for discussions like this where guest attestation of hardware
provenance is raised.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14  4:32                                                 ` Theodore Ts'o
  2024-02-14  6:48                                                   ` Dan Williams
@ 2024-02-14  6:54                                                   ` Reshetova, Elena
  2024-02-14  8:34                                                   ` Nikolay Borisov
  2 siblings, 0 replies; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-14  6:54 UTC (permalink / raw)
  To: Theodore Ts'o, Williams, Dan J
  Cc: Jason A. Donenfeld, Kirill A. Shutemov, H. Peter Anvin,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

 
> Ultimately, if CPU's can actually have an architectgural RNG ala
> RDRAND/RDSEED that actually can do the right thing in the face of
> entropy draining attacks, that seems to be a **much** simpler
> solution.  

I don’t think anyone would object that the rdrand approach we are
discussing here is simpler. My point (and also Peter original idea) was
that if we want to do it correctly and generically (and *not* just about 
confidential computing), we ought to provide a way for users to define
what entropy sources for Linux RNG they are willing to trust or not.
This should not be a policy decision that kernel hardcodes (we try hard
to avoid policies in kernel), but left for users to decide/configure based
on their preferences, trust notions, fears of backdooring, whatelse.
This of course has the flip part that some users will get it wrong, but
reasonable secure defaults can be provided also. 

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14  4:32                                                 ` Theodore Ts'o
  2024-02-14  6:48                                                   ` Dan Williams
  2024-02-14  6:54                                                   ` Reshetova, Elena
@ 2024-02-14  8:34                                                   ` Nikolay Borisov
  2024-02-14  9:34                                                     ` Dr. Greg
  2 siblings, 1 reply; 99+ messages in thread
From: Nikolay Borisov @ 2024-02-14  8:34 UTC (permalink / raw)
  To: Theodore Ts'o, Dan Williams
  Cc: Reshetova, Elena, Jason A. Donenfeld, Kirill A. Shutemov,
	H. Peter Anvin, Dave Hansen, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun,
	Tom Lendacky, Kalra, Ashish, Sean Christopherson, linux-coco,
	linux-kernel



On 14.02.24 г. 6:32 ч., Theodore Ts'o wrote:
> On Tue, Feb 13, 2024 at 04:53:06PM -0800, Dan Williams wrote:
>>
>> Indeed it is. Typically when you have x86, riscv, arm, and s390 folks
>> all show up at a Linux Plumbers session [1] to talk about their approach
>> to handling a new platform paradigm, that is a decent indication that
>> the technology is more real than not. Point taken that it is not here
>> today, but it is also not multiple hardware generations away as the
>> Plumbers participation indicated.
> 
> My big concerns with TDISP which make me believe it may not be a
> silver bullet is that (a) it's hyper-complex (although to be fair
> Confidential Compute isn't exactly simple, and (b) it's one thing to
> digitally sign software so you know that it comes from a trusted
> source; but it's a **lot** harder to prove that hardware hasn't been
> tampered with --- a digital siganture can't tell you much about
> whether or not the hardware is in an as-built state coming from the
> factory --- this requires things like wrapping the device with
> resistive wire in multiple directions with a whetstone bridge to
> detect if the wire has gotten cut or shorted, then dunking the whole
> thing in epoxy, so that any attempt to tamper with the hardware will
> result it self-destructing (via a thermite charge or equivalent :-)

This really reminds me of the engineering that goes into the omnipresent 
POS terminals ate every store, since they store certificates from the 
card (Visa/Master) operators. So I wonder if at somepoint we'll have a 
pos-like device (by merit of its engineering) in every server....

> 
> Remember, the whole conceit of Confidential Compute is that you don't
> trust the cloud provider --- but if that entity controls the PCI cards
> installed in their servers, and and that entity has the ability to
> *modify* the PCI cards in the server, all of the digital signatures
> and fancy-schmancy TDISP complexity isn't necessarily going to save
> you.

Can't the same argument go for the CPU, though it's a lot more 
"integrated" into the silicong substrate, yet we somehow believe CoCo 
ascertains that a vm is running on trusted hardware? But ultimately the 
CPU is still a part that comes from the untrusted CSP.


<snip>

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14  8:34                                                   ` Nikolay Borisov
@ 2024-02-14  9:34                                                     ` Dr. Greg
  0 siblings, 0 replies; 99+ messages in thread
From: Dr. Greg @ 2024-02-14  9:34 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Theodore Ts'o, Dan Williams, Reshetova, Elena,
	Jason A. Donenfeld, Kirill A. Shutemov, H. Peter Anvin,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Wed, Feb 14, 2024 at 10:34:48AM +0200, Nikolay Borisov wrote:

Hi, I hope the week is going well for everyone.

> On 14.02.24 ??. 6:32 ??., Theodore Ts'o wrote:
> >On Tue, Feb 13, 2024 at 04:53:06PM -0800, Dan Williams wrote:
> >>
> >>Indeed it is. Typically when you have x86, riscv, arm, and s390 folks
> >>all show up at a Linux Plumbers session [1] to talk about their approach
> >>to handling a new platform paradigm, that is a decent indication that
> >>the technology is more real than not. Point taken that it is not here
> >>today, but it is also not multiple hardware generations away as the
> >>Plumbers participation indicated.
> >
> >My big concerns with TDISP which make me believe it may not be a
> >silver bullet is that (a) it's hyper-complex (although to be fair
> >Confidential Compute isn't exactly simple, and (b) it's one thing to
> >digitally sign software so you know that it comes from a trusted
> >source; but it's a **lot** harder to prove that hardware hasn't been
> >tampered with --- a digital siganture can't tell you much about
> >whether or not the hardware is in an as-built state coming from the
> >factory --- this requires things like wrapping the device with
> >resistive wire in multiple directions with a whetstone bridge to
> >detect if the wire has gotten cut or shorted, then dunking the whole
> >thing in epoxy, so that any attempt to tamper with the hardware will
> >result it self-destructing (via a thermite charge or equivalent :-)

> This really reminds me of the engineering that goes into the
> omnipresent POS terminals ate every store, since they store
> certificates from the card (Visa/Master) operators. So I wonder if
> at somepoint we'll have a pos-like device (by merit of its
> engineering) in every server....

It already exists.  CoCo, at least the Intel implementation, is
dependent on what amounts to this concept.

> >Remember, the whole conceit of Confidential Compute is that you don't
> >trust the cloud provider --- but if that entity controls the PCI cards
> >installed in their servers, and and that entity has the ability to
> >*modify* the PCI cards in the server, all of the digital signatures
> >and fancy-schmancy TDISP complexity isn't necessarily going to save
> >you.

> Can't the same argument go for the CPU, though it's a lot more
> "integrated" into the silicong substrate, yet we somehow believe
> CoCo ascertains that a vm is running on trusted hardware? But
> ultimately the CPU is still a part that comes from the untrusted
> CSP.

The attestation model for TDX is largely built on top of SGX.

The Intel predicate with respect to SGX/TDX is that you have to trust
the CPU silicon implementation, if you can't entertain that level of
trust, it is game over for security.

To support that security model, Intel provides infrastructure that
proves that the software is running on a 'Genuine Intel' CPU.

Roughly, a root key is burned into the silicon that is used as the
basis for additional derived keys.  The key access and derivation
processes can only occur when the process is running software with a
known signature in a protected region of memory (enclave).

The model is to fill a structure with data that defines the
hardware/software state.  A keyed checksum is run over the structure
that allows a relying party to verify that the data structure contents
could have only been generated on a valid Intel CPU.

This process verifies that the CPU is from a known vendor, which is of
course only the initial starting point for verifying that something
like a VM is running in a known and trusted state.  But, if you can't
start with that predicate you have nothing to build on.

The actual implementation nowadays is a bit more complex, given that
all of this has to happen on multi-socket systems which involve more
than one CPU, but the concept is the same.

Have a good day.

As always,
Dr. Greg

The Quixote Project - Flailing at the Travails of Cybersecurity
              https://github.com/Quixote-Project

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-02  7:25                               ` Reshetova, Elena
  2024-02-02 15:39                                 ` Theodore Ts'o
@ 2024-02-14 15:18                                 ` Reshetova, Elena
  2024-02-14 17:21                                   ` Jason A. Donenfeld
  1 sibling, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-14 15:18 UTC (permalink / raw)
  To: Reshetova, Elena, Jason A. Donenfeld, Theodore Ts'o, Dave Hansen
  Cc: Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, Kuppuswamy Sathyanarayanan,
	Nakajima, Jun, Tom Lendacky, Kalra, Ashish, Sean Christopherson,
	linux-coco, linux-kernel

 
> This is a great summary of options, thank you Jason!
> My proposal would be to wait on result of our internal investigation
> before proceeding to choose the approach.

Hi everyone, 

I am finally able to share the result of my AR and here is the statement
about rdrand/rdseed on Intel platforms:

"The RdRand in a non-defective device is designed to be faster than the bus,
so when a core accesses the output from the DRNG, it will always get a
random number.
As a result, it is hard to envision a scenario where the RdRand, on a fully
functional device, will underflow.
The carry flag after RdRand signals an underflow so in the case of a defective chip,
this will prevent the code thinking it has a random number when it does not.

RdSeed however is limited by the speed of the noise source. So it is not faster
than the bus and there may be an underflow signaled by the carry flag. 
When reading for multiple values, the total throughput of RdSeed random
numbers varies over different products due to variation in the silicon processes,
operating voltage and speed vs power tradeoffs.
The throughput is shared between the cores"

In addition there is a plan to publish a whitepaper and add clarifications to
Intel official documentation on this topic, but this would obviously take longer. 

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 15:18                                 ` Reshetova, Elena
@ 2024-02-14 17:21                                   ` Jason A. Donenfeld
  2024-02-14 17:59                                     ` Reshetova, Elena
                                                       ` (2 more replies)
  0 siblings, 3 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-14 17:21 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

Hi Elena,

On Wed, Feb 14, 2024 at 4:18 PM Reshetova, Elena <elena.reshetova@intel.com> wrote:
> "The RdRand in a non-defective device is designed to be faster than the bus,
> so when a core accesses the output from the DRNG, it will always get a
> random number.
> As a result, it is hard to envision a scenario where the RdRand, on a fully
> functional device, will underflow.
> The carry flag after RdRand signals an underflow so in the case of a defective chip,
> this will prevent the code thinking it has a random number when it does not.

That's really great news, especially combined with a very similar
statement from Borislav about AMD chips:

On Fri, Feb 9, 2024 at 10:45 PM Borislav Petkov <bp@alien8.de> wrote:
> Yeah, I know exactly what you mean and I won't go into details for
> obvious reasons. Two things:
>
> * Starting with Zen3, provided properly configured hw RDRAND will never
> fail. It is also fair when feeding the different contexts.

I assume that this faster-than-the-bus-ness also takes into account the
various accesses required to even switch contexts when scheduling VMs,
so your proposed host-guest scheduling attack can't really happen
either. Correct?

One clarifying question in all of this: what is the point of the "try 10
times" advice? Is the "faster than the bus" statement actually "faster
than the bus if you try 10 times"? Or is the "10 times" advice just old
and not relevant.

In other words, is the following a reasonable patch?

diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
index 02bae8e0758b..2d5bf5aa9774 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -13,22 +13,16 @@
 #include <asm/processor.h>
 #include <asm/cpufeature.h>
 
-#define RDRAND_RETRY_LOOPS	10
-
 /* Unconditional execution of RDRAND and RDSEED */
 
 static inline bool __must_check rdrand_long(unsigned long *v)
 {
 	bool ok;
-	unsigned int retry = RDRAND_RETRY_LOOPS;
-	do {
-		asm volatile("rdrand %[out]"
-			     CC_SET(c)
-			     : CC_OUT(c) (ok), [out] "=r" (*v));
-		if (ok)
-			return true;
-	} while (--retry);
-	return false;
+	asm volatile("rdrand %[out]"
+		     CC_SET(c)
+		     : CC_OUT(c) (ok), [out] "=r" (*v));
+	WARN_ON(!ok);
+	return ok;
 }
 
 static inline bool __must_check rdseed_long(unsigned long *v)

(As for the RDSEED clarification, that also matches Borislav's reply, is
what we expected and knew experimentally, and doesn't really have any
bearing on Linux's RNG or this discussion, since RDRAND is all we need
anyway.)

Regards,
Jason

^ permalink raw reply related	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-12  8:25                                       ` Reshetova, Elena
  2024-02-12 16:32                                         ` Theodore Ts'o
@ 2024-02-14 17:30                                         ` Jason A. Donenfeld
  1 sibling, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-14 17:30 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Kirill A. Shutemov, H. Peter Anvin, Theodore Ts'o,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky, Kalra,
	Ashish, Sean Christopherson, linux-coco, linux-kernel

On Mon, Feb 12, 2024 at 08:25:33AM +0000, Reshetova, Elena wrote:
> So the change would be around adding the notion of conditional entropy
> counting (we will always take input as we do now because it wont hurt),
> which would automatically give us a correct behavior in _credit_init_bits()
> for initial seeding of crng.

I basically have zero interest in this kind of highly complex addition,
and I think that'll lead us back toward how the RNG was in the past.
"Entropy counting" is mostly an illusion, at least in terms of doing so
from measurement. We've got some heuristics to mitigate "premature
first" but these things will mostly only ever be heuristic. If a
platform like CoCo knows nothing else will work, then a
platform-specific choice like the one in this patch is sufficient to
do the trick. And in general, this seems like a weird thing to design
around: if the CPU is actually just totally broken and defective, maybe
CoCo shouldn't continue executing anyway? So I'm pretty loathe to go in
this direction of highly complex policy frameworks and such.

Anyway, based on your last email (and my reply to it), it seems like
we're mostly in the clear anyway, and we can rely on RDRAND failure ==>
hardware failure.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 17:21                                   ` Jason A. Donenfeld
@ 2024-02-14 17:59                                     ` Reshetova, Elena
  2024-02-14 19:32                                       ` Jason A. Donenfeld
  2024-02-14 19:46                                     ` Tom Lendacky
  2024-02-14 20:14                                     ` Dave Hansen
  2 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-14 17:59 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

 Hi Elena,
> 
> On Wed, Feb 14, 2024 at 4:18 PM Reshetova, Elena <elena.reshetova@intel.com>
> wrote:
> > "The RdRand in a non-defective device is designed to be faster than the bus,
> > so when a core accesses the output from the DRNG, it will always get a
> > random number.
> > As a result, it is hard to envision a scenario where the RdRand, on a fully
> > functional device, will underflow.
> > The carry flag after RdRand signals an underflow so in the case of a defective chip,
> > this will prevent the code thinking it has a random number when it does not.
> 
> That's really great news, especially combined with a very similar
> statement from Borislav about AMD chips:
> 
> On Fri, Feb 9, 2024 at 10:45 PM Borislav Petkov <bp@alien8.de> wrote:
> > Yeah, I know exactly what you mean and I won't go into details for
> > obvious reasons. Two things:
> >
> > * Starting with Zen3, provided properly configured hw RDRAND will never
> > fail. It is also fair when feeding the different contexts.
> 
> I assume that this faster-than-the-bus-ness also takes into account the
> various accesses required to even switch contexts when scheduling VMs,
> so your proposed host-guest scheduling attack can't really happen
> either. Correct?

Yes, this attack wont be possible for rdrand, so we are good. 

> 
> One clarifying question in all of this: what is the point of the "try 10
> times" advice? Is the "faster than the bus" statement actually "faster
> than the bus if you try 10 times"? Or is the "10 times" advice just old
> and not relevant.

The whitepaper should clarify this more in the future, but in short 
10 times retry is not relevant based on the above statement. 
"when core accesses the output from the DRNG, it will always get a
random number" - there are no statements of re-try here. 

> 
> In other words, is the following a reasonable patch?
> 
> diff --git a/arch/x86/include/asm/archrandom.h
> b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..2d5bf5aa9774 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,22 +13,16 @@
>  #include <asm/processor.h>
>  #include <asm/cpufeature.h>
> 
> -#define RDRAND_RETRY_LOOPS	10
> -
>  /* Unconditional execution of RDRAND and RDSEED */
> 
>  static inline bool __must_check rdrand_long(unsigned long *v)
>  {
>  	bool ok;
> -	unsigned int retry = RDRAND_RETRY_LOOPS;
> -	do {
> -		asm volatile("rdrand %[out]"
> -			     CC_SET(c)
> -			     : CC_OUT(c) (ok), [out] "=r" (*v));
> -		if (ok)
> -			return true;
> -	} while (--retry);
> -	return false;
> +	asm volatile("rdrand %[out]"
> +		     CC_SET(c)
> +		     : CC_OUT(c) (ok), [out] "=r" (*v));
> +	WARN_ON(!ok);
> +	return ok;
>  }

Do you intend this as a generic rdrand change or also a fix for CoCo
case problem? I personally don’t like WARN_ON from security
pov, but I know I am in minority with this. 

> 
>  static inline bool __must_check rdseed_long(unsigned long *v)
> 
> (As for the RDSEED clarification, that also matches Borislav's reply, is
> what we expected and knew experimentally, and doesn't really have any
> bearing on Linux's RNG or this discussion, since RDRAND is all we need
> anyway.)

Agree. Just wanted to have it also included for the overall picture. 

> 
> Regards,
> Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 17:59                                     ` Reshetova, Elena
@ 2024-02-14 19:32                                       ` Jason A. Donenfeld
  2024-02-15  7:07                                         ` Reshetova, Elena
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-14 19:32 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

Hi Elena,

On Wed, Feb 14, 2024 at 05:59:48PM +0000, Reshetova, Elena wrote:
> > 
> > In other words, is the following a reasonable patch?
> > 
> > diff --git a/arch/x86/include/asm/archrandom.h
> > b/arch/x86/include/asm/archrandom.h
> > index 02bae8e0758b..2d5bf5aa9774 100644
> > --- a/arch/x86/include/asm/archrandom.h
> > +++ b/arch/x86/include/asm/archrandom.h
> > @@ -13,22 +13,16 @@
> >  #include <asm/processor.h>
> >  #include <asm/cpufeature.h>
> > 
> > -#define RDRAND_RETRY_LOOPS	10
> > -
> >  /* Unconditional execution of RDRAND and RDSEED */
> > 
> >  static inline bool __must_check rdrand_long(unsigned long *v)
> >  {
> >  	bool ok;
> > -	unsigned int retry = RDRAND_RETRY_LOOPS;
> > -	do {
> > -		asm volatile("rdrand %[out]"
> > -			     CC_SET(c)
> > -			     : CC_OUT(c) (ok), [out] "=r" (*v));
> > -		if (ok)
> > -			return true;
> > -	} while (--retry);
> > -	return false;
> > +	asm volatile("rdrand %[out]"
> > +		     CC_SET(c)
> > +		     : CC_OUT(c) (ok), [out] "=r" (*v));
> > +	WARN_ON(!ok);
> > +	return ok;
> >  }
> 
> Do you intend this as a generic rdrand change or also a fix for CoCo
> case problem?

I was thinking generic, since in all cases, RDRAND failing points to a
hardware bug in the CPU ITSELF (!), which is solid grounds for a WARN().

> I personally don’t like WARN_ON from security
> pov, but I know I am in minority with this. 

I share the same opinion as you, that WARN_ON() is a little weak and we
should BUG_ON() or panic() or whatever, but I also know that this ship
has really sailed long ago, that in lots of ways Linus is also right
that BUG() is bad and shouldn't be used for much, and this just isn't a
hill to die on. And the "panic_on_warn" flag exists and "security guides"
sometimes say to turn this on, etc, so I think WARN_ON() remains the
practical compromise that won't get everyone's feathers ruffelled up.


By the way, there is still one question lingering in the back of my
mind, but I don't know if answering it would divulge confidential
implementation details.

You said that RDRAND is faster than the bus, so failures won't be
observable, while RDSEED is not because it requires collecting entropy
from the ether which is slow. That makes intuitive sense on a certain
dumb simplistic level: AES is just an algorithm so is fast, while
entropy collection is a more physical thing so is slow. But if you read
the implementation details, RDRAND is supposed to reseed after 511
calls. So what's to stop you from exhausting RDSEED in one place, while
also getting RDRAND to the end of its 511 calls, and *then* having your
victim make the subsequent RDRAND call, which tries to reseed (or is in
progress of doing so), finds that RDSEED is out of batteries, and
underflows? What's the magic detail that makes this scenario not
possible?

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 17:21                                   ` Jason A. Donenfeld
  2024-02-14 17:59                                     ` Reshetova, Elena
@ 2024-02-14 19:46                                     ` Tom Lendacky
  2024-02-14 20:04                                       ` Jason A. Donenfeld
  2024-02-14 20:14                                     ` Dave Hansen
  2 siblings, 1 reply; 99+ messages in thread
From: Tom Lendacky @ 2024-02-14 19:46 UTC (permalink / raw)
  To: Jason A. Donenfeld, Reshetova, Elena
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel

On 2/14/24 11:21, Jason A. Donenfeld wrote:
> Hi Elena,
> 
> On Wed, Feb 14, 2024 at 4:18 PM Reshetova, Elena <elena.reshetova@intel.com> wrote:
>> "The RdRand in a non-defective device is designed to be faster than the bus,
>> so when a core accesses the output from the DRNG, it will always get a
>> random number.
>> As a result, it is hard to envision a scenario where the RdRand, on a fully
>> functional device, will underflow.
>> The carry flag after RdRand signals an underflow so in the case of a defective chip,
>> this will prevent the code thinking it has a random number when it does not.
> 
> That's really great news, especially combined with a very similar
> statement from Borislav about AMD chips:
> 
> On Fri, Feb 9, 2024 at 10:45 PM Borislav Petkov <bp@alien8.de> wrote:
>> Yeah, I know exactly what you mean and I won't go into details for
>> obvious reasons. Two things:
>>
>> * Starting with Zen3, provided properly configured hw RDRAND will never
>> fail. It is also fair when feeding the different contexts.
> 
> I assume that this faster-than-the-bus-ness also takes into account the
> various accesses required to even switch contexts when scheduling VMs,
> so your proposed host-guest scheduling attack can't really happen
> either. Correct?
> 
> One clarifying question in all of this: what is the point of the "try 10
> times" advice? Is the "faster than the bus" statement actually "faster
> than the bus if you try 10 times"? Or is the "10 times" advice just old
> and not relevant.
> 
> In other words, is the following a reasonable patch?
> 
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..2d5bf5aa9774 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,22 +13,16 @@
>   #include <asm/processor.h>
>   #include <asm/cpufeature.h>
>   
> -#define RDRAND_RETRY_LOOPS	10
> -
>   /* Unconditional execution of RDRAND and RDSEED */
>   
>   static inline bool __must_check rdrand_long(unsigned long *v)
>   {
>   	bool ok;
> -	unsigned int retry = RDRAND_RETRY_LOOPS;
> -	do {
> -		asm volatile("rdrand %[out]"
> -			     CC_SET(c)
> -			     : CC_OUT(c) (ok), [out] "=r" (*v));
> -		if (ok)
> -			return true;
> -	} while (--retry);
> -	return false;
> +	asm volatile("rdrand %[out]"
> +		     CC_SET(c)
> +		     : CC_OUT(c) (ok), [out] "=r" (*v));
> +	WARN_ON(!ok);
> +	return ok;

Don't forget that Linux will run on older hardware as well, so the 10 
retries might be valid for that. Or do you intend this change purely for CVMs?

Thanks,
Tom

>   }
>   
>   static inline bool __must_check rdseed_long(unsigned long *v)
> 
> (As for the RDSEED clarification, that also matches Borislav's reply, is
> what we expected and knew experimentally, and doesn't really have any
> bearing on Linux's RNG or this discussion, since RDRAND is all we need
> anyway.)
> 
> Regards,
> Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 19:46                                     ` Tom Lendacky
@ 2024-02-14 20:04                                       ` Jason A. Donenfeld
  2024-02-14 20:11                                         ` Theodore Ts'o
  0 siblings, 1 reply; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-14 20:04 UTC (permalink / raw)
  To: Tom Lendacky, Reshetova, Elena, Borislav Petkov
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Kuppuswamy Sathyanarayanan, Nakajima, Jun, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel

Hi Tom,

On Wed, Feb 14, 2024 at 8:46 PM Tom Lendacky <thomas.lendacky@amd.com> wrote:
> Don't forget that Linux will run on older hardware as well, so the 10
> retries might be valid for that. Or do you intend this change purely for CVMs?

Oh, grr, darnit. That is indeed a very important detail. I meant this
for generic code, so yea, if it's actually just Zen3+, then this won't
fly.

AMD people, Intel people: what are the fullest statements we can rely
on here? Do the following two statements work?

1) On newer chips, RDRAND never fails.
2) On older chips, RDRAND never fails if you try 10 times in a loop,
unless you consider host->guest attacks, which we're not, because CoCo
is only a thing on the newer chips.

If those hold true, then the course of action would be to just add a
WARN_ON(!ok) but keep the loop as-is.

(Anyway, I posted
https://lore.kernel.org/lkml/20240214195744.8332-1-Jason@zx2c4.com/
just before seeing this message.)

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 20:04                                       ` Jason A. Donenfeld
@ 2024-02-14 20:11                                         ` Theodore Ts'o
  2024-02-15 13:01                                           ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Theodore Ts'o @ 2024-02-14 20:11 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Tom Lendacky, Reshetova, Elena, Borislav Petkov, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel

On Wed, Feb 14, 2024 at 09:04:34PM +0100, Jason A. Donenfeld wrote:
> AMD people, Intel people: what are the fullest statements we can rely
> on here? Do the following two statements work?
> 
> 1) On newer chips, RDRAND never fails.
> 2) On older chips, RDRAND never fails if you try 10 times in a loop,
> unless you consider host->guest attacks, which we're not, because CoCo
> is only a thing on the newer chips.
> 
> If those hold true, then the course of action would be to just add a
> WARN_ON(!ok) but keep the loop as-is.

I think we may only want to do the WARN_ON in early boot.  Otherwise,
on older chips, if a userspace process executes RDRAND is a tight
loop, it might cause the WARN_ON to trigger, which is considered
undesirable (and is certainly going to be something that could result
in a syzbot complaint).

					- Ted

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 17:21                                   ` Jason A. Donenfeld
  2024-02-14 17:59                                     ` Reshetova, Elena
  2024-02-14 19:46                                     ` Tom Lendacky
@ 2024-02-14 20:14                                     ` Dave Hansen
  2 siblings, 0 replies; 99+ messages in thread
From: Dave Hansen @ 2024-02-14 20:14 UTC (permalink / raw)
  To: Jason A. Donenfeld, Reshetova, Elena
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

On 2/14/24 09:21, Jason A. Donenfeld wrote:
> One clarifying question in all of this: what is the point of the "try 10
> times" advice? Is the "faster than the bus" statement actually "faster
> than the bus if you try 10 times"? Or is the "10 times" advice just old
> and not relevant.
> 
> In other words, is the following a reasonable patch?
> 
> diff --git a/arch/x86/include/asm/archrandom.h b/arch/x86/include/asm/archrandom.h
> index 02bae8e0758b..2d5bf5aa9774 100644
> --- a/arch/x86/include/asm/archrandom.h
> +++ b/arch/x86/include/asm/archrandom.h
> @@ -13,22 +13,16 @@
>  #include <asm/processor.h>
>  #include <asm/cpufeature.h>
>  
> -#define RDRAND_RETRY_LOOPS	10
> -
>  /* Unconditional execution of RDRAND and RDSEED */
>  
>  static inline bool __must_check rdrand_long(unsigned long *v)
>  {
>  	bool ok;
> -	unsigned int retry = RDRAND_RETRY_LOOPS;
> -	do {
> -		asm volatile("rdrand %[out]"
> -			     CC_SET(c)
> -			     : CC_OUT(c) (ok), [out] "=r" (*v));
> -		if (ok)
> -			return true;
> -	} while (--retry);
> -	return false;
> +	asm volatile("rdrand %[out]"
> +		     CC_SET(c)
> +		     : CC_OUT(c) (ok), [out] "=r" (*v));
> +	WARN_ON(!ok);
> +	return ok;
>  }

The key question here is if RDRAND can ever fail on perfectly good hardware.

I think it's theoretically possible for the entropy source health checks
to fail on perfectly good hardware for an arbitrarily long time.  But
the odds of this happening to the point of it affecting RDRAND are
rather small.

There's a reason that the guidance says: "the odds of ten failures in a
row are astronomically small" _instead_ of claiming the same about a
single RDRAND.

Given the scale that the kernel operates at, I think we should leave the
loop.

^ permalink raw reply	[flat|nested] 99+ messages in thread

* RE: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 19:32                                       ` Jason A. Donenfeld
@ 2024-02-15  7:07                                         ` Reshetova, Elena
  2024-02-15 12:58                                           ` Jason A. Donenfeld
  0 siblings, 1 reply; 99+ messages in thread
From: Reshetova, Elena @ 2024-02-15  7:07 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

> You said that RDRAND is faster than the bus, so failures won't be
> observable, while RDSEED is not because it requires collecting entropy
> from the ether which is slow. That makes intuitive sense on a certain
> dumb simplistic level: AES is just an algorithm so is fast, while
> entropy collection is a more physical thing so is slow. But if you read
> the implementation details, RDRAND is supposed to reseed after 511
> calls. So what's to stop you from exhausting RDSEED in one place, while
> also getting RDRAND to the end of its 511 calls, and *then* having your
> victim make the subsequent RDRAND call, which tries to reseed (or is in
> progress of doing so), finds that RDSEED is out of batteries, and
> underflows? What's the magic detail that makes this scenario not
> possible?

This was on my list of scenarios to double check whenever it is possible
or not, and the answer is that it is not possible (at least for Intel).
This scenario is also briefly described in the public doc [1]:

" Note that the conditioner does not send the same seed values to both the
 DRBG and the ENRNG. This pathway can be thought of as an alternating
switch, with one seed going to the DRGB and the next seed going to the ENRNG. 
*This construction ensures* that a software application can never obtain the
 value used to seed the DRBG, *nor can it launch a Denial of Service (DoS) 
attack against the DRBG through repeated executions of the RDSEED instruction.*"

The upcoming whitepaper hopefully should provide more details on this also.

[1] https://www.intel.com/content/www/us/en/developer/articles/guide/intel-digital-random-number-generator-drng-software-implementation-guide.html

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-15  7:07                                         ` Reshetova, Elena
@ 2024-02-15 12:58                                           ` Jason A. Donenfeld
  0 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-15 12:58 UTC (permalink / raw)
  To: Reshetova, Elena
  Cc: Theodore Ts'o, Dave Hansen, Kirill A. Shutemov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Tom Lendacky,
	Kalra, Ashish, Sean Christopherson, linux-coco, linux-kernel

On Thu, Feb 15, 2024 at 07:07:45AM +0000, Reshetova, Elena wrote:
> > You said that RDRAND is faster than the bus, so failures won't be
> > observable, while RDSEED is not because it requires collecting entropy
> > from the ether which is slow. That makes intuitive sense on a certain
> > dumb simplistic level: AES is just an algorithm so is fast, while
> > entropy collection is a more physical thing so is slow. But if you read
> > the implementation details, RDRAND is supposed to reseed after 511
> > calls. So what's to stop you from exhausting RDSEED in one place, while
> > also getting RDRAND to the end of its 511 calls, and *then* having your
> > victim make the subsequent RDRAND call, which tries to reseed (or is in
> > progress of doing so), finds that RDSEED is out of batteries, and
> > underflows? What's the magic detail that makes this scenario not
> > possible?
> 
> This was on my list of scenarios to double check whenever it is possible
> or not, and the answer is that it is not possible (at least for Intel).
> This scenario is also briefly described in the public doc [1]:
> 
> " Note that the conditioner does not send the same seed values to both the
>  DRBG and the ENRNG. This pathway can be thought of as an alternating
> switch, with one seed going to the DRGB and the next seed going to the ENRNG. 
> *This construction ensures* that a software application can never obtain the
>  value used to seed the DRBG, *nor can it launch a Denial of Service (DoS) 
> attack against the DRBG through repeated executions of the RDSEED instruction.*"

Interesting, and good to hear. So also implicit must be that the time
required by 511 calls to RDRAND exceeds the reseeding time, so that you
couldn't exhaust the seeds indirectly by flushing RDRAND.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

* Re: [PATCH 1/2] x86/random: Retry on RDSEED failure
  2024-02-14 20:11                                         ` Theodore Ts'o
@ 2024-02-15 13:01                                           ` Jason A. Donenfeld
  0 siblings, 0 replies; 99+ messages in thread
From: Jason A. Donenfeld @ 2024-02-15 13:01 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Tom Lendacky, Reshetova, Elena, Borislav Petkov, Dave Hansen,
	Kirill A. Shutemov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Kuppuswamy Sathyanarayanan, Nakajima, Jun, Kalra, Ashish,
	Sean Christopherson, linux-coco, linux-kernel

On Wed, Feb 14, 2024 at 03:11:03PM -0500, Theodore Ts'o wrote:
> On Wed, Feb 14, 2024 at 09:04:34PM +0100, Jason A. Donenfeld wrote:
> > AMD people, Intel people: what are the fullest statements we can rely
> > on here? Do the following two statements work?
> > 
> > 1) On newer chips, RDRAND never fails.
> > 2) On older chips, RDRAND never fails if you try 10 times in a loop,
> > unless you consider host->guest attacks, which we're not, because CoCo
> > is only a thing on the newer chips.
> > 
> > If those hold true, then the course of action would be to just add a
> > WARN_ON(!ok) but keep the loop as-is.
> 
> I think we may only want to do the WARN_ON in early boot.  Otherwise,
> on older chips, if a userspace process executes RDRAND is a tight
> loop, it might cause the WARN_ON to trigger, which is considered
> undesirable (and is certainly going to be something that could result
> in a syzbot complaint).

Yea, seems reasonable. Or maybe we just don't bother adding any WARN
there and just address the CoCo thing with the patch 2/2. As it turns
out, on normal systems, the RNG is designed anyway to deal with a broken
or missing RDRAND. So maybe adding these heuristics to warn when the CPU
is broken isn't worth it? Or maybe that's an interesting thing to do?
Dunno, I'm indifferent about it I suppose. But I agree if it's added,
doing it at early boot only makes most sense.

Jason

^ permalink raw reply	[flat|nested] 99+ messages in thread

end of thread, other threads:[~2024-02-15 13:01 UTC | newest]

Thread overview: 99+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-30  8:30 [PATCH 1/2] x86/random: Retry on RDSEED failure Kirill A. Shutemov
2024-01-30  8:30 ` [PATCH 2/2] x86/random: Issue a warning if RDRAND or RDSEED fails Kirill A. Shutemov
2024-01-30 12:37   ` Jason A. Donenfeld
2024-01-30 13:45     ` Reshetova, Elena
2024-01-30 14:21       ` Jason A. Donenfeld
2024-01-30 14:55         ` Reshetova, Elena
2024-01-30 15:00           ` Jason A. Donenfeld
2024-01-30 17:31       ` Dave Hansen
2024-01-30 17:49         ` Jason A. Donenfeld
2024-01-30 17:58           ` Dave Hansen
2024-01-30 18:15             ` H. Peter Anvin
2024-01-30 18:23               ` Jason A. Donenfeld
2024-01-30 18:23             ` Jason A. Donenfeld
2024-01-30 18:37               ` Dave Hansen
2024-01-30 18:05           ` Daniel P. Berrangé
2024-01-30 18:24             ` Jason A. Donenfeld
2024-01-30 18:31             ` Jason A. Donenfeld
2024-01-30 18:40             ` H. Peter Anvin
2024-01-31  8:16             ` Reshetova, Elena
2024-01-31 11:59               ` Dr. Greg
2024-01-31 13:06               ` Jason A. Donenfeld
2024-01-31 18:02                 ` Reshetova, Elena
2024-01-31 20:35                 ` Dr. Greg
2024-02-01  4:47                   ` Theodore Ts'o
2024-02-01  9:54                     ` Dr. Greg
2024-02-01 11:08                       ` Daniel P. Berrangé
2024-02-01 21:04                         ` Dr. Greg
2024-02-02  7:56                           ` Reshetova, Elena
2024-02-01  7:26                   ` Reshetova, Elena
2024-02-01 10:52                     ` Dr. Greg
2024-02-06  1:12               ` Dr. Greg
2024-02-06  8:04                 ` Daniel P. Berrangé
2024-02-06 12:04                   ` Dr. Greg
2024-02-06 13:00                     ` Daniel P. Berrangé
2024-02-08 10:31                       ` Dr. Greg
2024-02-06 13:50                     ` Daniel P. Berrangé
2024-02-06 15:35                     ` Borislav Petkov
2024-02-08 11:44                       ` Dr. Greg
2024-02-09 17:31                         ` Borislav Petkov
2024-02-09 19:49                           ` Jason A. Donenfeld
2024-02-09 20:37                             ` Dave Hansen
2024-02-09 21:45                             ` Borislav Petkov
2024-02-06 18:49                     ` H. Peter Anvin
2024-02-08 16:38                       ` Dr. Greg
2024-01-30 15:50   ` Kuppuswamy Sathyanarayanan
2024-01-30 12:29 ` [PATCH 1/2] x86/random: Retry on RDSEED failure Jason A. Donenfeld
2024-01-30 12:51   ` Jason A. Donenfeld
2024-01-30 13:10   ` Reshetova, Elena
2024-01-30 14:06     ` Jason A. Donenfeld
2024-01-30 14:43       ` Daniel P. Berrangé
2024-01-30 15:12         ` Jason A. Donenfeld
2024-01-30 18:35       ` Jason A. Donenfeld
2024-01-30 19:06         ` Reshetova, Elena
2024-01-30 19:16           ` Jason A. Donenfeld
2024-01-31  7:56             ` Reshetova, Elena
2024-01-31 13:14               ` Jason A. Donenfeld
2024-01-31 14:07                 ` Theodore Ts'o
2024-01-31 14:45                   ` Jason A. Donenfeld
2024-01-31 14:52                     ` Jason A. Donenfeld
2024-01-31 17:10                     ` Theodore Ts'o
2024-01-31 17:37                       ` Reshetova, Elena
2024-01-31 18:01                         ` Jason A. Donenfeld
2024-02-01  4:57                           ` Theodore Ts'o
2024-02-01 18:09                             ` Jason A. Donenfeld
2024-02-01 18:46                               ` Dave Hansen
2024-02-01 19:02                                 ` H. Peter Anvin
2024-02-02  7:25                               ` Reshetova, Elena
2024-02-02 15:39                                 ` Theodore Ts'o
2024-02-03 10:12                                   ` Jason A. Donenfeld
2024-02-09 19:53                                     ` Jason A. Donenfeld
2024-02-12  8:25                                       ` Reshetova, Elena
2024-02-12 16:32                                         ` Theodore Ts'o
2024-02-13  7:28                                           ` Dan Williams
2024-02-13 23:13                                             ` Theodore Ts'o
2024-02-14  0:53                                               ` Dan Williams
2024-02-14  4:32                                                 ` Theodore Ts'o
2024-02-14  6:48                                                   ` Dan Williams
2024-02-14  6:54                                                   ` Reshetova, Elena
2024-02-14  8:34                                                   ` Nikolay Borisov
2024-02-14  9:34                                                     ` Dr. Greg
2024-02-14 17:30                                         ` Jason A. Donenfeld
2024-02-14 15:18                                 ` Reshetova, Elena
2024-02-14 17:21                                   ` Jason A. Donenfeld
2024-02-14 17:59                                     ` Reshetova, Elena
2024-02-14 19:32                                       ` Jason A. Donenfeld
2024-02-15  7:07                                         ` Reshetova, Elena
2024-02-15 12:58                                           ` Jason A. Donenfeld
2024-02-14 19:46                                     ` Tom Lendacky
2024-02-14 20:04                                       ` Jason A. Donenfeld
2024-02-14 20:11                                         ` Theodore Ts'o
2024-02-15 13:01                                           ` Jason A. Donenfeld
2024-02-14 20:14                                     ` Dave Hansen
2024-02-02 15:47                               ` James Bottomley
2024-02-02 16:05                                 ` Theodore Ts'o
2024-02-02 21:28                                   ` James Bottomley
2024-02-03 14:35                                     ` Theodore Ts'o
2024-02-06 19:12                                       ` H. Peter Anvin
2024-01-30 15:20     ` H. Peter Anvin
2024-01-30 15:44 ` Kuppuswamy Sathyanarayanan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).