All of lore.kernel.org
 help / color / mirror / Atom feed
* BMCWeb policy for HTTPS site identity certificate
@ 2020-07-23 15:25 Joseph Reynolds
  2020-07-26 20:35 ` Michael Richardson
  2020-07-27 17:32 ` Patrick Williams
  0 siblings, 2 replies; 9+ messages in thread
From: Joseph Reynolds @ 2020-07-23 15:25 UTC (permalink / raw)
  To: openbmc

This is a followup to the OpenBMC security working group meeting 
discussion on 2020-07-22 
(https://docs.google.com/document/d/1b7x9BaxsfcukQDqbvZsU2ehMq4xoJRQvLxxsDUWmAOI).

Background:
Per [BMCWeb configuration 
policy](https://github.com/openbmc/bmcweb#configuration), BMCWeb 
generates a new HTTPS site identity certificate if a usable one cannot 
be found.  You can upload one via APIs described here: 
https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Certs/README.md#redfish-certificate-support

Problem:
BMCWeb apparently treats certificates that are either expired or not 
valid until a future date as unusable (investigation needed).  And 
BMCWeb deletes unusable certificates.  This can confuse the 
administrator, especially considering the BMC's time-of-day clock may 
not be set as expected.

Proposal:
What certificate management policy should BMCWeb use?  Here is an 
initial proposal:
1. certificate is perfectly good - Use the certificate.
2. certificate is good but expired or not yet valid - Use the 
certificate and log a warning.
3. certificate is missing or bad format or algorithm too old - Use 
another certificate or self-generate a certificate (and log that action).
In no case should BMCWeb should delete any certificate.

Discussion?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-23 15:25 BMCWeb policy for HTTPS site identity certificate Joseph Reynolds
@ 2020-07-26 20:35 ` Michael Richardson
  2020-07-27 15:15   ` Bruce Mitchell
  2020-07-27 15:36   ` Ed Tanous
  2020-07-27 17:32 ` Patrick Williams
  1 sibling, 2 replies; 9+ messages in thread
From: Michael Richardson @ 2020-07-26 20:35 UTC (permalink / raw)
  To: openbmc

[-- Attachment #1: Type: text/plain, Size: 1864 bytes --]


Joseph Reynolds <jrey@linux.ibm.com> wrote:
    > Problem:
    > BMCWeb apparently treats certificates that are either expired or not valid
    > until a future date as unusable (investigation needed).  And BMCWeb deletes
    > unusable certificates.  This can confuse the administrator, especially
    > considering the BMC's time-of-day clock may not be set as expected.

    > Proposal:
    > What certificate management policy should BMCWeb use?  Here is an initial
    > proposal:
    > 1. certificate is perfectly good - Use the certificate.

okay.

    > 2. certificate is good but expired or not yet valid - Use the certificate and
    > log a warning.

very good.

    > 3. certificate is missing or bad format or algorithm too old - Use another
    > certificate or self-generate a certificate (and log that action).
    > In no case should BMCWeb should delete any certificate.

I think that there is a problem in 3.

"certificate is missing" is pretty much unambiguous.
"bad format" depends a bit upon evolution of libraries.
In particular, a new version of libssl might support some new algorithm, and
then should the firmware be rolled back, it will "bad format".

So I suggest that the certificate+keypair is never deleted, but may be renamed.
I think that we could have a debate about getting telemetry about bad
certificates back via HTTP.

I think that there are some operational considerations relating to
determining root cause that may trump some security issues relating to
telling bad actors whether they have succeeded in damaging a certificate.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [




[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: BMCWeb policy for HTTPS site identity certificate
  2020-07-26 20:35 ` Michael Richardson
@ 2020-07-27 15:15   ` Bruce Mitchell
  2020-07-27 15:36   ` Ed Tanous
  1 sibling, 0 replies; 9+ messages in thread
From: Bruce Mitchell @ 2020-07-27 15:15 UTC (permalink / raw)
  To: Michael Richardson, openbmc



> -----Original Message-----
> From: openbmc [mailto:openbmc-
> bounces+bruce_mitchell=phoenix.com@lists.ozlabs.org] On Behalf Of
> Michael Richardson
> Sent: Sunday, July 26, 2020 13:35
> To: openbmc
> Subject: Re: BMCWeb policy for HTTPS site identity certificate
> 
> 
> Joseph Reynolds <jrey@linux.ibm.com> wrote:
>     > Problem:
>     > BMCWeb apparently treats certificates that are either expired or not
> valid
>     > until a future date as unusable (investigation needed).  And BMCWeb
> deletes
>     > unusable certificates.  This can confuse the administrator, especially
>     > considering the BMC's time-of-day clock may not be set as expected.
> 
>     > Proposal:
>     > What certificate management policy should BMCWeb use?  Here is an
> initial
>     > proposal:
>     > 1. certificate is perfectly good - Use the certificate.
> 
> okay.
> 
>     > 2. certificate is good but expired or not yet valid - Use the certificate
> and
>     > log a warning.
> 
> very good.
> 
>     > 3. certificate is missing or bad format or algorithm too old - Use
> another
>     > certificate or self-generate a certificate (and log that action).
>     > In no case should BMCWeb should delete any certificate.
> 
> I think that there is a problem in 3.
> 
> "certificate is missing" is pretty much unambiguous.
> "bad format" depends a bit upon evolution of libraries.
> In particular, a new version of libssl might support some new algorithm,
> and then should the firmware be rolled back, it will "bad format".
> 
> So I suggest that the certificate+keypair is never deleted, but may be
> renamed.
> I think that we could have a debate about getting telemetry about bad
> certificates back via HTTP.
> 
> I think that there are some operational considerations relating to
> determining root cause that may trump some security issues relating to
> telling bad actors whether they have succeeded in damaging a certificate.

One more thing is for 3 is that the incident must be logged.

> 
> --
> ]               Never tell me the odds!                 | ipv6 mesh
> networks [
> ]   Michael Richardson, Sandelman Software Works        |    IoT
> architect   [
> ]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on
> rails    [
> 
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-26 20:35 ` Michael Richardson
  2020-07-27 15:15   ` Bruce Mitchell
@ 2020-07-27 15:36   ` Ed Tanous
  2020-07-28 17:03     ` Michael Richardson
  1 sibling, 1 reply; 9+ messages in thread
From: Ed Tanous @ 2020-07-27 15:36 UTC (permalink / raw)
  To: Michael Richardson; +Cc: openbmc

Like I said in the other thread.  The current behavior is a regression
on what the bmcweb behavior was (and was designed to be).

On Sun, Jul 26, 2020 at 1:37 PM Michael Richardson <mcr@sandelman.ca> wrote:
>
>
> Joseph Reynolds <jrey@linux.ibm.com> wrote:
>     > Problem:
>     > BMCWeb apparently treats certificates that are either expired or not valid
>     > until a future date as unusable (investigation needed).  And BMCWeb deletes
>     > unusable certificates.  This can confuse the administrator, especially
>     > considering the BMC's time-of-day clock may not be set as expected.
>
>     > Proposal:
>     > What certificate management policy should BMCWeb use?  Here is an initial
>     > proposal:
>     > 1. certificate is perfectly good - Use the certificate.
>
> okay.
>
>     > 2. certificate is good but expired or not yet valid - Use the certificate and
>     > log a warning.
>
> very good.
>
>     > 3. certificate is missing or bad format or algorithm too old - Use another
>     > certificate or self-generate a certificate (and log that action).
>     > In no case should BMCWeb should delete any certificate.
>
> I think that there is a problem in 3.
>
> "certificate is missing" is pretty much unambiguous.

Unfortunately, this ambiguity comes with the territory.  On first
boot, bmcweb has no certificate, and doesn't know the difference
between "missing" and "was never there".  Regardless, to bring up TLS
it needs _some_ certificate, so the original behavior was that it
generated a new one in all cases where the existing one either didn't
exist or couldn't be used.  This also allows people to start bmcweb up
in "developer" mode, by only sending the binary over, and is useful
for doing A/B compares.
(note, an SSLContext can still be created with a certificate with bad
dates or an unknown certificate chain)

> "bad format" depends a bit upon evolution of libraries.

Today this is defined as the above.  "Can we use this certificate file
for starting up an SSL context?"  If the answer is no, we regenerate.
In theory, the only library we rely on for this is OpenSSL, which I
would hope doesn't have a backward incompatible evolution in this
area.

> In particular, a new version of libssl might support some new algorithm, and
> then should the firmware be rolled back, it will "bad format".

In this hypothetical, you're thinking about a new, non x509
certificate file format?  I vote let's cross that bridge when we get
there, as it seems like there's a lot more discussion that would need
to happen around upgrades and downgrades.  Today the assumption we
make is that x509 certificate reading is backward and forward
compatible since the begining of openbmc, which, to my knowledge, it
is.
In this hypothetical, if x509 instituted a backward incompatible
change AND previous OpenBMC instances were unable to read it, bmcweb
would simply generate a new default certificate.  I don't know if
we've instituted a firmware rollback policy for that case, but I'm
guessing it would be possible (but difficult to maintain).

>
> So I suggest that the certificate+keypair is never deleted, but may be renamed.
> I think that we could have a debate about getting telemetry about bad
> certificates back via HTTP.

We can have a discussion, but I suspect a lot of people would be very
against using unencrypted HTTP for this purpose.

>
> I think that there are some operational considerations relating to
> determining root cause that may trump some security issues relating to
> telling bad actors whether they have succeeded in damaging a certificate.
>
> --
> ]               Never tell me the odds!                 | ipv6 mesh networks [
> ]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
> ]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [
>
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-23 15:25 BMCWeb policy for HTTPS site identity certificate Joseph Reynolds
  2020-07-26 20:35 ` Michael Richardson
@ 2020-07-27 17:32 ` Patrick Williams
  2020-07-28 17:04   ` Michael Richardson
  1 sibling, 1 reply; 9+ messages in thread
From: Patrick Williams @ 2020-07-27 17:32 UTC (permalink / raw)
  To: Joseph Reynolds; +Cc: openbmc

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Thu, Jul 23, 2020 at 10:25:40AM -0500, Joseph Reynolds wrote:
> 2. certificate is good but expired or not yet valid - Use the 
> certificate and log a warning.

I suspect that "not yet valid" is a more common case than might be
assumed on the surface.  I agree with the recommended action.

Many of the Facebook server designs do not have a hardware RTC available
to the BMC.  We have an RTC accessible by the BIOS and we also sync with
NTP.  That means there is always a period of time after we first plug in
the rack where the servers in the rack have a date that is way wrong.

It is reasonable to assume the date is just wrong and the certificate is
valid.  The clients can validate a certificate which is actually out of
date.

I'm less settled on using a certificate which is clearly expired, but it
is still likely better than using a newly-generated self-signed
certificate.

-- 
Patrick Williams

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-27 15:36   ` Ed Tanous
@ 2020-07-28 17:03     ` Michael Richardson
  2020-07-29  2:31       ` Ed Tanous
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Richardson @ 2020-07-28 17:03 UTC (permalink / raw)
  To: Ed Tanous, openbmc

[-- Attachment #1: Type: text/plain, Size: 2838 bytes --]


Ed Tanous <ed@tanous.net> wrote:
    >> "certificate is missing" is pretty much unambiguous.

    > Unfortunately, this ambiguity comes with the territory.  On first
    > boot, bmcweb has no certificate, and doesn't know the difference
    > between "missing" and "was never there".  Regardless, to bring up TLS
    > it needs _some_ certificate, so the original behavior was that it

This is reasonable behaviour, but given that browsers are trying very hard to
make the certificate exception box go away, this does not really help
long-term in my opinion.

Missing means: "ENOFILE", not "Can we use this certificate file for starting
up an SSL Connect".

    >> "bad format" depends a bit upon evolution of libraries.

    > Today this is defined as the above.  "Can we use this certificate file
    > for starting up an SSL context?"  If the answer is no, we regenerate.
    > In theory, the only library we rely on for this is OpenSSL, which I
    > would hope doesn't have a backward incompatible evolution in this
    > area.

Yes, it does.
For instance, you can't load 1024-bit RSA keys with 1.1.1.
It refuses to start.
Meanwhile, 1.0.x does not have any ECDSA support, and you won't find this out
until the TLS session actually tries to start, at which point, it logs an
obsure message to stderr, and returns an error that most programs don't know
what to do with.
(And the TCP connection just ends)

    >> In particular, a new version of libssl might support some new algorithm, and
    >> then should the firmware be rolled back, it will "bad format".

    > In this hypothetical, you're thinking about a new, non x509
    > certificate file format?  I vote let's cross that bridge when we get

Nope, not about non-X.509.
Algorithms and keysize changes.

    > there, as it seems like there's a lot more discussion that would need
    > to happen around upgrades and downgrades.  Today the assumption we
    > make is that x509 certificate reading is backward and forward
    > compatible since the begining of openbmc, which, to my knowledge, it
    > is.

Until... it isn't.
But, the proposal would have considered a certificate with an invalid date as
being invalid, and generated a new one.

    >> So I suggest that the certificate+keypair is never deleted, but may be renamed.
    >> I think that we could have a debate about getting telemetry about bad
    >> certificates back via HTTP.

    > We can have a discussion, but I suspect a lot of people would be very
    > against using unencrypted HTTP for this purpose.

I agree.
So, how do you get information at this point?

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-27 17:32 ` Patrick Williams
@ 2020-07-28 17:04   ` Michael Richardson
  2020-07-29  2:28     ` Ed Tanous
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Richardson @ 2020-07-28 17:04 UTC (permalink / raw)
  To: Patrick Williams, Joseph Reynolds, openbmc

[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]


Patrick Williams <patrick@stwcx.xyz> wrote:
    > On Thu, Jul 23, 2020 at 10:25:40AM -0500, Joseph Reynolds wrote:
    >> 2. certificate is good but expired or not yet valid - Use the
    >> certificate and log a warning.

    > I suspect that "not yet valid" is a more common case than might be
    > assumed on the surface.  I agree with the recommended action.

    > Many of the Facebook server designs do not have a hardware RTC available
    > to the BMC.  We have an RTC accessible by the BIOS and we also sync with
    > NTP.  That means there is always a period of time after we first plug in
    > the rack where the servers in the rack have a date that is way wrong.

    > It is reasonable to assume the date is just wrong and the certificate is
    > valid.  The clients can validate a certificate which is actually out of
    > date.

An additional design idea if you think you have no valid time, is to set the
time to be the notBefore of the certificate you have.  It's probably at least
that date :-)

    > I'm less settled on using a certificate which is clearly expired, but it
    > is still likely better than using a newly-generated self-signed
    > certificate.

+1.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
]     mcr@sandelman.ca  http://www.sandelman.ca/        |   ruby on rails    [


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 487 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-28 17:04   ` Michael Richardson
@ 2020-07-29  2:28     ` Ed Tanous
  0 siblings, 0 replies; 9+ messages in thread
From: Ed Tanous @ 2020-07-29  2:28 UTC (permalink / raw)
  To: Michael Richardson; +Cc: Patrick Williams, Joseph Reynolds, openbmc

On Tue, Jul 28, 2020 at 10:06 AM Michael Richardson <mcr@sandelman.ca> wrote:
>
>     > I'm less settled on using a certificate which is clearly expired, but it
>     > is still likely better than using a newly-generated self-signed
>     > certificate.

The original implementation just caught the
X509_V_ERR_CERT_NOT_YET_VALID error and ignored it, but your idea
would work as well.

One thing we had considered is requiring that the CERT date be at
minimum AFTER the firmware build date, under the assumption that the
build machine had a good grasp on what time it was at the time.  We
could use this for gating the upload of a new cert, but can't use it
for invalidating a cert that already exists, as we run into the
"upgrade causes denial of service" problem.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: BMCWeb policy for HTTPS site identity certificate
  2020-07-28 17:03     ` Michael Richardson
@ 2020-07-29  2:31       ` Ed Tanous
  0 siblings, 0 replies; 9+ messages in thread
From: Ed Tanous @ 2020-07-29  2:31 UTC (permalink / raw)
  To: Michael Richardson; +Cc: openbmc

On Tue, Jul 28, 2020 at 10:03 AM Michael Richardson <mcr@sandelman.ca> wrote:
>
>
> Ed Tanous <ed@tanous.net> wrote:
>     >> "certificate is missing" is pretty much unambiguous.
>
>     > Unfortunately, this ambiguity comes with the territory.  On first
>     > boot, bmcweb has no certificate, and doesn't know the difference
>     > between "missing" and "was never there".  Regardless, to bring up TLS
>     > it needs _some_ certificate, so the original behavior was that it
>
> This is reasonable behaviour, but given that browsers are trying very hard to
> make the certificate exception box go away, this does not really help
> long-term in my opinion.

I'd still be very surprised if this ever happened to browsers in the
long term without any control server side.  I get where their
motivation is, and I agree with it in principle, but without some
mechanism for initial embedded system provisioning, I don't know how
you completely disable that bypass.  With that said, I'm not a browser
developer, so if it does happen, we'll need to figure out another way
to handle initial boot and provisioning.  If you have a proposal for
how to handle this without self signed certs, it would be an
interesting discussion to have.

>
> Missing means: "ENOFILE", not "Can we use this certificate file for starting
> up an SSL Connect".

Today, that's not the definition bmcweb uses to determine whether to
generate a new cert.  I can't tell if you're proposing a different
behavior here, or making a statement about current behavior.  If the
file is present but corrupt, or inaccessible due to permissions, what
are you proposing the behavior should be?
Flash corruption does happen, and in that case, we need a way to bring
up the (sometimes only) configuration interface in a way that is
usable to exchange the certs with valid ones, even if it's sub-optimal
for security.

>
>     >> "bad format" depends a bit upon evolution of libraries.
>
>     > Today this is defined as the above.  "Can we use this certificate file
>     > for starting up an SSL context?"  If the answer is no, we regenerate.
>     > In theory, the only library we rely on for this is OpenSSL, which I
>     > would hope doesn't have a backward incompatible evolution in this
>     > area.
>
> Yes, it does.
> For instance, you can't load 1024-bit RSA keys with 1.1.1.
> It refuses to start.

When I get a free second, I'll look up where we landed on "should we
allow 1k keys" discussion we had a long time back.  I know we had
talked about disallowing them and I think the conclusion was that we
disallow them at upload time.  With that said, maybe 2k keys fail to
load in the future?

> Meanwhile, 1.0.x does not have any ECDSA support,
bmcweb has never targeted OpenSSL 1.0, and has always generated self
signed EC keys so this shouldn't be an issue in practice, but your
point about "could've broken us if" is well taken.

> and you won't find this out
> until the TLS session actually tries to start, at which point, it logs an
> obsure message to stderr, and returns an error that most programs don't know
> what to do with.
> (And the TCP connection just ends)

I could've sworn that EVP_PKEY_get1_RSA returns NULL if it's an EC key
(which is a call that bmcweb explicitly checks).  That call is one of
our "can we build an SSL context" checks today.  Maybe OpenSSL 1.0 is
different?  Regardless, it's really hard to talk about backward
compatibility with hypothetical openbmc + openssl 1.0 builds that to
my knowledge have never existed.  If this situation presents itself in
the future on another OpenSSL upgrade, I suspect that is the best time
to discuss it.

>
>     >> In particular, a new version of libssl might support some new algorithm, and
>     >> then should the firmware be rolled back, it will "bad format".
>
>     > In this hypothetical, you're thinking about a new, non x509
>     > certificate file format?  I vote let's cross that bridge when we get
>
> Nope, not about non-X.509.
> Algorithms and keysize changes.

Agreed, there are possible changes that could break us in the future
(if openssl stops accepting 2k keys for example).

>
>     > there, as it seems like there's a lot more discussion that would need
>     > to happen around upgrades and downgrades.  Today the assumption we
>     > make is that x509 certificate reading is backward and forward
>     > compatible since the begining of openbmc, which, to my knowledge, it
>     > is.
>
> Until... it isn't.
> But, the proposal would have considered a certificate with an invalid date as
> being invalid, and generated a new one.

Yes, I do not believe date, nor cert chain should be used under the
definition of "valid"; "Can we use this certificate file for starting
up an SSL context?" answers yes, even if the date and/or cert chain is
invalid, so I think the definition still works.
With that said, I think all of the above is covered by general idea of
"upgrades are guaranteed, downgrades are best effort" that most BMC
implementations (including OpenBMC at this point) tend to take.  (yes,
sometimes we break the upgrade path and have to fix it).  I don't
think I've seen anywhere in the project where we include both a
forward and backward path for nonvolatile schema migrations.  Are you
proposing something different we should do to handle these types of
situations?

With all of this said, I'm open to the possibility that we have a
backward incompatible openssl change that invalidates the cert.  Do
you think you could code up a patch with what you're hoping the
behavior to be?  It might be easier to approach it from a patchset.

>
>     >> So I suggest that the certificate+keypair is never deleted, but may be renamed.
>     >> I think that we could have a debate about getting telemetry about bad
>     >> certificates back via HTTP.
>
>     > We can have a discussion, but I suspect a lot of people would be very
>     > against using unencrypted HTTP for this purpose.
>
> I agree.
> So, how do you get information at this point?
>

I'm not following.  At the point where you've downgraded, and your key
is no longer valid?  bmcweb will regenerate a self-signed one, and a
user can connect to the HTTPS port insecurely.  Hopefully their next
step is to set up a valid key again, but I don't know how to force
that on people.  Is there a better behavior you'd like?  Brick the
system until it's factory reset?

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-29  2:31 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-23 15:25 BMCWeb policy for HTTPS site identity certificate Joseph Reynolds
2020-07-26 20:35 ` Michael Richardson
2020-07-27 15:15   ` Bruce Mitchell
2020-07-27 15:36   ` Ed Tanous
2020-07-28 17:03     ` Michael Richardson
2020-07-29  2:31       ` Ed Tanous
2020-07-27 17:32 ` Patrick Williams
2020-07-28 17:04   ` Michael Richardson
2020-07-29  2:28     ` Ed Tanous

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.