All of lore.kernel.org
 help / color / mirror / Atom feed
* [Design] PSU firmware update
@ 2019-06-03  8:54 Lei YU
  2019-06-03 13:31 ` Andrew Geissler
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Lei YU @ 2019-06-03  8:54 UTC (permalink / raw)
  To: OpenBMC Maillist

Hi All,

This is a proposed design of PSU firmware update.
It will be posted to gerrit for review after we have resolved comments
in the mailing list.

# PSU firmware update

Author:
   Lei YU <mine260309@gmail.com> <LeiYU>
Primary assignee:
   None
Other contributors:
   Su Xiao <suxiao@inspur.com>
   Derek Howard <derekh@us.ibm.com>
Created:
   2019-06-03


## Problem Description

There is no support in OpenBMC to update the firmware for PSUs.


## Background and References

In OpenBMC, there is an existing interface for [software update][1].

The update process consists of:
1. Uploading an image to BMC;
2. Processing the image to check the version and purpose of the image;
3. Verifying and activating the image.

Currently, BMC and PNOR firmware update are supported:
* [phosphor-bmc-code-mgmt][2] implements BMC code update, and it supports all
  the above 3 processes.
* [openpower-pnor-code-mgmt][3] implements PNOR code update, and it only
  implements "verifying and activating" the image. It shares the function of
  the above 1 & 2 processes.

For PSU firmware code update, it is preferred to re-use the same function for
the above 1 & 2.


## Requirements

To mitigate the risk of power loss, the PSU firmware code update shall meet
pre-conditions:
1. The host is powered off;
2. The redundant PSUs are all connected;
3. The AC input and DC standby output shall be OK on all the PSUs;

And during updating:
4. After the update is done on a PSU, the AC input and DC standby output shall
be checked.


## Proposed Design

The PSU firmware code update will re-use the current interfaces to upload,
verify, and activate the image.

1. The "Version" interface needs to be extended:
   * Add a new [VersionPurpose][4] for PSU;
   * Re-use the existing ExtendedVersion as an additional string for
     vendor-specific purpose, e.g. to indicate the PSU model.
2. Re-use the existing functions implemented by [phosphor-bmc-code-mgmt][2] for
uploading and processing the image.
   * The PSU update image shall be a tarball consists of a MANIFEST, images,
     and signatures
3. There will be a new service that implements the [Activation][5] interface to
update the PSU firmware.
   * It shall run all the checks described in [Requirements] before performing
     the code update;
   * It shall run the checks after each PSU code update is done.
   * The service will verify the signature of the image;
   * The service shall check the ExtendedVersion to make sure the image matches
     the PSU model.
   * The service will call a configurable and vendor-specific tool to perform
     the code update.
   * When each check fails, or the vendor-specific tool returns errors, the PSU
     code update will be aborted and an error event log shall be created.
   * When the PSU code update is completed, an informational event log shall be
     created.


## Alternatives Considered

### General implementation

The PSU firmware update could be implemented by separated recipes that only
call vendor-specific tools.
It will be a bit simpler but loses the unified interface provided by OpenBMC's
existing [software update interface][1], and thus it will become difficult to
use a standard API to the PSU firmware update.

### VersionPurpose
It is possible to re-use the VersionPurpose.Other to represent the PSU image's
version purpose.
But that requires additional information about the image, otherwise, there is
no way to tell if the image is for PSU, or CPLD, or other peripherals.
A new VersionPurpose.PSU is more specific and makes it easier to implement and
friendly for the user.

### Additional string
The design proposal uses ExtendedVersion as the additional string for
vendor-specific purpose, e.g. to indicate the PSU model, so the implementation
could check and compare if the image matches the PSU model.
It is possible to make it optional or remove this additional string, then the
implementation will not verify if the image matches the PSU. It could be OK if
we trust the user who is uploading the correct image, especially the image
shall be signed.
But it is always risky in case the image does not match the PSU, and cause
unintended damage if the incorrect PSU firmware is updated.


## Impacts

This design only introduces a new VersionPurpose enum into the dbus interfaces.
The newly introduced PSU firmware update service will be a new service that
implements existing [Activation][5] interface.
So the impacts are minimal.


## Testing

It requires the manual tests to verify the PSU code update process.
* Verify the PSU code update will not start in case the pre-conditions are not
  met;
* Verify the PSU code update is done on all PSUs successfully when the
  pre-conditions are met.
* Verify the PSU code update will fail in the case that any PSU's AC input or
  DC standby output is lost during code update.


[1]: https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/xyz/openbmc_project/Software
[2]: https://github.com/openbmc/phosphor-bmc-code-mgmt/
[3]: https://github.com/openbmc/openpower-pnor-code-mgmt/
[4]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/57b878d048f929643276f1bf7fdf750abc4bde8b/xyz/openbmc_project/Software/Version.interface.yaml#L14
[5]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Software/Activation.interface.yaml

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03  8:54 [Design] PSU firmware update Lei YU
@ 2019-06-03 13:31 ` Andrew Geissler
  2019-06-03 17:23   ` Neeraj Ladkani
  2019-06-04  2:42   ` Lei YU
  2019-06-03 20:48 ` Derek Howard
  2019-06-04 18:26 ` Vernon Mauery
  2 siblings, 2 replies; 24+ messages in thread
From: Andrew Geissler @ 2019-06-03 13:31 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist

On Mon, Jun 3, 2019 at 3:54 AM Lei YU <mine260309@gmail.com> wrote:
>
> Hi All,
>
> This is a proposed design of PSU firmware update.
> It will be posted to gerrit for review after we have resolved comments
> in the mailing list.
>
> # PSU firmware update
>
> Author:
>    Lei YU <mine260309@gmail.com> <LeiYU>
> Primary assignee:
>    None
> Other contributors:
>    Su Xiao <suxiao@inspur.com>
>    Derek Howard <derekh@us.ibm.com>
> Created:
>    2019-06-03
>
>
> ## Problem Description
>
> There is no support in OpenBMC to update the firmware for PSUs.
>
>
> ## Background and References
>
> In OpenBMC, there is an existing interface for [software update][1].
>
> The update process consists of:
> 1. Uploading an image to BMC;
> 2. Processing the image to check the version and purpose of the image;
> 3. Verifying and activating the image.
>
> Currently, BMC and PNOR firmware update are supported:
> * [phosphor-bmc-code-mgmt][2] implements BMC code update, and it supports all
>   the above 3 processes.
> * [openpower-pnor-code-mgmt][3] implements PNOR code update, and it only
>   implements "verifying and activating" the image. It shares the function of
>   the above 1 & 2 processes.
>
> For PSU firmware code update, it is preferred to re-use the same function for
> the above 1 & 2.
>
>
> ## Requirements
>
> To mitigate the risk of power loss, the PSU firmware code update shall meet
> pre-conditions:
> 1. The host is powered off;
> 2. The redundant PSUs are all connected;
> 3. The AC input and DC standby output shall be OK on all the PSUs;
>
> And during updating:
> 4. After the update is done on a PSU, the AC input and DC standby output shall
> be checked.

What happens if this fail? Auto roll back or just an error log?

>
>
> ## Proposed Design
>
> The PSU firmware code update will re-use the current interfaces to upload,
> verify, and activate the image.

Yes, this ensures the existing Redfish firmware update API's implemented
within bmcweb will also work for this without any changes required.

>
> 1. The "Version" interface needs to be extended:
>    * Add a new [VersionPurpose][4] for PSU;
>    * Re-use the existing ExtendedVersion as an additional string for
>      vendor-specific purpose, e.g. to indicate the PSU model.
> 2. Re-use the existing functions implemented by [phosphor-bmc-code-mgmt][2] for
> uploading and processing the image.
>    * The PSU update image shall be a tarball consists of a MANIFEST, images,
>      and signatures
> 3. There will be a new service that implements the [Activation][5] interface to
> update the PSU firmware.
>    * It shall run all the checks described in [Requirements] before performing
>      the code update;
>    * It shall run the checks after each PSU code update is done.
>    * The service will verify the signature of the image;
>    * The service shall check the ExtendedVersion to make sure the image matches
>      the PSU model.
>    * The service will call a configurable and vendor-specific tool to perform
>      the code update.
>    * When each check fails, or the vendor-specific tool returns errors, the PSU
>      code update will be aborted and an error event log shall be created.
>    * When the PSU code update is completed, an informational event log shall be
>      created.
Is this a normal requirement when it comes to PSU's? We don't do this for BMC
or PNOR.

>
>
<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [Design] PSU firmware update
  2019-06-03 13:31 ` Andrew Geissler
@ 2019-06-03 17:23   ` Neeraj Ladkani
  2019-06-04  2:58     ` Lei YU
  2019-06-04  2:42   ` Lei YU
  1 sibling, 1 reply; 24+ messages in thread
From: Neeraj Ladkani @ 2019-06-03 17:23 UTC (permalink / raw)
  To: Andrew Geissler, Lei YU; +Cc: OpenBMC Maillist

1. Why host power off is a pre-condition? We should add this a PSU pre-requisite to support Live upgrade and activation. 
2. How should PSU update impact PSU and battery monitoring ? should there be coordination between sensor monitoring task during update ?
3. PSU may have multiple regions like bootloader, active region and inactive region. We should design to support multiple region update. 
4. Can you propose required SEL logs and telemetry requirements as well ?

Thanks 
Neeraj

-----Original Message-----
From: openbmc <openbmc-bounces+neladk=microsoft.com@lists.ozlabs.org> On Behalf Of Andrew Geissler
Sent: Monday, June 3, 2019 6:32 AM
To: Lei YU <mine260309@gmail.com>
Cc: OpenBMC Maillist <openbmc@lists.ozlabs.org>
Subject: Re: [Design] PSU firmware update

On Mon, Jun 3, 2019 at 3:54 AM Lei YU <mine260309@gmail.com> wrote:
>
> Hi All,
>
> This is a proposed design of PSU firmware update.
> It will be posted to gerrit for review after we have resolved comments
> in the mailing list.
>
> # PSU firmware update
>
> Author:
>    Lei YU <mine260309@gmail.com> <LeiYU>
> Primary assignee:
>    None
> Other contributors:
>    Su Xiao <suxiao@inspur.com>
>    Derek Howard <derekh@us.ibm.com>
> Created:
>    2019-06-03
>
>
> ## Problem Description
>
> There is no support in OpenBMC to update the firmware for PSUs.
>
>
> ## Background and References
>
> In OpenBMC, there is an existing interface for [software update][1].
>
> The update process consists of:
> 1. Uploading an image to BMC;
> 2. Processing the image to check the version and purpose of the image;
> 3. Verifying and activating the image.
>
> Currently, BMC and PNOR firmware update are supported:
> * [phosphor-bmc-code-mgmt][2] implements BMC code update, and it supports all
>   the above 3 processes.
> * [openpower-pnor-code-mgmt][3] implements PNOR code update, and it only
>   implements "verifying and activating" the image. It shares the function of
>   the above 1 & 2 processes.
>
> For PSU firmware code update, it is preferred to re-use the same function for
> the above 1 & 2.
>
>
> ## Requirements
>
> To mitigate the risk of power loss, the PSU firmware code update shall meet
> pre-conditions:
> 1. The host is powered off;
> 2. The redundant PSUs are all connected;
> 3. The AC input and DC standby output shall be OK on all the PSUs;
>
> And during updating:
> 4. After the update is done on a PSU, the AC input and DC standby output shall
> be checked.

What happens if this fail? Auto roll back or just an error log?

>
>
> ## Proposed Design
>
> The PSU firmware code update will re-use the current interfaces to upload,
> verify, and activate the image.

Yes, this ensures the existing Redfish firmware update API's implemented
within bmcweb will also work for this without any changes required.

>
> 1. The "Version" interface needs to be extended:
>    * Add a new [VersionPurpose][4] for PSU;
>    * Re-use the existing ExtendedVersion as an additional string for
>      vendor-specific purpose, e.g. to indicate the PSU model.
> 2. Re-use the existing functions implemented by [phosphor-bmc-code-mgmt][2] for
> uploading and processing the image.
>    * The PSU update image shall be a tarball consists of a MANIFEST, images,
>      and signatures
> 3. There will be a new service that implements the [Activation][5] interface to
> update the PSU firmware.
>    * It shall run all the checks described in [Requirements] before performing
>      the code update;
>    * It shall run the checks after each PSU code update is done.
>    * The service will verify the signature of the image;
>    * The service shall check the ExtendedVersion to make sure the image matches
>      the PSU model.
>    * The service will call a configurable and vendor-specific tool to perform
>      the code update.
>    * When each check fails, or the vendor-specific tool returns errors, the PSU
>      code update will be aborted and an error event log shall be created.
>    * When the PSU code update is completed, an informational event log shall be
>      created.
Is this a normal requirement when it comes to PSU's? We don't do this for BMC
or PNOR.

>
>
<snip>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03  8:54 [Design] PSU firmware update Lei YU
  2019-06-03 13:31 ` Andrew Geissler
@ 2019-06-03 20:48 ` Derek Howard
  2019-06-04  3:19   ` Lei YU
  2019-06-04 18:26 ` Vernon Mauery
  2 siblings, 1 reply; 24+ messages in thread
From: Derek Howard @ 2019-06-03 20:48 UTC (permalink / raw)
  To: openbmc


On 6/3/2019 3:54 AM, Lei YU wrote:
> Hi All,
>
> This is a proposed design of PSU firmware update.
> It will be posted to gerrit for review after we have resolved comments
> in the mailing list.
>
> # PSU firmware update
>
> Author:
>     Lei YU <mine260309@gmail.com> <LeiYU>
> Primary assignee:
>     None
> Other contributors:
>     Su Xiao <suxiao@inspur.com>
>     Derek Howard <derekh@us.ibm.com>
> Created:
>     2019-06-03
>
>
> ## Problem Description
>
> There is no support in OpenBMC to update the firmware for PSUs.
>
>
> ## Background and References
>
> In OpenBMC, there is an existing interface for [software update][1].
>
> The update process consists of:
> 1. Uploading an image to BMC;
> 2. Processing the image to check the version and purpose of the image;
> 3. Verifying and activating the image.
>
> Currently, BMC and PNOR firmware update are supported:
> * [phosphor-bmc-code-mgmt][2] implements BMC code update, and it supports all
>    the above 3 processes.
> * [openpower-pnor-code-mgmt][3] implements PNOR code update, and it only
>    implements "verifying and activating" the image. It shares the function of
>    the above 1 & 2 processes.
>
> For PSU firmware code update, it is preferred to re-use the same function for
> the above 1 & 2.
>
>
> ## Requirements
>
> To mitigate the risk of power loss, the PSU firmware code update shall meet
> pre-conditions:
> 1. The host is powered off;
> 2. The redundant PSUs are all connected;
> 3. The AC input and DC standby output shall be OK on all the PSUs;

As part of the PSU code update, it will turn off it's control supply.  
If only 1 PSU has AC applied, and it turns off the control supply, then 
the BMC would lose power and get reset, which is not wanted.  That being 
said, some systems may have 4 PSU instead of 2.  So ALL of the redundant 
PSUs wouldn't be connected, but at least 1 other PSU should be connected 
and have AC applied, so that when 1 PSU is reset due to the download, at 
least 1 other PSU will hold up the control supply and be providing 
standby power to the BMC.  It should still be ok to download the 
remaining PSU even if they don't have AC applied.

So #2 and #3 aren't always valid, including in systems with more than 2 
PSUs attached.  It's probably better to say that whenever downloading a 
PSU, that at least 1 other PSU is connected and has AC attached.


> And during updating:
> 4. After the update is done on a PSU, the AC input and DC standby output shall
> be checked.
>
>
> ## Proposed Design
>
> The PSU firmware code update will re-use the current interfaces to upload,
> verify, and activate the image.
>
> 1. The "Version" interface needs to be extended:
>     * Add a new [VersionPurpose][4] for PSU;
>     * Re-use the existing ExtendedVersion as an additional string for
>       vendor-specific purpose, e.g. to indicate the PSU model.
> 2. Re-use the existing functions implemented by [phosphor-bmc-code-mgmt][2] for
> uploading and processing the image.
>     * The PSU update image shall be a tarball consists of a MANIFEST, images,
>       and signatures
> 3. There will be a new service that implements the [Activation][5] interface to
> update the PSU firmware.
>     * It shall run all the checks described in [Requirements] before performing
>       the code update;
>     * It shall run the checks after each PSU code update is done.
>     * The service will verify the signature of the image;
>     * The service shall check the ExtendedVersion to make sure the image matches
>       the PSU model.
>     * The service will call a configurable and vendor-specific tool to perform
>       the code update.
>     * When each check fails, or the vendor-specific tool returns errors, the PSU
>       code update will be aborted and an error event log shall be created.
>     * When the PSU code update is completed, an informational event log shall be
>       created.
>
>
> ## Alternatives Considered
>
> ### General implementation
>
> The PSU firmware update could be implemented by separated recipes that only
> call vendor-specific tools.
> It will be a bit simpler but loses the unified interface provided by OpenBMC's
> existing [software update interface][1], and thus it will become difficult to
> use a standard API to the PSU firmware update.
>
> ### VersionPurpose
> It is possible to re-use the VersionPurpose.Other to represent the PSU image's
> version purpose.
> But that requires additional information about the image, otherwise, there is
> no way to tell if the image is for PSU, or CPLD, or other peripherals.
> A new VersionPurpose.PSU is more specific and makes it easier to implement and
> friendly for the user.
>
> ### Additional string
> The design proposal uses ExtendedVersion as the additional string for
> vendor-specific purpose, e.g. to indicate the PSU model, so the implementation
> could check and compare if the image matches the PSU model.
> It is possible to make it optional or remove this additional string, then the
> implementation will not verify if the image matches the PSU. It could be OK if
> we trust the user who is uploading the correct image, especially the image
> shall be signed.
> But it is always risky in case the image does not match the PSU, and cause
> unintended damage if the incorrect PSU firmware is updated.
>
>
> ## Impacts
>
> This design only introduces a new VersionPurpose enum into the dbus interfaces.
> The newly introduced PSU firmware update service will be a new service that
> implements existing [Activation][5] interface.
> So the impacts are minimal.
>
>
> ## Testing
>
> It requires the manual tests to verify the PSU code update process.
> * Verify the PSU code update will not start in case the pre-conditions are not
>    met;
> * Verify the PSU code update is done on all PSUs successfully when the
>    pre-conditions are met.
> * Verify the PSU code update will fail in the case that any PSU's AC input or
>    DC standby output is lost during code update.
>
>
> [1]: https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/xyz/openbmc_project/Software
> [2]: https://github.com/openbmc/phosphor-bmc-code-mgmt/
> [3]: https://github.com/openbmc/openpower-pnor-code-mgmt/
> [4]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/57b878d048f929643276f1bf7fdf750abc4bde8b/xyz/openbmc_project/Software/Version.interface.yaml#L14
> [5]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Software/Activation.interface.yaml
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03 13:31 ` Andrew Geissler
  2019-06-03 17:23   ` Neeraj Ladkani
@ 2019-06-04  2:42   ` Lei YU
  1 sibling, 0 replies; 24+ messages in thread
From: Lei YU @ 2019-06-04  2:42 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: OpenBMC Maillist

> > And during updating:
> > 4. After the update is done on a PSU, the AC input and DC standby output shall
> > be checked.
>
> What happens if this fail? Auto roll back or just an error log?

I do not think there will be a way to roll back (and it could get failure on
rolling back as well).
So probably just log an error and do not update other PSUs, so that the
system's other PSUs are good.

> > The PSU firmware code update will re-use the current interfaces to upload,
> > verify, and activate the image.
>
> Yes, this ensures the existing Redfish firmware update API's implemented
> within bmcweb will also work for this without any changes required.

Yup.

> >    * When the PSU code update is completed, an informational event log shall be
> >      created.
> Is this a normal requirement when it comes to PSU's? We don't do this for BMC
> or PNOR.

I expect that an informational log is good to tell that something is done,
something like audit log.
But we could remove this requirement if it is not mandatory.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03 17:23   ` Neeraj Ladkani
@ 2019-06-04  2:58     ` Lei YU
  2019-06-04  6:43       ` Neeraj Ladkani
                         ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Lei YU @ 2019-06-04  2:58 UTC (permalink / raw)
  To: Neeraj Ladkani; +Cc: Andrew Geissler, OpenBMC Maillist

On Tue, Jun 4, 2019 at 1:23 AM Neeraj Ladkani <neladk@microsoft.com> wrote:
>
> 1. Why host power off is a pre-condition? We should add this a PSU pre-requisite to support Live upgrade and activation.

Derek's reply explains the reason why we want to the host power off as
pre-condition.

> 2. How should PSU update impact PSU and battery monitoring ? should there be coordination between sensor monitoring task during update ?

This is a good point. During PSU update, the driver probably should be unbind,
and after the update is one, rebind the driver.
Does that sounds OK?

> 3. PSU may have multiple regions like bootloader, active region and inactive region. We should design to support multiple region update.

I do not have detailed information about this, which is more suitable to let
the vendor-specific tool to handle the multiple regions.
What do you think?

> 4. Can you propose required SEL logs and telemetry requirements as well ?

While I was writing this design doc, I was not thinking about the detailed SEL
logs.
Will need some time to discuss this and see if it shall be covered in this doc
or not.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03 20:48 ` Derek Howard
@ 2019-06-04  3:19   ` Lei YU
  0 siblings, 0 replies; 24+ messages in thread
From: Lei YU @ 2019-06-04  3:19 UTC (permalink / raw)
  To: Derek Howard; +Cc: OpenBMC Maillist

> > ## Requirements
> >
> > To mitigate the risk of power loss, the PSU firmware code update shall meet
> > pre-conditions:
> > 1. The host is powered off;
> > 2. The redundant PSUs are all connected;
> > 3. The AC input and DC standby output shall be OK on all the PSUs;
>
> As part of the PSU code update, it will turn off it's control supply.
> If only 1 PSU has AC applied, and it turns off the control supply, then
> the BMC would lose power and get reset, which is not wanted.  That being
> said, some systems may have 4 PSU instead of 2.  So ALL of the redundant
> PSUs wouldn't be connected, but at least 1 other PSU should be connected
> and have AC applied, so that when 1 PSU is reset due to the download, at
> least 1 other PSU will hold up the control supply and be providing
> standby power to the BMC.  It should still be ok to download the
> remaining PSU even if they don't have AC applied.
>
> So #2 and #3 aren't always valid, including in systems with more than 2
> PSUs attached.  It's probably better to say that whenever downloading a
> PSU, that at least 1 other PSU is connected and has AC attached.

Yes, technically, as long as there is at least 1 other PSU has AC, it should
be safe to do PSU update.
But that involves a certain complexity, and it is safer to require all PSUs to
have AC.
So I chose a simpler way to require the above 2 & 3.

The requirements could be relaxed, let's see others' ideas probably?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [Design] PSU firmware update
  2019-06-04  2:58     ` Lei YU
@ 2019-06-04  6:43       ` Neeraj Ladkani
  2019-06-04  7:02         ` Lei YU
  2019-06-04  7:28       ` Alexander Amelkin
  2019-06-04 18:22       ` Vernon Mauery
  2 siblings, 1 reply; 24+ messages in thread
From: Neeraj Ladkani @ 2019-06-04  6:43 UTC (permalink / raw)
  To: Lei YU; +Cc: Andrew Geissler, OpenBMC Maillist

Are you proposing that if PSU FW is attempted and if system is powered on, the FW update will not start? We should not tie framework with these requirements.  If this is really required for a particular platform design then vendor specific tool can have right checks before triggering the update. 

Also how do we tie this with IPMI?  How does the payload reach BMC and How do we know progress of FW update ?

Thanks
Neeraj

-----Original Message-----
From: Lei YU <mine260309@gmail.com> 
Sent: Monday, June 3, 2019 7:58 PM
To: Neeraj Ladkani <neladk@microsoft.com>
Cc: Andrew Geissler <geissonator@gmail.com>; OpenBMC Maillist <openbmc@lists.ozlabs.org>
Subject: Re: [Design] PSU firmware update

On Tue, Jun 4, 2019 at 1:23 AM Neeraj Ladkani <neladk@microsoft.com> wrote:
>
> 1. Why host power off is a pre-condition? We should add this a PSU pre-requisite to support Live upgrade and activation.


Derek's reply explains the reason why we want to the host power off as pre-condition.

> 2. How should PSU update impact PSU and battery monitoring ? should there be coordination between sensor monitoring task during update ?

This is a good point. During PSU update, the driver probably should be unbind, and after the update is one, rebind the driver.
Does that sounds OK?

> 3. PSU may have multiple regions like bootloader, active region and inactive region. We should design to support multiple region update.

I do not have detailed information about this, which is more suitable to let the vendor-specific tool to handle the multiple regions.
What do you think?

> 4. Can you propose required SEL logs and telemetry requirements as well ?

While I was writing this design doc, I was not thinking about the detailed SEL logs.
Will need some time to discuss this and see if it shall be covered in this doc or not.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-04  6:43       ` Neeraj Ladkani
@ 2019-06-04  7:02         ` Lei YU
  2019-06-04  7:20           ` Neeraj Ladkani
  0 siblings, 1 reply; 24+ messages in thread
From: Lei YU @ 2019-06-04  7:02 UTC (permalink / raw)
  To: Neeraj Ladkani; +Cc: Andrew Geissler, OpenBMC Maillist

On Tue, Jun 4, 2019 at 2:43 PM Neeraj Ladkani <neladk@microsoft.com> wrote:
>
> Are you proposing that if PSU FW is attempted and if system is powered on, the FW update will not start?

Yes, do not perform PSU FW update when the system is powered on, otherwise, it
is considered risky.

> We should not tie framework with these requirements.  If this is really required for a particular platform design then vendor specific tool can have right checks before triggering the update.

This is a good point, I would like to know if there are cases that the PSU
could be updated safely while the system is powered up.
If there are really such cases, then it's true the framework should not
require this, and leave it to vendor-specific tools.

> Also how do we tie this with IPMI?  How does the payload reach BMC and How do we know progress of FW update ?

This design does not involve IPMI at all. The payload is uploaded, processed,
and activated by the same interface as BMC code update.
See doc here: https://github.com/openbmc/docs/blob/master/code-update/code-update.md
So you could use REST APIs or Redfish to do the PSU code update.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [Design] PSU firmware update
  2019-06-04  7:02         ` Lei YU
@ 2019-06-04  7:20           ` Neeraj Ladkani
  0 siblings, 0 replies; 24+ messages in thread
From: Neeraj Ladkani @ 2019-06-04  7:20 UTC (permalink / raw)
  To: Lei YU; +Cc: Andrew Geissler, OpenBMC Maillist

Yes, for Cloud Infra use cases, all firmware updates needs to be live and impact less. Any thoughts on how this can be extended to IPMI ? 

Neeraj

-----Original Message-----
From: Lei YU <mine260309@gmail.com> 
Sent: Tuesday, June 4, 2019 12:02 AM
To: Neeraj Ladkani <neladk@microsoft.com>
Cc: Andrew Geissler <geissonator@gmail.com>; OpenBMC Maillist <openbmc@lists.ozlabs.org>
Subject: Re: [Design] PSU firmware update

On Tue, Jun 4, 2019 at 2:43 PM Neeraj Ladkani <neladk@microsoft.com> wrote:
>
> Are you proposing that if PSU FW is attempted and if system is powered on, the FW update will not start?

Yes, do not perform PSU FW update when the system is powered on, otherwise, it is considered risky.

> We should not tie framework with these requirements.  If this is really required for a particular platform design then vendor specific tool can have right checks before triggering the update.

This is a good point, I would like to know if there are cases that the PSU could be updated safely while the system is powered up.
If there are really such cases, then it's true the framework should not require this, and leave it to vendor-specific tools.

> Also how do we tie this with IPMI?  How does the payload reach BMC and How do we know progress of FW update ?

This design does not involve IPMI at all. The payload is uploaded, processed, and activated by the same interface as BMC code update.
See doc here: https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenbmc%2Fdocs%2Fblob%2Fmaster%2Fcode-update%2Fcode-update.md&amp;data=02%7C01%7Cneladk%40microsoft.com%7C8c570de5af934786e65d08d6e8ba9c02%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636952285510094921&amp;sdata=H2v988%2Fgp5%2FSbpn33LA7eOfE55T5duN%2BAl3yDtcfKRs%3D&amp;reserved=0
So you could use REST APIs or Redfish to do the PSU code update.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-04  2:58     ` Lei YU
  2019-06-04  6:43       ` Neeraj Ladkani
@ 2019-06-04  7:28       ` Alexander Amelkin
  2019-06-04 18:22       ` Vernon Mauery
  2 siblings, 0 replies; 24+ messages in thread
From: Alexander Amelkin @ 2019-06-04  7:28 UTC (permalink / raw)
  To: Lei YU, Neeraj Ladkani; +Cc: OpenBMC Maillist


[-- Attachment #1.1: Type: text/plain, Size: 2177 bytes --]

04.06.2019 5:58, Lei YU wrote:
> On Tue, Jun 4, 2019 at 1:23 AM Neeraj Ladkani <neladk@microsoft.com> wrote:
>> 1. Why host power off is a pre-condition? We should add this a PSU pre-requisite to support Live upgrade and activation.
> Derek's reply explains the reason why we want to the host power off as
> pre-condition.
>
>> 2. How should PSU update impact PSU and battery monitoring ? should there be coordination between sensor monitoring task during update ?
> This is a good point. During PSU update, the driver probably should be unbind,
> and after the update is one, rebind the driver.
> Does that sounds OK?

Unbinding the telemetry driver (as in kernel driver) isn't a good idea because telemetry for a PSU can be provided by the same driver that provides firmware update facilities.

In YADRO we have developed a mechanism that renders certain sensors 'invalid' (or alternatively changes their thresholds) in certain states of other sensors. For us that allows for avoiding failure state for chassis fan sensors when the host is off and also let's us live fine with zero main 12V output of PSUs when the host is off. I suppose this mechanism could be adopted by OpenBMC and adapted to this task to just disable some telemetry during PSU firmware update.

>
>> 3. PSU may have multiple regions like bootloader, active region and inactive region. We should design to support multiple region update.
> I do not have detailed information about this, which is more suitable to let
> the vendor-specific tool to handle the multiple regions.
> What do you think?

It's definitely up to the PSU vendor specific update tool to handle all the layout and update interface peculiarities.

>
>> 4. Can you propose required SEL logs and telemetry requirements as well ?
> While I was writing this design doc, I was not thinking about the detailed SEL
> logs.
> Will need some time to discuss this and see if it shall be covered in this doc
> or not.

The only event that I could find in IPMI spec is 'Version change' (sensor type code 0x2B).

With best regards,
Alexander Amelkin,
Leading BMC Software Engineer, YADRO
https://yadro.com




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-04  2:58     ` Lei YU
  2019-06-04  6:43       ` Neeraj Ladkani
  2019-06-04  7:28       ` Alexander Amelkin
@ 2019-06-04 18:22       ` Vernon Mauery
  2 siblings, 0 replies; 24+ messages in thread
From: Vernon Mauery @ 2019-06-04 18:22 UTC (permalink / raw)
  To: Lei YU; +Cc: Neeraj Ladkani, OpenBMC Maillist

On 04-Jun-2019 10:58 AM, Lei YU wrote:
>On Tue, Jun 4, 2019 at 1:23 AM Neeraj Ladkani <neladk@microsoft.com> wrote:
>>
>> 1. Why host power off is a pre-condition? We should add this a PSU pre-requisite to support Live upgrade and activation.
>
>Derek's reply explains the reason why we want to the host power off as
>pre-condition.

I don't think host power off should be a requirement. Our power supplies 
run safely in their bootloader while doing a firmware update and can 
keep the power on.

--Vernon

>> 2. How should PSU update impact PSU and battery monitoring ? should there be coordination between sensor monitoring task during update ?
>
>This is a good point. During PSU update, the driver probably should be unbind,
>and after the update is one, rebind the driver.
>Does that sounds OK?
>
>> 3. PSU may have multiple regions like bootloader, active region and inactive region. We should design to support multiple region update.
>
>I do not have detailed information about this, which is more suitable to let
>the vendor-specific tool to handle the multiple regions.
>What do you think?
>
>> 4. Can you propose required SEL logs and telemetry requirements as well ?
>
>While I was writing this design doc, I was not thinking about the detailed SEL
>logs.
>Will need some time to discuss this and see if it shall be covered in this doc
>or not.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-03  8:54 [Design] PSU firmware update Lei YU
  2019-06-03 13:31 ` Andrew Geissler
  2019-06-03 20:48 ` Derek Howard
@ 2019-06-04 18:26 ` Vernon Mauery
  2019-06-05  6:18   ` Lei YU
  2 siblings, 1 reply; 24+ messages in thread
From: Vernon Mauery @ 2019-06-04 18:26 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist

On 03-Jun-2019 04:54 PM, Lei YU wrote:
>Hi All,
>
>This is a proposed design of PSU firmware update.
>It will be posted to gerrit for review after we have resolved comments
>in the mailing list.
>
># PSU firmware update
>
>Author:
>   Lei YU <mine260309@gmail.com> <LeiYU>
>Primary assignee:
>   None
>Other contributors:
>   Su Xiao <suxiao@inspur.com>
>   Derek Howard <derekh@us.ibm.com>
>Created:
>   2019-06-03
>
>
>## Problem Description
>
>There is no support in OpenBMC to update the firmware for PSUs.
>
>
>## Background and References
>
>In OpenBMC, there is an existing interface for [software update][1].
>
>The update process consists of:
>1. Uploading an image to BMC;
>2. Processing the image to check the version and purpose of the image;
>3. Verifying and activating the image.
>
>Currently, BMC and PNOR firmware update are supported:
>* [phosphor-bmc-code-mgmt][2] implements BMC code update, and it supports all
>  the above 3 processes.
>* [openpower-pnor-code-mgmt][3] implements PNOR code update, and it only
>  implements "verifying and activating" the image. It shares the function of
>  the above 1 & 2 processes.
>
>For PSU firmware code update, it is preferred to re-use the same function for
>the above 1 & 2.
>
>
>## Requirements
>
>To mitigate the risk of power loss, the PSU firmware code update shall meet
>pre-conditions:
>1. The host is powered off;
>2. The redundant PSUs are all connected;
>3. The AC input and DC standby output shall be OK on all the PSUs;
>
>And during updating:
>4. After the update is done on a PSU, the AC input and DC standby output shall
>be checked.
>
>
>## Proposed Design
>
>The PSU firmware code update will re-use the current interfaces to upload,
>verify, and activate the image.

We would like the option to be able to ship the PSU firmware as part of 
the BMC image (in the root filesystem). This means that it is already 
present and authenticated when the BMC boots. In this way, we know that 
the current BMC firmware plays well with the PSU firmware and have fewer 
variables to test for when making a release.

I suppose this could be done by skipping the download phase and simply 
creating an activation object at boot and then initiating the FW 
activation automatically.

--Vernon

>1. The "Version" interface needs to be extended:
>   * Add a new [VersionPurpose][4] for PSU;
>   * Re-use the existing ExtendedVersion as an additional string for
>     vendor-specific purpose, e.g. to indicate the PSU model.
>2. Re-use the existing functions implemented by [phosphor-bmc-code-mgmt][2] for
>uploading and processing the image.
>   * The PSU update image shall be a tarball consists of a MANIFEST, images,
>     and signatures
>3. There will be a new service that implements the [Activation][5] interface to
>update the PSU firmware.
>   * It shall run all the checks described in [Requirements] before performing
>     the code update;
>   * It shall run the checks after each PSU code update is done.
>   * The service will verify the signature of the image;
>   * The service shall check the ExtendedVersion to make sure the image matches
>     the PSU model.
>   * The service will call a configurable and vendor-specific tool to perform
>     the code update.
>   * When each check fails, or the vendor-specific tool returns errors, the PSU
>     code update will be aborted and an error event log shall be created.
>   * When the PSU code update is completed, an informational event log shall be
>     created.
>
>
>## Alternatives Considered
>
>### General implementation
>
>The PSU firmware update could be implemented by separated recipes that only
>call vendor-specific tools.
>It will be a bit simpler but loses the unified interface provided by OpenBMC's
>existing [software update interface][1], and thus it will become difficult to
>use a standard API to the PSU firmware update.
>
>### VersionPurpose
>It is possible to re-use the VersionPurpose.Other to represent the PSU image's
>version purpose.
>But that requires additional information about the image, otherwise, there is
>no way to tell if the image is for PSU, or CPLD, or other peripherals.
>A new VersionPurpose.PSU is more specific and makes it easier to implement and
>friendly for the user.
>
>### Additional string
>The design proposal uses ExtendedVersion as the additional string for
>vendor-specific purpose, e.g. to indicate the PSU model, so the implementation
>could check and compare if the image matches the PSU model.
>It is possible to make it optional or remove this additional string, then the
>implementation will not verify if the image matches the PSU. It could be OK if
>we trust the user who is uploading the correct image, especially the image
>shall be signed.
>But it is always risky in case the image does not match the PSU, and cause
>unintended damage if the incorrect PSU firmware is updated.
>
>
>## Impacts
>
>This design only introduces a new VersionPurpose enum into the dbus interfaces.
>The newly introduced PSU firmware update service will be a new service that
>implements existing [Activation][5] interface.
>So the impacts are minimal.
>
>
>## Testing
>
>It requires the manual tests to verify the PSU code update process.
>* Verify the PSU code update will not start in case the pre-conditions are not
>  met;
>* Verify the PSU code update is done on all PSUs successfully when the
>  pre-conditions are met.
>* Verify the PSU code update will fail in the case that any PSU's AC input or
>  DC standby output is lost during code update.
>
>
>[1]: https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/xyz/openbmc_project/Software
>[2]: https://github.com/openbmc/phosphor-bmc-code-mgmt/
>[3]: https://github.com/openbmc/openpower-pnor-code-mgmt/
>[4]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/57b878d048f929643276f1bf7fdf750abc4bde8b/xyz/openbmc_project/Software/Version.interface.yaml#L14
>[5]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Software/Activation.interface.yaml

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-04 18:26 ` Vernon Mauery
@ 2019-06-05  6:18   ` Lei YU
  2019-06-05 14:25     ` Matt Spinler
  0 siblings, 1 reply; 24+ messages in thread
From: Lei YU @ 2019-06-05  6:18 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: OpenBMC Maillist

> >The PSU firmware code update will re-use the current interfaces to upload,
> >verify, and activate the image.
>
> We would like the option to be able to ship the PSU firmware as part of
> the BMC image (in the root filesystem). This means that it is already
> present and authenticated when the BMC boots. In this way, we know that
> the current BMC firmware plays well with the PSU firmware and have fewer
> variables to test for when making a release.

Because the PSU firmware is part of BMC image, this seems a completely
different approach, and more like part of BMC image update, is it?
I would expect this should not be part of this design, what do you think?

>
> I suppose this could be done by skipping the download phase and simply
> creating an activation object at boot and then initiating the FW
> activation automatically.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-05  6:18   ` Lei YU
@ 2019-06-05 14:25     ` Matt Spinler
  2019-06-05 14:42       ` Adriana Kobylak
  2019-06-06  3:31       ` Lei YU
  0 siblings, 2 replies; 24+ messages in thread
From: Matt Spinler @ 2019-06-05 14:25 UTC (permalink / raw)
  To: openbmc


On 6/5/2019 1:18 AM, Lei YU wrote:
>>> The PSU firmware code update will re-use the current interfaces to upload,
>>> verify, and activate the image.
>> We would like the option to be able to ship the PSU firmware as part of
>> the BMC image (in the root filesystem). This means that it is already
>> present and authenticated when the BMC boots. In this way, we know that
>> the current BMC firmware plays well with the PSU firmware and have fewer
>> variables to test for when making a release.
> Because the PSU firmware is part of BMC image, this seems a completely
> different approach, and more like part of BMC image update, is it?
> I would expect this should not be part of this design, what do you think?

FYI, I am 99% sure this is how IBM needs its systems to work as well.  
That being the case,

will you also be handling this design?



>> I suppose this could be done by skipping the download phase and simply
>> creating an activation object at boot and then initiating the FW
>> activation automatically.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-05 14:25     ` Matt Spinler
@ 2019-06-05 14:42       ` Adriana Kobylak
  2019-06-06  3:31       ` Lei YU
  1 sibling, 0 replies; 24+ messages in thread
From: Adriana Kobylak @ 2019-06-05 14:42 UTC (permalink / raw)
  To: Matt Spinler; +Cc: openbmc, openbmc

On 2019-06-05 09:25, Matt Spinler wrote:
> On 6/5/2019 1:18 AM, Lei YU wrote:
>>>> The PSU firmware code update will re-use the current interfaces to 
>>>> upload,
>>>> verify, and activate the image.
>>> We would like the option to be able to ship the PSU firmware as part 
>>> of
>>> the BMC image (in the root filesystem). This means that it is already
>>> present and authenticated when the BMC boots. In this way, we know 
>>> that
>>> the current BMC firmware plays well with the PSU firmware and have 
>>> fewer
>>> variables to test for when making a release.
>> Because the PSU firmware is part of BMC image, this seems a completely
>> different approach, and more like part of BMC image update, is it?
>> I would expect this should not be part of this design, what do you 
>> think?
> 
> FYI, I am 99% sure this is how IBM needs its systems to work as

There's a Version Purpose of "System"[1] that "is an aggregate for the 
system as a whole." IBM is planning to use this to bundle the BMC and 
host firmware in a single file, which as Vernon mentions would ensure 
that one is compatible with the other.

Not saying the proposed PSU design has to include the 'combined image' 
details, we can just have a mentioning that it can be an option and we 
can re-discuss how a bmc+PSU could look like once I send some details on 
the bmc+host image in the next few months.

[1] 
https://github.com/openbmc/phosphor-dbus-interfaces/blob/57b878d048f929643276f1bf7fdf750abc4bde8b/xyz/openbmc_project/Software/Version.interface.yaml#L24

> well.  That being the case,
> 
> will you also be handling this design?
> 
> 
> 
>>> I suppose this could be done by skipping the download phase and 
>>> simply
>>> creating an activation object at boot and then initiating the FW
>>> activation automatically.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-05 14:25     ` Matt Spinler
  2019-06-05 14:42       ` Adriana Kobylak
@ 2019-06-06  3:31       ` Lei YU
  2019-06-06 20:31         ` Adriana Kobylak
  1 sibling, 1 reply; 24+ messages in thread
From: Lei YU @ 2019-06-06  3:31 UTC (permalink / raw)
  To: Matt Spinler; +Cc: OpenBMC Maillist

On Wed, Jun 5, 2019 at 10:25 PM Matt Spinler <mspinler@linux.ibm.com> wrote:
>
>
> On 6/5/2019 1:18 AM, Lei YU wrote:
> >>> The PSU firmware code update will re-use the current interfaces to upload,
> >>> verify, and activate the image.
> >> We would like the option to be able to ship the PSU firmware as part of
> >> the BMC image (in the root filesystem). This means that it is already
> >> present and authenticated when the BMC boots. In this way, we know that
> >> the current BMC firmware plays well with the PSU firmware and have fewer
> >> variables to test for when making a release.
> > Because the PSU firmware is part of BMC image, this seems a completely
> > different approach, and more like part of BMC image update, is it?
> > I would expect this should not be part of this design, what do you think?
>
> FYI, I am 99% sure this is how IBM needs its systems to work as well.
> That being the case,
>
> will you also be handling this design?

Good to know.

Then a question comes up:
In which cases PSU firmware update shall be done?
1. It is updated together with BMC firmware update as described by Vernon
   Mauery;
2. It is updated independently with APIs, as described in this design doc.

Will 1 and 2 both be valid, or only 1 is the real case and we do not need to
support 2?

The reason I ask is because if we could get clear requirements, it is possible
to simplify the design.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-06  3:31       ` Lei YU
@ 2019-06-06 20:31         ` Adriana Kobylak
  2019-06-07 14:35           ` Matt Spinler
  2019-06-10  3:18           ` Lei YU
  0 siblings, 2 replies; 24+ messages in thread
From: Adriana Kobylak @ 2019-06-06 20:31 UTC (permalink / raw)
  To: Lei YU; +Cc: Matt Spinler, OpenBMC Maillist, openbmc

On 2019-06-05 22:31, Lei YU wrote:
> On Wed, Jun 5, 2019 at 10:25 PM Matt Spinler <mspinler@linux.ibm.com> 
> wrote:
>> 
>> 
>> On 6/5/2019 1:18 AM, Lei YU wrote:
>> >>> The PSU firmware code update will re-use the current interfaces to upload,
>> >>> verify, and activate the image.
>> >> We would like the option to be able to ship the PSU firmware as part of
>> >> the BMC image (in the root filesystem). This means that it is already
>> >> present and authenticated when the BMC boots. In this way, we know that
>> >> the current BMC firmware plays well with the PSU firmware and have fewer
>> >> variables to test for when making a release.
>> > Because the PSU firmware is part of BMC image, this seems a completely
>> > different approach, and more like part of BMC image update, is it?
>> > I would expect this should not be part of this design, what do you think?
>> 
>> FYI, I am 99% sure this is how IBM needs its systems to work as well.
>> That being the case,
>> 
>> will you also be handling this design?
> 
> Good to know.
> 
> Then a question comes up:
> In which cases PSU firmware update shall be done?
> 1. It is updated together with BMC firmware update as described by 
> Vernon
>    Mauery;
> 2. It is updated independently with APIs, as described in this design 
> doc.
> 
> Will 1 and 2 both be valid, or only 1 is the real case and we do not 
> need to
> support 2?
> 

I see it as having a single tarball file that has the required files to 
update the
BMC and the PSU. When this tarball is uploaded, then a new Version with 
a Purpose
of System or some other name is created. When this Version is activated, 
this
triggers the BMC updater (existing) and the PSU updater (new) to check 
if all
the necessary files to perform the update of their component exist. If 
yes, each
updater updates their piece and if any one fails it'd mark the Version 
as Failed
(TBD on synchronizing the updaters to mark the Version as Active or 
Failed).
So the PSU would be updated at the same time as the BMC, but done by its 
own
updater application.

Thoughts?

> The reason I ask is because if we could get clear requirements, it is 
> possible
> to simplify the design.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-06 20:31         ` Adriana Kobylak
@ 2019-06-07 14:35           ` Matt Spinler
  2019-06-07 15:52             ` Derek Howard
  2019-06-10  3:18           ` Lei YU
  1 sibling, 1 reply; 24+ messages in thread
From: Matt Spinler @ 2019-06-07 14:35 UTC (permalink / raw)
  To: Adriana Kobylak, Lei YU; +Cc: OpenBMC Maillist, openbmc


On 6/6/2019 3:31 PM, Adriana Kobylak wrote:
> On 2019-06-05 22:31, Lei YU wrote:
>> On Wed, Jun 5, 2019 at 10:25 PM Matt Spinler <mspinler@linux.ibm.com> 
>> wrote:
>>>
>>>
>>> On 6/5/2019 1:18 AM, Lei YU wrote:
>>> >>> The PSU firmware code update will re-use the current interfaces 
>>> to upload,
>>> >>> verify, and activate the image.
>>> >> We would like the option to be able to ship the PSU firmware as 
>>> part of
>>> >> the BMC image (in the root filesystem). This means that it is 
>>> already
>>> >> present and authenticated when the BMC boots. In this way, we 
>>> know that
>>> >> the current BMC firmware plays well with the PSU firmware and 
>>> have fewer
>>> >> variables to test for when making a release.
>>> > Because the PSU firmware is part of BMC image, this seems a 
>>> completely
>>> > different approach, and more like part of BMC image update, is it?
>>> > I would expect this should not be part of this design, what do you 
>>> think?
>>>
>>> FYI, I am 99% sure this is how IBM needs its systems to work as well.
>>> That being the case,
>>>
>>> will you also be handling this design?
>>
>> Good to know.
>>
>> Then a question comes up:
>> In which cases PSU firmware update shall be done?
>> 1. It is updated together with BMC firmware update as described by 
>> Vernon
>>    Mauery;
>> 2. It is updated independently with APIs, as described in this design 
>> doc.
>>
>> Will 1 and 2 both be valid, or only 1 is the real case and we do not 
>> need to
>> support 2?
>>
>
> I see it as having a single tarball file that has the required files 
> to update the
> BMC and the PSU. When this tarball is uploaded, then a new Version 
> with a Purpose
> of System or some other name is created. When this Version is 
> activated, this
> triggers the BMC updater (existing) and the PSU updater (new) to check 
> if all
> the necessary files to perform the update of their component exist. If 
> yes, each
> updater updates their piece and if any one fails it'd mark the Version 
> as Failed
> (TBD on synchronizing the updaters to mark the Version as Active or 
> Failed).
> So the PSU would be updated at the same time as the BMC, but done by 
> its own
> updater application.
>
> Thoughts?

3 more quick notes:

1) PSs can be hot pluggable, so when a new one is detected, the code
update should run then too if the new PS needs one, assuming all other
conditions are met.

2) A single system may support multiple models of PS (will definitely
happen for us), so this design should be able to store multiple PS
images and send the correct image to the correct model.

3) You mentioned the combined image stuff before.  We should just check
the timeline for that support aligns with this one.






>
>> The reason I ask is because if we could get clear requirements, it is 
>> possible
>> to simplify the design.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-07 14:35           ` Matt Spinler
@ 2019-06-07 15:52             ` Derek Howard
  2019-06-10  3:16               ` Lei YU
  0 siblings, 1 reply; 24+ messages in thread
From: Derek Howard @ 2019-06-07 15:52 UTC (permalink / raw)
  To: openbmc


On 6/7/2019 9:35 AM, Matt Spinler wrote:
>
> On 6/6/2019 3:31 PM, Adriana Kobylak wrote:
>> On 2019-06-05 22:31, Lei YU wrote:
>>> On Wed, Jun 5, 2019 at 10:25 PM Matt Spinler 
>>> <mspinler@linux.ibm.com> wrote:
>>>>
>>>>
>>>> On 6/5/2019 1:18 AM, Lei YU wrote:
>>>> >>> The PSU firmware code update will re-use the current interfaces 
>>>> to upload,
>>>> >>> verify, and activate the image.
>>>> >> We would like the option to be able to ship the PSU firmware as 
>>>> part of
>>>> >> the BMC image (in the root filesystem). This means that it is 
>>>> already
>>>> >> present and authenticated when the BMC boots. In this way, we 
>>>> know that
>>>> >> the current BMC firmware plays well with the PSU firmware and 
>>>> have fewer
>>>> >> variables to test for when making a release.
>>>> > Because the PSU firmware is part of BMC image, this seems a 
>>>> completely
>>>> > different approach, and more like part of BMC image update, is it?
>>>> > I would expect this should not be part of this design, what do 
>>>> you think?
>>>>
>>>> FYI, I am 99% sure this is how IBM needs its systems to work as well.
>>>> That being the case,
>>>>
>>>> will you also be handling this design?
>>>
>>> Good to know.
>>>
>>> Then a question comes up:
>>> In which cases PSU firmware update shall be done?
>>> 1. It is updated together with BMC firmware update as described by 
>>> Vernon
>>>    Mauery;
>>> 2. It is updated independently with APIs, as described in this 
>>> design doc.
>>>
>>> Will 1 and 2 both be valid, or only 1 is the real case and we do not 
>>> need to
>>> support 2?
>>>
>>
>> I see it as having a single tarball file that has the required files 
>> to update the
>> BMC and the PSU. When this tarball is uploaded, then a new Version 
>> with a Purpose
>> of System or some other name is created. When this Version is 
>> activated, this
>> triggers the BMC updater (existing) and the PSU updater (new) to 
>> check if all
>> the necessary files to perform the update of their component exist. 
>> If yes, each
>> updater updates their piece and if any one fails it'd mark the 
>> Version as Failed
>> (TBD on synchronizing the updaters to mark the Version as Active or 
>> Failed).
>> So the PSU would be updated at the same time as the BMC, but done by 
>> its own
>> updater application.
>>
>> Thoughts?
>
> 3 more quick notes:
>
> 1) PSs can be hot pluggable, so when a new one is detected, the code
> update should run then too if the new PS needs one, assuming all other
> conditions are met.
>
> 2) A single system may support multiple models of PS (will definitely
> happen for us), so this design should be able to store multiple PS
> images and send the correct image to the correct model.
>
> 3) You mentioned the combined image stuff before.  We should just check
> the timeline for that support aligns with this one.
>
>
Good point Matt on the PS install.  It would probably be a good idea to 
get the newly installed PS to the same image as the rest of the PS's in 
the system.

We do support PS's that don't provide control supply (standby voltage) 
when reset at the end of the update, while other PS's do.  Therefore for 
the former case, if only 1 PS has AC attached, we cannot update/reset 
that PS, so please let that be selectable by the user (eg vendor 
specific tool).

Also, please provide a way to know that the updates have finished.  As 
we don't want to update the PS's when the power is on (this is vendor 
specific as well), we also do not want to power the system on in the 
middle of an update.  For example, if after a BMC update the PS's are 
being updated, we want to hold off the next system power on until the PS 
updates have finished. Thanks.

>
>
>>
>>> The reason I ask is because if we could get clear requirements, it 
>>> is possible
>>> to simplify the design.
>
Would it be possible to support both methods?  The general use case 
being done during/after BMC code update, but also support the more 
manual method that could be used perhaps in the lab to test new psu 
images or in the field if there are problems with an existing image? Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-07 15:52             ` Derek Howard
@ 2019-06-10  3:16               ` Lei YU
  2019-06-10 20:43                 ` Derek Howard
  0 siblings, 1 reply; 24+ messages in thread
From: Lei YU @ 2019-06-10  3:16 UTC (permalink / raw)
  To: Derek Howard; +Cc: OpenBMC Maillist

> > 3 more quick notes:
> >
> > 1) PSs can be hot pluggable, so when a new one is detected, the code
> > update should run then too if the new PS needs one, assuming all other
> > conditions are met.
> >
> > 2) A single system may support multiple models of PS (will definitely
> > happen for us), so this design should be able to store multiple PS
> > images and send the correct image to the correct model.
> >
> > 3) You mentioned the combined image stuff before.  We should just check
> > the timeline for that support aligns with this one.
> >
> >
> Good point Matt on the PS install.  It would probably be a good idea to
> get the newly installed PS to the same image as the rest of the PS's in
> the system.

Yup, really good point.
This implies that BMC shall keep a local copy of the PSU image for future
updates.

> We do support PS's that don't provide control supply (standby voltage)
> when reset at the end of the update, while other PS's do.  Therefore for
> the former case, if only 1 PS has AC attached, we cannot update/reset
> that PS, so please let that be selectable by the user (eg vendor
> specific tool).

This is somehow complex, but if we could defer this to vendor specific tool,
that's OK.
However, if a system has multiple models of PS, I am not sure how the vendor
specific tool will be.
Should we defer that to vendor specfic tool, too?

>
> Also, please provide a way to know that the updates have finished.  As
> we don't want to update the PS's when the power is on (this is vendor
> specific as well), we also do not want to power the system on in the
> middle of an update.  For example, if after a BMC update the PS's are
> being updated, we want to hold off the next system power on until the PS
> updates have finished. Thanks.

This is already supported by the existing interface.


> >
> >
> >>
> >>> The reason I ask is because if we could get clear requirements, it
> >>> is possible
> >>> to simplify the design.
> >
> Would it be possible to support both methods?  The general use case
> being done during/after BMC code update, but also support the more
> manual method that could be used perhaps in the lab to test new psu
> images or in the field if there are problems with an existing image? Thanks.

This design doc will be updated to support both cases.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-06 20:31         ` Adriana Kobylak
  2019-06-07 14:35           ` Matt Spinler
@ 2019-06-10  3:18           ` Lei YU
  1 sibling, 0 replies; 24+ messages in thread
From: Lei YU @ 2019-06-10  3:18 UTC (permalink / raw)
  To: Adriana Kobylak; +Cc: Matt Spinler, OpenBMC Maillist, openbmc

> I see it as having a single tarball file that has the required files to
> update the
> BMC and the PSU. When this tarball is uploaded, then a new Version with
> a Purpose
> of System or some other name is created. When this Version is activated,
> this
> triggers the BMC updater (existing) and the PSU updater (new) to check
> if all
> the necessary files to perform the update of their component exist. If
> yes, each
> updater updates their piece and if any one fails it'd mark the Version
> as Failed
> (TBD on synchronizing the updaters to mark the Version as Active or
> Failed).
> So the PSU would be updated at the same time as the BMC, but done by its
> own
> updater application.
>
> Thoughts?

A single tarball containing multiple images for multiple purposes is not
implemented yet.
So this design doc is expected not to cover that, is that OK?

As long as the PSU purpose is implemented correctly with the same interface,
it should not be difficult to support it in multiple purpose case.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [Design] PSU firmware update
  2019-06-10  3:16               ` Lei YU
@ 2019-06-10 20:43                 ` Derek Howard
  0 siblings, 0 replies; 24+ messages in thread
From: Derek Howard @ 2019-06-10 20:43 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist


On 6/9/2019 10:16 PM, Lei YU wrote:
>>> 3 more quick notes:
>>>
>>> 1) PSs can be hot pluggable, so when a new one is detected, the code
>>> update should run then too if the new PS needs one, assuming all other
>>> conditions are met.
>>>
>>> 2) A single system may support multiple models of PS (will definitely
>>> happen for us), so this design should be able to store multiple PS
>>> images and send the correct image to the correct model.
>>>
>>> 3) You mentioned the combined image stuff before.  We should just check
>>> the timeline for that support aligns with this one.
>>>
>>>
>> Good point Matt on the PS install.  It would probably be a good idea to
>> get the newly installed PS to the same image as the rest of the PS's in
>> the system.
> Yup, really good point.
> This implies that BMC shall keep a local copy of the PSU image for future
> updates.
>
>> We do support PS's that don't provide control supply (standby voltage)
>> when reset at the end of the update, while other PS's do.  Therefore for
>> the former case, if only 1 PS has AC attached, we cannot update/reset
>> that PS, so please let that be selectable by the user (eg vendor
>> specific tool).
> This is somehow complex, but if we could defer this to vendor specific tool,
> that's OK.
> However, if a system has multiple models of PS, I am not sure how the vendor
> specific tool will be.
> Should we defer that to vendor specfic tool, too?
Sounds fine to me, thanks.
>> Also, please provide a way to know that the updates have finished.  As
>> we don't want to update the PS's when the power is on (this is vendor
>> specific as well), we also do not want to power the system on in the
>> middle of an update.  For example, if after a BMC update the PS's are
>> being updated, we want to hold off the next system power on until the PS
>> updates have finished. Thanks.
> This is already supported by the existing interface.
Excellent!
>
>>>
>>>>> The reason I ask is because if we could get clear requirements, it
>>>>> is possible
>>>>> to simplify the design.
>> Would it be possible to support both methods?  The general use case
>> being done during/after BMC code update, but also support the more
>> manual method that could be used perhaps in the lab to test new psu
>> images or in the field if there are problems with an existing image? Thanks.
> This design doc will be updated to support both cases.
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [Design] PSU firmware update
       [not found] <mailman.2630.1559853029.4162.openbmc@lists.ozlabs.org>
@ 2019-06-10  8:28 ` yuan.li
  0 siblings, 0 replies; 24+ messages in thread
From: yuan.li @ 2019-06-10  8:28 UTC (permalink / raw)
  To: openbmc

[-- Attachment #1: Type: text/plain, Size: 2971 bytes --]

>On 2019-06-05 22:31, Lei YU wrote:
>> On Wed, Jun 5, 2019 at 10:25 PM Matt Spinler <mspinler@linux.ibm.com> 
>> wrote:
>>> 
>>> 
>>> On 6/5/2019 1:18 AM, Lei YU wrote:
>>> >>> The PSU firmware code update will re-use the current interfaces to upload,
>>> >>> verify, and activate the image.
>>> >> We would like the option to be able to ship the PSU firmware as part of
>>> >> the BMC image (in the root filesystem). This means that it is already
>>> >> present and authenticated when the BMC boots. In this way, we know that
>>> >> the current BMC firmware plays well with the PSU firmware and have fewer
>>> >> variables to test for when making a release.
>>> > Because the PSU firmware is part of BMC image, this seems a completely
>>> > different approach, and more like part of BMC image update, is it?
>>> > I would expect this should not be part of this design, what do you think?
>>> 
>>> FYI, I am 99% sure this is how IBM needs its systems to work as well.
>>> That being the case,
>>> 
>>> will you also be handling this design?
>> 
>> Good to know.
>> 
>> Then a question comes up:
>> In which cases PSU firmware update shall be done?
>> 1. It is updated together with BMC firmware update as described by 
>> Vernon
>>    Mauery;
>> 2. It is updated independently with APIs, as described in this design 
>> doc.
>> 
>> Will 1 and 2 both be valid, or only 1 is the real case and we do not 
>> need to
>> support 2?
>> 
> 
> I see it as having a single tarball file that has the required files to 
> update the
> BMC and the PSU. When this tarball is uploaded, then a new Version with 
> a Purpose
> of System or some other name is created. When this Version is activated, 
> this
> triggers the BMC updater (existing) and the PSU updater (new) to check 
> if all
> the necessary files to perform the update of their component exist. If 
> yes, each
> updater updates their piece and if any one fails it'd mark the Version 
> as Failed
> (TBD on synchronizing the updaters to mark the Version as Active or 
> Failed).
> So the PSU would be updated at the same time as the BMC, but done by its 
> own
> updater application.
>  
> Thoughts?
>

I have different opinion about this. In current practice it's not a tarball which 
could be decompressed easily. The embedded BMC update image is signed. PSU
firmware is a part of the root filesystem (as a file). In this case the  whole update 
flow would look like:
1. Upload and update the BMC firmware itself.
2. Boot to new version of BMC firmware.
3. BMC to read PSU firmware version from PSU, and compare with the file shipped
    with this BMC firmware.
4. If update needed, update tool could be launched.

Benefit for this is that PSU firmware update process is transparent to end user.

How do you think?

Yuan Li

>> The reason I ask is because if we could get clear requirements, it is 
>> possible
>> to simplify the design.



[-- Attachment #2: Type: text/html, Size: 4907 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2019-06-10 20:43 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-03  8:54 [Design] PSU firmware update Lei YU
2019-06-03 13:31 ` Andrew Geissler
2019-06-03 17:23   ` Neeraj Ladkani
2019-06-04  2:58     ` Lei YU
2019-06-04  6:43       ` Neeraj Ladkani
2019-06-04  7:02         ` Lei YU
2019-06-04  7:20           ` Neeraj Ladkani
2019-06-04  7:28       ` Alexander Amelkin
2019-06-04 18:22       ` Vernon Mauery
2019-06-04  2:42   ` Lei YU
2019-06-03 20:48 ` Derek Howard
2019-06-04  3:19   ` Lei YU
2019-06-04 18:26 ` Vernon Mauery
2019-06-05  6:18   ` Lei YU
2019-06-05 14:25     ` Matt Spinler
2019-06-05 14:42       ` Adriana Kobylak
2019-06-06  3:31       ` Lei YU
2019-06-06 20:31         ` Adriana Kobylak
2019-06-07 14:35           ` Matt Spinler
2019-06-07 15:52             ` Derek Howard
2019-06-10  3:16               ` Lei YU
2019-06-10 20:43                 ` Derek Howard
2019-06-10  3:18           ` Lei YU
     [not found] <mailman.2630.1559853029.4162.openbmc@lists.ozlabs.org>
2019-06-10  8:28 ` yuan.li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.