All of lore.kernel.org
 help / color / mirror / Atom feed
* Exposing POST codes
@ 2018-02-28 18:56 Patrick Venture
  2018-03-01 17:31 ` Patrick Venture
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Patrick Venture @ 2018-02-28 18:56 UTC (permalink / raw)
  To: OpenBMC Maillist

I talked to Nancy and I think it's time to revisit previous conversations
about POST codes.  We have a simple daemon that exposes the information
over Dbus associated with https://gerrit.openbmc-project.xyz/#/c/5006

If there's interest, I can stage it against skeleton for review and comment.

We have a patch to the kernel character device for the aspeed-lpc-snoop
that is required and enables reading the post codes.

Patrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-02-28 18:56 Exposing POST codes Patrick Venture
@ 2018-03-01 17:31 ` Patrick Venture
  2018-03-05  5:19 ` Brad Bishop
  2018-04-14  0:47 ` Timothy Pearson
  2 siblings, 0 replies; 12+ messages in thread
From: Patrick Venture @ 2018-03-01 17:31 UTC (permalink / raw)
  To: OpenBMC Maillist

Here's a patch staged that provides our snoop daemon source:

https://gerrit.openbmc-project.xyz/9298

I have to fix my git send-email to send out any kernel patches.

Patrick

On Wed, Feb 28, 2018 at 10:56 AM, Patrick Venture <venture@google.com> wrote:
> I talked to Nancy and I think it's time to revisit previous conversations
> about POST codes.  We have a simple daemon that exposes the information
> over Dbus associated with https://gerrit.openbmc-project.xyz/#/c/5006
>
> If there's interest, I can stage it against skeleton for review and comment.
>
> We have a patch to the kernel character device for the aspeed-lpc-snoop
> that is required and enables reading the post codes.
>
> Patrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-02-28 18:56 Exposing POST codes Patrick Venture
  2018-03-01 17:31 ` Patrick Venture
@ 2018-03-05  5:19 ` Brad Bishop
  2018-03-05 21:34   ` Tanous, Ed
  2018-04-14  0:47 ` Timothy Pearson
  2 siblings, 1 reply; 12+ messages in thread
From: Brad Bishop @ 2018-03-05  5:19 UTC (permalink / raw)
  To: Patrick Venture, Tanous, Ed, Mohit Gupta (QDT)
  Cc: OpenBMC Maillist, Joel Stanley

Hi Patrick

Thanks for bringing this up again.

> On Feb 28, 2018, at 1:56 PM, Patrick Venture <venture@google.com> wrote:
> 
> I talked to Nancy and I think it's time to revisit previous conversations
> about POST codes.  We have a simple daemon that exposes the information
> over Dbus associated with https://gerrit.openbmc-project.xyz/#/c/5006

So we already have this xyz.openbmc_project.State.Boot.Progress DBus
interface (and others).  This is what we implemented on POWER.  Can
we talk about how that API falls short for your use case(s)?  Can I fix
it to enable your use case(s)?

My concern is I have to at some point implement this API on POWER
so that POWER can take advantage of applications targeting your new API.
If doing that would be non-sensical then this probably shouldn’t be an
API, or at least not in the common xyz namespace.

On POWER, this kind of information comes down through the IPMI boot
progress indicator.  Could someone briefly educate me on what the
difference between the IPMI boot progress indicator and POST codes
are on x86?  Is there overlap or are they used for different things?

If we went this route I’d be inclined to scrap the existing
xyz.openbmc_project.State.Boot.Progress API and implement
your proposed API on POWER, again, so that POWER can make use of the
applications targeting this API, if the community is unable to write
applications targeting the existing one.  That would mean changing
code that today takes actions on meaningful enumerations to instead
take actions on platform specific numbers.  Does that seem like
the right thing for me to do?  Or does it make sense for the project to
have both of these APIs?  If that is the case can you speculate on what
situations an application would target one API over the other?

It just seems like the first thing anyone is going to do with these
numbers is look them up and map them to something.  Wouldn’t it make
sense to have done that mapping already at the API level so that
every user and piece of code using this API doesn’t have to do it
themselves?

Mohit - do ARM chips have POST code-like functionality?  Can you
conceive of a way to write an application that could implement
the proposed API on an ARM server?

> 
> If there's interest, I can stage it against skeleton for review and comment.

I think what I’m hinting at here is that you could add a per-platform config
file to your app that maps the codes to some enumerations in the DBus
interface, and apply that mapping before you emit the signal. If you
wanted to go back to numbers later you could just reverse the mapping
using the same config file.  Please poke holes.

-brad

> 
> We have a patch to the kernel character device for the aspeed-lpc-snoop
> that is required and enables reading the post codes.
> 
> Patrick

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Exposing POST codes
  2018-03-05  5:19 ` Brad Bishop
@ 2018-03-05 21:34   ` Tanous, Ed
  2018-03-05 21:35     ` Patrick Venture
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Tanous, Ed @ 2018-03-05 21:34 UTC (permalink / raw)
  To: 'Brad Bishop', Patrick Venture, Mohit Gupta (QDT)
  Cc: OpenBMC Maillist, Joel Stanley

I'm not Patrick, but the input on my use cases for this is below.


> So we already have this xyz.openbmc_project.State.Boot.Progress DBus
> interface (and others).  This is what we implemented on POWER.  Can we talk
> about how that API falls short for your use case(s)?  Can I fix it to enable your
> use case(s)?
> 
The enum definition is very limiting, and seems to assume that POST code definitions are the same across all platforms.  New POST codes are added all the time, to the point where even internal teams struggle to keep up with their definitions.  Having an API that would need to be changed for every POST code type seems error prone and very likely to be out of date.
If we adjusted the boot progress interface to be a string rather than an enum, that _could_ meet the use case, but that doesn't seem to be the purpose of that API.

> My concern is I have to at some point implement this API on POWER so that
> POWER can take advantage of applications targeting your new API.
> If doing that would be non-sensical then this probably shouldn’t be an API, or
> at least not in the common xyz namespace.
Isn't this the reason for it being in an API all its own, so it can be optionally included?  I suspect any applications consuming it would likely be logging type applications, and any application using it to take action would be platform specific, even if it were a string.
Does POWER have any way of reporting detailed boot progress?  If, for example, the USB link training starts processor init flows, is that logged in a POWER system?  On x86, it would be logged as a POST code.

> 
> On POWER, this kind of information comes down through the IPMI boot
> progress indicator.  Could someone briefly educate me on what the
> difference between the IPMI boot progress indicator and POST codes are on
> x86?  Is there overlap or are they used for different things?
The POST code indicator is used for very fine grained feedback of boot progress, and is generally not looked at by a user unless there is a problem with boot.  At that point, it is generally (on my systems) used for debug of the system, as it gives some information about what steps the host system took before it failed. 

> 
> It just seems like the first thing anyone is going to do with these numbers is
> look them up and map them to something.  Wouldn’t it make sense to have
> done that mapping already at the API level so that every user and piece of
> code using this API doesn’t have to do it themselves?
That seems like a reasonable assumption, but practically isn't always an option;  In general the POST code mappings are difficult to come by, especially in initial system bringup, and that is when they are most valuable.  If attempted, the ability to provide a mapping should be made optional, which means the proposed interface still needs to exist.

> I think what I’m hinting at here is that you could add a per-platform config file
> to your app that maps the codes to some enumerations in the DBus
> interface, and apply that mapping before you emit the signal. If you wanted
> to go back to numbers later you could just reverse the mapping using the
> same config file.  Please poke holes.

I would argue that this functionality is outside the scope of Patricks patch.  We could very clearly do as you're suggesting, but it would be error prone, and make per-platform configuration more difficult to port, and would likely take a number of months to get correct for all platforms.  As is, Patricks patch adds value outside of his direct platform, as other teams would have an immediate use of it, and is very clear and clean to implement.  Building the platform configurable API you suggest would take a lot more time and effort, for only a little incremental value.  This seems like a case of "Perfect is the enemy of good".  As is, both the API and the daemon are things that I would use today on my platforms.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-05 21:34   ` Tanous, Ed
@ 2018-03-05 21:35     ` Patrick Venture
  2018-03-06  1:05     ` Stewart Smith
  2018-03-06  1:06     ` Brad Bishop
  2 siblings, 0 replies; 12+ messages in thread
From: Patrick Venture @ 2018-03-05 21:35 UTC (permalink / raw)
  To: Tanous, Ed; +Cc: Brad Bishop, guptam, OpenBMC Maillist, Joel Stanley

On Mon, Mar 5, 2018 at 1:34 PM Tanous, Ed <ed.tanous@intel.com> wrote:

> I'm not Patrick, but the input on my use cases for this is below.


> > So we already have this xyz.openbmc_project.State.Boot.Progress DBus
> > interface (and others).  This is what we implemented on POWER.  Can we
talk
> > about how that API falls short for your use case(s)?  Can I fix it to
enable your
> > use case(s)?
> >
> The enum definition is very limiting, and seems to assume that POST code
definitions are the same across all platforms.  New POST codes are added
all the time, to the point where even internal teams struggle to keep up
with their definitions.  Having an API that would need to be changed for
every POST code type seems error prone and very likely to be out of date.
> If we adjusted the boot progress interface to be a string rather than an
enum, that _could_ meet the use case, but that doesn't seem to be the
purpose of that API.

> > My concern is I have to at some point implement this API on POWER so
that
> > POWER can take advantage of applications targeting your new API.
> > If doing that would be non-sensical then this probably shouldn’t be an
API, or
> > at least not in the common xyz namespace.
> Isn't this the reason for it being in an API all its own, so it can be
optionally included?  I suspect any applications consuming it would likely
be logging type applications, and any application using it to take action
would be platform specific, even if it were a string.
> Does POWER have any way of reporting detailed boot progress?  If, for
example, the USB link training starts processor init flows, is that logged
in a POWER system?  On x86, it would be logged as a POST code.

> >
> > On POWER, this kind of information comes down through the IPMI boot
> > progress indicator.  Could someone briefly educate me on what the
> > difference between the IPMI boot progress indicator and POST codes are
on
> > x86?  Is there overlap or are they used for different things?
> The POST code indicator is used for very fine grained feedback of boot
progress, and is generally not looked at by a user unless there is a
problem with boot.  At that point, it is generally (on my systems) used for
debug of the system, as it gives some information about what steps the host
system took before it failed.

> >
> > It just seems like the first thing anyone is going to do with these
numbers is
> > look them up and map them to something.  Wouldn’t it make sense to have
> > done that mapping already at the API level so that every user and piece
of
> > code using this API doesn’t have to do it themselves?
> That seems like a reasonable assumption, but practically isn't always an
option;  In general the POST code mappings are difficult to come by,
especially in initial system bringup, and that is when they are most
valuable.  If attempted, the ability to provide a mapping should be made
optional, which means the proposed interface still needs to exist.

> > I think what I’m hinting at here is that you could add a per-platform
config file
> > to your app that maps the codes to some enumerations in the DBus
> > interface, and apply that mapping before you emit the signal. If you
wanted
> > to go back to numbers later you could just reverse the mapping using the
> > same config file.  Please poke holes.

> I would argue that this functionality is outside the scope of Patricks
patch.  We could very clearly do as you're suggesting, but it would be
error prone, and make per-platform configuration more difficult to port,
and would likely take a number of months to get correct for all platforms.
As is, Patricks patch adds value outside of his direct platform, as other
teams would have an immediate use of it, and is very clear and clean to
implement.  Building the platform configurable API you suggest would take a
lot more time and effort, for only a little incremental value.  This seems
like a case of "Perfect is the enemy of good".  As is, both the API and the
daemon are things that I would use today on my platforms.

Thanks Ed for your thorough response.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Exposing POST codes
  2018-03-05 21:34   ` Tanous, Ed
  2018-03-05 21:35     ` Patrick Venture
@ 2018-03-06  1:05     ` Stewart Smith
  2018-03-07  1:05       ` Rob Lippert
  2018-03-06  1:06     ` Brad Bishop
  2 siblings, 1 reply; 12+ messages in thread
From: Stewart Smith @ 2018-03-06  1:05 UTC (permalink / raw)
  To: Tanous, Ed, 'Brad Bishop', Patrick Venture, Mohit Gupta (QDT)
  Cc: OpenBMC Maillist

"Tanous, Ed" <ed.tanous@intel.com> writes:
> Does POWER have any way of reporting detailed boot progress?  If, for
> example, the USB link training starts processor init flows, is that
> logged in a POWER system?  On x86, it would be logged as a POST code.

On POWER (currently at least) there's a few things in play.

On OpenPOWER systems the only thing we currently actively communicate to
the BMC is the IPMI FW progress sensor, which isn't especially fine
grained, but it's what we have hooked up.

We do print out more detailed progress information to the console
though. What we print out to the console is roughly in two categories:
a) ISTEPs (probably the closest thing we have to POST codes, in that
   they're numbers), but these also have names because text is more
   descriptive than numbers.
b) log messages from OPAL (words, mostly around what we've probed/are
initing)

One thing to note about the istep numbers is that they can go
*backwards* if our firmware needs to do a reconfigure loop (e.g. we're
after a firmware update and needing to flash a seeprom inside the chip,
or we've discovered a problem with one of the cores and we're going to
disable it).

On the more enterprise-y POWER systems, there's SRC codes, which
are a set of incomprehensible hexadecimal numbers in a seemingly random
order designed to a) fit on a tiny LCD screen on the front of the
machine and b) not be strings that would have to be translated.
(I *always* have to google them, and even then, I don't think it helps)

If there's a problem during boot, we'd generally look at the console
output.... unless boot failure is *really* *REALLY* early, in which case
it's before we have any communications channel to the BMC open (and you
have to go and poke at the chip through one of the debug
interfaces... although we would like to improve this situation)

>> It just seems like the first thing anyone is going to do with these numbers is
>> look them up and map them to something.  Wouldn’t it make sense to have
>> done that mapping already at the API level so that every user and piece of
>> code using this API doesn’t have to do it themselves?
> That seems like a reasonable assumption, but practically isn't always
> an option;  In general the POST code mappings are difficult to come
> by, especially in initial system bringup, and that is when they are
> most valuable.  If attempted, the ability to provide a mapping should
> be made optional, which means the proposed interface still needs to
> exist.

What if it was a "number and/or string" kind of interface? Would that work? On
x86 if you only have the method of getting a number out, you could just
have the numbers (unless you have a mapping somewhere), but on POWER we
could hook this up to get a number and/or string from firmware.

>> I think what I’m hinting at here is that you could add a per-platform config file
>> to your app that maps the codes to some enumerations in the DBus
>> interface, and apply that mapping before you emit the signal. If you wanted
>> to go back to numbers later you could just reverse the mapping using the
>> same config file.  Please poke holes.
>
> I would argue that this functionality is outside the scope of Patricks
> patch.  We could very clearly do as you're suggesting, but it would be
> error prone, and make per-platform configuration more difficult to
> port, and would likely take a number of months to get correct for all
> platforms.  As is, Patricks patch adds value outside of his direct
> platform, as other teams would have an immediate use of it, and is
> very clear and clean to implement.  Building the platform configurable
> API you suggest would take a lot more time and effort, for only a
> little incremental value.  This seems like a case of "Perfect is the
> enemy of good".  As is, both the API and the daemon are things that I
> would use today on my platforms.

Would a universal interface look something like this:

- enum ProgressStages
  (to support things like IPMI fw progress, i.e. generic and well
  accepted what these mean)
- int (descriptive integer, platform specific, 0=unknown)
- string (descriptive, platform specific, can be null)

with each platform implementing whatever parts of that they can.

Looks like x86 post codes would go in the int, maybe a lookup table for
the string (if available).

For POWER, we'd poke the istep number into the int, and a description
into the string (from the host, some unknown mechanism to do that).

thoughts?
-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-05 21:34   ` Tanous, Ed
  2018-03-05 21:35     ` Patrick Venture
  2018-03-06  1:05     ` Stewart Smith
@ 2018-03-06  1:06     ` Brad Bishop
  2 siblings, 0 replies; 12+ messages in thread
From: Brad Bishop @ 2018-03-06  1:06 UTC (permalink / raw)
  To: Tanous, Ed; +Cc: Patrick Venture, Mohit Gupta (QDT), OpenBMC Maillist


> On Mar 5, 2018, at 4:34 PM, Tanous, Ed <ed.tanous@intel.com> wrote:
> 
> I'm not Patrick, but the input on my use cases for this is below.
> 
> 
>> So we already have this xyz.openbmc_project.State.Boot.Progress DBus
>> interface (and others).  This is what we implemented on POWER.  Can we talk
>> about how that API falls short for your use case(s)?  Can I fix it to enable your
>> use case(s)?
>> 
> The enum definition is very limiting, and seems to assume that POST code definitions are the same across all platforms.  New POST codes are added all the time, to the point where even internal teams struggle to keep up with their definitions.  Having an API that would need to be changed for every POST code type seems error prone and very likely to be out of date.
> If we adjusted the boot progress interface to be a string rather than an enum, that _could_ meet the use case, but that doesn't seem to be the purpose of that API.

Thanks Ed.  With the background its apparent to me now that the
existing API doesn’t line up.

I just want to be careful about the content in our data model.  I’m
trying to make sure we don’t inadvertently establish a norm and send
the project down the path of something like:

platform A implements API a, b and c
platform B implements API d, e and f
platform C implements API g, h and i

Obviously a setup like that doesn’t lead to meaningful collaboration
or code reuse.

> 
>> My concern is I have to at some point implement this API on POWER so that
>> POWER can take advantage of applications targeting your new API.
>> If doing that would be non-sensical then this probably shouldn’t be an API, or
>> at least not in the common xyz namespace.
> Isn't this the reason for it being in an API all its own, so it can be optionally included?

I’m not sure.  We should probably define what exactly it means
to have something in the DBus data model.  When I hear optional
my thoughts jump to missing function on the platforms that don’t
implement it and/or lack of compatibility with the rest of the
projects code.  As an IBM guy, I know I want all the APIs that
the community comes up with to have implementations for POWER.

As I type this, I realize if I really wanted to I could emit the
istep number using this API.

>  I suspect any applications consuming it would likely be logging type applications, and any application using it to take action would be platform specific, even if it were a string.
> Does POWER have any way of reporting detailed boot progress?  If, for example, the USB link training starts processor init flows, is that logged in a POWER system?  On x86, it would be logged as a POST code.

We do!  It is the istep thing I mentioned above.  Before OpenBMC that
kind of data flowed from the BIOS with proprietary protocols over
proprietary transports to proprietary BMC chips.  First pass at
that on an OpenBMC was to just map a subset of those to IPMI boot
progress sensor updates.

> 
>> 
>> On POWER, this kind of information comes down through the IPMI boot
>> progress indicator.  Could someone briefly educate me on what the
>> difference between the IPMI boot progress indicator and POST codes are on
>> x86?  Is there overlap or are they used for different things?
> The POST code indicator is used for very fine grained feedback of boot progress, and is generally not looked at by a user unless there is a problem with boot.  At that point, it is generally (on my systems) used for debug of the system, as it gives some information about what steps the host system took before it failed. 
> 
>> 
>> It just seems like the first thing anyone is going to do with these numbers is
>> look them up and map them to something.  Wouldn’t it make sense to have
>> done that mapping already at the API level so that every user and piece of
>> code using this API doesn’t have to do it themselves?
> That seems like a reasonable assumption, but practically isn't always an option;  In general the POST code mappings are difficult to come by, especially in initial system bringup, and that is when they are most valuable.  If attempted, the ability to provide a mapping should be made optional, which means the proposed interface still needs to exist.
> 
>> I think what I’m hinting at here is that you could add a per-platform config file
>> to your app that maps the codes to some enumerations in the DBus
>> interface, and apply that mapping before you emit the signal. If you wanted
>> to go back to numbers later you could just reverse the mapping using the
>> same config file.  Please poke holes.
> 
> I would argue that this functionality is outside the scope of Patricks patch.  We could very clearly do as you're suggesting, but it would be error prone, and make per-platform configuration more difficult to port, and would likely take a number of months to get correct for all platforms.  As is, Patricks patch adds value outside of his direct platform, as other teams would have an immediate use of it, and is very clear and clean to implement.  Building the platform configurable API you suggest would take a lot more time and effort, for only a little incremental value.  This seems like a case of "Perfect is the enemy of good".  As is, both the API and the daemon are things that I would use today on my platforms.

This was really just a straw man to move the conversation along, but yeah
I tend to agree now that I have the background.  Thanks for explaining it
to me.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-06  1:05     ` Stewart Smith
@ 2018-03-07  1:05       ` Rob Lippert
  2018-03-13  6:48         ` Stewart Smith
  0 siblings, 1 reply; 12+ messages in thread
From: Rob Lippert @ 2018-03-07  1:05 UTC (permalink / raw)
  To: Stewart Smith
  Cc: Tanous, Ed, Brad Bishop, Patrick Venture, Mohit Gupta (QDT),
	OpenBMC Maillist

[-- Attachment #1: Type: text/plain, Size: 5556 bytes --]

On Mon, Mar 5, 2018 at 5:05 PM, Stewart Smith <stewart@linux.vnet.ibm.com>
wrote:

> "Tanous, Ed" <ed.tanous@intel.com> writes:
> > Does POWER have any way of reporting detailed boot progress?  If, for
> > example, the USB link training starts processor init flows, is that
> > logged in a POWER system?  On x86, it would be logged as a POST code.
>
> On POWER (currently at least) there's a few things in play.
>
> On OpenPOWER systems the only thing we currently actively communicate to
> the BMC is the IPMI FW progress sensor, which isn't especially fine
> grained, but it's what we have hooked up.
>
> We do print out more detailed progress information to the console
> though. What we print out to the console is roughly in two categories:
> a) ISTEPs (probably the closest thing we have to POST codes, in that
>    they're numbers), but these also have names because text is more
>    descriptive than numbers.
> b) log messages from OPAL (words, mostly around what we've probed/are
> initing)
>
> One thing to note about the istep numbers is that they can go
> *backwards* if our firmware needs to do a reconfigure loop (e.g. we're
> after a firmware update and needing to flash a seeprom inside the chip,
> or we've discovered a problem with one of the cores and we're going to
> disable it).
>
> On the more enterprise-y POWER systems, there's SRC codes, which
> are a set of incomprehensible hexadecimal numbers in a seemingly random
> order designed to a) fit on a tiny LCD screen on the front of the
> machine and b) not be strings that would have to be translated.
> (I *always* have to google them, and even then, I don't think it helps)
>
> If there's a problem during boot, we'd generally look at the console
> output.... unless boot failure is *really* *REALLY* early, in which case
> it's before we have any communications channel to the BMC open (and you
> have to go and poke at the chip through one of the debug
> interfaces... although we would like to improve this situation)
>
> >> It just seems like the first thing anyone is going to do with these
> numbers is
> >> look them up and map them to something.  Wouldn’t it make sense to have
> >> done that mapping already at the API level so that every user and piece
> of
> >> code using this API doesn’t have to do it themselves?
> > That seems like a reasonable assumption, but practically isn't always
> > an option;  In general the POST code mappings are difficult to come
> > by, especially in initial system bringup, and that is when they are
> > most valuable.  If attempted, the ability to provide a mapping should
> > be made optional, which means the proposed interface still needs to
> > exist.
>
> What if it was a "number and/or string" kind of interface? Would that
> work? On
> x86 if you only have the method of getting a number out, you could just
> have the numbers (unless you have a mapping somewhere), but on POWER we
> could hook this up to get a number and/or string from firmware.
>
> >> I think what I’m hinting at here is that you could add a per-platform
> config file
> >> to your app that maps the codes to some enumerations in the DBus
> >> interface, and apply that mapping before you emit the signal. If you
> wanted
> >> to go back to numbers later you could just reverse the mapping using the
> >> same config file.  Please poke holes.
> >
> > I would argue that this functionality is outside the scope of Patricks
> > patch.  We could very clearly do as you're suggesting, but it would be
> > error prone, and make per-platform configuration more difficult to
> > port, and would likely take a number of months to get correct for all
> > platforms.  As is, Patricks patch adds value outside of his direct
> > platform, as other teams would have an immediate use of it, and is
> > very clear and clean to implement.  Building the platform configurable
> > API you suggest would take a lot more time and effort, for only a
> > little incremental value.  This seems like a case of "Perfect is the
> > enemy of good".  As is, both the API and the daemon are things that I
> > would use today on my platforms.
>
> Would a universal interface look something like this:
>
> - enum ProgressStages
>   (to support things like IPMI fw progress, i.e. generic and well
>   accepted what these mean)
> - int (descriptive integer, platform specific, 0=unknown)
> - string (descriptive, platform specific, can be null)
>
> with each platform implementing whatever parts of that they can.
>
> Looks like x86 post codes would go in the int, maybe a lookup table for
> the string (if available).
>
> For POWER, we'd poke the istep number into the int, and a description
> into the string (from the host, some unknown mechanism to do that).
>
> thoughts?
>

I implemented port 80h POST codes for POWER9 hostboot a while back:
https://github.com/open-power/hostboot/blob/c93bef31ae6ce781f9e0a11bb9224b6728ff120f/src/usr/initservice/istepdispatcher/istepdispatcher.C#L2312

On Zaius machines we are using that support with Patrick's snoop daemon and
a separate daemon that receives the code via dbus and outputs it over the
front 7seg debug display.
It has proven useful for getting early error/debug reports from technicians
at scale e.g. "5 machines stopped at code 35h, 2 at 72h" provides a quick
overview of what the problems are for me to debug further (since I have the
decoder ring, and the istep names would be useless to them anyways).

-Rob

[-- Attachment #2: Type: text/html, Size: 6661 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-07  1:05       ` Rob Lippert
@ 2018-03-13  6:48         ` Stewart Smith
  2018-03-26 22:57           ` Rob Lippert
  0 siblings, 1 reply; 12+ messages in thread
From: Stewart Smith @ 2018-03-13  6:48 UTC (permalink / raw)
  To: Rob Lippert
  Cc: Tanous, Ed, Brad Bishop, Patrick Venture, Mohit Gupta (QDT),
	OpenBMC Maillist

Rob Lippert <rlippert@google.com> writes:
> I implemented port 80h POST codes for POWER9 hostboot a while back:
> https://github.com/open-power/hostboot/blob/c93bef31ae6ce781f9e0a11bb9224b6728ff120f/src/usr/initservice/istepdispatcher/istepdispatcher.C#L2312
>
> On Zaius machines we are using that support with Patrick's snoop daemon and
> a separate daemon that receives the code via dbus and outputs it over the
> front 7seg debug display.
> It has proven useful for getting early error/debug reports from technicians
> at scale e.g. "5 machines stopped at code 35h, 2 at 72h" provides a quick
> overview of what the problems are for me to debug further (since I have the
> decoder ring, and the istep names would be useless to them anyways).

Neat!

Anything we should add to skiboot or petitboot environment for this? Or
do we not fail in IPL enough to warrant it?

-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-13  6:48         ` Stewart Smith
@ 2018-03-26 22:57           ` Rob Lippert
  2018-04-03  6:08             ` Stewart Smith
  0 siblings, 1 reply; 12+ messages in thread
From: Rob Lippert @ 2018-03-26 22:57 UTC (permalink / raw)
  To: Stewart Smith
  Cc: Tanous, Ed, Brad Bishop, Patrick Venture, Mohit Gupta (QDT),
	OpenBMC Maillist

[-- Attachment #1: Type: text/plain, Size: 1464 bytes --]

On Mon, Mar 12, 2018 at 11:48 PM, Stewart Smith <stewart@linux.vnet.ibm.com>
wrote:

> Rob Lippert <rlippert@google.com> writes:
> > I implemented port 80h POST codes for POWER9 hostboot a while back:
> > https://github.com/open-power/hostboot/blob/
> c93bef31ae6ce781f9e0a11bb9224b6728ff120f/src/usr/
> initservice/istepdispatcher/istepdispatcher.C#L2312
> >
> > On Zaius machines we are using that support with Patrick's snoop daemon
> and
> > a separate daemon that receives the code via dbus and outputs it over the
> > front 7seg debug display.
> > It has proven useful for getting early error/debug reports from
> technicians
> > at scale e.g. "5 machines stopped at code 35h, 2 at 72h" provides a quick
> > overview of what the problems are for me to debug further (since I have
> the
> > decoder ring, and the istep names would be useless to them anyways).
>
> Neat!
>
> Anything we should add to skiboot or petitboot environment for this? Or
> do we not fail in IPL enough to warrant it?
>

I've never seen a hang in skiboot/petitboot yet so haven't done the work
there to add POST codes yet...

If you look at the picture published at openpower conference you can see
the POST code display on the machines in a rack:
https://www.top500.org/news/openpower-gathers-momentum-with-major-deployments/

All the machines in that picture are at 0x9b which is end of hostboot aka
"good" :)
(except the one that seems to be soft off for some reason... oops)

-Rob

[-- Attachment #2: Type: text/html, Size: 2379 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-03-26 22:57           ` Rob Lippert
@ 2018-04-03  6:08             ` Stewart Smith
  0 siblings, 0 replies; 12+ messages in thread
From: Stewart Smith @ 2018-04-03  6:08 UTC (permalink / raw)
  To: Rob Lippert
  Cc: Tanous, Ed, Brad Bishop, Patrick Venture, Mohit Gupta (QDT),
	OpenBMC Maillist

Rob Lippert <rlippert@google.com> writes:
> On Mon, Mar 12, 2018 at 11:48 PM, Stewart Smith <stewart@linux.vnet.ibm.com>
> wrote:
>
>> Rob Lippert <rlippert@google.com> writes:
>> > I implemented port 80h POST codes for POWER9 hostboot a while back:
>> > https://github.com/open-power/hostboot/blob/
>> c93bef31ae6ce781f9e0a11bb9224b6728ff120f/src/usr/
>> initservice/istepdispatcher/istepdispatcher.C#L2312
>> >
>> > On Zaius machines we are using that support with Patrick's snoop daemon
>> and
>> > a separate daemon that receives the code via dbus and outputs it over the
>> > front 7seg debug display.
>> > It has proven useful for getting early error/debug reports from
>> technicians
>> > at scale e.g. "5 machines stopped at code 35h, 2 at 72h" provides a quick
>> > overview of what the problems are for me to debug further (since I have
>> the
>> > decoder ring, and the istep names would be useless to them anyways).
>>
>> Neat!
>>
>> Anything we should add to skiboot or petitboot environment for this? Or
>> do we not fail in IPL enough to warrant it?
>>
>
> I've never seen a hang in skiboot/petitboot yet so haven't done the work
> there to add POST codes yet...

We'll have to try harder :)

I manage to get us to die in plenty of ways, so maybe you're just lucky
enough to not pick things up when that's the case.

-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Exposing POST codes
  2018-02-28 18:56 Exposing POST codes Patrick Venture
  2018-03-01 17:31 ` Patrick Venture
  2018-03-05  5:19 ` Brad Bishop
@ 2018-04-14  0:47 ` Timothy Pearson
  2 siblings, 0 replies; 12+ messages in thread
From: Timothy Pearson @ 2018-04-14  0:47 UTC (permalink / raw)
  To: Patrick Venture; +Cc: OpenBMC Maillist

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sounds like multiple folks have been working on similar code.  Here's ours:

https://git.raptorcs.com/git/talos-skeleton/log/pyiplobserver?h=04-13-2018

The current version just parses the hostboot codes from the serial
console, it used to use pdbg but over time it was found that using pdbg,
even with the BMC kernel driver, destabilizes the IPL process at key
points.  It does however also include state monitoring via BMC mailbox
bits for skiboot and skiroot, allowing the BMC to know when the full IPL
process is complete.

I like the approach of hostboot sending data over LPC, and we're willing
to rework the observer daemon to use that approach.  Is there any way to
send status codes over LPC from the SBE?

Thanks!

On 02/28/2018 12:56 PM, Patrick Venture wrote:
> I talked to Nancy and I think it's time to revisit previous conversations
> about POST codes.  We have a simple daemon that exposes the information
> over Dbus associated with https://gerrit.openbmc-project.xyz/#/c/5006
> 
> If there's interest, I can stage it against skeleton for review and comment.
> 
> We have a patch to the kernel character device for the aspeed-lpc-snoop
> that is required and enables reading the post codes.
> 
> Patrick


- -- 
Timothy Pearson
Raptor Engineering
+1 (415) 727-8645 (direct line)
+1 (512) 690-0200 (switchboard)
https://www.raptorengineering.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJa0U+sAAoJEK+E3vEXDOFbOM4H/3WJK9SbA6ly4Q/CioC9Yz9L
dzwsKSClKkK1pHru86wPYHadAmatn/ksA5dl4pEzQlfh5RKGBAaoFdVdVG/BdbVA
upKn9WBudQyLCHFWtD3xg4vzX3cCsi3hFgvgGIKkdrxj4uMv+56Fp5Fwh2eWuun8
ITadBt1LVdZlZT7z56zi2gt0eC4QSaAIBjBXq4KYIhXdc/xX9suw7pLKHAfxpb3v
ygSlE+SGeD1jZQfA5y2CxhOYyl+ac3AmCHXlDe70WQgbyPHU6aohuJaoh1+AtJeo
ys2mULcL9Sz2y8YTBXbEQ2QVMCJHw2BpKBmy6gxuDKChUNndqNJBselSVu7jrXU=
=OWnM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-04-14  0:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 18:56 Exposing POST codes Patrick Venture
2018-03-01 17:31 ` Patrick Venture
2018-03-05  5:19 ` Brad Bishop
2018-03-05 21:34   ` Tanous, Ed
2018-03-05 21:35     ` Patrick Venture
2018-03-06  1:05     ` Stewart Smith
2018-03-07  1:05       ` Rob Lippert
2018-03-13  6:48         ` Stewart Smith
2018-03-26 22:57           ` Rob Lippert
2018-04-03  6:08             ` Stewart Smith
2018-03-06  1:06     ` Brad Bishop
2018-04-14  0:47 ` Timothy Pearson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.