All of lore.kernel.org
 help / color / mirror / Atom feed
* Feature request in phosphor-pid-control
@ 2018-06-14  7:24 Lei YU
  2018-06-14 16:41 ` Patrick Venture
  2018-06-14 17:21 ` Tanous, Ed
  0 siblings, 2 replies; 9+ messages in thread
From: Lei YU @ 2018-06-14  7:24 UTC (permalink / raw)
  To: OpenBMC Maillist, Patrick Venture

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

This email is about a feature request in phosphor-pid-control, that to
support
optional sensors.
It is mentioned in comment of https://gerrit.openbmc-project.xyz/#/c/11003/

Background:
Current phosphor-pid-control checks the temperature sensors and calculate
the
expected fan speed.
It assumes the temperature sensors are always available. And if the sensor
is
not there, the service throws exceptions.

The feature request:
I would like phosphor-pid-control to support "optional" sensors, that it may
be unavailable when system is running.
Such sensor is counted when it is available, and is ignored when it is
missing.

An use case:
I want to adjust fan speed based on cpu core or dimm temperature.
For a typical OpenPOWER P9 system, there could be at most 48 cores, but they
can be missing depending on the CPU, e.g. if a system is plugged CPUs with
22
cores, 4 cores temperature will be missing. And in run time, a core may be
garded if it has problem it will be missing as well.
The same for dimms, for a system with at most 16 dimms, it depends on how
many
dimms are attached.
In such case, the config will specify all the temperatures for cpu cores and
dimms, and phosphor-pid-control shall ignore the missing temperatures.

Patrick, what do you think about such case?

Thanks!

[-- Attachment #2: Type: text/html, Size: 1699 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature request in phosphor-pid-control
  2018-06-14  7:24 Feature request in phosphor-pid-control Lei YU
@ 2018-06-14 16:41 ` Patrick Venture
  2018-06-15  2:20   ` Lei YU
  2018-06-14 17:21 ` Tanous, Ed
  1 sibling, 1 reply; 9+ messages in thread
From: Patrick Venture @ 2018-06-14 16:41 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist

It follows a similar downstream feature where a sensor being
"unavailable" because the mapper is swamped shouldn't except the
program.  The only difference here is that this is a sensor which also
needs to be unable to throw the system into failsafe mode.  The system
starts in failsafe mode until it's heard from all the sensors it's
expecting.  This would need to check to see if that sensor supports
the notion of a failsafe timeout -- which is already supported.

A sensor triggers failsafe IFF the value is older than the timeout
value and the timeout value is non-zero.  So, in theory, the
initialization loop could check that the timeout value for a sensor is
non-zero before adding it to the fail-safe check-in list.

So, this change really is just, "don't fail if you can't find the
sensor during start-up, and keep trying until it shows up"

Does that follow?

Patrick

On Thu, Jun 14, 2018 at 12:24 AM, Lei YU <mine260309@gmail.com> wrote:
> This email is about a feature request in phosphor-pid-control, that to
> support
> optional sensors.
> It is mentioned in comment of https://gerrit.openbmc-project.xyz/#/c/11003/
>
> Background:
> Current phosphor-pid-control checks the temperature sensors and calculate
> the
> expected fan speed.
> It assumes the temperature sensors are always available. And if the sensor
> is
> not there, the service throws exceptions.
>
> The feature request:
> I would like phosphor-pid-control to support "optional" sensors, that it may
> be unavailable when system is running.
> Such sensor is counted when it is available, and is ignored when it is
> missing.
>
> An use case:
> I want to adjust fan speed based on cpu core or dimm temperature.
> For a typical OpenPOWER P9 system, there could be at most 48 cores, but they
> can be missing depending on the CPU, e.g. if a system is plugged CPUs with
> 22
> cores, 4 cores temperature will be missing. And in run time, a core may be
> garded if it has problem it will be missing as well.
> The same for dimms, for a system with at most 16 dimms, it depends on how
> many
> dimms are attached.
> In such case, the config will specify all the temperatures for cpu cores and
> dimms, and phosphor-pid-control shall ignore the missing temperatures.
>
> Patrick, what do you think about such case?
>
> Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Feature request in phosphor-pid-control
  2018-06-14  7:24 Feature request in phosphor-pid-control Lei YU
  2018-06-14 16:41 ` Patrick Venture
@ 2018-06-14 17:21 ` Tanous, Ed
  2018-06-14 18:36   ` Brad Bishop
  1 sibling, 1 reply; 9+ messages in thread
From: Tanous, Ed @ 2018-06-14 17:21 UTC (permalink / raw)
  To: Lei YU, OpenBMC Maillist, Patrick Venture

We've had a few discussions about this in previous code reviews.  I think the conclusion was that we need to add a "status" field/interface to sensors, so we can report OK/Unavailable/Error statuses outside of the existing value interface.  So far as I know, no one has proposed one of these yet, but I would happily help review a patch.

Also, (not Patrick) I would support a parameter in phosphor-pid-control that makes sensors optional.  The only part that worries me here is that there is a big difference at a system level between "unavailable" and "faulted", but I think that could be easily fixed in a future patchset if we get the full DBus support for status interfaces.

-Ed





From: openbmc [mailto:openbmc-bounces+ed.tanous=intel.com@lists.ozlabs.org] On Behalf Of Lei YU
Sent: Thursday, June 14, 2018 12:24 AM
To: OpenBMC Maillist <openbmc@lists.ozlabs.org>; Patrick Venture <venture@google.com>
Subject: Feature request in phosphor-pid-control

This email is about a feature request in phosphor-pid-control, that to support
optional sensors.
It is mentioned in comment of https://gerrit.openbmc-project.xyz/#/c/11003/

Background:
Current phosphor-pid-control checks the temperature sensors and calculate the
expected fan speed.
It assumes the temperature sensors are always available. And if the sensor is
not there, the service throws exceptions.

The feature request:
I would like phosphor-pid-control to support "optional" sensors, that it may
be unavailable when system is running.
Such sensor is counted when it is available, and is ignored when it is
missing.

An use case:
I want to adjust fan speed based on cpu core or dimm temperature.
For a typical OpenPOWER P9 system, there could be at most 48 cores, but they
can be missing depending on the CPU, e.g. if a system is plugged CPUs with 22
cores, 4 cores temperature will be missing. And in run time, a core may be
garded if it has problem it will be missing as well.
The same for dimms, for a system with at most 16 dimms, it depends on how many
dimms are attached.
In such case, the config will specify all the temperatures for cpu cores and
dimms, and phosphor-pid-control shall ignore the missing temperatures.

Patrick, what do you think about such case?

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature request in phosphor-pid-control
  2018-06-14 17:21 ` Tanous, Ed
@ 2018-06-14 18:36   ` Brad Bishop
  2018-06-14 18:44     ` Tanous, Ed
  0 siblings, 1 reply; 9+ messages in thread
From: Brad Bishop @ 2018-06-14 18:36 UTC (permalink / raw)
  To: Tanous, Ed; +Cc: Lei YU, OpenBMC Maillist, Patrick Venture

> 
> On Jun 14, 2018, at 1:21 PM, Tanous, Ed <ed.tanous@intel.com> wrote:
> 
> So far as I know, no one has proposed one of these yet, but I would happily help review a patch.

I’m pretty sure Matt Barth added support for xyz.openbmc_project.State.Decorator.OperationalStatus
to phosphor-hwmon already.  It certainly doesn’t cover something like mapper timeouts though (nor
should it, IMO).

> 
> Also, (not Patrick) I would support a parameter in phosphor-pid-control that makes sensors optional.  The only part that worries me here is that there is a big difference at a system level between "unavailable" and "faulted", but I think that could be easily fixed in a future patchset if we get the full DBus support for status interfaces.
> 
> -Ed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Feature request in phosphor-pid-control
  2018-06-14 18:36   ` Brad Bishop
@ 2018-06-14 18:44     ` Tanous, Ed
  0 siblings, 0 replies; 9+ messages in thread
From: Tanous, Ed @ 2018-06-14 18:44 UTC (permalink / raw)
  To: Brad Bishop; +Cc: Lei YU, OpenBMC Maillist, Patrick Venture

> 
> I’m pretty sure Matt Barth added support for
> xyz.openbmc_project.State.Decorator.OperationalStatus
> to phosphor-hwmon already.  It certainly doesn’t cover something like
> mapper timeouts though (nor should it, IMO).
> 

Interesting.....  I had no idea.  That I think that solves half of the issue (operational versus not), but still seems to miss the distinction between unavailable and error.

With that said, it sounds like it might solve the issue in the context of this discussion, so that's pretty cool.

-Ed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature request in phosphor-pid-control
  2018-06-14 16:41 ` Patrick Venture
@ 2018-06-15  2:20   ` Lei YU
  2018-08-30 20:17     ` Patrick Venture
  0 siblings, 1 reply; 9+ messages in thread
From: Lei YU @ 2018-06-15  2:20 UTC (permalink / raw)
  To: Patrick Venture; +Cc: OpenBMC Maillist

> So, this change really is just, "don't fail if you can't find the
> sensor during start-up, and keep trying until it shows up"
>
> Does that follow?

Pretty much close.
More like "in any case, don't fail if you can't find the the sensor, and keep
trying".
Note this is applicable to sensors that are unavailable.

For sensors that are expected to be treated as fault, it should trigger
failsafe, and this is discussed in Ed's mail.

Putting it together, the feature request is splitted:
1. For sensors without operation status, we can config them as optional, so
   they do not trigger failsafe;
2. For sensors with operation status, do not trigger failsafe if it is
   unavailable; trigger failsafe if it is fault.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature request in phosphor-pid-control
  2018-06-15  2:20   ` Lei YU
@ 2018-08-30 20:17     ` Patrick Venture
  2018-08-30 20:25       ` Patrick Venture
  0 siblings, 1 reply; 9+ messages in thread
From: Patrick Venture @ 2018-08-30 20:17 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist

On Thu, Jun 14, 2018 at 7:21 PM Lei YU <mine260309@gmail.com> wrote:
>
> > So, this change really is just, "don't fail if you can't find the
> > sensor during start-up, and keep trying until it shows up"
> >
> > Does that follow?
>
> Pretty much close.
> More like "in any case, don't fail if you can't find the the sensor, and keep
> trying".
> Note this is applicable to sensors that are unavailable.
>
> For sensors that are expected to be treated as fault, it should trigger
> failsafe, and this is discussed in Ed's mail.
>
> Putting it together, the feature request is splitted:
> 1. For sensors without operation status, we can config them as optional, so
>    they do not trigger failsafe;
> 2. For sensors with operation status, do not trigger failsafe if it is
>    unavailable; trigger failsafe if it is fault.

Just bumping this back to the top of my list.  I don't know that I'll
have to time to restructure the initialization of sensors until a few
weeks from now.  However, I'm bumping this now because I'm doing some
phosphor-pid-control work again and it's a good time to think about
these features -- since I'm already cracking it open.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature request in phosphor-pid-control
  2018-08-30 20:17     ` Patrick Venture
@ 2018-08-30 20:25       ` Patrick Venture
  0 siblings, 0 replies; 9+ messages in thread
From: Patrick Venture @ 2018-08-30 20:25 UTC (permalink / raw)
  To: Lei YU; +Cc: OpenBMC Maillist

On Thu, Aug 30, 2018 at 1:17 PM Patrick Venture <venture@google.com> wrote:
>
> On Thu, Jun 14, 2018 at 7:21 PM Lei YU <mine260309@gmail.com> wrote:
> >
> > > So, this change really is just, "don't fail if you can't find the
> > > sensor during start-up, and keep trying until it shows up"
> > >
> > > Does that follow?
> >
> > Pretty much close.
> > More like "in any case, don't fail if you can't find the the sensor, and keep
> > trying".
> > Note this is applicable to sensors that are unavailable.
> >
> > For sensors that are expected to be treated as fault, it should trigger
> > failsafe, and this is discussed in Ed's mail.
> >
> > Putting it together, the feature request is splitted:
> > 1. For sensors without operation status, we can config them as optional, so
> >    they do not trigger failsafe;
> > 2. For sensors with operation status, do not trigger failsafe if it is
> >    unavailable; trigger failsafe if it is fault.
>
> Just bumping this back to the top of my list.  I don't know that I'll
> have to time to restructure the initialization of sensors until a few
> weeks from now.  However, I'm bumping this now because I'm doing some
> phosphor-pid-control work again and it's a good time to think about
> these features -- since I'm already cracking it open.

https://github.com/openbmc/phosphor-pid-control/issues/2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Feature request in phosphor-pid-control
@ 2018-08-30 20:22 Patrick Venture
  0 siblings, 0 replies; 9+ messages in thread
From: Patrick Venture @ 2018-08-30 20:22 UTC (permalink / raw)
  To: Lei YU, Tanous, Ed; +Cc: OpenBMC Maillist

Currently we expose a zone's state, whether it's in failsafe or manual
mode.  One can read or write this information over IPMI via OEM
commands.  With the PID objects, it may also be helpful to read the
PID values over IPMI periodically (or later read/write them).

A zone has PIDs, so we can expose an interface on each PID that
exports the values onto dbus.  Obviously, values that are updated
regularly should only be updated in dbus if someone tries to read them
-- that's what I'm thinking there.  Keep it more of a report on read
interface instead of one that broadcasts for others.

I'm going to start hacking out an initial design over the next few
days and update this email thread.

Let me know if anyone is interested in this information or has already
worked it out.

Patrick

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-08-30 20:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-14  7:24 Feature request in phosphor-pid-control Lei YU
2018-06-14 16:41 ` Patrick Venture
2018-06-15  2:20   ` Lei YU
2018-08-30 20:17     ` Patrick Venture
2018-08-30 20:25       ` Patrick Venture
2018-06-14 17:21 ` Tanous, Ed
2018-06-14 18:36   ` Brad Bishop
2018-06-14 18:44     ` Tanous, Ed
2018-08-30 20:22 Patrick Venture

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.