All of lore.kernel.org
 help / color / mirror / Atom feed
* Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
@ 2020-12-16  7:33 Thu Nguyen
  2020-12-23 15:32 ` Thu Nguyen
  0 siblings, 1 reply; 4+ messages in thread
From: Thu Nguyen @ 2020-12-16  7:33 UTC (permalink / raw)
  To: openbmc

Hi All,


I'm working with Fan sensors on Ampere MtJade platform.

In this platform, I have multiple fans which name as FAN3_1, FAN3_2, 
FAN4_1, FAN4_2, FAN5_1...

I added the configuration for those fans in phosphor-hwmon and I also 
added option "--enable-update-functional-on-fail" in phosphor-hwmon 
build flag. I'm trying to set fan functional to false when unplug fan.

Flash new image to the board, read functional of fans. The time to read 
dbus property is about 0.05->0.1 seconds:

root@mtjade:~# time busctl get-property 
xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN4_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true

real    0m0.078s
user    0m0.002s
sys    0m0.032s
root@mtjade:~# time busctl get-property 
xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN3_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true


real    0m0.044s
user    0m0.001s
sys    0m0.034s

After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2 
is set to false as expected. And functional of others fans keeps  true. 
But the time to get dbus properties of all fans have a huge increasement 
event in the working fans.

~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN4_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b false

real    0m1.189s
user    0m0.001s
sys    0m0.036s

~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN3_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true

real    0m3.285s
user    0m0.010s
sys    0m0.028s

The "ipmitool sdr type 0x4" commands is also failed because this 
increasement.

~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr type 0x4
FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
FAN4_1           | 2Bh | ns  | 29.19 | No Reading
FAN4_2           | 2Eh | ns  | 29.22 | No Reading
FAN5_1           | 31h | ns  | 29.25 | No Reading
FAN5_2           | 34h | ns  | 29.28 | No Reading
FAN6_1           | 37h | ns  | 29.31 | No Reading
FAN6_2           | 3Ah | ns  | 29.34 | No Reading
FAN7_1           | 3Dh | ns  | 29.37 | No Reading
FAN7_2           | 40h | ns  | 29.40 | No Reading
FAN8_1           | 43h | ns  | 29.43 | No Reading
FAN8_2           | 46h | ns  | 29.46 | No Reading
PSU0_fan1        | F5h | ns  | 29.60 | No Reading
PSU1_fan1        | F6h | ns  | 29.61 | No Reading

real    2m43.704s
user    0m0.046s
sys    0m0.057s

The cause of this increasement is when it failed to read one sensor 
phosphor-hwmon keep trying to read the sensors with the retry is 10 and 
the 100ms delays between retry times.

Should we reduce the retry for non-functional sensors?


Regards.

Thu Nguyen.







^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
  2020-12-16  7:33 Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional Thu Nguyen
@ 2020-12-23 15:32 ` Thu Nguyen
  2020-12-24  1:52   ` Lei Yu
  0 siblings, 1 reply; 4+ messages in thread
From: Thu Nguyen @ 2020-12-23 15:32 UTC (permalink / raw)
  To: openbmc

On 12/16/20 14:33, Thu Nguyen wrote:
> Hi All,
>
>
> I'm working with Fan sensors on Ampere MtJade platform.
>
> In this platform, I have multiple fans which name as FAN3_1, FAN3_2, 
> FAN4_1, FAN4_2, FAN5_1...
>
> I added the configuration for those fans in phosphor-hwmon and I also 
> added option "--enable-update-functional-on-fail" in phosphor-hwmon 
> build flag. I'm trying to set fan functional to false when unplug fan.
>
> Flash new image to the board, read functional of fans. The time to 
> read dbus property is about 0.05->0.1 seconds:
>
> root@mtjade:~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real    0m0.078s
> user    0m0.002s
> sys    0m0.032s
> root@mtjade:~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
>
> real    0m0.044s
> user    0m0.001s
> sys    0m0.034s
>
> After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2 
> is set to false as expected. And functional of others fans keeps  
> true. But the time to get dbus properties of all fans have a huge 
> increasement event in the working fans.
>
> ~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b false
>
> real    0m1.189s
> user    0m0.001s
> sys    0m0.036s
>
> ~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real    0m3.285s
> user    0m0.010s
> sys    0m0.028s
>
> The "ipmitool sdr type 0x4" commands is also failed because this 
> increasement.
>
> ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr 
> type 0x4
> FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
> FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
> FAN4_1           | 2Bh | ns  | 29.19 | No Reading
> FAN4_2           | 2Eh | ns  | 29.22 | No Reading
> FAN5_1           | 31h | ns  | 29.25 | No Reading
> FAN5_2           | 34h | ns  | 29.28 | No Reading
> FAN6_1           | 37h | ns  | 29.31 | No Reading
> FAN6_2           | 3Ah | ns  | 29.34 | No Reading
> FAN7_1           | 3Dh | ns  | 29.37 | No Reading
> FAN7_2           | 40h | ns  | 29.40 | No Reading
> FAN8_1           | 43h | ns  | 29.43 | No Reading
> FAN8_2           | 46h | ns  | 29.46 | No Reading
> PSU0_fan1        | F5h | ns  | 29.60 | No Reading
> PSU1_fan1        | F6h | ns  | 29.61 | No Reading
>
> real    2m43.704s
> user    0m0.046s
> sys    0m0.057s
>
> The cause of this increasement is when it failed to read one sensor 
> phosphor-hwmon keep trying to read the sensors with the retry is 10 
> and the 100ms delays between retry times.
>
> Should we reduce the retry for non-functional sensors?
>
>
> Regards.
>
> Thu Nguyen
Hi All,

Any feed back on this?

Thu Nguyen,


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
  2020-12-23 15:32 ` Thu Nguyen
@ 2020-12-24  1:52   ` Lei Yu
  2020-12-24  2:32     ` Thu Nguyen
  0 siblings, 1 reply; 4+ messages in thread
From: Lei Yu @ 2020-12-24  1:52 UTC (permalink / raw)
  To: Thu Nguyen; +Cc: openbmc

On Wed, Dec 23, 2020 at 11:33 PM Thu Nguyen
<thu@amperemail.onmicrosoft.com> wrote:
>
> On 12/16/20 14:33, Thu Nguyen wrote:
> > Hi All,
> >
> >
> > I'm working with Fan sensors on Ampere MtJade platform.
> >
> > In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
> > FAN4_1, FAN4_2, FAN5_1...
> >
> > I added the configuration for those fans in phosphor-hwmon and I also
> > added option "--enable-update-functional-on-fail" in phosphor-hwmon
> > build flag. I'm trying to set fan functional to false when unplug fan.
> >
> > Flash new image to the board, read functional of fans. The time to
> > read dbus property is about 0.05->0.1 seconds:
> >
> > root@mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real    0m0.078s
> > user    0m0.002s
> > sys    0m0.032s
> > root@mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> >
> > real    0m0.044s
> > user    0m0.001s
> > sys    0m0.034s
> >
> > After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
> > is set to false as expected. And functional of others fans keeps
> > true. But the time to get dbus properties of all fans have a huge
> > increasement event in the working fans.
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b false
> >
> > real    0m1.189s
> > user    0m0.001s
> > sys    0m0.036s
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real    0m3.285s
> > user    0m0.010s
> > sys    0m0.028s
> >
> > The "ipmitool sdr type 0x4" commands is also failed because this
> > increasement.
> >
> > ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
> > type 0x4
> > FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
> > FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
> > FAN4_1           | 2Bh | ns  | 29.19 | No Reading
> > FAN4_2           | 2Eh | ns  | 29.22 | No Reading
> > FAN5_1           | 31h | ns  | 29.25 | No Reading
> > FAN5_2           | 34h | ns  | 29.28 | No Reading
> > FAN6_1           | 37h | ns  | 29.31 | No Reading
> > FAN6_2           | 3Ah | ns  | 29.34 | No Reading
> > FAN7_1           | 3Dh | ns  | 29.37 | No Reading
> > FAN7_2           | 40h | ns  | 29.40 | No Reading
> > FAN8_1           | 43h | ns  | 29.43 | No Reading
> > FAN8_2           | 46h | ns  | 29.46 | No Reading
> > PSU0_fan1        | F5h | ns  | 29.60 | No Reading
> > PSU1_fan1        | F6h | ns  | 29.61 | No Reading
> >
> > real    2m43.704s
> > user    0m0.046s
> > sys    0m0.057s
> >
> > The cause of this increasement is when it failed to read one sensor
> > phosphor-hwmon keep trying to read the sensors with the retry is 10
> > and the 100ms delays between retry times.
> >
> > Should we reduce the retry for non-functional sensors?

When a fan is unplugged, its "Present" property should be false as well.
Maybe you could check that property and skip such fans?

> >
> >
> > Regards.
> >
> > Thu Nguyen
> Hi All,
>
> Any feed back on this?
>
> Thu Nguyen,
>


-- 
BRs,
Lei YU

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
  2020-12-24  1:52   ` Lei Yu
@ 2020-12-24  2:32     ` Thu Nguyen
  0 siblings, 0 replies; 4+ messages in thread
From: Thu Nguyen @ 2020-12-24  2:32 UTC (permalink / raw)
  To: Lei Yu; +Cc: openbmc

On 12/24/20 08:52, Lei Yu wrote:
> On Wed, Dec 23, 2020 at 11:33 PM Thu Nguyen
> <thu@amperemail.onmicrosoft.com> wrote:
>> On 12/16/20 14:33, Thu Nguyen wrote:
>>> Hi All,
>>>
>>>
>>> I'm working with Fan sensors on Ampere MtJade platform.
>>>
>>> In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
>>> FAN4_1, FAN4_2, FAN5_1...
>>>
>>> I added the configuration for those fans in phosphor-hwmon and I also
>>> added option "--enable-update-functional-on-fail" in phosphor-hwmon
>>> build flag. I'm trying to set fan functional to false when unplug fan.
>>>
>>> Flash new image to the board, read functional of fans. The time to
>>> read dbus property is about 0.05->0.1 seconds:
>>>
>>> root@mtjade:~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>> real    0m0.078s
>>> user    0m0.002s
>>> sys    0m0.032s
>>> root@mtjade:~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>>
>>> real    0m0.044s
>>> user    0m0.001s
>>> sys    0m0.034s
>>>
>>> After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
>>> is set to false as expected. And functional of others fans keeps
>>> true. But the time to get dbus properties of all fans have a huge
>>> increasement event in the working fans.
>>>
>>> ~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b false
>>>
>>> real    0m1.189s
>>> user    0m0.001s
>>> sys    0m0.036s
>>>
>>> ~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>> real    0m3.285s
>>> user    0m0.010s
>>> sys    0m0.028s
>>>
>>> The "ipmitool sdr type 0x4" commands is also failed because this
>>> increasement.
>>>
>>> ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
>>> type 0x4
>>> FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
>>> FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
>>> FAN4_1           | 2Bh | ns  | 29.19 | No Reading
>>> FAN4_2           | 2Eh | ns  | 29.22 | No Reading
>>> FAN5_1           | 31h | ns  | 29.25 | No Reading
>>> FAN5_2           | 34h | ns  | 29.28 | No Reading
>>> FAN6_1           | 37h | ns  | 29.31 | No Reading
>>> FAN6_2           | 3Ah | ns  | 29.34 | No Reading
>>> FAN7_1           | 3Dh | ns  | 29.37 | No Reading
>>> FAN7_2           | 40h | ns  | 29.40 | No Reading
>>> FAN8_1           | 43h | ns  | 29.43 | No Reading
>>> FAN8_2           | 46h | ns  | 29.46 | No Reading
>>> PSU0_fan1        | F5h | ns  | 29.60 | No Reading
>>> PSU1_fan1        | F6h | ns  | 29.61 | No Reading
>>>
>>> real    2m43.704s
>>> user    0m0.046s
>>> sys    0m0.057s
>>>
>>> The cause of this increasement is when it failed to read one sensor
>>> phosphor-hwmon keep trying to read the sensors with the retry is 10
>>> and the 100ms delays between retry times.
>>>
>>> Should we reduce the retry for non-functional sensors?
> When a fan is unplugged, its "Present" property should be false as well.
> Maybe you could check that property and skip such fans?
>
In the sensor Dbus object, we don't have the present property. The 
present property is belong to the inventory object of the phosphor-fan.

If using present properties, we have to map the fan sensor name with the 
corresponding inventory object. We will break the generic character of 
phosphor-hwmon.

As my opinion, for hotplug supporting devices such as fans, we should 
not retry when failed to read. Because there are no difference between 
the fan sensors are failed to read or the fan sensors are unplugged with 
the fan.

Is it reasonable to retry to read the failed sensors after each 0.1 
seconds?

>>>
>>> Regards.
>>>
>>> Thu Nguyen
>> Hi All,
>>
>> Any feed back on this?
>>
>> Thu Nguyen,
>>
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-12-24  2:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-16  7:33 Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional Thu Nguyen
2020-12-23 15:32 ` Thu Nguyen
2020-12-24  1:52   ` Lei Yu
2020-12-24  2:32     ` Thu Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.