* Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
@ 2020-12-16 7:33 Thu Nguyen
2020-12-23 15:32 ` Thu Nguyen
0 siblings, 1 reply; 4+ messages in thread
From: Thu Nguyen @ 2020-12-16 7:33 UTC (permalink / raw)
To: openbmc
Hi All,
I'm working with Fan sensors on Ampere MtJade platform.
In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
FAN4_1, FAN4_2, FAN5_1...
I added the configuration for those fans in phosphor-hwmon and I also
added option "--enable-update-functional-on-fail" in phosphor-hwmon
build flag. I'm trying to set fan functional to false when unplug fan.
Flash new image to the board, read functional of fans. The time to read
dbus property is about 0.05->0.1 seconds:
root@mtjade:~# time busctl get-property
xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN4_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m0.078s
user 0m0.002s
sys 0m0.032s
root@mtjade:~# time busctl get-property
xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN3_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m0.044s
user 0m0.001s
sys 0m0.034s
After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
is set to false as expected. And functional of others fans keeps true.
But the time to get dbus properties of all fans have a huge increasement
event in the working fans.
~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN4_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b false
real 0m1.189s
user 0m0.001s
sys 0m0.036s
~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN3_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m3.285s
user 0m0.010s
sys 0m0.028s
The "ipmitool sdr type 0x4" commands is also failed because this
increasement.
~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr type 0x4
FAN3_1 | 25h | ok | 29.13 | 5100 RPM
FAN3_2 | 28h | ok | 29.16 | 4700 RPM
FAN4_1 | 2Bh | ns | 29.19 | No Reading
FAN4_2 | 2Eh | ns | 29.22 | No Reading
FAN5_1 | 31h | ns | 29.25 | No Reading
FAN5_2 | 34h | ns | 29.28 | No Reading
FAN6_1 | 37h | ns | 29.31 | No Reading
FAN6_2 | 3Ah | ns | 29.34 | No Reading
FAN7_1 | 3Dh | ns | 29.37 | No Reading
FAN7_2 | 40h | ns | 29.40 | No Reading
FAN8_1 | 43h | ns | 29.43 | No Reading
FAN8_2 | 46h | ns | 29.46 | No Reading
PSU0_fan1 | F5h | ns | 29.60 | No Reading
PSU1_fan1 | F6h | ns | 29.61 | No Reading
real 2m43.704s
user 0m0.046s
sys 0m0.057s
The cause of this increasement is when it failed to read one sensor
phosphor-hwmon keep trying to read the sensors with the retry is 10 and
the 100ms delays between retry times.
Should we reduce the retry for non-functional sensors?
Regards.
Thu Nguyen.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
2020-12-16 7:33 Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional Thu Nguyen
@ 2020-12-23 15:32 ` Thu Nguyen
2020-12-24 1:52 ` Lei Yu
0 siblings, 1 reply; 4+ messages in thread
From: Thu Nguyen @ 2020-12-23 15:32 UTC (permalink / raw)
To: openbmc
On 12/16/20 14:33, Thu Nguyen wrote:
> Hi All,
>
>
> I'm working with Fan sensors on Ampere MtJade platform.
>
> In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
> FAN4_1, FAN4_2, FAN5_1...
>
> I added the configuration for those fans in phosphor-hwmon and I also
> added option "--enable-update-functional-on-fail" in phosphor-hwmon
> build flag. I'm trying to set fan functional to false when unplug fan.
>
> Flash new image to the board, read functional of fans. The time to
> read dbus property is about 0.05->0.1 seconds:
>
> root@mtjade:~# time busctl get-property
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real 0m0.078s
> user 0m0.002s
> sys 0m0.032s
> root@mtjade:~# time busctl get-property
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
>
> real 0m0.044s
> user 0m0.001s
> sys 0m0.034s
>
> After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
> is set to false as expected. And functional of others fans keeps
> true. But the time to get dbus properties of all fans have a huge
> increasement event in the working fans.
>
> ~# time busctl get-property
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b false
>
> real 0m1.189s
> user 0m0.001s
> sys 0m0.036s
>
> ~# time busctl get-property
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real 0m3.285s
> user 0m0.010s
> sys 0m0.028s
>
> The "ipmitool sdr type 0x4" commands is also failed because this
> increasement.
>
> ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
> type 0x4
> FAN3_1 | 25h | ok | 29.13 | 5100 RPM
> FAN3_2 | 28h | ok | 29.16 | 4700 RPM
> FAN4_1 | 2Bh | ns | 29.19 | No Reading
> FAN4_2 | 2Eh | ns | 29.22 | No Reading
> FAN5_1 | 31h | ns | 29.25 | No Reading
> FAN5_2 | 34h | ns | 29.28 | No Reading
> FAN6_1 | 37h | ns | 29.31 | No Reading
> FAN6_2 | 3Ah | ns | 29.34 | No Reading
> FAN7_1 | 3Dh | ns | 29.37 | No Reading
> FAN7_2 | 40h | ns | 29.40 | No Reading
> FAN8_1 | 43h | ns | 29.43 | No Reading
> FAN8_2 | 46h | ns | 29.46 | No Reading
> PSU0_fan1 | F5h | ns | 29.60 | No Reading
> PSU1_fan1 | F6h | ns | 29.61 | No Reading
>
> real 2m43.704s
> user 0m0.046s
> sys 0m0.057s
>
> The cause of this increasement is when it failed to read one sensor
> phosphor-hwmon keep trying to read the sensors with the retry is 10
> and the 100ms delays between retry times.
>
> Should we reduce the retry for non-functional sensors?
>
>
> Regards.
>
> Thu Nguyen
Hi All,
Any feed back on this?
Thu Nguyen,
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
2020-12-23 15:32 ` Thu Nguyen
@ 2020-12-24 1:52 ` Lei Yu
2020-12-24 2:32 ` Thu Nguyen
0 siblings, 1 reply; 4+ messages in thread
From: Lei Yu @ 2020-12-24 1:52 UTC (permalink / raw)
To: Thu Nguyen; +Cc: openbmc
On Wed, Dec 23, 2020 at 11:33 PM Thu Nguyen
<thu@amperemail.onmicrosoft.com> wrote:
>
> On 12/16/20 14:33, Thu Nguyen wrote:
> > Hi All,
> >
> >
> > I'm working with Fan sensors on Ampere MtJade platform.
> >
> > In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
> > FAN4_1, FAN4_2, FAN5_1...
> >
> > I added the configuration for those fans in phosphor-hwmon and I also
> > added option "--enable-update-functional-on-fail" in phosphor-hwmon
> > build flag. I'm trying to set fan functional to false when unplug fan.
> >
> > Flash new image to the board, read functional of fans. The time to
> > read dbus property is about 0.05->0.1 seconds:
> >
> > root@mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real 0m0.078s
> > user 0m0.002s
> > sys 0m0.032s
> > root@mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> >
> > real 0m0.044s
> > user 0m0.001s
> > sys 0m0.034s
> >
> > After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
> > is set to false as expected. And functional of others fans keeps
> > true. But the time to get dbus properties of all fans have a huge
> > increasement event in the working fans.
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b false
> >
> > real 0m1.189s
> > user 0m0.001s
> > sys 0m0.036s
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real 0m3.285s
> > user 0m0.010s
> > sys 0m0.028s
> >
> > The "ipmitool sdr type 0x4" commands is also failed because this
> > increasement.
> >
> > ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
> > type 0x4
> > FAN3_1 | 25h | ok | 29.13 | 5100 RPM
> > FAN3_2 | 28h | ok | 29.16 | 4700 RPM
> > FAN4_1 | 2Bh | ns | 29.19 | No Reading
> > FAN4_2 | 2Eh | ns | 29.22 | No Reading
> > FAN5_1 | 31h | ns | 29.25 | No Reading
> > FAN5_2 | 34h | ns | 29.28 | No Reading
> > FAN6_1 | 37h | ns | 29.31 | No Reading
> > FAN6_2 | 3Ah | ns | 29.34 | No Reading
> > FAN7_1 | 3Dh | ns | 29.37 | No Reading
> > FAN7_2 | 40h | ns | 29.40 | No Reading
> > FAN8_1 | 43h | ns | 29.43 | No Reading
> > FAN8_2 | 46h | ns | 29.46 | No Reading
> > PSU0_fan1 | F5h | ns | 29.60 | No Reading
> > PSU1_fan1 | F6h | ns | 29.61 | No Reading
> >
> > real 2m43.704s
> > user 0m0.046s
> > sys 0m0.057s
> >
> > The cause of this increasement is when it failed to read one sensor
> > phosphor-hwmon keep trying to read the sensors with the retry is 10
> > and the 100ms delays between retry times.
> >
> > Should we reduce the retry for non-functional sensors?
When a fan is unplugged, its "Present" property should be false as well.
Maybe you could check that property and skip such fans?
> >
> >
> > Regards.
> >
> > Thu Nguyen
> Hi All,
>
> Any feed back on this?
>
> Thu Nguyen,
>
--
BRs,
Lei YU
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
2020-12-24 1:52 ` Lei Yu
@ 2020-12-24 2:32 ` Thu Nguyen
0 siblings, 0 replies; 4+ messages in thread
From: Thu Nguyen @ 2020-12-24 2:32 UTC (permalink / raw)
To: Lei Yu; +Cc: openbmc
On 12/24/20 08:52, Lei Yu wrote:
> On Wed, Dec 23, 2020 at 11:33 PM Thu Nguyen
> <thu@amperemail.onmicrosoft.com> wrote:
>> On 12/16/20 14:33, Thu Nguyen wrote:
>>> Hi All,
>>>
>>>
>>> I'm working with Fan sensors on Ampere MtJade platform.
>>>
>>> In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
>>> FAN4_1, FAN4_2, FAN5_1...
>>>
>>> I added the configuration for those fans in phosphor-hwmon and I also
>>> added option "--enable-update-functional-on-fail" in phosphor-hwmon
>>> build flag. I'm trying to set fan functional to false when unplug fan.
>>>
>>> Flash new image to the board, read functional of fans. The time to
>>> read dbus property is about 0.05->0.1 seconds:
>>>
>>> root@mtjade:~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>> real 0m0.078s
>>> user 0m0.002s
>>> sys 0m0.032s
>>> root@mtjade:~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>>
>>> real 0m0.044s
>>> user 0m0.001s
>>> sys 0m0.034s
>>>
>>> After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
>>> is set to false as expected. And functional of others fans keeps
>>> true. But the time to get dbus properties of all fans have a huge
>>> increasement event in the working fans.
>>>
>>> ~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN4_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b false
>>>
>>> real 0m1.189s
>>> user 0m0.001s
>>> sys 0m0.036s
>>>
>>> ~# time busctl get-property
>>> xyz.openbmc_project.Hwmon-1644477290.Hwmon1
>>> /xyz/openbmc_project/sensors/fan_tach/FAN3_2
>>> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
>>> b true
>>>
>>> real 0m3.285s
>>> user 0m0.010s
>>> sys 0m0.028s
>>>
>>> The "ipmitool sdr type 0x4" commands is also failed because this
>>> increasement.
>>>
>>> ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
>>> type 0x4
>>> FAN3_1 | 25h | ok | 29.13 | 5100 RPM
>>> FAN3_2 | 28h | ok | 29.16 | 4700 RPM
>>> FAN4_1 | 2Bh | ns | 29.19 | No Reading
>>> FAN4_2 | 2Eh | ns | 29.22 | No Reading
>>> FAN5_1 | 31h | ns | 29.25 | No Reading
>>> FAN5_2 | 34h | ns | 29.28 | No Reading
>>> FAN6_1 | 37h | ns | 29.31 | No Reading
>>> FAN6_2 | 3Ah | ns | 29.34 | No Reading
>>> FAN7_1 | 3Dh | ns | 29.37 | No Reading
>>> FAN7_2 | 40h | ns | 29.40 | No Reading
>>> FAN8_1 | 43h | ns | 29.43 | No Reading
>>> FAN8_2 | 46h | ns | 29.46 | No Reading
>>> PSU0_fan1 | F5h | ns | 29.60 | No Reading
>>> PSU1_fan1 | F6h | ns | 29.61 | No Reading
>>>
>>> real 2m43.704s
>>> user 0m0.046s
>>> sys 0m0.057s
>>>
>>> The cause of this increasement is when it failed to read one sensor
>>> phosphor-hwmon keep trying to read the sensors with the retry is 10
>>> and the 100ms delays between retry times.
>>>
>>> Should we reduce the retry for non-functional sensors?
> When a fan is unplugged, its "Present" property should be false as well.
> Maybe you could check that property and skip such fans?
>
In the sensor Dbus object, we don't have the present property. The
present property is belong to the inventory object of the phosphor-fan.
If using present properties, we have to map the fan sensor name with the
corresponding inventory object. We will break the generic character of
phosphor-hwmon.
As my opinion, for hotplug supporting devices such as fans, we should
not retry when failed to read. Because there are no difference between
the fan sensors are failed to read or the fan sensors are unplugged with
the fan.
Is it reasonable to retry to read the failed sensors after each 0.1
seconds?
>>>
>>> Regards.
>>>
>>> Thu Nguyen
>> Hi All,
>>
>> Any feed back on this?
>>
>> Thu Nguyen,
>>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-12-24 2:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-16 7:33 Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional Thu Nguyen
2020-12-23 15:32 ` Thu Nguyen
2020-12-24 1:52 ` Lei Yu
2020-12-24 2:32 ` Thu Nguyen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).