* Fault handling(Threshold exceeds/low) in Fan and NIC sensors @ 2020-11-13 16:30 Kumar Thangavel 2020-11-13 20:44 ` Ed Tanous 0 siblings, 1 reply; 6+ messages in thread From: Kumar Thangavel @ 2020-11-13 16:30 UTC (permalink / raw) To: openbmc Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, Ed Tanous, Vernon Mauery, Velumani T-ERS, HCLTech, Patrick Williams [-- Attachment #1: Type: text/plain, Size: 1759 bytes --] Classification: Internal Hi All, We wanted to power-off 12 V of the hosts/BMC, if the Fan and NIC sensors crossed the threshold level. It would be platform specific. In dbus-sensors, most of the sensor handles the threshold checks and throws error if it crossed. So, we are planning to add a new field in entity manager to identify the particular sensors to handle this fault condition. Planning to add default script in the dbus-sensor to handle this fault condition and this would be overwritten from the machine layer. Could you please provide your suggestions on this. Thanks, Kumar. ::DISCLAIMER:: ________________________________ The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ________________________________ [-- Attachment #2: Type: text/html, Size: 4311 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors 2020-11-13 16:30 Fault handling(Threshold exceeds/low) in Fan and NIC sensors Kumar Thangavel @ 2020-11-13 20:44 ` Ed Tanous 2020-11-16 13:05 ` Kumar Thangavel 0 siblings, 1 reply; 6+ messages in thread From: Ed Tanous @ 2020-11-13 20:44 UTC (permalink / raw) To: Kumar Thangavel Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, openbmc, Vernon Mauery, Velumani T-ERS, HCLTech, Patrick Williams On Fri, Nov 13, 2020 at 8:31 AM Kumar Thangavel <thangavel.k@hcl.com> wrote: > > Could you please provide your suggestions on this. I'm having a little trouble following your email. Dbus-sensors has the ability to mask thresholds where appropriate, the platform specifics of which are already captured in the config file definition. If there's some configurable masking needed that's new, we can certainly add it, but I'd recommend looking at the existing threshold masking before adding anything new to see if what's there meets your needs. If you have some concrete things you'd like to see added, I'm happy to talk in more detail, just at this point, I have no idea what you're looking to solve, so you might want to be slightly more specific, and reference the existing threshold even masking in your proposed changes. Cheers, -Ed ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Fault handling(Threshold exceeds/low) in Fan and NIC sensors 2020-11-13 20:44 ` Ed Tanous @ 2020-11-16 13:05 ` Kumar Thangavel 2020-11-16 15:59 ` Ed Tanous 0 siblings, 1 reply; 6+ messages in thread From: Kumar Thangavel @ 2020-11-16 13:05 UTC (permalink / raw) To: Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, openbmc, Vernon Mauery, Velumani T-ERS,HCLTech, Patrick Williams Classification: Internal Hi Ed, In short, Our requirement is to take the actions when the fan fails. That action is platform specific. Fan failure : This is based on Fan sensors. If fan sensor's tach values is less than 33%, will consider as a fan failure. So will take the actions to reduce the heat production in the system. So that, hosts, NIC and other power consuming modules. Dbus-sensor's already handles the threshold masking. We just use that threshold masking to take the platform specific actions. Please let us know if any clarifications needed. Thanks, Kumar. -----Original Message----- From: Ed Tanous <ed@tanous.net> Sent: Saturday, November 14, 2020 2:14 AM To: Kumar Thangavel <thangavel.k@hcl.com> Cc: openbmc@lists.ozlabs.org; Velumani T-ERS,HCLTech <velumanit@hcl.com>; sdasari@fb.com; Patrick Williams <patrickw3@fb.com>; Patrick Venture <venture@google.com>; Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>; Vernon Mauery <vernon.mauery@linux.intel.com>; Zhikui Ren <zhikui.ren@intel.com> Subject: Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors [CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don’t click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.] On Fri, Nov 13, 2020 at 8:31 AM Kumar Thangavel <thangavel.k@hcl.com> wrote: > > Could you please provide your suggestions on this. I'm having a little trouble following your email. Dbus-sensors has the ability to mask thresholds where appropriate, the platform specifics of which are already captured in the config file definition. If there's some configurable masking needed that's new, we can certainly add it, but I'd recommend looking at the existing threshold masking before adding anything new to see if what's there meets your needs. If you have some concrete things you'd like to see added, I'm happy to talk in more detail, just at this point, I have no idea what you're looking to solve, so you might want to be slightly more specific, and reference the existing threshold even masking in your proposed changes. Cheers, -Ed ::DISCLAIMER:: ________________________________ The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ________________________________ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors 2020-11-16 13:05 ` Kumar Thangavel @ 2020-11-16 15:59 ` Ed Tanous 2020-11-17 11:59 ` Kumar Thangavel 0 siblings, 1 reply; 6+ messages in thread From: Ed Tanous @ 2020-11-16 15:59 UTC (permalink / raw) To: Kumar Thangavel Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, openbmc, Vernon Mauery, Velumani T-ERS, HCLTech, Patrick Williams On Mon, Nov 16, 2020 at 5:05 AM Kumar Thangavel <thangavel.k@hcl.com> wrote: > > Classification: Internal > > Hi Ed, > > In short, Our requirement is to take the actions when the fan fails. That action is platform specific. > > Fan failure : This is based on Fan sensors. If fan sensor's tach values is less than 33%, will consider as a fan failure. So will take the actions to reduce the heat production in the system. dbus-sensors and phosphor-pid-control already have mechanisms for handling fan failure in these ways. Take a look at the existing config files, and they'll guide you on what you need to do next. > So that, hosts, NIC and other power consuming modules. > > Dbus-sensor's already handles the threshold masking. We just use that threshold masking to take the platform specific actions. > > Please let us know if any clarifications needed. > > Thanks, > Kumar. Ps, Please don't toppost. ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: Fault handling(Threshold exceeds/low) in Fan and NIC sensors 2020-11-16 15:59 ` Ed Tanous @ 2020-11-17 11:59 ` Kumar Thangavel 2020-11-20 17:10 ` Matt Spinler 0 siblings, 1 reply; 6+ messages in thread From: Kumar Thangavel @ 2020-11-17 11:59 UTC (permalink / raw) To: Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, openbmc, Vernon Mauery, Velumani T-ERS,HCLTech, Patrick Williams Classification: Internal Hi Ed, Please find below my response inline. Thanks, Kumar. -----Original Message----- From: Ed Tanous <ed@tanous.net> Sent: Monday, November 16, 2020 9:29 PM To: Kumar Thangavel <thangavel.k@hcl.com> Cc: openbmc@lists.ozlabs.org; Velumani T-ERS,HCLTech <velumanit@hcl.com>; sdasari@fb.com; Patrick Williams <patrickw3@fb.com>; Patrick Venture <venture@google.com>; Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>; Vernon Mauery <vernon.mauery@linux.intel.com>; Zhikui Ren <zhikui.ren@intel.com> Subject: Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors [CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don’t click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.] On Mon, Nov 16, 2020 at 5:05 AM Kumar Thangavel <thangavel.k@hcl.com> wrote: > > Classification: Internal > > Hi Ed, > > In short, Our requirement is to take the actions when the fan fails. That action is platform specific. > > Fan failure : This is based on Fan sensors. If fan sensor's tach values is less than 33%, will consider as a fan failure. So will take the actions to reduce the heat production in the system. dbus-sensors and phosphor-pid-control already have mechanisms for handling fan failure in these ways. Take a look at the existing config files, and they'll guide you on what you need to do next. Kumar : Are you saying about dbus-sensor's checkThresholds function ? In that function, high/low threshold levels are handled. Please confirm once. In that function, planning to add the service to handle the platform specific actions. Also, planning to add a new field in entity manager to identify the particular sensors to handle this fault condition. > So that, hosts, NIC and other power consuming modules. > > Dbus-sensor's already handles the threshold masking. We just use that threshold masking to take the platform specific actions. > > Please let us know if any clarifications needed. > > Thanks, > Kumar. Ps, Please don't toppost. ::DISCLAIMER:: ________________________________ The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. ________________________________ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors 2020-11-17 11:59 ` Kumar Thangavel @ 2020-11-20 17:10 ` Matt Spinler 0 siblings, 0 replies; 6+ messages in thread From: Matt Spinler @ 2020-11-20 17:10 UTC (permalink / raw) To: Kumar Thangavel, Ed Tanous Cc: Zhikui Ren, Jae Hyun Yoo, Patrick Venture, openbmc, Vernon Mauery, Velumani T-ERS, HCLTech, Patrick Williams On 11/17/2020 5:59 AM, Kumar Thangavel wrote: > Classification: Internal > > Hi Ed, > > Please find below my response inline. > > Thanks, > Kumar. > > -----Original Message----- > From: Ed Tanous <ed@tanous.net> > Sent: Monday, November 16, 2020 9:29 PM > To: Kumar Thangavel <thangavel.k@hcl.com> > Cc: openbmc@lists.ozlabs.org; Velumani T-ERS,HCLTech <velumanit@hcl.com>; sdasari@fb.com; Patrick Williams <patrickw3@fb.com>; Patrick Venture <venture@google.com>; Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>; Vernon Mauery <vernon.mauery@linux.intel.com>; Zhikui Ren <zhikui.ren@intel.com> > Subject: Re: Fault handling(Threshold exceeds/low) in Fan and NIC sensors > > [CAUTION: This Email is from outside the Organization. Unless you trust the sender, Don’t click links or open attachments as it may be a Phishing email, which can steal your Information and compromise your Computer.] > > On Mon, Nov 16, 2020 at 5:05 AM Kumar Thangavel <thangavel.k@hcl.com> wrote: >> Classification: Internal >> >> Hi Ed, >> >> In short, Our requirement is to take the actions when the fan fails. That action is platform specific. >> >> Fan failure : This is based on Fan sensors. If fan sensor's tach values is less than 33%, will consider as a fan failure. So will take the actions to reduce the heat production in the system. > dbus-sensors and phosphor-pid-control already have mechanisms for handling fan failure in these ways. Take a look at the existing config files, and they'll guide you on what you need to do next. > > Kumar : Are you saying about dbus-sensor's checkThresholds function ? In that function, high/low threshold levels are handled. Please confirm once. > In that function, planning to add the service to handle the platform specific actions. > Also, planning to add a new field in entity manager to identify the particular sensors to handle this fault condition. I have a need to monitor some temperature sensor thresholds and take various actions, such as creating phosphor-logging event logs and doing soft and hard shutdowns after various delays. In fact, not all sensors I need to monitor will be provided by D-Bus sensors, but I do need to use data provided by entity manager to tell me things like how long to delay, etc. I wouldn't think that dbus-sensors is probably the appropriate place to put this code, since it isn't putting any sensors on D-Bus and won't necessarily being monitoring sensors provided by that repo. Does anyone have a good idea of where a daemon like this could go? If nowhere else, I could put it in phosphor-fan, though not fan related, since our platforms will always use the fan-monitor app provided there which already does similar things for fan errors. > >> So that, hosts, NIC and other power consuming modules. >> >> Dbus-sensor's already handles the threshold masking. We just use that threshold masking to take the platform specific actions. >> >> Please let us know if any clarifications needed. >> >> Thanks, >> Kumar. > Ps, Please don't toppost. > ::DISCLAIMER:: > ________________________________ > The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects. > ________________________________ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-11-20 17:12 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-11-13 16:30 Fault handling(Threshold exceeds/low) in Fan and NIC sensors Kumar Thangavel 2020-11-13 20:44 ` Ed Tanous 2020-11-16 13:05 ` Kumar Thangavel 2020-11-16 15:59 ` Ed Tanous 2020-11-17 11:59 ` Kumar Thangavel 2020-11-20 17:10 ` Matt Spinler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.