* thermald for OpenBMC @ 2017-04-17 20:21 Patrick Venture 2017-04-18 2:31 ` Patrick Williams 0 siblings, 1 reply; 9+ messages in thread From: Patrick Venture @ 2017-04-17 20:21 UTC (permalink / raw) To: openbmc [-- Attachment #1: Type: text/plain, Size: 1084 bytes --] I'm working on a thermal control loop that'll operate within the openbmc framework(s) and wanted to provide a somewhat high level overview for thoughts. The general design is to have a daemon that reads fans and temperatures from dbus (reaching out to phosphor-hwmon) as well as being able to receive temperatures and other sensor information over an OEM IPMI command. The system will support zones defined (yes, probably in YAML). A zone will have at least one exclusion fan, and at least one thermal sensor. The thermal sensor can be shared. There will be defaults provided in this configuration to act as fallbacks. The thermal loop will be margin based and attempt to drive the fans to maintain the temperature within operating temperature of the zones. Each zone will be independently managed. Because not all thermal sensors can necessarily be ready by the BMC, we need a method of getting that information from the host. From a previous project, we have the notion of sending thermal margins for slow and quick (heat change) devices to a controller. Regards, Patrick [-- Attachment #2: Type: text/html, Size: 1253 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: thermald for OpenBMC 2017-04-17 20:21 thermald for OpenBMC Patrick Venture @ 2017-04-18 2:31 ` Patrick Williams 2017-04-18 3:20 ` Patrick Venture 0 siblings, 1 reply; 9+ messages in thread From: Patrick Williams @ 2017-04-18 2:31 UTC (permalink / raw) To: Patrick Venture; +Cc: openbmc [-- Attachment #1: Type: text/plain, Size: 2588 bytes --] Patrick, On Mon, Apr 17, 2017 at 01:21:29PM -0700, Patrick Venture wrote: > I'm working on a thermal control loop that'll operate within the openbmc > framework(s) and wanted to provide a somewhat high level overview for > thoughts. We should connect you with Matt Spinler (mspinler) and Matt Barth (msbarth) on IRC. They have been working on implementing the "IBM fan control algorithm" but I suspect there is a significant amount of overlap. Our intention was that you'd be able to reuse our implementation and insert a different (low-level detailed) algorithm. > The general design is to have a daemon that reads fans and temperatures > from dbus (reaching out to phosphor-hwmon) as well as being able to receive > temperatures and other sensor information over an OEM IPMI command. Sounds good. This is how it is suppose to work. For the IPMI commands, the expectation would be that either the IPMI provider or an application fed by the IPMI provider for these OEM commands would implement the same xyz.openbmc_project.Sensor.Value interface as the phosphor-hwmon. This way the thermal algorithm really doesn't need to know where the data comes from. > The system will support zones defined (yes, probably in YAML). A zone will > have at least one exclusion fan, and at least one thermal sensor. The > thermal sensor can be shared. There will be defaults provided in this > configuration to act as fallbacks. There is some code available to define zones via YAML. Matt Spinler can point you at these. > The thermal loop will be margin based and attempt to drive the fans to > maintain the temperature within operating temperature of the zones. Each > zone will be independently managed. These sounds very similar to what their intended design is as well. For a zone there is a lower-threshold and an upper-threshold. When the temperature is above the upper-threshold, the fan speed is increased and the fans are decreased when the temperature is below the lower-threshold. Again, the Matts can give you details on what the "IBM fan control algorithm" design is. > Because not all thermal sensors can necessarily be ready by the BMC, we > need a method of getting that information from the host. From a previous > project, we have the notion of sending thermal margins for slow and quick > (heat change) devices to a controller. Is this the Host->BMC via IPMI you mentioned earlier or does the BMC need to actively query the host in some cases? Hopefully it is always one direction. -- Patrick Williams [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: thermald for OpenBMC 2017-04-18 2:31 ` Patrick Williams @ 2017-04-18 3:20 ` Patrick Venture 2017-05-02 18:07 ` OpenBMC Thermal Design Matthew Barth 0 siblings, 1 reply; 9+ messages in thread From: Patrick Venture @ 2017-04-18 3:20 UTC (permalink / raw) To: Patrick Williams; +Cc: openbmc [-- Attachment #1: Type: text/plain, Size: 6593 bytes --] Patrick, >> I'm working on a thermal control loop that'll operate within the openbmc >> framework(s) and wanted to provide a somewhat high level overview for >> thoughts. > We should connect you with Matt Spinler (mspinler) and Matt Barth > (msbarth) on IRC. They have been working on implementing the "IBM fan > control algorithm" but I suspect there is a significant amount of > overlap. Our intention was that you'd be able to reuse our > implementation and insert a different (low-level detailed) algorithm. Definitely. I know there's a Google algorithm we use for thermal control that's based on proportional–integral–derivative. I'll ping them on IRC to get a peek at their design, roadmap and timeline. It's also possible because of our specific data center requirements based on configurations, it may be more work to plug-in a different low-level algorithm. But without seeing the design, it's impossible to say. >> The general design is to have a daemon that reads fans and temperatures >> from dbus (reaching out to phosphor-hwmon) as well as being able to receive >> temperatures and other sensor information over an OEM IPMI command. > Sounds good. This is how it is suppose to work. Good. Yeah. I'll end up running some performance experiments to make sure things are handled quickly enough going through dbus for everything, but I'm sure it will be reasonably quick. > For the IPMI commands, the expectation would be that either the IPMI > provider or an application fed by the IPMI provider for these OEM > commands would implement the same xyz.openbmc_project.Sensor.Value > interface as the phosphor-hwmon. This way the thermal algorithm really > doesn't need to know where the data comes from. Right. I just need to verify the exact design of that information required. Discussions today indicated I'd be provided with the temperature margin for the fastest device and the slowest (in terms of thermal adjustment) per zone. The YAML definition will need to allow for indicating whether a sensor is available to the BMC or is "outside." >> The system will support zones defined (yes, probably in YAML). A zone will >> have at least one exclusion fan, and at least one thermal sensor. The >> thermal sensor can be shared. There will be defaults provided in this >> configuration to act as fallbacks. > There is some code available to define zones via YAML. Matt Spinler can > point you at these. Ok. >> The thermal loop will be margin based and attempt to drive the fans to >> maintain the temperature within operating temperature of the zones. Each >> zone will be independently managed. > These sounds very similar to what their intended design is as well. For > a zone there is a lower-threshold and an upper-threshold. When the > temperature is above the upper-threshold, the fan speed is increased and > the fans are decreased when the temperature is below the > lower-threshold. Again, the Matts can give you details on what the "IBM > fan control algorithm" design is. That's the basic idea. >> Because not all thermal sensors can necessarily be ready by the BMC, we >> need a method of getting that information from the host. From a previous >> project, we have the notion of sending thermal margins for slow and quick >> (heat change) devices to a controller. > Is this the Host->BMC via IPMI you mentioned earlier or does the BMC > need to actively query the host in some cases? Hopefully it is always > one direction. The plan is for Host->BMC only. The host just feeds thermal information on a cycle to the BMC for those sensors out of reach. I'm very interested in seeing the design doc, or any code that exists, and especially a timeline. Regards, Patrick On Mon, Apr 17, 2017 at 7:31 PM, Patrick Williams <patrick@stwcx.xyz> wrote: > Patrick, > > On Mon, Apr 17, 2017 at 01:21:29PM -0700, Patrick Venture wrote: > > I'm working on a thermal control loop that'll operate within the openbmc > > framework(s) and wanted to provide a somewhat high level overview for > > thoughts. > > We should connect you with Matt Spinler (mspinler) and Matt Barth > (msbarth) on IRC. They have been working on implementing the "IBM fan > control algorithm" but I suspect there is a significant amount of > overlap. Our intention was that you'd be able to reuse our > implementation and insert a different (low-level detailed) algorithm. > > > The general design is to have a daemon that reads fans and temperatures > > from dbus (reaching out to phosphor-hwmon) as well as being able to > receive > > temperatures and other sensor information over an OEM IPMI command. > > Sounds good. This is how it is suppose to work. > > For the IPMI commands, the expectation would be that either the IPMI > provider or an application fed by the IPMI provider for these OEM > commands would implement the same xyz.openbmc_project.Sensor.Value > interface as the phosphor-hwmon. This way the thermal algorithm really > doesn't need to know where the data comes from. > > > The system will support zones defined (yes, probably in YAML). A zone > will > > have at least one exclusion fan, and at least one thermal sensor. The > > thermal sensor can be shared. There will be defaults provided in this > > configuration to act as fallbacks. > > There is some code available to define zones via YAML. Matt Spinler can > point you at these. > > > The thermal loop will be margin based and attempt to drive the fans to > > maintain the temperature within operating temperature of the zones. Each > > zone will be independently managed. > > These sounds very similar to what their intended design is as well. For > a zone there is a lower-threshold and an upper-threshold. When the > temperature is above the upper-threshold, the fan speed is increased and > the fans are decreased when the temperature is below the > lower-threshold. Again, the Matts can give you details on what the "IBM > fan control algorithm" design is. > > > Because not all thermal sensors can necessarily be ready by the BMC, we > > need a method of getting that information from the host. From a previous > > project, we have the notion of sending thermal margins for slow and quick > > (heat change) devices to a controller. > > Is this the Host->BMC via IPMI you mentioned earlier or does the BMC > need to actively query the host in some cases? Hopefully it is always > one direction. > > -- > Patrick Williams > [-- Attachment #2: Type: text/html, Size: 11424 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* OpenBMC Thermal Design 2017-04-18 3:20 ` Patrick Venture @ 2017-05-02 18:07 ` Matthew Barth 2017-05-02 19:33 ` Patrick Venture 0 siblings, 1 reply; 9+ messages in thread From: Matthew Barth @ 2017-05-02 18:07 UTC (permalink / raw) To: Patrick Venture, Patrick Williams; +Cc: openbmc [-- Attachment #1: Type: text/plain, Size: 249 bytes --] Patrick, As Patrick mentioned a few of us have been working on the fan control infrastructure currently and wrote up a quick outline to share our thoughts on the design layout. Let us know if there are areas you'd like to see more detail. Matt [-- Attachment #2: ThermalDesign.pdf --] [-- Type: application/pdf, Size: 73786 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: OpenBMC Thermal Design 2017-05-02 18:07 ` OpenBMC Thermal Design Matthew Barth @ 2017-05-02 19:33 ` Patrick Venture 2017-05-02 20:46 ` Patrick Venture 2017-05-02 21:21 ` Matthew Barth 0 siblings, 2 replies; 9+ messages in thread From: Patrick Venture @ 2017-05-02 19:33 UTC (permalink / raw) To: Matthew Barth; +Cc: Patrick Williams, openbmc [-- Attachment #1: Type: text/plain, Size: 794 bytes --] Please elaborate on the format or layout of the control configuration file. Please elaborate on the mechanism planned to import it into code? Y'all often use python programs to make c++, will that be the case here? Please elaborate on how the control program will execute it's "pluggable" algorithm? Please elaborate on how the fans will be controlled? Will that be as a group per system or independently per zone? -- What is a zone? Patrick On Tue, May 2, 2017 at 11:07 AM, Matthew Barth <msbarth@linux.vnet.ibm.com> wrote: > Patrick, > > As Patrick mentioned a few of us have been working on the fan control > infrastructure currently and wrote up a quick outline to share our thoughts > on the design layout. Let us know if there are areas you'd like to see more > detail. > > Matt > [-- Attachment #2: Type: text/html, Size: 1193 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: OpenBMC Thermal Design 2017-05-02 19:33 ` Patrick Venture @ 2017-05-02 20:46 ` Patrick Venture 2017-05-02 21:21 ` Matthew Barth 1 sibling, 0 replies; 9+ messages in thread From: Patrick Venture @ 2017-05-02 20:46 UTC (permalink / raw) To: Matthew Barth; +Cc: Patrick Williams, openbmc [-- Attachment #1: Type: text/plain, Size: 1498 bytes --] Just to give some details, the present design is as follows: - A zone is a group of fans controlled independently. -- The configuration specifies the inputs to the PID loops. -- each loop takes an input and a goal, and outputs an RPM to achieve the goal. (the loop input can be the margin) -- The PID loops all feed into a maximum function which then feeds a fan PID loop that tweaks the duty cycles as write and reads the fan tachs back until it's where it needs to be. So we'll be dynamically building a list of PIDs to run and drive the output for controlling the fans. Patrick On Tue, May 2, 2017 at 12:33 PM, Patrick Venture <venture@google.com> wrote: > Please elaborate on the format or layout of the control configuration file. > Please elaborate on the mechanism planned to import it into code? Y'all > often use python programs to make c++, will that be the case here? > Please elaborate on how the control program will execute it's "pluggable" > algorithm? > Please elaborate on how the fans will be controlled? Will that be as a > group per system or independently per zone? > -- What is a zone? > > Patrick > > On Tue, May 2, 2017 at 11:07 AM, Matthew Barth <msbarth@linux.vnet.ibm.com > > wrote: > >> Patrick, >> >> As Patrick mentioned a few of us have been working on the fan control >> infrastructure currently and wrote up a quick outline to share our thoughts >> on the design layout. Let us know if there are areas you'd like to see more >> detail. >> >> Matt >> > > [-- Attachment #2: Type: text/html, Size: 2345 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: OpenBMC Thermal Design 2017-05-02 19:33 ` Patrick Venture 2017-05-02 20:46 ` Patrick Venture @ 2017-05-02 21:21 ` Matthew Barth 2017-05-02 21:33 ` Patrick Venture 1 sibling, 1 reply; 9+ messages in thread From: Matthew Barth @ 2017-05-02 21:21 UTC (permalink / raw) To: Patrick Venture; +Cc: Patrick Williams, openbmc [-- Attachment #1: Type: text/plain, Size: 1774 bytes --] On 05/02/17 2:33 PM, Patrick Venture wrote: > Please elaborate on the format or layout of the control configuration > file. This will be a yaml file containing the zone and fan definitions with the associated set of parameters that feed into the control algorithm. Currently this contains the zone number, the zone initial speed, and the list of fans(including their inventory path, sensor name, etc...). > Please elaborate on the mechanism planned to import it into code? > Y'all often use python programs to make c++, will that be the case here? Correct, that will be the case here as well. > Please elaborate on how the control program will execute it's > "pluggable" algorithm? Filling out the associated yaml file for the control application defines how the algorithm will control the fan speeds based on the values, sensors listed, delays, etc.. that will be supported as parameter inputs to the algorithm. > Please elaborate on how the fans will be controlled? Will that be as > a group per system or independently per zone? This will be configurable by the yaml file where the fan speeds are set on the zone, which a zone is a group of fans. These zones can be constructed as 1-to-many fans enabling individual fans being controlled or a group of fans being controlled similarly. > -- What is a zone? A grouping of fans > > Patrick > > On Tue, May 2, 2017 at 11:07 AM, Matthew Barth > <msbarth@linux.vnet.ibm.com <mailto:msbarth@linux.vnet.ibm.com>> wrote: > > Patrick, > > As Patrick mentioned a few of us have been working on the fan > control infrastructure currently and wrote up a quick outline to > share our thoughts on the design layout. Let us know if there are > areas you'd like to see more detail. > > Matt > > [-- Attachment #2: Type: text/html, Size: 3589 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: OpenBMC Thermal Design 2017-05-02 21:21 ` Matthew Barth @ 2017-05-02 21:33 ` Patrick Venture 2017-05-03 14:05 ` Matthew Barth 0 siblings, 1 reply; 9+ messages in thread From: Patrick Venture @ 2017-05-02 21:33 UTC (permalink / raw) To: Matthew Barth; +Cc: Patrick Williams, openbmc [-- Attachment #1: Type: text/plain, Size: 2245 bytes --] Fantastic! Thanks for elaborating. >> Please elaborate on how the control program will execute it's "pluggable" algorithm? > Filling out the associated yaml file for the control application defines how the algorithm will control the fan speeds based on the values, sensors listed, delays, etc.. that will be supported as parameter inputs to the algorithm. What do you mean by "delays?" Patrick On Tue, May 2, 2017 at 2:21 PM, Matthew Barth <msbarth@linux.vnet.ibm.com> wrote: > On 05/02/17 2:33 PM, Patrick Venture wrote: > > Please elaborate on the format or layout of the control configuration file. > > This will be a yaml file containing the zone and fan definitions with the > associated set of parameters that feed into the control algorithm. > Currently this contains the zone number, the zone initial speed, and the > list of fans(including their inventory path, sensor name, etc...). > > Please elaborate on the mechanism planned to import it into code? Y'all > often use python programs to make c++, will that be the case here? > > Correct, that will be the case here as well. > > Please elaborate on how the control program will execute it's "pluggable" > algorithm? > > Filling out the associated yaml file for the control application defines > how the algorithm will control the fan speeds based on the values, sensors > listed, delays, etc.. that will be supported as parameter inputs to the > algorithm. > > Please elaborate on how the fans will be controlled? Will that be as a > group per system or independently per zone? > > This will be configurable by the yaml file where the fan speeds are set on > the zone, which a zone is a group of fans. These zones can be constructed > as 1-to-many fans enabling individual fans being controlled or a group of > fans being controlled similarly. > > -- What is a zone? > > A grouping of fans > > > Patrick > > On Tue, May 2, 2017 at 11:07 AM, Matthew Barth <msbarth@linux.vnet.ibm.com > > wrote: > >> Patrick, >> >> As Patrick mentioned a few of us have been working on the fan control >> infrastructure currently and wrote up a quick outline to share our thoughts >> on the design layout. Let us know if there are areas you'd like to see more >> detail. >> >> Matt >> > > > [-- Attachment #2: Type: text/html, Size: 4152 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: OpenBMC Thermal Design 2017-05-02 21:33 ` Patrick Venture @ 2017-05-03 14:05 ` Matthew Barth 0 siblings, 0 replies; 9+ messages in thread From: Matthew Barth @ 2017-05-03 14:05 UTC (permalink / raw) To: Patrick Venture; +Cc: Patrick Williams, openbmc [-- Attachment #1: Type: text/plain, Size: 4103 bytes --] On 05/02/17 4:33 PM, Patrick Venture wrote: > Fantastic! Thanks for elaborating. > > >> Please elaborate on how the control program will execute it's > "pluggable" algorithm? > > Filling out the associated yaml file for the control application > defines how the algorithm will control the fan speeds based on the > values, sensors listed, delays, etc.. that will be supported as > parameter inputs to the algorithm. > > What do you mean by "delays?" > No prob, so in our design each temperature sensor that's an input to the control algorithm has a defined RPM delta for each degree above or below a defined "nominal" temperature range. For each sensor read, the maximum RPM delta based on their reported temps are written as the new speed target for the zone they are included in. After that occurs, any more RPM deltas determined from the sensor readings are ignored for a set amount of time(delay), unless the RPM delta is larger than the previous RPM change. If that happens, the difference in RPM deltas is then written as the new speed target for the zone they are included in again. Not sure if that describes it well enough, but for example: Given a core temp's "nominal" temperature range is 75-78C, if that core reports a temp of 79C with a defined RPM increase delta of 300rpms per degree above, then the fans in the zone containing this core are increased 300rpms. After that increase is requested, no increase requests 300rpm and less are done until after the delay interval has passed. Whereas if another core is reported at 80C, resulting in a 600rpm increase request during that delay interval, then an additional 300rpms are added to the previous target and the delay interval restarts. After the interval expires it considers all incoming deltas again. The delay interval is there to help eliminate unnecessary speed change requests while the fans are going to their target speed. Also, this minimizes the possibility for speed oscillations. Matt > > Patrick > > > On Tue, May 2, 2017 at 2:21 PM, Matthew Barth > <msbarth@linux.vnet.ibm.com <mailto:msbarth@linux.vnet.ibm.com>> wrote: > > On 05/02/17 2:33 PM, Patrick Venture wrote: >> Please elaborate on the format or layout of the control >> configuration file. > This will be a yaml file containing the zone and fan definitions > with the associated set of parameters that feed into the control > algorithm. Currently this contains the zone number, the zone > initial speed, and the list of fans(including their inventory > path, sensor name, etc...). >> Please elaborate on the mechanism planned to import it into >> code? Y'all often use python programs to make c++, will that be >> the case here? > Correct, that will be the case here as well. >> Please elaborate on how the control program will execute it's >> "pluggable" algorithm? > Filling out the associated yaml file for the control application > defines how the algorithm will control the fan speeds based on the > values, sensors listed, delays, etc.. that will be supported as > parameter inputs to the algorithm. >> Please elaborate on how the fans will be controlled? Will that >> be as a group per system or independently per zone? > This will be configurable by the yaml file where the fan speeds > are set on the zone, which a zone is a group of fans. These zones > can be constructed as 1-to-many fans enabling individual fans > being controlled or a group of fans being controlled similarly. >> -- What is a zone? > A grouping of fans >> >> Patrick >> >> On Tue, May 2, 2017 at 11:07 AM, Matthew Barth >> <msbarth@linux.vnet.ibm.com <mailto:msbarth@linux.vnet.ibm.com>> >> wrote: >> >> Patrick, >> >> As Patrick mentioned a few of us have been working on the fan >> control infrastructure currently and wrote up a quick outline >> to share our thoughts on the design layout. Let us know if >> there are areas you'd like to see more detail. >> >> Matt >> >> > > [-- Attachment #2: Type: text/html, Size: 8030 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-05-03 14:05 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-17 20:21 thermald for OpenBMC Patrick Venture 2017-04-18 2:31 ` Patrick Williams 2017-04-18 3:20 ` Patrick Venture 2017-05-02 18:07 ` OpenBMC Thermal Design Matthew Barth 2017-05-02 19:33 ` Patrick Venture 2017-05-02 20:46 ` Patrick Venture 2017-05-02 21:21 ` Matthew Barth 2017-05-02 21:33 ` Patrick Venture 2017-05-03 14:05 ` Matthew Barth
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.