* dbus-sensors:hwmontemp: additional attribute proposal @ 2020-08-05 23:42 Jason Ling 2020-08-06 6:07 ` Zbyszek 2020-08-06 20:40 ` James Feist 0 siblings, 2 replies; 26+ messages in thread From: Jason Ling @ 2020-08-05 23:42 UTC (permalink / raw) To: OpenBMC Maillist, James Feist [-- Attachment #1: Type: text/plain, Size: 972 bytes --] *Problem:* There is a use case where temp1_input should be omitted from being exposed to dbus. A concrete example is if you have a temp sensor with 10 channels but only want to expose 2..10. Currently dbus/hwmontemp doesn't allow this. *Solution:* In order to maintain backwards compatibility I am proposing an OmitList attribute that hwmontemp will attempt to retrieve. If the "Name"s of any temp sensor appears in the list, it will be skip sensor creation. I am proposing a list to support other use cases such as... * you're doing BMC development and for whatever reason want to temporarily suppress a temperature and do some tests..you can add it to this list and then remove it instead of deleting and re-inserting. * lets you have non-contiguous temp sensors exposed (e.g temp2_input, temp5_input, temp7_input) . There is a better solution to this; but for now this enables this use-case. *etc..* It's a simple feature; plan to have something within O(hours). [-- Attachment #2: Type: text/html, Size: 1234 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-05 23:42 dbus-sensors:hwmontemp: additional attribute proposal Jason Ling @ 2020-08-06 6:07 ` Zbyszek 2020-08-06 20:40 ` James Feist 1 sibling, 0 replies; 26+ messages in thread From: Zbyszek @ 2020-08-06 6:07 UTC (permalink / raw) To: Jason Ling; +Cc: OpenBMC Maillist, James Feist [-- Attachment #1: Type: text/plain, Size: 1411 bytes --] Doing some code changes I encountered this comment in PSUSensor. https://github.com/openbmc/dbus-sensors/blob/dfad1ffcf4d41d6b5c0fd578d1817c0540631e29/src/PSUSensorMain.cpp#L762 I didn't verify it but it looks like the mechanism for disabling the selected sensor is already there. czw., 6 sie 2020 o 01:43 Jason Ling <jasonling@google.com> napisał(a): > *Problem:* > There is a use case where temp1_input should be omitted from being exposed > to dbus. > A concrete example is if you have a temp sensor with 10 channels but only > want to expose 2..10. > > Currently dbus/hwmontemp doesn't allow this. > > *Solution:* > In order to maintain backwards compatibility I am proposing an OmitList > attribute that hwmontemp will attempt to retrieve. > If the "Name"s of any temp sensor appears in the list, it will be skip > sensor creation. > > I am proposing a list to support other use cases such as... > > * you're doing BMC development and for whatever reason want to temporarily > suppress a temperature and do some tests..you can add it to this list and > then remove it instead of deleting and re-inserting. > > * lets you have non-contiguous temp sensors exposed (e.g temp2_input, > temp5_input, temp7_input) . There is a better solution to this; but for now > this enables this use-case. > > *etc..* > It's a simple feature; plan to have something within O(hours). > [-- Attachment #2: Type: text/html, Size: 2115 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-05 23:42 dbus-sensors:hwmontemp: additional attribute proposal Jason Ling 2020-08-06 6:07 ` Zbyszek @ 2020-08-06 20:40 ` James Feist 2020-08-06 22:52 ` Jason Ling 1 sibling, 1 reply; 26+ messages in thread From: James Feist @ 2020-08-06 20:40 UTC (permalink / raw) To: Jason Ling, OpenBMC Maillist On 8/5/2020 4:42 PM, Jason Ling wrote: > In order to maintain backwards compatibility I am proposing an OmitList > attribute that hwmontemp will attempt to retrieve. > If the "Name"s of any temp sensor appears in the list, it will be skip > sensor creation. Why not just make it so that "Name.*" is required, that way you can skip the first one? Either that, or something like PSU sensor and have a list of what labels you want to support? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-06 20:40 ` James Feist @ 2020-08-06 22:52 ` Jason Ling 2020-08-07 16:24 ` James Feist 0 siblings, 1 reply; 26+ messages in thread From: Jason Ling @ 2020-08-06 22:52 UTC (permalink / raw) To: James Feist; +Cc: OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 1166 bytes --] Assuming that you mean "Omit Name attribute from the sensor configuration definition and then change hwmontemp to require any Name.*" This won't work since Entity-Manager requires Name (tried it, entity-manager does indeed complain about not finding name). My rationale for an omit list vs permit list (1) if it's a permit list then everytime you add another temp you want to monitor you need to add to this list..if you want to drop a temp then you have to modify the list again. (2) General assumption is that the primary use case is to display all named temperatures which means a permit list is typically large (3) adding a permit list also breaks all existing code. Everyone has to go back into their json config and add all the sensor values to the list. My rationale for using the value for the "Name" attribute rather than labels or referencing sysfs attributes (1) Looking at just the config , it's obvious as to what you're omitting. (2) If it's label base, a label change in a driver would mean a breakage in the userspace daemon. Not a big deal; but it can be annoying. (3) if it's sysfs attribute based then it's my opinion that it's not as readable. [-- Attachment #2: Type: text/html, Size: 1400 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-06 22:52 ` Jason Ling @ 2020-08-07 16:24 ` James Feist 2020-08-07 16:34 ` Jason Ling 2020-08-07 16:37 ` Ed Tanous 0 siblings, 2 replies; 26+ messages in thread From: James Feist @ 2020-08-07 16:24 UTC (permalink / raw) To: Jason Ling; +Cc: OpenBMC Maillist On 8/6/2020 3:52 PM, Jason Ling wrote: > Assuming that you mean "Omit Name attribute from the sensor > configuration definition and then change hwmontemp to require any Name.*" > This won't work since Entity-Manager requires Name (tried it, > entity-manager does indeed complain about not finding name). > > My rationale for an omit list vs permit list > (1) if it's a permit list then everytime you add another temp you want > to monitor you need to add to this list..if you want to drop a temp then > you have to modify the list again. > (2) General assumption is that the primary use case is to display all > named temperatures which means a permit list is typically large > (3) adding a permit list also breaks all existing code. Everyone has to > go back into their json config and add all the sensor values to the list. > > My rationale for using the value for the "Name" attribute rather than > labels or referencing sysfs attributes > (1) Looking at just the config , it's obvious as to what you're omitting. > (2) If it's label base, a label change in a driver would mean a breakage > in the userspace daemon. Not a big deal; but it can be annoying. > (3) if it's sysfs attribute based then it's my opinion that it's not as > readable. > I'm not a huge fan of this as the PSU sensor already has a way of handling this, and it adds a new way of handling it. I'd rather follow what is already there. It's already confusing enough that hwmontemp and psu do things in slightly different ways. > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:24 ` James Feist @ 2020-08-07 16:34 ` Jason Ling 2020-08-07 16:39 ` Ed Tanous 2020-08-07 17:01 ` James Feist 2020-08-07 16:37 ` Ed Tanous 1 sibling, 2 replies; 26+ messages in thread From: Jason Ling @ 2020-08-07 16:34 UTC (permalink / raw) To: James Feist; +Cc: OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 1844 bytes --] ack, sounds like I'm better off just moving temp sensors over to being handled by PSUSensors. Slightly different topic: What about making the device/type lists in PSUSensors extendable using JSON? On Fri, Aug 7, 2020 at 9:24 AM James Feist <james.feist@linux.intel.com> wrote: > On 8/6/2020 3:52 PM, Jason Ling wrote: > > Assuming that you mean "Omit Name attribute from the sensor > > configuration definition and then change hwmontemp to require any Name.*" > > This won't work since Entity-Manager requires Name (tried it, > > entity-manager does indeed complain about not finding name). > > > > My rationale for an omit list vs permit list > > (1) if it's a permit list then everytime you add another temp you want > > to monitor you need to add to this list..if you want to drop a temp then > > you have to modify the list again. > > (2) General assumption is that the primary use case is to display all > > named temperatures which means a permit list is typically large > > (3) adding a permit list also breaks all existing code. Everyone has to > > go back into their json config and add all the sensor values to the list. > > > > My rationale for using the value for the "Name" attribute rather than > > labels or referencing sysfs attributes > > (1) Looking at just the config , it's obvious as to what you're omitting. > > (2) If it's label base, a label change in a driver would mean a breakage > > in the userspace daemon. Not a big deal; but it can be annoying. > > (3) if it's sysfs attribute based then it's my opinion that it's not as > > readable. > > > > I'm not a huge fan of this as the PSU sensor already has a way of > handling this, and it adds a new way of handling it. I'd rather follow > what is already there. It's already confusing enough that hwmontemp and > psu do things in slightly different ways. > > > > [-- Attachment #2: Type: text/html, Size: 2447 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:34 ` Jason Ling @ 2020-08-07 16:39 ` Ed Tanous 2020-08-07 16:54 ` Jason Ling 2020-08-07 17:01 ` James Feist 1 sibling, 1 reply; 26+ messages in thread From: Ed Tanous @ 2020-08-07 16:39 UTC (permalink / raw) To: Jason Ling; +Cc: James Feist, OpenBMC Maillist On Fri, Aug 7, 2020 at 9:36 AM Jason Ling <jasonling@google.com> wrote: > > Slightly different topic: > What about making the device/type lists in PSUSensors extendable using JSON? > Say more about what you're wanting to do here..... Can you give an example of what you would use it for? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:39 ` Ed Tanous @ 2020-08-07 16:54 ` Jason Ling 2020-08-07 17:06 ` James Feist 2020-08-07 17:24 ` Ed Tanous 0 siblings, 2 replies; 26+ messages in thread From: Jason Ling @ 2020-08-07 16:54 UTC (permalink / raw) To: Ed Tanous; +Cc: James Feist, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 1301 bytes --] > What about making the device/type lists in PSUSensors extendable using JSON? > Say more about what you're wanting to do here..... Take https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L43 and https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59 and make it so those can be extended using (for example) without an upstream code change. I picked JSON as the easiest example. IIRC PSUSensors does validity checks to make sure that the device emits a name in its 'permit list' (hwmontempsensor is less picky) so tricking PSUSensors into gathering telemetry for a non-public device is tricky. Can you give an > example of what you would use it for? Sure, the primary use case would be It's a non public device. Would rather not broadcast information about it or have to obfuscate the device name. Really would rather not maintain patches until the device is made public. On Fri, Aug 7, 2020 at 9:39 AM Ed Tanous <ed@tanous.net> wrote: > On Fri, Aug 7, 2020 at 9:36 AM Jason Ling <jasonling@google.com> wrote: > > > > Slightly different topic: > > What about making the device/type lists in PSUSensors extendable using > JSON? > > > > Say more about what you're wanting to do here..... Can you give an > example of what you would use it for? > [-- Attachment #2: Type: text/html, Size: 2260 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:54 ` Jason Ling @ 2020-08-07 17:06 ` James Feist 2020-08-07 17:17 ` Jason Ling 2020-08-07 17:24 ` Ed Tanous 1 sibling, 1 reply; 26+ messages in thread From: James Feist @ 2020-08-07 17:06 UTC (permalink / raw) To: Jason Ling, Ed Tanous; +Cc: OpenBMC Maillist On 8/7/2020 9:54 AM, Jason Ling wrote: >> What about making the device/type lists in PSUSensors extendable using JSON? > > Say more about what you're wanting to do here..... > > Take > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L43 and > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59 and > make it so those can be extended using (for example) without an upstream > code change. I picked JSON as the easiest example. > IIRC PSUSensors does validity checks to make sure that the device emits > a name in its 'permit list' (hwmontempsensor is less picky) so tricking > PSUSensors into gathering telemetry for a non-public device is tricky. I feel like this approach already isn't optimal as in reality most systems aren't going to have half of those sensors, and you're then creating useless matches for things that don't matter to your platform. Maybe something like allowing a device to be exported with a different keyword other then 'type' in Entity Manager would allow us to just use one PSUSensor config type, then your export could be hidden in your EM config? > > Can you give an > example of what you would use it for? > > Sure, the primary use case would be > It's a non public device. Would rather not broadcast information about > it or have to obfuscate the device name. Really would rather not > maintain patches until the device is made public. > > > > On Fri, Aug 7, 2020 at 9:39 AM Ed Tanous <ed@tanous.net > <mailto:ed@tanous.net>> wrote: > > On Fri, Aug 7, 2020 at 9:36 AM Jason Ling <jasonling@google.com > <mailto:jasonling@google.com>> wrote: > > > > Slightly different topic: > > What about making the device/type lists in PSUSensors extendable > using JSON? > > > > Say more about what you're wanting to do here..... Can you give an > example of what you would use it for? > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 17:06 ` James Feist @ 2020-08-07 17:17 ` Jason Ling 2020-08-07 17:40 ` James Feist 0 siblings, 1 reply; 26+ messages in thread From: Jason Ling @ 2020-08-07 17:17 UTC (permalink / raw) To: James Feist; +Cc: Ed Tanous, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 3223 bytes --] > > feel like this approach already isn't optimal as in reality most > systems aren't going to have half of those sensors, and you're then > creating useless matches for things that don't matter to your platform. > I'm assuming that you're referring to the general approach of PSUSensors and not referring to extending the list using a config file? If so then I agree. I'm also a bit confused about scanning every hwmon attribute on the system and seeing if has a name that is in the permit list..then seeing if it has an appropriate configuration. Why not just grab the configuration and using that information navigate to the appropriate hwmon path and take what is there? (Why even bother with validating the driver name?) . Any context would be helpful; I'm trying to grok the design. > Maybe something like allowing a device to be exported with a different > keyword other then 'type' in Entity Manager would allow us to just use > one PSUSensor config type, then your export could be hidden in your EM > config? Sure, or maybe a catchall type for something PSUSensor can consume (and bypass the name check?) On Fri, Aug 7, 2020 at 10:06 AM James Feist <james.feist@linux.intel.com> wrote: > On 8/7/2020 9:54 AM, Jason Ling wrote: > >> What about making the device/type lists in PSUSensors extendable using > JSON? > > > > Say more about what you're wanting to do here..... > > > > Take > > > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L43 and > > > > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59 and > > > make it so those can be extended using (for example) without an upstream > > code change. I picked JSON as the easiest example. > > IIRC PSUSensors does validity checks to make sure that the device emits > > a name in its 'permit list' (hwmontempsensor is less picky) so tricking > > PSUSensors into gathering telemetry for a non-public device is tricky. > > I feel like this approach already isn't optimal as in reality most > systems aren't going to have half of those sensors, and you're then > creating useless matches for things that don't matter to your platform. > Maybe something like allowing a device to be exported with a different > keyword other then 'type' in Entity Manager would allow us to just use > one PSUSensor config type, then your export could be hidden in your EM > config? > > > > > > Can you give an > > example of what you would use it for? > > > > Sure, the primary use case would be > > It's a non public device. Would rather not broadcast information about > > it or have to obfuscate the device name. Really would rather not > > maintain patches until the device is made public. > > > > > > > > On Fri, Aug 7, 2020 at 9:39 AM Ed Tanous <ed@tanous.net > > <mailto:ed@tanous.net>> wrote: > > > > On Fri, Aug 7, 2020 at 9:36 AM Jason Ling <jasonling@google.com > > <mailto:jasonling@google.com>> wrote: > > > > > > Slightly different topic: > > > What about making the device/type lists in PSUSensors extendable > > using JSON? > > > > > > > Say more about what you're wanting to do here..... Can you give an > > example of what you would use it for? > > > [-- Attachment #2: Type: text/html, Size: 4723 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 17:17 ` Jason Ling @ 2020-08-07 17:40 ` James Feist 2020-08-07 18:04 ` Jason Ling 0 siblings, 1 reply; 26+ messages in thread From: James Feist @ 2020-08-07 17:40 UTC (permalink / raw) To: Jason Ling; +Cc: OpenBMC Maillist, Ed Tanous On 8/7/2020 10:17 AM, Jason Ling wrote: > feel like this approach already isn't optimal as in reality most > systems aren't going to have half of those sensors, and you're then > creating useless matches for things that don't matter to your platform. > > I'm assuming that you're referring to the general approach of PSUSensors > and not referring to extending the list using a config file? > If so then I agree. > I'm also a bit confused about scanning every hwmon attribute on the > system and seeing if has a name that is in the permit list..then seeing > if it has an appropriate configuration. Why not just grab the > configuration and using that information navigate to the appropriate > hwmon path and take what is there? (Why even bother with validating the > driver name?) . Any context would be helpful; I'm trying to grok the design. > > Maybe something like allowing a device to be exported with a different > keyword other then 'type' in Entity Manager would allow us to just use > one PSUSensor config type, then your export could be hidden in your EM > config? > > Sure, or maybe a catchall type for something PSUSensor can consume (and > bypass the name check?) Yeah that's what I meant, a generic PSUSensor Type with a field called 'Export' or something that EM can use to get the Export type. > > On Fri, Aug 7, 2020 at 10:06 AM James Feist <james.feist@linux.intel.com > <mailto:james.feist@linux.intel.com>> wrote: > > On 8/7/2020 9:54 AM, Jason Ling wrote: > >> What about making the device/type lists in PSUSensors extendable > using JSON? > > > > Say more about what you're wanting to do here..... > > > > Take > > > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L43 and > > > > https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59 and > > > make it so those can be extended using (for example) without an > upstream > > code change. I picked JSON as the easiest example. > > IIRC PSUSensors does validity checks to make sure that the device > emits > > a name in its 'permit list' (hwmontempsensor is less picky) so > tricking > > PSUSensors into gathering telemetry for a non-public device is > tricky. > > I feel like this approach already isn't optimal as in reality most > systems aren't going to have half of those sensors, and you're then > creating useless matches for things that don't matter to your platform. > Maybe something like allowing a device to be exported with a different > keyword other then 'type' in Entity Manager would allow us to just use > one PSUSensor config type, then your export could be hidden in your EM > config? > > > > > > Can you give an > > example of what you would use it for? > > > > Sure, the primary use case would be > > It's a non public device. Would rather not broadcast information > about > > it or have to obfuscate the device name. Really would rather not > > maintain patches until the device is made public. > > > > > > > > On Fri, Aug 7, 2020 at 9:39 AM Ed Tanous <ed@tanous.net > <mailto:ed@tanous.net> > > <mailto:ed@tanous.net <mailto:ed@tanous.net>>> wrote: > > > > On Fri, Aug 7, 2020 at 9:36 AM Jason Ling > <jasonling@google.com <mailto:jasonling@google.com> > > <mailto:jasonling@google.com <mailto:jasonling@google.com>>> > wrote: > > > > > > Slightly different topic: > > > What about making the device/type lists in PSUSensors > extendable > > using JSON? > > > > > > > Say more about what you're wanting to do here..... Can you > give an > > example of what you would use it for? > > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 17:40 ` James Feist @ 2020-08-07 18:04 ` Jason Ling 2020-08-07 18:17 ` James Feist 2020-08-07 19:17 ` Ed Tanous 0 siblings, 2 replies; 26+ messages in thread From: Jason Ling @ 2020-08-07 18:04 UTC (permalink / raw) To: James Feist, Alex Qiu; +Cc: OpenBMC Maillist, Ed Tanous [-- Attachment #1: Type: text/plain, Size: 1350 bytes --] > Yeah that's what I meant, a generic PSUSensor Type with a field called > 'Export' or something that EM can use to get the Export type. > Sure, I think that handles the EM side of things. I'm conflicted on that. Part of the reason that list is there, and > not in the config files directly, is to convey that those are the > types that have been tested to work correctly, and the types that the > config files "promise" to work sanely. The other thing is, if you've > done the testing, adding to that list is (should be) relatively > trivial and straightforward. Opening up that list to runtime parsing > opens a lot of security questions I'm not prepared to answer. Sure, let's scrap runtime parsing and go for something far simpler. (1) have a base type, devices list that represents the approved device list. (2) have an empty extended type, device list that represents user specified extensions. (3) factor these out into separate files and provide a method that returns the union of the base and extended types/devices. now we have a base type/device list that contains supported guaranteed devices and another extended type/device list that is easier to maintain patches for. I think that's a rather small change and accomplishes what I'd like while preserving the intent of keeping a list of supported types/devices. What do you think? [-- Attachment #2: Type: text/html, Size: 1864 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 18:04 ` Jason Ling @ 2020-08-07 18:17 ` James Feist 2020-08-07 18:29 ` Jason Ling 2020-08-07 19:17 ` Ed Tanous 1 sibling, 1 reply; 26+ messages in thread From: James Feist @ 2020-08-07 18:17 UTC (permalink / raw) To: Jason Ling, Alex Qiu; +Cc: OpenBMC Maillist, Ed Tanous On 8/7/2020 11:04 AM, Jason Ling wrote: > > Yeah that's what I meant, a generic PSUSensor Type with a field called > 'Export' or something that EM can use to get the Export type. > > Sure, I think that handles the EM side of things. > > I'm conflicted on that. Part of the reason that list is there, and > not in the config files directly, is to convey that those are the > types that have been tested to work correctly, and the types that the > config files "promise" to work sanely. The other thing is, if you've > done the testing, adding to that list is (should be) relatively > trivial and straightforward. Opening up that list to runtime parsing > opens a lot of security questions I'm not prepared to answer. > > Sure, let's scrap runtime parsing and go for something far simpler. > (1) have a base type, devices list that represents the approved device list. > (2) have an empty extended type, device list that represents user > specified extensions. > (3) factor these out into separate files and provide a method that > returns the union of the base and extended types/devices. > > now we have a base type/device list that contains supported guaranteed > devices and another extended type/device list that is easier to maintain > patches for. > I think that's a rather small change and accomplishes what I'd like > while preserving the intent of keeping a list of supported types/devices. > What do you think? What would happen if something wasn't in the approved devices list? Print a warning? Assert? I don't have any problems with this approach. It would be nice if this list could be (in the future) extended to all sensor types. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 18:17 ` James Feist @ 2020-08-07 18:29 ` Jason Ling 2020-08-07 19:27 ` Ed Tanous 0 siblings, 1 reply; 26+ messages in thread From: Jason Ling @ 2020-08-07 18:29 UTC (permalink / raw) To: James Feist; +Cc: Alex Qiu, OpenBMC Maillist, Ed Tanous [-- Attachment #1: Type: text/plain, Size: 2857 bytes --] > > What would happen if something wasn't in the approved devices list? > Print a warning? Assert? I don't have any problems with this approach. > I think currently in PSUSensors if the name of the device isn't in pmBusNames <https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59> then PSUSensors refuses to monitor that particular device ( https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L269 ). It would be nice if this list could be (in the future) extended to all > sensor types. Sure, I'll start with PSUSensors first. A quick word about the original topic of this thread; we picked hwmontemp sensors where possible because we've seen performance issues for PSUSensors (could be N seconds before Sensor.Value gets updated to dbus). This has undesirable implications for PID loops.. thought was that spreading temp sensors over into hwmontemp sensor where possible would help mitigate the performance issue until we can figure out a better long term solution. On Fri, Aug 7, 2020 at 11:17 AM James Feist <james.feist@linux.intel.com> wrote: > On 8/7/2020 11:04 AM, Jason Ling wrote: > > > > Yeah that's what I meant, a generic PSUSensor Type with a field > called > > 'Export' or something that EM can use to get the Export type. > > > > Sure, I think that handles the EM side of things. > > > > I'm conflicted on that. Part of the reason that list is there, and > > not in the config files directly, is to convey that those are the > > types that have been tested to work correctly, and the types that the > > config files "promise" to work sanely. The other thing is, if you've > > done the testing, adding to that list is (should be) relatively > > trivial and straightforward. Opening up that list to runtime parsing > > opens a lot of security questions I'm not prepared to answer. > > > > Sure, let's scrap runtime parsing and go for something far simpler. > > (1) have a base type, devices list that represents the approved device > list. > > (2) have an empty extended type, device list that represents user > > specified extensions. > > (3) factor these out into separate files and provide a method that > > returns the union of the base and extended types/devices. > > > > now we have a base type/device list that contains supported guaranteed > > devices and another extended type/device list that is easier to maintain > > patches for. > > I think that's a rather small change and accomplishes what I'd like > > while preserving the intent of keeping a list of supported types/devices. > > What do you think? > > What would happen if something wasn't in the approved devices list? > Print a warning? Assert? I don't have any problems with this approach. > It would be nice if this list could be (in the future) extended to all > sensor types. > [-- Attachment #2: Type: text/html, Size: 3917 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 18:29 ` Jason Ling @ 2020-08-07 19:27 ` Ed Tanous 0 siblings, 0 replies; 26+ messages in thread From: Ed Tanous @ 2020-08-07 19:27 UTC (permalink / raw) To: Jason Ling; +Cc: James Feist, Alex Qiu, OpenBMC Maillist On Fri, Aug 7, 2020 at 11:30 AM Jason Ling <jasonling@google.com> wrote: > > A quick word about the original topic of this thread; we picked hwmontemp sensors where possible because we've seen performance issues for PSUSensors (could be N seconds before Sensor.Value gets updated to dbus). This has undesirable implications for PID loops.. > thought was that spreading temp sensors over into hwmontemp sensor where possible would help mitigate the performance issue until we can figure out a better long term solution. > Have you filed a bug, or asked on the mailing list before now? This is the kind of feedback the authors of that sensor need (Ideally before you move over to another subsystem like hwmontemp). If I didn't see your message/bug and you did post it, I apologize, I'm not trying to call you out. If you have specifics, like the value of N, and the details around what chips you're interacting with and whatever debugging you've done, it would be helpful to put that in a bug for triage. Keep in mind, PSUSensor by default has a 1 second scan rate. https://github.com/openbmc/dbus-sensors/blob/41061e2c3198c0f597d4f6bb702b690a273ab45d/include/PSUSensor.hpp#L38 If it's not obeying that, clearly there's a bug to fix somewhere. On some platforms, I have seen very high rate polling of power values on the PSU I2c bus by other devices, and that tends to hold up transactions for other components. If that bus is misbehaving or overloaded on your platform, it might have triggered a weird condition within the PSU sensor (like the scans taking longer than the scan rate). If you have any more details here, it's quite possible someone will have an idea where to look, or know exactly where the problem is. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 18:04 ` Jason Ling 2020-08-07 18:17 ` James Feist @ 2020-08-07 19:17 ` Ed Tanous 2020-08-07 21:05 ` Jason Ling 1 sibling, 1 reply; 26+ messages in thread From: Ed Tanous @ 2020-08-07 19:17 UTC (permalink / raw) To: Jason Ling; +Cc: James Feist, Alex Qiu, OpenBMC Maillist On Fri, Aug 7, 2020 at 11:05 AM Jason Ling <jasonling@google.com> wrote: > > >> Yeah that's what I meant, a generic PSUSensor Type with a field called >> 'Export' or something that EM can use to get the Export type. > > Sure, I think that handles the EM side of things. > >> I'm conflicted on that. Part of the reason that list is there, and >> not in the config files directly, is to convey that those are the >> types that have been tested to work correctly, and the types that the >> config files "promise" to work sanely. The other thing is, if you've >> done the testing, adding to that list is (should be) relatively >> trivial and straightforward. Opening up that list to runtime parsing >> opens a lot of security questions I'm not prepared to answer. > > Sure, let's scrap runtime parsing and go for something far simpler. Unless you have a way to handle the cases where things need to be configurable at runtime (like adding a new entity config or fan control control without a recompile) I don't think we can "scrap" runtime parsing entirely, but if you have a plan for dealing with the aforementioned, I'd love to hear ideas! > (1) have a base type, devices list that represents the approved device list. > (2) have an empty extended type, device list that represents user specified extensions. > (3) factor these out into separate files and provide a method that returns the union of the base and extended types/devices. > > now we have a base type/device list that contains supported guaranteed devices and another extended type/device list that is easier to maintain patches for. > I think that's a rather small change and accomplishes what I'd like while preserving the intent of keeping a list of supported types/devices. > What do you think? I don't like the merging of base lists with extended lists, as it adds a dependency between how we represent that, and implies that we have a published plugin interface, which we definitely don't, nor do we want to maintain it at the entity manager level. It also means that upstream never tests the "extended" list, which means it's a lot more likely to break. I originally wrote a big long idea about how to make the above work, but got to the end, and realized that I'm still trying to understand what we're trying to avoid here with the downstream/upstream lists thing. It's easy enough to patch the existing list to add your new custom types, then when it comes time to merge because the project/component is public, the patch is ready and good to upstream. What are we buying by moving that info to a non-patch format? I think moving it to a file means it's a lot less likely to be upstreamed, as it requires the next person to understand it to use it, and modify the patch rather than simply cleaning up the commit message, testing it, and firing it at gerrit. Having done this pattern many times (including with that list specifically) I think the only thing we could improve would be to move that list to its own file (but still C++ code), so it gets fewer merge conflicts, and you can completely replace it if you like, but even that doesn't solve the problem if code is never upstreamed. If you want to code that up, I'd support it (and I'd guess James would too). Keep in mind, all your downstream patches against that list will break when you do that, causing the world irony meter to spike :) With that said, i still think it's worth it. Disclaimer time: The way the project recommends to avoid this problem is to upstream your code. Anything else is a half measure, and we can make no guarantees about your downstream fork. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 19:17 ` Ed Tanous @ 2020-08-07 21:05 ` Jason Ling 2020-08-07 21:53 ` Ed Tanous 0 siblings, 1 reply; 26+ messages in thread From: Jason Ling @ 2020-08-07 21:05 UTC (permalink / raw) To: Ed Tanous; +Cc: James Feist, Alex Qiu, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 5952 bytes --] > > don't like the merging of base lists with extended lists, as it adds > a dependency between how we represent that, and implies that we have a > published plugin interface, which we definitely don't, nor do we want > to maintain it at the entity manager level. It also means that > upstream never tests the "extended" list, which means it's a lot more > likely to break. > My stance on devices added to the extended list is the same as devices added by downstream patches. Upstream maintainers aren't responsible for testing those, if you're patching in devices then you take the responsibility of testing those. My objective is to make it so devices that we don't want to upstream yet can be maintained more easily. First idea was a json file that extends types/lists but there are concerns with runtime parsing for devices for the purposes of exposing telemetry (I'd think runtime parsing of things like PID configurations would be more worrisome). So second idea was just to move data structures around in dbus-sensors/PSUSensor to make upstream changes less likely to result in merge conflicts for those who are maintaining downstream patches. Furthermore just split the data structure up into a portion that is upstream and downstream. Yes, it's definitely making it friendlier for those who want to maintain downstream patches to extend devices. I don't see how this increases maintenance or testing burden for the maintainers though. If however , the intent is to explicitly send the message "You shouldn't try to use this service for any devices that do not have explicit upstream support. Any patches that make it easier to do so will be rejected." then I agree with your earlier suggestion that maybe the best approach is to create another service for those devices. I originally wrote a big long idea about how to make the above work, > but got to the end, and realized that I'm still trying to understand > what we're trying to avoid here with the downstream/upstream lists > thing. It's easy enough to patch the existing list to add your new > custom types, then when it comes time to merge because the > project/component is public, the patch is ready and good to upstream. > What are we buying by moving that info to a non-patch format? You get the benefit of separating devices into two classes.. (1) types that are upstream , have been tested by someone else and they are ready to go without additional work. (2) types that are not upstream, because the devices are not public yet or may never be public and need to be kept downstream for a long period of time (or forever). > I think > moving it to a file means it's a lot less likely to be upstreamed, as > it requires the next person to understand it to use it, and modify the > patch rather than simply cleaning up the commit message, testing it, > and firing it at gerrit. Yes, the file would be for those things that are never meant to be upstreamed or won't be upstreamed for a long while. > Having done this pattern many times > (including with that list specifically) I think the only thing we > could improve would be to move that list to its own file (but still > C++ code), so it gets fewer merge conflicts, and you can completely > replace it if you like, but even that doesn't solve the problem if > code is never upstreamed. Yup, the problem here are the following "I have patches I keep downstream and they keep getting broken whenever the types/device list gets updated. I wish these data structures were in a file that doesn't get patched often" and "I have patches to add devices to the type/list data structure and I can't upstream them for a long time (or ever). I don't want to waste time constantly fixing my broken patches everytime someone adds a new public supported type." Both approaches (parsing json and extending the list runtime and separating the data structs into a separate file + returning the union of base + extended) accomplish the same thing. One requires a recipe to copy a ExtendPSUSensorConfig.json in a recipe somewhere to usr/share/PSUSensors (or something) and the other is just a patch that gets applied. Talking it through, I now realize that the slight code refactoring approach is a lot less work and a lot more simple..and something that I'd actually have time to contribute. Have you filed a bug, or asked on the mailing list before now? This > is the kind of feedback the authors of that sensor need (Ideally > before you move over to another subsystem like hwmontemp). I never really considered hwmontemp a different sub-system since they live in the same repo and seemed to be more specific towards monitoring temperature telemetry. As far as bugs go, I CC'ed Alex Qiu who has more experience with the performance of PSUSensors. I haven't personally ran into this problem; just know about it from talk amongst the larger team. > If I > didn't see your message/bug and you did post it, I apologize, I'm not > trying to call you out. > If you have specifics, like the value of N, and the details around > what chips you're interacting with and whatever debugging you've done, > it would be helpful to put that in a bug for triage. > Alex, maybe you can add some color here? > Keep in mind, PSUSensor by default has a 1 second scan rate. > > https://github.com/openbmc/dbus-sensors/blob/41061e2c3198c0f597d4f6bb702b690a273ab45d/include/PSUSensor.hpp#L38 > If it's not obeying that, clearly there's a bug to fix somewhere. > On some platforms, I have seen very high rate polling of power values > on the PSU I2c bus by other devices, and that tends to hold up > transactions for other components. If that bus is misbehaving or > overloaded on your platform, it might have triggered a weird condition > within the PSU sensor (like the scans taking longer than the scan > rate). > If you have any more details here, it's quite possible someone will > have an idea where to look, or know exactly where the problem is. [-- Attachment #2: Type: text/html, Size: 7496 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 21:05 ` Jason Ling @ 2020-08-07 21:53 ` Ed Tanous 2020-08-11 0:08 ` Alex Qiu 0 siblings, 1 reply; 26+ messages in thread From: Ed Tanous @ 2020-08-07 21:53 UTC (permalink / raw) To: Jason Ling; +Cc: James Feist, Alex Qiu, OpenBMC Maillist On Fri, Aug 7, 2020 at 2:05 PM Jason Ling <jasonling@google.com> wrote: >> >> don't like the merging of base lists with extended lists, as it adds >> a dependency between how we represent that, and implies that we have a >> published plugin interface, which we definitely don't, nor do we want >> to maintain it at the entity manager level. It also means that >> upstream never tests the "extended" list, which means it's a lot more >> likely to break. > > My stance on devices added to the extended list is the same as devices added by downstream patches. Upstream maintainers aren't responsible for testing those, if you're patching in devices then you take the responsibility of testing those. Sure, but upstream does bear the responsibility for testing that this "extended types" system works? > My objective is to make it so devices that we don't want to upstream yet can be maintained more easily. Ever? To be frank for a moment, you're going to have a hard time pushing that one forward in an open source project. If there are a lot of users that are planning on maintaining private forks forever, maybe there's a case for this, but I think most uses of private forks tends to be temporary, even if it's longer term, and the long term intention is to upstream. > First idea was a json file that extends types/lists but there are concerns with runtime parsing for devices for the purposes of exposing telemetry (I'd think runtime parsing of things like PID configurations would be more worrisome). > So second idea was just to move data structures around in dbus-sensors/PSUSensor to make upstream changes less likely to result in merge conflicts for those who are maintaining downstream patches. Furthermore just split the data structure up into a portion that is upstream and downstream. > Yes, it's definitely making it friendlier for those who want to maintain downstream patches to extend devices. I don't see how this increases maintenance or testing burden for the maintainers though. It increases burden because any time the maintainer wants to change the name of that file, change the name of the structure in that file, add a field, rename fields, or change compiler flags, that file will break in your downstream, and the maintainers will have no idea. It definitely increases the maintenance burden, and all of the things I've mentioned and more have happened over the short life of entity manager. I'm sure they will continue to happen as they evolve (or until EM is replaced by something better). > If however , the intent is to explicitly send the message > "You shouldn't try to use this service for any devices that do not have explicit upstream support. Any patches that make it easier to do so will be rejected." > then I agree with your earlier suggestion that maybe the best approach is to create another service for those devices. I'm not the maintainer of this project anymore, so my opinion is just that, and I have no ability to reject patches :) I would rephrase my position as: Modifying the source code directly is not an adequate long term API for making permanent, never to be upstreamed changes. The closest guarantee to that kind of API that is dbus, the second closest is an Entity manager config, each of which have their own pros and cons. If we as a project can do anything to make the transition between downstream code to upstream code easier, like trying to make merge conflicts less likely on commonly modified files, without the expense of maintainability or complexity, I'm absolutely for it, but creating explicit data structures and hard guarantees about downstream code on a binary boundary needs a much larger discussion, and speaks to some of the project's main principles about the "Open" part of OpenBMC. > >> I originally wrote a big long idea about how to make the above work, >> but got to the end, and realized that I'm still trying to understand >> what we're trying to avoid here with the downstream/upstream lists >> thing. It's easy enough to patch the existing list to add your new >> custom types, then when it comes time to merge because the >> project/component is public, the patch is ready and good to upstream. >> What are we buying by moving that info to a non-patch format? > > You get the benefit of separating devices into two classes.. > (1) types that are upstream , have been tested by someone else and they are ready to go without additional work. > (2) types that are not upstream, because the devices are not public yet or may never be public and need to be kept downstream for a long period of time (or forever). Don't you already have that distinction today? Devices that are in your downstream patch fall into category 2, devices that are on openbmc master fall into category 1. Maybe I'm missing something? >> >> I think >> moving it to a file means it's a lot less likely to be upstreamed, as >> it requires the next person to understand it to use it, and modify the >> patch rather than simply cleaning up the commit message, testing it, >> and firing it at gerrit. > > Yes, the file would be for those things that are never meant to be upstreamed or won't be upstreamed for a long while. See above. >> >> Having done this pattern many times >> (including with that list specifically) I think the only thing we >> could improve would be to move that list to its own file (but still >> C++ code), so it gets fewer merge conflicts, and you can completely >> replace it if you like, but even that doesn't solve the problem if >> code is never upstreamed. > > Yup, the problem here are the following > "I have patches I keep downstream and they keep getting broken whenever the types/device list gets updated. I wish these data structures were in a file that doesn't get patched often" Moving that structure to its own file sounds totally reasonable, although (if I were the maintainer) it would have no guarantees granted. Said file may change name, structure, naming, or compiler magic that will cause downstream to break. Funnily enough, in the act of implementing that, you will ironically break a lot of peoples downstreams, and I'm personally ok with that. You (and your team) needs to be ok when people break your downstream for similar reasons. On a personal note, if you haven't already, I highly recommend spending some learning time on getting a good setup and mental model for merging conflicting patches. As a useful skill, it comes up ridiculously often in software, especially if you're a maintainer. If you have the ability to resolve conflicts quickly and correctly it puts you at a significant time advantage to your peers that don't. > and > "I have patches to add devices to the type/list data structure and I can't upstream them for a long time (or ever). I don't want to waste time constantly fixing my broken patches everytime someone adds a new public supported type." > Both approaches (parsing json and extending the list runtime and separating the data structs into a separate file + returning the union of base + extended) accomplish the same thing. One requires a recipe to copy a ExtendPSUSensorConfig.json in a recipe somewhere to usr/share/PSUSensors (or something) and the other is just a patch that gets applied. Talking it through, I now realize that the slight code refactoring approach is a lot less work and a lot more simple..and something that I'd actually have time to contribute. EXCELLENT! Add me to the patch, and I'd be happy to review it for you (With that said, James is pretty fast and sometimes beats me to the punch). > > >> Have you filed a bug, or asked on the mailing list before now? This >> is the kind of feedback the authors of that sensor need (Ideally >> before you move over to another subsystem like hwmontemp). > > I never really considered hwmontemp a different sub-system since they live in the same repo and seemed to be more specific towards monitoring temperature telemetry. Ignore the above comment about subsystem. I thought you were talking about phosphor-hwmon. My bad. > As far as bugs go, I CC'ed Alex Qiu who has more experience with the performance of PSUSensors. I haven't personally ran into this problem; just know about it from talk amongst the larger team. Excellent. Looking forward to details. >> >> If I >> didn't see your message/bug and you did post it, I apologize, I'm not >> trying to call you out. >> If you have specifics, like the value of N, and the details around >> what chips you're interacting with and whatever debugging you've done, >> it would be helpful to put that in a bug for triage. > > Alex, maybe you can add some color here? >> >> Keep in mind, PSUSensor by default has a 1 second scan rate. >> https://github.com/openbmc/dbus-sensors/blob/41061e2c3198c0f597d4f6bb702b690a273ab45d/include/PSUSensor.hpp#L38 >> If it's not obeying that, clearly there's a bug to fix somewhere. >> On some platforms, I have seen very high rate polling of power values >> on the PSU I2c bus by other devices, and that tends to hold up >> transactions for other components. If that bus is misbehaving or >> overloaded on your platform, it might have triggered a weird condition >> within the PSU sensor (like the scans taking longer than the scan >> rate). >> If you have any more details here, it's quite possible someone will >> have an idea where to look, or know exactly where the problem is. > > ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 21:53 ` Ed Tanous @ 2020-08-11 0:08 ` Alex Qiu 2020-08-11 23:02 ` Ed Tanous 0 siblings, 1 reply; 26+ messages in thread From: Alex Qiu @ 2020-08-11 0:08 UTC (permalink / raw) To: Ed Tanous, James Feist; +Cc: Jason Ling, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 9956 bytes --] Hi Ed and James, Is it acceptable for a device to be listed in both HwmonTempSensor and PSUSensor? - Alex Qiu On Fri, Aug 7, 2020 at 2:53 PM Ed Tanous <ed@tanous.net> wrote: > On Fri, Aug 7, 2020 at 2:05 PM Jason Ling <jasonling@google.com> wrote: > >> > >> don't like the merging of base lists with extended lists, as it adds > >> a dependency between how we represent that, and implies that we have a > >> published plugin interface, which we definitely don't, nor do we want > >> to maintain it at the entity manager level. It also means that > >> upstream never tests the "extended" list, which means it's a lot more > >> likely to break. > > > > My stance on devices added to the extended list is the same as devices > added by downstream patches. Upstream maintainers aren't responsible for > testing those, if you're patching in devices then you take the > responsibility of testing those. > > Sure, but upstream does bear the responsibility for testing that this > "extended types" system works? > > > My objective is to make it so devices that we don't want to upstream yet > can be maintained more easily. > > Ever? To be frank for a moment, you're going to have a hard time > pushing that one forward in an open source project. If there are a > lot of users that are planning on maintaining private forks forever, > maybe there's a case for this, but I think most uses of private forks > tends to be temporary, even if it's longer term, and the long term > intention is to upstream. > > > First idea was a json file that extends types/lists but there are > concerns with runtime parsing for devices for the purposes of exposing > telemetry (I'd think runtime parsing of things like PID configurations > would be more worrisome). > > So second idea was just to move data structures around in > dbus-sensors/PSUSensor to make upstream changes less likely to result in > merge conflicts for those who are maintaining downstream patches. > Furthermore just split the data structure up into a portion that is > upstream and downstream. > > Yes, it's definitely making it friendlier for those who want to maintain > downstream patches to extend devices. I don't see how this increases > maintenance or testing burden for the maintainers though. > > It increases burden because any time the maintainer wants to change > the name of that file, change the name of the structure in that file, > add a field, rename fields, or change compiler flags, that file will > break in your downstream, and the maintainers will have no idea. It > definitely increases the maintenance burden, and all of the things > I've mentioned and more have happened over the short life of entity > manager. I'm sure they will continue to happen as they evolve (or > until EM is replaced by something better). > > > If however , the intent is to explicitly send the message > > "You shouldn't try to use this service for any devices that do not have > explicit upstream support. Any patches that make it easier to do so will be > rejected." > > then I agree with your earlier suggestion that maybe the best approach > is to create another service for those devices. > > I'm not the maintainer of this project anymore, so my opinion is just > that, and I have no ability to reject patches :) I would rephrase my > position as: Modifying the source code directly is not an adequate > long term API for making permanent, never to be upstreamed changes. > The closest guarantee to that kind of API that is dbus, the second > closest is an Entity manager config, each of which have their own pros > and cons. If we as a project can do anything to make the transition > between downstream code to upstream code easier, like trying to make > merge conflicts less likely on commonly modified files, without the > expense of maintainability or complexity, I'm absolutely for it, but > creating explicit data structures and hard guarantees about downstream > code on a binary boundary needs a much larger discussion, and speaks > to some of the project's main principles about the "Open" part of > OpenBMC. > > > > >> I originally wrote a big long idea about how to make the above work, > >> but got to the end, and realized that I'm still trying to understand > >> what we're trying to avoid here with the downstream/upstream lists > >> thing. It's easy enough to patch the existing list to add your new > >> custom types, then when it comes time to merge because the > >> project/component is public, the patch is ready and good to upstream. > >> What are we buying by moving that info to a non-patch format? > > > > You get the benefit of separating devices into two classes.. > > (1) types that are upstream , have been tested by someone else and they > are ready to go without additional work. > > (2) types that are not upstream, because the devices are not public yet > or may never be public and need to be kept downstream for a long period of > time (or forever). > > Don't you already have that distinction today? Devices that are in > your downstream patch fall into category 2, devices that are on > openbmc master fall into category 1. Maybe I'm missing something? > > >> > >> I think > >> moving it to a file means it's a lot less likely to be upstreamed, as > >> it requires the next person to understand it to use it, and modify the > >> patch rather than simply cleaning up the commit message, testing it, > >> and firing it at gerrit. > > > > Yes, the file would be for those things that are never meant to be > upstreamed or won't be upstreamed for a long while. > > See above. > > >> > >> Having done this pattern many times > >> (including with that list specifically) I think the only thing we > >> could improve would be to move that list to its own file (but still > >> C++ code), so it gets fewer merge conflicts, and you can completely > >> replace it if you like, but even that doesn't solve the problem if > >> code is never upstreamed. > > > > Yup, the problem here are the following > > "I have patches I keep downstream and they keep getting broken whenever > the types/device list gets updated. I wish these data structures were in a > file that doesn't get patched often" > > Moving that structure to its own file sounds totally reasonable, > although (if I were the maintainer) it would have no guarantees > granted. Said file may change name, structure, naming, or compiler > magic that will cause downstream to break. Funnily enough, in the act > of implementing that, you will ironically break a lot of peoples > downstreams, and I'm personally ok with that. You (and your team) > needs to be ok when people break your downstream for similar reasons. > > On a personal note, if you haven't already, I highly recommend > spending some learning time on getting a good setup and mental model > for merging conflicting patches. As a useful skill, it comes up > ridiculously often in software, especially if you're a maintainer. If > you have the ability to resolve conflicts quickly and correctly it > puts you at a significant time advantage to your peers that don't. > > > and > > "I have patches to add devices to the type/list data structure and I > can't upstream them for a long time (or ever). I don't want to waste time > constantly fixing my broken patches everytime someone adds a new public > supported type." > > Both approaches (parsing json and extending the list runtime and > separating the data structs into a separate file + returning the union of > base + extended) accomplish the same thing. One requires a recipe to copy a > ExtendPSUSensorConfig.json in a recipe somewhere to usr/share/PSUSensors > (or something) and the other is just a patch that gets applied. Talking it > through, I now realize that the slight code refactoring approach is a lot > less work and a lot more simple..and something that I'd actually have time > to contribute. > > EXCELLENT! Add me to the patch, and I'd be happy to review it for you > (With that said, James is pretty fast and sometimes beats me to the > punch). > > > > > > >> Have you filed a bug, or asked on the mailing list before now? This > >> is the kind of feedback the authors of that sensor need (Ideally > >> before you move over to another subsystem like hwmontemp). > > > > I never really considered hwmontemp a different sub-system since they > live in the same repo and seemed to be more specific towards monitoring > temperature telemetry. > > Ignore the above comment about subsystem. I thought you were talking > about phosphor-hwmon. My bad. > > > As far as bugs go, I CC'ed Alex Qiu who has more experience with the > performance of PSUSensors. I haven't personally ran into this problem; just > know about it from talk amongst the larger team. > > Excellent. Looking forward to details. > > >> > >> If I > >> didn't see your message/bug and you did post it, I apologize, I'm not > >> trying to call you out. > >> If you have specifics, like the value of N, and the details around > >> what chips you're interacting with and whatever debugging you've done, > >> it would be helpful to put that in a bug for triage. > > > > Alex, maybe you can add some color here? > >> > >> Keep in mind, PSUSensor by default has a 1 second scan rate. > >> > https://github.com/openbmc/dbus-sensors/blob/41061e2c3198c0f597d4f6bb702b690a273ab45d/include/PSUSensor.hpp#L38 > >> If it's not obeying that, clearly there's a bug to fix somewhere. > >> On some platforms, I have seen very high rate polling of power values > >> on the PSU I2c bus by other devices, and that tends to hold up > >> transactions for other components. If that bus is misbehaving or > >> overloaded on your platform, it might have triggered a weird condition > >> within the PSU sensor (like the scans taking longer than the scan > >> rate). > >> If you have any more details here, it's quite possible someone will > >> have an idea where to look, or know exactly where the problem is. > > > > > [-- Attachment #2: Type: text/html, Size: 11670 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-11 0:08 ` Alex Qiu @ 2020-08-11 23:02 ` Ed Tanous 2020-08-12 0:46 ` Alex Qiu 0 siblings, 1 reply; 26+ messages in thread From: Ed Tanous @ 2020-08-11 23:02 UTC (permalink / raw) To: Alex Qiu; +Cc: James Feist, Jason Ling, OpenBMC Maillist On Mon, Aug 10, 2020 at 5:09 PM Alex Qiu <xqiu@google.com> wrote: > > Hi Ed and James, > > Is it acceptable for a device to be listed in both HwmonTempSensor and PSUSensor? > Can you lay out a little bit of info about why you would need the same type in both places? My knee jerk reaction would be to say no, under the idea that we shouldn't be duplicating functionality and code between sensors, but if there are good technical reasons that amount to more than "one has bugs, the other doesn't" then I definitely would be for allowing it. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-11 23:02 ` Ed Tanous @ 2020-08-12 0:46 ` Alex Qiu 2020-08-12 1:42 ` Ed Tanous 0 siblings, 1 reply; 26+ messages in thread From: Alex Qiu @ 2020-08-12 0:46 UTC (permalink / raw) To: Ed Tanous; +Cc: James Feist, Jason Ling, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 1550 bytes --] Hi Ed, The question was more of a general ask on whether dbus-sensors plans for this on record. If so, individual daemon should have the config option to ignore a device completely. Currently, I think PSUSensor has the ability, but HwmonTempSensor does not. The reason behind it may be complicated. One is if we can fix the PSUSensor performance issue on time so that we can use it directly for PID control based on VR temperatures. And then, if we can't fix it on time, what work around can we have? Is it upstream-able or local patches? What's more, can we have different polling rates for temperature and voltage/current/power by using multiple daemon for the same device? Of course, ideally, we can go for a fast feature-enriched PSUSensor to take care of everything and deprecate HwmonTempSensor, but you know I have been talking about schedule for multiple times with you before... Thanks. - Alex Qiu On Tue, Aug 11, 2020 at 4:02 PM Ed Tanous <ed@tanous.net> wrote: > On Mon, Aug 10, 2020 at 5:09 PM Alex Qiu <xqiu@google.com> wrote: > > > > Hi Ed and James, > > > > Is it acceptable for a device to be listed in both HwmonTempSensor and > PSUSensor? > > > > Can you lay out a little bit of info about why you would need the same > type in both places? My knee jerk reaction would be to say no, under > the idea that we shouldn't be duplicating functionality and code > between sensors, but if there are good technical reasons that amount > to more than "one has bugs, the other doesn't" then I definitely would > be for allowing it. > [-- Attachment #2: Type: text/html, Size: 2162 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-12 0:46 ` Alex Qiu @ 2020-08-12 1:42 ` Ed Tanous 2020-08-12 16:19 ` Alex Qiu 0 siblings, 1 reply; 26+ messages in thread From: Ed Tanous @ 2020-08-12 1:42 UTC (permalink / raw) To: Alex Qiu; +Cc: James Feist, Jason Ling, OpenBMC Maillist On Tue, Aug 11, 2020 at 5:46 PM Alex Qiu <xqiu@google.com> wrote: > > The question was more of a general ask on whether dbus-sensors plans for this on record. If so, individual daemon should have the config option to ignore a device completely. Currently, I think PSUSensor has the ability, but HwmonTempSensor does not. > You're talking about ignoring a specific leg of a device? Like ignoring a particular sensor on an installed device? Or are you talking about ignoring a device class entirely? Maybe calling out the specific use case of what you're wanting to ignore might help here. > The reason behind it may be complicated. One is if we can fix the PSUSensor performance issue on time so that we can use it directly for PID control based on VR temperatures. And then, if we can't fix it on time, what work around can we have? I'm assuming "on time" here refers to some product schedule? Without knowing your particular timelines, I will say this: we've solved several orders of magnitude worse performance problems in dbus sensors in the past. I'm assuming this one just requires the proper application of debugging, thought, and engineering. I'm not sure how to answer your question about workarounds: You would have whatever workarounds you're willing to build, that's the beauty of open source. If using HwMonTempSensor is a workaround, and you're willing to live with the patch, by all means, use it. > Is it upstream-able or local patches That would depend on what your workarounds entail. If your workarounds follow the principles of the project, don't duplicate functionality, don't break any of the existing use cases, and are maintainable, then they're probably upstreamable. If you're duplicating features between daemons unnecessarily for lack of wanting to hunt down and understand a performance bug, that's probably something you need to keep in a local patch until you have time to do the appropriate debugging. > ? What's more, can we have different polling rates for temperature and voltage/current/power by using multiple daemon for the same device? Why don't we just make the polling rate configurable in the EM config? Each sensor has its own timer for exactly this reason. In the original design of dbus-sensors, each sensor instance also ran in its own process. We moved it to each sensor type running in a shared process because the context switches were getting really bad, and a lot of the sensors of similar types had very similar event matches, so it allowed us to reduce the dbus load for things like power on. We could certainly revisit this, but for what you're wanting, I suspect we can just configure the polling rates per sensor. We hardcoded them under the assumption that we could build one reasonable default that worked for everyone, but clearly you've broken that assumption, so lets throw it in a config file, put some reasonable defaults on it, and call it good. (note, this is subject to James' opinion as maintainer, I'm not sure if he'll agree here). > Of course, ideally, we can go for a fast feature-enriched PSUSensor to take care of everything and deprecate HwmonTempSensor, but you know I have been talking about schedule for multiple times with you before... > I understand short schedules. It always feels like there's not enough time. The best advice I have here is to try to break your problem space down into small problems such that you can get a patch written, tested, and pushed to gerrit out for that day. Then use those patches to slowly move toward what you want, even if you keep stacking them up in upstream. Eventually the maintainer will review them, or you'll have solved someone else's problem before they hit it. If enough people do this, we'll be way ahead on these types of bugs. At some point James might need to school us on what the theoretical difference is between PSUsensor and HwmonSensor. I know one was originally built for PSU modules, and the other was built for on board hardware, but they're getting pretty similar, maybe it does make sense to have certain devices in both? ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-12 1:42 ` Ed Tanous @ 2020-08-12 16:19 ` Alex Qiu 0 siblings, 0 replies; 26+ messages in thread From: Alex Qiu @ 2020-08-12 16:19 UTC (permalink / raw) To: Ed Tanous; +Cc: James Feist, Jason Ling, OpenBMC Maillist [-- Attachment #1: Type: text/plain, Size: 4804 bytes --] Hi Ed, I was talking about, if having one device in multiple daemons is expected, it would make sense to have some switch in the config to turn off the support for one device according to the setup. From your reply, I think I see this is not really preferred or expected from the design, so I'll leave it to Sui to come up with more data on the issue that we are having, and progress on the actual issue. In case the time is running out, we'll try local patches for workarounds. Thanks! - Alex Qiu On Tue, Aug 11, 2020 at 6:42 PM Ed Tanous <ed@tanous.net> wrote: > On Tue, Aug 11, 2020 at 5:46 PM Alex Qiu <xqiu@google.com> wrote: > > > > The question was more of a general ask on whether dbus-sensors plans for > this on record. If so, individual daemon should have the config option to > ignore a device completely. Currently, I think PSUSensor has the ability, > but HwmonTempSensor does not. > > > > You're talking about ignoring a specific leg of a device? Like > ignoring a particular sensor on an installed device? Or are you > talking about ignoring a device class entirely? Maybe calling out the > specific use case of what you're wanting to ignore might help here. > > > The reason behind it may be complicated. One is if we can fix the > PSUSensor performance issue on time so that we can use it directly for PID > control based on VR temperatures. And then, if we can't fix it on time, > what work around can we have? > > I'm assuming "on time" here refers to some product schedule? Without > knowing your particular timelines, I will say this: we've solved > several orders of magnitude worse performance problems in dbus sensors > in the past. I'm assuming this one just requires the proper > application of debugging, thought, and engineering. I'm not sure how > to answer your question about workarounds: You would have whatever > workarounds you're willing to build, that's the beauty of open source. > If using HwMonTempSensor is a workaround, and you're willing to live > with the patch, by all means, use it. > > > Is it upstream-able or local patches > > That would depend on what your workarounds entail. If your > workarounds follow the principles of the project, don't duplicate > functionality, don't break any of the existing use cases, and are > maintainable, then they're probably upstreamable. If you're > duplicating features between daemons unnecessarily for lack of wanting > to hunt down and understand a performance bug, that's probably > something you need to keep in a local patch until you have time to do > the appropriate debugging. > > > ? What's more, can we have different polling rates for temperature and > voltage/current/power by using multiple daemon for the same device? > > Why don't we just make the polling rate configurable in the EM config? > Each sensor has its own timer for exactly this reason. In the > original design of dbus-sensors, each sensor instance also ran in its > own process. We moved it to each sensor type running in a shared > process because the context switches were getting really bad, and a > lot of the sensors of similar types had very similar event matches, so > it allowed us to reduce the dbus load for things like power on. We > could certainly revisit this, but for what you're wanting, I suspect > we can just configure the polling rates per sensor. We hardcoded them > under the assumption that we could build one reasonable default that > worked for everyone, but clearly you've broken that assumption, so > lets throw it in a config file, put some reasonable defaults on it, > and call it good. (note, this is subject to James' opinion as > maintainer, I'm not sure if he'll agree here). > > > Of course, ideally, we can go for a fast feature-enriched PSUSensor to > take care of everything and deprecate HwmonTempSensor, but you know I have > been talking about schedule for multiple times with you before... > > > > I understand short schedules. It always feels like there's not enough > time. The best advice I have here is to try to break your problem > space down into small problems such that you can get a patch written, > tested, and pushed to gerrit out for that day. Then use those patches > to slowly move toward what you want, even if you keep stacking them up > in upstream. Eventually the maintainer will review them, or you'll > have solved someone else's problem before they hit it. If enough > people do this, we'll be way ahead on these types of bugs. > > At some point James might need to school us on what the theoretical > difference is between PSUsensor and HwmonSensor. I know one was > originally built for PSU modules, and the other was built for on board > hardware, but they're getting pretty similar, maybe it does make sense > to have certain devices in both? > [-- Attachment #2: Type: text/html, Size: 5661 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:54 ` Jason Ling 2020-08-07 17:06 ` James Feist @ 2020-08-07 17:24 ` Ed Tanous 1 sibling, 0 replies; 26+ messages in thread From: Ed Tanous @ 2020-08-07 17:24 UTC (permalink / raw) To: Jason Ling; +Cc: James Feist, OpenBMC Maillist On Fri, Aug 7, 2020 at 9:54 AM Jason Ling <jasonling@google.com> wrote: > > > What about making the device/type lists in PSUSensors extendable using JSON? >> >> Say more about what you're wanting to do here..... > > Take https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L43 and https://github.com/openbmc/dbus-sensors/blob/master/src/PSUSensorMain.cpp#L59 and make it so those can be extended using (for example) without an upstream code change. I picked JSON as the easiest example. > IIRC PSUSensors does validity checks to make sure that the device emits a name in its 'permit list' (hwmontempsensor is less picky) so tricking PSUSensors into gathering telemetry for a non-public device is tricky. > I'm conflicted on that. Part of the reason that list is there, and not in the config files directly, is to convey that those are the types that have been tested to work correctly, and the types that the config files "promise" to work sanely. The other thing is, if you've done the testing, adding to that list is (should be) relatively trivial and straightforward. Opening up that list to runtime parsing opens a lot of security questions I'm not prepared to answer. >> Can you give an >> example of what you would use it for? > > Sure, the primary use case would be > It's a non public device. Would rather not broadcast information about it or have to obfuscate the device name. Really would rather not maintain patches until the device is made public. First off, you will always have merge conflicts when you do this. Moving the config to json doesn't eliminate the possibility of merge conficts. I'd highly recommend doing the research in your org to understand why these things can't be public. In a lot of cases it's legitimate, in others it's just dogma. The intent behind EM was that even if the configuration files have to live downstream, a majority of the daemon code can be upstream. Unless you're working on custom silicon (and please don't tell me if you are) this should work just fine with most commercially available devices. In theory, the intended "recommended by the authors" way to handle a "custom, can't be published anywhere" device would be to create a totally separate daemon that implements that new device functionality, and adds an exposes record specifically for that to your config in a patch. In theory, the config file patch is small, and pretty easy to merge, and should be the only conflict you're likely to get if your app is self-contained. That's not to say this is supported by the project, or that the EM maintainers make any promise to not break you, but it tends to be a lot less likely, and has been a pattern that I've seen work well in the past. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:34 ` Jason Ling 2020-08-07 16:39 ` Ed Tanous @ 2020-08-07 17:01 ` James Feist 1 sibling, 0 replies; 26+ messages in thread From: James Feist @ 2020-08-07 17:01 UTC (permalink / raw) To: Jason Ling; +Cc: OpenBMC Maillist On 8/7/2020 9:34 AM, Jason Ling wrote: > ack, > sounds like I'm better off just moving temp sensors over to being > handled by PSUSensors. > I'm not sure this is a great idea either, I think taking the label parsing out of PSU sensor and making something generic that both can use would be the best approach. ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: dbus-sensors:hwmontemp: additional attribute proposal 2020-08-07 16:24 ` James Feist 2020-08-07 16:34 ` Jason Ling @ 2020-08-07 16:37 ` Ed Tanous 1 sibling, 0 replies; 26+ messages in thread From: Ed Tanous @ 2020-08-07 16:37 UTC (permalink / raw) To: James Feist; +Cc: Jason Ling, OpenBMC Maillist On Fri, Aug 7, 2020 at 9:28 AM James Feist <james.feist@linux.intel.com> wrote: > > On 8/6/2020 3:52 PM, Jason Ling wrote: > > Assuming that you mean "Omit Name attribute from the sensor > > configuration definition and then change hwmontemp to require any Name.*" > > This won't work since Entity-Manager requires Name (tried it, > > entity-manager does indeed complain about not finding name). > > > > My rationale for an omit list vs permit list > > (1) if it's a permit list then everytime you add another temp you want > > to monitor you need to add to this list..if you want to drop a temp then > > you have to modify the list again. > > (2) General assumption is that the primary use case is to display all > > named temperatures which means a permit list is typically large > > (3) adding a permit list also breaks all existing code. Everyone has to > > go back into their json config and add all the sensor values to the list. > > > > My rationale for using the value for the "Name" attribute rather than > > labels or referencing sysfs attributes > > (1) Looking at just the config , it's obvious as to what you're omitting. > > (2) If it's label base, a label change in a driver would mean a breakage > > in the userspace daemon. Not a big deal; but it can be annoying. > > (3) if it's sysfs attribute based then it's my opinion that it's not as > > readable. > > > > I'm not a huge fan of this as the PSU sensor already has a way of > handling this, and it adds a new way of handling it. I'd rather follow > what is already there. It's already confusing enough that hwmontemp and > psu do things in slightly different ways. > +1 If we're talking about adding this to PSU sensor. For dbus-sensors, this feels like an already solved problem, even if the particular sensor you're using might not have implemented the pattern yet. If we're talking about phosphor-hwmon, I think that's a very different discussion, and one I don't have a strong opinion on. ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2020-08-12 16:19 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-05 23:42 dbus-sensors:hwmontemp: additional attribute proposal Jason Ling 2020-08-06 6:07 ` Zbyszek 2020-08-06 20:40 ` James Feist 2020-08-06 22:52 ` Jason Ling 2020-08-07 16:24 ` James Feist 2020-08-07 16:34 ` Jason Ling 2020-08-07 16:39 ` Ed Tanous 2020-08-07 16:54 ` Jason Ling 2020-08-07 17:06 ` James Feist 2020-08-07 17:17 ` Jason Ling 2020-08-07 17:40 ` James Feist 2020-08-07 18:04 ` Jason Ling 2020-08-07 18:17 ` James Feist 2020-08-07 18:29 ` Jason Ling 2020-08-07 19:27 ` Ed Tanous 2020-08-07 19:17 ` Ed Tanous 2020-08-07 21:05 ` Jason Ling 2020-08-07 21:53 ` Ed Tanous 2020-08-11 0:08 ` Alex Qiu 2020-08-11 23:02 ` Ed Tanous 2020-08-12 0:46 ` Alex Qiu 2020-08-12 1:42 ` Ed Tanous 2020-08-12 16:19 ` Alex Qiu 2020-08-07 17:24 ` Ed Tanous 2020-08-07 17:01 ` James Feist 2020-08-07 16:37 ` Ed Tanous
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.