* Sensor Value PropertiesChanged Events @ 2021-02-02 0:42 Bills, Jason M 2021-02-02 1:26 ` Ed Tanous 0 siblings, 1 reply; 7+ messages in thread From: Bills, Jason M @ 2021-02-02 0:42 UTC (permalink / raw) To: openbmc Hi All, There is an issue and idea that James Feist and I chatted about to maybe relieve some of our D-Bus traffic. A major contributor to our D-Bus traffic (as seen in dbus-monitor) is the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value property on each polling loop, which generates a PropertiesChanged signal for every sensor on every polling loop (once per second?). The concern is that more important D-Bus messages could be getting delayed as D-Bus processes these Sensor Value signals. The idea to fix this is to change the sensors with a custom getter on the Value property, so the last read can be pulled from D-Bus using a get-property call, but it would no longer signal a PropertiesChanged event. I pushed a proposed change here: https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199. Our original assumption was that nobody was matching on this PropertiesChanged signal for the Value property; however, it was pointed out to me today, that PID control has a match for it and may be using it. So, I wanted to start a broader community discussion about this issue: 1. Is this a real concern or are PropertiesChanged signals so lightweight that removing them won't help with D-Bus load? 2. Does anyone need to match on sensor Value property updates or is reading them with get-property enough? 3. Does PID control use the Value match? If so and there are benefits to removing these signals, could PID control manage without them? As a side note, I still have two remaining services that publish PropertiesChanged events on sensor Value properties: PWM Sensors. I have a proposed (and untested) change here: gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40200. A Power sensor, that I will track down based on this discussion. Thanks! -Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-02 0:42 Sensor Value PropertiesChanged Events Bills, Jason M @ 2021-02-02 1:26 ` Ed Tanous 2021-02-02 10:02 ` Ambrozewicz, Adrian ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Ed Tanous @ 2021-02-02 1:26 UTC (permalink / raw) To: Bills, Jason M; +Cc: openbmc On Mon, Feb 1, 2021 at 4:44 PM Bills, Jason M <jason.m.bills@linux.intel.com> wrote: > > Hi All, > > There is an issue and idea that James Feist and I chatted about to maybe > relieve some of our D-Bus traffic. > > A major contributor to our D-Bus traffic (as seen in dbus-monitor) is > the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value > property on each polling loop, which generates a PropertiesChanged > signal for every sensor on every polling loop (once per second?). > > The concern is that more important D-Bus messages could be getting > delayed as D-Bus processes these Sensor Value signals. > > The idea to fix this is to change the sensors with a custom getter on > the Value property, so the last read can be pulled from D-Bus using a > get-property call, but it would no longer signal a PropertiesChanged event. Doesn't this break..... like... everything that relies on sensor values changing over time? > > I pushed a proposed change here: > https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199. > > Our original assumption was that nobody was matching on this > PropertiesChanged signal for the Value property; however, it was pointed > out to me today, that PID control has a match for it and may be using it. Pid control, telemetry, redfish event service are the ones that immediately come to mind. It should be noted, dbus-sensors even uses that message internally for things like CFM sensor, which has to base its output on a transform of other sensor values, so there'd be a lot of stuff to fix if we did this, we'd have to make sure it's worth it. > > So, I wanted to start a broader community discussion about this issue: > > 1. Is this a real concern or are PropertiesChanged signals so > lightweight that removing them won't help with D-Bus load? It's a valid concern IMO. It's arguably the current limit on how fast we can scan sensors on large-sensor-count BMCs. In terms of dbus-sensors architecture stuff, it's next in line on my priority list after the whole "reading sensors from hwmon blocks the thread" problem. > > 2. Does anyone need to match on sensor Value property updates or is > reading them with get-property enough? See above. Lots of stuff needs property value updates, and moving all that stuff back to polling doesn't really seem like an option, or a good thing. > > 3. Does PID control use the Value match? If so and there are benefits > to removing these signals, could PID control manage without them? Phosphor-pid-control and friends could theoretically move to polling, but that seems much much worse, and increases the dbus traffic overall, given that every poll would now have to go round trip through both processes, instead of one way. > > > As a side note, I still have two remaining services that publish > PropertiesChanged events on sensor Value properties: > > PWM Sensors. I have a proposed (and untested) change here: > gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40200. > > A Power sensor, that I will track down based on this discussion. One thought I've had in this area was to rearrange/redefine the dbus interfaces such that a single properties changed (or equivalent) could batch multiple sensor values together, thus avoiding the per-message cost, while still keeping the eventing available. The three basic strawman ideas I've had for this in phosphor-dbus-interfaces yaml would be something like NewSensorValueValueAPI - name: Names type: array<string> - name: Values type: array<double> description: > The sensor reading. - name: MaxValues type: array<double> description: > The Maximum supported sensor reading. - name: MinValues type: array<double> description: > The Minimum supported sensor reading. - name: Units type: array<enum[self.Unit]> Because all properties are vector<double>, and Names are specified in a single property, the PropertiesChanged events could batch together 10s of sensors in a single message, and use a tombstone value for "haven't updated since the last update" The other thought was we could completely delete the Value property from Sensor.Value, and replace it with a SensorManager that would live at /xyz/openbmc_project/sensors, which would publish a SensorValuesUpdated signal with a dict of name and value, again, allowing sensor producers to batch the sensor reads, but still keeping it reasonably close to the existing API. In both cases, the implementation in dbus-sensors or phosphor-hwmon would be something like "read as many sensors as I can in 250ms, then batch it up and send out one event" the 250ms timer would help with stuck sensors, and avoid a lot of latency if the system is overloaded. The third one is a little more out there. We could change the sensor.Value.Value property into an FD type, and point to a memmapped area of memory into which we write the current sensor value. That way, the "sensor value" on dbus rarely changes, but you can always read the current state of the sensor with zero overhead or context switching to the sensor processes. If this works, it has the potential to speed up most sensor polling operations by an order of magnitude, but seems hard to do, and has a lot of questions. Does inotify work on mmaped files? How do FD permissions work when sent over dbus? How do you "lock" the memory region to write it (or do you not have to)? With all of the above said, I think we really need to take a look and profile why individual dbus messages are so slow, and if it's fixable at a lower layer than the interfaces. I know there were some efforts to put the broker into the kernel that kinda petered out, but I was never clear as to why. I wanted to try grabbing those patches to see what performance benefit they gave at some point. > > Thanks! > -Jason ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-02 1:26 ` Ed Tanous @ 2021-02-02 10:02 ` Ambrozewicz, Adrian 2021-02-04 15:55 ` Patrick Williams 2021-02-02 20:39 ` Bills, Jason M 2021-02-03 0:11 ` Andrew Jeffery 2 siblings, 1 reply; 7+ messages in thread From: Ambrozewicz, Adrian @ 2021-02-02 10:02 UTC (permalink / raw) To: Ed Tanous, Bills, Jason M; +Cc: openbmc W dniu 2/2/2021 o 01:42, Bills, Jason M pisze: > 1. Is this a real concern or are PropertiesChanged signals so lightweight that removing them won't help with D-Bus load? Without proper measurements I don't believe we can be sure. Can we reliably benchmark how much CPU time and memory is used by PropertiesChanged roaming through the system? I would be interested in investing some time at profiling existing dbus-sensors services, as from time to time we're experiencing performance issues with them. My trust in systemd D-Bus implementation is that signals are implemented in optimal way and broker doesn't broadcast it to services without proper 'match' defined. It should be checked though Moving to polling might in fact increase D-Bus utilization by introducing message-response communication between producer and consumer. In certain cases of sensors which tend to update slowly, introducing a getter with faster interval would increase the traffic, if the interval would be faster than the sensor update rate. > 2. Does anyone need to match on sensor Value property updates or is reading them with get-property enough? TelemetryService specifies 'OnChange' mode for monitoring Metrics - "The metric report is generated when any of the metric values change.". My current experience with TelemetryService adopters, shows that administrators and data scientist want to get all the information they can (all sensors, possibly with highest rate). With more focus on telemetry these days I would expect more (hundreds of thousands) sensors to come. Polling each of them individually could be not feasible. Furthermore - having reliable event-based updates is crucial for catching short-lived anomalies (voltage spikes etc). I believe they are events of particular interest for data-center admins, while being easy to miss with polling-based updates. Algorithms working based on sensor updates would be also crippled in this case, as missing samples means worse accuracy. To sum up - I believe moving away from PropertiesChanged event would limit pretty important use cases for system telemetry. I wouldn't mind however on working out more performant solution, while keeping the same (or better) features: - optimal and efficient one-to-many broadcasting, - queuing multiple updates to be consumed by receiver, - low overhead of yet another side-band channel. Regards, Adrian ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-02 10:02 ` Ambrozewicz, Adrian @ 2021-02-04 15:55 ` Patrick Williams 0 siblings, 0 replies; 7+ messages in thread From: Patrick Williams @ 2021-02-04 15:55 UTC (permalink / raw) To: Ambrozewicz, Adrian; +Cc: Bills, Jason M, Ed Tanous, openbmc [-- Attachment #1: Type: text/plain, Size: 1344 bytes --] On Tue, Feb 02, 2021 at 11:02:53AM +0100, Ambrozewicz, Adrian wrote: > W dniu 2/2/2021 o 01:42, Bills, Jason M pisze: > > My trust in systemd D-Bus implementation is that signals are implemented > in optimal way and broker doesn't broadcast it to services without > proper 'match' defined. It should be checked though > Moving to polling might in fact increase D-Bus utilization by > introducing message-response communication between producer and > consumer. In certain cases of sensors which tend to update slowly, > introducing a getter with faster interval would increase the traffic, if > the interval would be faster than the sensor update rate. You raise a very good point here. When I review code that contains a signal match almost every time I see code with very little in the match and doing a bunch of filtering in C++ code and I have to point this out. It is like doing a `SELECT *` in SQL. I wouldn't be surprised if there are lots of cases where we're effectively broadcasting signals and then dropping them on the receive side rather than allowing dbus-broker to filter out who gets them. We might be able to code something up in sdbusplus to warn when the filtering is likely insufficient but that'd probably become a runtime check that gets lost in a journal message. -- Patrick Williams [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-02 1:26 ` Ed Tanous 2021-02-02 10:02 ` Ambrozewicz, Adrian @ 2021-02-02 20:39 ` Bills, Jason M 2021-02-03 0:11 ` Andrew Jeffery 2 siblings, 0 replies; 7+ messages in thread From: Bills, Jason M @ 2021-02-02 20:39 UTC (permalink / raw) To: Ed Tanous; +Cc: openbmc On 2/1/2021 5:26 PM, Ed Tanous wrote: > On Mon, Feb 1, 2021 at 4:44 PM Bills, Jason M > <jason.m.bills@linux.intel.com> wrote: >> >> Hi All, >> >> There is an issue and idea that James Feist and I chatted about to maybe >> relieve some of our D-Bus traffic. >> >> A major contributor to our D-Bus traffic (as seen in dbus-monitor) is >> the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value >> property on each polling loop, which generates a PropertiesChanged >> signal for every sensor on every polling loop (once per second?). >> >> The concern is that more important D-Bus messages could be getting >> delayed as D-Bus processes these Sensor Value signals. >> >> The idea to fix this is to change the sensors with a custom getter on >> the Value property, so the last read can be pulled from D-Bus using a >> get-property call, but it would no longer signal a PropertiesChanged event. > > Doesn't this break..... like... everything that relies on sensor > values changing over time? I think this was my incorrect assumption that the PropertiesChanged signal for sensor value updates was not used and could be removed without significant impact. I will abandon my proposed change, but I'm glad there are other thoughts and discussion around this issue. Thanks! -Jason > >> >> I pushed a proposed change here: >> https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199. >> >> Our original assumption was that nobody was matching on this >> PropertiesChanged signal for the Value property; however, it was pointed >> out to me today, that PID control has a match for it and may be using it. > snip... ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-02 1:26 ` Ed Tanous 2021-02-02 10:02 ` Ambrozewicz, Adrian 2021-02-02 20:39 ` Bills, Jason M @ 2021-02-03 0:11 ` Andrew Jeffery 2021-02-03 1:28 ` Andrew Jeffery 2 siblings, 1 reply; 7+ messages in thread From: Andrew Jeffery @ 2021-02-03 0:11 UTC (permalink / raw) To: Ed Tanous, Bills, Jason M; +Cc: Jeremy Kerr, openbmc On Tue, 2 Feb 2021, at 11:56, Ed Tanous wrote: > > The third one is a little more out there. We could change the > sensor.Value.Value property into an FD type, and point to a memmapped > area of memory into which we write the current sensor value. That > way, the "sensor value" on dbus rarely changes, but you can always > read the current state of the sensor with zero overhead or context > switching to the sensor processes. If this works, it has the > potential to speed up most sensor polling operations by an order of > magnitude, but seems hard to do, and has a lot of questions. Does > inotify work on mmaped files? How do FD permissions work when sent > over dbus? How do you "lock" the memory region to write it (or do you > not have to)? > So... a little repo that I pushed recently might help out with this approach: https://github.com/amboar/shmapper It's a shared-memory implementation of the mapper daemon, but setting that aside for the moment, the implementation contains a shared library, libshmap, discussed in the README: https://github.com/amboar/shmapper#libshmap libshmap allows you to treat process-shared-memory like regular heap memory*, and also wraps up the pthread locking and condition signalling primitives in a way that they can be placed into the shared memory region without concern. So in brief, you could implement your out-there idea on top of libshmap, and it has solutions for most of the questions you list there already: * Permissions would be enforced by ownership of the sensor shared library (libraries?) built on top of libshmap. * Locking the memory regions can be done with shmap_mutex or shmap_rwlock, and * Notification can be done using shmap_mutex and shmap_condition I've been thinking about how I could do condition notification over a file-descriptor so you could poll it for updates to the shared memory. I haven't got anything concrete yet, but that would provide async monitoring too. Cheers, Andrew ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Sensor Value PropertiesChanged Events 2021-02-03 0:11 ` Andrew Jeffery @ 2021-02-03 1:28 ` Andrew Jeffery 0 siblings, 0 replies; 7+ messages in thread From: Andrew Jeffery @ 2021-02-03 1:28 UTC (permalink / raw) To: Ed Tanous, Bills, Jason M; +Cc: Jeremy Kerr, openbmc On Wed, 3 Feb 2021, at 10:41, Andrew Jeffery wrote: > > > On Tue, 2 Feb 2021, at 11:56, Ed Tanous wrote: > > > > The third one is a little more out there. We could change the > > sensor.Value.Value property into an FD type, and point to a memmapped > > area of memory into which we write the current sensor value. That > > way, the "sensor value" on dbus rarely changes, but you can always > > read the current state of the sensor with zero overhead or context > > switching to the sensor processes. If this works, it has the > > potential to speed up most sensor polling operations by an order of > > magnitude, but seems hard to do, and has a lot of questions. Does > > inotify work on mmaped files? How do FD permissions work when sent > > over dbus? How do you "lock" the memory region to write it (or do you > > not have to)? > > > > So... a little repo that I pushed recently might help out with this approach: > > https://github.com/amboar/shmapper > > It's a shared-memory implementation of the mapper daemon, but setting that > aside for the moment, the implementation contains a shared library, libshmap, > discussed in the README: > > https://github.com/amboar/shmapper#libshmap > > libshmap allows you to treat process-shared-memory like regular heap memory*, > and also wraps up the pthread locking and condition signalling primitives in a > way that they can be placed into the shared memory region without concern. > > So in brief, you could implement your out-there idea on top of libshmap, and it > has solutions for most of the questions you list there already: > > * Permissions would be enforced by ownership of the sensor shared library > (libraries?) built on top of libshmap. Alternatively, set permissions on the sem or shm files (currently libshmap just sets them to 0660), or given that you're probably going to wrap access up in an API anyway, split the producer and consumer APIs and have the consumer API only expose const values. Anyway, interested in your thoughts. Andrew ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-02-04 15:57 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-02 0:42 Sensor Value PropertiesChanged Events Bills, Jason M 2021-02-02 1:26 ` Ed Tanous 2021-02-02 10:02 ` Ambrozewicz, Adrian 2021-02-04 15:55 ` Patrick Williams 2021-02-02 20:39 ` Bills, Jason M 2021-02-03 0:11 ` Andrew Jeffery 2021-02-03 1:28 ` Andrew Jeffery
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).