All of lore.kernel.org
 help / color / mirror / Atom feed
* Sensor Value PropertiesChanged Events
@ 2021-02-02  0:42 Bills, Jason M
  2021-02-02  1:26 ` Ed Tanous
  0 siblings, 1 reply; 7+ messages in thread
From: Bills, Jason M @ 2021-02-02  0:42 UTC (permalink / raw)
  To: openbmc

Hi All,

There is an issue and idea that James Feist and I chatted about to maybe 
relieve some of our D-Bus traffic.

A major contributor to our D-Bus traffic (as seen in dbus-monitor) is 
the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value 
property on each polling loop, which generates a PropertiesChanged 
signal for every sensor on every polling loop (once per second?).

The concern is that more important D-Bus messages could be getting 
delayed as D-Bus processes these Sensor Value signals.

The idea to fix this is to change the sensors with a custom getter on 
the Value property, so the last read can be pulled from D-Bus using a 
get-property call, but it would no longer signal a PropertiesChanged event.

I pushed a proposed change here: 
https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199.

Our original assumption was that nobody was matching on this 
PropertiesChanged signal for the Value property; however, it was pointed 
out to me today, that PID control has a match for it and may be using it.

So, I wanted to start a broader community discussion about this issue:

1. Is this a real concern or are PropertiesChanged signals so 
lightweight that removing them won't help with D-Bus load?

2. Does anyone need to match on sensor Value property updates or is 
reading them with get-property enough?

3. Does PID control use the Value match?  If so and there are benefits 
to removing these signals, could PID control manage without them?


As a side note, I still have two remaining services that publish 
PropertiesChanged events on sensor Value properties:

PWM Sensors.  I have a proposed (and untested) change here: 
gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40200.

A Power sensor, that I will track down based on this discussion.

Thanks!
-Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-02  0:42 Sensor Value PropertiesChanged Events Bills, Jason M
@ 2021-02-02  1:26 ` Ed Tanous
  2021-02-02 10:02   ` Ambrozewicz, Adrian
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Ed Tanous @ 2021-02-02  1:26 UTC (permalink / raw)
  To: Bills, Jason M; +Cc: openbmc

On Mon, Feb 1, 2021 at 4:44 PM Bills, Jason M
<jason.m.bills@linux.intel.com> wrote:
>
> Hi All,
>
> There is an issue and idea that James Feist and I chatted about to maybe
> relieve some of our D-Bus traffic.
>
> A major contributor to our D-Bus traffic (as seen in dbus-monitor) is
> the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value
> property on each polling loop, which generates a PropertiesChanged
> signal for every sensor on every polling loop (once per second?).
>
> The concern is that more important D-Bus messages could be getting
> delayed as D-Bus processes these Sensor Value signals.
>
> The idea to fix this is to change the sensors with a custom getter on
> the Value property, so the last read can be pulled from D-Bus using a
> get-property call, but it would no longer signal a PropertiesChanged event.

Doesn't this break..... like... everything that relies on sensor
values changing over time?

>
> I pushed a proposed change here:
> https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199.
>
> Our original assumption was that nobody was matching on this
> PropertiesChanged signal for the Value property; however, it was pointed
> out to me today, that PID control has a match for it and may be using it.

Pid control, telemetry, redfish event service are the ones that
immediately come to mind.  It should be noted, dbus-sensors even uses
that message internally for things like CFM sensor, which has to base
its output on a transform of other sensor values, so there'd be a lot
of stuff to fix if we did this, we'd have to make sure it's worth it.

>
> So, I wanted to start a broader community discussion about this issue:
>
> 1. Is this a real concern or are PropertiesChanged signals so
> lightweight that removing them won't help with D-Bus load?

It's a valid concern IMO.  It's arguably the current limit on how fast
we can scan sensors on large-sensor-count BMCs.  In terms of
dbus-sensors architecture stuff, it's next in line on my priority list
after the whole "reading sensors from hwmon blocks the thread"
problem.

>
> 2. Does anyone need to match on sensor Value property updates or is
> reading them with get-property enough?

See above.  Lots of stuff needs property value updates, and moving all
that stuff back to polling doesn't really seem like an option, or a
good thing.

>
> 3. Does PID control use the Value match?  If so and there are benefits
> to removing these signals, could PID control manage without them?

Phosphor-pid-control and friends could theoretically move to polling,
but that seems much much worse, and increases the dbus traffic
overall, given that every poll would now have to go round trip through
both processes, instead of one way.

>
>
> As a side note, I still have two remaining services that publish
> PropertiesChanged events on sensor Value properties:
>
> PWM Sensors.  I have a proposed (and untested) change here:
> gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40200.
>
> A Power sensor, that I will track down based on this discussion.


One thought I've had in this area was to rearrange/redefine the dbus
interfaces such that a single properties changed (or equivalent) could
batch multiple sensor values together, thus avoiding the per-message
cost, while still keeping the eventing available.

The three basic strawman ideas I've had for this in
phosphor-dbus-interfaces yaml would be something like

NewSensorValueValueAPI
- name: Names
type: array<string>
- name: Values
type: array<double>
description: >
The sensor reading.
- name: MaxValues
type: array<double>
description: >
The Maximum supported sensor reading.
- name: MinValues
type: array<double>
description: >
The Minimum supported sensor reading.
- name: Units
type: array<enum[self.Unit]>


Because all properties are vector<double>, and Names are specified in
a single property, the PropertiesChanged events could batch together
10s of sensors in a single message, and use a tombstone value for
"haven't updated since the last update"

The other thought was we could completely delete the Value property
from Sensor.Value, and replace it with a SensorManager that would live
at /xyz/openbmc_project/sensors, which would publish a
SensorValuesUpdated signal with a dict of name and value, again,
allowing sensor producers to batch the sensor reads, but still keeping
it reasonably close to the existing API.

In both cases, the implementation in dbus-sensors or phosphor-hwmon
would be something like "read as many sensors as I can in 250ms, then
batch it up and send out one event" the 250ms timer would help with
stuck sensors, and avoid a lot of latency if the system is overloaded.

The third one is a little more out there.  We could change the
sensor.Value.Value property into an FD type, and point to a memmapped
area of memory into which we write the current sensor value.  That
way, the "sensor value" on dbus rarely changes, but you can always
read the current state of the sensor with zero overhead or context
switching to the sensor processes.  If this works, it has the
potential to speed up most sensor polling operations by an order of
magnitude, but seems hard to do, and has a lot of questions.  Does
inotify work on mmaped files?  How do FD permissions work when sent
over dbus?  How do you "lock" the memory region to write it (or do you
not have to)?



With all of the above said, I think we really need to take a look and
profile why individual dbus messages are so slow, and if it's fixable
at a lower layer than the interfaces.  I know there were some efforts
to put the broker into the kernel that kinda petered out, but I was
never clear as to why.  I wanted to try grabbing those patches to see
what performance benefit they gave at some point.

>
> Thanks!
> -Jason

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-02  1:26 ` Ed Tanous
@ 2021-02-02 10:02   ` Ambrozewicz, Adrian
  2021-02-04 15:55     ` Patrick Williams
  2021-02-02 20:39   ` Bills, Jason M
  2021-02-03  0:11   ` Andrew Jeffery
  2 siblings, 1 reply; 7+ messages in thread
From: Ambrozewicz, Adrian @ 2021-02-02 10:02 UTC (permalink / raw)
  To: Ed Tanous, Bills, Jason M; +Cc: openbmc

W dniu 2/2/2021 o 01:42, Bills, Jason M pisze:

 > 1. Is this a real concern or are PropertiesChanged signals so 
lightweight that removing them won't help with D-Bus load?
Without proper measurements I don't believe we can be sure. Can we 
reliably benchmark how much CPU time and memory is used by 
PropertiesChanged roaming through the system?

I would be interested in investing some time at profiling existing 
dbus-sensors services, as from time to time we're experiencing 
performance issues with them.

My trust in systemd D-Bus implementation is that signals are implemented 
in optimal way and broker doesn't broadcast it to services without 
proper 'match' defined. It should be checked though
Moving to polling might in fact increase D-Bus utilization by 
introducing message-response communication between producer and 
consumer. In certain cases of sensors which tend to update slowly, 
introducing a getter with faster interval would increase the traffic, if 
the interval would be faster than the sensor update rate.


 > 2. Does anyone need to match on sensor Value property updates or is 
reading them with get-property enough?
TelemetryService specifies 'OnChange' mode for monitoring Metrics - "The 
metric report is generated when any of the metric values change.".

My current experience with TelemetryService adopters, shows that 
administrators and data scientist want to get all the information they 
can (all sensors, possibly with highest rate). With more focus on 
telemetry these days I would expect more (hundreds of thousands) sensors 
to come. Polling each of them individually could be not feasible.

Furthermore - having reliable event-based updates is crucial for 
catching short-lived anomalies (voltage spikes etc). I believe they are 
events of particular interest for data-center admins, while being easy 
to miss with polling-based updates.
Algorithms working based on sensor updates would be also crippled in 
this case, as missing samples means worse accuracy.

To sum up - I believe moving away from PropertiesChanged event would 
limit pretty important use cases for system telemetry. I wouldn't mind 
however on working out more performant solution, while keeping the same 
(or better) features:
- optimal and efficient one-to-many broadcasting,
- queuing multiple updates to be consumed by receiver,
- low overhead of yet another side-band channel.

Regards,
Adrian

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-02  1:26 ` Ed Tanous
  2021-02-02 10:02   ` Ambrozewicz, Adrian
@ 2021-02-02 20:39   ` Bills, Jason M
  2021-02-03  0:11   ` Andrew Jeffery
  2 siblings, 0 replies; 7+ messages in thread
From: Bills, Jason M @ 2021-02-02 20:39 UTC (permalink / raw)
  To: Ed Tanous; +Cc: openbmc



On 2/1/2021 5:26 PM, Ed Tanous wrote:
> On Mon, Feb 1, 2021 at 4:44 PM Bills, Jason M
> <jason.m.bills@linux.intel.com> wrote:
>>
>> Hi All,
>>
>> There is an issue and idea that James Feist and I chatted about to maybe
>> relieve some of our D-Bus traffic.
>>
>> A major contributor to our D-Bus traffic (as seen in dbus-monitor) is
>> the polling sensors updating the xyz.openbmc_project.Sensor.Value.Value
>> property on each polling loop, which generates a PropertiesChanged
>> signal for every sensor on every polling loop (once per second?).
>>
>> The concern is that more important D-Bus messages could be getting
>> delayed as D-Bus processes these Sensor Value signals.
>>
>> The idea to fix this is to change the sensors with a custom getter on
>> the Value property, so the last read can be pulled from D-Bus using a
>> get-property call, but it would no longer signal a PropertiesChanged event.
> 
> Doesn't this break..... like... everything that relies on sensor
> values changing over time?

I think this was my incorrect assumption that the PropertiesChanged 
signal for sensor value updates was not used and could be removed 
without significant impact.  I will abandon my proposed change, but I'm 
glad there are other thoughts and discussion around this issue.

Thanks!
-Jason

> 
>>
>> I pushed a proposed change here:
>> https://gerrit.openbmc-project.xyz/c/openbmc/dbus-sensors/+/40199.
>>
>> Our original assumption was that nobody was matching on this
>> PropertiesChanged signal for the Value property; however, it was pointed
>> out to me today, that PID control has a match for it and may be using it.
> 
snip...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-02  1:26 ` Ed Tanous
  2021-02-02 10:02   ` Ambrozewicz, Adrian
  2021-02-02 20:39   ` Bills, Jason M
@ 2021-02-03  0:11   ` Andrew Jeffery
  2021-02-03  1:28     ` Andrew Jeffery
  2 siblings, 1 reply; 7+ messages in thread
From: Andrew Jeffery @ 2021-02-03  0:11 UTC (permalink / raw)
  To: Ed Tanous, Bills, Jason M; +Cc: Jeremy Kerr, openbmc



On Tue, 2 Feb 2021, at 11:56, Ed Tanous wrote:
> 
> The third one is a little more out there.  We could change the
> sensor.Value.Value property into an FD type, and point to a memmapped
> area of memory into which we write the current sensor value.  That
> way, the "sensor value" on dbus rarely changes, but you can always
> read the current state of the sensor with zero overhead or context
> switching to the sensor processes.  If this works, it has the
> potential to speed up most sensor polling operations by an order of
> magnitude, but seems hard to do, and has a lot of questions.  Does
> inotify work on mmaped files?  How do FD permissions work when sent
> over dbus?  How do you "lock" the memory region to write it (or do you
> not have to)?
> 

So... a little repo that I pushed recently might help out with this approach:

https://github.com/amboar/shmapper

It's a shared-memory implementation of the mapper daemon, but setting that 
aside for the moment, the implementation contains a shared library, libshmap, 
discussed in the README:

https://github.com/amboar/shmapper#libshmap

libshmap allows you to treat process-shared-memory like regular heap memory*, 
and also wraps up the pthread locking and condition signalling primitives in a 
way that they can be placed into the shared memory region without concern.

So in brief, you could implement your out-there idea on top of libshmap, and it 
has solutions for most of the questions you list there already:

* Permissions would be enforced by ownership of the sensor shared library 
(libraries?) built on top of libshmap.
* Locking the memory regions can be done with shmap_mutex or shmap_rwlock, and
* Notification can be done using shmap_mutex and shmap_condition

I've been thinking about how I could do condition notification over a 
file-descriptor so you could poll it for updates to the shared memory. I 
haven't got anything concrete yet, but that would provide async monitoring too.

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-03  0:11   ` Andrew Jeffery
@ 2021-02-03  1:28     ` Andrew Jeffery
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Jeffery @ 2021-02-03  1:28 UTC (permalink / raw)
  To: Ed Tanous, Bills, Jason M; +Cc: Jeremy Kerr, openbmc



On Wed, 3 Feb 2021, at 10:41, Andrew Jeffery wrote:
> 
> 
> On Tue, 2 Feb 2021, at 11:56, Ed Tanous wrote:
> > 
> > The third one is a little more out there.  We could change the
> > sensor.Value.Value property into an FD type, and point to a memmapped
> > area of memory into which we write the current sensor value.  That
> > way, the "sensor value" on dbus rarely changes, but you can always
> > read the current state of the sensor with zero overhead or context
> > switching to the sensor processes.  If this works, it has the
> > potential to speed up most sensor polling operations by an order of
> > magnitude, but seems hard to do, and has a lot of questions.  Does
> > inotify work on mmaped files?  How do FD permissions work when sent
> > over dbus?  How do you "lock" the memory region to write it (or do you
> > not have to)?
> > 
> 
> So... a little repo that I pushed recently might help out with this approach:
> 
> https://github.com/amboar/shmapper
> 
> It's a shared-memory implementation of the mapper daemon, but setting that 
> aside for the moment, the implementation contains a shared library, libshmap, 
> discussed in the README:
> 
> https://github.com/amboar/shmapper#libshmap
> 
> libshmap allows you to treat process-shared-memory like regular heap memory*, 
> and also wraps up the pthread locking and condition signalling primitives in a 
> way that they can be placed into the shared memory region without concern.
> 
> So in brief, you could implement your out-there idea on top of libshmap, and it 
> has solutions for most of the questions you list there already:
> 
> * Permissions would be enforced by ownership of the sensor shared library 
> (libraries?) built on top of libshmap.

Alternatively, set permissions on the sem or shm files (currently libshmap just 
sets them to 0660), or given that you're probably going to wrap access up in an 
API anyway, split the producer and consumer APIs and have the consumer API only 
expose const values.

Anyway, interested in your thoughts.

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Sensor Value PropertiesChanged Events
  2021-02-02 10:02   ` Ambrozewicz, Adrian
@ 2021-02-04 15:55     ` Patrick Williams
  0 siblings, 0 replies; 7+ messages in thread
From: Patrick Williams @ 2021-02-04 15:55 UTC (permalink / raw)
  To: Ambrozewicz, Adrian; +Cc: Bills, Jason M, Ed Tanous, openbmc

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

On Tue, Feb 02, 2021 at 11:02:53AM +0100, Ambrozewicz, Adrian wrote:
> W dniu 2/2/2021 o 01:42, Bills, Jason M pisze:
>
> My trust in systemd D-Bus implementation is that signals are implemented 
> in optimal way and broker doesn't broadcast it to services without 
> proper 'match' defined. It should be checked though
> Moving to polling might in fact increase D-Bus utilization by 
> introducing message-response communication between producer and 
> consumer. In certain cases of sensors which tend to update slowly, 
> introducing a getter with faster interval would increase the traffic, if 
> the interval would be faster than the sensor update rate.

You raise a very good point here.  When I review code that contains a
signal match almost every time I see code with very little in the match
and doing a bunch of filtering in C++ code and I have to point this out.
It is like doing a `SELECT *` in SQL.

I wouldn't be surprised if there are lots of cases where we're effectively
broadcasting signals and then dropping them on the receive side rather than
allowing dbus-broker to filter out who gets them.  We might be able to
code something up in sdbusplus to warn when the filtering is likely
insufficient but that'd probably become a runtime check that gets lost
in a journal message.

-- 
Patrick Williams

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-04 15:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-02  0:42 Sensor Value PropertiesChanged Events Bills, Jason M
2021-02-02  1:26 ` Ed Tanous
2021-02-02 10:02   ` Ambrozewicz, Adrian
2021-02-04 15:55     ` Patrick Williams
2021-02-02 20:39   ` Bills, Jason M
2021-02-03  0:11   ` Andrew Jeffery
2021-02-03  1:28     ` Andrew Jeffery

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.