openbmc.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* BMC Performance Profiler
@ 2020-10-05 17:57 Pasha Ghabussi
  2020-10-08 21:44 ` Pasha Ghabussi
  2020-10-26  2:02 ` Andrew Jeffery
  0 siblings, 2 replies; 6+ messages in thread
From: Pasha Ghabussi @ 2020-10-05 17:57 UTC (permalink / raw)
  To: openbmc; +Cc: Ed Tanous, Sui Chen, Ofer Yehielli

[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]

Hello all,

We would really appreciate it if you can take a few minutes to read the
following proposal and let us know your thoughts and suggestions.

We are developing a tool to investigate performance problems by looking at
DBus traffic dumps. Current DBus inspection and visualization tools do not
represent the DBus events similar to a typical performance profiler.
Additionally, these tools do not address typical BMC workloads such as IPMI
and ASIO. Hence, identifying potential performance problems requires
inspecting the raw BMC DBus traffic, which can become a long and complex
process. We want to add a graphical interface to webui-vue to visualize the
DBus traffic to address the abovementioned problem.

There have been DBus and IPMI performance-related discussions in the
OpenBMC community, both of which can be helped by this work: IPMI-related
issues started to appear as early as in 2017. One issue (#2630)
<https://github.com/openbmc/openbmc/issues/2630> describes a problem
related to large numbers of sensors. Its follow-up (#3098)
<https://github.com/openbmc/openbmc/issues/3098> mentions “hostboot crashes
due to poor IPMI performance”. Another issue (#2519)
<https://github.com/openbmc/openbmc/issues/2519> describes a commonly-seen
problem of IPMI taking very long to respond (> 5s).
There are also discussions on RedFish performance
<https://lists.ozlabs.org/pipermail/openbmc/2018-February/010735.html> on
the mailing list; A patch
<https://lists.ozlabs.org/pipermail/openbmc/2016-June/003380.html>
optimized DBus performance by introducing a cache for name translation.

All the performance investigations listed above involve DBus and may be
helped by this work.

We are planning to use the BMCweb file hosting functionality to access the
DBus event dumps and visualize the events in the web UI. The available
profiling tools such as dbus-pcap
<https://github.com/openbmc/openbmc-tools/tree/master/amboar/obmc-scripts/dbus-pcap>,
Wireshark <https://www.wireshark.org/>, Bustle
<https://gitlab.freedesktop.org/bustle/bustle>, Snyh
<https://github.com/snyh/dbus-profiler>, or DFeet
<https://wiki.gnome.org/action/show/Apps/DFeet?action=show&redirect=DFeet>
do not provide the exact functionality we are looking for. Our goal is to
develop functionalities similar to other widely used profilers such as
GPUView or VTune Profiler.

One alternative solution considered was to stream DBus requests over
websocket, but the existing websocket endpoints available on BMC webserver
do not provide the exact information we need.

Requirements and Scalability:

   -

   Should provide the adequate functionalities to filter, visualize the
   events timeline, and group the DBus traffic based on multiple criteria such
   as type, source, destination, path, interface, demon signatures, and more.
   -

   Should support capture of DBus messages using as little resources as
   possible.
   -

   Should be able to show many (~thousands of) entries on screen
   simultaneously
   -

   Integration with webui-vue


Thank you

[-- Attachment #2: Type: text/html, Size: 13210 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BMC Performance Profiler
  2020-10-05 17:57 BMC Performance Profiler Pasha Ghabussi
@ 2020-10-08 21:44 ` Pasha Ghabussi
  2020-10-09 14:37   ` Andrew Geissler
  2020-10-26  2:02 ` Andrew Jeffery
  1 sibling, 1 reply; 6+ messages in thread
From: Pasha Ghabussi @ 2020-10-08 21:44 UTC (permalink / raw)
  To: openbmc

[-- Attachment #1: Type: text/plain, Size: 3489 bytes --]

Hello all,

We would really appreciate it if you can take a few minutes to read the
proposal sent earlier and let us know your thoughts and suggestions.

Thank you

On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag@google.com> wrote:

> Hello all,
>
> We would really appreciate it if you can take a few minutes to read the
> following proposal and let us know your thoughts and suggestions.
>
> We are developing a tool to investigate performance problems by looking at
> DBus traffic dumps. Current DBus inspection and visualization tools do not
> represent the DBus events similar to a typical performance profiler.
> Additionally, these tools do not address typical BMC workloads such as IPMI
> and ASIO. Hence, identifying potential performance problems requires
> inspecting the raw BMC DBus traffic, which can become a long and complex
> process. We want to add a graphical interface to webui-vue to visualize the
> DBus traffic to address the abovementioned problem.
>
> There have been DBus and IPMI performance-related discussions in the
> OpenBMC community, both of which can be helped by this work: IPMI-related
> issues started to appear as early as in 2017. One issue (#2630)
> <https://github.com/openbmc/openbmc/issues/2630> describes a problem
> related to large numbers of sensors. Its follow-up (#3098)
> <https://github.com/openbmc/openbmc/issues/3098> mentions “hostboot
> crashes due to poor IPMI performance”. Another issue (#2519)
> <https://github.com/openbmc/openbmc/issues/2519> describes a
> commonly-seen problem of IPMI taking very long to respond (> 5s).
> There are also discussions on RedFish performance
> <https://lists.ozlabs.org/pipermail/openbmc/2018-February/010735.html> on
> the mailing list; A patch
> <https://lists.ozlabs.org/pipermail/openbmc/2016-June/003380.html>
> optimized DBus performance by introducing a cache for name translation.
>
> All the performance investigations listed above involve DBus and may be
> helped by this work.
>
> We are planning to use the BMCweb file hosting functionality to access the
> DBus event dumps and visualize the events in the web UI. The available
> profiling tools such as dbus-pcap
> <https://github.com/openbmc/openbmc-tools/tree/master/amboar/obmc-scripts/dbus-pcap>,
> Wireshark <https://www.wireshark.org/>, Bustle
> <https://gitlab.freedesktop.org/bustle/bustle>, Snyh
> <https://github.com/snyh/dbus-profiler>, or DFeet
> <https://wiki.gnome.org/action/show/Apps/DFeet?action=show&redirect=DFeet>
> do not provide the exact functionality we are looking for. Our goal is to
> develop functionalities similar to other widely used profilers such as
> GPUView or VTune Profiler.
>
> One alternative solution considered was to stream DBus requests over
> websocket, but the existing websocket endpoints available on BMC webserver
> do not provide the exact information we need.
>
> Requirements and Scalability:
>
>    -
>
>    Should provide the adequate functionalities to filter, visualize the
>    events timeline, and group the DBus traffic based on multiple criteria such
>    as type, source, destination, path, interface, demon signatures, and more.
>    -
>
>    Should support capture of DBus messages using as little resources as
>    possible.
>    -
>
>    Should be able to show many (~thousands of) entries on screen
>    simultaneously
>    -
>
>    Integration with webui-vue
>
>
> Thank you
>

[-- Attachment #2: Type: text/html, Size: 14545 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BMC Performance Profiler
  2020-10-08 21:44 ` Pasha Ghabussi
@ 2020-10-09 14:37   ` Andrew Geissler
  2020-10-09 15:45     ` Pasha Ghabussi
  2020-10-09 18:53     ` Sui Chen
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Geissler @ 2020-10-09 14:37 UTC (permalink / raw)
  To: Pasha Ghabussi; +Cc: openbmc



> On Oct 8, 2020, at 4:44 PM, Pasha Ghabussi <pashag@google.com> wrote:
> 
> Hello all,
> 
> We would really appreciate it if you can take a few minutes to read the proposal sent earlier and let us know your thoughts and suggestions.
> 
> Thank you
> 
> On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag@google.com> wrote:
> Hello all,
> We would really appreciate it if you can take a few minutes to read the following proposal and let us know your thoughts and suggestions.
> We are developing a tool to investigate performance problems by looking at DBus traffic dumps.

I definitely think this could be a very useful tool. Performance issues have hindered us from day 1 with OpenBMC and countless hours have gone into trying to identify the different issues. One area we’ve seen a lot of issues with is on BMC startup, especially after a firmware update. If you could provide a way to enable the needed profiling debug, and then reboot the BMC and capture the data for analysis, it would be appreciated.

> Current DBus inspection and visualization tools do not represent the DBus events similar to a typical performance profiler. Additionally, these tools do not address typical BMC workloads such as IPMI and ASIO. Hence, identifying potential performance problems requires inspecting the raw BMC DBus traffic, which can become a long and complex process. We want to add a graphical interface to webui-vue to visualize the DBus traffic to address the abovementioned problem.

Will you be using something like "busctl capture” to capture the data? I hope you don’t have to write a new tool to get the data? 


> 
> There have been DBus and IPMI performance-related discussions in the OpenBMC community, both of which can be helped by this work: IPMI-related issues started to appear as early as in 2017. One issue (#2630) describes a problem related to large numbers of sensors. Its follow-up (#3098) mentions “hostboot crashes due to poor IPMI performance”. Another issue (#2519) describes a commonly-seen problem of IPMI taking very long to respond (> 5s).
> There are also discussions on RedFish performance on the mailing list; A patch optimized DBus performance by introducing a cache for name translation.
> All the performance investigations listed above involve DBus and may be helped by this work.

Agreed

> 
> We are planning to use the BMCweb file hosting functionality to access the DBus event dumps and visualize the events in the web UI. The available profiling tools such as dbus-pcap, Wireshark, Bustle, Snyh, or DFeet do not provide the exact functionality we are looking for. Our goal is to develop functionalities similar to other widely used profilers such as GPUView or VTune Profiler.
> 

For the analysis and visualization side, I’m never a big fan of writing something from scratch. Have you looked into enhancing some of the existing tools out there vs. writing your own?

Although having in the web UI could be useful, I don’t really see it as a requirement. Could your tool be simpler to write or be made more generic for others to use if it was not tied to the web UI?

> One alternative solution considered was to stream DBus requests over websocket, but the existing websocket endpoints available on BMC webserver do not provide the exact information we need.
> 
> Requirements and Scalability:
> 	• Should provide the adequate functionalities to filter, visualize the events timeline, and group the DBus traffic based on multiple criteria such as type, source, destination, path, interface, demon signatures, and more.
> 	• Should support capture of DBus messages using as little resources as possible.
> 	• Should be able to show many (~thousands of) entries on screen simultaneously
> 	• Integration with webui-vue
> 
> Thank you


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BMC Performance Profiler
  2020-10-09 14:37   ` Andrew Geissler
@ 2020-10-09 15:45     ` Pasha Ghabussi
  2020-10-09 18:53     ` Sui Chen
  1 sibling, 0 replies; 6+ messages in thread
From: Pasha Ghabussi @ 2020-10-09 15:45 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: openbmc

[-- Attachment #1: Type: text/plain, Size: 4861 bytes --]

Thanks Andrew for the feedback.

We're planning to capture the data using something like "dbus-monitor" or
"busctl capture" and on't want to write a new tool for sure.

For the analysis and visualization part, we looked into multiple options
including the ones I mentioned in the previous email and none of them
provide the exact functionality that we need. There are a few DBus
profilers, but they do not allow grouping the events. Hence, they are not
capable of visualizing many potential performance problems.

Regarding the implementation, we have a working prototype that can
visualize DBus pcap dumps. However, we were thinking that including this in
the UI would be more useful to the community.

On Fri, Oct 9, 2020 at 10:37 AM Andrew Geissler <geissonator@gmail.com>
wrote:

>
>
> > On Oct 8, 2020, at 4:44 PM, Pasha Ghabussi <pashag@google.com> wrote:
> >
> > Hello all,
> >
> > We would really appreciate it if you can take a few minutes to read the
> proposal sent earlier and let us know your thoughts and suggestions.
> >
> > Thank you
> >
> > On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag@google.com> wrote:
> > Hello all,
> > We would really appreciate it if you can take a few minutes to read the
> following proposal and let us know your thoughts and suggestions.
> > We are developing a tool to investigate performance problems by looking
> at DBus traffic dumps.
>
> I definitely think this could be a very useful tool. Performance issues
> have hindered us from day 1 with OpenBMC and countless hours have gone into
> trying to identify the different issues. One area we’ve seen a lot of
> issues with is on BMC startup, especially after a firmware update. If you
> could provide a way to enable the needed profiling debug, and then reboot
> the BMC and capture the data for analysis, it would be appreciated.
>
> > Current DBus inspection and visualization tools do not represent the
> DBus events similar to a typical performance profiler. Additionally, these
> tools do not address typical BMC workloads such as IPMI and ASIO. Hence,
> identifying potential performance problems requires inspecting the raw BMC
> DBus traffic, which can become a long and complex process. We want to add a
> graphical interface to webui-vue to visualize the DBus traffic to address
> the abovementioned problem.
>
> Will you be using something like "busctl capture” to capture the data? I
> hope you don’t have to write a new tool to get the data?
>
>
> >
> > There have been DBus and IPMI performance-related discussions in the
> OpenBMC community, both of which can be helped by this work: IPMI-related
> issues started to appear as early as in 2017. One issue (#2630) describes a
> problem related to large numbers of sensors. Its follow-up (#3098) mentions
> “hostboot crashes due to poor IPMI performance”. Another issue (#2519)
> describes a commonly-seen problem of IPMI taking very long to respond (>
> 5s).
> > There are also discussions on RedFish performance on the mailing list; A
> patch optimized DBus performance by introducing a cache for name
> translation.
> > All the performance investigations listed above involve DBus and may be
> helped by this work.
>
> Agreed
>
> >
> > We are planning to use the BMCweb file hosting functionality to access
> the DBus event dumps and visualize the events in the web UI. The available
> profiling tools such as dbus-pcap, Wireshark, Bustle, Snyh, or DFeet do not
> provide the exact functionality we are looking for. Our goal is to develop
> functionalities similar to other widely used profilers such as GPUView or
> VTune Profiler.
> >
>
> For the analysis and visualization side, I’m never a big fan of writing
> something from scratch. Have you looked into enhancing some of the existing
> tools out there vs. writing your own?
>
> Although having in the web UI could be useful, I don’t really see it as a
> requirement. Could your tool be simpler to write or be made more generic
> for others to use if it was not tied to the web UI?
>
> > One alternative solution considered was to stream DBus requests over
> websocket, but the existing websocket endpoints available on BMC webserver
> do not provide the exact information we need.
> >
> > Requirements and Scalability:
> >       • Should provide the adequate functionalities to filter, visualize
> the events timeline, and group the DBus traffic based on multiple criteria
> such as type, source, destination, path, interface, demon signatures, and
> more.
> >       • Should support capture of DBus messages using as little
> resources as possible.
> >       • Should be able to show many (~thousands of) entries on screen
> simultaneously
> >       • Integration with webui-vue
> >
> > Thank you
>
>

[-- Attachment #2: Type: text/html, Size: 5452 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BMC Performance Profiler
  2020-10-09 14:37   ` Andrew Geissler
  2020-10-09 15:45     ` Pasha Ghabussi
@ 2020-10-09 18:53     ` Sui Chen
  1 sibling, 0 replies; 6+ messages in thread
From: Sui Chen @ 2020-10-09 18:53 UTC (permalink / raw)
  To: Andrew Geissler; +Cc: OpenBMC Maillist, Pasha Ghabussi

On Fri, Oct 9, 2020 at 7:38 AM Andrew Geissler <geissonator@gmail.com> wrote:
>
>
>
> > On Oct 8, 2020, at 4:44 PM, Pasha Ghabussi <pashag@google.com> wrote:
> >
> > Hello all,
> >
> > We would really appreciate it if you can take a few minutes to read the proposal sent earlier and let us know your thoughts and suggestions.
> >
> > Thank you
> >
> > On Mon, Oct 5, 2020 at 1:57 PM Pasha Ghabussi <pashag@google.com> wrote:
> > Hello all,
> > We would really appreciate it if you can take a few minutes to read the following proposal and let us know your thoughts and suggestions.
> > We are developing a tool to investigate performance problems by looking at DBus traffic dumps.
>
> I definitely think this could be a very useful tool. Performance issues have hindered us from day 1 with OpenBMC and countless hours have gone into trying to identify the different issues. One area we’ve seen a lot of issues with is on BMC startup, especially after a firmware update. If you could provide a way to enable the needed profiling debug, and then reboot the BMC and capture the data for analysis, it would be appreciated.
>
> > Current DBus inspection and visualization tools do not represent the DBus events similar to a typical performance profiler. Additionally, these tools do not address typical BMC workloads such as IPMI and ASIO. Hence, identifying potential performance problems requires inspecting the raw BMC DBus traffic, which can become a long and complex process. We want to add a graphical interface to webui-vue to visualize the DBus traffic to address the abovementioned problem.
>
> Will you be using something like "busctl capture” to capture the data? I hope you don’t have to write a new tool to get the data?
>

We will be using "busctl capture" to capture the data and not writing
a new one, just like Pasha mentioned.

>
> >
> > There have been DBus and IPMI performance-related discussions in the OpenBMC community, both of which can be helped by this work: IPMI-related issues started to appear as early as in 2017. One issue (#2630) describes a problem related to large numbers of sensors. Its follow-up (#3098) mentions “hostboot crashes due to poor IPMI performance”. Another issue (#2519) describes a commonly-seen problem of IPMI taking very long to respond (> 5s).
> > There are also discussions on RedFish performance on the mailing list; A patch optimized DBus performance by introducing a cache for name translation.
> > All the performance investigations listed above involve DBus and may be helped by this work.
>
> Agreed
>
> >
> > We are planning to use the BMCweb file hosting functionality to access the DBus event dumps and visualize the events in the web UI. The available profiling tools such as dbus-pcap, Wireshark, Bustle, Snyh, or DFeet do not provide the exact functionality we are looking for. Our goal is to develop functionalities similar to other widely used profilers such as GPUView or VTune Profiler.
> >
>
> For the analysis and visualization side, I’m never a big fan of writing something from scratch. Have you looked into enhancing some of the existing tools out there vs. writing your own?
>

One existing tool on the visualization side that resembles what we are
looking for is the ChromeDevTools performance profiler UI
(https://github.com/ChromeDevTools/devtools-frontend/tree/master/front_end/perf_ui),
in that it is capable of showing thousands of events in an interactive
way (allowing the user to pan/scale the time line and inspect
individual events). If we plug the debug UI to existing DBus-related
tools, we basically get something similar to the prototype (
https://gerrit.openbmc-project.xyz/c/openbmc/openbmc-tools/+/34263 )
but a lot more polished.

The Perf UI mentioned above seems to have many dependencies and is
tightly integrated into Chrome so that we feel it might take less
effort to write a basic implementation from scratch (covering basic
functionalities such as timelines and histograms) than to integrate it
into existing tools. Actually, both the Perf UI and the rendering
routine in the prototype use the HTML canvas element for
visualization, which are typically hardware-accelerated and can render
many thousands of objects (basic shapes, images, text, etc) at
interactive frame rates. The visualization runs in the user's browser
and does not consume the BMC's processing power.

> Although having in the web UI could be useful, I don’t really see it as a requirement. Could your tool be simpler to write or be made more generic for others to use if it was not tied to the web UI?
>

WebUI was considered for the following reasons: 1) web technologies,
in particular the hardware-accelerated HTML canvas, are convenient and
performant enough for the visualization we are looking for, and it
makes reusing the code in the prototype (which was also HTML based)
very easy; 2) accessing the BMC through WebUI saves the user the
trouble of having to manually start DBus capture and transfer the dump
file back to the host for doing visualization; 3) it might make it
easier to integrate this tool with future technologies such as
Redfish.

To untie the visualizer from the WebUI, there could be a few alternatives:
1) visualize the data using a text-based UI. In that case, the tool
would function similarly to tools like "top".
2) generate the visualization in SVG or HTML format similarly to FlameGraph.

In any case, the visualization part and the integrated performance
profiling experience would be our main contribution and is the extra
step we are taking on top of existing text-based tools like dbus-pcap
(which already parses dbus dumps and is being depended on by the
prototype.)

> > One alternative solution considered was to stream DBus requests over websocket, but the existing websocket endpoints available on BMC webserver do not provide the exact information we need.
> >
> > Requirements and Scalability:
> >       • Should provide the adequate functionalities to filter, visualize the events timeline, and group the DBus traffic based on multiple criteria such as type, source, destination, path, interface, demon signatures, and more.
> >       • Should support capture of DBus messages using as little resources as possible.
> >       • Should be able to show many (~thousands of) entries on screen simultaneously
> >       • Integration with webui-vue
> >
> > Thank you
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: BMC Performance Profiler
  2020-10-05 17:57 BMC Performance Profiler Pasha Ghabussi
  2020-10-08 21:44 ` Pasha Ghabussi
@ 2020-10-26  2:02 ` Andrew Jeffery
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Jeffery @ 2020-10-26  2:02 UTC (permalink / raw)
  To: Pasha Ghabussi, openbmc; +Cc: Ed Tanous, Sui Chen, Ofer Yehielli


> Requirements and Scalability:
> 
>  * Should provide the adequate functionalities to filter, visualize the 
> events timeline, and group the DBus traffic based on multiple criteria 
> such as type, source, destination, path, interface, demon signatures, 
> and more.

Probably the most common thing I've used dbus-pcap for is finding ugly latencies in long IPC call chains (i.e. more than one hop). This, among other insane ideas (boot process simulation via IPMI message replay), was the motivation for writing it.

Probably the most useful thing I've implemented in dbus-pcap (aside from the general filtering capabilities) was method call tracking. A harder problem is identifying complete call trees in the message timeline (filtering out unrelated messages). Generally this requires a bunch of manual work with getting the filters right in the dbus-pcap invocation, as it requires knowledge of the implementation of each daemon. Did you have any ideas for making this easier? My brief thought is to recursively identify the service targeted by a method call and to track calls from this service until it sends the reply message to the caller, though this gets messier with ASIO-based daemons.

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-10-26  2:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 17:57 BMC Performance Profiler Pasha Ghabussi
2020-10-08 21:44 ` Pasha Ghabussi
2020-10-09 14:37   ` Andrew Geissler
2020-10-09 15:45     ` Pasha Ghabussi
2020-10-09 18:53     ` Sui Chen
2020-10-26  2:02 ` Andrew Jeffery

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).