All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kun Yi <kunyi@google.com>
To: OpenBMC Maillist <openbmc@lists.ozlabs.org>
Subject: BMC health metrics (again!)
Date: Tue, 9 Apr 2019 09:25:46 -0700	[thread overview]
Message-ID: <CAGMNF6VHifnF8qC61HN2bboY8duArOuQ1FvK3mP1gA6Xbazcow@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2480 bytes --]

Hello there,

This topic has been brought up several times on the mailing list and
offline, but in general seems we as a community didn't reach a consensus on
what things would be the most valuable to monitor, and how to monitor them.
While it seems a general purposed monitoring infrastructure for OpenBMC is
a hard problem, I have some simple ideas that I hope can provide immediate
and direct benefits.

1. Monitoring host IPMI link reliability (host side)

The essentials I want are "IPMI commands sent" and "IPMI commands
succeeded" counts over time. More metrics like response time would
be helpful as well. The issue to address here: when some IPMI sensor
readings are flaky, it would be really helpful to tell from IPMI command
stats to determine whether it is a hardware issue, or IPMI issue. Moreover,
it would be a very useful regression test metric for rolling out new BMC
software.

Looking at the host IPMI side, there is some metrics exposed
through /proc/ipmi/0/si_stats if ipmi_si driver is used, but I haven't dug
into whether it contains information mapping to the interrupts. Time to
read the source code I guess.

Another idea would be to instrument caller libraries like the interfaces in
ipmitool, though I feel that approach is harder due to fragmentation of
IPMI libraries.

2. Read and expose core BMC performance metrics from procfs

This is straightforward: have a smallish daemon (or bmc-state-manager)
read,parse, and process procfs and put values on D-Bus. Core metrics I'm
interested in getting through this way: load average, memory, disk
used/available, net stats... The values can then simply be exported as IPMI
sensors or Redfish resource properties.

A nice byproduct of this effort would be a procfs parsing library. Since
different platforms would probably have different monitoring requirements
and procfs output format has no standard, I'm thinking the user would just
provide a configuration file containing list of (procfs path, property
regex, D-Bus property name), and the compile-time generated code to provide
an object for each property.

All of this is merely thoughts and nothing concrete. With that said, it
would be really great if you could provide some feedback such as "I want
this, but I really need that feature", or let me know it's all implemented
already :)

If this seems valuable, after gathering more feedback of feature
requirements, I'm going to turn them into design docs and upload for review.

-- 
Regards,
Kun

[-- Attachment #2: Type: text/html, Size: 2977 bytes --]

             reply	other threads:[~2019-04-09 16:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-09 16:25 Kun Yi [this message]
2019-04-11 12:56 ` BMC health metrics (again!) Sivas Srr
2019-04-20  1:04   ` Kun Yi
2019-04-12 13:02 ` Andrew Geissler
2019-04-20  1:08   ` Kun Yi
2019-05-08  8:11 ` vishwa
2019-05-17  6:30   ` Neeraj Ladkani
2019-05-17  7:17     ` vishwa
2019-05-17  7:23       ` Neeraj Ladkani
2019-05-17  7:27         ` vishwa
2019-05-17 15:50           ` Kun Yi
2019-05-17 18:25             ` vishwa
2019-05-20 21:29               ` Neeraj Ladkani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGMNF6VHifnF8qC61HN2bboY8duArOuQ1FvK3mP1gA6Xbazcow@mail.gmail.com \
    --to=kunyi@google.com \
    --cc=openbmc@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.