All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Geissler <geissonator@gmail.com>
To: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com>
Cc: Jayanth Othayoth <ojayanth@gmail.com>,
	OpenBMC Maillist <openbmc@lists.ozlabs.org>
Subject: Re: Add support to debug unresponsive host
Date: Thu, 16 May 2019 08:01:00 -0500	[thread overview]
Message-ID: <CALLMt=q+PHM09zDeM5hBGRm7sTmPF42QSo6fYB=CmL5DHP_rKg@mail.gmail.com> (raw)
In-Reply-To: <c8826cab-42d1-85d7-4eb0-50e79857a205@linux.vnet.ibm.com>

On Thu, May 16, 2019 at 1:36 AM Deepak Kodihalli
<dkodihal@linux.vnet.ibm.com> wrote:
>
> On 15/05/19 6:09 PM, Jayanth Othayoth wrote:
> > ## Problem Description
> > Issue #457:  Add support to debug unresponsive host.
> >
> > Scope: High level design direction to solve this problem,
> >
> > ## Background and References
> > There are situation at customer places where OPAL/Linux goes
> > unresponsive causing a system hang. And there is no way to figure out
> > what went wrong with Linux kernel or OPAL. Looking for a way to trigger
> > a dump capture on Linux host so that we can capture the OS dump for post
> > analysis.
> >
> > ## Proposed Design for POWER processor based systems:
> > Get all Host CPUs in reset vector and Linux then has a mechanism to
> > patch it into panic-kdump path to trigger dump capture. This will enable
> > us to analyze and fix customer issue where we see Linux hang and
> > unresponsive system.
> >
> > ### Redfish Schema used:
> > * Reference: DSP2046 2018.3,
> > * ComputerSystem 1.6.0 schema provides an action called
> > #ComputerSystem.Reset”, This action is used to reset the system.
> > ResetType parameter is used  for indicating type of reset need to be
> > performed. In this use case we can use “Nmi” type
> >      * Nmi: Generate a Diagnostic Interrupt (usually an NMI on x86
> > systems) to cease normal operations, perform diagnostic actions and
> > typically halt the system.
> > * ### d-bus :
> >
> > Option 1:   Extending  the existing  d-bus interface  state.Host  name
> > space (
> > /openbmc/phosphor-dbus-interfaces/xyz/openbmc_project/State/Host.interface.yaml
> > ) to support new RequestedHostTransition property called  “Nmi”.   d-bus
> > backend can internally invoke processor specific target to do Sreset(
> > equivalent to x86 NMI) and associated  actions.
>
> I don't prefer this option, because this would mean adding host specific
> code in phoshor-state-manager, which I think until now is host agnostic.

Yeah, this was my main concern with tying it into phosphor-state-manager.
The fact Redfish put it in with their other state related commands (which
are implemented by phosphor-state-manager) is the only reason I'm a little
wishy-washy here. We could just create a generic systemd target "host-nmi"
or something and phosphor-state-manager could just call that to abstract
any of the specifics, but it sill doesn't really feel like it fits to me.

I think I prefer option 2, and then we can just map bmcweb to that API when
the Redfish command comes in. Sounds like for ppc64 systems we can just
use pdbg to issue the NMI.

> So for that reason, Option 2 sounds better. There are some good
> questions from Neeraj as well, so I would suggest adding this as a
> design template on Gerrit to gather better feedback.
>
> Thanks,
> Deepak
>
> > Option 2: Introducing new d-bus interface in the control.state namespace
> > (
> > /openbmc/phosphor-dbus-interfaces/xyz/openbmc_project/Control/Host/NMI.interface.yaml)
> > namespace and implement the new d-bus back-end for respective  processor
> > specific targets.
>

  reply	other threads:[~2019-05-16 13:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 12:39 Add support to debug unresponsive host Jayanth Othayoth
2019-05-15 18:26 ` Neeraj Ladkani
2019-05-16  9:11   ` Artem Senichev
2019-05-16  6:36 ` Deepak Kodihalli
2019-05-16 13:01   ` Andrew Geissler [this message]
2019-05-27  7:15     ` Jayanth Othayoth
2019-05-27 12:42       ` vishwa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALLMt=q+PHM09zDeM5hBGRm7sTmPF42QSo6fYB=CmL5DHP_rKg@mail.gmail.com' \
    --to=geissonator@gmail.com \
    --cc=dkodihal@linux.vnet.ibm.com \
    --cc=ojayanth@gmail.com \
    --cc=openbmc@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.