All of lore.kernel.org
 help / color / mirror / Atom feed
* Out-of-band SRESET
@ 2017-03-16  8:02 Ananth N Mavinakayanahalli
  2017-03-16 15:37 ` Patrick Williams
  0 siblings, 1 reply; 5+ messages in thread
From: Ananth N Mavinakayanahalli @ 2017-03-16  8:02 UTC (permalink / raw)
  To: openbmc; +Cc: mahesh, hegdevasant, vsainath

Hi,

One requirement from a OpenPOWER service point-of-view is to be able to
trigger an out-of-band SRESET on a unresponsive system. We can then have
the necessary plumbing in the host Linux kernel to either drop the
machine into a debugger or trigger a dump capture, if configured.

On P9, this would translate to a series of SCOM operations for the SBE
It would be good to have a REST API defined to cater to this specific
purpose.

The API should cater to:
- SRESET a core
- SRESET a chip
- SRESET all cores

Thoughts?

Regards,
Ananth

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Out-of-band SRESET
  2017-03-16  8:02 Out-of-band SRESET Ananth N Mavinakayanahalli
@ 2017-03-16 15:37 ` Patrick Williams
  2017-03-16 16:06   ` Rick Altherr
                     ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Patrick Williams @ 2017-03-16 15:37 UTC (permalink / raw)
  To: Ananth N Mavinakayanahalli; +Cc: openbmc, mahesh, vsainath

[-- Attachment #1: Type: text/plain, Size: 1104 bytes --]

On Thu, Mar 16, 2017 at 01:32:52PM +0530, Ananth N Mavinakayanahalli wrote:
> Hi,
> 
> One requirement from a OpenPOWER service point-of-view is to be able to
> trigger an out-of-band SRESET on a unresponsive system. We can then have
> the necessary plumbing in the host Linux kernel to either drop the
> machine into a debugger or trigger a dump capture, if configured.
> 
> On P9, this would translate to a series of SCOM operations for the SBE
> It would be good to have a REST API defined to cater to this specific
> purpose.
> 
> The API should cater to:
> - SRESET a core
> - SRESET a chip
> - SRESET all cores
> 
> Thoughts?
> 
> Regards,
> Ananth
> 

Ananth,

I understand the desire from your end with respect to debugging the
host.  Is there something we can do to model this better from a REST
perspective to make this less Power-specific?  Do other architectures
also have a "send debug interrupt"?

Do you need to SRESET targeting an SMT thread?  We will need to come up
with some kind of identifier for sending the debug interrupts.

-- 
Patrick Williams

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Out-of-band SRESET
  2017-03-16 15:37 ` Patrick Williams
@ 2017-03-16 16:06   ` Rick Altherr
  2017-03-16 21:54   ` Stewart Smith
  2017-03-17  5:25   ` Ananth N Mavinakayanahalli
  2 siblings, 0 replies; 5+ messages in thread
From: Rick Altherr @ 2017-03-16 16:06 UTC (permalink / raw)
  To: Patrick Williams
  Cc: Ananth N Mavinakayanahalli, mahesh, OpenBMC Maillist, vsainath

I know x86 has debug modes but I'm unfamiliar with them.  I'll ask my
teammates who know more for some details to see if and how the BMC is
involved.

On Thu, Mar 16, 2017 at 8:37 AM, Patrick Williams <patrick@stwcx.xyz> wrote:
> On Thu, Mar 16, 2017 at 01:32:52PM +0530, Ananth N Mavinakayanahalli wrote:
>> Hi,
>>
>> One requirement from a OpenPOWER service point-of-view is to be able to
>> trigger an out-of-band SRESET on a unresponsive system. We can then have
>> the necessary plumbing in the host Linux kernel to either drop the
>> machine into a debugger or trigger a dump capture, if configured.
>>
>> On P9, this would translate to a series of SCOM operations for the SBE
>> It would be good to have a REST API defined to cater to this specific
>> purpose.
>>
>> The API should cater to:
>> - SRESET a core
>> - SRESET a chip
>> - SRESET all cores
>>
>> Thoughts?
>>
>> Regards,
>> Ananth
>>
>
> Ananth,
>
> I understand the desire from your end with respect to debugging the
> host.  Is there something we can do to model this better from a REST
> perspective to make this less Power-specific?  Do other architectures
> also have a "send debug interrupt"?
>
> Do you need to SRESET targeting an SMT thread?  We will need to come up
> with some kind of identifier for sending the debug interrupts.
>
> --
> Patrick Williams

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Out-of-band SRESET
  2017-03-16 15:37 ` Patrick Williams
  2017-03-16 16:06   ` Rick Altherr
@ 2017-03-16 21:54   ` Stewart Smith
  2017-03-17  5:25   ` Ananth N Mavinakayanahalli
  2 siblings, 0 replies; 5+ messages in thread
From: Stewart Smith @ 2017-03-16 21:54 UTC (permalink / raw)
  To: Patrick Williams, Ananth N Mavinakayanahalli; +Cc: mahesh, openbmc, vsainath

Patrick Williams <patrick@stwcx.xyz> writes:
> On Thu, Mar 16, 2017 at 01:32:52PM +0530, Ananth N Mavinakayanahalli wrote:
>> Hi,
>> 
>> One requirement from a OpenPOWER service point-of-view is to be able to
>> trigger an out-of-band SRESET on a unresponsive system. We can then have
>> the necessary plumbing in the host Linux kernel to either drop the
>> machine into a debugger or trigger a dump capture, if configured.
>> 
>> On P9, this would translate to a series of SCOM operations for the SBE
>> It would be good to have a REST API defined to cater to this specific
>> purpose.
>> 
>> The API should cater to:
>> - SRESET a core
>> - SRESET a chip
>> - SRESET all cores
>> 
>> Thoughts?
>> 
>> Regards,
>> Ananth
>> 
>
> Ananth,
>
> I understand the desire from your end with respect to debugging the
> host.  Is there something we can do to model this better from a REST
> perspective to make this less Power-specific?  Do other architectures
> also have a "send debug interrupt"?

on x86 there's the NMI, which can be sent via "ipmitool power diag"

It also exists in RedFish as a type of restart (on, forceoff,
gracefulrestart, nmi etc)

-- 
Stewart Smith
OPAL Architect, IBM.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Out-of-band SRESET
  2017-03-16 15:37 ` Patrick Williams
  2017-03-16 16:06   ` Rick Altherr
  2017-03-16 21:54   ` Stewart Smith
@ 2017-03-17  5:25   ` Ananth N Mavinakayanahalli
  2 siblings, 0 replies; 5+ messages in thread
From: Ananth N Mavinakayanahalli @ 2017-03-17  5:25 UTC (permalink / raw)
  To: Patrick Williams; +Cc: openbmc, mahesh, vsainath

On Thu, Mar 16, 2017 at 10:37:56AM -0500, Patrick Williams wrote:
> On Thu, Mar 16, 2017 at 01:32:52PM +0530, Ananth N Mavinakayanahalli wrote:
> > Hi,
> > 
> > One requirement from a OpenPOWER service point-of-view is to be able to
> > trigger an out-of-band SRESET on a unresponsive system. We can then have
> > the necessary plumbing in the host Linux kernel to either drop the
> > machine into a debugger or trigger a dump capture, if configured.
> > 
> > On P9, this would translate to a series of SCOM operations for the SBE
> > It would be good to have a REST API defined to cater to this specific
> > purpose.
> > 
> > The API should cater to:
> > - SRESET a core
> > - SRESET a chip
> > - SRESET all cores
> > 
> > Thoughts?
> > 
> > Regards,
> > Ananth
> > 
> 
> Ananth,
> 
> I understand the desire from your end with respect to debugging the
> host.  Is there something we can do to model this better from a REST
> perspective to make this less Power-specific?  Do other architectures
> also have a "send debug interrupt"?

Any option that says nmi for x86 can apply here, IMO.

> Do you need to SRESET targeting an SMT thread?  We will need to come up
> with some kind of identifier for sending the debug interrupts.

For starters, we will be using the SRESET as an unrecoverable entity --
option of last resort. The SRESET all cores will be the most used, but I
can envisage cases where we would need specific cores/threads to be
forced into xmon or such. While it is good to have the design to be able
to accommodate it, targeted SMT thread reset isn't a 'must have' to
begin with.

Ananth

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-03-17  5:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-16  8:02 Out-of-band SRESET Ananth N Mavinakayanahalli
2017-03-16 15:37 ` Patrick Williams
2017-03-16 16:06   ` Rick Altherr
2017-03-16 21:54   ` Stewart Smith
2017-03-17  5:25   ` Ananth N Mavinakayanahalli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.