All of lore.kernel.org
 help / color / mirror / Atom feed
From: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com>
To: Jeremy Kerr <jk@ozlabs.org>, openbmc <openbmc@lists.ozlabs.org>
Cc: Emily Shaffer <emilyshaffer@google.com>,
	David Thompson <dthompson@mellanox.com>,
	Dong Wei <Dong.Wei@arm.com>,
	Supreeth Venkatesh <Supreeth.Venkatesh@arm.com>,
	"Naidoo, Nilan" <nilan.naidoo@intel.com>
Subject: Re: Initial MCTP design proposal
Date: Fri, 7 Dec 2018 10:43:48 +0530	[thread overview]
Message-ID: <94639b69-3f3c-a606-ae68-f7e1461097e9@linux.vnet.ibm.com> (raw)
In-Reply-To: <a75abc9a6c5c2ae50b39576a60a2bb07c65567d4.camel@ozlabs.org>

On 07/12/18 8:11 AM, Jeremy Kerr wrote:
> Hi OpenBMCers!
> 
> In an earlier thread, I promised to sketch out a design for a MCTP
> implementation in OpenBMC, and I've included it below.


Thanks Jeremy for sending this out. This looks good (have just one 
comment below).

Question for everyone : do you have plans to employ PLDM over MCTP?

We are interested in PLDM for various "inside the box" communications 
(at the moment for the Host <-> BMC communication). I'd like to propose 
a design for a PLDM stack on OpenBMC, and would send a design template 
for review on the mailing list in some amount of time (I've just started 
with some initial sketches). I'd like to also know if others have 
embarked on a similar activity, so that we can collaborate earlier and 
avoid duplicate work.

> This is roughly in the OpenBMC design document format (thanks for the
> reminder Andrew), but I've sent it to the list for initial review before
> proposing to gerrit - mainly because there were a lot of folks who
> expressed interest on the list. I suggest we move to gerrit once we get
> specific feedback coming in. Let me know if you have general comments
> whenever you like though.
> 
> In parallel, I've been developing a prototype for the MCTP library
> mentioned below, including a serial transport binding. I'll push to
> github soon and post a link, once I have it in a
> slightly-more-consumable form.
> 
> Cheers,
> 
> 
> Jeremy
> 
> --------------------------------------------------------
> 
> # Host/BMC communication channel: MCTP & PLDM
> 
> Author: Jeremy Kerr <jk@ozlabs.org> <jk>
> 
> ## Problem Description
> 
> Currently, we have a few different methods of communication between host
> and BMC. This is primarily IPMI-based, but also includes a few
> hardware-specific side-channels, like hiomap. On OpenPOWER hardware at
> least, we've definitely started to hit some of the limitations of IPMI
> (for example, we have need for >255 sensors), as well as the hardware
> channels that IPMI typically uses.
> 
> This design aims to use the Management Component Transport Protocol
> (MCTP) to provide a common transport layer over the multiple channels
> that OpenBMC platforms provide. Then, on top of MCTP, we have the
> opportunity to move to newer host/BMC messaging protocols to overcome
> some of the limitations we've encountered with IPMI.
> 
> ## Background and References
> 
> Separating the "transport" and "messaging protocol" parts of the current
> stack allows us to design these parts separately. Currently, IPMI
> defines both of these; we currently have BT and KCS (both defined as
> part of the IPMI 2.0 standard) as the transports, and IPMI itself as the
> messaging protocol.
> 
> Some efforts of improving the hardware transport mechanism of IPMI have
> been attempted, but not in a cross-implementation manner so far. This
> does not address some of the limitations of the IPMI data model.
> 
> MCTP defines a standard transport protocol, plus a number of separate
> hardware bindings for the actual transport of MCTP packets. These are
> defined by the DMTF's Platform Management Working group; standards are
> available at:
> 
>    https://www.dmtf.org/standards/pmci
> 
> I have included a small diagram of how these standards may fit together
> in an OpenBMC system. The DSP numbers there are references to DMTF
> standards.
> 
> One of the key concepts here is that separation of transport protocol
> from the hardware bindings; this means that an MCTP "stack" may be using
> either a I2C, PCI, Serial or custom hardware channel, without the higher
> layers of that stack needing to be aware of the hardware implementation.
> These higher levels only need to be aware that they are communicating
> with a certain entity, defined by an Entity ID (MCTP EID).
> 
> I've mainly focussed on the "transport" part of the design here. While
> this does enable new messaging protocols (mainly PLDM), I haven't
> covered that much; we will propose those details for a separate design
> effort.
> 
> As part of the design, I have referred to MCTP "messages" and "packets";
> this is intentional, to match the definitions in the MCTP standard. MCTP
> messages are the higher-level data transferred between MCTP endpoints,
> which packets are typically smaller, and are what is sent over the
> hardware. Messages that are larger than the hardware MTU are split into
> individual packets by the transmit implementation, and reassembled at
> the receive implementation.
> 
> A final important point is that this design is for the host <--> BMC
> channel *only*. Even if we do replace IPMI for the host interface, we
> will certainly need an IPMI interface available for external system
> management.
> 
> ## Requirements
> 
> Any channel between host and BMC should:
> 
>   - Have a simple serialisation and deserialisation format, to enable
>     implementations in host firmware, which have widely varying runtime
>     capabilities
> 
>   - Allow different hardware channels, as we have a wide variety of
>     target platforms for OpenBMC
> 
>   - Be usable over simple hardware implementations, but have a facility
>     for higher bandwidth messaging on platforms that require it.
> 
>   - Ideally, integrate with newer messaging protocols
> 
> ## Proposed Design
> 
> The MCTP core specification just provides the packetisation, routing and
> addressing mechanisms. The actual transmit/receive of those packets is
> up to the hardware binding of the MCTP transport.
> 
> For OpenBMC, we would introduce an MCTP daemon, which implements the
> transport over a configurable hardware channel (eg., Serial UART, I2C or
> PCI). This daemon is responsible for the packetisation and routing of
> MCTP messages to and from host firmware.
> 
> I see two options for the "inbound" or "application" interface of the
> MCTP daemon:
> 
>   - it could handle upper parts of the stack (eg PLDM) directly, through
>     in-process handlers that register for certain MCTP message types; or

We'd like to somehow ensure (at least via documentation) that the 
handlers don't block the MCTP daemon from processing incoming traffic. 
The handlers might anyway end up making IPC calls (via D-Bus) to other 
processes. The second approach below seems to alleviate this problem.

>   - it could channel raw MCTP messages (reassembled from MCTP packets) to
>     DBUS messages (similar to the current IPMI host daemons), where the
>     upper layers receive and act on those DBUS events.
> 
> I have a preference for the former, but I would be interested to hear
> from the IPMI folks about how the latter structure has worked in the
> past.
> 
> The proposed implementation here is to produce an MCTP "library" which
> provides the packetisation and routing functions, between:
> 
>   - an "upper" messaging transmit/receive interface, for tx/rx of a full
>     message to a specific endpoint
> 
>   - a "lower" hardware binding for transmit/receive of individual
>     packets, providing a method for the core to tx/rx each packet to
>     hardware
> 
> The lower interface would be plugged in to one of a number of
> hardware-specific binding implementations (most of which would be
> included in the library source tree, but others can be plugged-in too)
> 
> The reason for a library is to allow the same MCTP implementation to be
> used in both OpenBMC and host firmware; the library should be
> bidirectional. To allow this, the library would be written in portable C
> (structured in a way that can be compiled as "extern C" in C++
> codebases), and be able to be configured to suit those runtime
> environments (for example, POSIX IO may not be available on all
> platforms; we should be able to compile the library to suit). The
> licence for the library should also allow this re-use; I'd suggest a
> dual Apache & GPL licence.
> 
> As for the hardware bindings, we would want to implement a serial
> transport binding first, to allow easy prototyping in simulation. For
> OpenPOWER, we'd want to implement a "raw LPC" binding for better
> performance, and later PCIe for large transfers. I imagine that there is
> a need for an I2C binding implementation for other hardware platforms
> too.
> 
> Lastly, I don't want to exclude any currently-used interfaces by
> implementing MCTP - this should be an optional component of OpenBMC, and
> not require platforms to implement it.
> 
> ## Alternatives Considered
> 
> There have been two main alternatives to this approach:
> 
> Continue using IPMI, but start making more use of OEM extensions to
> suit the requirements of new platforms. However, given that the IPMI
> standard is no longer under active development, we would likely end up
> with a large amount of platform-specific customisations. This also does
> not solve the hardware channel issues in a standard manner.
> 
> Redfish between host and BMC. This would mean that host firmware needs a
> HTTP client, a TCP/IP stack, a JSON (de)serialiser, and support for
> Redfish schema. This is not feasible for all host firmware
> implementations; certainly not for OpenPOWER. It's possible that we
> could run a simplified Redfish stack - indeed, MCTP has a proposal for a
> Redfish-over-MCTP protocol, which uses simplified serialisation and no
> requirement on HTTP. However, this still introduces a large amount of
> complexity in host firmware.
> 
> ## Impacts
> 
> Development would be required to implement the MCTP transport, plus any
> new users of the MCTP messaging (eg, a PLDM implementation). These would
> somewhat duplicate the work we have in IPMI handlers.
> 
> We'd want to keep IPMI running in parallel, so the "upgrade" path should
> be fairly straightforward.
> 
> Design and development needs to involve potential host firmware
> implementations.
> 
> ## Testing
> 
> For the core MCTP library, we are able to run tests there in complete
> isolation (I have already been able to run a prototype MCTP stack
> through the afl fuzzer) to ensure that the core transport protocol
> works.
> 
> For MCTP hardware bindings, we would develop channel-specific tests that
> would be run in CI on both host and BMC.
> 
> For the OpenBMC MCTP daemon implementation, testing models would depend
> on the structure we adopt in the design section.
> 

Regards,
Deepak

  parent reply	other threads:[~2018-12-07  5:14 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07  2:41 Initial MCTP design proposal Jeremy Kerr
2018-12-07  4:15 ` Naidoo, Nilan
2018-12-07  5:06   ` Jeremy Kerr
2018-12-07  5:40     ` Naidoo, Nilan
2018-12-07  5:13 ` Deepak Kodihalli [this message]
2018-12-07  7:41   ` Jeremy Kerr
2018-12-07 17:09   ` Supreeth Venkatesh
2018-12-07 18:53     ` Emily Shaffer
2018-12-07 20:06       ` Supreeth Venkatesh
2018-12-07 21:19       ` Jeremy Kerr
2018-12-11  1:14       ` Stewart Smith
2018-12-11 18:26         ` Tanous, Ed
2018-12-18  0:10           ` Stewart Smith
2018-12-10  6:14     ` Deepak Kodihalli
2018-12-10 17:40       ` Supreeth Venkatesh
2018-12-11  7:38         ` Deepak Kodihalli
2018-12-12 22:50           ` Supreeth Venkatesh
2018-12-07 16:38 ` Supreeth Venkatesh
2019-02-07 15:51 ` Brad Bishop
2019-02-08  6:48   ` Jeremy Kerr
2019-02-08 15:55     ` Supreeth Venkatesh
2019-02-11 18:57     ` Brad Bishop
2019-02-12  8:43       ` Jeremy Kerr
2019-03-06 20:04         ` Ed Tanous
2019-03-07  8:46           ` Deepak Kodihalli
2019-03-07 19:35             ` Ed Tanous
2019-03-08  4:58               ` Deepak Kodihalli
2019-03-08  5:21                 ` Deepak Kodihalli
2019-03-07 20:40             ` Supreeth Venkatesh
2019-03-18 12:12           ` Brad Bishop

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94639b69-3f3c-a606-ae68-f7e1461097e9@linux.vnet.ibm.com \
    --to=dkodihal@linux.vnet.ibm.com \
    --cc=Dong.Wei@arm.com \
    --cc=Supreeth.Venkatesh@arm.com \
    --cc=dthompson@mellanox.com \
    --cc=emilyshaffer@google.com \
    --cc=jk@ozlabs.org \
    --cc=nilan.naidoo@intel.com \
    --cc=openbmc@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.