All of lore.kernel.org
 help / color / mirror / Atom feed
From: Supreeth Venkatesh <supreeth.venkatesh@arm.com>
To: Deepak Kodihalli <dkodihal@linux.vnet.ibm.com>,
	Jeremy Kerr <jk@ozlabs.org>,  openbmc <openbmc@lists.ozlabs.org>
Cc: Emily Shaffer <emilyshaffer@google.com>,
	David Thompson <dthompson@mellanox.com>,
	Dong Wei <Dong.Wei@arm.com>,
	"Naidoo, Nilan" <nilan.naidoo@intel.com>
Subject: Re: Initial MCTP design proposal
Date: Wed, 12 Dec 2018 16:50:49 -0600	[thread overview]
Message-ID: <00af4681d9b2b912cc461480d8e6f20e4032cb5b.camel@arm.com> (raw)
In-Reply-To: <42032c4e-4594-5803-67b4-79ba8622fc71@linux.vnet.ibm.com>

On Tue, 2018-12-11 at 13:08 +0530, Deepak Kodihalli wrote:
> On 10/12/18 11:10 PM, Supreeth Venkatesh wrote:
> > On Mon, 2018-12-10 at 11:44 +0530, Deepak Kodihalli wrote:
> > > On 07/12/18 10:39 PM, Supreeth Venkatesh wrote:
> > > > On Fri, 2018-12-07 at 10:43 +0530, Deepak Kodihalli wrote:
> > > > > On 07/12/18 8:11 AM, Jeremy Kerr wrote:
> > > > > > Hi OpenBMCers!
> > > > > > 
> > > > > > In an earlier thread, I promised to sketch out a design for
> > > > > > a
> > > > > > MCTP
> > > > > > implementation in OpenBMC, and I've included it below.
> > > > > 
> > > > > 
> > > > > Thanks Jeremy for sending this out. This looks good (have
> > > > > just
> > > > > one
> > > > > comment below).
> > > > > 
> > > > > Question for everyone : do you have plans to employ PLDM over
> > > > > MCTP?
> > > > 
> > > > Yes Deepak. we do eventually.
> > > 
> > > 
> > > Thanks for letting me know Supreeth!
> > 
> > My pleasure.
> > 
> > > 
> > > > > 
> > > > > We are interested in PLDM for various "inside the box"
> > > > > communications
> > > > > (at the moment for the Host <-> BMC communication). I'd like
> > > > > to
> > > > > propose
> > > > > a design for a PLDM stack on OpenBMC, and would send a design
> > > > > template
> > > > > for review on the mailing list in some amount of time (I've
> > > > > just
> > > > > started
> > > > > with some initial sketches). I'd like to also know if others
> > > > > have
> > > > > embarked on a similar activity, so that we can collaborate
> > > > > earlier
> > > > > and
> > > > > avoid duplicate work.
> > > > 
> > > > Yes. Interested to collaborate.
> > > > Which portion of PLDM are you working on, other than base?
> > > > Platform Monitoring and Control?
> > > > Firmware Update?
> > > > BIOS Control andConfiguration?
> > > > SMBIOS Transfer?
> > > > FRU Data?
> > > > Redfish Device Enablement?
> > > > 
> > > > We are currently interested in Platform Monitoring and Control.
> > > 
> > > 
> > > We're interested in each of these profiles for the BMC host
> > > communications. Are you interested in Platform monitoring and
> > > control
> > > for communications involving the BMC and the host firmware, or
> > > the
> > > BMC
> > > and other devices?
> > 
> > BMC and the host firmware initially.
> > 
> > > 
> > > Also, I have been thinking about the usefulness/feasibility of a
> > > common
> > > PLDM library (just the protocol piece - encoding and decoding
> > > PLDM
> > > messages), so as to be able to share code between BMC and host
> > > firmware.
> > > This of course sets expectations on the library based on OpenBMC
> > > and
> > > various host firmware stacks. Do you have an opinion on this?
> > 
> > Glad that we are on the same page.
> > My thinking at this point is to come up with a generic standalone
> > "C"
> > library which processes PLDM messages without regard to whether
> > this
> > message contains payload for Sensors, firmware update, etc., so
> > that it
> > can be ported to Host firmware if needed.
> 
> I was thinking of a C lib as well (given the lack of or limited C++ 
> stdlib support on some host firmware stacks). Although, when you say
> a 
> lib that processes the PLDM messages, do you mean just the parsing
> part?
> 
The example you gave below aptly sums up what I had in mind.
 
> I think the processing/handling of a PLDM message would be platform 
> specific, because that involves mapping PLDM concepts to platform 
> concepts (for eg to D-Bus on OpenBMC). What I believe can get to the 
> common lib is the marshalling and umarshalling of PLDM messages. So
> for 
> eg if the platform has all the necessary information to make a PLDM 
> message, it can rely on this lib to actually prepare the message for
> it. 
> Plus the reverse flow - decode an incoming PLDM message into C-style 
> data types. We'd have to work on what these APIs look like. Consumers
> of 
> this lib would be the PLDM app(s)/daemon(s).
Yes Exactly.

> 
> > > 
> > > > > 
> > > > > > This is roughly in the OpenBMC design document format
> > > > > > (thanks
> > > > > > for
> > > > > > the
> > > > > > reminder Andrew), but I've sent it to the list for initial
> > > > > > review
> > > > > > before
> > > > > > proposing to gerrit - mainly because there were a lot of
> > > > > > folks
> > > > > > who
> > > > > > expressed interest on the list. I suggest we move to gerrit
> > > > > > once we
> > > > > > get
> > > > > > specific feedback coming in. Let me know if you have
> > > > > > general
> > > > > > comments
> > > > > > whenever you like though.
> > > > > > 
> > > > > > In parallel, I've been developing a prototype for the MCTP
> > > > > > library
> > > > > > mentioned below, including a serial transport binding. I'll
> > > > > > push to
> > > > > > github soon and post a link, once I have it in a
> > > > > > slightly-more-consumable form.
> > > > > > 
> > > > > > Cheers,
> > > > > > 
> > > > > > 
> > > > > > Jeremy
> > > > > > 
> > > > > > --------------------------------------------------------
> > > > > > 
> > > > > > # Host/BMC communication channel: MCTP & PLDM
> > > > > > 
> > > > > > Author: Jeremy Kerr <jk@ozlabs.org> <jk>
> > > > > > 
> > > > > > ## Problem Description
> > > > > > 
> > > > > > Currently, we have a few different methods of communication
> > > > > > between
> > > > > > host
> > > > > > and BMC. This is primarily IPMI-based, but also includes a
> > > > > > few
> > > > > > hardware-specific side-channels, like hiomap. On OpenPOWER
> > > > > > hardware
> > > > > > at
> > > > > > least, we've definitely started to hit some of the
> > > > > > limitations
> > > > > > of
> > > > > > IPMI
> > > > > > (for example, we have need for >255 sensors), as well as
> > > > > > the
> > > > > > hardware
> > > > > > channels that IPMI typically uses.
> > > > > > 
> > > > > > This design aims to use the Management Component Transport
> > > > > > Protocol
> > > > > > (MCTP) to provide a common transport layer over the
> > > > > > multiple
> > > > > > channels
> > > > > > that OpenBMC platforms provide. Then, on top of MCTP, we
> > > > > > have
> > > > > > the
> > > > > > opportunity to move to newer host/BMC messaging protocols
> > > > > > to
> > > > > > overcome
> > > > > > some of the limitations we've encountered with IPMI.
> > > > > > 
> > > > > > ## Background and References
> > > > > > 
> > > > > > Separating the "transport" and "messaging protocol" parts
> > > > > > of
> > > > > > the
> > > > > > current
> > > > > > stack allows us to design these parts separately.
> > > > > > Currently,
> > > > > > IPMI
> > > > > > defines both of these; we currently have BT and KCS (both
> > > > > > defined
> > > > > > as
> > > > > > part of the IPMI 2.0 standard) as the transports, and IPMI
> > > > > > itself
> > > > > > as the
> > > > > > messaging protocol.
> > > > > > 
> > > > > > Some efforts of improving the hardware transport mechanism
> > > > > > of
> > > > > > IPMI
> > > > > > have
> > > > > > been attempted, but not in a cross-implementation manner so
> > > > > > far.
> > > > > > This
> > > > > > does not address some of the limitations of the IPMI data
> > > > > > model.
> > > > > > 
> > > > > > MCTP defines a standard transport protocol, plus a number
> > > > > > of
> > > > > > separate
> > > > > > hardware bindings for the actual transport of MCTP packets.
> > > > > > These
> > > > > > are
> > > > > > defined by the DMTF's Platform Management Working group;
> > > > > > standards
> > > > > > are
> > > > > > available at:
> > > > > > 
> > > > > >      https://www.dmtf.org/standards/pmci
> > > > > > 
> > > > > > I have included a small diagram of how these standards may
> > > > > > fit
> > > > > > together
> > > > > > in an OpenBMC system. The DSP numbers there are references
> > > > > > to
> > > > > > DMTF
> > > > > > standards.
> > > > > > 
> > > > > > One of the key concepts here is that separation of
> > > > > > transport
> > > > > > protocol
> > > > > > from the hardware bindings; this means that an MCTP "stack"
> > > > > > may
> > > > > > be
> > > > > > using
> > > > > > either a I2C, PCI, Serial or custom hardware channel,
> > > > > > without
> > > > > > the
> > > > > > higher
> > > > > > layers of that stack needing to be aware of the hardware
> > > > > > implementation.
> > > > > > These higher levels only need to be aware that they are
> > > > > > communicating
> > > > > > with a certain entity, defined by an Entity ID (MCTP EID).
> > > > > > 
> > > > > > I've mainly focussed on the "transport" part of the design
> > > > > > here.
> > > > > > While
> > > > > > this does enable new messaging protocols (mainly PLDM), I
> > > > > > haven't
> > > > > > covered that much; we will propose those details for a
> > > > > > separate
> > > > > > design
> > > > > > effort.
> > > > > > 
> > > > > > As part of the design, I have referred to MCTP "messages"
> > > > > > and
> > > > > > "packets";
> > > > > > this is intentional, to match the definitions in the MCTP
> > > > > > standard.
> > > > > > MCTP
> > > > > > messages are the higher-level data transferred between MCTP
> > > > > > endpoints,
> > > > > > which packets are typically smaller, and are what is sent
> > > > > > over
> > > > > > the
> > > > > > hardware. Messages that are larger than the hardware MTU
> > > > > > are
> > > > > > split
> > > > > > into
> > > > > > individual packets by the transmit implementation, and
> > > > > > reassembled
> > > > > > at
> > > > > > the receive implementation.
> > > > > > 
> > > > > > A final important point is that this design is for the host
> > > > > > <
> > > > > > -->
> > > > > > BMC
> > > > > > channel *only*. Even if we do replace IPMI for the host
> > > > > > interface,
> > > > > > we
> > > > > > will certainly need an IPMI interface available for
> > > > > > external
> > > > > > system
> > > > > > management.
> > > > > > 
> > > > > > ## Requirements
> > > > > > 
> > > > > > Any channel between host and BMC should:
> > > > > > 
> > > > > >     - Have a simple serialisation and deserialisation
> > > > > > format, to
> > > > > > enable
> > > > > >       implementations in host firmware, which have widely
> > > > > > varying
> > > > > > runtime
> > > > > >       capabilities
> > > > > > 
> > > > > >     - Allow different hardware channels, as we have a wide
> > > > > > variety of
> > > > > >       target platforms for OpenBMC
> > > > > > 
> > > > > >     - Be usable over simple hardware implementations, but
> > > > > > have a
> > > > > > facility
> > > > > >       for higher bandwidth messaging on platforms that
> > > > > > require
> > > > > > it.
> > > > > > 
> > > > > >     - Ideally, integrate with newer messaging protocols
> > > > > > 
> > > > > > ## Proposed Design
> > > > > > 
> > > > > > The MCTP core specification just provides the
> > > > > > packetisation,
> > > > > > routing and
> > > > > > addressing mechanisms. The actual transmit/receive of those
> > > > > > packets
> > > > > > is
> > > > > > up to the hardware binding of the MCTP transport.
> > > > > > 
> > > > > > For OpenBMC, we would introduce an MCTP daemon, which
> > > > > > implements
> > > > > > the
> > > > > > transport over a configurable hardware channel (eg., Serial
> > > > > > UART,
> > > > > > I2C or
> > > > > > PCI). This daemon is responsible for the packetisation and
> > > > > > routing
> > > > > > of
> > > > > > MCTP messages to and from host firmware.
> > > > > > 
> > > > > > I see two options for the "inbound" or "application"
> > > > > > interface
> > > > > > of
> > > > > > the
> > > > > > MCTP daemon:
> > > > > > 
> > > > > >     - it could handle upper parts of the stack (eg PLDM)
> > > > > > directly,
> > > > > > through
> > > > > >       in-process handlers that register for certain MCTP
> > > > > > message
> > > > > > types; or
> > > > > 
> > > > > We'd like to somehow ensure (at least via documentation) that
> > > > > the
> > > > > handlers don't block the MCTP daemon from processing incoming
> > > > > traffic.
> > > > > The handlers might anyway end up making IPC calls (via D-Bus) 
> > > > > to
> > > > > other
> > > > > processes. The second approach below seems to alleviate this
> > > > > problem.
> > > > > 
> > > > > >     - it could channel raw MCTP messages (reassembled from
> > > > > > MCTP
> > > > > > packets) to
> > > > > >       DBUS messages (similar to the current IPMI host
> > > > > > daemons),
> > > > > > where
> > > > > > the
> > > > > >       upper layers receive and act on those DBUS events.
> > > > > > 
> > > > > > I have a preference for the former, but I would be
> > > > > > interested
> > > > > > to
> > > > > > hear
> > > > > > from the IPMI folks about how the latter structure has
> > > > > > worked
> > > > > > in
> > > > > > the
> > > > > > past.
> > > > > > 
> > > > > > The proposed implementation here is to produce an MCTP
> > > > > > "library"
> > > > > > which
> > > > > > provides the packetisation and routing functions, between:
> > > > > > 
> > > > > >     - an "upper" messaging transmit/receive interface, for
> > > > > > tx/rx
> > > > > > of a
> > > > > > full
> > > > > >       message to a specific endpoint
> > > > > > 
> > > > > >     - a "lower" hardware binding for transmit/receive of
> > > > > > individual
> > > > > >       packets, providing a method for the core to tx/rx
> > > > > > each
> > > > > > packet
> > > > > > to
> > > > > >       hardware
> > > > > > 
> > > > > > The lower interface would be plugged in to one of a number
> > > > > > of
> > > > > > hardware-specific binding implementations (most of which
> > > > > > would
> > > > > > be
> > > > > > included in the library source tree, but others can be
> > > > > > plugged-
> > > > > > in
> > > > > > too)
> > > > > > 
> > > > > > The reason for a library is to allow the same MCTP
> > > > > > implementation
> > > > > > to be
> > > > > > used in both OpenBMC and host firmware; the library should
> > > > > > be
> > > > > > bidirectional. To allow this, the library would be written
> > > > > > in
> > > > > > portable C
> > > > > > (structured in a way that can be compiled as "extern C" in
> > > > > > C++
> > > > > > codebases), and be able to be configured to suit those
> > > > > > runtime
> > > > > > environments (for example, POSIX IO may not be available on
> > > > > > all
> > > > > > platforms; we should be able to compile the library to
> > > > > > suit).
> > > > > > The
> > > > > > licence for the library should also allow this re-use; I'd
> > > > > > suggest
> > > > > > a
> > > > > > dual Apache & GPL licence.
> > > > > > 
> > > > > > As for the hardware bindings, we would want to implement a
> > > > > > serial
> > > > > > transport binding first, to allow easy prototyping in
> > > > > > simulation.
> > > > > > For
> > > > > > OpenPOWER, we'd want to implement a "raw LPC" binding for
> > > > > > better
> > > > > > performance, and later PCIe for large transfers. I imagine
> > > > > > that
> > > > > > there is
> > > > > > a need for an I2C binding implementation for other hardware
> > > > > > platforms
> > > > > > too.
> > > > > > 
> > > > > > Lastly, I don't want to exclude any currently-used
> > > > > > interfaces
> > > > > > by
> > > > > > implementing MCTP - this should be an optional component of
> > > > > > OpenBMC, and
> > > > > > not require platforms to implement it.
> > > > > > 
> > > > > > ## Alternatives Considered
> > > > > > 
> > > > > > There have been two main alternatives to this approach:
> > > > > > 
> > > > > > Continue using IPMI, but start making more use of OEM
> > > > > > extensions to
> > > > > > suit the requirements of new platforms. However, given that
> > > > > > the
> > > > > > IPMI
> > > > > > standard is no longer under active development, we would
> > > > > > likely
> > > > > > end
> > > > > > up
> > > > > > with a large amount of platform-specific customisations.
> > > > > > This
> > > > > > also
> > > > > > does
> > > > > > not solve the hardware channel issues in a standard manner.
> > > > > > 
> > > > > > Redfish between host and BMC. This would mean that host
> > > > > > firmware
> > > > > > needs a
> > > > > > HTTP client, a TCP/IP stack, a JSON (de)serialiser, and
> > > > > > support
> > > > > > for
> > > > > > Redfish schema. This is not feasible for all host firmware
> > > > > > implementations; certainly not for OpenPOWER. It's possible
> > > > > > that we
> > > > > > could run a simplified Redfish stack - indeed, MCTP has a
> > > > > > proposal
> > > > > > for a
> > > > > > Redfish-over-MCTP protocol, which uses simplified
> > > > > > serialisation
> > > > > > and
> > > > > > no
> > > > > > requirement on HTTP. However, this still introduces a large
> > > > > > amount
> > > > > > of
> > > > > > complexity in host firmware.
> > > > > > 
> > > > > > ## Impacts
> > > > > > 
> > > > > > Development would be required to implement the MCTP
> > > > > > transport,
> > > > > > plus
> > > > > > any
> > > > > > new users of the MCTP messaging (eg, a PLDM
> > > > > > implementation).
> > > > > > These
> > > > > > would
> > > > > > somewhat duplicate the work we have in IPMI handlers.
> > > > > > 
> > > > > > We'd want to keep IPMI running in parallel, so the
> > > > > > "upgrade"
> > > > > > path
> > > > > > should
> > > > > > be fairly straightforward.
> > > > > > 
> > > > > > Design and development needs to involve potential host
> > > > > > firmware
> > > > > > implementations.
> > > > > > 
> > > > > > ## Testing
> > > > > > 
> > > > > > For the core MCTP library, we are able to run tests there
> > > > > > in
> > > > > > complete
> > > > > > isolation (I have already been able to run a prototype MCTP
> > > > > > stack
> > > > > > through the afl fuzzer) to ensure that the core transport
> > > > > > protocol
> > > > > > works.
> > > > > > 
> > > > > > For MCTP hardware bindings, we would develop channel-
> > > > > > specific
> > > > > > tests
> > > > > > that
> > > > > > would be run in CI on both host and BMC.
> > > > > > 
> > > > > > For the OpenBMC MCTP daemon implementation, testing models
> > > > > > would
> > > > > > depend
> > > > > > on the structure we adopt in the design section.
> > > > > > 
> > > > > 
> > > > > Regards,
> > > > > Deepak
> > > > > 
> > > 
> > > 
> 
> 

  reply	other threads:[~2018-12-12 22:50 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-07  2:41 Initial MCTP design proposal Jeremy Kerr
2018-12-07  4:15 ` Naidoo, Nilan
2018-12-07  5:06   ` Jeremy Kerr
2018-12-07  5:40     ` Naidoo, Nilan
2018-12-07  5:13 ` Deepak Kodihalli
2018-12-07  7:41   ` Jeremy Kerr
2018-12-07 17:09   ` Supreeth Venkatesh
2018-12-07 18:53     ` Emily Shaffer
2018-12-07 20:06       ` Supreeth Venkatesh
2018-12-07 21:19       ` Jeremy Kerr
2018-12-11  1:14       ` Stewart Smith
2018-12-11 18:26         ` Tanous, Ed
2018-12-18  0:10           ` Stewart Smith
2018-12-10  6:14     ` Deepak Kodihalli
2018-12-10 17:40       ` Supreeth Venkatesh
2018-12-11  7:38         ` Deepak Kodihalli
2018-12-12 22:50           ` Supreeth Venkatesh [this message]
2018-12-07 16:38 ` Supreeth Venkatesh
2019-02-07 15:51 ` Brad Bishop
2019-02-08  6:48   ` Jeremy Kerr
2019-02-08 15:55     ` Supreeth Venkatesh
2019-02-11 18:57     ` Brad Bishop
2019-02-12  8:43       ` Jeremy Kerr
2019-03-06 20:04         ` Ed Tanous
2019-03-07  8:46           ` Deepak Kodihalli
2019-03-07 19:35             ` Ed Tanous
2019-03-08  4:58               ` Deepak Kodihalli
2019-03-08  5:21                 ` Deepak Kodihalli
2019-03-07 20:40             ` Supreeth Venkatesh
2019-03-18 12:12           ` Brad Bishop

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=00af4681d9b2b912cc461480d8e6f20e4032cb5b.camel@arm.com \
    --to=supreeth.venkatesh@arm.com \
    --cc=Dong.Wei@arm.com \
    --cc=dkodihal@linux.vnet.ibm.com \
    --cc=dthompson@mellanox.com \
    --cc=emilyshaffer@google.com \
    --cc=jk@ozlabs.org \
    --cc=nilan.naidoo@intel.com \
    --cc=openbmc@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.