All of lore.kernel.org
 help / color / mirror / Atom feed
* MCTP/PLDM BMC-host communication design
@ 2022-01-17 16:33 Tung Nguyen OS
  2022-01-19 22:51 ` Andrew Jeffery
  2022-01-20  0:53 ` Jeremy Kerr
  0 siblings, 2 replies; 6+ messages in thread
From: Tung Nguyen OS @ 2022-01-17 16:33 UTC (permalink / raw)
  To: openbmc; +Cc: Thu Nguyen OS, Thang Nguyen OS

Dear community,
[Switched the email in PlainText format]

We are using community PLDM/MCTP code to design our MCTP/PLDM stack via SMBUS on aarch64 system. Basically, we have 2 CPU sockets corresponding with 2 SMBUS addresses, and the MCTP/PLDM stack looks like this image:
https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/current_design.png

Due to the not supported of discovery process, we are fixing the EIDs for host. During the implementation, we recognize if we use this approach, we will have a limitation when using the code in pldmd and host-bmc in https://github.com/openbmc/pldm
Where just only Host_EID is using. (It might be improvement later here). 

A new way that is considering is like the image: https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/new_design.png

In this way, we use an EID for host to communicate. The master socket 0 should self-manage the host devices like: sensors, effectors, boot progress ... etc.

I would like to hear your opinion on this if the feature is applicable for your systems like what is pros, what is cons, what is the performance, what do you prefer, other suggestion... etc.

Our target is to make the feature works and support the community..

Thank you & best regards,
Tung



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MCTP/PLDM BMC-host communication design
  2022-01-17 16:33 MCTP/PLDM BMC-host communication design Tung Nguyen OS
@ 2022-01-19 22:51 ` Andrew Jeffery
  2022-01-20  0:53 ` Jeremy Kerr
  1 sibling, 0 replies; 6+ messages in thread
From: Andrew Jeffery @ 2022-01-19 22:51 UTC (permalink / raw)
  To: Tung Nguyen OS, openbmc, Jeremy Kerr, Matt Johnston
  Cc: Thu Nguyen OS, Thang Q . Nguyen



On Tue, 18 Jan 2022, at 03:03, Tung Nguyen OS wrote:
> Dear community,
> [Switched the email in PlainText format]
>
> We are using community PLDM/MCTP code to design our MCTP/PLDM stack via 
> SMBUS on aarch64 system. Basically, we have 2 CPU sockets corresponding 
> with 2 SMBUS addresses, and the MCTP/PLDM stack looks like this image:
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/current_design.png
>
> Due to the not supported of discovery process, we are fixing the EIDs 
> for host.

This is true if you're using the libmctp userspace solution.

This isn't true if you're using the in-kernel MCTP stack with the associated utilities from Code Construct:

https://codeconstruct.com.au/docs/mctp-on-linux-introduction/

Specifically, mctpd can handle discovery and network setup for you.

The kernel solution is the future of MCTP in OpenBMC, so if this isn't part of your plan then I'd encourage you to consider it.

There might be some work to do to get the latest work from Jeremy and Matt into the OpenBMC kernel tree, but I'm sure they will be happy to help out.

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MCTP/PLDM BMC-host communication design
  2022-01-17 16:33 MCTP/PLDM BMC-host communication design Tung Nguyen OS
  2022-01-19 22:51 ` Andrew Jeffery
@ 2022-01-20  0:53 ` Jeremy Kerr
  2022-01-21  4:15   ` Tung Nguyen OS
  1 sibling, 1 reply; 6+ messages in thread
From: Jeremy Kerr @ 2022-01-20  0:53 UTC (permalink / raw)
  To: Tung Nguyen OS, openbmc; +Cc: Thu Nguyen OS, Thang Nguyen OS

Hi Tung,

> We are using community PLDM/MCTP code to design our MCTP/PLDM stack
> via SMBUS on aarch64 system. Basically, we have 2 CPU sockets
> corresponding with 2 SMBUS addresses, and the MCTP/PLDM stack looks
> like this image:
>  
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/current_design.png

That looks good to me, but a couple of notes:

 - EID 0 and EID 1 are reserved addresses according to the spec, the
   usable range starts at 8

 - therefore, the *convention* so far for EID allocation is to assign
   EID 8 to the BMC, as the top-level bus owner, and allocate onwards
   from there. However, that's certainly not fixed if you require
   something different for your design.

 - you don't necessarily need two EIDs (0 and 1) for the BMC there.
   Even if there are two interfaces, you can use a single EID on the
   BMC, which simplifies things.

> Due to the not supported of discovery process, we are fixing the EIDs
> for host.

As Andrew has mentioned, we have the in-kernel stack working, including
the EID discovery process using MCTP Control Protocol messaging.

If you'd like to experiment with this, we have a couple of backport
branches for 5.10 and 5.15 kernels, depending on which you're working
with:

 https://codeconstruct.com.au/docs/mctp-on-linux-introduction/#our-development-branches

It's still possible to use fixed EID(s) for remote endpoints though, if
your host MCTP stack does not support the control protocol. You'll just
need to set up (on the BMC) some static routes for the fixed remote
EIDs. I'm happy to help out with configuring that if you like.

> A new way that is considering is like the image:  
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/new_design.png

That looks like it has some considerable drawbacks though, being:

 - you'll now need to implement MCTP bridging between the SMBus link
   (between host and socket 0) and whatever interface you're using to
   communicate between socket 0 and socket 1. This may then require you
   to implement more of the control protocol stack on the host (for
   example, as you'll need to allocate EID pools from the top-level bus
   owner, if you're doing dynamic addressing).

   That's all still possible, but requires more firmware you'll need to
   implement

 - if there's an issue with the socket 0's link, (say, if the host
   has offlined offlined CPUs in socket 0), you might lose MCTP
   connectivity between the BMC and socket 1 too.

That said, it's still feasible, but I'd suggest your first design as a
simpler and more reliable solution.

Regards,


Jeremy


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MCTP/PLDM BMC-host communication design
  2022-01-20  0:53 ` Jeremy Kerr
@ 2022-01-21  4:15   ` Tung Nguyen OS
  2022-01-21  4:23     ` Tung Nguyen OS
  0 siblings, 1 reply; 6+ messages in thread
From: Tung Nguyen OS @ 2022-01-21  4:15 UTC (permalink / raw)
  To: Jeremy Kerr, openbmc; +Cc: Thu Nguyen OS, Thang Nguyen OS

[-- Attachment #1: Type: text/plain, Size: 4235 bytes --]

Dear Jeremy, Andrew,
Appreciated of your comments. We are using the userspace MCTP and will consider moving to kernel space MCTP as the suggestion.
Because of the specific requirements, we look forward for simpler way. In our case, we have on-chip sensors and events which are allocated in both 2 sockets, and the situation is: we must send the PLDM command to poll the data.  If using 2 interfaces to communicate with host, I think it would be complex when sending to multiple sockets.
The things should be considered as :
+ If a socket is problem during runtime, is the process of MCTPL/PLDM still ok
+ If one, or more socket problem. Can we reboot the whole system to recover ?

When using 1 interface, i think:
+ From the host side, socket 0 (master) should manage its other sockets, (might be not via SMBus, but other faster sockets communication). Of course, the more work should be implemented in the firmware, and you have pointed.
+ BMC just recover the system (via reboot) when socket 0 issue, otherwise it does properly

Do you think any more issues with the communication performance ?

Thanks,

________________________________
From: Jeremy Kerr <jk@codeconstruct.com.au>
Sent: Thursday, January 20, 2022 7:53 AM
To: Tung Nguyen OS <tungnguyen@os.amperecomputing.com>; openbmc@lists.ozlabs.org <openbmc@lists.ozlabs.org>
Cc: Thu Nguyen OS <thu@os.amperecomputing.com>; Thang Nguyen OS <thang@os.amperecomputing.com>
Subject: Re: MCTP/PLDM BMC-host communication design

Hi Tung,

> We are using community PLDM/MCTP code to design our MCTP/PLDM stack
> via SMBUS on aarch64 system. Basically, we have 2 CPU sockets
> corresponding with 2 SMBUS addresses, and the MCTP/PLDM stack looks
> like this image:
>
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/current_design.png

That looks good to me, but a couple of notes:

 - EID 0 and EID 1 are reserved addresses according to the spec, the
   usable range starts at 8

 - therefore, the *convention* so far for EID allocation is to assign
   EID 8 to the BMC, as the top-level bus owner, and allocate onwards
   from there. However, that's certainly not fixed if you require
   something different for your design.

 - you don't necessarily need two EIDs (0 and 1) for the BMC there.
   Even if there are two interfaces, you can use a single EID on the
   BMC, which simplifies things.

> Due to the not supported of discovery process, we are fixing the EIDs
> for host.

As Andrew has mentioned, we have the in-kernel stack working, including
the EID discovery process using MCTP Control Protocol messaging.

If you'd like to experiment with this, we have a couple of backport
branches for 5.10 and 5.15 kernels, depending on which you're working
with:

 https://codeconstruct.com.au/docs/mctp-on-linux-introduction/#our-development-branches

It's still possible to use fixed EID(s) for remote endpoints though, if
your host MCTP stack does not support the control protocol. You'll just
need to set up (on the BMC) some static routes for the fixed remote
EIDs. I'm happy to help out with configuring that if you like.

> A new way that is considering is like the image:
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/new_design.png

That looks like it has some considerable drawbacks though, being:

 - you'll now need to implement MCTP bridging between the SMBus link
   (between host and socket 0) and whatever interface you're using to
   communicate between socket 0 and socket 1. This may then require you
   to implement more of the control protocol stack on the host (for
   example, as you'll need to allocate EID pools from the top-level bus
   owner, if you're doing dynamic addressing).

   That's all still possible, but requires more firmware you'll need to
   implement

 - if there's an issue with the socket 0's link, (say, if the host
   has offlined offlined CPUs in socket 0), you might lose MCTP
   connectivity between the BMC and socket 1 too.

That said, it's still feasible, but I'd suggest your first design as a
simpler and more reliable solution.

Regards,


Jeremy


[-- Attachment #2: Type: text/html, Size: 8198 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MCTP/PLDM BMC-host communication design
  2022-01-21  4:15   ` Tung Nguyen OS
@ 2022-01-21  4:23     ` Tung Nguyen OS
  2022-01-21  4:37       ` Jeremy Kerr
  0 siblings, 1 reply; 6+ messages in thread
From: Tung Nguyen OS @ 2022-01-21  4:23 UTC (permalink / raw)
  To: Jeremy Kerr, andrew; +Cc: Thu Nguyen OS, Thang Nguyen OS, openbmc

Dear Jeremy, Andrew,
Appreciated of your comments. We are using the userspace MCTP and will consider moving to kernel space MCTP as the suggestion.
Because of the specific requirements, we look forward for simpler way. In our case, we have on-chip sensors and events which are allocated in both 2 sockets, and the situation is: we must send the PLDM command to poll the data.  If using 2 interfaces to communicate with host, I think it would be complex when sending to multiple sockets.
The things should be considered as :
+ If a socket is problem during runtime, is the process of MCTPL/PLDM still ok
+ If one, or more socket problem. Can we reboot the whole system to recover ?

When using 1 interface, i think:
+ From the host side, socket 0 (master) should manage its other sockets, (might be not via SMBus, but other faster sockets communication). Of course, the more work should be implemented in the firmware, and you have pointed.
+ BMC just recover the system (via reboot) when socket 0 issue, otherwise it does properly

Do you think any more issues with the communication performance ?

Thanks,

________________________________________
From: Tung Nguyen OS <tungnguyen@os.amperecomputing.com>
Sent: Friday, January 21, 2022 11:15 AM
To: Jeremy Kerr; openbmc@lists.ozlabs.org
Cc: Thu Nguyen OS; Thang Nguyen OS
Subject: Re: MCTP/PLDM BMC-host communication design

Dear Jeremy, Andrew,
Appreciated of your comments. We are using the userspace MCTP and will consider moving to kernel space MCTP as the suggestion.
Because of the specific requirements, we look forward for simpler way. In our case, we have on-chip sensors and events which are allocated in both 2 sockets, and the situation is: we must send the PLDM command to poll the data.  If using 2 interfaces to communicate with host, I think it would be complex when sending to multiple sockets.
The things should be considered as :
+ If a socket is problem during runtime, is the process of MCTPL/PLDM still ok
+ If one, or more socket problem. Can we reboot the whole system to recover ?

When using 1 interface, i think:
+ From the host side, socket 0 (master) should manage its other sockets, (might be not via SMBus, but other faster sockets communication). Of course, the more work should be implemented in the firmware, and you have pointed.
+ BMC just recover the system (via reboot) when socket 0 issue, otherwise it does properly

Do you think any more issues with the communication performance ?

Thanks,

________________________________
From: Jeremy Kerr <jk@codeconstruct.com.au>
Sent: Thursday, January 20, 2022 7:53 AM
To: Tung Nguyen OS <tungnguyen@os.amperecomputing.com>; openbmc@lists.ozlabs.org <openbmc@lists.ozlabs.org>
Cc: Thu Nguyen OS <thu@os.amperecomputing.com>; Thang Nguyen OS <thang@os.amperecomputing.com>
Subject: Re: MCTP/PLDM BMC-host communication design

Hi Tung,

> We are using community PLDM/MCTP code to design our MCTP/PLDM stack
> via SMBUS on aarch64 system. Basically, we have 2 CPU sockets
> corresponding with 2 SMBUS addresses, and the MCTP/PLDM stack looks
> like this image:
>
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/current_design.png

That looks good to me, but a couple of notes:

 - EID 0 and EID 1 are reserved addresses according to the spec, the
   usable range starts at 8

 - therefore, the *convention* so far for EID allocation is to assign
   EID 8 to the BMC, as the top-level bus owner, and allocate onwards
   from there. However, that's certainly not fixed if you require
   something different for your design.

 - you don't necessarily need two EIDs (0 and 1) for the BMC there.
   Even if there are two interfaces, you can use a single EID on the
   BMC, which simplifies things.

> Due to the not supported of discovery process, we are fixing the EIDs
> for host.

As Andrew has mentioned, we have the in-kernel stack working, including
the EID discovery process using MCTP Control Protocol messaging.

If you'd like to experiment with this, we have a couple of backport
branches for 5.10 and 5.15 kernels, depending on which you're working
with:

 https://codeconstruct.com.au/docs/mctp-on-linux-introduction/#our-development-branches

It's still possible to use fixed EID(s) for remote endpoints though, if
your host MCTP stack does not support the control protocol. You'll just
need to set up (on the BMC) some static routes for the fixed remote
EIDs. I'm happy to help out with configuring that if you like.

> A new way that is considering is like the image:
> https://github.com/tungnguyen-ampere/images/blob/7dba355b4742d0ffab9cd39303bbb6e9c8a6f646/new_design.png

That looks like it has some considerable drawbacks though, being:

 - you'll now need to implement MCTP bridging between the SMBus link
   (between host and socket 0) and whatever interface you're using to
   communicate between socket 0 and socket 1. This may then require you
   to implement more of the control protocol stack on the host (for
   example, as you'll need to allocate EID pools from the top-level bus
   owner, if you're doing dynamic addressing).

   That's all still possible, but requires more firmware you'll need to
   implement

 - if there's an issue with the socket 0's link, (say, if the host
   has offlined offlined CPUs in socket 0), you might lose MCTP
   connectivity between the BMC and socket 1 too.

That said, it's still feasible, but I'd suggest your first design as a
simpler and more reliable solution.

Regards,


Jeremy


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: MCTP/PLDM BMC-host communication design
  2022-01-21  4:23     ` Tung Nguyen OS
@ 2022-01-21  4:37       ` Jeremy Kerr
  0 siblings, 0 replies; 6+ messages in thread
From: Jeremy Kerr @ 2022-01-21  4:37 UTC (permalink / raw)
  To: Tung Nguyen OS, andrew; +Cc: Thu Nguyen OS, Thang Nguyen OS, openbmc

Hi Tung,

> Appreciated of your comments. We are using theuserspace MCTP and will
> consider moving to kernel space MCTP as the suggestion. 
> Because of the specific requirements, we look forward for simpler
> way. In our case, we have on-chip sensors and events which are
> allocated in both 2 sockets, and the situation is: we must send the
> PLDM command to poll the data.

Yes, that all sounds fine.

> If using 2 interfaces to communicate
> with host, I think it would be complex when sending to multiple
> sockets. 

[We're at risk of overloading the term "socket" here, as it also refers
to the kernel interface to the MCTP stack - the sockets API. So I'll use
the word "CPU" instead, referring to the physical device, being the
MCTP/PLDM endpoint]

If you're using the kernel stack, there's no real additional complexity
with the two-interface model; you would just configure each interface,
and set up routes to each CPU EID. This is a once-off configuration at
BMC boot time. If you're using dynamic addressing, mctpd takes care of
that for you.

The PLDM application only needs to have knowledge of the EIDs of the
CPUs - the kernel handles the routing of which interface to transmit
packets over, based on the packets' destination EIDs.

> The things should be considered as :
> + If a socket is problem during runtime, is the process of MCTPL/PLDM
> still ok

The MCTP stack on the BMC will be fine. I assume the BMC PLDM
application will timeout any pending requests, and should handle that
gracefully too.

> + If one, or more socket problem. Can we reboot the whole system to
> recover ?

You could, but that's pretty heavy-handed. There should be no need to
reboot the BMC at all. And for the CPU's MCTP implementation, I assume
there's a way to perform recovery there, rather than requiring a host
reboot.

The two-interface architecture does give you more fault-tolerance there;
if one CPU's MCTP stack is not reachable, it doesn't prevent
communication with the other.

> When using 1 interface, i think:
> + From the host side, socket 0 (master) should manage its other
> sockets, (might be not via SMBus, but other faster sockets
> communication). Of course, the more work should be implemented in the
> firmware, and you have pointed.
> + BMC just recover the system (via reboot) when socket 0 issue,
> otherwise it does properly

Not sure what you mean by "it does properly" there - but I think avoiding
host reboots would definitely be a good thing. Also, if the fault on
CPU0 isn't recoverable, you won't be able to perform any communication
with CPU1.

Regards,


Jeremy

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-23 22:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-17 16:33 MCTP/PLDM BMC-host communication design Tung Nguyen OS
2022-01-19 22:51 ` Andrew Jeffery
2022-01-20  0:53 ` Jeremy Kerr
2022-01-21  4:15   ` Tung Nguyen OS
2022-01-21  4:23     ` Tung Nguyen OS
2022-01-21  4:37       ` Jeremy Kerr

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.