All of lore.kernel.org
 help / color / mirror / Atom feed
* BMC redundancy
@ 2018-01-29 15:52 Brad Bishop
       [not found] ` <7eb0f506-1dd7-1a28-cc0a-9f7813c28562@yadro.com>
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Brad Bishop @ 2018-01-29 15:52 UTC (permalink / raw)
  To: OpenBMC Maillist

I know we have a lot of work to do with the basics before tackling something
like supporting multiple BMCs in a single system, but its never too early to
brainstorm.

Quick community poll:  Please share any thoughts you may have around supporting
systems with multiple BMCs.  Does your organization care?  Thoughts on how it
could/should be done?  System designs that are a non-starter for OpenBMC?

Would love to see some brain-dumps here.

thx - brad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
       [not found]   ` <524CA01B-1D8E-4C15-B5DB-A27157FBECB7@fuzziesquirrel.com>
@ 2018-01-29 18:14     ` Brad Bishop
  0 siblings, 0 replies; 16+ messages in thread
From: Brad Bishop @ 2018-01-29 18:14 UTC (permalink / raw)
  To: OpenBMC Maillist; +Cc: Alexander Amelkin

+cc: list

> On Jan 29, 2018, at 1:13 PM, Brad Bishop <bradleyb@fuzziesquirrel.com> wrote:
> 
> Thanks for the quick reply!
> 
>> On Jan 29, 2018, at 11:45 AM, Alexander Amelkin <a.amelkin@yadro.com> wrote:
>> 
>> Brad, do you have any examples or existing systems having multiple BMCs with any commercial firmware like MegaRAC ?
> 
> Not really.  I’m just familiar with IBM systems and this is something
> we have done in our commercial BMC stacks for a long time.
> 
>> 
>> I can't think of a problem you're proposing to address with multiple BMCs for a single host, but I can imagine a number of problems it may add.
> 
> Indeed.  Obviously complexity is one.  I’d be interested in hearing about
> other problems.  You’ve exposed my agenda - I’m looking for ways for IBM to
> be able to support something like this in OpenBMC but at the same time
> minimize the complexity burden for everyone else.
> 
>> 
>> BMC lockup? Solved by hardware watchdog.
>> BMC firmware corruption? Solved by read-only golden image on a separate flash IC (well supported by at least Aspeed).
>> BMC DoS attack? Solved by network isolation and overall correct network environment configuration.
>> BMC chip burnout? Does it happen at all? Isn't this an indicator of some major hardware design flaw? Does adding another chip actually solve this problem?
> 
> Yeah on the surface I agree with all your points here, assuming the
> definition of a system is a single board.  It doesn’t make sense.
> 
> We do have some modular system designs though where N discrete chassis
> can be connected together with high speed cabling or a backplane for
> a single SMP fabric across the N chassis for example.  Its these kind
> of system designs where multiple BMCs make a little more sense.
> 
>> 
>> What else?
> 
> It's really the connections between the BMC and the host hardware on
> these larger systems.  These busses can and do have both transient and
> hard failures, and IBM needs a way to maintain connections from a BMC
> to the host hardware in that state.  Also, while not really a redundancy
> statement, it simply isn’t economical for BMC vendors to develop SOCs
> that can provide enough pins on systems this large.
> 
> -brad

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-01-29 15:52 BMC redundancy Brad Bishop
       [not found] ` <7eb0f506-1dd7-1a28-cc0a-9f7813c28562@yadro.com>
@ 2018-01-29 20:43 ` Vernon Mauery
  2018-01-29 21:38   ` Brad Bishop
  2018-01-31  6:27 ` Deepak Kodihalli
  2 siblings, 1 reply; 16+ messages in thread
From: Vernon Mauery @ 2018-01-29 20:43 UTC (permalink / raw)
  To: Brad Bishop; +Cc: OpenBMC Maillist

On 29-Jan-2018 10:52 AM, Brad Bishop wrote:
>I know we have a lot of work to do with the basics before tackling something
>like supporting multiple BMCs in a single system, but its never too early to
>brainstorm.
>
>Quick community poll:  Please share any thoughts you may have around supporting
>systems with multiple BMCs.  Does your organization care?  Thoughts on how it
>could/should be done?  System designs that are a non-starter for OpenBMC?

Intel has supported systems like this in the past, and it is likely that 
we will have need of multi-node/multi-bmc systems in the future.

Our systems in the past have been sled-based with four sleds (BMC+host) 
per chassis and the BMCs connected via I2C over the backplane.  One of 
the BMCs was elected (based on availability and ID) to be the master for 
the system to control the shared resources (power supplies and other 
common stuff). 

Each BMC was individually accessible in the normal ways (KCS from host 
or RMCP/web over the network).

Is this the sort of stuff that you are looking for or were you thinking 
of BMC redundancy for a single system?

--Vernon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-01-29 20:43 ` Vernon Mauery
@ 2018-01-29 21:38   ` Brad Bishop
  0 siblings, 0 replies; 16+ messages in thread
From: Brad Bishop @ 2018-01-29 21:38 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: OpenBMC Maillist


> On Jan 29, 2018, at 3:43 PM, Vernon Mauery <vernon.mauery@linux.intel.com> wrote:
> 
> On 29-Jan-2018 10:52 AM, Brad Bishop wrote:
>> I know we have a lot of work to do with the basics before tackling something
>> like supporting multiple BMCs in a single system, but its never too early to
>> brainstorm.
>> 
>> Quick community poll:  Please share any thoughts you may have around supporting
>> systems with multiple BMCs.  Does your organization care?  Thoughts on how it
>> could/should be done?  System designs that are a non-starter for OpenBMC?
> 
> Intel has supported systems like this in the past, and it is likely that we will have need of multi-node/multi-bmc systems in the future.

Thanks Vernon - its good to hear there might be room to collaborate on this.

> 
> Our systems in the past have been sled-based with four sleds (BMC+host) per chassis and the BMCs connected via I2C over the backplane.  One of the BMCs was elected (based on availability and ID) to be the master for the system to control the shared resources (power supplies and other common stuff). 
> Each BMC was individually accessible in the normal ways (KCS from host or RMCP/web over the network).
> 
> Is this the sort of stuff that you are looking for or were you thinking of BMC redundancy for a single system?

I think so yes.  Very similar to what you describe but with a single
SMP fabric across all the sleds.  I think in practice it doesn’t
make much difference - the SMP fabric would just another shared
resource amongst the sleds.

> 
> --Vernon

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-01-29 15:52 BMC redundancy Brad Bishop
       [not found] ` <7eb0f506-1dd7-1a28-cc0a-9f7813c28562@yadro.com>
  2018-01-29 20:43 ` Vernon Mauery
@ 2018-01-31  6:27 ` Deepak Kodihalli
  2018-02-02  0:48   ` Andrew Jeffery
  2 siblings, 1 reply; 16+ messages in thread
From: Deepak Kodihalli @ 2018-01-31  6:27 UTC (permalink / raw)
  To: openbmc

On 29/01/18 9:22 pm, Brad Bishop wrote:
> I know we have a lot of work to do with the basics before tackling something
> like supporting multiple BMCs in a single system, but its never too early to
> brainstorm.
> 
> Quick community poll:  Please share any thoughts you may have around supporting
> systems with multiple BMCs.  Does your organization care?  Thoughts on how it
> could/should be done?  System designs that are a non-starter for OpenBMC?
> 
> Would love to see some brain-dumps here.
> 
> thx - brad
> 

I had some thoughts/questions on messaging protocols that multiple BMCs 
within the same system (potentially managing their own sleds) may employ 
to talk to each other. I'm not getting into how the BMCs discover and 
address each other. I see two broad use-cases for a peer BMC 
communication in such a system :

- A group operation : where an external agent talks to a single BMC (for 
the want of not having to query each BMC), but wants an aggregate of 
information. For eg the agent wants to collect error logs, it talks to a 
single BMC but expects that BMC would give back an aggregation of errors 
across all peer BMCs.

- There may be reasons for a BMC to itself initiate communication with 
one or more peers.


I was interested in brainstorming what protocol should the BMCs use for 
peer communication. What I had in mind though was, can this be done by 
leveraging the existing D-Bus/REST API, as opposed to employing a new 
protocol (I've looked at some such as Protobufs/gRPC, MQTT, etc).


So several of the existing OpenBMC apps implement specific D-Bus 
services. What does it take to make remote D-Bus calls to such apps?
- It doesn't look like the D-Bus spec or libdbus officially has anything 
for D-Bus across computers. There are some good notes at 
https://www.freedesktop.org/wiki/Software/DBusRemote/.
- There are ways to achieve this via Qt D-Bus, but it would involve some 
amount tweaking with the D-Bus configs.
- I'm not aware of any open/active project implementing remote D-Bus.
- Thoughts on doing remote D-Bus over WebSockets? For instance one could 
write a serialized D-Bus message over a WebSocket that has been 
established between two BMCs. Since WebSockets are full-duplex, they 
could also be used to receive events/D-Bus signals from peer BMCs 
asynchronously.


How about using REST for the peer to peer communication? Although this 
would mean each of the peer BMCs in the system would need a REST server 
and client implementation. The 'group operation' I've mentioned above 
seems a logical fit to do this, presuming an external agent would be 
using a REST API to talk to one of the BMCs, and the REST server 
implementation on that BMC has to propagate/multicast that API to all 
its peers, and then aggregate the responses.
If it's a D-Bus app wanting to talk to a counterpart app on a peer BMC, 
then it would have to use the REST client to send an https request.


Thanks,
Deepak

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-01-31  6:27 ` Deepak Kodihalli
@ 2018-02-02  0:48   ` Andrew Jeffery
  2018-02-02  6:28     ` Deepak Kodihalli
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Andrew Jeffery @ 2018-02-02  0:48 UTC (permalink / raw)
  To: openbmc

Hi Deepak,

> So several of the existing OpenBMC apps implement specific D-Bus 
> services. What does it take to make remote D-Bus calls to such apps?
> - It doesn't look like the D-Bus spec or libdbus officially has anything 
> for D-Bus across computers. There are some good notes at 
> https://www.freedesktop.org/wiki/Software/DBusRemote/.

Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:

1. Slave BMCs connect to the master's DBus daemon, and applications namespace their objects appropriately. Multi-BMC aware applications on the master access the namespaced objects as required
2. Slave BMCs are willfully ignorant of their role, with the master connecting to the slaves' DBus daemons to form a coherent global view of the bus for its multi-BMC aware applications, which access the remote objects as required.

Given the support DBus has today it might be easier to go for 1 than for 2, if we go down this path at all.

[1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html

> - There are ways to achieve this via Qt D-Bus, but it would involve some 
> amount tweaking with the D-Bus configs.
> - I'm not aware of any open/active project implementing remote D-Bus.

Here is someone's attempt at making it easier: http://gabriel.sourceforge.net/howto.html though you would struggle to say it's active given the last contribution was 2013-05-14.

> - Thoughts on doing remote D-Bus over WebSockets?

How do websockets come into the picture? Why do we need the extra complication vs normal sockets?

Cheers,

Andrew

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02  0:48   ` Andrew Jeffery
@ 2018-02-02  6:28     ` Deepak Kodihalli
  2018-02-02  9:48     ` Ratan Gupta
  2018-02-02 21:10     ` Vernon Mauery
  2 siblings, 0 replies; 16+ messages in thread
From: Deepak Kodihalli @ 2018-02-02  6:28 UTC (permalink / raw)
  To: Andrew Jeffery, openbmc

On 02/02/18 6:18 am, Andrew Jeffery wrote:
> Hi Deepak,
> 
>> So several of the existing OpenBMC apps implement specific D-Bus
>> services. What does it take to make remote D-Bus calls to such apps?
>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>> for D-Bus across computers. There are some good notes at
>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
> 
> Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:
> 
> 1. Slave BMCs connect to the master's DBus daemon, and applications namespace their objects appropriately. Multi-BMC aware applications on the master access the namespaced objects as required
> 2. Slave BMCs are willfully ignorant of their role, with the master connecting to the slaves' DBus daemons to form a coherent global view of the bus for its multi-BMC aware applications, which access the remote objects as required.
> 
> Given the support DBus has today it might be easier to go for 1 than for 2, if we go down this path at all.

I've to look at how/if this is possible in a scenario where any BMC can 
take on the role of a master, dynamically.

> [1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
> 
>> - There are ways to achieve this via Qt D-Bus, but it would involve some
>> amount tweaking with the D-Bus configs.
>> - I'm not aware of any open/active project implementing remote D-Bus.
> 
> Here is someone's attempt at making it easier: http://gabriel.sourceforge.net/howto.html though you would struggle to say it's active given the last contribution was 2013-05-14.

Yep, did look at this.

>> - Thoughts on doing remote D-Bus over WebSockets?
> 
> How do websockets come into the picture? Why do we need the extra complication vs normal sockets?

I agree that WebSockets are not the same as plain sockets, there's the 
additional connection scheme and framing, but the protocol itself 
doesn't preclude non-browser clients. Why I brought up WebSockets though 
is for things like sending/receiving (D-Bus) signals asynchronously 
across BMCs, ping/ping heartbeat messages (surveillance).

> Cheers,
> 
> Andrew
> 

Regards,
Deepak

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02  0:48   ` Andrew Jeffery
  2018-02-02  6:28     ` Deepak Kodihalli
@ 2018-02-02  9:48     ` Ratan Gupta
  2018-02-02 14:42       ` Brad Bishop
  2018-02-02 21:10     ` Vernon Mauery
  2 siblings, 1 reply; 16+ messages in thread
From: Ratan Gupta @ 2018-02-02  9:48 UTC (permalink / raw)
  To: Andrew Jeffery, openbmc

Hi Andrew,


On Friday 02 February 2018 06:18 AM, Andrew Jeffery wrote:
> Hi Deepak,
>
>> So several of the existing OpenBMC apps implement specific D-Bus
>> services. What does it take to make remote D-Bus calls to such apps?
>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>> for D-Bus across computers. There are some good notes at
>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
> Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:
>
> 1. Slave BMCs connect to the master's DBus daemon, and applications namespace their objects appropriately. Multi-BMC aware applications on the master access the namespaced objects as required
> 2. Slave BMCs are willfully ignorant of their role, with the master connecting to the slaves' DBus daemons to form a coherent global view of the bus for its multi-BMC aware applications, which access the remote objects as required.
>
> Given the support DBus has today it might be easier to go for 1 than for 2, if we go down this path at all.
>
> [1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
>
>> - There are ways to achieve this via Qt D-Bus, but it would involve some
>> amount tweaking with the D-Bus configs.
>> - I'm not aware of any open/active project implementing remote D-Bus.
> Here is someone's attempt at making it easier: http://gabriel.sourceforge.net/howto.html though you would struggle to say it's active given the last contribution was 2013-05-14.
How it would be beneficial sending dbus packets over ssh v/s current 
REST architecture,where url is mapped with the Dbus objects?
>
>> - Thoughts on doing remote D-Bus over WebSockets?
> How do websockets come into the picture? Why do we need the extra complication vs normal sockets?
>
> Cheers,
>
> Andrew
>
Regards
Ratan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02  9:48     ` Ratan Gupta
@ 2018-02-02 14:42       ` Brad Bishop
  0 siblings, 0 replies; 16+ messages in thread
From: Brad Bishop @ 2018-02-02 14:42 UTC (permalink / raw)
  To: Ratan Gupta; +Cc: Andrew Jeffery, openbmc


> On Feb 2, 2018, at 4:48 AM, Ratan Gupta <ratagupt@linux.vnet.ibm.com> wrote:
> 
> Hi Andrew,
> 
> 
> On Friday 02 February 2018 06:18 AM, Andrew Jeffery wrote:
>> Hi Deepak,
>> 
>>> So several of the existing OpenBMC apps implement specific D-Bus
>>> services. What does it take to make remote D-Bus calls to such apps?
>>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>>> for D-Bus across computers. There are some good notes at
>>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>> Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:
>> 
>> 1. Slave BMCs connect to the master's DBus daemon, and applications namespace their objects appropriately. Multi-BMC aware applications on the master access the namespaced objects as required
>> 2. Slave BMCs are willfully ignorant of their role, with the master connecting to the slaves' DBus daemons to form a coherent global view of the bus for its multi-BMC aware applications, which access the remote objects as required.
>> 
>> Given the support DBus has today it might be easier to go for 1 than for 2, if we go down this path at all.
>> 
>> [1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
>> 
>>> - There are ways to achieve this via Qt D-Bus, but it would involve some
>>> amount tweaking with the D-Bus configs.
>>> - I'm not aware of any open/active project implementing remote D-Bus.
>> Here is someone's attempt at making it easier: http://gabriel.sourceforge.net/howto.html though you would struggle to say it's active given the last contribution was 2013-05-14.
> How it would be beneficial sending dbus packets over ssh v/s current REST architecture,where url is mapped with the Dbus objects?

1 - Less code maintained by us in the path.
2 - Would like to see the current REST API evolve into a development/testing/debug tool
      and not used in any other capacity (like production) - We don’t want to have to
      maintain an API contract at the DBus level.
3 - The current REST API hides DBus interface information, which will likely be needed
      in this environment.

>> 
>>> - Thoughts on doing remote D-Bus over WebSockets?
>> How do websockets come into the picture? Why do we need the extra complication vs normal sockets?
>> 
>> Cheers,
>> 
>> Andrew
>> 
> Regards
> Ratan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02  0:48   ` Andrew Jeffery
  2018-02-02  6:28     ` Deepak Kodihalli
  2018-02-02  9:48     ` Ratan Gupta
@ 2018-02-02 21:10     ` Vernon Mauery
  2018-02-03  8:08       ` Deepak Kodihalli
                         ` (2 more replies)
  2 siblings, 3 replies; 16+ messages in thread
From: Vernon Mauery @ 2018-02-02 21:10 UTC (permalink / raw)
  To: Andrew Jeffery; +Cc: openbmc

On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
>Hi Deepak,
>
>> So several of the existing OpenBMC apps implement specific D-Bus
>> services. What does it take to make remote D-Bus calls to such apps?
>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>> for D-Bus across computers. There are some good notes at
>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>
>Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:

Putting DBus on an externally-available TCP socket is a security 
architect's nightmare. All command and control of the entire BMC is done 
over DBus; we cannot put that on an externally-available address. I 
suppose if you have an internal connection and switching fabric between 
the nodes, this would be possible.

--Vernon

>1. Slave BMCs connect to the master's DBus daemon, and applications namespace their objects appropriately. Multi-BMC aware applications on the master access the namespaced objects as required
>2. Slave BMCs are willfully ignorant of their role, with the master connecting to the slaves' DBus daemons to form a coherent global view of the bus for its multi-BMC aware applications, which access the remote objects as required.
>
>Given the support DBus has today it might be easier to go for 1 than for 2, if we go down this path at all.
>
>[1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
>
>> - There are ways to achieve this via Qt D-Bus, but it would involve some
>> amount tweaking with the D-Bus configs.
>> - I'm not aware of any open/active project implementing remote D-Bus.
>
>Here is someone's attempt at making it easier: http://gabriel.sourceforge.net/howto.html though you would struggle to say it's active given the last contribution was 2013-05-14.
>
>> - Thoughts on doing remote D-Bus over WebSockets?
>
>How do websockets come into the picture? Why do we need the extra complication vs normal sockets?
>
>Cheers,
>
>Andrew

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02 21:10     ` Vernon Mauery
@ 2018-02-03  8:08       ` Deepak Kodihalli
  2018-02-03  8:52         ` Ratan Gupta
  2018-02-05  1:38       ` Andrew Jeffery
  2018-02-06  6:10       ` Michael E Brown
  2 siblings, 1 reply; 16+ messages in thread
From: Deepak Kodihalli @ 2018-02-03  8:08 UTC (permalink / raw)
  To: openbmc

On 03/02/18 2:40 am, Vernon Mauery wrote:
> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
>> Hi Deepak,
>>
>>> So several of the existing OpenBMC apps implement specific D-Bus
>>> services. What does it take to make remote D-Bus calls to such apps?
>>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>>> for D-Bus across computers. There are some good notes at
>>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>>
>> Applications can cannect to remote dbus servers; the --address option 
>> to dbus-daemon allows it to listen on a TCP socket and setting 
>> DBUS_SESSION_BUS_ADDRESS will point applications in the right 
>> direction. So there are probably two ways we could do this:
> 
> Putting DBus on an externally-available TCP socket is a security 
> architect's nightmare. All command and control of the entire BMC is done 
> over DBus; we cannot put that on an externally-available address. I 
> suppose if you have an internal connection and switching fabric between 
> the nodes, this would be possible.

This shouldn't be a problem though with SSH forwarding, with a proxy 
D-Bus daemon for example. 
https://www.freedesktop.org/wiki/Software/DBusRemote/ talks about 
another issue with SSH forwarding D-Bus, which I haven't fully 
understood. I know that the Gabriel project took the SSH forwarding route.

Regards,
Deepak

> --Vernon
> 
>> 1. Slave BMCs connect to the master's DBus daemon, and applications 
>> namespace their objects appropriately. Multi-BMC aware applications on 
>> the master access the namespaced objects as required
>> 2. Slave BMCs are willfully ignorant of their role, with the master 
>> connecting to the slaves' DBus daemons to form a coherent global view 
>> of the bus for its multi-BMC aware applications, which access the 
>> remote objects as required.
>>
>> Given the support DBus has today it might be easier to go for 1 than 
>> for 2, if we go down this path at all.
>>
>> [1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
>>
>>> - There are ways to achieve this via Qt D-Bus, but it would involve some
>>> amount tweaking with the D-Bus configs.
>>> - I'm not aware of any open/active project implementing remote D-Bus.
>>
>> Here is someone's attempt at making it easier: 
>> http://gabriel.sourceforge.net/howto.html though you would struggle to 
>> say it's active given the last contribution was 2013-05-14.
>>
>>> - Thoughts on doing remote D-Bus over WebSockets?
>>
>> How do websockets come into the picture? Why do we need the extra 
>> complication vs normal sockets?
>>
>> Cheers,
>>
>> Andrew
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-03  8:08       ` Deepak Kodihalli
@ 2018-02-03  8:52         ` Ratan Gupta
  0 siblings, 0 replies; 16+ messages in thread
From: Ratan Gupta @ 2018-02-03  8:52 UTC (permalink / raw)
  To: openbmc



On Saturday 03 February 2018 01:38 PM, Deepak Kodihalli wrote:
> On 03/02/18 2:40 am, Vernon Mauery wrote:
>> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
>>> Hi Deepak,
>>>
>>>> So several of the existing OpenBMC apps implement specific D-Bus
>>>> services. What does it take to make remote D-Bus calls to such apps?
>>>> - It doesn't look like the D-Bus spec or libdbus officially has 
>>>> anything
>>>> for D-Bus across computers. There are some good notes at
>>>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>>>
>>> Applications can cannect to remote dbus servers; the --address 
>>> option to dbus-daemon allows it to listen on a TCP socket and 
>>> setting DBUS_SESSION_BUS_ADDRESS will point applications in the 
>>> right direction. So there are probably two ways we could do this:
>>
>> Putting DBus on an externally-available TCP socket is a security 
>> architect's nightmare. All command and control of the entire BMC is 
>> done over DBus; we cannot put that on an externally-available 
>> address. I suppose if you have an internal connection and switching 
>> fabric between the nodes, this would be possible.
>
> This shouldn't be a problem though with SSH forwarding, with a proxy 
> D-Bus daemon for example. 
> https://www.freedesktop.org/wiki/Software/DBusRemote/ talks about 
> another issue with SSH forwarding D-Bus, which I haven't fully 
> understood. I know that the Gabriel project took the SSH forwarding 
> route.
Forwarding D-Bus packet over SSH will be having bottle neck as we need 
to check whether libssh is threadsafe or not.In the following link for 
Gabriel,it is mentioned that libssh is not thread safe so multiple 
clients can not connect.
http://gabriel.sourceforge.net/README.
However in other link it is mentioned that how can we make the libssh 
threadsafe.
http://api.libssh.org/master/libssh_tutor_threads.html.

On other note,Do we really need to concern for the security for 
internal(private network) BMC communication?

>
> Regards,
> Deepak
>
>> --Vernon
>>
>>> 1. Slave BMCs connect to the master's DBus daemon, and applications 
>>> namespace their objects appropriately. Multi-BMC aware applications 
>>> on the master access the namespaced objects as required
>>> 2. Slave BMCs are willfully ignorant of their role, with the master 
>>> connecting to the slaves' DBus daemons to form a coherent global 
>>> view of the bus for its multi-BMC aware applications, which access 
>>> the remote objects as required.
>>>
>>> Given the support DBus has today it might be easier to go for 1 than 
>>> for 2, if we go down this path at all.
>>>
>>> [1] https://dbus.freedesktop.org/doc/dbus-daemon.1.html
>>>
>>>> - There are ways to achieve this via Qt D-Bus, but it would involve 
>>>> some
>>>> amount tweaking with the D-Bus configs.
>>>> - I'm not aware of any open/active project implementing remote D-Bus.
>>>
>>> Here is someone's attempt at making it easier: 
>>> http://gabriel.sourceforge.net/howto.html though you would struggle 
>>> to say it's active given the last contribution was 2013-05-14.
>>>
>>>> - Thoughts on doing remote D-Bus over WebSockets?
>>>
>>> How do websockets come into the picture? Why do we need the extra 
>>> complication vs normal sockets?
>>>
>>> Cheers,
>>>
>>> Andrew
>>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02 21:10     ` Vernon Mauery
  2018-02-03  8:08       ` Deepak Kodihalli
@ 2018-02-05  1:38       ` Andrew Jeffery
  2018-02-06  6:10       ` Michael E Brown
  2 siblings, 0 replies; 16+ messages in thread
From: Andrew Jeffery @ 2018-02-05  1:38 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: openbmc

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Fri, 2018-02-02 at 13:10 -0800, Vernon Mauery wrote:
> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
> > Hi Deepak,
> > 
> > > So several of the existing OpenBMC apps implement specific D-Bus
> > > services. What does it take to make remote D-Bus calls to such apps?
> > > - It doesn't look like the D-Bus spec or libdbus officially has anything
> > > for D-Bus across computers. There are some good notes at
> > > https://www.freedesktop.org/wiki/Software/DBusRemote/.
> > 
> > Applications can cannect to remote dbus servers; the --address
> > option to dbus-daemon allows it to listen on a TCP socket and
> > setting DBUS_SESSION_BUS_ADDRESS will point applications in the
> > right direction. So there are probably two ways we could do this:
> 
> Putting DBus on an externally-available TCP socket is a security 
> architect's nightmare. All command and control of the entire BMC is done 
> over DBus; we cannot put that on an externally-available address.

I think what you're actually suggesting is that whatever interface is
exposed, it needs authentication/authorisation or for the system design
 to ensure that no unexpected actors can connect.

It's true that the TCP socket option doesn't provide autentication, so
yeah, the suggestion isn't secure, but the suggestion was mainly to
counter the claim that "It doesn't look like the D-Bus spec or libdbus
officially has anything for D-Bus across computers". Support for this
is built into the spec:

https://dbus.freedesktop.org/doc/dbus-specification.html#transports-tcp-sockets

There are other ways to provide authentication and transport security,
so I don't think we have a huge design concern on our hands.

Cheers,

Andrew

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-02 21:10     ` Vernon Mauery
  2018-02-03  8:08       ` Deepak Kodihalli
  2018-02-05  1:38       ` Andrew Jeffery
@ 2018-02-06  6:10       ` Michael E Brown
  2018-02-06  6:44         ` Deepak Kodihalli
  2 siblings, 1 reply; 16+ messages in thread
From: Michael E Brown @ 2018-02-06  6:10 UTC (permalink / raw)
  To: Vernon Mauery; +Cc: Andrew Jeffery, openbmc

On Fri, Feb 02, 2018 at 01:10:43PM -0800, Vernon Mauery wrote:
> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
> > Hi Deepak,
> > 
> > > So several of the existing OpenBMC apps implement specific D-Bus
> > > services. What does it take to make remote D-Bus calls to such apps?
> > > - It doesn't look like the D-Bus spec or libdbus officially has anything
> > > for D-Bus across computers. There are some good notes at
> > > https://www.freedesktop.org/wiki/Software/DBusRemote/.
> > 
> > Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:
> 
> Putting DBus on an externally-available TCP socket is a security architect's
> nightmare. All command and control of the entire BMC is done over DBus; we
> cannot put that on an externally-available address. I suppose if you have an
> internal connection and switching fabric between the nodes, this would be
> possible.

Vernon, I completely and wholeheartedly agree with this assessment. Most of the
things I've heard so far start way too high up in the stack and try to solve
the issue there. I believe that we ought to start at the networking layer and
build up from there.

Here is a dump of my thoughts related to what we talked about on the conference
call.

Step 0: Internal network
	Definition: This is a private network that is internal to a chassis and
cannot be attached via the external world. This does not make this network
implicitly secure, however. I would strongly suggest that we create standards
for addressing this network and how components communicate at the IP level
between BMCs.
	Proposal: all nodes on this network use IPv6 SLAAC, with a designation
for one or a redundant pair of nodes running "radvd" to provide stable
site-local address space assignments.

Step 1: Discovery
	Definition: This is how you figure out that there are other BMCs on the
internal network. Zeroconf/Avahi/MDNS are three names for one method. This also
includes figuring out which role any specific node may fill (chassis-level BMC
vs node-level BMC, for example).
	Proposal: MDNS DNS-SD with TXT records indicating BMC Role

Step 2: Exchanging certificates
	Definition: To start a crypto system, one needs to exchange public
keys, at a minimum. Other information about a machine can also be useful. Note
that exchanging certificates does not imply "trust" of those certificates, but
only provides the basis upon which you later decide if you trust or not.
	Proposal: Oauth 2.0 Dynamic Client Registration

Step 3: Trust
	Definition: based on many factors, a machine decides if it trusts
another machine. There is the implication that end-users may wish to disable
new trust relationships from forming, but that's not a requirement. Factors
that determine trust: 1) which network initiated connection, 2) machine
identity certs, 3) any signatures on the machine identity certs (for example, a
vendor signature probably implies higher trust), 4) user wishes/configuration,
and 5) any possible remote attestation (TPM) or similar. These factors could
certainly be extended to include many other things, but these are a baseline
for starting to talk about this. Depending on user settings, devices might
require OAuth 2.0 device flow or be implicitly trusted based on vendor-signed
device certificates.
	Proposal: The OAuth 2.0 Client Credentials grant or the OAuth 2.0
device flow. (Depending on policy)

Step 4: RPC
	After establishing trust, you need a mechanism to do remote calls
between machines. This could be as simple as REST (using oauth tokens granted
in #3), or as complex as a full DBUS interface.
	Proposal: None at this time

Step 5: Clustering
	Definition: Clustering is generally divided into a cluster
communication protocol and a cluster resource manager. The resource manager has
the job of taking developer constraints about which daemons have to run where
and what resources they need, and running the jobs on the machines available.
For example, you might specify that only one machine should run sensors
connected to physical hardware i2c lines, and specify the list of daemons that
depend on these hardware resources. The resource manager would be responsible
for running the daemons in the correct order on the correct number of machines.
	Proposal: Corosync/Pacemaker have fairly reliable and flexible
clustering, and can describe complex requirements.

The one thing to keep in mind is that everything up to step 4/5 is part of your
forwards/backwards compatibility guarantee for all time for everything in the
chassis. To make sure it is supportable for a very long time, try to keep it as
simple as possible, but no simpler.

Another part of the design is figuring out if you need Active/Active
clustering, or Active/Standby. If you can get away with Active/Standby, you can
greatly minimize your RPC requirements between the machines, to the point you
don't really need much other than REST.

--
Michael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
  2018-02-06  6:10       ` Michael E Brown
@ 2018-02-06  6:44         ` Deepak Kodihalli
       [not found]           ` <d78c6a15-8736-7641-47af-714559cc48f5@linux.vnet.ibm.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Deepak Kodihalli @ 2018-02-06  6:44 UTC (permalink / raw)
  To: openbmc

On 06/02/18 11:40 am, Michael E Brown wrote:
> On Fri, Feb 02, 2018 at 01:10:43PM -0800, Vernon Mauery wrote:
>> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
>>> Hi Deepak,
>>>
>>>> So several of the existing OpenBMC apps implement specific D-Bus
>>>> services. What does it take to make remote D-Bus calls to such apps?
>>>> - It doesn't look like the D-Bus spec or libdbus officially has anything
>>>> for D-Bus across computers. There are some good notes at
>>>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>>>
>>> Applications can cannect to remote dbus servers; the --address option to dbus-daemon allows it to listen on a TCP socket and setting DBUS_SESSION_BUS_ADDRESS will point applications in the right direction. So there are probably two ways we could do this:
>>
>> Putting DBus on an externally-available TCP socket is a security architect's
>> nightmare. All command and control of the entire BMC is done over DBus; we
>> cannot put that on an externally-available address. I suppose if you have an
>> internal connection and switching fabric between the nodes, this would be
>> possible.
> 
> Vernon, I completely and wholeheartedly agree with this assessment. Most of the
> things I've heard so far start way too high up in the stack and try to solve
> the issue there. I believe that we ought to start at the networking layer and
> build up from there.
> 
> Here is a dump of my thoughts related to what we talked about on the conference
> call.
> 
> Step 0: Internal network
> 	Definition: This is a private network that is internal to a chassis and
> cannot be attached via the external world. This does not make this network
> implicitly secure, however. I would strongly suggest that we create standards
> for addressing this network and how components communicate at the IP level
> between BMCs.
> 	Proposal: all nodes on this network use IPv6 SLAAC, with a designation
> for one or a redundant pair of nodes running "radvd" to provide stable
> site-local address space assignments.
> 
> Step 1: Discovery
> 	Definition: This is how you figure out that there are other BMCs on the
> internal network. Zeroconf/Avahi/MDNS are three names for one method. This also
> includes figuring out which role any specific node may fill (chassis-level BMC
> vs node-level BMC, for example).
> 	Proposal: MDNS DNS-SD with TXT records indicating BMC Role
> 
> Step 2: Exchanging certificates
> 	Definition: To start a crypto system, one needs to exchange public
> keys, at a minimum. Other information about a machine can also be useful. Note
> that exchanging certificates does not imply "trust" of those certificates, but
> only provides the basis upon which you later decide if you trust or not.
> 	Proposal: Oauth 2.0 Dynamic Client Registration
> 
> Step 3: Trust
> 	Definition: based on many factors, a machine decides if it trusts
> another machine. There is the implication that end-users may wish to disable
> new trust relationships from forming, but that's not a requirement. Factors
> that determine trust: 1) which network initiated connection, 2) machine
> identity certs, 3) any signatures on the machine identity certs (for example, a
> vendor signature probably implies higher trust), 4) user wishes/configuration,
> and 5) any possible remote attestation (TPM) or similar. These factors could
> certainly be extended to include many other things, but these are a baseline
> for starting to talk about this. Depending on user settings, devices might
> require OAuth 2.0 device flow or be implicitly trusted based on vendor-signed
> device certificates.
> 	Proposal: The OAuth 2.0 Client Credentials grant or the OAuth 2.0
> device flow. (Depending on policy)
> 
> Step 4: RPC
> 	After establishing trust, you need a mechanism to do remote calls
> between machines. This could be as simple as REST (using oauth tokens granted
> in #3), or as complex as a full DBUS interface.
> 	Proposal: None at this time
> 
> Step 5: Clustering
> 	Definition: Clustering is generally divided into a cluster
> communication protocol and a cluster resource manager. The resource manager has
> the job of taking developer constraints about which daemons have to run where
> and what resources they need, and running the jobs on the machines available.
> For example, you might specify that only one machine should run sensors
> connected to physical hardware i2c lines, and specify the list of daemons that
> depend on these hardware resources. The resource manager would be responsible
> for running the daemons in the correct order on the correct number of machines.
> 	Proposal: Corosync/Pacemaker have fairly reliable and flexible
> clustering, and can describe complex requirements.
> 
> The one thing to keep in mind is that everything up to step 4/5 is part of your
> forwards/backwards compatibility guarantee for all time for everything in the
> chassis. To make sure it is supportable for a very long time, try to keep it as
> simple as possible, but no simpler.
> 
> Another part of the design is figuring out if you need Active/Active
> clustering, or Active/Standby. If you can get away with Active/Standby, you can
> greatly minimize your RPC requirements between the machines, to the point you
> don't really need much other than REST.
> 
> --
> Michael


Thanks for this break-up and summary, Michael. I'm trying to collect the 
factors that can help us weighing up the pros and cons of the RPC 
mechanism (REST vs D-Bus vs something else). This is what I've gathered 
so far :

- Some of the other layers in the picture can influence the RPC 
mechanism - you've brought out the active/active vs active/standby 
design, but I'm not sure if it's possible to identify an RPC mechanism 
that fits all such designs, because such configurations could depend on 
the system design, which can vary. So REST for example may not fit the 
bill for a multi-master.
- One thing about D-Bus that people have brought up is that our 
REST/Redfish API might be based around the existing D-Bus interface, so 
it kind of seems natural that the same interface serves as the RPC 
interface as well.
- Security aspects of D-Bus vs REST : depends on the design/mechanisms 
chosen for the steps prior to RPC.
- Any other factors?

Regards,
Deepak

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: BMC redundancy
       [not found]           ` <d78c6a15-8736-7641-47af-714559cc48f5@linux.vnet.ibm.com>
@ 2018-04-13 11:57             ` Deepak Kodihalli
  0 siblings, 0 replies; 16+ messages in thread
From: Deepak Kodihalli @ 2018-04-13 11:57 UTC (permalink / raw)
  To: Michael_E_Brown, Andrew Jeffery, rosedahl, Brad Bishop, ratagupt,
	vernon.mauery
  Cc: OpenBMC Maillist

On 13/04/18 5:15 pm, Deepak Kodihalli wrote:
> On 06/02/18 12:14 pm, Deepak Kodihalli wrote:
>> On 06/02/18 11:40 am, Michael E Brown wrote:
>>> On Fri, Feb 02, 2018 at 01:10:43PM -0800, Vernon Mauery wrote:
>>>> On 02-Feb-2018 11:18 AM, Andrew Jeffery wrote:
>>>>> Hi Deepak,
>>>>>
>>>>>> So several of the existing OpenBMC apps implement specific D-Bus
>>>>>> services. What does it take to make remote D-Bus calls to such apps?
>>>>>> - It doesn't look like the D-Bus spec or libdbus officially has 
>>>>>> anything
>>>>>> for D-Bus across computers. There are some good notes at
>>>>>> https://www.freedesktop.org/wiki/Software/DBusRemote/.
>>>>>
>>>>> Applications can cannect to remote dbus servers; the --address 
>>>>> option to dbus-daemon allows it to listen on a TCP socket and 
>>>>> setting DBUS_SESSION_BUS_ADDRESS will point applications in the 
>>>>> right direction. So there are probably two ways we could do this:
>>>>
>>>> Putting DBus on an externally-available TCP socket is a security 
>>>> architect's
>>>> nightmare. All command and control of the entire BMC is done over 
>>>> DBus; we
>>>> cannot put that on an externally-available address. I suppose if you 
>>>> have an
>>>> internal connection and switching fabric between the nodes, this 
>>>> would be
>>>> possible.
>>>
>>> Vernon, I completely and wholeheartedly agree with this assessment. 
>>> Most of the
>>> things I've heard so far start way too high up in the stack and try 
>>> to solve
>>> the issue there. I believe that we ought to start at the networking 
>>> layer and
>>> build up from there.
>>>
>>> Here is a dump of my thoughts related to what we talked about on the 
>>> conference
>>> call.
>>>
>>> Step 0: Internal network
>>>     Definition: This is a private network that is internal to a 
>>> chassis and
>>> cannot be attached via the external world. This does not make this 
>>> network
>>> implicitly secure, however. I would strongly suggest that we create 
>>> standards
>>> for addressing this network and how components communicate at the IP 
>>> level
>>> between BMCs.
>>>     Proposal: all nodes on this network use IPv6 SLAAC, with a 
>>> designation
>>> for one or a redundant pair of nodes running "radvd" to provide stable
>>> site-local address space assignments.
>>>
>>> Step 1: Discovery
>>>     Definition: This is how you figure out that there are other BMCs 
>>> on the
>>> internal network. Zeroconf/Avahi/MDNS are three names for one method. 
>>> This also
>>> includes figuring out which role any specific node may fill 
>>> (chassis-level BMC
>>> vs node-level BMC, for example).
>>>     Proposal: MDNS DNS-SD with TXT records indicating BMC Role
>>>
>>> Step 2: Exchanging certificates
>>>     Definition: To start a crypto system, one needs to exchange public
>>> keys, at a minimum. Other information about a machine can also be 
>>> useful. Note
>>> that exchanging certificates does not imply "trust" of those 
>>> certificates, but
>>> only provides the basis upon which you later decide if you trust or not.
>>>     Proposal: Oauth 2.0 Dynamic Client Registration
>>>
>>> Step 3: Trust
>>>     Definition: based on many factors, a machine decides if it trusts
>>> another machine. There is the implication that end-users may wish to 
>>> disable
>>> new trust relationships from forming, but that's not a requirement. 
>>> Factors
>>> that determine trust: 1) which network initiated connection, 2) machine
>>> identity certs, 3) any signatures on the machine identity certs (for 
>>> example, a
>>> vendor signature probably implies higher trust), 4) user 
>>> wishes/configuration,
>>> and 5) any possible remote attestation (TPM) or similar. These 
>>> factors could
>>> certainly be extended to include many other things, but these are a 
>>> baseline
>>> for starting to talk about this. Depending on user settings, devices 
>>> might
>>> require OAuth 2.0 device flow or be implicitly trusted based on 
>>> vendor-signed
>>> device certificates.
>>>     Proposal: The OAuth 2.0 Client Credentials grant or the OAuth 2.0
>>> device flow. (Depending on policy)
>>>
>>> Step 4: RPC
>>>     After establishing trust, you need a mechanism to do remote calls
>>> between machines. This could be as simple as REST (using oauth tokens 
>>> granted
>>> in #3), or as complex as a full DBUS interface.
>>>     Proposal: None at this time
>>>
>>> Step 5: Clustering
>>>     Definition: Clustering is generally divided into a cluster
>>> communication protocol and a cluster resource manager. The resource 
>>> manager has
>>> the job of taking developer constraints about which daemons have to 
>>> run where
>>> and what resources they need, and running the jobs on the machines 
>>> available.
>>> For example, you might specify that only one machine should run sensors
>>> connected to physical hardware i2c lines, and specify the list of 
>>> daemons that
>>> depend on these hardware resources. The resource manager would be 
>>> responsible
>>> for running the daemons in the correct order on the correct number of 
>>> machines.
>>>     Proposal: Corosync/Pacemaker have fairly reliable and flexible
>>> clustering, and can describe complex requirements.
>>>
>>> The one thing to keep in mind is that everything up to step 4/5 is 
>>> part of your
>>> forwards/backwards compatibility guarantee for all time for 
>>> everything in the
>>> chassis. To make sure it is supportable for a very long time, try to 
>>> keep it as
>>> simple as possible, but no simpler.
>>>
>>> Another part of the design is figuring out if you need Active/Active
>>> clustering, or Active/Standby. If you can get away with 
>>> Active/Standby, you can
>>> greatly minimize your RPC requirements between the machines, to the 
>>> point you
>>> don't really need much other than REST.
>>>
>>> -- 
>>> Michael
>>
>>
>> Thanks for this break-up and summary, Michael. I'm trying to collect 
>> the factors that can help us weighing up the pros and cons of the RPC 
>> mechanism (REST vs D-Bus vs something else). This is what I've 
>> gathered so far :
>>
>> - Some of the other layers in the picture can influence the RPC 
>> mechanism - you've brought out the active/active vs active/standby 
>> design, but I'm not sure if it's possible to identify an RPC mechanism 
>> that fits all such designs, because such configurations could depend 
>> on the system design, which can vary. So REST for example may not fit 
>> the bill for a multi-master.
>> - One thing about D-Bus that people have brought up is that our 
>> REST/Redfish API might be based around the existing D-Bus interface, 
>> so it kind of seems natural that the same interface serves as the RPC 
>> interface as well.
>> - Security aspects of D-Bus vs REST : depends on the design/mechanisms 
>> chosen for the steps prior to RPC.
>> - Any other factors?
>>
>> Regards,
>> Deepak
> 
> Hello,
> 
> I'd like to resurrect this topic. A quick recap - this is applicable to 
> multi BMC (or peer BMC) systems, where typically each BMC is managing a 
> host.
> 
> While I agree about solving this problem across different layers : the 
> internal network between BMCs, RPC, trust, clustering, etc, some of 
> these have well-known solutions that can be easily integrated into 
> OpenBMC (such as Avahi for discovery and Corosync for clustering). We 
> still need discussions in those areas, but with this email I'd like to 
> propose a peer-BMC RPC mechanism based on a D-Bus model. The motivation 
> behind using D-Bus is that most BMC apps should be minimally impacted or 
> would be agnostic to the fact that there are peer BMCs in the system.
> 
> I've been thinking of various use-cases to see how a D-Bus based RPC 
> fits in. Consider one such use case where an external application wants 
> to issue a power-on command to each of the peer BMCs. Further, the 
> external app wants to communicate with a specific point-of-contact (POC) 
> BMC, expecting the POC to broadcast the command across peers and to 
> aggregate responses. Let's also consider that the external application 
> is using redfish api to communicate to the POC, so it might send out 
> something like /redfish/v1/system/control with {"power" : "on"} (I don't 
> know the exact redfish api for this).
> 
> The POC has two tasks here - translating the redfish request to a D-Bus 
> method call (which would have to be done even for a single BMC system), 
> and then propagating that call across BMCs. The proposal is that the 
> same D-Bus model is created on every peer BMC, with appropriately named 
> object paths. So in this case, say on a 4-BMC system, you could have the 
> following :
> 
> /bmc0/xyz/openbmc_project/host/control/power
> /bmc1/xyz/openbmc_project/host/control/power
> /bmc2/xyz/openbmc_project/host/control/power
> /bmc3/xyz/openbmc_project/host/control/power
> 
> On every BMC, the objects in the model that point to other BMCs are 
> proxies, they'll route D-Bus calls to the relevant BMC and retrieve the 
> response. The advantage of this approach is that for eg the rest/redfish 
> code that attempts to find D-Bus objects corresponding to a REST URI 
> (say by making a mapper query based on the D-Bus interface) will still 
> work as before, just that there will be proxy objects found as well now. 
> These proxy objects will implement the same interface that a 
> /xyz/openbmc_project/host/control/power D-Bus object would. This 
> approach also works with native D-Bus apps wanting to communicate with 
> peer BMCs; a differentiation of a native vs remote path can be made 
> based on the path itself, or the proxy objects could implement an 
> interface indicating they're remote objects.
> 
> So effectively most of the well-known D-Bus model would reside on each 
> peer BMC. In terms of how this scales, I guess it's comparable to a 
> single BMC managing multiple nodes.
> 
> With the previous example, the object paths to construct are well-known, 
> but this may not apply to objects such as error logs. In that case, I 
> think it should be possible to implement proxy "object managers" under 
> well known D-Bus roots. So a call to retrieve all objects under the 
> D-Bus root will retrieve objects across all the peer BMCs, with the 
> proxy object managers routing the request to the appropriate BMCs.
> 
> For the actual remote D-Bus calls, one possibility is to use the API 
> offered by sdbus (it transports D-Bus messages over ssh).
> 
> Thanks,
> Deepak

Forgot copying the list.

Regards,
Deepak

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-04-13 11:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-29 15:52 BMC redundancy Brad Bishop
     [not found] ` <7eb0f506-1dd7-1a28-cc0a-9f7813c28562@yadro.com>
     [not found]   ` <524CA01B-1D8E-4C15-B5DB-A27157FBECB7@fuzziesquirrel.com>
2018-01-29 18:14     ` Brad Bishop
2018-01-29 20:43 ` Vernon Mauery
2018-01-29 21:38   ` Brad Bishop
2018-01-31  6:27 ` Deepak Kodihalli
2018-02-02  0:48   ` Andrew Jeffery
2018-02-02  6:28     ` Deepak Kodihalli
2018-02-02  9:48     ` Ratan Gupta
2018-02-02 14:42       ` Brad Bishop
2018-02-02 21:10     ` Vernon Mauery
2018-02-03  8:08       ` Deepak Kodihalli
2018-02-03  8:52         ` Ratan Gupta
2018-02-05  1:38       ` Andrew Jeffery
2018-02-06  6:10       ` Michael E Brown
2018-02-06  6:44         ` Deepak Kodihalli
     [not found]           ` <d78c6a15-8736-7641-47af-714559cc48f5@linux.vnet.ibm.com>
2018-04-13 11:57             ` Deepak Kodihalli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.