All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Viacheslav A.Dubeyko" <viacheslav.dubeyko@bytedance.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Viacheslav Dubeyko <slava@dubeyko.com>,
	linux-cxl@vger.kernel.org, a.manzanares@samsung.com
Subject: Re: [External] CXL Fabric Manager (FM) architecture diagrams
Date: Thu, 9 Mar 2023 14:28:51 -0800	[thread overview]
Message-ID: <F6C3F73D-904C-4667-AB97-E12E64F4681F@bytedance.com> (raw)
In-Reply-To: <20230307172105.00006132@Huawei.com>



> On Mar 7, 2023, at 9:21 AM, Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> On Mon,  6 Mar 2023 10:59:13 -0800
> Viacheslav Dubeyko <slava@dubeyko.com> wrote:
> 
>> Hi Jonathan,
>> 
>> You can find diagram online now:
>> 
>> (1) Diagram 1: single host with Fabric Manager (FM)
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram1-cxl-fm-single-host-with-fm.pdf
> 
> Whilst accurate that you might only be able to control the switch, I would add one MLD to this
> diagram so that tunneling can be discussed.  It's minimal in sense of near minimal number of
> components that might exist in a switched system
> 
> Thanks to asciiflow.com + some editing of the resutl - fingers crossed this works.
> Obviously these can't be as rich as your nice diagrams so I've
> left out what the connections are, but having them inline has
> it's advantages as well.
> 

I’ve copied ascii diagrams into pure text document and I can see the diagrams properly aligned. :)

> I've played fast and loose with some of the terminology and not checked
> it against the spec naming.  I've also left it vague how an application
> talks to the orchestrator.  That was via agent in your terms I think
> 
> ┌───────────────────┐    ┌───────────────┐   ┌────────────────┐
> │                   │    │               │   │                │
> │                   │    │   ┌───────┐   │   │ ┌────────────┐ │
> │                   │    │   │       ├───┼───┼─►FM Owned LD │ │
> │           ┌───┐   │    │   │Switch │   │   │ └────────────┘ │
> │           │FM ├───┼────┼───►  CCI  │   ├───┤                │
> │           └───┘   │    │   │       │   │   │ MLD 1          │
> │                   │    │   └──────┬┘   │   └────────────────┘
> │              ┌────┤    │          │    │
> │              │RP0 ├────┤          │    │   ┌────────────────┐
> │              └────┤    │          │    │   │                │
> │                   │    │          │    │   │ ┌────────────┐ │
> │              ┌────┤    │          └────┼───┼─►FM Owned LD │ │
> │              │RP1 ├────┤               ├───┤ └────────────┘ │
> │              └────┤    │               │   │                │
> │ Host A            │    │ SWITCH        │   │ MLD 2          │
> └───────────────────┘    └───────────────┘   └────────────────┘
> 

I assume that RP0/RP1 are Root Ports. Should RP0/RP1 be on switch side?
I think APP should be shown too on this diagram. What do you think?

>> 
>> (2) Diagram 2: single host with orchestrator
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram2-cxl-fm-single-host-with-orhestrator.pdf
> 
> Actually having an FM in a switch might happen, but there is no spec defined way of doing it.
> From a software architecture point of view it's no different from another host talking to the
> switch - just think of sticking a BMC down next to the switch chip.
> Having the orchestrator in the host is also rather odd though it could in theory happen.
> Typically the orchestrator is considered a 'cloud' level thing floating way above individual host.
> Without any loss of generality I'd always have the orchestrator as something on it's own
> machine chatting over a network to the Agents and FM-API accessed devices.
> 

I agree that orchestrator could be standalone system. But diagram 2 is an example of simple case
when FM is located outside of the host (CXL switch or something else). So, I mean here simple
subsystem (dedicated application or library) that can talk with FM. I believe that standalone
orchestrator could be overkill for such case. What do you think?

> Something like
> 
> 
>    ┌──────────────────────┐    ┌────────┐
>    │                      │    │        │
>    │                      │    │        │
>    │   Orchestrator ──────┼────►  FM    │
>    │                      │    │        │
>    └───▲──────────────────┘    └─┬──────┘
>        │                         │
>  ┌─────┼─────────────┐    ┌──────┼────────┐   ┌────────────────┐
>  │     │             │    │      │        │   │                │
>  │     │             │    │  ┌───▼────┐   │   │ ┌────────────┐ │
>  │     │             │    │  │ Switch ├───┼───┼─►FM Owned LD │ │
>  │ ┌───┴──┐          │    │  │   CCI  │   │   │ └────────────┘ │
>  │ │ APP  │          │    │  │ or MCTP│   ├───┤                │
>  │ │      │          │    │  │   CCI  │   │   │ MLD 1          │
>  │ └──────┘          │    │  └───────┬┘   │   └────────────────┘
>  │              ┌────┤    │          │    │
>  │              │RP0 ├────┤          │    │   ┌────────────────┐
>  │              └────┤    │          │    │   │                │
>  │                   │    │          │    │   │ ┌────────────┐ │
>  │              ┌────┤    │          └────┼───┼─►FM Owned LD │ │
>  │              │RP1 ├────┤               ├───┤ └────────────┘ │
>  │              └────┤    │               │   │                │
>  │ Host A            │    │ SWITCH        │   │ MLD 2          │
>  └───────────────────┘    └───────────────┘   └────────────────┘
> 

Most probably, FM outside of CXL switch will talk by means of Redfish protocol.
Do you mean that MCTP CCI is subprotocol of Redfish? Should we introduce some
CXL switch’s subsystem that can talk by means of Redfish?

>> 
>> (3) Diagram3: Multi-headed device
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram3-cxl-fm-multi-headed-device.pdf
> 
> Looks good though there is a simpler single host version.
> 
> ┌───────────────────┐    ┌─────────────────────┐
> │           ┌───┐   │    │ ┌────────────┐      │
> │           │FM ├───┼────┼─► Mailbox CCI│      │
> │           └───┘   │    │ └────────────┘      │    
> │                   │    │ ┌─────▼───────┐     │
> │              ┌────┤    │ │ LD Pool CCI │     │
> │              │RP0 ├────┤ └─────────────┘     │
> │              └────┤    │                     │
> │              ┌────┤    │                     │
> │              │RP1 ├────┤                     │
> │              └────┤    │                     │
> │ Host A            │    │ MHD 1               │
> └───────────────────┘    └─────────────────────┘
> 
> I'd jump from single host to the multi host with external FM and
> orchestator.
> 
>        ┌──────────────────────┐    ┌────────┐
>        │                      │    │        │
>        │                      │    │        │
>  ┌─────►   Orchestrator ──────┼────►  FM    │
>  │     │                      │    │        │
>  │     └───▲──────────────────┘    └────┬───┘
>  │         │                            │
>  │   ┌─────┼─────────────┐      ┌───────┼─────────────┐
>  │   │     │             │      │ ┌─────▼───────┐     │
>  │   │     │             │      │ │ Mailbox CCI │     │
>  │   │ ┌───┴──┐      ┌───┤      │ └─────-───────┘     │
>  │   │ │ APP  │      │RP0├──────┤       │             │
>  │   │ │      │      └───┤      │ ┌─────▼───────┐     │
>  │   │ └──────┘          │      │ │ LD Pool CCI │     │
>  │   │                   │      │ └─────────────┘     │
>  │   │ Host A            │      │                     │
>  │   └───────────────────┘      │                     │
>  │   ┌───────────────────┐      │                     │
>  └───┼─┬──────┐      ┌───┤      │                     │
>      │ │ APP  │      │RP0├──────┤                     │
>      │ │      │      └───┤      │                     │
>      │ └──────┘          │      │                     │
>      │ Host B            │      │ MHD 1               │
>      └───────────────────┘      └─────────────────────┘
> 
> 

It looks like FM needs to use network + Redfish to interact with Multi-Headed Device (MHD).
Could Multi-Headed Device support network protocol? I assume it sounds like overkill.

> 
>> 
>> (4) Diagram 4: Multi-headed device + Orchestrator
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram4-cxl-fm-multi-headed-device-orchestrator.pdf
> I'd put the orchestrator in it's own 'host' as above...  I've drawn it with a mailbox cci but
> could be a mctp CCI. 
>> 
>> (5) Diagram5: Multiple hosts with Fabric Manager (FM)
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram5-cxl-fm-multiple-hosts-with-fm.pdf
> 
> Another one where I'd separate the FM from the switch.  It may be near the switch but
> it's talking fm-api to the switch and that's an interface that is well defined.
> 
>       ┌──────────────────────┐    ┌────────┐
> ┌─────►   Orchestrator ──────┼────►  FM    │
> │     └───▲──────────────────┘    └────┬───┘
> │   ┌─────┼─────────────┐              │
> │   │     │             │       ┌──────┼────────┐   ┌────────────────┐
> │   │     │             │       │  ┌───▼────┐   │   │ ┌────────────┐ │
> │   │ ┌───┴──┐      ┌───┤       │  │ Switch ├───┼───┼─►FM Owned LD │ │
> │   │ │ APP  │      │RP0├───────┤  │   CCI  │   │   │ └────────────┘ │
> │   │ │      │      └───┤       │  │ or MCTP│   ├───┤                │
> │   │ └──────┘          │       │  │   CCI  │   │   │ MLD 1          │
> │   │                   │       │  └───────┬┘   │   └────────────────┘
> │   │ Host A            │       │          │    │
> │   └───────────────────┘       │          │    │   ┌────────────────┐
> │   ┌───────────────────┐       │          │    │   │ ┌────────────┐ │
> │   │                   │       │          └────┼───┼─►FM Owned LD │ │
> │   │                   │       │               ├───┤ └────────────┘ │
> │   │                   │       │               │   │                │
> └───┼─┬──────┐      ┌───┤       │ SWITCH        │   │ MLD 2          │
>     │ │ APP  │      │RP0├───────┤               │   └────────────────┘
>     │ │      │      └───┤       │               │
>     │ └──────┘          │       └───────────────┘
>     │ Host B            │
>     └───────────────────┘

The same question here for interaction of FM with CXL switch by Redfish.
Is MCTP CCI compatible with Redfish? 

>> 
>> (6) Diagram 6: Multiple hosts with orhestrator
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram6-cxl-fm-multiple-hosts-with-orhestrator.pdf
> 
> I'm not sure there is a need to separate the case where there
> is an orchestrator in the loop from when there isn't.  Hence I just threw one on the diagram above.
> 

The same logic here. Application need to use some library or simple subsystem that can talk
with FM. And standalone Orchestrator could be overkill here.

>> 
>> (7) Diagram 7: Distributed Fabric Manager (FM)
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram7-cxl-fm-distributed-fm.pdf
>> 
> Looks like a set of FMs, not what I'd think of as a distributed FM.
> Only makes sense to me if more like this... 

I think I don’t follow here what you mean by distributed FM. I see the same set of FMs on your
version of diagram. What’s the difference? :)

> 
>     ┌──────────────────────┐
> ┌───►   Orchestrator ──────┼─────────────────────────────────────────┐
> │   │   ▲                  │                                         │
> │   └───┬──────────┬───────┘                                         │
> │       │          │                                                 │
> │       │          │            ┌────────┐                           │
> │       │          └────────────►  FM1   │                           │
> │       │                       └────┬───┘                           │
> │ ┌─────┼─────────────┐              │                               │
> │ │     │             │       ┌──────┼────────┐   ┌────────────────┐ │
> │ │     │         ┌───┤       │      │        │   │                │ │
> │ │     │         │RP0├───────┤  ┌───▼────┐   │   │ ┌────────────┐ │ │
> │ │ ┌───┴──┐      └───┤       │  │ Switch ├───┼───┼─►FM Owned LD │ │ │
> │ │ │ APP  │          │       │  │   CCI  │   │   │ └────────────┘ │ │
> │ │ │      │      ┌───┤       │  │ or MCTP│   ├───┤                │ │
> │ │ └──────┘      │RP1├─────┐ │  │   CCI  │   │   │ MLD 1          │ │
> │ │               └───┤     │ │  └───────┬┘   │   └────────────────┘ │
> │ │ Host A            │     │ │          │    │                      │
> │ └───────────────────┘     │ │          │    │   ┌────────────────┐ │
> │                           │ │          │    │   │                │ │
> │ ┌───────────────────┐     │ │          │    │   │ ┌────────────┐ │ │
> │ │                   │     │ │          └────┼───┼─►FM Owned LD │ │ │
> │ │               ┌───┤     │ │               ├───┤ └────────────┘ │ │
> │ │               │RP0├─────┼─┤               │   │                │ │
> │ │ ┌──────┐      └───┤     │ │ SWITCH 1      │   │ MLD 2          │ │
> └─┼─┤ APP  │          │     │ │               │   └────────────────┘ │
>   │ │      │      ┌───┤     │ │               │                      │
>   │ └──────┘      │RP1├──┐  │ └───────────────┘                      │
>   │               └───┤  │  │                                        │
>   │ Host B            │  │  │                                        │
>   └───────────────────┘  │  │                                        │
>                          │  │   ┌────────┐                           │
>                          │  │   │  FM2   ◄───────────────────────────┘
>                          │  │   └────┬───┘
>                          │  │ ┌──────┼────────┐   ┌────────────────┐
>                          │  │ │  ┌───▼────┐   │   │ ┌────────────┐ │
>                          │  │ │  │ Switch ├───┼───┼─►FM Owned LD │ │
>                          │  └─┤  │   CCI  │   │   │ └────────────┘ │
>                          │    │  │ or MCTP│   ├───┤                │
>                          │    │  │   CCI  │   │   │ MLD 3          │
>                          │    │  └───────┬┘   │   └────────────────┘
>                          │    │          │    │   ┌────────────────┐
>                          │    │          │    │   │ ┌────────────┐ │
>                          │    │          └────┼───┼─►FM Owned LD │ │
>                          │    │               ├───┤ └────────────┘ │
>                          │    │               │   │                │
>                          │    │ SWITCH 2      │   │ MLD 4          │
>                          └────┤               │   └────────────────┘
>                               └───────────────┘
> 
>> (8) Diagram 8: Layered Fabric Manager (FM) and separate orchestrator
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram8-cxl-fm-layered-fm-and-separate-orchestrator.pdf
> 
> I don't follow the layered part on this diagram.  My interpretation would
> be something like.
> 

Yeah, your version looks more logical. But top FM and bottom FM sounds slightly not
obvious. Maybe, we need to name top FM as root FM and bottom FM as leaf FM?
However, leaf FM sounds confusing too.

But why do we need top FM at all? I considered Orchestrator as root of hierarchy that
can keep the knowledge about all FMs. Am I wrong here?


> 
>       ┌──────────────────────┐    ┌────────┐
> ┌─────►   Orchestrator ──────┼────► TOP FM ├──────────────────────────────┐
> │     │   ▲                  │    │        │                              │
> │     └───┬──────────────────┘    └────┬───┘                              │
> │         │                            │                                  │
> │         │                       ┌────┴───┐                              │
> │         │                       │ BOTTOM │                              │
> │         │                       │  FM1   │                              │
> │         │                       └────┬───┘                              │
> │   ┌─────┼─────────────┐              │                                  │
> │   │     │             │       ┌──────┼────────┐   ┌────────────────┐    │
> │   │     │         ┌───┤       │      │        │   │                │    │
> │   │     │         │RP0├───────┤  ┌───▼────┐   │   │ ┌────────────┐ │    │
> │   │ ┌───┴──┐      └───┤       │  │ Switch ├───┼───┼─►FM Owned LD │ │    │
> │   │ │ APP  │          │       │  │   CCI  │   │   │ └────────────┘ │    │
> │   │ │      │      ┌───┤       │  │ or MCTP│   ├───┤                │    │
> │   │ └──────┘      │RP1├─────┐ │  │   CCI  │   │   │ MLD 1          │    │
> │   │               └───┤     │ │  └───────┬┘   │   └────────────────┘    │
> │   │ Host A            │     │ │          │    │                         │
> │   └───────────────────┘     │ │          │    │   ┌────────────────┐    │
> │                             │ │          │    │   │                │    │
> │                             │ │          │    │   │ ┌────────────┐ │    │
> │   ┌───────────────────┐     │ │          └────┼───┼─►FM Owned LD │ │    │
> │   │               ┌───┤     │ │               ├───┤ └────────────┘ │    │
> │   │               │RP0├─────┼─┤               │   │                │    │
> │   │ ┌──────┐      └───┤     │ │ SWITCH 1      │   │ MLD 2          │    │
> └───┼─┤ APP  │          │     │ │               │   └────────────────┘    │
>     │ │      │      ┌───┤     │ │               │                         │
>     │ └──────┘      │RP1├──┐  │ └───────────────┘                         │
>     │               └───┤  │  │                                           │
>     │ Host B            │  │  │                                           │
>     └───────────────────┘  │  │                                           │
>                            │  │   ┌────────┐                              │
>                            │  │   │ BOTTOM │                              │
>                            │  │   │  FM2   ◄──────────────────────────────┘
>                            │  │   └────┬───┘
>                            │  │ ┌──────┼────────┐   ┌────────────────┐
>                            │  │ │  ┌───▼────┐   │   │ ┌────────────┐ │
>                            │  │ │  │ Switch ├───┼───┼─►FM Owned LD │ │
>                            │  └─┤  │   CCI  │   │   │ └────────────┘ │
>                            │    │  │ or MCTP│   ├───┤                │
>                            │    │  │   CCI  │   │   │ MLD 3          │
>                            │    │  └───────┬┘   │   └────────────────┘
>                            │    │          │    │   ┌────────────────┐
>                            │    │          │    │   │ ┌────────────┐ │
>                            │    │          └────┼───┼─►FM Owned LD │ │
>                            │    │               ├───┤ └────────────┘ │
>                            │    │ SWITCH 2      │   │ MLD 4          │
>                            └────┤               │   └────────────────┘
>                                 └───────────────┘
>> 
>> (9) Diagram 9: BMC based layered Fabric Manager (FM)
>> https://github.com/computexpresslink/cxl-fm-architecture/blob/main/diagram9-cxl-fm-bmc-based-layered-fm.pdf
> 
> I don't follow what the BMC is in this diagram.
> 
> The BMC is just a (cheap) host that happens to have some elements of the overall management
> infrastructure on it.  Let it be any of the FMs floating around on their own in the
> diagrams above.  The diagram immediately above might be built with 3 BMCs or
> as a single BMC like this...
> 
>      ┌──────────────────────┐
>  ┌───►   Orchestrator       │
>  │   └───▲──────────┬───────┘    ┌────────┐
>  │       │          │            │ ┌────┐ │
>  │       │          └────────────►─┤FM  │ ├───────────────────────────┐
>  │       │                       │ └──┬─┘ │                           │
>  │       │                       │BMC │   │                           │
>  │       │                       └────┼───┘                           │
>  │ ┌─────┼─────────────┐              │                               │
>  │ │     │             │       ┌──────┼────────┐   ┌────────────────┐ │
>  │ │     │         ┌───┤       │      │        │   │                │ │
>  │ │     │         │RP0├───────┤  ┌───▼────┐   │   │ ┌────────────┐ │ │
>  │ │ ┌───┴──┐      └───┤       │  │ Switch ├───┼───┼─►FM Owned LD │ │ │
>  │ │ │ APP  │          │       │  │   CCI  │   │   │ └────────────┘ │ │
>  │ │ │      │      ┌───┤       │  │ or MCTP│   ├───┤                │ │
>  │ │ └──────┘      │RP1├─────┐ │  │   CCI  │   │   │ MLD 1          │ │
>  │ │               └───┤     │ │  └───────┬┘   │   └────────────────┘ │
>  │ │ Host A            │     │ │          │    │                      │
>  │ └───────────────────┘     │ │          │    │   ┌────────────────┐ │
>  │                           │ │          │    │   │                │ │
>  │ ┌───────────────────┐     │ │          │    │   │ ┌────────────┐ │ │
>  │ │                   │     │ │          └────┼───┼─►FM Owned LD │ │ │
>  │ │               ┌───┤     │ │               ├───┤ └────────────┘ │ │
>  │ │               │RP0├─────┼─┤               │   │                │ │
>  │ │ ┌──────┐      └───┤     │ │ SWITCH 1      │   │ MLD 2          │ │
>  └─┼─┤ APP  │          │     │ │               │   └────────────────┘ │
>    │ │      │      ┌───┤     │ │               │                      │
>    │ └──────┘      │RP1├──┐  │ └───────────────┘                      │
>    │               └───┤  │  │                                        │
>    │ Host B            │  │  │                                        │
>    └───────────────────┘  │  │                                        │
>                           │  │        ┌───────────────────────────────┘
>                           │  │ ┌──────┼────────┐   ┌────────────────┐
>                           │  │ │  ┌───▼────┐   │   │ ┌────────────┐ │
>                           │  │ │  │ Switch ├───┼───┼─►FM Owned LD │ │
>                           │  └─┤  │   CCI  │   │   │ └────────────┘ │
>                           │    │  │ or MCTP│   ├───┤                │
>                           │    │  │   CCI  │   │   │ MLD 3          │
>                           │    │  └───────┬┘   │   └────────────────┘
>                           │    │          │    │   ┌────────────────┐
>                           │    │          │    │   │ ┌────────────┐ │
>                           │    │          └────┼───┼─►FM Owned LD │ │
>                           │    │               ├───┤ └────────────┘ │
>                           │    │ SWITCH 2      │   │ MLD 4          │
>                           └────┤               │   └────────────────┘
>                                └───────────────┘
>> 
>> So, have I missed something?
>> Should we correct something on diagrams?
>> Does it look good?
> 
> Thare are far too many things we should perhaps show on these diagrams, but
> I suspect it will mostly fall out of any layered design.
> 
> We could draw one incredibly complex diagram that does everything :)
> 

I believe several simple diagrams are better than one really complex one. :)

> *crossed fingers the ascii art fun above looks ok*
> 
> Thanks for getting this started btw! I was completely failing to start
> whereas once there was a proposal it became easier to have a go!
> 

My pleasure. :)

Thanks,
Slava.


  reply	other threads:[~2023-03-09 22:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-06 18:59 CXL Fabric Manager (FM) architecture diagrams Viacheslav Dubeyko
2023-03-07 17:21 ` Jonathan Cameron
2023-03-09 22:28   ` Viacheslav A.Dubeyko [this message]
     [not found]     ` <20230310120126.000057b9@Huawei.com>
2023-03-16 20:43       ` [External] " Viacheslav A.Dubeyko
     [not found]         ` <20230406204227.GA5971@bgt-140510-bm01>
2023-04-07 18:18           ` Viacheslav A.Dubeyko
     [not found]             ` <90aea04c-dab2-4b7e-bfc6-09a1240793a9@nmtadam.samsung>
2023-04-07 23:24               ` Viacheslav A.Dubeyko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F6C3F73D-904C-4667-AB97-E12E64F4681F@bytedance.com \
    --to=viacheslav.dubeyko@bytedance.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=a.manzanares@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=slava@dubeyko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.