All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-22  7:37 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-22  7:37 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 9423 bytes --]

Hi Anil,

I'm trying to fill the gap between FC and RDMA/TCP as possible as I can.

The following is correct?

RDMA and TCP transports schedule the created qpair to NVMe-oF poll group.

FC transport schedules the created qpair (FC connection) to FC HWQP.​
Each FC port has a HWQP for LS queue and multiple HWQPs for IO queues.​
HWQPs are already scheduled to FC transport poll group when changing the corresponding FC port to online.​

Thanks,
Shuhei


________________________________
差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
送信日時: 2019年5月22日 6:46
宛先: Walker, Benjamin
CC: 松本周平 / MATSUMOTO,SHUUHEI; Harris, James R; spdk(a)lists.01.org
件名: [!]Re: Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP

Hi Ben,
     Please find the attached diagram which outlines FC node/link initialization and FC-NVMe connection setup sequence.
Here are some highlights,

  *   Each FC port login to switch to identify itself, establish trust and make use of directory service provided by switch
  *   Storage network intelligence is embedded in FC switches in the form of directory or name service
  *   FC-NVMe initiator and target are allowed to setup connections only if storage admin permits by zoning them together.

On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <benjamin.walker(a)intel.com<mailto:benjamin.walker(a)intel.com>> wrote:
On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
> Hi Shuhei,
>     My response is inline below.
>

I don't know much about Fibre Channel, but surely there must be some step that
tells the FC driver whether to accept a connection or not, right? Even if that's
just the particular FC port to use on the HBA, it's still something. That
information needs to come from an RPC and be passed down through
spdk_nvmf_tgt_listen() to the FC transport.

Step 50 (which again is dependent on step .1 & step 21.) has to completed before NVMe connections can be established.
So bind()/listen() involves external entity (switch) and multiple PDU exchanges.
We are exploring the feasibility for moving step 21. to spdk_nvmf_tgt_listen().


> That is the reason we have created a channel or queue to receive these FC-4 LS
> frames for processing. Similarly ABTS (Abort exchange)
> is again a FC-LS command. As you can see the FC-4 LS does more work than it's
> RDMA / TCP counterparts. That brings us to the
> second difference - "Management of master thread", because of the amount of
> work involved lead us to implement a dedicated master thread
> to handle all non-IO related processing.

Once a connection is established, you need to tell the generic SPDK NVMe-oF
library about it (so the qpair can get assigned to a poll group). That means
that spdk_nvmf_tgt_accept() is going to need to do something to the transport
that generates the new_qpair callback.
This is exactly fc transport implements accept() api. Transport polls on FC-4 LS queue for
incoming "Create Association", "Create Connection" and "Disconnect" requests.
After accepting / setting up connection, transport would assign it to a poll group by
calling spdk_nvmf_poll_group_add().


I think what would be helpful is a basic overview of what the FC primitives are
and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK NVMe-oF
primitives are, including mappings for RDMA/TCP:

spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)

That would be FC WWNN and WWPN as show below,
  {
    "nqn": "nqn.2016-06.io.spdk:cnode1",
    "subtype": "NVMe",
    "listen_addresses": [
      {
        "transport": "FC",
        "trtype": "FC",
        "adrfam": "FC",
        "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",
        "trsvcid": "none"
      },
      {
        "transport": "FC",
        "trtype": "FC",
        "adrfam": "FC",
        "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",
        "trsvcid": "none"
      }
    ],
    "allow_any_host": true,
    "hosts": [],
    "serial_number": "SPDK00000000000001",
    "model_number": "SPDK bdev Controller",
    "max_namespaces": 20,
    "namespaces": [
      {
        "nsid": 1,
        "bdev_name": "Malloc0",
        "name": "Malloc0",
        "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"
      }
    ]
  }


spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)

That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}

spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
epoll/kqueue group)
In FC transport this is defined as hardware queue pair (hwqp). Logical connections  are added to hwqp as they are created and removed when the connection are torn down.


>
> Thanks,
> - Anil
>
>
> > 2. Management of master thread
> > FC transport layer has its own master thread. This may match the thread of
> > the NVMe-oF target's
> > acceptor poller but is independent each other.
> > All FC administration events and link services are executed on the FC
> > transport's master thread.
> >
> >
> > Thanks,
> > Shuhei
> >
> >
> > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
> > 送信日時: 2019年5月14日 14:47
> > 宛先: Anil Veerabhadrappa
> > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport
> >
> > Hi Anil,
> >
> > Thank you for your kind feedback. Your thought is very reasonable.
> >
> > OK, let's continue to focus on the current patch, and add the two work items
> > in Trello as enhancements.
> > I added them to NVMe-oF target backlog for now.
> > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog<https://clicktime.symantec.com/3LZb3ioxXqvWtnsLh19Lr1r7Vc?u=https%3A%2F%2Ftrello.com%2Fb%2FG2f3dVcs%2Fnvme-of-target-backlog>
> >
> > Thanks,
> > Shuhei
> >
> > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com<mailto:anil.veerabhadrappa(a)broadcom.com>>
> > 送信日時: 2019年5月14日 8:11
> > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
> > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport
> >
> > Hi Shuhei,
> >      Your understanding about WWNN and WWPN is right. Initially there was
> > some considerations about adding FC listener code but decided against it.
> > There are couple reasons for it,
> > Unlike RDMA/TCP transport, there is no easy way for user to obtain worldwide
> > names from say Linux bash shell. There isn't a 'ip address list' or
> > 'ifconfig' equivalent for FC in Linux distributions. So each vendor has to
> > provide their own utility.
> > FC protocol suite includes Directory Service via' Name Server which controls
> > how connected FC ports as discovered in a SNA. Also zoning controls which
> > ports are allowed to connect to each other. So FC-NVMe inherits an external
> > pseudo listening feature from native FC. So it is ok for FC NVMf target to
> > listen on all FC ports.
> > Our drivers can support changing WWNN and WWPN. We will work on this feature
> > enhancement after FC-NVMe is merged into SPDK code. Also it is worth noting
> > that this feature will introduce some new transport API's which would be
> > NOP's for RDMA and TCP transports.
> >
> >     Sure, we can add these 2 work items in Trello as enhancements to FC-NVMe
> > transport to be addressed after it is merged into SPDK master.
> >
> > Thanks,
> > - Anil
> >
> >
> > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> > shuhei.matsumoto.xt(a)hitachi.com<mailto:shuhei.matsumoto.xt(a)hitachi.com>> wrote:
> > > Hi Anil,
> > >
> > > Thank you for improving the patch continuously. I have seen great
> > > improvement.
> > >
> > > I have an item to discuss with you, and I send it to the mailing list
> > > first.
> > > Please correct me if I'm wrong or let's discusss on Trello by creating the
> > > board for FC if this question is reasonable.
> > >
> > > NVMe-oF FC transport utilizes two types of WWN.
> > >
> > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC port, and
> > > each port on a FC HBA has a unique WWPN.
> > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
> > >
> > > If I understand correctly,
> > > The FC low level driver (LLD) reads persistent WWPN and WWNN and informs
> > > them to SPDK NVMe-oF transport,
> > > then SPDK NVMe-oF transport configures listeners according to them.
> > > Besides, nvmf_fc_listen is implemented as NOP.
> > >
> > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
> > >
> > > But it is very desirable if we can change WWNN and WWPN as our own needs.
> > > .INI config file has been deprecated and could you consider to add FC code
> > > to the nvmf_subsystem_add_listener RPC?
> > >
> > > Implementation options may be for example
> > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of the
> > > HBA which matches WWNN.
> > > or
> > > - FC port has its own PCI address.
> > > - user passes the trio, PCI address, WWNN, WWPN.
> > > - if the PCI address is the lowest of the FC HBA, WWNN can be changed.
> > >
> > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
> > > message.
> > >
> > > Thanks,
> > > Shuhei
> > >


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-22 10:58 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-22 10:58 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 11679 bytes --]

And additionally (this will be the last today) scheduling all connections
of an association to hwqp is done when creating the association. The reason
why you do that is to satisfy the requirement of fcnvme specification?

2019年5月22日(水) 19:46 松本周平 <shuheimatsumoto(a)gmail.com>:

> But it will be nice if we unify fc hwqp and nvmf poll group more.
>
> And it may be nice if you share again about FC don’t have ifconfig
> equivalent and so all listener equivalent are added to all nvmf subsystems
> by LLD for now.
>
> 2019年5月22日(水) 19:38 松本周平 <shuheimatsumoto(a)gmail.com>:
>
>> Hi Anil,
>>
>> I read your reply again and found that you already had answered to the
>> following questions.
>>
>> So you don’t have to answer again.
>>
>> Thanks for your help
>> Shuhei
>>
>> 2019年5月22日(水) 16:46 松本周平 / MATSUMOTO,SHUUHEI <
>> shuhei.matsumoto.xt(a)hitachi.com>:
>>
>>> Hi Anil,
>>>
>>> I'm trying to fill the gap between FC and RDMA/TCP as possible as I can.
>>>
>>> The following is correct?
>>>
>>> RDMA and TCP transports schedule the created qpair to NVMe-oF poll group.
>>>
>>> FC transport schedules the created qpair (FC connection) to FC HWQP.​
>>> Each FC port has a HWQP for LS queue and multiple HWQPs for IO queues.​
>>> HWQPs are already scheduled to FC transport poll group when changing the
>>> corresponding FC port to online.​
>>>
>>> Thanks,
>>> Shuhei
>>>
>>>
>>> ________________________________
>>> 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
>>> 送信日時: 2019年5月22日 6:46
>>> 宛先: Walker, Benjamin
>>> CC: 松本周平 / MATSUMOTO,SHUUHEI; Harris, James R; spdk(a)lists.01.org
>>> 件名: [!]Re: Difference of thread and transport object management in
>>> NVMe-oFC, RDMA, and TCP
>>>
>>> Hi Ben,
>>>      Please find the attached diagram which outlines FC node/link
>>> initialization and FC-NVMe connection setup sequence.
>>> Here are some highlights,
>>>
>>>   *   Each FC port login to switch to identify itself, establish trust
>>> and make use of directory service provided by switch
>>>   *   Storage network intelligence is embedded in FC switches in the
>>> form of directory or name service
>>>   *   FC-NVMe initiator and target are allowed to setup connections only
>>> if storage admin permits by zoning them together.
>>>
>>> On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <
>>> benjamin.walker(a)intel.com<mailto:benjamin.walker(a)intel.com>> wrote:
>>> On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
>>> > Hi Shuhei,
>>> >     My response is inline below.
>>> >
>>>
>>> I don't know much about Fibre Channel, but surely there must be some
>>> step that
>>> tells the FC driver whether to accept a connection or not, right? Even
>>> if that's
>>> just the particular FC port to use on the HBA, it's still something. That
>>> information needs to come from an RPC and be passed down through
>>> spdk_nvmf_tgt_listen() to the FC transport.
>>>
>>> Step 50 (which again is dependent on step .1 & step 21.) has to
>>> completed before NVMe connections can be established.
>>> So bind()/listen() involves external entity (switch) and multiple PDU
>>> exchanges.
>>> We are exploring the feasibility for moving step 21. to
>>> spdk_nvmf_tgt_listen().
>>>
>>>
>>> > That is the reason we have created a channel or queue to receive these
>>> FC-4 LS
>>> > frames for processing. Similarly ABTS (Abort exchange)
>>> > is again a FC-LS command. As you can see the FC-4 LS does more work
>>> than it's
>>> > RDMA / TCP counterparts. That brings us to the
>>> > second difference - "Management of master thread", because of the
>>> amount of
>>> > work involved lead us to implement a dedicated master thread
>>> > to handle all non-IO related processing.
>>>
>>> Once a connection is established, you need to tell the generic SPDK
>>> NVMe-oF
>>> library about it (so the qpair can get assigned to a poll group). That
>>> means
>>> that spdk_nvmf_tgt_accept() is going to need to do something to the
>>> transport
>>> that generates the new_qpair callback.
>>> This is exactly fc transport implements accept() api. Transport polls on
>>> FC-4 LS queue for
>>> incoming "Create Association", "Create Connection" and "Disconnect"
>>> requests.
>>> After accepting / setting up connection, transport would assign it to a
>>> poll group by
>>> calling spdk_nvmf_poll_group_add().
>>>
>>>
>>> I think what would be helpful is a basic overview of what the FC
>>> primitives are
>>> and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK
>>> NVMe-oF
>>> primitives are, including mappings for RDMA/TCP:
>>>
>>> spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
>>>
>>> That would be FC WWNN and WWPN as show below,
>>>   {
>>>     "nqn": "nqn.2016-06.io.spdk:cnode1",
>>>     "subtype": "NVMe",
>>>     "listen_addresses": [
>>>       {
>>>         "transport": "FC",
>>>         "trtype": "FC",
>>>         "adrfam": "FC",
>>>         "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",
>>>         "trsvcid": "none"
>>>       },
>>>       {
>>>         "transport": "FC",
>>>         "trtype": "FC",
>>>         "adrfam": "FC",
>>>         "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",
>>>         "trsvcid": "none"
>>>       }
>>>     ],
>>>     "allow_any_host": true,
>>>     "hosts": [],
>>>     "serial_number": "SPDK00000000000001",
>>>     "model_number": "SPDK bdev Controller",
>>>     "max_namespaces": 20,
>>>     "namespaces": [
>>>       {
>>>         "nsid": 1,
>>>         "bdev_name": "Malloc0",
>>>         "name": "Malloc0",
>>>         "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"
>>>       }
>>>     ]
>>>   }
>>>
>>>
>>> spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
>>>
>>> That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}
>>>
>>> spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
>>> epoll/kqueue group)
>>> In FC transport this is defined as hardware queue pair (hwqp). Logical
>>> connections  are added to hwqp as they are created and removed when the
>>> connection are torn down.
>>>
>>>
>>> >
>>> > Thanks,
>>> > - Anil
>>> >
>>> >
>>> > > 2. Management of master thread
>>> > > FC transport layer has its own master thread. This may match the
>>> thread of
>>> > > the NVMe-oF target's
>>> > > acceptor poller but is independent each other.
>>> > > All FC administration events and link services are executed on the FC
>>> > > transport's master thread.
>>> > >
>>> > >
>>> > > Thanks,
>>> > > Shuhei
>>> > >
>>> > >
>>> > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
>>> > > 送信日時: 2019年5月14日 14:47
>>> > > 宛先: Anil Veerabhadrappa
>>> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
>>> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
>>> transport
>>> > >
>>> > > Hi Anil,
>>> > >
>>> > > Thank you for your kind feedback. Your thought is very reasonable.
>>> > >
>>> > > OK, let's continue to focus on the current patch, and add the two
>>> work items
>>> > > in Trello as enhancements.
>>> > > I added them to NVMe-oF target backlog for now.
>>> > > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog<
>>> https://clicktime.symantec.com/3LZb3ioxXqvWtnsLh19Lr1r7Vc?u=https%3A%2F%2Ftrello.com%2Fb%2FG2f3dVcs%2Fnvme-of-target-backlog
>>> >
>>> > >
>>> > > Thanks,
>>> > > Shuhei
>>> > >
>>> > > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com<mailto:
>>> anil.veerabhadrappa(a)broadcom.com>>
>>> > > 送信日時: 2019年5月14日 8:11
>>> > > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
>>> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
>>> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
>>> transport
>>> > >
>>> > > Hi Shuhei,
>>> > >      Your understanding about WWNN and WWPN is right. Initially
>>> there was
>>> > > some considerations about adding FC listener code but decided
>>> against it.
>>> > > There are couple reasons for it,
>>> > > Unlike RDMA/TCP transport, there is no easy way for user to obtain
>>> worldwide
>>> > > names from say Linux bash shell. There isn't a 'ip address list' or
>>> > > 'ifconfig' equivalent for FC in Linux distributions. So each vendor
>>> has to
>>> > > provide their own utility.
>>> > > FC protocol suite includes Directory Service via' Name Server which
>>> controls
>>> > > how connected FC ports as discovered in a SNA. Also zoning controls
>>> which
>>> > > ports are allowed to connect to each other. So FC-NVMe inherits an
>>> external
>>> > > pseudo listening feature from native FC. So it is ok for FC NVMf
>>> target to
>>> > > listen on all FC ports.
>>> > > Our drivers can support changing WWNN and WWPN. We will work on this
>>> feature
>>> > > enhancement after FC-NVMe is merged into SPDK code. Also it is worth
>>> noting
>>> > > that this feature will introduce some new transport API's which
>>> would be
>>> > > NOP's for RDMA and TCP transports.
>>> > >
>>> > >     Sure, we can add these 2 work items in Trello as enhancements to
>>> FC-NVMe
>>> > > transport to be addressed after it is merged into SPDK master.
>>> > >
>>> > > Thanks,
>>> > > - Anil
>>> > >
>>> > >
>>> > > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
>>> > > shuhei.matsumoto.xt(a)hitachi.com<mailto:
>>> shuhei.matsumoto.xt(a)hitachi.com>> wrote:
>>> > > > Hi Anil,
>>> > > >
>>> > > > Thank you for improving the patch continuously. I have seen great
>>> > > > improvement.
>>> > > >
>>> > > > I have an item to discuss with you, and I send it to the mailing
>>> list
>>> > > > first.
>>> > > > Please correct me if I'm wrong or let's discusss on Trello by
>>> creating the
>>> > > > board for FC if this question is reasonable.
>>> > > >
>>> > > > NVMe-oF FC transport utilizes two types of WWN.
>>> > > >
>>> > > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC
>>> port, and
>>> > > > each port on a FC HBA has a unique WWPN.
>>> > > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
>>> > > >
>>> > > > If I understand correctly,
>>> > > > The FC low level driver (LLD) reads persistent WWPN and WWNN and
>>> informs
>>> > > > them to SPDK NVMe-oF transport,
>>> > > > then SPDK NVMe-oF transport configures listeners according to them.
>>> > > > Besides, nvmf_fc_listen is implemented as NOP.
>>> > > >
>>> > > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
>>> > > >
>>> > > > But it is very desirable if we can change WWNN and WWPN as our own
>>> needs.
>>> > > > .INI config file has been deprecated and could you consider to add
>>> FC code
>>> > > > to the nvmf_subsystem_add_listener RPC?
>>> > > >
>>> > > > Implementation options may be for example
>>> > > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN
>>> of the
>>> > > > HBA which matches WWNN.
>>> > > > or
>>> > > > - FC port has its own PCI address.
>>> > > > - user passes the trio, PCI address, WWNN, WWPN.
>>> > > > - if the PCI address is the lowest of the FC HBA, WWNN can be
>>> changed.
>>> > > >
>>> > > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
>>> > > > message.
>>> > > >
>>> > > > Thanks,
>>> > > > Shuhei
>>> > > >
>>>
>>> _______________________________________________
>>> SPDK mailing list
>>> SPDK(a)lists.01.org
>>> https://lists.01.org/mailman/listinfo/spdk
>>>
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-22 10:46 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-22 10:46 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 11065 bytes --]

But it will be nice if we unify fc hwqp and nvmf poll group more.

And it may be nice if you share again about FC don’t have ifconfig
equivalent and so all listener equivalent are added to all nvmf subsystems
by LLD for now.

2019年5月22日(水) 19:38 松本周平 <shuheimatsumoto(a)gmail.com>:

> Hi Anil,
>
> I read your reply again and found that you already had answered to the
> following questions.
>
> So you don’t have to answer again.
>
> Thanks for your help
> Shuhei
>
> 2019年5月22日(水) 16:46 松本周平 / MATSUMOTO,SHUUHEI <
> shuhei.matsumoto.xt(a)hitachi.com>:
>
>> Hi Anil,
>>
>> I'm trying to fill the gap between FC and RDMA/TCP as possible as I can.
>>
>> The following is correct?
>>
>> RDMA and TCP transports schedule the created qpair to NVMe-oF poll group.
>>
>> FC transport schedules the created qpair (FC connection) to FC HWQP.​
>> Each FC port has a HWQP for LS queue and multiple HWQPs for IO queues.​
>> HWQPs are already scheduled to FC transport poll group when changing the
>> corresponding FC port to online.​
>>
>> Thanks,
>> Shuhei
>>
>>
>> ________________________________
>> 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
>> 送信日時: 2019年5月22日 6:46
>> 宛先: Walker, Benjamin
>> CC: 松本周平 / MATSUMOTO,SHUUHEI; Harris, James R; spdk(a)lists.01.org
>> 件名: [!]Re: Difference of thread and transport object management in
>> NVMe-oFC, RDMA, and TCP
>>
>> Hi Ben,
>>      Please find the attached diagram which outlines FC node/link
>> initialization and FC-NVMe connection setup sequence.
>> Here are some highlights,
>>
>>   *   Each FC port login to switch to identify itself, establish trust
>> and make use of directory service provided by switch
>>   *   Storage network intelligence is embedded in FC switches in the form
>> of directory or name service
>>   *   FC-NVMe initiator and target are allowed to setup connections only
>> if storage admin permits by zoning them together.
>>
>> On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <
>> benjamin.walker(a)intel.com<mailto:benjamin.walker(a)intel.com>> wrote:
>> On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
>> > Hi Shuhei,
>> >     My response is inline below.
>> >
>>
>> I don't know much about Fibre Channel, but surely there must be some step
>> that
>> tells the FC driver whether to accept a connection or not, right? Even if
>> that's
>> just the particular FC port to use on the HBA, it's still something. That
>> information needs to come from an RPC and be passed down through
>> spdk_nvmf_tgt_listen() to the FC transport.
>>
>> Step 50 (which again is dependent on step .1 & step 21.) has to completed
>> before NVMe connections can be established.
>> So bind()/listen() involves external entity (switch) and multiple PDU
>> exchanges.
>> We are exploring the feasibility for moving step 21. to
>> spdk_nvmf_tgt_listen().
>>
>>
>> > That is the reason we have created a channel or queue to receive these
>> FC-4 LS
>> > frames for processing. Similarly ABTS (Abort exchange)
>> > is again a FC-LS command. As you can see the FC-4 LS does more work
>> than it's
>> > RDMA / TCP counterparts. That brings us to the
>> > second difference - "Management of master thread", because of the
>> amount of
>> > work involved lead us to implement a dedicated master thread
>> > to handle all non-IO related processing.
>>
>> Once a connection is established, you need to tell the generic SPDK
>> NVMe-oF
>> library about it (so the qpair can get assigned to a poll group). That
>> means
>> that spdk_nvmf_tgt_accept() is going to need to do something to the
>> transport
>> that generates the new_qpair callback.
>> This is exactly fc transport implements accept() api. Transport polls on
>> FC-4 LS queue for
>> incoming "Create Association", "Create Connection" and "Disconnect"
>> requests.
>> After accepting / setting up connection, transport would assign it to a
>> poll group by
>> calling spdk_nvmf_poll_group_add().
>>
>>
>> I think what would be helpful is a basic overview of what the FC
>> primitives are
>> and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK
>> NVMe-oF
>> primitives are, including mappings for RDMA/TCP:
>>
>> spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
>>
>> That would be FC WWNN and WWPN as show below,
>>   {
>>     "nqn": "nqn.2016-06.io.spdk:cnode1",
>>     "subtype": "NVMe",
>>     "listen_addresses": [
>>       {
>>         "transport": "FC",
>>         "trtype": "FC",
>>         "adrfam": "FC",
>>         "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",
>>         "trsvcid": "none"
>>       },
>>       {
>>         "transport": "FC",
>>         "trtype": "FC",
>>         "adrfam": "FC",
>>         "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",
>>         "trsvcid": "none"
>>       }
>>     ],
>>     "allow_any_host": true,
>>     "hosts": [],
>>     "serial_number": "SPDK00000000000001",
>>     "model_number": "SPDK bdev Controller",
>>     "max_namespaces": 20,
>>     "namespaces": [
>>       {
>>         "nsid": 1,
>>         "bdev_name": "Malloc0",
>>         "name": "Malloc0",
>>         "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"
>>       }
>>     ]
>>   }
>>
>>
>> spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
>>
>> That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}
>>
>> spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
>> epoll/kqueue group)
>> In FC transport this is defined as hardware queue pair (hwqp). Logical
>> connections  are added to hwqp as they are created and removed when the
>> connection are torn down.
>>
>>
>> >
>> > Thanks,
>> > - Anil
>> >
>> >
>> > > 2. Management of master thread
>> > > FC transport layer has its own master thread. This may match the
>> thread of
>> > > the NVMe-oF target's
>> > > acceptor poller but is independent each other.
>> > > All FC administration events and link services are executed on the FC
>> > > transport's master thread.
>> > >
>> > >
>> > > Thanks,
>> > > Shuhei
>> > >
>> > >
>> > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
>> > > 送信日時: 2019年5月14日 14:47
>> > > 宛先: Anil Veerabhadrappa
>> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
>> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
>> transport
>> > >
>> > > Hi Anil,
>> > >
>> > > Thank you for your kind feedback. Your thought is very reasonable.
>> > >
>> > > OK, let's continue to focus on the current patch, and add the two
>> work items
>> > > in Trello as enhancements.
>> > > I added them to NVMe-oF target backlog for now.
>> > > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog<
>> https://clicktime.symantec.com/3LZb3ioxXqvWtnsLh19Lr1r7Vc?u=https%3A%2F%2Ftrello.com%2Fb%2FG2f3dVcs%2Fnvme-of-target-backlog
>> >
>> > >
>> > > Thanks,
>> > > Shuhei
>> > >
>> > > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com<mailto:
>> anil.veerabhadrappa(a)broadcom.com>>
>> > > 送信日時: 2019年5月14日 8:11
>> > > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
>> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
>> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
>> transport
>> > >
>> > > Hi Shuhei,
>> > >      Your understanding about WWNN and WWPN is right. Initially there
>> was
>> > > some considerations about adding FC listener code but decided against
>> it.
>> > > There are couple reasons for it,
>> > > Unlike RDMA/TCP transport, there is no easy way for user to obtain
>> worldwide
>> > > names from say Linux bash shell. There isn't a 'ip address list' or
>> > > 'ifconfig' equivalent for FC in Linux distributions. So each vendor
>> has to
>> > > provide their own utility.
>> > > FC protocol suite includes Directory Service via' Name Server which
>> controls
>> > > how connected FC ports as discovered in a SNA. Also zoning controls
>> which
>> > > ports are allowed to connect to each other. So FC-NVMe inherits an
>> external
>> > > pseudo listening feature from native FC. So it is ok for FC NVMf
>> target to
>> > > listen on all FC ports.
>> > > Our drivers can support changing WWNN and WWPN. We will work on this
>> feature
>> > > enhancement after FC-NVMe is merged into SPDK code. Also it is worth
>> noting
>> > > that this feature will introduce some new transport API's which would
>> be
>> > > NOP's for RDMA and TCP transports.
>> > >
>> > >     Sure, we can add these 2 work items in Trello as enhancements to
>> FC-NVMe
>> > > transport to be addressed after it is merged into SPDK master.
>> > >
>> > > Thanks,
>> > > - Anil
>> > >
>> > >
>> > > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
>> > > shuhei.matsumoto.xt(a)hitachi.com<mailto:
>> shuhei.matsumoto.xt(a)hitachi.com>> wrote:
>> > > > Hi Anil,
>> > > >
>> > > > Thank you for improving the patch continuously. I have seen great
>> > > > improvement.
>> > > >
>> > > > I have an item to discuss with you, and I send it to the mailing
>> list
>> > > > first.
>> > > > Please correct me if I'm wrong or let's discusss on Trello by
>> creating the
>> > > > board for FC if this question is reasonable.
>> > > >
>> > > > NVMe-oF FC transport utilizes two types of WWN.
>> > > >
>> > > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC
>> port, and
>> > > > each port on a FC HBA has a unique WWPN.
>> > > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
>> > > >
>> > > > If I understand correctly,
>> > > > The FC low level driver (LLD) reads persistent WWPN and WWNN and
>> informs
>> > > > them to SPDK NVMe-oF transport,
>> > > > then SPDK NVMe-oF transport configures listeners according to them.
>> > > > Besides, nvmf_fc_listen is implemented as NOP.
>> > > >
>> > > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
>> > > >
>> > > > But it is very desirable if we can change WWNN and WWPN as our own
>> needs.
>> > > > .INI config file has been deprecated and could you consider to add
>> FC code
>> > > > to the nvmf_subsystem_add_listener RPC?
>> > > >
>> > > > Implementation options may be for example
>> > > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of
>> the
>> > > > HBA which matches WWNN.
>> > > > or
>> > > > - FC port has its own PCI address.
>> > > > - user passes the trio, PCI address, WWNN, WWPN.
>> > > > - if the PCI address is the lowest of the FC HBA, WWNN can be
>> changed.
>> > > >
>> > > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
>> > > > message.
>> > > >
>> > > > Thanks,
>> > > > Shuhei
>> > > >
>>
>> _______________________________________________
>> SPDK mailing list
>> SPDK(a)lists.01.org
>> https://lists.01.org/mailman/listinfo/spdk
>>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-22 10:38 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-22 10:38 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 10439 bytes --]

Hi Anil,

I read your reply again and found that you already had answered to the
following questions.

So you don’t have to answer again.

Thanks for your help
Shuhei

2019年5月22日(水) 16:46 松本周平 / MATSUMOTO,SHUUHEI <
shuhei.matsumoto.xt(a)hitachi.com>:

> Hi Anil,
>
> I'm trying to fill the gap between FC and RDMA/TCP as possible as I can.
>
> The following is correct?
>
> RDMA and TCP transports schedule the created qpair to NVMe-oF poll group.
>
> FC transport schedules the created qpair (FC connection) to FC HWQP.​
> Each FC port has a HWQP for LS queue and multiple HWQPs for IO queues.​
> HWQPs are already scheduled to FC transport poll group when changing the
> corresponding FC port to online.​
>
> Thanks,
> Shuhei
>
>
> ________________________________
> 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
> 送信日時: 2019年5月22日 6:46
> 宛先: Walker, Benjamin
> CC: 松本周平 / MATSUMOTO,SHUUHEI; Harris, James R; spdk(a)lists.01.org
> 件名: [!]Re: Difference of thread and transport object management in
> NVMe-oFC, RDMA, and TCP
>
> Hi Ben,
>      Please find the attached diagram which outlines FC node/link
> initialization and FC-NVMe connection setup sequence.
> Here are some highlights,
>
>   *   Each FC port login to switch to identify itself, establish trust and
> make use of directory service provided by switch
>   *   Storage network intelligence is embedded in FC switches in the form
> of directory or name service
>   *   FC-NVMe initiator and target are allowed to setup connections only
> if storage admin permits by zoning them together.
>
> On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <
> benjamin.walker(a)intel.com<mailto:benjamin.walker(a)intel.com>> wrote:
> On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
> > Hi Shuhei,
> >     My response is inline below.
> >
>
> I don't know much about Fibre Channel, but surely there must be some step
> that
> tells the FC driver whether to accept a connection or not, right? Even if
> that's
> just the particular FC port to use on the HBA, it's still something. That
> information needs to come from an RPC and be passed down through
> spdk_nvmf_tgt_listen() to the FC transport.
>
> Step 50 (which again is dependent on step .1 & step 21.) has to completed
> before NVMe connections can be established.
> So bind()/listen() involves external entity (switch) and multiple PDU
> exchanges.
> We are exploring the feasibility for moving step 21. to
> spdk_nvmf_tgt_listen().
>
>
> > That is the reason we have created a channel or queue to receive these
> FC-4 LS
> > frames for processing. Similarly ABTS (Abort exchange)
> > is again a FC-LS command. As you can see the FC-4 LS does more work than
> it's
> > RDMA / TCP counterparts. That brings us to the
> > second difference - "Management of master thread", because of the amount
> of
> > work involved lead us to implement a dedicated master thread
> > to handle all non-IO related processing.
>
> Once a connection is established, you need to tell the generic SPDK NVMe-oF
> library about it (so the qpair can get assigned to a poll group). That
> means
> that spdk_nvmf_tgt_accept() is going to need to do something to the
> transport
> that generates the new_qpair callback.
> This is exactly fc transport implements accept() api. Transport polls on
> FC-4 LS queue for
> incoming "Create Association", "Create Connection" and "Disconnect"
> requests.
> After accepting / setting up connection, transport would assign it to a
> poll group by
> calling spdk_nvmf_poll_group_add().
>
>
> I think what would be helpful is a basic overview of what the FC
> primitives are
> and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK
> NVMe-oF
> primitives are, including mappings for RDMA/TCP:
>
> spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
>
> That would be FC WWNN and WWPN as show below,
>   {
>     "nqn": "nqn.2016-06.io.spdk:cnode1",
>     "subtype": "NVMe",
>     "listen_addresses": [
>       {
>         "transport": "FC",
>         "trtype": "FC",
>         "adrfam": "FC",
>         "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",
>         "trsvcid": "none"
>       },
>       {
>         "transport": "FC",
>         "trtype": "FC",
>         "adrfam": "FC",
>         "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",
>         "trsvcid": "none"
>       }
>     ],
>     "allow_any_host": true,
>     "hosts": [],
>     "serial_number": "SPDK00000000000001",
>     "model_number": "SPDK bdev Controller",
>     "max_namespaces": 20,
>     "namespaces": [
>       {
>         "nsid": 1,
>         "bdev_name": "Malloc0",
>         "name": "Malloc0",
>         "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"
>       }
>     ]
>   }
>
>
> spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
>
> That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}
>
> spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
> epoll/kqueue group)
> In FC transport this is defined as hardware queue pair (hwqp). Logical
> connections  are added to hwqp as they are created and removed when the
> connection are torn down.
>
>
> >
> > Thanks,
> > - Anil
> >
> >
> > > 2. Management of master thread
> > > FC transport layer has its own master thread. This may match the
> thread of
> > > the NVMe-oF target's
> > > acceptor poller but is independent each other.
> > > All FC administration events and link services are executed on the FC
> > > transport's master thread.
> > >
> > >
> > > Thanks,
> > > Shuhei
> > >
> > >
> > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
> > > 送信日時: 2019年5月14日 14:47
> > > 宛先: Anil Veerabhadrappa
> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Anil,
> > >
> > > Thank you for your kind feedback. Your thought is very reasonable.
> > >
> > > OK, let's continue to focus on the current patch, and add the two work
> items
> > > in Trello as enhancements.
> > > I added them to NVMe-oF target backlog for now.
> > > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog<
> https://clicktime.symantec.com/3LZb3ioxXqvWtnsLh19Lr1r7Vc?u=https%3A%2F%2Ftrello.com%2Fb%2FG2f3dVcs%2Fnvme-of-target-backlog
> >
> > >
> > > Thanks,
> > > Shuhei
> > >
> > > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com<mailto:
> anil.veerabhadrappa(a)broadcom.com>>
> > > 送信日時: 2019年5月14日 8:11
> > > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
> > > CC: spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Shuhei,
> > >      Your understanding about WWNN and WWPN is right. Initially there
> was
> > > some considerations about adding FC listener code but decided against
> it.
> > > There are couple reasons for it,
> > > Unlike RDMA/TCP transport, there is no easy way for user to obtain
> worldwide
> > > names from say Linux bash shell. There isn't a 'ip address list' or
> > > 'ifconfig' equivalent for FC in Linux distributions. So each vendor
> has to
> > > provide their own utility.
> > > FC protocol suite includes Directory Service via' Name Server which
> controls
> > > how connected FC ports as discovered in a SNA. Also zoning controls
> which
> > > ports are allowed to connect to each other. So FC-NVMe inherits an
> external
> > > pseudo listening feature from native FC. So it is ok for FC NVMf
> target to
> > > listen on all FC ports.
> > > Our drivers can support changing WWNN and WWPN. We will work on this
> feature
> > > enhancement after FC-NVMe is merged into SPDK code. Also it is worth
> noting
> > > that this feature will introduce some new transport API's which would
> be
> > > NOP's for RDMA and TCP transports.
> > >
> > >     Sure, we can add these 2 work items in Trello as enhancements to
> FC-NVMe
> > > transport to be addressed after it is merged into SPDK master.
> > >
> > > Thanks,
> > > - Anil
> > >
> > >
> > > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> > > shuhei.matsumoto.xt(a)hitachi.com<mailto:shuhei.matsumoto.xt(a)hitachi.com>>
> wrote:
> > > > Hi Anil,
> > > >
> > > > Thank you for improving the patch continuously. I have seen great
> > > > improvement.
> > > >
> > > > I have an item to discuss with you, and I send it to the mailing list
> > > > first.
> > > > Please correct me if I'm wrong or let's discusss on Trello by
> creating the
> > > > board for FC if this question is reasonable.
> > > >
> > > > NVMe-oF FC transport utilizes two types of WWN.
> > > >
> > > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC
> port, and
> > > > each port on a FC HBA has a unique WWPN.
> > > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
> > > >
> > > > If I understand correctly,
> > > > The FC low level driver (LLD) reads persistent WWPN and WWNN and
> informs
> > > > them to SPDK NVMe-oF transport,
> > > > then SPDK NVMe-oF transport configures listeners according to them.
> > > > Besides, nvmf_fc_listen is implemented as NOP.
> > > >
> > > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
> > > >
> > > > But it is very desirable if we can change WWNN and WWPN as our own
> needs.
> > > > .INI config file has been deprecated and could you consider to add
> FC code
> > > > to the nvmf_subsystem_add_listener RPC?
> > > >
> > > > Implementation options may be for example
> > > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of
> the
> > > > HBA which matches WWNN.
> > > > or
> > > > - FC port has its own PCI address.
> > > > - user passes the trio, PCI address, WWNN, WWPN.
> > > > - if the PCI address is the lowest of the FC HBA, WWNN can be
> changed.
> > > >
> > > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
> > > > message.
> > > >
> > > > Thanks,
> > > > Shuhei
> > > >
>
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-21 21:46 Anil Veerabhadrappa
  0 siblings, 0 replies; 9+ messages in thread
From: Anil Veerabhadrappa @ 2019-05-21 21:46 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 8799 bytes --]

Hi Ben,
     Please find the attached diagram which outlines FC node/link
initialization and FC-NVMe connection setup sequence.
Here are some highlights,

   - Each FC port login to switch to identify itself, establish trust and
   make use of directory service provided by switch
   - Storage network intelligence is embedded in FC switches in the form of
   directory or name service
   - FC-NVMe initiator and target are allowed to setup connections only if
   storage admin permits by zoning them together.


On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <benjamin.walker(a)intel.com>
wrote:

> On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
> > Hi Shuhei,
> >     My response is inline below.
> >
>
> I don't know much about Fibre Channel, but surely there must be some step
> that
> tells the FC driver whether to accept a connection or not, right? Even if
> that's

just the particular FC port to use on the HBA, it's still something. That
> information needs to come from an RPC and be passed down through
> spdk_nvmf_tgt_listen() to the FC transport.
>

Step 50 (which again is dependent on step .1 & step 21.) has to completed
before NVMe connections can be established.
So bind()/listen() involves external entity (switch) and multiple PDU
exchanges.
We are exploring the feasibility for moving step 21. to
spdk_nvmf_tgt_listen().


>
> > That is the reason we have created a channel or queue to receive these
> FC-4 LS
> > frames for processing. Similarly ABTS (Abort exchange)
> > is again a FC-LS command. As you can see the FC-4 LS does more work than
> it's
> > RDMA / TCP counterparts. That brings us to the
> > second difference - "Management of master thread", because of the amount
> of
> > work involved lead us to implement a dedicated master thread
> > to handle all non-IO related processing.
>
> Once a connection is established, you need to tell the generic SPDK NVMe-oF
> library about it (so the qpair can get assigned to a poll group). That
> means
> that spdk_nvmf_tgt_accept() is going to need to do something to the
> transport
> that generates the new_qpair callback.
>
This is exactly fc transport implements accept() api. Transport polls on
FC-4 LS queue for
incoming "Create Association", "Create Connection" and "Disconnect"
requests.
After accepting / setting up connection, transport would assign it to a
poll group by
calling spdk_nvmf_poll_group_add().


> I think what would be helpful is a basic overview of what the FC
> primitives are
> and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK
> NVMe-oF
> primitives are, including mappings for RDMA/TCP:
>
> spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
>

That would be FC WWNN and WWPN as show below,

  {

    "nqn": "nqn.2016-06.io.spdk:cnode1",

    "subtype": "NVMe",

    "listen_addresses": [

      {

        "transport": "FC",

        "trtype": "FC",

        "adrfam": "FC",

*        "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",*

        "trsvcid": "none"

      },

      {

        "transport": "FC",

        "trtype": "FC",

        "adrfam": "FC",

*        "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",*

        "trsvcid": "none"

      }

    ],

    "allow_any_host": true,

    "hosts": [],

    "serial_number": "SPDK00000000000001",

    "model_number": "SPDK bdev Controller",

    "max_namespaces": 20,

    "namespaces": [

      {

        "nsid": 1,

        "bdev_name": "Malloc0",

        "name": "Malloc0",

        "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"

      }

    ]

  }




> spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
>

That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}


> spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
> epoll/kqueue group)
>
In FC transport this is defined as hardware queue pair (hwqp). Logical
connections  are added to hwqp as they are created and removed when the
connection are torn down.



> >
> > Thanks,
> > - Anil
> >
> >
> > > 2. Management of master thread
> > > FC transport layer has its own master thread. This may match the
> thread of
> > > the NVMe-oF target's
> > > acceptor poller but is independent each other.
> > > All FC administration events and link services are executed on the FC
> > > transport's master thread.
> > >
> > >
> > > Thanks,
> > > Shuhei
> > >
> > >
> > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
> > > 送信日時: 2019年5月14日 14:47
> > > 宛先: Anil Veerabhadrappa
> > > CC: spdk(a)lists.01.org
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Anil,
> > >
> > > Thank you for your kind feedback. Your thought is very reasonable.
> > >
> > > OK, let's continue to focus on the current patch, and add the two work
> items
> > > in Trello as enhancements.
> > > I added them to NVMe-oF target backlog for now.
> > > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog
> > >
> > > Thanks,
> > > Shuhei
> > >
> > > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
> > > 送信日時: 2019年5月14日 8:11
> > > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
> > > CC: spdk(a)lists.01.org
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Shuhei,
> > >      Your understanding about WWNN and WWPN is right. Initially there
> was
> > > some considerations about adding FC listener code but decided against
> it.
> > > There are couple reasons for it,
> > > Unlike RDMA/TCP transport, there is no easy way for user to obtain
> worldwide
> > > names from say Linux bash shell. There isn't a 'ip address list' or
> > > 'ifconfig' equivalent for FC in Linux distributions. So each vendor
> has to
> > > provide their own utility.
> > > FC protocol suite includes Directory Service via' Name Server which
> controls
> > > how connected FC ports as discovered in a SNA. Also zoning controls
> which
> > > ports are allowed to connect to each other. So FC-NVMe inherits an
> external
> > > pseudo listening feature from native FC. So it is ok for FC NVMf
> target to
> > > listen on all FC ports.
> > > Our drivers can support changing WWNN and WWPN. We will work on this
> feature
> > > enhancement after FC-NVMe is merged into SPDK code. Also it is worth
> noting
> > > that this feature will introduce some new transport API's which would
> be
> > > NOP's for RDMA and TCP transports.
> > >
> > >     Sure, we can add these 2 work items in Trello as enhancements to
> FC-NVMe
> > > transport to be addressed after it is merged into SPDK master.
> > >
> > > Thanks,
> > > - Anil
> > >
> > >
> > > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> > > shuhei.matsumoto.xt(a)hitachi.com> wrote:
> > > > Hi Anil,
> > > >
> > > > Thank you for improving the patch continuously. I have seen great
> > > > improvement.
> > > >
> > > > I have an item to discuss with you, and I send it to the mailing list
> > > > first.
> > > > Please correct me if I'm wrong or let's discusss on Trello by
> creating the
> > > > board for FC if this question is reasonable.
> > > >
> > > > NVMe-oF FC transport utilizes two types of WWN.
> > > >
> > > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC
> port, and
> > > > each port on a FC HBA has a unique WWPN.
> > > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
> > > >
> > > > If I understand correctly,
> > > > The FC low level driver (LLD) reads persistent WWPN and WWNN and
> informs
> > > > them to SPDK NVMe-oF transport,
> > > > then SPDK NVMe-oF transport configures listeners according to them.
> > > > Besides, nvmf_fc_listen is implemented as NOP.
> > > >
> > > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
> > > >
> > > > But it is very desirable if we can change WWNN and WWPN as our own
> needs.
> > > > .INI config file has been deprecated and could you consider to add
> FC code
> > > > to the nvmf_subsystem_add_listener RPC?
> > > >
> > > > Implementation options may be for example
> > > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of
> the
> > > > HBA which matches WWNN.
> > > > or
> > > > - FC port has its own PCI address.
> > > > - user passes the trio, PCI address, WWNN, WWPN.
> > > > - if the PCI address is the lowest of the FC HBA, WWNN can be
> changed.
> > > >
> > > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
> > > > message.
> > > >
> > > > Thanks,
> > > > Shuhei
> > > >
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-17 18:00 Walker, Benjamin
  0 siblings, 0 replies; 9+ messages in thread
From: Walker, Benjamin @ 2019-05-17 18:00 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 9207 bytes --]

On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
> Hi Shuhei,
>     My response is inline below.
> 
> 
> On Thu, May 16, 2019 at 11:08 PM 松本周平 / MATSUMOTO,SHUUHEI <
> shuhei.matsumoto.xt(a)hitachi.com> wrote:
> > Hi Anil, Ben, Jim, and All,
> > 
> > Thanks to great help by Anil, I have steady progress to understand the NVMe-
> > oFC patch.
> > https://review.gerrithub.io/#/c/spdk/spdk/+/450077/
> > 
> > I noticed a few differences about management of transport object and thread.
> > 
> > I think the difference came from the difference of transport as long as I
> > read
> > https://s3.us-east-2.amazonaws.com/intel-builders/day_2_spdk_nvme_over_fabrics.pdf
> > by John Meneghini and Madhu Pai by NetApp.
> > 
> > Can we change FC transport to align RDMA and TCP transports a little more,
> > or
> > all transports can coexist even if there are difference?
> > 
> > 
> > I added a trello card as follows too, and we will be able to discuss on it.
> > https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport
> >  Trello Organize anything, together. Trello is a collaboration tool that
> > organizes your projects into boards. In one glance, know what's being worked
> > on, who's working on what, and where something is in a process. trello.com
> > 
> > 
> > I'm afraid that my understanding is wrong but I hope Anil will correct it 🙂
> > Any feedback is appreciated.
> > 
> > 1. Management of FC transport object
> > Add/remove listers
> > For RDMA/TCP tranport,
> > we add/remove listeners to RDMA/TCP transport layer by
> > nvmf_subsystem_add_listener RPC.
> > NVMe-oF target has all transport objects and nvmf_subsystem_add_listener RPC
> > calls NVMe-oF
> > target. So NVMe-oF target calls nvmf_rdma/tcp_listen() to add listener and
> > passes the
> > corresponding transport object to it.
> > For FC transport,
> > FC transport layer implements add/removal of listers as the callback
> > nvmf_fc_adm_add_rem_nport_listener() from LLD (low level driver), LLD calls
> > the callback for each
> > WWNN-WWPN pair. LLD don't have any reference to the FC transport object.
> > FC transport layer has the reference to the FC transport object instead, and
> > the invoked callback nvmf_fc_adm_add_rem_nport_listener() gets the stored
> > transport object.
> > So, nvmf_fc_listen() is NOP.
> > Accept new connections
> > For RDMA/TCP transport,
> > SPDK NVMe-oF target polls spdk_nvmf_tgt_accept(), and spdk_nvmf_tgt_accept()
> > calls
> > nvmf_rdma/tcp_accept() and passes the corresponding transport object to it.
> > For FC transport,
> > spdk_nvmf_tgt_accept() calls nvmf_fc_accept() and passes the corresponding
> > transport object to it.
> > But nvmf_fc_accept() doesn't pass the FC transport object to the LLD's link
> > service poller.
> > LLD's link service poller polls hardware link service queue and invokes FC
> > transport layer's
> > callbacks. LLD don't have any reference to the FC transport object and so
> > the callbacks get the
> > stored transport object.
> > 
> 
>  
> All controller events such as Create_Association, Disconnect_Association,
> Create_Connection, etc. are provided by FC-4 LS commands.

I don't know much about Fibre Channel, but surely there must be some step that
tells the FC driver whether to accept a connection or not, right? Even if that's
just the particular FC port to use on the HBA, it's still something. That
information needs to come from an RPC and be passed down through
spdk_nvmf_tgt_listen() to the FC transport.

> That is the reason we have created a channel or queue to receive these FC-4 LS
> frames for processing. Similarly ABTS (Abort exchange)
> is again a FC-LS command. As you can see the FC-4 LS does more work than it's
> RDMA / TCP counterparts. That brings us to the 
> second difference - "Management of master thread", because of the amount of
> work involved lead us to implement a dedicated master thread
> to handle all non-IO related processing.

Once a connection is established, you need to tell the generic SPDK NVMe-oF
library about it (so the qpair can get assigned to a poll group). That means
that spdk_nvmf_tgt_accept() is going to need to do something to the transport
that generates the new_qpair callback.

I think what would be helpful is a basic overview of what the FC primitives are
and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK NVMe-oF
primitives are, including mappings for RDMA/TCP:

spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
epoll/kqueue group)

> 
> Thanks,
> - Anil
> 
>  
> > 2. Management of master thread
> > FC transport layer has its own master thread. This may match the thread of
> > the NVMe-oF target's
> > acceptor poller but is independent each other.
> > All FC administration events and link services are executed on the FC
> > transport's master thread.
> > 
> > 
> > Thanks,
> > Shuhei
> > 
> > 
> > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
> > 送信日時: 2019年5月14日 14:47
> > 宛先: Anil Veerabhadrappa
> > CC: spdk(a)lists.01.org
> > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport
> >  
> > Hi Anil,
> > 
> > Thank you for your kind feedback. Your thought is very reasonable.
> > 
> > OK, let's continue to focus on the current patch, and add the two work items
> > in Trello as enhancements.
> > I added them to NVMe-oF target backlog for now.
> > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog
> > 
> > Thanks,
> > Shuhei
> > 
> > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
> > 送信日時: 2019年5月14日 8:11
> > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
> > CC: spdk(a)lists.01.org
> > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport
> >  
> > Hi Shuhei,
> >      Your understanding about WWNN and WWPN is right. Initially there was
> > some considerations about adding FC listener code but decided against it.
> > There are couple reasons for it,
> > Unlike RDMA/TCP transport, there is no easy way for user to obtain worldwide
> > names from say Linux bash shell. There isn't a 'ip address list' or
> > 'ifconfig' equivalent for FC in Linux distributions. So each vendor has to
> > provide their own utility.
> > FC protocol suite includes Directory Service via' Name Server which controls
> > how connected FC ports as discovered in a SNA. Also zoning controls which
> > ports are allowed to connect to each other. So FC-NVMe inherits an external
> > pseudo listening feature from native FC. So it is ok for FC NVMf target to
> > listen on all FC ports.
> > Our drivers can support changing WWNN and WWPN. We will work on this feature
> > enhancement after FC-NVMe is merged into SPDK code. Also it is worth noting
> > that this feature will introduce some new transport API's which would be
> > NOP's for RDMA and TCP transports.
> > 
> >     Sure, we can add these 2 work items in Trello as enhancements to FC-NVMe 
> > transport to be addressed after it is merged into SPDK master.
> > 
> > Thanks,
> > - Anil
> > 
> > 
> > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> > shuhei.matsumoto.xt(a)hitachi.com> wrote:
> > > Hi Anil,
> > > 
> > > Thank you for improving the patch continuously. I have seen great
> > > improvement.
> > > 
> > > I have an item to discuss with you, and I send it to the mailing list
> > > first.
> > > Please correct me if I'm wrong or let's discusss on Trello by creating the
> > > board for FC if this question is reasonable.
> > > 
> > > NVMe-oF FC transport utilizes two types of WWN.
> > > 
> > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC port, and
> > > each port on a FC HBA has a unique WWPN.
> > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
> > > 
> > > If I understand correctly,
> > > The FC low level driver (LLD) reads persistent WWPN and WWNN and informs
> > > them to SPDK NVMe-oF transport,
> > > then SPDK NVMe-oF transport configures listeners according to them.
> > > Besides, nvmf_fc_listen is implemented as NOP.
> > > 
> > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
> > > 
> > > But it is very desirable if we can change WWNN and WWPN as our own needs.
> > > .INI config file has been deprecated and could you consider to add FC code
> > > to the nvmf_subsystem_add_listener RPC?
> > > 
> > > Implementation options may be for example
> > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of the
> > > HBA which matches WWNN.
> > > or
> > > - FC port has its own PCI address.
> > > - user passes the trio, PCI address, WWNN, WWPN.
> > > - if the PCI address is the lowest of the FC HBA, WWNN can be changed.
> > > 
> > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
> > > message.
> > > 
> > > Thanks,
> > > Shuhei
> > > 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-17 17:40 Anil Veerabhadrappa
  0 siblings, 0 replies; 9+ messages in thread
From: Anil Veerabhadrappa @ 2019-05-17 17:40 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 7889 bytes --]

Hi Shuhei,
    My response is inline below.


On Thu, May 16, 2019 at 11:08 PM 松本周平 / MATSUMOTO,SHUUHEI <
shuhei.matsumoto.xt(a)hitachi.com> wrote:

> Hi Anil, Ben, Jim, and All,
>
> Thanks to great help by Anil, I have steady progress to understand the
> NVMe-oFC patch.
> https://review.gerrithub.io/#/c/spdk/spdk/+/450077/
>
> I noticed a few differences about management of transport object and
> thread.
>
> I think the difference came from the difference of transport as long as I
> read
>
> https://s3.us-east-2.amazonaws.com/intel-builders/day_2_spdk_nvme_over_fabrics.pdf
> by John Meneghini and Madhu Pai by NetApp.
>
> Can we change FC transport to align RDMA and TCP transports a little more,
> or
> all transports can coexist even if there are difference?
>
>
> I added a trello card as follows too, and we will be able to discuss on it.
>
> https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport
> Trello
> <https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport>
> Organize anything, together. Trello is a collaboration tool that organizes
> your projects into boards. In one glance, know what's being worked on,
> who's working on what, and where something is in a process.
> trello.com
>
>
> I'm afraid that my understanding is wrong but I hope Anil will correct it
> 🙂
> Any feedback is appreciated.
>
> 1. Management of FC transport object
> Add/remove listers
> For RDMA/TCP tranport,
> we add/remove listeners to RDMA/TCP transport layer by
> nvmf_subsystem_add_listener RPC.
> NVMe-oF target has all transport objects and nvmf_subsystem_add_listener
> RPC calls NVMe-oF
> target. So NVMe-oF target calls nvmf_rdma/tcp_listen() to add listener and
> passes the
> corresponding transport object to it.
> For FC transport,
> FC transport layer implements add/removal of listers as the callback
> nvmf_fc_adm_add_rem_nport_listener() from LLD (low level driver), LLD
> calls the callback for each
> WWNN-WWPN pair. LLD don't have any reference to the FC transport object.
> FC transport layer has the reference to the FC transport object instead,
> and
> the invoked callback nvmf_fc_adm_add_rem_nport_listener() gets the stored
> transport object.
> So, nvmf_fc_listen() is NOP.
> Accept new connections
> For RDMA/TCP transport,
> SPDK NVMe-oF target polls spdk_nvmf_tgt_accept(), and
> spdk_nvmf_tgt_accept() calls
> nvmf_rdma/tcp_accept() and passes the corresponding transport object to it.
> For FC transport,
> spdk_nvmf_tgt_accept() calls nvmf_fc_accept() and passes the corresponding
> transport object to it.
> But nvmf_fc_accept() doesn't pass the FC transport object to the LLD's
> link service poller.
> LLD's link service poller polls hardware link service queue and invokes FC
> transport layer's
> callbacks. LLD don't have any reference to the FC transport object and so
> the callbacks get the
> stored transport object.
>

All controller events such as Create_Association, Disconnect_Association,
Create_Connection, etc. are provided by FC-4 LS commands.
That is the reason we have created a channel or queue to receive these FC-4
LS frames for processing. Similarly ABTS (Abort exchange)
is again a FC-LS command. As you can see the FC-4 LS does more work than
it's RDMA / TCP counterparts. That brings us to the
second difference - "Management of master thread", because of the amount of
work involved lead us to implement a dedicated master thread
to handle all non-IO related processing.

Thanks,
- Anil



> 2. Management of master thread
> FC transport layer has its own master thread. This may match the thread of
> the NVMe-oF target's
> acceptor poller but is independent each other.
> All FC administration events and link services are executed on the FC
> transport's master thread.
>
>
> Thanks,
> Shuhei
>
>
> ------------------------------
> *差出人:* 松本周平 / MATSUMOTO,SHUUHEI
> *送信日時:* 2019年5月14日 14:47
> *宛先:* Anil Veerabhadrappa
> *CC:* spdk(a)lists.01.org
> *件名:* Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
>
> Hi Anil,
>
> Thank you for your kind feedback. Your thought is very reasonable.
>
> OK, let's continue to focus on the current patch, and add the two work
> items in Trello as enhancements.
> I added them to NVMe-oF target backlog for now.
> https://trello.com/b/G2f3dVcs/nvme-of-target-backlog
>
> Thanks,
> Shuhei
>
> ------------------------------
> *差出人:* Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
> *送信日時:* 2019年5月14日 8:11
> *宛先:* 松本周平 / MATSUMOTO,SHUUHEI
> *CC:* spdk(a)lists.01.org
> *件名:* Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
>
> Hi Shuhei,
>      Your understanding about WWNN and WWPN is right. Initially there was
> some considerations about adding FC listener code but decided against it.
> There are couple reasons for it,
>
>    - Unlike RDMA/TCP transport, there is no easy way for user to obtain
>    worldwide names from say Linux bash shell. There isn't a 'ip address list'
>    or 'ifconfig' equivalent for FC in Linux distributions. So each vendor has
>    to provide their own utility.
>    - FC protocol suite includes Directory Service via' Name Server which
>    controls how connected FC ports as discovered in a SNA. Also zoning
>    controls which ports are allowed to connect to each other. So FC-NVMe
>    inherits an external pseudo listening feature from native FC. So it is ok
>    for FC NVMf target to listen on all FC ports.
>
> Our drivers can support changing WWNN and WWPN. We will work on this
> feature enhancement after FC-NVMe is merged into SPDK code. Also it is
> worth noting that this feature will introduce some new transport API's
> which would be NOP's for RDMA and TCP transports.
>
>     Sure, we can add these 2 work items in Trello as enhancements to
> FC-NVMe transport to be addressed after it is merged into SPDK master.
>
> Thanks,
> - Anil
>
>
> On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> shuhei.matsumoto.xt(a)hitachi.com> wrote:
>
> Hi Anil,
>
> Thank you for improving the patch continuously. I have seen great
> improvement.
>
> I have an item to discuss with you, and I send it to the mailing list
> first.
> Please correct me if I'm wrong or let's discusss on Trello by creating the
> board for FC if this question is reasonable.
>
> NVMe-oF FC transport utilizes two types of WWN.
>
> WWPN is a World Wide Port Name, WWPN is an unique ID for each FC port, and
> each port on a FC HBA has a unique WWPN.
> WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
>
> If I understand correctly,
> The FC low level driver (LLD) reads persistent WWPN and WWNN and informs
> them to SPDK NVMe-oF transport,
> then SPDK NVMe-oF transport configures listeners according to them.
> Besides, nvmf_fc_listen is implemented as NOP.
>
> So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
>
> But it is very desirable if we can change WWNN and WWPN as our own needs.
> .INI config file has been deprecated and could you consider to add FC code
> to the nvmf_subsystem_add_listener RPC?
>
> Implementation options may be for example
> - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of the
> HBA which matches WWNN.
> or
> - FC port has its own PCI address.
> - user passes the trio, PCI address, WWNN, WWPN.
> - if the PCI address is the lowest of the FC HBA, WWNN can be changed.
>
> If FC HBA doesn't allow changing WWNN or WWPN, we can output error message.
>
> Thanks,
> Shuhei
>
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-17  6:29 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-17  6:29 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 8007 bytes --]

Hi Anil, Ben, Jim, and All,

That is,

for RDMA and TCP transport, all operations start from NVMe-oF target layer to the RDMA or TCP transport layer,
but for FC transport, many management operations start from the FC low level driver (LLD) by callbacks
to the FC transport layer.

LLD don't know the transport object or master thread (the acceptor poller may run), and so
the FC transport layer remembers its transport object and master thead and passes them to callbacks.

Some pollers of LLD are registered to the poll group of the NVMe-oF target layer indirectly,
but their information is not passed to the pollers.

How or what we can do for the FC transport to co-exist in the long term?

Thanks,
Shuhei



________________________________
差出人: 松本周平 / MATSUMOTO,SHUUHEI
送信日時: 2019年5月17日 15:08
宛先: Anil Veerabhadrappa; benjamin.walker(a)intel.com; james.r.harris(a)intel.com
CC: spdk(a)lists.01.org
件名: Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP

Hi Anil, Ben, Jim, and All,

Thanks to great help by Anil, I have steady progress to understand the NVMe-oFC patch.
https://review.gerrithub.io/#/c/spdk/spdk/+/450077/

I noticed a few differences about management of transport object and thread.

I think the difference came from the difference of transport as long as I read
https://s3.us-east-2.amazonaws.com/intel-builders/day_2_spdk_nvme_over_fabrics.pdf​
by John Meneghini and Madhu Pai by NetApp.​

Can we change FC transport to align RDMA and TCP transports a little more, or
all transports can coexist even if there are difference?


I added a trello card as follows too, and we will be able to discuss on it.
https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport
Trello<https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport>
Organize anything, together. Trello is a collaboration tool that organizes your projects into boards. In one glance, know what's being worked on, who's working on what, and where something is in a process.
trello.com


I'm afraid that my understanding is wrong but I hope Anil will correct it 🙂
Any feedback is appreciated.

1. Management of FC transport object
​
Add/remove listers​
​
For RDMA/TCP tranport,​
we add/remove listeners to RDMA/TCP transport layer by nvmf_subsystem_add_listener RPC.​
NVMe-oF target has all transport objects and nvmf_subsystem_add_listener RPC calls NVMe-oF​
target. So NVMe-oF target calls nvmf_rdma/tcp_listen() to add listener and passes the​
corresponding transport object to it.​
​
​
For FC transport,​
FC transport layer implements add/removal of listers as the callback​
nvmf_fc_adm_add_rem_nport_listener() from LLD (low level driver), LLD calls the callback for each
WWNN-WWPN pair.​ ​LLD don't have any reference to the FC transport object.​
FC transport layer has the reference to the FC transport object instead, and​
the invoked callback nvmf_fc_adm_add_rem_nport_listener() gets the stored transport object.​
​So, nvmf_fc_listen() is NOP.​
​
​
​
Accept new connections​
​
For RDMA/TCP transport,​
SPDK NVMe-oF target polls spdk_nvmf_tgt_accept(), and spdk_nvmf_tgt_accept() calls​
nvmf_rdma/tcp_accept() and passes the corresponding transport object to it.​
​
​
For FC transport,​
spdk_nvmf_tgt_accept() calls nvmf_fc_accept() and passes the corresponding transport object to it.​
But nvmf_fc_accept() doesn't pass the FC transport object to the LLD's link service poller.​
​LLD's link service poller polls hardware link service queue and invokes FC transport layer's​
callbacks. LLD don't have any reference to the FC transport object and so the callbacks get the​
stored transport object.​
​
​
​2. Management of master thread​
​
FC transport layer has its own master thread. This may match the thread of the NVMe-oF target's​
acceptor poller but is independent each other.​
​All FC administration events and link services are executed on the FC transport's master thread.​


Thanks,
Shuhei


________________________________
差出人: 松本周平 / MATSUMOTO,SHUUHEI
送信日時: 2019年5月14日 14:47
宛先: Anil Veerabhadrappa
CC: spdk(a)lists.01.org
件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport

Hi Anil,

Thank you for your kind feedback. Your thought is very reasonable.

OK, let's continue to focus on the current patch, and add the two work items in Trello as enhancements.
I added them to NVMe-oF target backlog for now.
https://trello.com/b/G2f3dVcs/nvme-of-target-backlog

Thanks,
Shuhei

________________________________
差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
送信日時: 2019年5月14日 8:11
宛先: 松本周平 / MATSUMOTO,SHUUHEI
CC: spdk(a)lists.01.org
件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport

Hi Shuhei,
     Your understanding about WWNN and WWPN is right. Initially there was some considerations about adding FC listener code but decided against it.
There are couple reasons for it,

  *   Unlike RDMA/TCP transport, there is no easy way for user to obtain worldwide names from say Linux bash shell. There isn't a 'ip address list' or 'ifconfig' equivalent for FC in Linux distributions. So each vendor has to provide their own utility.
  *   FC protocol suite includes Directory Service via' Name Server which controls how connected FC ports as discovered in a SNA. Also zoning controls which ports are allowed to connect to each other. So FC-NVMe inherits an external pseudo listening feature from native FC. So it is ok for FC NVMf target to listen on all FC ports.

Our drivers can support changing WWNN and WWPN. We will work on this feature enhancement after FC-NVMe is merged into SPDK code. Also it is worth noting that this feature will introduce some new transport API's which would be NOP's for RDMA and TCP transports.

    Sure, we can add these 2 work items in Trello as enhancements to FC-NVMe transport to be addressed after it is merged into SPDK master.

Thanks,
- Anil


On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <shuhei.matsumoto.xt(a)hitachi.com<mailto:shuhei.matsumoto.xt(a)hitachi.com>> wrote:
Hi Anil,

Thank you for improving the patch continuously. I have seen great improvement.

I have an item to discuss with you, and I send it to the mailing list first.
Please correct me if I'm wrong or let's discusss on Trello by creating the board for FC if this question is reasonable.

NVMe-oF FC transport utilizes two types of WWN.

WWPN is a World Wide Port Name, WWPN is an unique ID for each FC port, and each port on a FC HBA has a unique WWPN.​
WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.

If I understand correctly,
​The FC low level driver (LLD) reads persistent WWPN and WWNN and informs them to SPDK NVMe-oF transport,​
then SPDK NVMe-oF transport configures listeners according to them.​
Besides, nvmf_fc_listen is implemented as NOP.​

So WWPN and WWNN is read-only for SPDK NVMe-oF transport.​

But it is very desirable if we can change WWNN and WWPN as our own needs.
.INI config file has been deprecated and could you consider to add FC code to the nvmf_subsystem_add_listener RPC?​

Implementation options may be​ for example
- to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of the HBA which matches WWNN.​
or
- FC port has its own PCI address.​
- user passes the trio, PCI address, WWNN, WWPN.​
- if the PCI address is the lowest of the FC HBA, WWNN can be changed.​

If FC HBA doesn't allow changing WWNN or WWPN, we can output error message.​

Thanks,
Shuhei​


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
@ 2019-05-17  6:08 
  0 siblings, 0 replies; 9+ messages in thread
From:  @ 2019-05-17  6:08 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 6953 bytes --]

Hi Anil, Ben, Jim, and All,

Thanks to great help by Anil, I have steady progress to understand the NVMe-oFC patch.
https://review.gerrithub.io/#/c/spdk/spdk/+/450077/

I noticed a few differences about management of transport object and thread.

I think the difference came from the difference of transport as long as I read
https://s3.us-east-2.amazonaws.com/intel-builders/day_2_spdk_nvme_over_fabrics.pdf​
by John Meneghini and Madhu Pai by NetApp.​

Can we change FC transport to align RDMA and TCP transports a little more, or
all transports can coexist even if there are difference?


I added a trello card as follows too, and we will be able to discuss on it.
https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport
Trello<https://trello.com/c/q42Nt2X3/54-difference-of-transport-object-and-thread-management-between-fc-transport-and-rdma-tcp-transport>
Organize anything, together. Trello is a collaboration tool that organizes your projects into boards. In one glance, know what's being worked on, who's working on what, and where something is in a process.
trello.com


I'm afraid that my understanding is wrong but I hope Anil will correct it 🙂
Any feedback is appreciated.

1. Management of FC transport object
​
Add/remove listers​
​
For RDMA/TCP tranport,​
we add/remove listeners to RDMA/TCP transport layer by nvmf_subsystem_add_listener RPC.​
NVMe-oF target has all transport objects and nvmf_subsystem_add_listener RPC calls NVMe-oF​
target. So NVMe-oF target calls nvmf_rdma/tcp_listen() to add listener and passes the​
corresponding transport object to it.​
​
​
For FC transport,​
FC transport layer implements add/removal of listers as the callback​
nvmf_fc_adm_add_rem_nport_listener() from LLD (low level driver), LLD calls the callback for each
WWNN-WWPN pair.​ ​LLD don't have any reference to the FC transport object.​
FC transport layer has the reference to the FC transport object instead, and​
the invoked callback nvmf_fc_adm_add_rem_nport_listener() gets the stored transport object.​
​So, nvmf_fc_listen() is NOP.​
​
​
​
Accept new connections​
​
For RDMA/TCP transport,​
SPDK NVMe-oF target polls spdk_nvmf_tgt_accept(), and spdk_nvmf_tgt_accept() calls​
nvmf_rdma/tcp_accept() and passes the corresponding transport object to it.​
​
​
For FC transport,​
spdk_nvmf_tgt_accept() calls nvmf_fc_accept() and passes the corresponding transport object to it.​
But nvmf_fc_accept() doesn't pass the FC transport object to the LLD's link service poller.​
​LLD's link service poller polls hardware link service queue and invokes FC transport layer's​
callbacks. LLD don't have any reference to the FC transport object and so the callbacks get the​
stored transport object.​
​
​
​2. Management of master thread​
​
FC transport layer has its own master thread. This may match the thread of the NVMe-oF target's​
acceptor poller but is independent each other.​
​All FC administration events and link services are executed on the FC transport's master thread.​


Thanks,
Shuhei


________________________________
差出人: 松本周平 / MATSUMOTO,SHUUHEI
送信日時: 2019年5月14日 14:47
宛先: Anil Veerabhadrappa
CC: spdk(a)lists.01.org
件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport

Hi Anil,

Thank you for your kind feedback. Your thought is very reasonable.

OK, let's continue to focus on the current patch, and add the two work items in Trello as enhancements.
I added them to NVMe-oF target backlog for now.
https://trello.com/b/G2f3dVcs/nvme-of-target-backlog

Thanks,
Shuhei

________________________________
差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
送信日時: 2019年5月14日 8:11
宛先: 松本周平 / MATSUMOTO,SHUUHEI
CC: spdk(a)lists.01.org
件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC transport

Hi Shuhei,
     Your understanding about WWNN and WWPN is right. Initially there was some considerations about adding FC listener code but decided against it.
There are couple reasons for it,

  *   Unlike RDMA/TCP transport, there is no easy way for user to obtain worldwide names from say Linux bash shell. There isn't a 'ip address list' or 'ifconfig' equivalent for FC in Linux distributions. So each vendor has to provide their own utility.
  *   FC protocol suite includes Directory Service via' Name Server which controls how connected FC ports as discovered in a SNA. Also zoning controls which ports are allowed to connect to each other. So FC-NVMe inherits an external pseudo listening feature from native FC. So it is ok for FC NVMf target to listen on all FC ports.

Our drivers can support changing WWNN and WWPN. We will work on this feature enhancement after FC-NVMe is merged into SPDK code. Also it is worth noting that this feature will introduce some new transport API's which would be NOP's for RDMA and TCP transports.

    Sure, we can add these 2 work items in Trello as enhancements to FC-NVMe transport to be addressed after it is merged into SPDK master.

Thanks,
- Anil


On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <shuhei.matsumoto.xt(a)hitachi.com<mailto:shuhei.matsumoto.xt(a)hitachi.com>> wrote:
Hi Anil,

Thank you for improving the patch continuously. I have seen great improvement.

I have an item to discuss with you, and I send it to the mailing list first.
Please correct me if I'm wrong or let's discusss on Trello by creating the board for FC if this question is reasonable.

NVMe-oF FC transport utilizes two types of WWN.

WWPN is a World Wide Port Name, WWPN is an unique ID for each FC port, and each port on a FC HBA has a unique WWPN.​
WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.

If I understand correctly,
​The FC low level driver (LLD) reads persistent WWPN and WWNN and informs them to SPDK NVMe-oF transport,​
then SPDK NVMe-oF transport configures listeners according to them.​
Besides, nvmf_fc_listen is implemented as NOP.​

So WWPN and WWNN is read-only for SPDK NVMe-oF transport.​

But it is very desirable if we can change WWNN and WWPN as our own needs.
.INI config file has been deprecated and could you consider to add FC code to the nvmf_subsystem_add_listener RPC?​

Implementation options may be​ for example
- to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of the HBA which matches WWNN.​
or
- FC port has its own PCI address.​
- user passes the trio, PCI address, WWNN, WWPN.​
- if the PCI address is the lowest of the FC HBA, WWNN can be changed.​

If FC HBA doesn't allow changing WWNN or WWPN, we can output error message.​

Thanks,
Shuhei​


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-22 10:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22  7:37 [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP 
  -- strict thread matches above, loose matches on Subject: below --
2019-05-22 10:58 
2019-05-22 10:46 
2019-05-22 10:38 
2019-05-21 21:46 Anil Veerabhadrappa
2019-05-17 18:00 Walker, Benjamin
2019-05-17 17:40 Anil Veerabhadrappa
2019-05-17  6:29 
2019-05-17  6:08 

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.