All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anil Veerabhadrappa <anil.veerabhadrappa at broadcom.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP
Date: Tue, 21 May 2019 14:46:08 -0700	[thread overview]
Message-ID: <CAFUp95eQKCbfndQGOhHRcM-8BdFmscXU4BUcU9Voabm1ZQGo0A@mail.gmail.com> (raw)
In-Reply-To: 51b1bb1e1bb55babe834eeba9c2989121844c83b.camel@intel.com

[-- Attachment #1: Type: text/plain, Size: 8799 bytes --]

Hi Ben,
     Please find the attached diagram which outlines FC node/link
initialization and FC-NVMe connection setup sequence.
Here are some highlights,

   - Each FC port login to switch to identify itself, establish trust and
   make use of directory service provided by switch
   - Storage network intelligence is embedded in FC switches in the form of
   directory or name service
   - FC-NVMe initiator and target are allowed to setup connections only if
   storage admin permits by zoning them together.


On Fri, May 17, 2019 at 11:00 AM Walker, Benjamin <benjamin.walker(a)intel.com>
wrote:

> On Fri, 2019-05-17 at 10:40 -0700, Anil Veerabhadrappa wrote:
> > Hi Shuhei,
> >     My response is inline below.
> >
>
> I don't know much about Fibre Channel, but surely there must be some step
> that
> tells the FC driver whether to accept a connection or not, right? Even if
> that's

just the particular FC port to use on the HBA, it's still something. That
> information needs to come from an RPC and be passed down through
> spdk_nvmf_tgt_listen() to the FC transport.
>

Step 50 (which again is dependent on step .1 & step 21.) has to completed
before NVMe connections can be established.
So bind()/listen() involves external entity (switch) and multiple PDU
exchanges.
We are exploring the feasibility for moving step 21. to
spdk_nvmf_tgt_listen().


>
> > That is the reason we have created a channel or queue to receive these
> FC-4 LS
> > frames for processing. Similarly ABTS (Abort exchange)
> > is again a FC-LS command. As you can see the FC-4 LS does more work than
> it's
> > RDMA / TCP counterparts. That brings us to the
> > second difference - "Management of master thread", because of the amount
> of
> > work involved lead us to implement a dedicated master thread
> > to handle all non-IO related processing.
>
> Once a connection is established, you need to tell the generic SPDK NVMe-oF
> library about it (so the qpair can get assigned to a poll group). That
> means
> that spdk_nvmf_tgt_accept() is going to need to do something to the
> transport
> that generates the new_qpair callback.
>
This is exactly fc transport implements accept() api. Transport polls on
FC-4 LS queue for
incoming "Create Association", "Create Connection" and "Disconnect"
requests.
After accepting / setting up connection, transport would assign it to a
poll group by
calling spdk_nvmf_poll_group_add().


> I think what would be helpful is a basic overview of what the FC
> primitives are
> and how you plan to map them to the SPDK NVMe-oF primitives. The SPDK
> NVMe-oF
> primitives are, including mappings for RDMA/TCP:
>
> spdk_nvme_transport_id (RDMA: IP/port TCP: IP/port)
>

That would be FC WWNN and WWPN as show below,

  {

    "nqn": "nqn.2016-06.io.spdk:cnode1",

    "subtype": "NVMe",

    "listen_addresses": [

      {

        "transport": "FC",

        "trtype": "FC",

        "adrfam": "FC",

*        "traddr": "nn-0x200000109b6460e1:pn-0x100000109b6460e1",*

        "trsvcid": "none"

      },

      {

        "transport": "FC",

        "trtype": "FC",

        "adrfam": "FC",

*        "traddr": "nn-0x200000109b6460e2:pn-0x100000109b6460e2",*

        "trsvcid": "none"

      }

    ],

    "allow_any_host": true,

    "hosts": [],

    "serial_number": "SPDK00000000000001",

    "model_number": "SPDK bdev Controller",

    "max_namespaces": 20,

    "namespaces": [

      {

        "nsid": 1,

        "bdev_name": "Malloc0",

        "name": "Malloc0",

        "uuid": "b24aa2f5-ff6e-425e-828e-145d03f410cc"

      }

    ]

  }




> spdk_nvmf_qpair (RDMA: ibv_qp. TCP: socket)
>

That would be NVMeoFC connection defined in spdk_nvmf_fc_conn{}


> spdk_nvmf_poll_group (RDMA: shared ibv_cq, optionally shared ibv_srq TCP:
> epoll/kqueue group)
>
In FC transport this is defined as hardware queue pair (hwqp). Logical
connections  are added to hwqp as they are created and removed when the
connection are torn down.



> >
> > Thanks,
> > - Anil
> >
> >
> > > 2. Management of master thread
> > > FC transport layer has its own master thread. This may match the
> thread of
> > > the NVMe-oF target's
> > > acceptor poller but is independent each other.
> > > All FC administration events and link services are executed on the FC
> > > transport's master thread.
> > >
> > >
> > > Thanks,
> > > Shuhei
> > >
> > >
> > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI
> > > 送信日時: 2019年5月14日 14:47
> > > 宛先: Anil Veerabhadrappa
> > > CC: spdk(a)lists.01.org
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Anil,
> > >
> > > Thank you for your kind feedback. Your thought is very reasonable.
> > >
> > > OK, let's continue to focus on the current patch, and add the two work
> items
> > > in Trello as enhancements.
> > > I added them to NVMe-oF target backlog for now.
> > > https://trello.com/b/G2f3dVcs/nvme-of-target-backlog
> > >
> > > Thanks,
> > > Shuhei
> > >
> > > 差出人: Anil Veerabhadrappa <anil.veerabhadrappa(a)broadcom.com>
> > > 送信日時: 2019年5月14日 8:11
> > > 宛先: 松本周平 / MATSUMOTO,SHUUHEI
> > > CC: spdk(a)lists.01.org
> > > 件名: Re: Configuring listener (especially WWPN) in SPDK NVMe-oF FC
> transport
> > >
> > > Hi Shuhei,
> > >      Your understanding about WWNN and WWPN is right. Initially there
> was
> > > some considerations about adding FC listener code but decided against
> it.
> > > There are couple reasons for it,
> > > Unlike RDMA/TCP transport, there is no easy way for user to obtain
> worldwide
> > > names from say Linux bash shell. There isn't a 'ip address list' or
> > > 'ifconfig' equivalent for FC in Linux distributions. So each vendor
> has to
> > > provide their own utility.
> > > FC protocol suite includes Directory Service via' Name Server which
> controls
> > > how connected FC ports as discovered in a SNA. Also zoning controls
> which
> > > ports are allowed to connect to each other. So FC-NVMe inherits an
> external
> > > pseudo listening feature from native FC. So it is ok for FC NVMf
> target to
> > > listen on all FC ports.
> > > Our drivers can support changing WWNN and WWPN. We will work on this
> feature
> > > enhancement after FC-NVMe is merged into SPDK code. Also it is worth
> noting
> > > that this feature will introduce some new transport API's which would
> be
> > > NOP's for RDMA and TCP transports.
> > >
> > >     Sure, we can add these 2 work items in Trello as enhancements to
> FC-NVMe
> > > transport to be addressed after it is merged into SPDK master.
> > >
> > > Thanks,
> > > - Anil
> > >
> > >
> > > On Mon, May 13, 2019 at 1:38 AM 松本周平 / MATSUMOTO,SHUUHEI <
> > > shuhei.matsumoto.xt(a)hitachi.com> wrote:
> > > > Hi Anil,
> > > >
> > > > Thank you for improving the patch continuously. I have seen great
> > > > improvement.
> > > >
> > > > I have an item to discuss with you, and I send it to the mailing list
> > > > first.
> > > > Please correct me if I'm wrong or let's discusss on Trello by
> creating the
> > > > board for FC if this question is reasonable.
> > > >
> > > > NVMe-oF FC transport utilizes two types of WWN.
> > > >
> > > > WWPN is a World Wide Port Name, WWPN is an unique ID for each FC
> port, and
> > > > each port on a FC HBA has a unique WWPN.
> > > > WWNN is a World Wide Node Name and WWNN is assigned to a FC HBA.
> > > >
> > > > If I understand correctly,
> > > > The FC low level driver (LLD) reads persistent WWPN and WWNN and
> informs
> > > > them to SPDK NVMe-oF transport,
> > > > then SPDK NVMe-oF transport configures listeners according to them.
> > > > Besides, nvmf_fc_listen is implemented as NOP.
> > > >
> > > > So WWPN and WWNN is read-only for SPDK NVMe-oF transport.
> > > >
> > > > But it is very desirable if we can change WWNN and WWPN as our own
> needs.
> > > > .INI config file has been deprecated and could you consider to add
> FC code
> > > > to the nvmf_subsystem_add_listener RPC?
> > > >
> > > > Implementation options may be for example
> > > > - to pass the pair WWNN and WWPN to LLD, and LLD change the WWPN of
> the
> > > > HBA which matches WWNN.
> > > > or
> > > > - FC port has its own PCI address.
> > > > - user passes the trio, PCI address, WWNN, WWPN.
> > > > - if the PCI address is the lowest of the FC HBA, WWNN can be
> changed.
> > > >
> > > > If FC HBA doesn't allow changing WWNN or WWPN, we can output error
> > > > message.
> > > >
> > > > Thanks,
> > > > Shuhei
> > > >
>
>

             reply	other threads:[~2019-05-21 21:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-21 21:46 Anil Veerabhadrappa [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-05-22 10:58 [SPDK] Difference of thread and transport object management in NVMe-oFC, RDMA, and TCP 
2019-05-22 10:46 
2019-05-22 10:38 
2019-05-22  7:37 
2019-05-17 18:00 Walker, Benjamin
2019-05-17 17:40 Anil Veerabhadrappa
2019-05-17  6:29 
2019-05-17  6:08 

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFUp95eQKCbfndQGOhHRcM-8BdFmscXU4BUcU9Voabm1ZQGo0A@mail.gmail.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.