From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============8240957457709210965==" MIME-Version: 1.0 From: Walker, Benjamin Subject: Re: [SPDK] NVMe vendor specific commands Date: Mon, 09 Sep 2019 21:42:32 +0000 Message-ID: <30f18434974b5a28cf3f6254eec94d33cb85809d.camel@intel.com> In-Reply-To: CAHUk8cQfAOyKUuBJnbG0MGThH=XZ2QhFxVJmFmfRTTsVyapKfg@mail.gmail.com List-ID: To: spdk@lists.01.org --===============8240957457709210965== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Thu, 2019-09-05 at 09:00 -0600, Michael Haeuptle wrote: > Thanks for the reply, Ben. > = > Couple of follow up questions/comments: > - I assume the per vendor command callbacks are attached to the > spdk_nvmf_subsystem. So the API would look like: > spdk_nvmf_subsystem_add_admin_hdlr(spdk_nvmf_subsystem *ss, uint8_t > vendor_cmd, (*hdlr_fct)(spdk_nvmf_cmd *cmd, spdk_nvmf_cpl *cpl, void *buf, > unit32_t len))? Yeah, something like that. > = > - One use case we have in mind is handling cage management requests. This > means that we have to interact with another component within the handler > (e.g. via sockets/epoll). So the flow/code would look similar to the IO > path where we always do async processing > (SPDK_NVMF_REQUEST_EXEC_STATUS_ASYNCHRONOUS). The hdlr_fct would need to > use a completion callback to actually finish the request and post the rep= ly. > = > - It seems that the admin command handling would follow very closely the = IO > processing and replace the actual bdev processing with the hdlr_fct. At t= he > surface this seems fairly straight forward but I'm somewhat concerned that > we'd be duplicating a lot of what's already there for the IO processing, > especially dealing with corner cases. What's your take on that, any > concerns? Do you think this is small/medium effort? I think it has to be off to the side to maintain any semblance of modularit= y. A lot of the handlers will likely turn around and make calls into the bdev la= yer so they'll work out fine, but we can't know for sure what specifically the handlers will need to do. I'm not that worried about corner cases here, and= I think if any common operation begins to emerge we can probably come up with= a generic way to provide that via the nvmf library. > = > - The API to set the handler is a little bit weird since it is outside of > the current way of implementing bdevs... Maybe another approach would be = to > have a global function in the bdev module structure that is responsible f= or > admin commands with subsystem as the context? But that would break the > generic bdev abstraction even more. > = > Thanks. > = > -- Michael > = > On Wed, Sep 4, 2019 at 11:59 AM Walker, Benjamin > wrote: > = > > On Fri, 2019-08-30 at 16:11 -0600, Michael Haeuptle wrote: > > > Hello, > > > = > > > I have a couple of observations/questions regarding vendor specific > > > commands over NVMf and I was hoping to get your take on this. > > > = > > > - I can get vendor specific *IO *commands to work in my custom bdev by > > > supporting SPDK_BDEV_IO_TYPE_NVME_IO. However, doing the same for > > > *Admin *commands > > > fails. Is there a reason why we are blocking the vendor specific rang= e of > > > C0h-FFh for Admin commands (see spdk_nvmf_ctrlr_process_admin_cmd). > > = > > I'm a bit torn on what to do about vendor specific commands.In the NVMe= -oF > > target, each bdev is treated as a namespace in a virtualized controller. > > For > > physical NVMe devices, each namespace on the device becomes a bdev. The > > issues > > all arise when a vendor specific command is performing some operation t= hat > > targets the whole controller instead of just a namespace. For example: > > = > > 1) It's entirely possible that a device with two namespaces places each= of > > those > > namespaces in an entirely different subsystem on the NVMe-oF side. The > > hosts > > then think these namespaces aren't related to one another. If a vendor > > specific > > command that changes some global state on the SSD is allowed through, t= his > > can > > cause serious problems to the other unwitting subsystem. > > = > > 2) The NVMe-oF target allows multiple hosts to connect simultaneously, > > emulating > > a separate NVMe controller for each. Even in the case where the physical > > SSD has > > a single namespace, surfaced as a bdev and added to a single subsystem, > > allowing > > a vendor specific command through may have an impact on both controllers > > presented to the two hosts, but there's no way for the NVMe-oF code to = know > > that. > > = > > I think the reality is that all vendor-specific commands will likely ne= ed > > to be > > handled in the NVMe-oF software emulation instead of simply passing > > through to > > the backing device blindly. The cleanest way to do this is to probably = add > > a > > public API function to the nvmf library that allows users to register a > > handler > > for a vendor specific command, and then in > > spdk_nvmf_ctrlr_process_admin_cmd it > > looks up if a handler is present and calls it rather than just failing > > everything in the vendor-specific range. > > = > > > - The current bdev API to complete a NVMe specific request is > > > spdk_bdev_io_complete_nvme_status(struct spdk_bdev_io *bdev_io, int s= ct, > > > int sc) but it only takes in the 2 status code that are written to the > > > completion queue status field. I would also like to set the CDW0 of t= he > > > completion queue. Are there any plans to support this or do we want to > > keep > > > the bdev API as front-end protocol agnostic as possible? > > > I guess one way to support it would be to add an additional field to > > > bdev_io's structure like there is for the NVMe/SCSI specific status c= ode > > > handling, but again, do we want to add more protocol specifics to > > bdev_io? > > > Are the any other options to set CDW0? > > > = > > > Thanks for your help. > > > = > > > -- Michael > > > _______________________________________________ > > > SPDK mailing list > > > SPDK(a)lists.01.org > > > https://lists.01.org/mailman/listinfo/spdk > > = > > _______________________________________________ > > SPDK mailing list > > SPDK(a)lists.01.org > > https://lists.01.org/mailman/listinfo/spdk > > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk --===============8240957457709210965==--