All of lore.kernel.org
 help / color / mirror / Atom feed
* [SPDK] NVMe vendor specific commands
@ 2019-08-30 22:11 Michael Haeuptle
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Haeuptle @ 2019-08-30 22:11 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1161 bytes --]

Hello,

I have a couple of observations/questions regarding vendor specific
commands over NVMf and I was hoping to get your take on this.

- I can get vendor specific *IO *commands to work in my custom bdev by
supporting SPDK_BDEV_IO_TYPE_NVME_IO. However, doing the same for
*Admin *commands
fails. Is there a reason why we are blocking the vendor specific range of
C0h-FFh for Admin commands (see spdk_nvmf_ctrlr_process_admin_cmd).
- The current bdev API to complete a NVMe specific request is
spdk_bdev_io_complete_nvme_status(struct spdk_bdev_io *bdev_io, int sct,
int sc) but it only takes in the 2 status code that are written to the
completion queue status field. I would also like to set the CDW0 of the
completion queue. Are there any plans to support this or do we want to keep
the bdev API as front-end protocol agnostic as possible?
I guess one way to support it would be to add an additional field to
bdev_io's structure like there is for the NVMe/SCSI specific status code
handling, but again, do we want to add more protocol specifics to bdev_io?
Are the any other options to set CDW0?

Thanks for your help.

-- Michael

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [SPDK] NVMe vendor specific commands
@ 2019-09-09 21:42 Walker, Benjamin
  0 siblings, 0 replies; 4+ messages in thread
From: Walker, Benjamin @ 2019-09-09 21:42 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5977 bytes --]

On Thu, 2019-09-05 at 09:00 -0600, Michael Haeuptle wrote:
> Thanks for the reply, Ben.
> 
> Couple of follow up questions/comments:
> - I assume the per vendor command callbacks are attached to the
> spdk_nvmf_subsystem. So the API would look like:
> spdk_nvmf_subsystem_add_admin_hdlr(spdk_nvmf_subsystem *ss, uint8_t
> vendor_cmd, (*hdlr_fct)(spdk_nvmf_cmd *cmd, spdk_nvmf_cpl *cpl, void *buf,
> unit32_t len))?

Yeah, something like that.

> 
> - One use case we have in mind is handling cage management requests. This
> means that we have to interact with another component within the handler
> (e.g. via sockets/epoll). So the flow/code would look similar to the IO
> path where we always do async processing
> (SPDK_NVMF_REQUEST_EXEC_STATUS_ASYNCHRONOUS). The hdlr_fct would need to
> use a completion callback to actually finish the request and post the reply.
> 
> - It seems that the admin command handling would follow very closely the IO
> processing and replace the actual bdev processing with the hdlr_fct. At the
> surface this seems fairly straight forward but I'm somewhat concerned that
> we'd be duplicating a lot of what's already there for the IO processing,
> especially dealing with corner cases. What's your take on that, any
> concerns? Do you think this is small/medium effort?

I think it has to be off to the side to maintain any semblance of modularity. A
lot of the handlers will likely turn around and make calls into the bdev layer
so they'll work out fine, but we can't know for sure what specifically the
handlers will need to do. I'm not that worried about corner cases here, and I
think if any common operation begins to emerge we can probably come up with a
generic way to provide that via the nvmf library.

> 
> - The API to set the handler is a little bit weird since it is outside of
> the current way of implementing bdevs... Maybe another approach would be to
> have a global function in the bdev module structure that is responsible for
> admin commands with subsystem as the context? But that would break the
> generic bdev abstraction even more.
> 
> Thanks.
> 
> -- Michael
> 
> On Wed, Sep 4, 2019 at 11:59 AM Walker, Benjamin <benjamin.walker(a)intel.com>
> wrote:
> 
> > On Fri, 2019-08-30 at 16:11 -0600, Michael Haeuptle wrote:
> > > Hello,
> > > 
> > > I have a couple of observations/questions regarding vendor specific
> > > commands over NVMf and I was hoping to get your take on this.
> > > 
> > > - I can get vendor specific *IO *commands to work in my custom bdev by
> > > supporting SPDK_BDEV_IO_TYPE_NVME_IO. However, doing the same for
> > > *Admin *commands
> > > fails. Is there a reason why we are blocking the vendor specific range of
> > > C0h-FFh for Admin commands (see spdk_nvmf_ctrlr_process_admin_cmd).
> > 
> > I'm a bit torn on what to do about vendor specific commands.In the NVMe-oF
> > target, each bdev is treated as a namespace in a virtualized controller.
> > For
> > physical NVMe devices, each namespace on the device becomes a bdev. The
> > issues
> > all arise when a vendor specific command is performing some operation that
> > targets the whole controller instead of just a namespace. For example:
> > 
> > 1) It's entirely possible that a device with two namespaces places each of
> > those
> > namespaces in an entirely different subsystem on the NVMe-oF side. The
> > hosts
> > then think these namespaces aren't related to one another. If a vendor
> > specific
> > command that changes some global state on the SSD is allowed through, this
> > can
> > cause serious problems to the other unwitting subsystem.
> > 
> > 2) The NVMe-oF target allows multiple hosts to connect simultaneously,
> > emulating
> > a separate NVMe controller for each. Even in the case where the physical
> > SSD has
> > a single namespace, surfaced as a bdev and added to a single subsystem,
> > allowing
> > a vendor specific command through may have an impact on both controllers
> > presented to the two hosts, but there's no way for the NVMe-oF code to know
> > that.
> > 
> > I think the reality is that all vendor-specific commands will likely need
> > to be
> > handled in the NVMe-oF software emulation instead of simply passing
> > through to
> > the backing device blindly. The cleanest way to do this is to probably add
> > a
> > public API function to the nvmf library that allows users to register a
> > handler
> > for a vendor specific command, and then in
> > spdk_nvmf_ctrlr_process_admin_cmd it
> > looks up if a handler is present and calls it rather than just failing
> > everything in the vendor-specific range.
> > 
> > > - The current bdev API to complete a NVMe specific request is
> > > spdk_bdev_io_complete_nvme_status(struct spdk_bdev_io *bdev_io, int sct,
> > > int sc) but it only takes in the 2 status code that are written to the
> > > completion queue status field. I would also like to set the CDW0 of the
> > > completion queue. Are there any plans to support this or do we want to
> > keep
> > > the bdev API as front-end protocol agnostic as possible?
> > > I guess one way to support it would be to add an additional field to
> > > bdev_io's structure like there is for the NVMe/SCSI specific status code
> > > handling, but again, do we want to add more protocol specifics to
> > bdev_io?
> > > Are the any other options to set CDW0?
> > > 
> > > Thanks for your help.
> > > 
> > > -- Michael
> > > _______________________________________________
> > > SPDK mailing list
> > > SPDK(a)lists.01.org
> > > https://lists.01.org/mailman/listinfo/spdk
> > 
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
> > 
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [SPDK] NVMe vendor specific commands
@ 2019-09-05 15:00 Michael Haeuptle
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Haeuptle @ 2019-09-05 15:00 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5051 bytes --]

Thanks for the reply, Ben.

Couple of follow up questions/comments:
- I assume the per vendor command callbacks are attached to the
spdk_nvmf_subsystem. So the API would look like:
spdk_nvmf_subsystem_add_admin_hdlr(spdk_nvmf_subsystem *ss, uint8_t
vendor_cmd, (*hdlr_fct)(spdk_nvmf_cmd *cmd, spdk_nvmf_cpl *cpl, void *buf,
unit32_t len))?

- One use case we have in mind is handling cage management requests. This
means that we have to interact with another component within the handler
(e.g. via sockets/epoll). So the flow/code would look similar to the IO
path where we always do async processing
(SPDK_NVMF_REQUEST_EXEC_STATUS_ASYNCHRONOUS). The hdlr_fct would need to
use a completion callback to actually finish the request and post the reply.

- It seems that the admin command handling would follow very closely the IO
processing and replace the actual bdev processing with the hdlr_fct. At the
surface this seems fairly straight forward but I'm somewhat concerned that
we'd be duplicating a lot of what's already there for the IO processing,
especially dealing with corner cases. What's your take on that, any
concerns? Do you think this is small/medium effort?

- The API to set the handler is a little bit weird since it is outside of
the current way of implementing bdevs... Maybe another approach would be to
have a global function in the bdev module structure that is responsible for
admin commands with subsystem as the context? But that would break the
generic bdev abstraction even more.

Thanks.

-- Michael

On Wed, Sep 4, 2019 at 11:59 AM Walker, Benjamin <benjamin.walker(a)intel.com>
wrote:

> On Fri, 2019-08-30 at 16:11 -0600, Michael Haeuptle wrote:
> > Hello,
> >
> > I have a couple of observations/questions regarding vendor specific
> > commands over NVMf and I was hoping to get your take on this.
> >
> > - I can get vendor specific *IO *commands to work in my custom bdev by
> > supporting SPDK_BDEV_IO_TYPE_NVME_IO. However, doing the same for
> > *Admin *commands
> > fails. Is there a reason why we are blocking the vendor specific range of
> > C0h-FFh for Admin commands (see spdk_nvmf_ctrlr_process_admin_cmd).
>
> I'm a bit torn on what to do about vendor specific commands.In the NVMe-oF
> target, each bdev is treated as a namespace in a virtualized controller.
> For
> physical NVMe devices, each namespace on the device becomes a bdev. The
> issues
> all arise when a vendor specific command is performing some operation that
> targets the whole controller instead of just a namespace. For example:
>
> 1) It's entirely possible that a device with two namespaces places each of
> those
> namespaces in an entirely different subsystem on the NVMe-oF side. The
> hosts
> then think these namespaces aren't related to one another. If a vendor
> specific
> command that changes some global state on the SSD is allowed through, this
> can
> cause serious problems to the other unwitting subsystem.
>
> 2) The NVMe-oF target allows multiple hosts to connect simultaneously,
> emulating
> a separate NVMe controller for each. Even in the case where the physical
> SSD has
> a single namespace, surfaced as a bdev and added to a single subsystem,
> allowing
> a vendor specific command through may have an impact on both controllers
> presented to the two hosts, but there's no way for the NVMe-oF code to know
> that.
>
> I think the reality is that all vendor-specific commands will likely need
> to be
> handled in the NVMe-oF software emulation instead of simply passing
> through to
> the backing device blindly. The cleanest way to do this is to probably add
> a
> public API function to the nvmf library that allows users to register a
> handler
> for a vendor specific command, and then in
> spdk_nvmf_ctrlr_process_admin_cmd it
> looks up if a handler is present and calls it rather than just failing
> everything in the vendor-specific range.
>
> > - The current bdev API to complete a NVMe specific request is
> > spdk_bdev_io_complete_nvme_status(struct spdk_bdev_io *bdev_io, int sct,
> > int sc) but it only takes in the 2 status code that are written to the
> > completion queue status field. I would also like to set the CDW0 of the
> > completion queue. Are there any plans to support this or do we want to
> keep
> > the bdev API as front-end protocol agnostic as possible?
> > I guess one way to support it would be to add an additional field to
> > bdev_io's structure like there is for the NVMe/SCSI specific status code
> > handling, but again, do we want to add more protocol specifics to
> bdev_io?
> > Are the any other options to set CDW0?
> >
> > Thanks for your help.
> >
> > -- Michael
> > _______________________________________________
> > SPDK mailing list
> > SPDK(a)lists.01.org
> > https://lists.01.org/mailman/listinfo/spdk
>
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [SPDK] NVMe vendor specific commands
@ 2019-09-04 17:59 Walker, Benjamin
  0 siblings, 0 replies; 4+ messages in thread
From: Walker, Benjamin @ 2019-09-04 17:59 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 3102 bytes --]

On Fri, 2019-08-30 at 16:11 -0600, Michael Haeuptle wrote:
> Hello,
> 
> I have a couple of observations/questions regarding vendor specific
> commands over NVMf and I was hoping to get your take on this.
> 
> - I can get vendor specific *IO *commands to work in my custom bdev by
> supporting SPDK_BDEV_IO_TYPE_NVME_IO. However, doing the same for
> *Admin *commands
> fails. Is there a reason why we are blocking the vendor specific range of
> C0h-FFh for Admin commands (see spdk_nvmf_ctrlr_process_admin_cmd).

I'm a bit torn on what to do about vendor specific commands.In the NVMe-oF
target, each bdev is treated as a namespace in a virtualized controller. For
physical NVMe devices, each namespace on the device becomes a bdev. The issues
all arise when a vendor specific command is performing some operation that
targets the whole controller instead of just a namespace. For example:

1) It's entirely possible that a device with two namespaces places each of those
namespaces in an entirely different subsystem on the NVMe-oF side. The hosts
then think these namespaces aren't related to one another. If a vendor specific
command that changes some global state on the SSD is allowed through, this can
cause serious problems to the other unwitting subsystem.

2) The NVMe-oF target allows multiple hosts to connect simultaneously, emulating
a separate NVMe controller for each. Even in the case where the physical SSD has
a single namespace, surfaced as a bdev and added to a single subsystem, allowing
a vendor specific command through may have an impact on both controllers
presented to the two hosts, but there's no way for the NVMe-oF code to know
that.

I think the reality is that all vendor-specific commands will likely need to be
handled in the NVMe-oF software emulation instead of simply passing through to
the backing device blindly. The cleanest way to do this is to probably add a
public API function to the nvmf library that allows users to register a handler
for a vendor specific command, and then in spdk_nvmf_ctrlr_process_admin_cmd it
looks up if a handler is present and calls it rather than just failing
everything in the vendor-specific range.

> - The current bdev API to complete a NVMe specific request is
> spdk_bdev_io_complete_nvme_status(struct spdk_bdev_io *bdev_io, int sct,
> int sc) but it only takes in the 2 status code that are written to the
> completion queue status field. I would also like to set the CDW0 of the
> completion queue. Are there any plans to support this or do we want to keep
> the bdev API as front-end protocol agnostic as possible?
> I guess one way to support it would be to add an additional field to
> bdev_io's structure like there is for the NVMe/SCSI specific status code
> handling, but again, do we want to add more protocol specifics to bdev_io?
> Are the any other options to set CDW0?
> 
> Thanks for your help.
> 
> -- Michael
> _______________________________________________
> SPDK mailing list
> SPDK(a)lists.01.org
> https://lists.01.org/mailman/listinfo/spdk


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-09-09 21:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-30 22:11 [SPDK] NVMe vendor specific commands Michael Haeuptle
2019-09-04 17:59 Walker, Benjamin
2019-09-05 15:00 Michael Haeuptle
2019-09-09 21:42 Walker, Benjamin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.