Hi Shahar, Thank you for reply to my feedback. I’m sorry but I don’t have enough to describe about the following now, I will read your update and If I have anything hereafter I’ll update to you. Thanks, Shuhei > On May 10, 2018, at 22:33, Shahar Salzman wrote: > > Hi Shuhei, > > You raise some good comments. > If possible, I'd also like feadback on bdev_user.h and how you envision it. > This is probably the most important aspect, and everything else is the gritty details. > > Shahar > From: SPDK on behalf of 松本周平 / MATSUMOTO,SHUUHEI > Sent: Thursday, May 10, 2018 10:36:26 AM > To: spdk(a)lists.01.org > Subject: Re: [SPDK] SPDK + user space appliance > > Hi Shahar, > > I wrote a comment in your patch. > > Unfortunately the value is not be clear for me yet. > I may understand if you redirect IO from SPDK to application in the layer upon the bdev layer, that is nvmf passthrough without bdev. > However if you go through bdev layer, conversion from bdev_io to alternative software protocol will be done. > > I'm happy if you revise me when I'm wrong. > > Thanks, > Shuhei > > > 差出人: 松本周平 / MATSUMOTO,SHUUHEI > 送信日時: 2018年5月10日 15:34 > 宛先: spdk(a)lists.01.org > 件名: RE: [SPDK] SPDK + user space appliance > > Hi Shahar, > > First I apologize I didn't look the patch in detail before reply. > > I know updating SPDK from 17.07 to 18.01 is not easy task. > However creating own framework only on the DPDK lockless ring looks cumbersome for me. > > Thanks for your interesting proposal again and I'm sorry that I can't do enough effort to the communitization in this direction immediately. > > Thanks, > Shuhei > > 差出人: SPDK が Shahar Salzman の代理で送信 > 送信日時: 2018年5月10日 4:45 > 宛先: spdk(a)lists.01.org > 件名: [!]Re: [SPDK] SPDK + user space appliance > > Hi Ben, > > You make a good point. > We are looking for a stable API, so given a stable API which is not under bdev_internal, I can port the entire bdev_user module into my glue API, and use it as my spdk gateway. > But there is a benefit to pushing this to the community, since it would seem that the semantics (e.g. register_device, submit/poll for IO, complete IO) are very similar to those attempting to integrate spdk to a storage appliance (I would love to hear from Shuhei Matsumoto or Eyal Harari which seem to be attempting a similar integration of spdk to their products). > > Using this sort of API is much easier than having to understand an elaborate bdev API, spdk threading concerns, reactor issues etc. > > In addition, since the semantics are similar, I would imagine that if each user writes their own integration module, we end up with multple versions of the same code. > All parties eventually benefit from the "communitization" since we can openly discuss the interface, and code quality of a common module should be much better than any "private" copy. > > I also agree, that if this module goes upstream it needs to be paired with a good unit test running in CI so that as the internal interface evolves and as bdev changes, module breakages are easily detected, otherwise issues need to be solved upon integrating a new version. > > Shahar > > > From: SPDK on behalf of Walker, Benjamin > Sent: Wednesday, May 9, 2018 7:52:38 PM > To: spdk(a)lists.01.org > Subject: Re: [SPDK] SPDK + user space appliance > > On Wed, 2018-05-09 at 07:48 +0000, Shahar Salzman wrote: > > Hi Ben, > > > > To answer your question first, the main reason to have a generic spdk IO API > > for storage appliance is being able to maintain functionality over versions. > > We started with our module in version 17.07, but when we started integrating > > it into 17.07.01, and the 18.X version we started running into issues. Since > > the bdev layer is internal to spdk, it makes sense that it will change both in > > functionality and in interface from time to time especially in NVMeF where the > > specifications are so young, which is the reason that we would like to have an > > API which is more stable over versions. > > As a storage appliance, the requirements are fairly stable, and at least in > > the case of SCSI, a thin "glue API" had been enough to get the core appliance > > work with multiple frontend targets (e.g. both SCSI and internal replication). > > I believe that the same method can be applied to NVMEF. > > In addition, I think that our use case is not unique to the way we are > > integrating spdk, which means that the work we put in can benefit other users, > > and hopefully via their experience, the code can become more robust and > > useful. > > Thank you for this - now everything is much clearer. I totally agree with what > you're saying about the API being unstable and internal. That makes it quite > difficult to write a bdev module that continues to function across releases. > However, I don't think trading one API (the bdev module API) for another (the > proposed bdev user API) fixes anything at a fundamental level. This is because > the API changes we've made to the bdev module API are necessary to correctly > implement some types of bdev modules (although probably not yours currently). > Over time, you'll just end up propagating all of those changes into the newly > proposed bdev user API in order to make it suitable for general use and we won't > be in any better of a spot than we are today. > > I do think this is a problem that needs to be solved as soon as possible though. > The solution, in my mind, is to start the work of defining a public bdev module > API in include/spdk/bdev_module.h. It will take some work to define the API in a > way that as many of the implementation details are hidden as possible, but it is > worth the effort. Most of the module API currently resides in > include/spdk_internal/bdev.h, but there is a lot of stuff in there that doesn't > need to become public. > > Once we have a well defined public interface, we can take steps to avoid > breaking it going forward. I'm giving a talk on API stability at the SPDK summit > next week that will touch on some of the strategies we plan to employ to keep > APIs stable over time. > > I'll wait for more feedback to see if we can build some consensus about the best > path forward here. I'm interested to see if there are any other perspectives > that may view this a bit differently. > > > > > Shahar > > From: SPDK on behalf of Walker, Benjamin > .walker(a)intel.com> > > Sent: Tuesday, May 8, 2018 8:30:20 PM > > To: spdk(a)lists.01.org > > Subject: Re: [SPDK] SPDK + user space appliance > > > > On Tue, 2018-05-08 at 07:36 +0000, Shahar Salzman wrote: > > > Hi Jim and Ben, > > > > > > For the threading issue, I agree that there is something not very clean in > > the > > > interface, as there is an assumption on how the user implements it. As I did > > > in the bdev_user_example, we also use a ring in order to place all the > > > incoming IO without delaying the reactor, and then use multiple pollers to > > > actually handle the IO (deduplication, compression, HA etc.). This is why > > > there are 2 distinct interfaces - submit_io callback, and > > > the bdev_user_submit_completion interface which (normally) is called on > > > another thread (not the original poller), and passed back to the reactor via > > > the completion queue on the bdev_user_io_channel, and the registered poller > > > thread which takes from the user completion queue. > > > Do you think that a cleaner interface would be modifying the submit_io > > > callback to a poll_io interface which checks a bdev_user internal ring for > > IO? > > > Or do you think that the current interface is OK provided good > > documentation? > > > > > > Regarding the spdk_call_unaffinitized, I am currently using spdk_event_call > > in > > > order to register my volumes, I don't really like this since it forces me to > > > (eventually) add another async callback in my app to verify that device > > > registration was successful (and this just adds more conditions, futures > > etc. > > > in the application). Is there a way to call spdk interfaces directly with a > > > "non-spdk" thread (i.e. TLS is not initialized)? > > > > I'm not so much concerned yet with the interface you've defined, but rather > > understanding the whole approach at a high level. The SPDK bdev layer is > > designed for custom bdev modules to be added, so my primary question is why > > write a generic bdev_user module as opposed to writing a "your custom storage > > backend" module? I think this is the key piece, and understanding the process > > you went through as you designed this will probably yield a whole bunch of > > good > > improvements to the current bdev module system. > > > > Thanks, > > Ben > > > > > > > > > > Hope this answers the questions, > > > Shahar > > > > > > > > > From: SPDK on behalf of Harris, James R > > > > > > Sent: Monday, May 7, 2018 9:18:20 PM > > > To: Storage Performance Development Kit > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > There are also calls such as spdk_call_unaffinitized() and > > > spdk_unaffinitize_thread() which have been added to enable cases where a > > bdev > > > module may need to spawn non-polling threads and don’t want those threads to > > > inherit the affinity of the calling thread. The SPDK rbd module currently > > > uses these (see git commit fa5206c4) since rbd_open is a blocking call. > > (Note > > > that librbd does now support rbd_aio_open which is better suited for SPDK.) > > > > > > -Jim > > > > > > > > > On 5/7/18, 11:02 AM, "SPDK on behalf of Walker, Benjamin" > s. > > > 01.org on behalf of benjamin.walker(a)intel.com> wrote: > > > > > > Hi Shahar, > > > > > > Thank you for submitting the patch. I've looked through it in detail and > > I > > > think > > > I understand the purpose of this code, but I'm going to explain it back > > to > > > you > > > so you can correct me where I'm wrong. > > > > > > I think this code solves two distinct problems: > > > > > > 1) You need to forward I/O out of the bdev layer to some custom backend, > > > and you > > > want the code that does that to live outside of the SPDK repository. > > > > > > 2) Your custom back-end library isn't suitable for use in a run-to- > > > completion > > > model. By that I mean that you can't just call your library directly on > > > the > > > thread that originally receives the spdk_bdev_io request because your > > > library > > > either blocks or generally takes too long to return from the submission > > > call > > > (maybe it is doing inline compression or something). Instead, you need > > to > > > shuttle those requests off to separate threads for handling. > > > > > > As far as point #1, today the SPDK build system does not nicely > > > accommodate bdev > > > modules whose code lives outside of SPDK. SPDK expects them to be in > > > lib/bdev/. However, that's a fairly straightforward change > > to > > > the > > > build system and it's one we've been intending to make for some time. > > > > > > For point #2, this is likely the case for a large number of storage > > back- > > > ends, > > > but I think the proper way to solve it is probably back-end specific and > > > not > > > general purpose. As a counter-point, SPDK already integrates with a > > number > > > of > > > third-party storage back-ends today (Ceph RBD, libiscsi, libaio, etc.) > > and > > > none > > > of those ended up needing to pass messages to other threads. They all > > > support > > > asynchronous operations, though. I could imagine writing a bdev module > > > that > > > ultimately makes POSIX preadv calls, for instance. That would need to be > > > implemented with a thread pool and each bdev_io gets funneled off to a > > > thread in > > > the pool to perform the blocking operation. > > > > > > Ok - I explained what I think I'm understanding. Now tell me where I > > went > > > wrong > > > :) > > > > > > Thanks, > > > Ben > > > > > > On Sun, 2018-05-06 at 10:32 +0000, Shahar Salzman wrote: > > > > Hi, > > > > > > > > I pushed the code for review, thanks Daniel for the help. > > > > > > > > In a nutshell: > > > > - bdev_user - an API for a user appliance to use spdk as an > > iSCSI/NVMeF > > > target > > > > - bdev_user_example - reference application > > > > - The API relies on rings in order to submit/complete IOs > > > > - User appliance registers callbacks for submit_io (should we have > > > > read/write/other instead?) > > > > - User appliance registers its devices so that they may be added to an > > > > existing namespace (I am using RPC to do the management) > > > > > > > > Thanks, > > > > Shahar > > > > > > > > > > > > From: SPDK on behalf of Verkamp, Daniel > > > > > > rkamp(a)intel.com> > > > > Sent: Thursday, May 3, 2018 8:50 PM > > > > To: Storage Performance Development Kit > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > Hi Shahar, > > > > > > > > The target branch for the push should be ‘refs/for/master’, not > > ‘master’ > > > – if > > > > you configured a remote as specified in http://www.spdk.io/development > SPDK Development > www.spdk.io > The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode instead of relying on interrupts, which avoids kernel context switches and ... > > > > / > > > it > > > > should look like: > > > > > > > > [remote "review"] > > > > url = https://review.gerrithub.io/spdk/spdk > > > > push = HEAD:refs/for/master > > > > > > > > From: SPDK [mailto:spdk-bounces(a)lists.01.org ] On Behalf Of Shahar > > > Salzman > > > > Sent: Thursday, May 3, 2018 1:00 AM > > > > To: Storage Performance Development Kit > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > Hi Ben, > > > > > > > > I have the code ready for review (spdk/master on dpdk/18.02), but I do > > > not > > > > have push rights for gerrithub: > > > > shahar.salzman(a)shahars-vm:~/Kaminario/git/spdk$ git push spdk-review > > > > HEAD:master > > > > Password for 'https://ShaharSalzman-K(a)review.gerrithub.io ': > > > > Counting objects: 109, done. > > > > Compressing objects: 100% (22/22), done. > > > > Writing objects: 100% (22/22), 8.70 KiB | 0 bytes/s, done. > > > > Total 22 (delta 14), reused 0 (delta 0) > > > > remote: Resolving deltas: 100% (14/14) > > > > remote: Branch refs/heads/master: > > > > remote: You are not allowed to perform this operation. > > > > remote: To push into this reference you need 'Push' rights. > > > > remote: User: ShaharSalzman-K > > > > remote: Please read the documentation and contact an administrator > > > > remote: if you feel the configuration is incorrect > > > > remote: Processing changes: refs: 1, done > > > > To https://ShaharSalzman-K(a)review.gerrithub.io/a/spdk/spdk > > > > ! [remote rejected] HEAD -> master (prohibited by Gerrit: ref update > > > access > > > > denied) > > > > error: failed to push some refs to ' https://clicktime.symantec.com/a/1/VcP2v4HJWwYtLlvp3_LcgSsqhmYOOUAZscJQHE8uNvI=?d=ESGMfx885Mon-FYabmhD7ls3wtWmBZW9kWgryeq1SISrgpYjCOb4IS6lwYKSCrvtHc7CEvBRgFQdII_hRkKdEHtIPw-_l-2vbpSf4mgn8dWJRwfogT1dkkd4aplrKU00CzI0rDhCgZT67GHdJ7NowVIQ_12RIr45uXe3dXDgV1slShko1C3-ayHaghRMgJQoGQz5ewaurDC42wc7_w1BEs8oj1QoSKsDNxm9-hFs4L9a-XshONMr1SxTXIt9EIfUvWoaFlCtJeJikXq2XnYc-tunDEHEykqgMCz_AWlpwA4UE5RVCmkekk-Fuxjm4UYnc6ggHScc1xqmGn-WwsISBpjmD_tGzsoKqRuVhiV01CId-_wCmWsv-uXcoIhW0e_JPIuV6LLTZhTPZ8JMljeezHgswSjkLaVkX1U3lDxg25b69Q%3D%3D&u=https%3A%2F%2FShaharSalzman-K%40review.ger > > ri > > > thub.i > > > > o/a/spdk/spdk' > > > > > > > > Am I doing something incorrect, or is this just a permission issue? > > > > > > > > Thanks, > > > > Shahar > > > > From: SPDK on behalf of Shahar Salzman > > > > > > zman(a)kaminario.com> > > > > Sent: Wednesday, April 25, 2018 9:02:38 AM > > > > To: Storage Performance Development Kit > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > Hi Ben, > > > > > > > > The code is currently working on v17.07, we are planning on bumping > > the > > > > version to one of the latest stable versions (18.01?) + master. > > > > It will take me (hopefully) a few days to update the code and have our > > > > internal CI start running on this version, not sure it would be > > useful, > > > but I > > > > can get our working 17.07 code (+ reference application) for review > > much > > > > faster. > > > > What is the best course of action? > > > > > > > > Shahar > > > > From: SPDK on behalf of Walker, Benjamin > > > > > > .walker(a)intel.com> > > > > Sent: Tuesday, April 24, 2018 7:19:12 PM > > > > To: Storage Performance Development Kit > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > Hi Shahar, > > > > > > > > Would you be willing to submit your bdev module as a patch on > > GerritHub? > > > That > > > > way everyone can take a look and provide feedback. If you don’t want > > it > > > to run > > > > the tests, you can put [RFC] and the beginning of the commit message. > > > > > > > > Thanks, > > > > Ben > > > > > > > > From: SPDK [mailto:spdk-bounces(a)lists.01.org ] On Behalf Of Shahar > > > Salzman > > > > Sent: Monday, April 23, 2018 8:45 AM > > > > To: spdk(a)lists.01.org > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > Hi Ben, > > > > > > > > Bumping this thread since I've been having some new thoughts on the > > > issue now > > > > that we are starting integration with newer spdk versions. > > > > Unfortunately the merge isn't as smooth as I'd like it to be since the > > > bdev > > > > module is pretty tightly integrated into spdk, perhaps we made some > > > false > > > > assumptions writing the module, but it seems some of the newer spdk > > > features > > > > are complicating the integration. > > > > My question is, if this passthrough module is useful, wouldn't it be > > > better to > > > > maintain it as part of spdk so that we can catch issues as soon as > > they > > > show > > > > up? > > > > We would be happy to help with maintaining this module, the module > > with > > > is > > > > currently part of our CI with our "frozen" spdk version, but once > > > integrated > > > > into the newer version we choose, I'll add it to the CI our CI as > > well. > > > > > > > > Shahar > > > > From: SPDK on behalf of Walker, Benjamin > > > > > > .walker(a)intel.com> > > > > Sent: Friday, February 2, 2018 11:43:58 PM > > > > To: spdk(a)lists.01.org > > > > Subject: Re: [SPDK] SPDK + user space appliance > > > > > > > > On Thu, 2018-02-01 at 08:29 +0000, Shahar Salzman wrote: > > > > > Hi Ben, > > > > > > > > > > Would you also like to take a look at the bdev_user module? > > > > > It still needs some patching (as some of the stuff is still hard > > > coded), but > > > > I > > > > > think we can get most of it cleaned up in a couple of days. > > > > > > > > > > In any case, is it the intention that the user write his own bdev > > > module, or > > > > > would this user appliance glue be a useful generic module? > > > > > > > > For existing storage stacks that serve block I/O, like the internals > > of > > > a SAN, > > > > the idea is that you write your own bdev module to forward I/O coming > > > out of > > > > the > > > > SPDK bdev layer. Then you can use the SPDK iSCSI/NVMe-oF/vhost targets > > > mostly > > > > as-is. > > > > > > > > In some cases, the actual iSCSI/NVMe-oF/vhost target applications > > won't > > > > integrate nicely directly into an existing storage application because > > > they > > > > spawn their own threads and allocate their own memory. To support > > that, > > > the > > > > libraries may be consumed directly instead of the applications > > > (lib/iscsi, > > > > lib/scsi, lib/nvmf, etc.). The libraries don't spawn any of their own > > > threads, > > > > but instead rely on SPDK's abstractions in include/spdk/io_channel.h. > > > See > > > > > > > > http://www.spdk.io/doc/concurrency.html > > > > > > > > We don't currently have a way to write a custom bdev module that > > resides > > > > outside > > > > of the SPDK source tree, but it's very possible to add support for > > that. > > > But > > > > beyond that inconvenience (just drop your module in lib/bdev for now), > > > writing > > > > a > > > > bdev module is the recommended way of interacting with the bottom end > > of > > > the > > > > SPDK bdev layer. I think that's what you really want to be doing in > > your > > > code, > > > > from what I can tell. > > > > > > > > I hope that helps! > > > > _______________________________________________ > > > > SPDK mailing list > > > > SPDK(a)lists.01.org > > > > https://lists.01.org/mailman/listinfo/spdk > > > > > > > > _______________________________________________ > > > > SPDK mailing list > > > > SPDK(a)lists.01.org > > > > https://lists.01.org/mailman/listinfo/spdk > > > _______________________________________________ > > > SPDK mailing list > > > SPDK(a)lists.01.org > > > https://lists.01.org/mailman/listinfo/spdk > > > > > > > > > _______________________________________________ > > > SPDK mailing list > > > SPDK(a)lists.01.org > > > https://lists.01.org/mailman/listinfo/spdk > > > _______________________________________________ > > > SPDK mailing list > > > SPDK(a)lists.01.org > > > https://lists.01.org/mailman/listinfo/spdk > > _______________________________________________ > > SPDK mailing list > > SPDK(a)lists.01.org > > https://lists.01.org/mailman/listinfo/spdk > > _______________________________________________ > > SPDK mailing list > > SPDK(a)lists.01.org > > https://lists.01.org/mailman/listinfo/spdk > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk