From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============2307146270999611955==" MIME-Version: 1.0 From: Walker, Benjamin Subject: Re: [SPDK] SPDK + user space appliance Date: Tue, 08 May 2018 17:30:20 +0000 Message-ID: <1525800619.2784.26.camel@intel.com> In-Reply-To: AM3PR04MB370C7D78E1593651ACAB0B4899A0@AM3PR04MB370.eurprd04.prod.outlook.com List-ID: To: spdk@lists.01.org --===============2307146270999611955== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Tue, 2018-05-08 at 07:36 +0000, Shahar Salzman wrote: > Hi Jim and Ben, > = > For the threading issue, I agree that there is something not very clean i= n the > interface, as there is an assumption on how the user implements it. As I = did > in the bdev_user_example, we also use a ring in order to place all the > incoming IO without delaying the reactor, and then use multiple pollers to > actually handle the IO (deduplication, compression, HA etc.). This is why > there are 2 distinct interfaces - submit_io callback, and > the bdev_user_submit_completion interface which (normally) is called on > another thread (not the original poller), and passed back to the reactor = via > the completion queue on the bdev_user_io_channel, and the registered poll= er > thread which takes from the user completion queue. > Do you think that a cleaner interface would be modifying the submit_io > callback to a poll_io interface which checks a bdev_user internal ring fo= r IO? > Or do you think that the current interface is OK provided good documentat= ion? > = > Regarding the spdk_call_unaffinitized, I am currently using spdk_event_ca= ll in > order to register my volumes, I don't really like this since it forces me= to > (eventually) add another async callback in my app to verify that device > registration was successful (and this just adds more conditions, futures = etc. > in the application). Is there a way to call spdk interfaces directly with= a > "non-spdk" thread (i.e. TLS is not initialized)? I'm not so much concerned yet with the interface you've defined, but rather understanding the whole approach at a high level. The SPDK bdev layer is designed for custom bdev modules to be added, so my primary question is why write a generic bdev_user module as opposed to writing a "your custom stora= ge backend" module? I think this is the key piece, and understanding the proce= ss you went through as you designed this will probably yield a whole bunch of = good improvements to the current bdev module system. Thanks, Ben > = > Hope this answers the questions, > Shahar > From: SPDK on behalf of Harris, James R > > Sent: Monday, May 7, 2018 9:18:20 PM > To: Storage Performance Development Kit > Subject: Re: [SPDK] SPDK + user space appliance > = > There are also calls such as spdk_call_unaffinitized() and > spdk_unaffinitize_thread() which have been added to enable cases where a = bdev > module may need to spawn non-polling threads and don=E2=80=99t want those= threads to > inherit the affinity of the calling thread. The SPDK rbd module currently > uses these (see git commit fa5206c4) since rbd_open is a blocking call. = (Note > that librbd does now support rbd_aio_open which is better suited for SPDK= .) > = > -Jim > = > = > On 5/7/18, 11:02 AM, "SPDK on behalf of Walker, Benjamin" 01.org on behalf of benjamin.walker(a)intel.com> wrote: > = > Hi Shahar, > = > Thank you for submitting the patch. I've looked through it in detail = and I > think > I understand the purpose of this code, but I'm going to explain it ba= ck to > you > so you can correct me where I'm wrong. > = > I think this code solves two distinct problems: > = > 1) You need to forward I/O out of the bdev layer to some custom backe= nd, > and you > want the code that does that to live outside of the SPDK repository. > = > 2) Your custom back-end library isn't suitable for use in a run-to- > completion > model. By that I mean that you can't just call your library directly = on > the > thread that originally receives the spdk_bdev_io request because your > library > either blocks or generally takes too long to return from the submissi= on > call > (maybe it is doing inline compression or something). Instead, you nee= d to > shuttle those requests off to separate threads for handling. > = > As far as point #1, today the SPDK build system does not nicely > accommodate bdev > modules whose code lives outside of SPDK. SPDK expects them to be in > lib/bdev/. However, that's a fairly straightforward chan= ge to > the > build system and it's one we've been intending to make for some time. > = > For point #2, this is likely the case for a large number of storage b= ack- > ends, > but I think the proper way to solve it is probably back-end specific = and > not > general purpose. As a counter-point, SPDK already integrates with a n= umber > of > third-party storage back-ends today (Ceph RBD, libiscsi, libaio, etc.= ) and > none > of those ended up needing to pass messages to other threads. They all > support > asynchronous operations, though. I could imagine writing a bdev module > that > ultimately makes POSIX preadv calls, for instance. That would need to= be > implemented with a thread pool and each bdev_io gets funneled off to a > thread in > the pool to perform the blocking operation. > = > Ok - I explained what I think I'm understanding. Now tell me where I = went > wrong > :) > = > Thanks, > Ben > = > On Sun, 2018-05-06 at 10:32 +0000, Shahar Salzman wrote: > > Hi, > > = > > I pushed the code for review, thanks Daniel for the help. > > = > > In a nutshell: > > - bdev_user - an API for a user appliance to use spdk as an iSCSI/N= VMeF > target > > - bdev_user_example - reference application > > - The API relies on rings in order to submit/complete IOs > > - User appliance registers callbacks for submit_io (should we have > > read/write/other instead?) > > - User appliance registers its devices so that they may be added to= an > > existing namespace (I am using RPC to do the management) > > = > > Thanks, > > Shahar > > = > > = > > From: SPDK on behalf of Verkamp, Dani= el > > rkamp(a)intel.com> > > Sent: Thursday, May 3, 2018 8:50 PM > > To: Storage Performance Development Kit > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > Hi Shahar, > > = > > The target branch for the push should be =E2=80=98refs/for/master= =E2=80=99, not =E2=80=98master=E2=80=99 > =E2=80=93 if > > you configured a remote as specified in http://www.spdk.io/developm= ent/ > it > > should look like: > > = > > [remote "review"] > > url =3D https://review.gerrithub.io/spdk/spdk > > push =3D HEAD:refs/for/master > > = > > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar > Salzman > > Sent: Thursday, May 3, 2018 1:00 AM > > To: Storage Performance Development Kit > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > Hi Ben, > > = > > I have the code ready for review (spdk/master on dpdk/18.02), but I= do > not > > have push rights for gerrithub: > > shahar.salzman(a)shahars-vm:~/Kaminario/git/spdk$ git push spdk-rev= iew > > HEAD:master > > Password for 'https://ShaharSalzman-K(a)review.gerrithub.io': = > > Counting objects: 109, done. > > Compressing objects: 100% (22/22), done. > > Writing objects: 100% (22/22), 8.70 KiB | 0 bytes/s, done. > > Total 22 (delta 14), reused 0 (delta 0) > > remote: Resolving deltas: 100% (14/14) > > remote: Branch refs/heads/master: > > remote: You are not allowed to perform this operation. > > remote: To push into this reference you need 'Push' rights. > > remote: User: ShaharSalzman-K > > remote: Please read the documentation and contact an administrator > > remote: if you feel the configuration is incorrect > > remote: Processing changes: refs: 1, done = > > To https://ShaharSalzman-K(a)review.gerrithub.io/a/spdk/spdk > > ! [remote rejected] HEAD -> master (prohibited by Gerrit: ref upda= te > access > > denied) > > error: failed to push some refs to 'https://ShaharSalzman-K(a)revie= w.gerri > thub.i > > o/a/spdk/spdk' > > = > > Am I doing something incorrect, or is this just a permission issue? > > = > > Thanks, > > Shahar > > From: SPDK on behalf of Shahar Salzman > > zman(a)kaminario.com> > > Sent: Wednesday, April 25, 2018 9:02:38 AM > > To: Storage Performance Development Kit > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > Hi Ben, > > = > > The code is currently working on v17.07, we are planning on bumping= the > > version to one of the latest stable versions (18.01?) + master. > > It will take me (hopefully) a few days to update the code and have = our > > internal CI start running on this version, not sure it would be use= ful, > but I > > can get our working 17.07 code (+ reference application) for review= much > > faster. > > What is the best course of action? > > = > > Shahar > > From: SPDK on behalf of Walker, Benja= min > > .walker(a)intel.com> > > Sent: Tuesday, April 24, 2018 7:19:12 PM > > To: Storage Performance Development Kit > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > Hi Shahar, > > = > > Would you be willing to submit your bdev module as a patch on Gerri= tHub? > That > > way everyone can take a look and provide feedback. If you don=E2=80= =99t want it > to run > > the tests, you can put [RFC] and the beginning of the commit messag= e. > > = > > Thanks, > > Ben > > = > > From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Shahar > Salzman > > Sent: Monday, April 23, 2018 8:45 AM > > To: spdk(a)lists.01.org > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > Hi Ben, > > = > > Bumping this thread since I've been having some new thoughts on the > issue now > > that we are starting integration with newer spdk versions. > > Unfortunately the merge isn't as smooth as I'd like it to be since = the > bdev > > module is pretty tightly integrated into spdk, perhaps we made some > false > > assumptions writing the module, but it seems some of the newer spdk > features > > are complicating the integration. > > My question is, if this passthrough module is useful, wouldn't it be > better to > > maintain it as part of spdk so that we can catch issues as soon as = they > show > > up? > > We would be happy to help with maintaining this module, the module = with > is > > currently part of our CI with our "frozen" spdk version, but once > integrated > > into the newer version we choose, I'll add it to the CI our CI as w= ell. > > = > > Shahar > > From: SPDK on behalf of Walker, Benja= min > > .walker(a)intel.com> > > Sent: Friday, February 2, 2018 11:43:58 PM > > To: spdk(a)lists.01.org > > Subject: Re: [SPDK] SPDK + user space appliance > > = > > On Thu, 2018-02-01 at 08:29 +0000, Shahar Salzman wrote: > > > Hi Ben, > > > = > > > Would you also like to take a look at the bdev_user module? > > > It still needs some patching (as some of the stuff is still hard > coded), but > > I > > > think we can get most of it cleaned up in a couple of days. > > > = > > > In any case, is it the intention that the user write his own bdev > module, or > > > would this user appliance glue be a useful generic module? > > = > > For existing storage stacks that serve block I/O, like the internal= s of > a SAN, > > the idea is that you write your own bdev module to forward I/O comi= ng > out of > > the > > SPDK bdev layer. Then you can use the SPDK iSCSI/NVMe-oF/vhost targ= ets > mostly > > as-is. > > = > > In some cases, the actual iSCSI/NVMe-oF/vhost target applications w= on't > > integrate nicely directly into an existing storage application beca= use > they > > spawn their own threads and allocate their own memory. To support t= hat, > the > > libraries may be consumed directly instead of the applications > (lib/iscsi, > > lib/scsi, lib/nvmf, etc.). The libraries don't spawn any of their o= wn > threads, > > but instead rely on SPDK's abstractions in include/spdk/io_channel.= h. > See > > = > > http://www.spdk.io/doc/concurrency.html > > = > > We don't currently have a way to write a custom bdev module that re= sides > > outside > > of the SPDK source tree, but it's very possible to add support for = that. > But > > beyond that inconvenience (just drop your module in lib/bdev for no= w), > writing > > a > > bdev module is the recommended way of interacting with the bottom e= nd of > the > > SPDK bdev layer. I think that's what you really want to be doing in= your > code, > > from what I can tell. > > = > > I hope that helps! > > _______________________________________________ > > SPDK mailing list > > SPDK(a)lists.01.org > > https://lists.01.org/mailman/listinfo/spdk > > = > > _______________________________________________ > > SPDK mailing list > > SPDK(a)lists.01.org > > https://lists.01.org/mailman/listinfo/spdk > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > = > = > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk > _______________________________________________ > SPDK mailing list > SPDK(a)lists.01.org > https://lists.01.org/mailman/listinfo/spdk --===============2307146270999611955==--