All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 15:02 Luse, Paul E
  0 siblings, 0 replies; 11+ messages in thread
From: Luse, Paul E @ 2018-02-15 15:02 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4494 bytes --]

Hi Stephen,

Yeah, awesome job working on these.  Some replies from me below, others I'm sure will have thoughts as well.  We're also starting weekly community calls starting the week after next so can you always put some of these up on the board (details coming soon) and discuss there....

Thx
Paul

-----Original Message-----
From: Stephen Bates [mailto:sbates(a)raithlin.com] 
Sent: Thursday, February 15, 2018 7:33 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Cc: Harris, James R <james.r.harris(a)intel.com>; Verkamp, Daniel <daniel.verkamp(a)intel.com>; Stojaczyk, DariuszX <dariuszx.stojaczyk(a)intel.com>; Chang, Cunyin <cunyin.chang(a)intel.com>; Luse, Paul E <paul.e.luse(a)intel.com>
Subject: SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!

Hi SPDK Team

I wanted to start by thanking everyone for the great feedback on the first set of CMB WDS/RDS enablement patches which went into master over the past few days (e.g. [1]). There are already some reported suspected bug sightings so with luck those patches will mature quickly ;-). 

Now I wanted to pick the communities brains' on a the best way to approach a couple of topics:

1. Documentation. I would like to update the API documentation (which I believe is auto-generated) as well as add a new file in docs/ discussing some of the issues setting up Peer-2-Peer DMAs (which cmb_copy does). Any tips for how best to do this?

All docs are done via patches.  There's the API docs which you can easily see examples of in other code comments, just submit a patch.  For http://www.spdk.io/doc/ those are also done via public patches, her's an example of one of those: https://review.gerrithub.io/#/c/384118/  You can also do a blog post on the website via a patch however I don't think we document how to do that anywhere, you set everything up the same as the regular repo but git info is:

[remote "origin"]
        url = https://review.gerrithub.io/spdk/spdk.github.io
        fetch = +refs/heads/*:refs/remotes/origin/*
[remote "review"]
  url = https://review.gerrithub.io/spdk/spdk.github.io
  push = HEAD:refs/for/master

If you want to do one, let me or anyone know and we can provide more info  

2. CI Testing. Upstream QEMU has support in its NVMe model for SSDs with WDS/RDS CMBs [2] (I should know as I added that support ;-)). Can we discuss adding this to the CI pool so we can do some form of emulated P2P testing? In addition is there interest in real HW testing? If so we could discuss adding some of our HW to the pool (but as a lowly startup I think donating HW is beyond our budget right now).

The community CI pool isn't really open to adding new HW due to limited resources.  You, however, are welcome to setup your own CI system at your site and tie into the community GerritHub so all patches would run on your CI as well and reports would be visible on the main page but not 'count' as voting for merge.

Adding some sort of nightly test with VMs is very doable.  There is a lot of restructuring going on with the tests right now but you can propose a patch at any time.  All of the tests are public and in the same repo.  If you're up for doing this, go for it.  Any of us can provide some starting points/guidelines/tips if so.  There's some doc written up but not posted yet (not final) that are part of the test restructuring I just mentioned. I can share some of those on the dist list early too if you're interested.

3. VFIO Support. Right now I have only tested with UIO. VFIO adds some interesting issues around BAR address translations, PCI ACS and PCI ATS. 

4. Fabrics Support. An obvious extension of this work is to allow other devices (aside from NVMe SSDs) to initiate DMAs to the NVMe CMBs. The prime candidate for that is a RDMA capable NIC which ties superbly well into NVMe over Fabrics. I would like to start a discussion on how best to approach this.

Is Trello the right place to enter and discuss these topics? Or is it OK to hash them out on the mailing list? Or do the community have a better way of discussing these items?

Trello is good, this list is good.  IRC is GREAT, haven't seen you there I don't think freenode, #spdk and then there's the weekly con calls that I mentioned that will be starting soon.

Cheers
 
Stephen

[1] https://github.com/spdk/spdk/commit/1f9da54e9cca75c1a049844b36319a52fdbacbd6
[2] https://github.com/qemu/qemu/blob/master/hw/block/nvme.c (see cmb_size_mb)

 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-26 19:31 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-26 19:31 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 891 bytes --]

> Adding some sort of nightly test with VMs is very doable.  There is a lot of restructuring going on with the tests right now but you 
> can propose a patch at any time.  All of the tests are public and in the same repo.  If you're up for doing this, go for it.  Any of us 
> can provide some starting points/guidelines/tips if so.  There's some doc written up but not posted yet (not final) that are part of 
> the test restructuring I just mentioned. I can share some of those on the dist list early too if you're interested.

Paul

Some more details on this would be appreciated. I am working on getting a VM image up on my system using libvirt, QEMU and the CMB enabled NVMe models that QEMU supports. More details on how best to tie this into your SPDK CI pool would be appreciated. A draft of the docs would be useful even if they are not final.
    
 Stephen
    
    


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-21 20:58 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-21 20:58 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 149 bytes --]

>    Yep - what’s your Trello handle?
    
stephenbates19

Who knew there were 18 other Stephen Bates ;-)....        
        
    
    


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-21 20:43 Harris, James R
  0 siblings, 0 replies; 11+ messages in thread
From: Harris, James R @ 2018-02-21 20:43 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 361 bytes --]

Yep - what’s your Trello handle?

On 2/21/18, 1:39 PM, "Stephen  Bates" <sbates(a)raithlin.com> wrote:

    Hi All
    
    I am trying to add some CMB/PMR related Trello board items so we can track some of the issues I mentioned last week. Do I need to be added as a SPDK member in order to do this?
    
    Cheers
    
    Stephen
    
    


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-21 20:39 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-21 20:39 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 211 bytes --]

Hi All

I am trying to add some CMB/PMR related Trello board items so we can track some of the issues I mentioned last week. Do I need to be added as a SPDK member in order to do this?

Cheers

Stephen


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 23:18 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-15 23:18 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 2135 bytes --]

    
> We are already using emulated QEMU NVMe devices in the test pool - so in lieu of adding new HW, we could 
> get at least some level of CMB WDS/RDS testing there.  You could look at test/lib/nvme/nvme.sh initially for 
> where to plumb something in.  Of course, whatever tests get added need to know the difference between 
> failing due to lack of CMB v. a real CMB bug.

OK I will go take a look. This could work well for us.
    
>  Good point.  Testing this will be predicated on getting real HW.  For now, we should probably at least 
> have a warning message emitted when we find a CMB-enabled SSD with vfio enabled.

Well I think we can test this using the vIOMMU in QEMU without needing real hardware but it might be tricky. I will look at putting in an error path in place for now. Does anyone know if SPDK has a helper function that can tell if a PCIe device is under VFIO or UIO control? 
    
> 4. Fabrics Support. 
> Step 1 would just be testing this I/O path to confirm it works. You’ve already added the spdk_mem_register() calls which should register the CMB region with each RDMA NIC.  So first make sure that works.    

Agreed though I am very confident this works as our kernel patches do something identical to what SPDK would do.

> rxe might be OK to start but really you’ll want a real RDMA NIC.  

I don't think rxe would work the way we want since its "DMA engine" is actually a memcpy() operation.

> Then you could read to a CMB buffer and write  to a remote NVMe namespace using the SPDK NVMe-oF driver.  
> Then read it back into a different CMB buffer, etc.

Yup.
    
> Step 2 would be a lot more involved.  In an ideal world, there’s enough CMB space to replace all of the existing 
> host memory buffer pools used by the NVMe-oF target.  If not - well, that’s where a lot more work will be needed.  :-)

Agreed. In the kernel patches we fall back to using host memory when we run out of CMB space. We would need to do something similar here.

Thanks Jim and Paul. Lots of good input here and enough to keep us busy for a while ;-).

Stephen
    
 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 22:32 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-15 22:32 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 723 bytes --]

> I guess I’m not 100% sold that we need a full-blown allocator.  

OK. I will poke around and see what I think makes sense here.
    
> I’m curious – how much CMB does one of your NoLoad card have?
    
We currently ship two types of Eval Kits. In both cases the default is 512MB (DRAM backed) but that can go as high as about 12GB on one of the eval kits. I did once expose a 6EB CMB to see what would happen (obviously this was not DRAM backed). Needless to say the BIOS was very unhappy....

TBH even 512MB is probably big for a CMB staging buffer but we can also support NVMe PMRs where these sizes make a lot more sense as they are for more permanent storage....

Cheers

Stephen    
    
    


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 21:15 Harris, James R
  0 siblings, 0 replies; 11+ messages in thread
From: Harris, James R @ 2018-02-15 21:15 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1350 bytes --]



On 2/15/18, 2:02 PM, "Stephen  Bates" <sbates(a)raithlin.com> wrote:

    Jim
    
    > We also need a better CMB allocation scheme.  
    
    Ha! I knew there was a feature I was forgetting. This is it. Yes for sure, thanks for reminding me. I have added it to my list so I don't forget next time. 
    
  Do we have anything like this in SPDK already or does anyone have any pointers to a licence compatible open-source allocator that might suit our needs?


There’s really nothing like this today in SPDK.  I believe DPDK provides a way to use their allocators with “user-provided” memory – i.e. not from the hugepages allocated in host memory by DPDK.  But I haven’t looked at exactly how that works.

I guess I’m not 100% sold that we need a full-blown allocator.  Maybe what we have currently is enough – the user can allocate the memory but it’s up to the user how it is managed.  Meaning that the free_cmb_io_buffer routine effectively goes away.  And just be explicit in the API – i.e. here’s the call you make to get the CMB buffer and its size – what you choose to do with it is your business.  We could always provide some kind of allocator later – but that can be done outside of the SPDK NVMe CMB logic itself.

I’m curious – how much CMB does one of your NoLoad card have?

-Jim



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 21:02 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-15 21:02 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

Jim

> We also need a better CMB allocation scheme.  

Ha! I knew there was a feature I was forgetting. This is it. Yes for sure, thanks for reminding me. I have added it to my list so I don't forget next time. 

Do we have anything like this in SPDK already or does anyone have any pointers to a licence compatible open-source allocator that might suit our needs?
    
Stephen    
    


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 15:32 Harris, James R
  0 siblings, 0 replies; 11+ messages in thread
From: Harris, James R @ 2018-02-15 15:32 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 6548 bytes --]


> On Feb 15, 2018, at 8:02 AM, Luse, Paul E <paul.e.luse(a)intel.com> wrote:
> 
> Hi Stephen,
> 
> Yeah, awesome job working on these.  Some replies from me below, others I'm sure will have thoughts as well.  We're also starting weekly community calls starting the week after next so can you always put some of these up on the board (details coming soon) and discuss there....
> 
> Thx
> Paul
> 
> -----Original Message-----
> From: Stephen Bates [mailto:sbates(a)raithlin.com] 
> Sent: Thursday, February 15, 2018 7:33 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Cc: Harris, James R <james.r.harris(a)intel.com>; Verkamp, Daniel <daniel.verkamp(a)intel.com>; Stojaczyk, DariuszX <dariuszx.stojaczyk(a)intel.com>; Chang, Cunyin <cunyin.chang(a)intel.com>; Luse, Paul E <paul.e.luse(a)intel.com>
> Subject: SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
> 
> Hi SPDK Team
> 
> I wanted to start by thanking everyone for the great feedback on the first set of CMB WDS/RDS enablement patches which went into master over the past few days (e.g. [1]). There are already some reported suspected bug sightings so with luck those patches will mature quickly ;-). 

Thanks for your work on this!

> 
> Now I wanted to pick the communities brains' on a the best way to approach a couple of topics:
> 
> 1. Documentation. I would like to update the API documentation (which I believe is auto-generated) as well as add a new file in docs/ discussing some of the issues setting up Peer-2-Peer DMAs (which cmb_copy does). Any tips for how best to do this?
> 
> All docs are done via patches.  There's the API docs which you can easily see examples of in other code comments, just submit a patch.  For http://www.spdk.io/doc/ those are also done via public patches, her's an example of one of those: https://review.gerrithub.io/#/c/384118/  You can also do a blog post on the website via a patch however I don't think we document how to do that anywhere, you set everything up the same as the regular repo but git info is:
> 
> [remote "origin"]
>        url = https://review.gerrithub.io/spdk/spdk.github.io
>        fetch = +refs/heads/*:refs/remotes/origin/*
> [remote "review"]
>  url = https://review.gerrithub.io/spdk/spdk.github.io
>  push = HEAD:refs/for/master
> 
> If you want to do one, let me or anyone know and we can provide more info  


> 
> 2. CI Testing. Upstream QEMU has support in its NVMe model for SSDs with WDS/RDS CMBs [2] (I should know as I added that support ;-)). Can we discuss adding this to the CI pool so we can do some form of emulated P2P testing? In addition is there interest in real HW testing? If so we could discuss adding some of our HW to the pool (but as a lowly startup I think donating HW is beyond our budget right now).
> 
> The community CI pool isn't really open to adding new HW due to limited resources.  You, however, are welcome to setup your own CI system at your site and tie into the community GerritHub so all patches would run on your CI as well and reports would be visible on the main page but not 'count' as voting for merge.

We are already using emulated QEMU NVMe devices in the test pool - so in lieu of adding new HW, we could get at least some level of CMB WDS/RDS testing there.  You could look at test/lib/nvme/nvme.sh initially for where to plumb something in.  Of course, whatever tests get added need to know the difference between failing due to lack of CMB v. a real CMB bug.

> 
> Adding some sort of nightly test with VMs is very doable.  There is a lot of restructuring going on with the tests right now but you can propose a patch at any time.  All of the tests are public and in the same repo.  If you're up for doing this, go for it.  Any of us can provide some starting points/guidelines/tips if so.  There's some doc written up but not posted yet (not final) that are part of the test restructuring I just mentioned. I can share some of those on the dist list early too if you're interested.
> 
> 3. VFIO Support. Right now I have only tested with UIO. VFIO adds some interesting issues around BAR address translations, PCI ACS and PCI ATS. 

Good point.  Testing this will be predicated on getting real HW.  For now, we should probably at least have a warning message emitted when we find a CMB-enabled SSD with vfio enabled.

> 
> 4. Fabrics Support. An obvious extension of this work is to allow other devices (aside from NVMe SSDs) to initiate DMAs to the NVMe CMBs. The prime candidate for that is a RDMA capable NIC which ties superbly well into NVMe over Fabrics. I would like to start a discussion on how best to approach this.

Step 1 would just be testing this I/O path to confirm it works.  You’ve already added the spdk_mem_register() calls which should register the CMB region with each RDMA NIC.  So first make sure that works.  rxe might be OK to start but really you’ll want a real RDMA NIC.  Then you could read to a CMB buffer and write to a remote NVMe namespace using the SPDK NVMe-oF driver.  Then read it back into a different CMB buffer, etc.

Step 2 would be a lot more involved.  In an ideal world, there’s enough CMB space to replace all of the existing host memory buffer pools used by the NVMe-oF target.  If not - well, that’s where a lot more work will be needed.  :-)

> 
> Is Trello the right place to enter and discuss these topics? Or is it OK to hash them out on the mailing list? Or do the community have a better way of discussing these items?
> 
> Trello is good, this list is good.  IRC is GREAT, haven't seen you there I don't think freenode, #spdk and then there's the weekly con calls that I mentioned that will be starting soon.

Yep - everything Paul said.

We also need a better CMB allocation scheme.  What’s there currently was just to get CMB working at some level but isn’t really functional (i.e. the free routine is a nop).  A full blown allocator is probably overkill at best - these regions are somewhat limited so fragmentation can be a problem.  There’s also no synchronization currently in nvme_pci_ctrlr_alloc_cmb() to protect concurrent allocations on multiple threads.  Until that is ready, we will need to consider the CMB functionality as experimental and make sure the docs reflect that.

> 
> Cheers
> 
> Stephen
> 
> [1] https://github.com/spdk/spdk/commit/1f9da54e9cca75c1a049844b36319a52fdbacbd6
> [2] https://github.com/qemu/qemu/blob/master/hw/block/nvme.c (see cmb_size_mb)
> 
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps!
@ 2018-02-15 14:32 Stephen Bates
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Bates @ 2018-02-15 14:32 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 1885 bytes --]

Hi SPDK Team

I wanted to start by thanking everyone for the great feedback on the first set of CMB WDS/RDS enablement patches which went into master over the past few days (e.g. [1]). There are already some reported suspected bug sightings so with luck those patches will mature quickly ;-). 

Now I wanted to pick the communities brains' on a the best way to approach a couple of topics:

1. Documentation. I would like to update the API documentation (which I believe is auto-generated) as well as add a new file in docs/ discussing some of the issues setting up Peer-2-Peer DMAs (which cmb_copy does). Any tips for how best to do this?

2. CI Testing. Upstream QEMU has support in its NVMe model for SSDs with WDS/RDS CMBs [2] (I should know as I added that support ;-)). Can we discuss adding this to the CI pool so we can do some form of emulated P2P testing? In addition is there interest in real HW testing? If so we could discuss adding some of our HW to the pool (but as a lowly startup I think donating HW is beyond our budget right now).

3. VFIO Support. Right now I have only tested with UIO. VFIO adds some interesting issues around BAR address translations, PCI ACS and PCI ATS. 

4. Fabrics Support. An obvious extension of this work is to allow other devices (aside from NVMe SSDs) to initiate DMAs to the NVMe CMBs. The prime candidate for that is a RDMA capable NIC which ties superbly well into NVMe over Fabrics. I would like to start a discussion on how best to approach this.

Is Trello the right place to enter and discuss these topics? Or is it OK to hash them out on the mailing list? Or do the community have a better way of discussing these items?

Cheers
 
Stephen

[1] https://github.com/spdk/spdk/commit/1f9da54e9cca75c1a049844b36319a52fdbacbd6
[2] https://github.com/qemu/qemu/blob/master/hw/block/nvme.c (see cmb_size_mb)

 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-02-26 19:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-15 15:02 [SPDK] SPDK NVMe CMB WDS/RDS Support: Thanks and next steps! Luse, Paul E
  -- strict thread matches above, loose matches on Subject: below --
2018-02-26 19:31 Stephen Bates
2018-02-21 20:58 Stephen Bates
2018-02-21 20:43 Harris, James R
2018-02-21 20:39 Stephen Bates
2018-02-15 23:18 Stephen Bates
2018-02-15 22:32 Stephen Bates
2018-02-15 21:15 Harris, James R
2018-02-15 21:02 Stephen Bates
2018-02-15 15:32 Harris, James R
2018-02-15 14:32 Stephen Bates

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.