All of lore.kernel.org
 help / color / mirror / Atom feed
* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-12 21:34 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2020-06-12 21:34 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 6390 bytes --]

Thanks Shuhei! After talking to Jim some more, I think for this release we’ll get the capability into the acceleration engine but if/how we use it through the rest of the SPDK stack is likely going to require enough experimentation that next release is a better target.  I’ll let you know when the capabilities are in there as I’m totally counting on your help to help make use of it when the time comes ☺

Thx
Paul

From: 松本周平 <shuheimatsumoto(a)gmail.com>
Sent: Friday, June 12, 2020 6:45 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Re: DIF/DIX acceleration in SPDK

Hi Paul,

Current NVMe-oF target and iSCSI target have used special SGL when reading from network or writing to network.
When reading from network, leave a metadata space per block by special SGL and then compute and fill DIF to metadata spaces.
When writing to network, compute and check DIF to metadata spaces, and then ignore a DIF space per block by special SGL.

Special SGL means that each SGL entry maps a single block and there is a metadata space between two SGL entries.

The major reason why we used special SGL was to avoid data copy between two buffers.

So, DIF check and DIF update will be helpful first.

The limitation that SGL is not supported will be acceptable first.
iSCSI target uses only a single contiguous buffer.
NVMe-oF target also can use only a single contiguous buffer by adjusting maximum IO size.


Another thing I want to share is that SPDK sock layer improved performance greatly recently, and this may conflict with
the current DIF implementation especially for the performance.

So in future, using DIF insert and DIF strip may be better for NVMe-TCP target and iSCSI target.
When NVMe-TCP target or iSCSI target use DIF insert and DIF strip, they prepare two buffers for each read or write, and copy with DIF insertion or strip.

So all DIF check, update, insert , and strip will be usable anyway.

One difficulty to emulate SGL by DSA is boundary.
To compute DIF for a single block which spans multiple SGL entries, it is necessary to compute CRC for the first partial block and then use the CRC as the seed value for the second partial block, and so on.

I don't think my feedback is enough, and so I'm fine for your any further question or feedback.

Thanks,
Shuhei

On Fri, Jun 12, 2020 at 12:46 AM Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>> wrote:
[cid:image001.gif(a)01D640C6.953CCB60]
Hi Everyone,

This is primarily for Shuhei but please feel free, anyone, to respond ☺

Adding support for Intel’s next generation offload engine is going well (Note, the feature is not available in HW yet, I’m using a simulator to do dev/test). Currently support exists, or is about to land on master, for:

Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches of commands.

Currently these are only being used by a new tool in /examples/accel/perf but once they all land and I’ve added some more tests, we’ll start using them in SPDK modules – the most notable uses will be for CRC32C 9iscsi) and DIF/DIX throughout the stack.  There will be other uses (compare, fill, copy, etc) as well but those are the big ones.

I’ve just now started looking at DIF/DIX and have determined that using these within SPDK won’t be quite as straightforward as some of the others. I’ll explain what I’m thinking after briefly summering the DSA DIF/DIX functions (more detail is available in the public spec at https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html)

Note: there is no SGL support in any of these, all are single src and/or dst:


  *   DIF Check: The DIF Check operation computes the Data Integrity Field (DIF) on the source data and compares the computed DIF to the DIF contained in the source data.
  *   DIF Insert: The DIF Insert operation copies memory from the Source Address to the Destination Address, while computing the Data Integrity Field (DIF) on the source data and inserting the DIF into the output data.
  *   DIF Strip: The DIF Strip operation copies memory from the Source Address to the Destination Address, removing the Data Integrity Field (DIF). It optionally computes the DIF on the source data and compares the computed DIF to the DIF contained in the source data.
  *   DIF Update: The DIF Update operation copies memory from the Source Address to the Destination Address. It optionally computes the Data Integrity Field (DIF) on the source data and compares the computed DIF to the DIF contained in the data. It simultaneously computes the DIF on the source data using Destination DIF fields in the descriptor and inserts the computed DIF into the output data.

Upon initial review of the relatively complex implementation of DIF?DIX we have in SPDK I have the following observations that I’m hoping to get some feedback on:


  *   It looks like we require SGL in most if not all cases. I can go through them one by one but wanted to get an initial feel mainly from Shuhei on how lack of SGL support impacts our ability to use DIF?DIX offload w/DSA before I start adding support ☺
  *   With the exception of DIF Check, all of the DSA functions include a copy (I can only assume they figured a use case where they are moving data from a host buffer into a different memory subsystem in prep for DMA’ing to disk).  It looks like most if not all of our calculations are done on fixed buffers. I see a few copy functions in diff.c but I don’t see them used anywhere

I’m almost thinking the DSA functions are too “simple” for our current implementation but wonder if there’s some refactoring we can do to make use of them. I don’t know if the DSA CRC32C engine calculates the same exact CRC as the DIF/DIX functions but if so (I can verify) at a minum maybe use just accelerate the CRCs called from funcs within diff.c

Thoughts? We can chat in a community meeting soon too but email might be easier to get us all on the amge page first.

Thanks!!
Paul

_______________________________________________
SPDK mailing list -- spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
To unsubscribe send an email to spdk-leave(a)lists.01.org<mailto:spdk-leave(a)lists.01.org>

[-- Attachment #2: attachment.htm --]
[-- Type: text/html, Size: 19191 bytes --]

[-- Attachment #3: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-14 19:27 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2020-06-14 19:27 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5202 bytes --]

Thanks for the inputs Sasha and, yeah, I agree the spec isn't super clear wrt what it's actually doing. Easy enough to find out though, I'll keep everyone posted. 

Thx
Paul
-----Original Message-----
From: Sasha Kotchubievsky <sashakot(a)dev.mellanox.co.il> 
Sent: Sunday, June 14, 2020 9:39 AM
To: spdk(a)lists.01.org
Subject: [SPDK] Re: DIF/DIX acceleration in SPDK

Hi Paul,

 From DSA spec, I don't quite understand which exactly DIF it supports. NVME and SCSI supports DIF with CRC16 guard,  not CRC32c (which is commonly used in Intel platform).
In NVME-OF we have "insert&stip" mode in TCP and RDMA transport. I would say, DSA looks applicable for TCP case. When data comes into target, we have extra for copy for the data. At this stage, copy with DSA can insert. or strip DIF. In case of RDMA Transport, using DSA will add extra copy. I'm not sure that will be better, than existing CRC calculation.

Best regards
Sasha

On 11-Jun-20 6:45 PM, Luse, Paul E wrote:
>
> Hi Everyone,
>
> This is primarily for Shuhei but please feel free, anyone, to respond 
> J
>
> Adding support for Intel's next generation offload engine is going 
> well (Note, the feature is not available in HW yet, I'm using a 
> simulator to do dev/test). Currently support exists, or is about to 
> land on master, for:
>
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit 
> batches of commands.
>
> Currently these are only being used by a new tool in 
> /examples/accel/perf but once they all land and I've added some more 
> tests, we'll start using them in SPDK modules - the most notable uses 
> will be for CRC32C 9iscsi) and DIF/DIX throughout the stack.  There 
> will be other uses (compare, fill, copy, etc) as well but those are 
> the big ones.
>
> I've just now started looking at DIF/DIX and have determined that 
> using these within SPDK won't be quite as straightforward as some of 
> the others. I'll explain what I'm thinking after briefly summering the 
> DSA DIF/DIX functions (more detail is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-da
> ta-streaming-accelerator-preliminary-architecture-specification.html)
>
> Note: there is no SGL support in any of these, all are single src 
> and/or dst:
>
>   * DIF Check: The DIF Check operation computes the Data Integrity
>     Field (DIF) on the source data and compares the computed DIF to
>     the DIF contained in the source data.
>   * DIF Insert: The DIF Insert operation copies memory from the Source
>     Address to the Destination Address, while computing the Data
>     Integrity Field (DIF) on the source data and inserting the DIF
>     into the output data.
>   * DIF Strip: The DIF Strip operation copies memory from the Source
>     Address to the Destination Address, removing the Data Integrity
>     Field (DIF). It optionally computes the DIF on the source data and
>     compares the computed DIF to the DIF contained in the source data.
>   * DIF Update: The DIF Update operation copies memory from the Source
>     Address to the Destination Address. It optionally computes the
>     Data Integrity Field (DIF) on the source data and compares the
>     computed DIF to the DIF contained in the data. It simultaneously
>     computes the DIF on the source data using Destination DIF fields
>     in the descriptor and inserts the computed DIF into the output data.
>
> Upon initial review of the relatively complex implementation of 
> DIF?DIX we have in SPDK I have the following observations that I'm 
> hoping to get some feedback on:
>
>   * It looks like we require SGL in most if not all cases. I can go
>     through them one by one but wanted to get an initial feel mainly
>     from Shuhei on how lack of SGL support impacts our ability to use
>     DIF?DIX offload w/DSA before I start adding support J
>   * With the exception of DIF Check, all of the DSA functions include
>     a copy (I can only assume they figured a use case where they are
>     moving data from a host buffer into a different memory subsystem
>     in prep for DMA'ing to disk).  It looks like most if not all of
>     our calculations are done on fixed buffers. I see a few copy
>     functions in diff.c but I don't see them used anywhere
>
> I'm almost thinking the DSA functions are too "simple" for our current 
> implementation but wonder if there's some refactoring we can do to 
> make use of them. I don't know if the DSA CRC32C engine calculates the 
> same exact CRC as the DIF/DIX functions but if so (I can verify) at a 
> minum maybe use just accelerate the CRCs called from funcs within 
> diff.c
>
> Thoughts? We can chat in a community meeting soon too but email might 
> be easier to get us all on the amge page first.
>
> Thanks!!
>
> Paul
>
>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org To unsubscribe send an email to 
> spdk-leave(a)lists.01.org

_______________________________________________
SPDK mailing list -- spdk(a)lists.01.org
To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-14 16:53 Sasha Kotchubievsky
  0 siblings, 0 replies; 10+ messages in thread
From: Sasha Kotchubievsky @ 2020-06-14 16:53 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 7397 bytes --]

Hi Ben,

Do you plan to replace DPDK calls  in "reduce" block by  "accel" framework?

Having plug-in system for HW offloads is a great idea.

Best regards
Sasha

On 11-Jun-20 11:00 PM, Walker, Benjamin wrote:
>
>> -----Original Message-----
>> From: Luse, Paul E <paul.e.luse(a)intel.com>
>> Sent: Thursday, June 11, 2020 11:22 AM
>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
>>
>> That’s a great question. I can only comment on what’s been made public and
>> there’s no mention of those things in the announcement here
>> https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
>>
>> As I’m sure you’re aware, we already support compression/crypto via Intel QAT
>> https://www.intel.com/content/www/us/en/architecture-and-
>> technology/intel-quick-assist-technology-overview.html
> I think there are some interesting open questions in this area on how we model the software framework, even if DSA never supports crypto/compression. Right now you've coded up the 'accel' framework library and it can handle a bunch of these types of storage offloads by redirecting to either DSA, CBDMA, or ISA-L (CPU) as needed. You've even made it a plug-in system, so other vendors can add their offload hardware into it. That's a great starting point.
>
> Simultaneously, parts of SPDK leverage compression and crypto offload by directly calling into the DPDK framework. That code has a similar concept - plugin drivers for various pieces of hardware that can do crypto or compression type things, with ISA-l as the fallback.
>
> What isn't clear to me, as of right now, is if we should continue to model these two things as separate components, or if we should try to unite them into a single accel framework. Under the hood, we're going to delegate crypto and compression offloads to DPDK because that's where the drivers live, and the more storage-specific offloads that the 'accel' framework currently does will stay in SPDK. But we could build a more unified abstraction layer for SPDK users on top at least just for convenience. I don't know how much demand there would be for people to plug their own hardware into a framework like that, or from people who want to consume an API like that.
>
>> Thx
>> Paul
>>
>> From: Andrey Kuzmin <andrey.v.kuzmin(a)gmail.com>
>> Sent: Thursday, June 11, 2020 11:14 AM
>> To: Storage Performance Development Kit <spdk(a)lists.01.org>
>> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
>>
>> Hi Paul,
>>
>> does the offload engine support (or plan to support in the future) more complex
>> compute-intensive storage tasks such as compression/decompression, crypto
>> (encryption/strong hashing) etc.?
>> Thanks,
>> Andrey
>>
>> On Thu, Jun 11, 2020, 18:46 Luse, Paul E
>> <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>> wrote:
>> [cid:image001.gif(a)01D63FC9.365AE030]
>> Hi Everyone,
>>
>> This is primarily for Shuhei but please feel free, anyone, to respond ☺
>>
>> Adding support for Intel’s next generation offload engine is going well (Note,
>> the feature is not available in HW yet, I’m using a simulator to do dev/test).
>> Currently support exists, or is about to land on master, for:
>>
>> Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches of
>> commands.
>>
>> Currently these are only being used by a new tool in /examples/accel/perf but
>> once they all land and I’ve added some more tests, we’ll start using them in
>> SPDK modules – the most notable uses will be for CRC32C 9iscsi) and DIF/DIX
>> throughout the stack.  There will be other uses (compare, fill, copy, etc) as well
>> but those are the big ones.
>>
>> I’ve just now started looking at DIF/DIX and have determined that using these
>> within SPDK won’t be quite as straightforward as some of the others. I’ll explain
>> what I’m thinking after briefly summering the DSA DIF/DIX functions (more detail
>> is available in the public spec at
>> https://software.intel.com/content/www/us/en/develop/download/intel-data-
>> streaming-accelerator-preliminary-architecture-specification.html)
>>
>> Note: there is no SGL support in any of these, all are single src and/or dst:
>>
>>
>>    *   DIF Check: The DIF Check operation computes the Data Integrity Field (DIF)
>> on the source data and compares the computed DIF to the DIF contained in the
>> source data.
>>    *   DIF Insert: The DIF Insert operation copies memory from the Source Address
>> to the Destination Address, while computing the Data Integrity Field (DIF) on the
>> source data and inserting the DIF into the output data.
>>    *   DIF Strip: The DIF Strip operation copies memory from the Source Address to
>> the Destination Address, removing the Data Integrity Field (DIF). It optionally
>> computes the DIF on the source data and compares the computed DIF to the DIF
>> contained in the source data.
>>    *   DIF Update: The DIF Update operation copies memory from the Source
>> Address to the Destination Address. It optionally computes the Data Integrity
>> Field (DIF) on the source data and compares the computed DIF to the DIF
>> contained in the data. It simultaneously computes the DIF on the source data
>> using Destination DIF fields in the descriptor and inserts the computed DIF into
>> the output data.
>>
>> Upon initial review of the relatively complex implementation of DIF?DIX we have
>> in SPDK I have the following observations that I’m hoping to get some feedback
>> on:
>>
>>
>>    *   It looks like we require SGL in most if not all cases. I can go through them
>> one by one but wanted to get an initial feel mainly from Shuhei on how lack of
>> SGL support impacts our ability to use DIF?DIX offload w/DSA before I start
>> adding support ☺
>>    *   With the exception of DIF Check, all of the DSA functions include a copy (I
>> can only assume they figured a use case where they are moving data from a
>> host buffer into a different memory subsystem in prep for DMA’ing to disk).  It
>> looks like most if not all of our calculations are done on fixed buffers. I see a
>> few copy functions in diff.c but I don’t see them used anywhere
>>
>> I’m almost thinking the DSA functions are too “simple” for our current
>> implementation but wonder if there’s some refactoring we can do to make use
>> of them. I don’t know if the DSA CRC32C engine calculates the same exact CRC
>> as the DIF/DIX functions but if so (I can verify) at a minum maybe use just
>> accelerate the CRCs called from funcs within diff.c
>>
>> Thoughts? We can chat in a community meeting soon too but email might be
>> easier to get us all on the amge page first.
>>
>> Thanks!!
>> Paul
>>
>> _______________________________________________
>> SPDK mailing list -- spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
>> To unsubscribe send an email to spdk-leave(a)lists.01.org<mailto:spdk-
>> leave(a)lists.01.org>
>> _______________________________________________
>> SPDK mailing list -- spdk(a)lists.01.org
>> To unsubscribe send an email to spdk-leave(a)lists.01.org
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-14 16:38 Sasha Kotchubievsky
  0 siblings, 0 replies; 10+ messages in thread
From: Sasha Kotchubievsky @ 2020-06-14 16:38 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4712 bytes --]

Hi Paul,

 From DSA spec, I don't quite understand which exactly DIF it supports.  
NVME and SCSI supports DIF with CRC16 guard,  not CRC32c (which is 
commonly used in Intel platform).
In NVME-OF we have "insert&stip" mode in TCP and RDMA transport. I would 
say, DSA looks applicable for TCP case. When data comes into target, we 
have extra for copy for the data. At this stage, copy with DSA can 
insert. or strip DIF. In case of RDMA Transport, using DSA will add 
extra copy. I'm not sure that will be better, than existing CRC 
calculation.

Best regards
Sasha

On 11-Jun-20 6:45 PM, Luse, Paul E wrote:
>
> Hi Everyone,
>
> This is primarily for Shuhei but please feel free, anyone, to respond J
>
> Adding support for Intel’s next generation offload engine is going 
> well (Note, the feature is not available in HW yet, I’m using a 
> simulator to do dev/test). Currently support exists, or is about to 
> land on master, for:
>
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit 
> batches of commands.
>
> Currently these are only being used by a new tool in 
> /examples/accel/perf but once they all land and I’ve added some more 
> tests, we’ll start using them in SPDK modules – the most notable uses 
> will be for CRC32C 9iscsi) and DIF/DIX throughout the stack.  There 
> will be other uses (compare, fill, copy, etc) as well but those are 
> the big ones.
>
> I’ve just now started looking at DIF/DIX and have determined that 
> using these within SPDK won’t be quite as straightforward as some of 
> the others. I’ll explain what I’m thinking after briefly summering the 
> DSA DIF/DIX functions (more detail is available in the public spec at 
> https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html)
>
> Note: there is no SGL support in any of these, all are single src 
> and/or dst:
>
>   * DIF Check: The DIF Check operation computes the Data Integrity
>     Field (DIF) on the source data and compares the computed DIF to
>     the DIF contained in the source data.
>   * DIF Insert: The DIF Insert operation copies memory from the Source
>     Address to the Destination Address, while computing the Data
>     Integrity Field (DIF) on the source data and inserting the DIF
>     into the output data.
>   * DIF Strip: The DIF Strip operation copies memory from the Source
>     Address to the Destination Address, removing the Data Integrity
>     Field (DIF). It optionally computes the DIF on the source data and
>     compares the computed DIF to the DIF contained in the source data.
>   * DIF Update: The DIF Update operation copies memory from the Source
>     Address to the Destination Address. It optionally computes the
>     Data Integrity Field (DIF) on the source data and compares the
>     computed DIF to the DIF contained in the data. It simultaneously
>     computes the DIF on the source data using Destination DIF fields
>     in the descriptor and inserts the computed DIF into the output data.
>
> Upon initial review of the relatively complex implementation of 
> DIF?DIX we have in SPDK I have the following observations that I’m 
> hoping to get some feedback on:
>
>   * It looks like we require SGL in most if not all cases. I can go
>     through them one by one but wanted to get an initial feel mainly
>     from Shuhei on how lack of SGL support impacts our ability to use
>     DIF?DIX offload w/DSA before I start adding support J
>   * With the exception of DIF Check, all of the DSA functions include
>     a copy (I can only assume they figured a use case where they are
>     moving data from a host buffer into a different memory subsystem
>     in prep for DMA’ing to disk).  It looks like most if not all of
>     our calculations are done on fixed buffers. I see a few copy
>     functions in diff.c but I don’t see them used anywhere
>
> I’m almost thinking the DSA functions are too “simple” for our current 
> implementation but wonder if there’s some refactoring we can do to 
> make use of them. I don’t know if the DSA CRC32C engine calculates the 
> same exact CRC as the DIF/DIX functions but if so (I can verify) at a 
> minum maybe use just accelerate the CRCs called from funcs within diff.c
>
> Thoughts? We can chat in a community meeting soon too but email might 
> be easier to get us all on the amge page first.
>
> Thanks!!
>
> Paul
>
>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-12 13:45 
  0 siblings, 0 replies; 10+ messages in thread
From:  @ 2020-06-12 13:45 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 5942 bytes --]

Hi Paul,

Current NVMe-oF target and iSCSI target have used special SGL when reading
from network or writing to network.
When reading from network, leave a metadata space per block by special SGL
and then compute and fill DIF to metadata spaces.
When writing to network, compute and check DIF to metadata spaces, and then
ignore a DIF space per block by special SGL.

Special SGL means that each SGL entry maps a single block and there is a
metadata space between two SGL entries.

The major reason why we used special SGL was to avoid data copy between two
buffers.

So, DIF check and DIF update will be helpful first.

The limitation that SGL is not supported will be acceptable first.
iSCSI target uses only a single contiguous buffer.
NVMe-oF target also can use only a single contiguous buffer by adjusting
maximum IO size.


Another thing I want to share is that SPDK sock layer improved performance
greatly recently, and this may conflict with
the current DIF implementation especially for the performance.

So in future, using DIF insert and DIF strip may be better for NVMe-TCP
target and iSCSI target.
When NVMe-TCP target or iSCSI target use DIF insert and DIF strip, they
prepare two buffers for each read or write, and copy with DIF insertion or
strip.

So all DIF check, update, insert , and strip will be usable anyway.

One difficulty to emulate SGL by DSA is boundary.
To compute DIF for a single block which spans multiple SGL entries, it is
necessary to compute CRC for the first partial block and then use the CRC
as the seed value for the second partial block, and so on.

I don't think my feedback is enough, and so I'm fine for your any further
question or feedback.

Thanks,
Shuhei

On Fri, Jun 12, 2020 at 12:46 AM Luse, Paul E <paul.e.luse(a)intel.com> wrote:

> Hi Everyone,
>
>
>
> This is primarily for Shuhei but please feel free, anyone, to respond J
>
>
>
> Adding support for Intel’s next generation offload engine is going well
> (Note, the feature is not available in HW yet, I’m using a simulator to do
> dev/test). Currently support exists, or is about to land on master, for:
>
>
>
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches
> of commands.
>
>
>
> Currently these are only being used by a new tool in /examples/accel/perf
> but once they all land and I’ve added some more tests, we’ll start using
> them in SPDK modules – the most notable uses will be for CRC32C 9iscsi) and
> DIF/DIX throughout the stack.  There will be other uses (compare, fill,
> copy, etc) as well but those are the big ones.
>
>
>
> I’ve just now started looking at DIF/DIX and have determined that using
> these within SPDK won’t be quite as straightforward as some of the others.
> I’ll explain what I’m thinking after briefly summering the DSA DIF/DIX
> functions (more detail is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html
> )
>
>
>
> Note: there is no SGL support in any of these, all are single src and/or
> dst:
>
>
>
>    - DIF Check: The DIF Check operation computes the Data Integrity Field
>    (DIF) on the source data and compares the computed DIF to the DIF contained
>    in the source data.
>    - DIF Insert: The DIF Insert operation copies memory from the Source
>    Address to the Destination Address, while computing the Data Integrity
>    Field (DIF) on the source data and inserting the DIF into the output data.
>    - DIF Strip: The DIF Strip operation copies memory from the Source
>    Address to the Destination Address, removing the Data Integrity Field
>    (DIF). It optionally computes the DIF on the source data and compares the
>    computed DIF to the DIF contained in the source data.
>    - DIF Update: The DIF Update operation copies memory from the Source
>    Address to the Destination Address. It optionally computes the Data
>    Integrity Field (DIF) on the source data and compares the computed DIF to
>    the DIF contained in the data. It simultaneously computes the DIF on the
>    source data using Destination DIF fields in the descriptor and inserts the
>    computed DIF into the output data.
>
>
>
> Upon initial review of the relatively complex implementation of DIF?DIX we
> have in SPDK I have the following observations that I’m hoping to get some
> feedback on:
>
>
>
>    - It looks like we require SGL in most if not all cases. I can go
>    through them one by one but wanted to get an initial feel mainly from
>    Shuhei on how lack of SGL support impacts our ability to use DIF?DIX
>    offload w/DSA before I start adding support J
>    - With the exception of DIF Check, all of the DSA functions include a
>    copy (I can only assume they figured a use case where they are moving data
>    from a host buffer into a different memory subsystem in prep for DMA’ing to
>    disk).  It looks like most if not all of our calculations are done on fixed
>    buffers. I see a few copy functions in diff.c but I don’t see them used
>    anywhere
>
>
>
> I’m almost thinking the DSA functions are too “simple” for our current
> implementation but wonder if there’s some refactoring we can do to make use
> of them. I don’t know if the DSA CRC32C engine calculates the same exact
> CRC as the DIF/DIX functions but if so (I can verify) at a minum maybe use
> just accelerate the CRCs called from funcs within diff.c
>
>
>
> Thoughts? We can chat in a community meeting soon too but email might be
> easier to get us all on the amge page first.
>
>
>
> Thanks!!
>
> Paul
>
>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org
>

[-- Attachment #2: attachment.htm --]
[-- Type: text/html, Size: 8462 bytes --]

[-- Attachment #3: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-11 20:20 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2020-06-11 20:20 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 7948 bytes --]

Thanks Ben, yeah I've thought about that as well and do think it makes sense to have one single SPDK accel FW API and whether the back end is an accel module or a DPDK framework interface would be great to totally hide.  There's still a lot to do on the accel framework as is though in terms of finishing DSA and then bringing the other modules up to the same level of functionality. I've been keeping the software module up to date as that was super simply but I still have some others on the IOAT side to do.

Any thoughts on the DIF/DIX considerations below?

Thx
Paul

-----Original Message-----
From: Walker, Benjamin <benjamin.walker(a)intel.com> 
Sent: Thursday, June 11, 2020 1:01 PM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Re: DIF/DIX acceleration in SPDK



> -----Original Message-----
> From: Luse, Paul E <paul.e.luse(a)intel.com>
> Sent: Thursday, June 11, 2020 11:22 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
> 
> That’s a great question. I can only comment on what’s been made public 
> and there’s no mention of those things in the announcement here 
> https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
> 
> As I’m sure you’re aware, we already support compression/crypto via 
> Intel QAT
> https://www.intel.com/content/www/us/en/architecture-and-
> technology/intel-quick-assist-technology-overview.html

I think there are some interesting open questions in this area on how we model the software framework, even if DSA never supports crypto/compression. Right now you've coded up the 'accel' framework library and it can handle a bunch of these types of storage offloads by redirecting to either DSA, CBDMA, or ISA-L (CPU) as needed. You've even made it a plug-in system, so other vendors can add their offload hardware into it. That's a great starting point.

Simultaneously, parts of SPDK leverage compression and crypto offload by directly calling into the DPDK framework. That code has a similar concept - plugin drivers for various pieces of hardware that can do crypto or compression type things, with ISA-l as the fallback.

What isn't clear to me, as of right now, is if we should continue to model these two things as separate components, or if we should try to unite them into a single accel framework. Under the hood, we're going to delegate crypto and compression offloads to DPDK because that's where the drivers live, and the more storage-specific offloads that the 'accel' framework currently does will stay in SPDK. But we could build a more unified abstraction layer for SPDK users on top at least just for convenience. I don't know how much demand there would be for people to plug their own hardware into a framework like that, or from people who want to consume an API like that.

> 
> Thx
> Paul
> 
> From: Andrey Kuzmin <andrey.v.kuzmin(a)gmail.com>
> Sent: Thursday, June 11, 2020 11:14 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
> 
> Hi Paul,
> 
> does the offload engine support (or plan to support in the future) 
> more complex compute-intensive storage tasks such as 
> compression/decompression, crypto (encryption/strong hashing) etc.?
> Thanks,
> Andrey
> 
> On Thu, Jun 11, 2020, 18:46 Luse, Paul E 
> <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>> wrote:
> [cid:image001.gif(a)01D63FC9.365AE030]
> Hi Everyone,
> 
> This is primarily for Shuhei but please feel free, anyone, to respond 
> ☺
> 
> Adding support for Intel’s next generation offload engine is going 
> well (Note, the feature is not available in HW yet, I’m using a simulator to do dev/test).
> Currently support exists, or is about to land on master, for:
> 
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit 
> batches of commands.
> 
> Currently these are only being used by a new tool in 
> /examples/accel/perf but once they all land and I’ve added some more 
> tests, we’ll start using them in SPDK modules – the most notable uses 
> will be for CRC32C 9iscsi) and DIF/DIX throughout the stack.  There 
> will be other uses (compare, fill, copy, etc) as well but those are the big ones.
> 
> I’ve just now started looking at DIF/DIX and have determined that 
> using these within SPDK won’t be quite as straightforward as some of 
> the others. I’ll explain what I’m thinking after briefly summering the 
> DSA DIF/DIX functions (more detail is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-da
> ta-
> streaming-accelerator-preliminary-architecture-specification.html)
> 
> Note: there is no SGL support in any of these, all are single src and/or dst:
> 
> 
>   *   DIF Check: The DIF Check operation computes the Data Integrity Field (DIF)
> on the source data and compares the computed DIF to the DIF contained 
> in the source data.
>   *   DIF Insert: The DIF Insert operation copies memory from the Source Address
> to the Destination Address, while computing the Data Integrity Field 
> (DIF) on the source data and inserting the DIF into the output data.
>   *   DIF Strip: The DIF Strip operation copies memory from the Source Address to
> the Destination Address, removing the Data Integrity Field (DIF). It 
> optionally computes the DIF on the source data and compares the 
> computed DIF to the DIF contained in the source data.
>   *   DIF Update: The DIF Update operation copies memory from the Source
> Address to the Destination Address. It optionally computes the Data 
> Integrity Field (DIF) on the source data and compares the computed DIF 
> to the DIF contained in the data. It simultaneously computes the DIF 
> on the source data using Destination DIF fields in the descriptor and 
> inserts the computed DIF into the output data.
> 
> Upon initial review of the relatively complex implementation of 
> DIF?DIX we have in SPDK I have the following observations that I’m 
> hoping to get some feedback
> on:
> 
> 
>   *   It looks like we require SGL in most if not all cases. I can go through them
> one by one but wanted to get an initial feel mainly from Shuhei on how 
> lack of SGL support impacts our ability to use DIF?DIX offload w/DSA 
> before I start adding support ☺
>   *   With the exception of DIF Check, all of the DSA functions include a copy (I
> can only assume they figured a use case where they are moving data 
> from a host buffer into a different memory subsystem in prep for 
> DMA’ing to disk).  It looks like most if not all of our calculations 
> are done on fixed buffers. I see a few copy functions in diff.c but I 
> don’t see them used anywhere
> 
> I’m almost thinking the DSA functions are too “simple” for our current 
> implementation but wonder if there’s some refactoring we can do to 
> make use of them. I don’t know if the DSA CRC32C engine calculates the 
> same exact CRC as the DIF/DIX functions but if so (I can verify) at a 
> minum maybe use just accelerate the CRCs called from funcs within 
> diff.c
> 
> Thoughts? We can chat in a community meeting soon too but email might 
> be easier to get us all on the amge page first.
> 
> Thanks!!
> Paul
> 
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> To unsubscribe send an email to spdk-leave(a)lists.01.org<mailto:spdk-
> leave(a)lists.01.org>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org To unsubscribe send an email to 
> spdk-leave(a)lists.01.org
_______________________________________________
SPDK mailing list -- spdk(a)lists.01.org
To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-11 20:00 Walker, Benjamin
  0 siblings, 0 replies; 10+ messages in thread
From: Walker, Benjamin @ 2020-06-11 20:00 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 6923 bytes --]



> -----Original Message-----
> From: Luse, Paul E <paul.e.luse(a)intel.com>
> Sent: Thursday, June 11, 2020 11:22 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
> 
> That’s a great question. I can only comment on what’s been made public and
> there’s no mention of those things in the announcement here
> https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator
> 
> As I’m sure you’re aware, we already support compression/crypto via Intel QAT
> https://www.intel.com/content/www/us/en/architecture-and-
> technology/intel-quick-assist-technology-overview.html

I think there are some interesting open questions in this area on how we model the software framework, even if DSA never supports crypto/compression. Right now you've coded up the 'accel' framework library and it can handle a bunch of these types of storage offloads by redirecting to either DSA, CBDMA, or ISA-L (CPU) as needed. You've even made it a plug-in system, so other vendors can add their offload hardware into it. That's a great starting point.

Simultaneously, parts of SPDK leverage compression and crypto offload by directly calling into the DPDK framework. That code has a similar concept - plugin drivers for various pieces of hardware that can do crypto or compression type things, with ISA-l as the fallback.

What isn't clear to me, as of right now, is if we should continue to model these two things as separate components, or if we should try to unite them into a single accel framework. Under the hood, we're going to delegate crypto and compression offloads to DPDK because that's where the drivers live, and the more storage-specific offloads that the 'accel' framework currently does will stay in SPDK. But we could build a more unified abstraction layer for SPDK users on top at least just for convenience. I don't know how much demand there would be for people to plug their own hardware into a framework like that, or from people who want to consume an API like that.

> 
> Thx
> Paul
> 
> From: Andrey Kuzmin <andrey.v.kuzmin(a)gmail.com>
> Sent: Thursday, June 11, 2020 11:14 AM
> To: Storage Performance Development Kit <spdk(a)lists.01.org>
> Subject: [SPDK] Re: DIF/DIX acceleration in SPDK
> 
> Hi Paul,
> 
> does the offload engine support (or plan to support in the future) more complex
> compute-intensive storage tasks such as compression/decompression, crypto
> (encryption/strong hashing) etc.?
> Thanks,
> Andrey
> 
> On Thu, Jun 11, 2020, 18:46 Luse, Paul E
> <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>> wrote:
> [cid:image001.gif(a)01D63FC9.365AE030]
> Hi Everyone,
> 
> This is primarily for Shuhei but please feel free, anyone, to respond ☺
> 
> Adding support for Intel’s next generation offload engine is going well (Note,
> the feature is not available in HW yet, I’m using a simulator to do dev/test).
> Currently support exists, or is about to land on master, for:
> 
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches of
> commands.
> 
> Currently these are only being used by a new tool in /examples/accel/perf but
> once they all land and I’ve added some more tests, we’ll start using them in
> SPDK modules – the most notable uses will be for CRC32C 9iscsi) and DIF/DIX
> throughout the stack.  There will be other uses (compare, fill, copy, etc) as well
> but those are the big ones.
> 
> I’ve just now started looking at DIF/DIX and have determined that using these
> within SPDK won’t be quite as straightforward as some of the others. I’ll explain
> what I’m thinking after briefly summering the DSA DIF/DIX functions (more detail
> is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-data-
> streaming-accelerator-preliminary-architecture-specification.html)
> 
> Note: there is no SGL support in any of these, all are single src and/or dst:
> 
> 
>   *   DIF Check: The DIF Check operation computes the Data Integrity Field (DIF)
> on the source data and compares the computed DIF to the DIF contained in the
> source data.
>   *   DIF Insert: The DIF Insert operation copies memory from the Source Address
> to the Destination Address, while computing the Data Integrity Field (DIF) on the
> source data and inserting the DIF into the output data.
>   *   DIF Strip: The DIF Strip operation copies memory from the Source Address to
> the Destination Address, removing the Data Integrity Field (DIF). It optionally
> computes the DIF on the source data and compares the computed DIF to the DIF
> contained in the source data.
>   *   DIF Update: The DIF Update operation copies memory from the Source
> Address to the Destination Address. It optionally computes the Data Integrity
> Field (DIF) on the source data and compares the computed DIF to the DIF
> contained in the data. It simultaneously computes the DIF on the source data
> using Destination DIF fields in the descriptor and inserts the computed DIF into
> the output data.
> 
> Upon initial review of the relatively complex implementation of DIF?DIX we have
> in SPDK I have the following observations that I’m hoping to get some feedback
> on:
> 
> 
>   *   It looks like we require SGL in most if not all cases. I can go through them
> one by one but wanted to get an initial feel mainly from Shuhei on how lack of
> SGL support impacts our ability to use DIF?DIX offload w/DSA before I start
> adding support ☺
>   *   With the exception of DIF Check, all of the DSA functions include a copy (I
> can only assume they figured a use case where they are moving data from a
> host buffer into a different memory subsystem in prep for DMA’ing to disk).  It
> looks like most if not all of our calculations are done on fixed buffers. I see a
> few copy functions in diff.c but I don’t see them used anywhere
> 
> I’m almost thinking the DSA functions are too “simple” for our current
> implementation but wonder if there’s some refactoring we can do to make use
> of them. I don’t know if the DSA CRC32C engine calculates the same exact CRC
> as the DIF/DIX functions but if so (I can verify) at a minum maybe use just
> accelerate the CRCs called from funcs within diff.c
> 
> Thoughts? We can chat in a community meeting soon too but email might be
> easier to get us all on the amge page first.
> 
> Thanks!!
> Paul
> 
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
> To unsubscribe send an email to spdk-leave(a)lists.01.org<mailto:spdk-
> leave(a)lists.01.org>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-11 18:21 Luse, Paul E
  0 siblings, 0 replies; 10+ messages in thread
From: Luse, Paul E @ 2020-06-11 18:21 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4864 bytes --]

That’s a great question. I can only comment on what’s been made public and there’s no mention of those things in the announcement here https://01.org/blogs/2019/introducing-intel-data-streaming-accelerator

As I’m sure you’re aware, we already support compression/crypto via Intel QAT https://www.intel.com/content/www/us/en/architecture-and-technology/intel-quick-assist-technology-overview.html

Thx
Paul

From: Andrey Kuzmin <andrey.v.kuzmin(a)gmail.com>
Sent: Thursday, June 11, 2020 11:14 AM
To: Storage Performance Development Kit <spdk(a)lists.01.org>
Subject: [SPDK] Re: DIF/DIX acceleration in SPDK

Hi Paul,

does the offload engine support (or plan to support in the future) more complex compute-intensive storage tasks such as compression/decompression, crypto (encryption/strong hashing) etc.?
Thanks,
Andrey

On Thu, Jun 11, 2020, 18:46 Luse, Paul E <paul.e.luse(a)intel.com<mailto:paul.e.luse(a)intel.com>> wrote:
[cid:image001.gif(a)01D63FC9.365AE030]
Hi Everyone,

This is primarily for Shuhei but please feel free, anyone, to respond ☺

Adding support for Intel’s next generation offload engine is going well (Note, the feature is not available in HW yet, I’m using a simulator to do dev/test). Currently support exists, or is about to land on master, for:

Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches of commands.

Currently these are only being used by a new tool in /examples/accel/perf but once they all land and I’ve added some more tests, we’ll start using them in SPDK modules – the most notable uses will be for CRC32C 9iscsi) and DIF/DIX throughout the stack.  There will be other uses (compare, fill, copy, etc) as well but those are the big ones.

I’ve just now started looking at DIF/DIX and have determined that using these within SPDK won’t be quite as straightforward as some of the others. I’ll explain what I’m thinking after briefly summering the DSA DIF/DIX functions (more detail is available in the public spec at https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html)

Note: there is no SGL support in any of these, all are single src and/or dst:


  *   DIF Check: The DIF Check operation computes the Data Integrity Field (DIF) on the source data and compares the computed DIF to the DIF contained in the source data.
  *   DIF Insert: The DIF Insert operation copies memory from the Source Address to the Destination Address, while computing the Data Integrity Field (DIF) on the source data and inserting the DIF into the output data.
  *   DIF Strip: The DIF Strip operation copies memory from the Source Address to the Destination Address, removing the Data Integrity Field (DIF). It optionally computes the DIF on the source data and compares the computed DIF to the DIF contained in the source data.
  *   DIF Update: The DIF Update operation copies memory from the Source Address to the Destination Address. It optionally computes the Data Integrity Field (DIF) on the source data and compares the computed DIF to the DIF contained in the data. It simultaneously computes the DIF on the source data using Destination DIF fields in the descriptor and inserts the computed DIF into the output data.

Upon initial review of the relatively complex implementation of DIF?DIX we have in SPDK I have the following observations that I’m hoping to get some feedback on:


  *   It looks like we require SGL in most if not all cases. I can go through them one by one but wanted to get an initial feel mainly from Shuhei on how lack of SGL support impacts our ability to use DIF?DIX offload w/DSA before I start adding support ☺
  *   With the exception of DIF Check, all of the DSA functions include a copy (I can only assume they figured a use case where they are moving data from a host buffer into a different memory subsystem in prep for DMA’ing to disk).  It looks like most if not all of our calculations are done on fixed buffers. I see a few copy functions in diff.c but I don’t see them used anywhere

I’m almost thinking the DSA functions are too “simple” for our current implementation but wonder if there’s some refactoring we can do to make use of them. I don’t know if the DSA CRC32C engine calculates the same exact CRC as the DIF/DIX functions but if so (I can verify) at a minum maybe use just accelerate the CRCs called from funcs within diff.c

Thoughts? We can chat in a community meeting soon too but email might be easier to get us all on the amge page first.

Thanks!!
Paul

_______________________________________________
SPDK mailing list -- spdk(a)lists.01.org<mailto:spdk(a)lists.01.org>
To unsubscribe send an email to spdk-leave(a)lists.01.org<mailto:spdk-leave(a)lists.01.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-11 18:13 Andrey Kuzmin
  0 siblings, 0 replies; 10+ messages in thread
From: Andrey Kuzmin @ 2020-06-11 18:13 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4408 bytes --]

Hi Paul,

does the offload engine support (or plan to support in the future) more
complex compute-intensive storage tasks such as compression/decompression,
crypto (encryption/strong hashing) etc.?

Thanks,
Andrey

On Thu, Jun 11, 2020, 18:46 Luse, Paul E <paul.e.luse(a)intel.com> wrote:

> Hi Everyone,
>
>
>
> This is primarily for Shuhei but please feel free, anyone, to respond J
>
>
>
> Adding support for Intel’s next generation offload engine is going well
> (Note, the feature is not available in HW yet, I’m using a simulator to do
> dev/test). Currently support exists, or is about to land on master, for:
>
>
>
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches
> of commands.
>
>
>
> Currently these are only being used by a new tool in /examples/accel/perf
> but once they all land and I’ve added some more tests, we’ll start using
> them in SPDK modules – the most notable uses will be for CRC32C 9iscsi) and
> DIF/DIX throughout the stack.  There will be other uses (compare, fill,
> copy, etc) as well but those are the big ones.
>
>
>
> I’ve just now started looking at DIF/DIX and have determined that using
> these within SPDK won’t be quite as straightforward as some of the others.
> I’ll explain what I’m thinking after briefly summering the DSA DIF/DIX
> functions (more detail is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html
> )
>
>
>
> Note: there is no SGL support in any of these, all are single src and/or
> dst:
>
>
>
>    - DIF Check: The DIF Check operation computes the Data Integrity Field
>    (DIF) on the source data and compares the computed DIF to the DIF contained
>    in the source data.
>    - DIF Insert: The DIF Insert operation copies memory from the Source
>    Address to the Destination Address, while computing the Data Integrity
>    Field (DIF) on the source data and inserting the DIF into the output data.
>    - DIF Strip: The DIF Strip operation copies memory from the Source
>    Address to the Destination Address, removing the Data Integrity Field
>    (DIF). It optionally computes the DIF on the source data and compares the
>    computed DIF to the DIF contained in the source data.
>    - DIF Update: The DIF Update operation copies memory from the Source
>    Address to the Destination Address. It optionally computes the Data
>    Integrity Field (DIF) on the source data and compares the computed DIF to
>    the DIF contained in the data. It simultaneously computes the DIF on the
>    source data using Destination DIF fields in the descriptor and inserts the
>    computed DIF into the output data.
>
>
>
> Upon initial review of the relatively complex implementation of DIF?DIX we
> have in SPDK I have the following observations that I’m hoping to get some
> feedback on:
>
>
>
>    - It looks like we require SGL in most if not all cases. I can go
>    through them one by one but wanted to get an initial feel mainly from
>    Shuhei on how lack of SGL support impacts our ability to use DIF?DIX
>    offload w/DSA before I start adding support J
>    - With the exception of DIF Check, all of the DSA functions include a
>    copy (I can only assume they figured a use case where they are moving data
>    from a host buffer into a different memory subsystem in prep for DMA’ing to
>    disk).  It looks like most if not all of our calculations are done on fixed
>    buffers. I see a few copy functions in diff.c but I don’t see them used
>    anywhere
>
>
>
> I’m almost thinking the DSA functions are too “simple” for our current
> implementation but wonder if there’s some refactoring we can do to make use
> of them. I don’t know if the DSA CRC32C engine calculates the same exact
> CRC as the DIF/DIX functions but if so (I can verify) at a minum maybe use
> just accelerate the CRCs called from funcs within diff.c
>
>
>
> Thoughts? We can chat in a community meeting soon too but email might be
> easier to get us all on the amge page first.
>
>
>
> Thanks!!
>
> Paul
>
>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org
>

[-- Attachment #2: attachment.htm --]
[-- Type: text/html, Size: 6931 bytes --]

[-- Attachment #3: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

[-- Attachment #4: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [SPDK] Re: DIF/DIX acceleration in SPDK
@ 2020-06-11 18:11 Andrey Kuzmin
  0 siblings, 0 replies; 10+ messages in thread
From: Andrey Kuzmin @ 2020-06-11 18:11 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 4408 bytes --]

Hi Paul,

does the offload engine support (or plan to support in the future) more
complex compute-intensive storage tasks such as compression/decompression,
crypto (encryption/strong hashing) etc.?

Thanks,
Andrey

On Thu, Jun 11, 2020, 18:46 Luse, Paul E <paul.e.luse(a)intel.com> wrote:

> Hi Everyone,
>
>
>
> This is primarily for Shuhei but please feel free, anyone, to respond J
>
>
>
> Adding support for Intel’s next generation offload engine is going well
> (Note, the feature is not available in HW yet, I’m using a simulator to do
> dev/test). Currently support exists, or is about to land on master, for:
>
>
>
> Copy, fill, dual-cast, CRC32C, compare and the ability to submit batches
> of commands.
>
>
>
> Currently these are only being used by a new tool in /examples/accel/perf
> but once they all land and I’ve added some more tests, we’ll start using
> them in SPDK modules – the most notable uses will be for CRC32C 9iscsi) and
> DIF/DIX throughout the stack.  There will be other uses (compare, fill,
> copy, etc) as well but those are the big ones.
>
>
>
> I’ve just now started looking at DIF/DIX and have determined that using
> these within SPDK won’t be quite as straightforward as some of the others.
> I’ll explain what I’m thinking after briefly summering the DSA DIF/DIX
> functions (more detail is available in the public spec at
> https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html
> )
>
>
>
> Note: there is no SGL support in any of these, all are single src and/or
> dst:
>
>
>
>    - DIF Check: The DIF Check operation computes the Data Integrity Field
>    (DIF) on the source data and compares the computed DIF to the DIF contained
>    in the source data.
>    - DIF Insert: The DIF Insert operation copies memory from the Source
>    Address to the Destination Address, while computing the Data Integrity
>    Field (DIF) on the source data and inserting the DIF into the output data.
>    - DIF Strip: The DIF Strip operation copies memory from the Source
>    Address to the Destination Address, removing the Data Integrity Field
>    (DIF). It optionally computes the DIF on the source data and compares the
>    computed DIF to the DIF contained in the source data.
>    - DIF Update: The DIF Update operation copies memory from the Source
>    Address to the Destination Address. It optionally computes the Data
>    Integrity Field (DIF) on the source data and compares the computed DIF to
>    the DIF contained in the data. It simultaneously computes the DIF on the
>    source data using Destination DIF fields in the descriptor and inserts the
>    computed DIF into the output data.
>
>
>
> Upon initial review of the relatively complex implementation of DIF?DIX we
> have in SPDK I have the following observations that I’m hoping to get some
> feedback on:
>
>
>
>    - It looks like we require SGL in most if not all cases. I can go
>    through them one by one but wanted to get an initial feel mainly from
>    Shuhei on how lack of SGL support impacts our ability to use DIF?DIX
>    offload w/DSA before I start adding support J
>    - With the exception of DIF Check, all of the DSA functions include a
>    copy (I can only assume they figured a use case where they are moving data
>    from a host buffer into a different memory subsystem in prep for DMA’ing to
>    disk).  It looks like most if not all of our calculations are done on fixed
>    buffers. I see a few copy functions in diff.c but I don’t see them used
>    anywhere
>
>
>
> I’m almost thinking the DSA functions are too “simple” for our current
> implementation but wonder if there’s some refactoring we can do to make use
> of them. I don’t know if the DSA CRC32C engine calculates the same exact
> CRC as the DIF/DIX functions but if so (I can verify) at a minum maybe use
> just accelerate the CRCs called from funcs within diff.c
>
>
>
> Thoughts? We can chat in a community meeting soon too but email might be
> easier to get us all on the amge page first.
>
>
>
> Thanks!!
>
> Paul
>
>
> _______________________________________________
> SPDK mailing list -- spdk(a)lists.01.org
> To unsubscribe send an email to spdk-leave(a)lists.01.org
>

[-- Attachment #2: attachment.htm --]
[-- Type: text/html, Size: 6774 bytes --]

[-- Attachment #3: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

[-- Attachment #4: image001.gif --]
[-- Type: image/gif, Size: 92 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-06-14 19:27 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 21:34 [SPDK] Re: DIF/DIX acceleration in SPDK Luse, Paul E
  -- strict thread matches above, loose matches on Subject: below --
2020-06-14 19:27 Luse, Paul E
2020-06-14 16:53 Sasha Kotchubievsky
2020-06-14 16:38 Sasha Kotchubievsky
2020-06-12 13:45 
2020-06-11 20:20 Luse, Paul E
2020-06-11 20:00 Walker, Benjamin
2020-06-11 18:21 Luse, Paul E
2020-06-11 18:13 Andrey Kuzmin
2020-06-11 18:11 Andrey Kuzmin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.