All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
@ 2021-11-10 17:43 Itai Handler
  2021-11-11 13:07 ` Milan Broz
  0 siblings, 1 reply; 8+ messages in thread
From: Itai Handler @ 2021-11-10 17:43 UTC (permalink / raw)
  To: dm-devel; +Cc: snitzer, agk

Maximum sector size of dm-crypt is currently limited to 4096 bytes.

On systems where PAGE_SIZE is larger than 4096 bytes, using larger
sectors can be beneficial for performance reasons.

This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
and in addition it changes the type of sector_size in
struct crypt_config from 'unsigned short int' to 'unsigned int', in
order to be able to represent larger values.

On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
x2 performance improvement in sequential read throughput benchmark
while using only about half of the CPU usage, after simply increasing
sector size from 4096 to 65536 bytes.
I used ext4 filesystem for that benchmark, which supports 64KiB
sectors.

Note: A small change should be made in cryptsetup in order to add
support for sectors larger than 4096 bytes.

Signed-off-by: Itai Handler <itai.handler@gmail.com>
---
 drivers/md/dm-crypt.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 916b7da16de2..78c239443bd5 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -168,7 +168,7 @@ struct crypt_config {
        } iv_gen_private;
        u64 iv_offset;
        unsigned int iv_size;
-       unsigned short int sector_size;
+       unsigned int sector_size;
        unsigned char sector_shift;

        union {
@@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
*ti, unsigned int argc, char **ar
                        cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
                        if (!cc->cipher_auth)
                                return -ENOMEM;
-               } else if (sscanf(opt_string, "sector_size:%hu%c",
&cc->sector_size, &dummy) == 1) {
+               } else if (sscanf(opt_string, "sector_size:%u%c",
&cc->sector_size, &dummy) == 1) {
                        if (cc->sector_size < (1 << SECTOR_SHIFT) ||
-                           cc->sector_size > 4096 ||
+                           cc->sector_size > PAGE_SIZE ||
                            (cc->sector_size & (cc->sector_size - 1))) {
                                ti->error = "Invalid feature value for
sector_size";
                                return -EINVAL;
-- 
2.17.1

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-10 17:43 [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE Itai Handler
@ 2021-11-11 13:07 ` Milan Broz
  2021-11-12 15:32   ` Mike Snitzer
  2021-11-14 11:56   ` Itai Handler
  0 siblings, 2 replies; 8+ messages in thread
From: Milan Broz @ 2021-11-11 13:07 UTC (permalink / raw)
  To: Itai Handler, dm-devel; +Cc: Mikulas Patocka, agk, snitzer

On 10/11/2021 18:43, Itai Handler wrote:
> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
> 
> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
> sectors can be beneficial for performance reasons.

The limit to 4096 was set because this is the smallest possible
page size that all platform supports.

If you allow a higher size here, the device cannot be activated on a platform
with the smaller page size. (Encrypted sector size becomes
atomic sector size for all upper layers - as you mention below, not
all fs support bigger sectors.)

For LUKS, this is not acceptable - the format is portable by definition.

For specific dm-crypt device, I am not sure. I would better kept
the 4096 page size limit here.

It also depends on crypto API driver here (performance is usually optimized to 4k).
What cipher and encryption mode did you use for test?

How the number looks for random access? Linear test is usually misleading.
I expect there will be big performance problem if you write small data chunks,
writes and encryption will be amplified to full big sectors here...)

(Technical detail: such pat MUST increase dm-crypt minor version.)

Milan

> 
> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
> and in addition it changes the type of sector_size in
> struct crypt_config from 'unsigned short int' to 'unsigned int', in
> order to be able to represent larger values.
> 
> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
> x2 performance improvement in sequential read throughput benchmark
> while using only about half of the CPU usage, after simply increasing
> sector size from 4096 to 65536 bytes.
> I used ext4 filesystem for that benchmark, which supports 64KiB
> sectors.
> 
> Note: A small change should be made in cryptsetup in order to add
> support for sectors larger than 4096 bytes.
> 
> Signed-off-by: Itai Handler <itai.handler@gmail.com>
> ---
>   drivers/md/dm-crypt.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 916b7da16de2..78c239443bd5 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -168,7 +168,7 @@ struct crypt_config {
>          } iv_gen_private;
>          u64 iv_offset;
>          unsigned int iv_size;
> -       unsigned short int sector_size;
> +       unsigned int sector_size;
>          unsigned char sector_shift;
> 
>          union {
> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
> *ti, unsigned int argc, char **ar
>                          cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>                          if (!cc->cipher_auth)
>                                  return -ENOMEM;
> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
> &cc->sector_size, &dummy) == 1) {
> +               } else if (sscanf(opt_string, "sector_size:%u%c",
> &cc->sector_size, &dummy) == 1) {
>                          if (cc->sector_size < (1 << SECTOR_SHIFT) ||
> -                           cc->sector_size > 4096 ||
> +                           cc->sector_size > PAGE_SIZE ||
>                              (cc->sector_size & (cc->sector_size - 1))) {
>                                  ti->error = "Invalid feature value for
> sector_size";
>                                  return -EINVAL;
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-11 13:07 ` Milan Broz
@ 2021-11-12 15:32   ` Mike Snitzer
  2021-11-14 11:56   ` Itai Handler
  1 sibling, 0 replies; 8+ messages in thread
From: Mike Snitzer @ 2021-11-12 15:32 UTC (permalink / raw)
  To: Milan Broz, Itai Handler; +Cc: dm-devel, Mikulas Patocka, agk

On Thu, Nov 11 2021 at  8:07P -0500,
Milan Broz <gmazyland@gmail.com> wrote:

> On 10/11/2021 18:43, Itai Handler wrote:
> > Maximum sector size of dm-crypt is currently limited to 4096 bytes.
> > 
> > On systems where PAGE_SIZE is larger than 4096 bytes, using larger
> > sectors can be beneficial for performance reasons.
> 
> The limit to 4096 was set because this is the smallest possible
> page size that all platform supports.
> 
> If you allow a higher size here, the device cannot be activated on a platform
> with the smaller page size. (Encrypted sector size becomes
> atomic sector size for all upper layers - as you mention below, not
> all fs support bigger sectors.)
> 
> For LUKS, this is not acceptable - the format is portable by definition.
> 
> For specific dm-crypt device, I am not sure. I would better kept
> the 4096 page size limit here.
> 
> It also depends on crypto API driver here (performance is usually optimized to 4k).
> What cipher and encryption mode did you use for test?
> 
> How the number looks for random access? Linear test is usually misleading.
> I expect there will be big performance problem if you write small data chunks,
> writes and encryption will be amplified to full big sectors here...)
> 
> (Technical detail: such patch MUST increase dm-crypt minor version.)
> 
> Milan

Thanks for your insight Milan, much appreciated.

> > 
> > This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
> > and in addition it changes the type of sector_size in
> > struct crypt_config from 'unsigned short int' to 'unsigned int', in
> > order to be able to represent larger values.
> > 
> > On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
> > x2 performance improvement in sequential read throughput benchmark
> > while using only about half of the CPU usage, after simply increasing
> > sector size from 4096 to 65536 bytes.
> > I used ext4 filesystem for that benchmark, which supports 64KiB
> > sectors.
> > 
> > Note: A small change should be made in cryptsetup in order to add
> > support for sectors larger than 4096 bytes.
> > 
> > Signed-off-by: Itai Handler <itai.handler@gmail.com>

Your patch is way too raw.  It doesn't come with any understanding of
all the nuanced reasons for 4096 that Milan raised above, and so I've
dropped your patch from further consideration.

Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-11 13:07 ` Milan Broz
  2021-11-12 15:32   ` Mike Snitzer
@ 2021-11-14 11:56   ` Itai Handler
  2021-11-25 16:28     ` Itai Handler
  1 sibling, 1 reply; 8+ messages in thread
From: Itai Handler @ 2021-11-14 11:56 UTC (permalink / raw)
  To: Milan Broz, dm-devel; +Cc: Mikulas Patocka, agk, snitzer

On 11/11/2021 15:07, Milan Broz wrote:
> On 10/11/2021 18:43, Itai Handler wrote:
>> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
>>
>> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
>> sectors can be beneficial for performance reasons.
>
> The limit to 4096 was set because this is the smallest possible
> page size that all platform supports.
>
> If you allow a higher size here, the device cannot be activated on a 
> platform
> with the smaller page size. (Encrypted sector size becomes
> atomic sector size for all upper layers - as you mention below, not
> all fs support bigger sectors.)
>
> For LUKS, this is not acceptable - the format is portable by definition.
>
> For specific dm-crypt device, I am not sure. I would better kept
> the 4096 page size limit here.

I considered only plain dm-crypt since I am unfamiliar with LUKS.
Does LUKS assume that dm-crypt sector size is limited to 4K?
If so, maybe I'll be able to also patch LUKS regarding this issue.

>
> It also depends on crypto API driver here (performance is usually 
> optimized to 4k).
> What cipher and encryption mode did you use for test?

The cipher I used for the test is not publicly available but I can say 
that it's performance
is not optimized to 4k blocks.
I believe that this results from the high overhead of setting up DMA 
transfers. (my
cipher uses DMA to transfer data between memory and programmable logic).
There are many additional ciphers that use DMA in the tree, but I cannot 
run any
benchmark with them at the moment.

I have performed some additional benchmarks using the ARM Cryptographic 
Extensions
CPU ciphers and saw that increasing block size beyond 4K does increase 
performance,
albeit the performance improvement isn't as large as I've seen when 
using my cipher.

Following are "tcrypt mode=600 sec=5 num_mb=512" results for xts-aes-ce 
decryption
(ARM CPU Cryptographic Extensions cipher):
   testing speed of multibuffer xts(aes) (xts-aes-ce) decryption
   ...
   trcypt: test 5 (256 bit key, 4096 byte blocks): 801792 operations in 
5 seconds (3284140032 bytes)
   ...
   trcypt: test 9 (256 bit key, 65536 byte blocks): 63488 operations in 
5 seconds (4160749568 bytes)

That translates to:
   657 MB/s for 4K byte blocks.
   832 MB/s for 64K blocks.

That means that there is about 27 percents improvement when 
transitioning to 64K sectors,
for the cipher alone (only tcrypt benchmark).

This benchmark had been performed on an ARM Cortex A53 CPU.
(Note that in all of my benchmarks PAGE_SIZE=64K).

> How the number looks for random access? Linear test is usually 
> misleading.
> I expect there will be big performance problem if you write small data 
> chunks,
> writes and encryption will be amplified to full big sectors here...)
I understand your concern.
However my patch does not force anyone to use large sectors - it only 
opens up this
possibility for those interested in that option.
This is similarly to the option to format an ext4 filesystem with 64K 
sectors.
By the way: when you do that, you get a warning saying that the filesystem
will not be usable on most systems.

Sometime users need to store mostly large files on a filesystem, for 
example for
backup or for video files.
I think that in these cases random access time is not so important.
Some users may also be able to reserve a dedicated partition for storing 
such
large files.

>
> (Technical detail: such pat MUST increase dm-crypt minor version.)
Thanks for pointing that out. I believe that I can prepare a v2 patch 
that will
address that issue.
>
> Milan
>
>>
>> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
>> and in addition it changes the type of sector_size in
>> struct crypt_config from 'unsigned short int' to 'unsigned int', in
>> order to be able to represent larger values.
>>
>> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
>> x2 performance improvement in sequential read throughput benchmark
>> while using only about half of the CPU usage, after simply increasing
>> sector size from 4096 to 65536 bytes.
>> I used ext4 filesystem for that benchmark, which supports 64KiB
>> sectors.
>>
>> Note: A small change should be made in cryptsetup in order to add
>> support for sectors larger than 4096 bytes.
>>
>> Signed-off-by: Itai Handler <itai.handler@gmail.com>
>> ---
>>   drivers/md/dm-crypt.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>> index 916b7da16de2..78c239443bd5 100644
>> --- a/drivers/md/dm-crypt.c
>> +++ b/drivers/md/dm-crypt.c
>> @@ -168,7 +168,7 @@ struct crypt_config {
>>          } iv_gen_private;
>>          u64 iv_offset;
>>          unsigned int iv_size;
>> -       unsigned short int sector_size;
>> +       unsigned int sector_size;
>>          unsigned char sector_shift;
>>
>>          union {
>> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
>> *ti, unsigned int argc, char **ar
>>                          cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>>                          if (!cc->cipher_auth)
>>                                  return -ENOMEM;
>> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
>> &cc->sector_size, &dummy) == 1) {
>> +               } else if (sscanf(opt_string, "sector_size:%u%c",
>> &cc->sector_size, &dummy) == 1) {
>>                          if (cc->sector_size < (1 << SECTOR_SHIFT) ||
>> -                           cc->sector_size > 4096 ||
>> +                           cc->sector_size > PAGE_SIZE ||
>>                              (cc->sector_size & (cc->sector_size - 
>> 1))) {
>>                                  ti->error = "Invalid feature value for
>> sector_size";
>>                                  return -EINVAL;
>>
>
I appreciate your valuable comments.

Itai

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-14 11:56   ` Itai Handler
@ 2021-11-25 16:28     ` Itai Handler
  2021-11-27 11:19       ` Milan Broz
  0 siblings, 1 reply; 8+ messages in thread
From: Itai Handler @ 2021-11-25 16:28 UTC (permalink / raw)
  To: Milan Broz, dm-devel; +Cc: Mikulas Patocka, agk, snitzer

On 14/11/2021 13:56, Itai Handler wrote:
> On 11/11/2021 15:07, Milan Broz wrote:
>> On 10/11/2021 18:43, Itai Handler wrote:
>>> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
>>>
>>> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
>>> sectors can be beneficial for performance reasons.
>>
>> The limit to 4096 was set because this is the smallest possible
>> page size that all platform supports.
>>
>> If you allow a higher size here, the device cannot be activated on a 
>> platform
>> with the smaller page size. (Encrypted sector size becomes
>> atomic sector size for all upper layers - as you mention below, not
>> all fs support bigger sectors.)
>>
>> For LUKS, this is not acceptable - the format is portable by definition.
>>
>> For specific dm-crypt device, I am not sure. I would better kept
>> the 4096 page size limit here.
>
> I considered only plain dm-crypt since I am unfamiliar with LUKS.
> Does LUKS assume that dm-crypt sector size is limited to 4K?
> If so, maybe I'll be able to also patch LUKS regarding this issue.
>
>>
>> It also depends on crypto API driver here (performance is usually 
>> optimized to 4k).
>> What cipher and encryption mode did you use for test?
>
> The cipher I used for the test is not publicly available but I can say 
> that it's performance
> is not optimized to 4k blocks.
> I believe that this results from the high overhead of setting up DMA 
> transfers. (my
> cipher uses DMA to transfer data between memory and programmable logic).
> There are many additional ciphers that use DMA in the tree, but I 
> cannot run any
> benchmark with them at the moment.
>
> I have performed some additional benchmarks using the ARM 
> Cryptographic Extensions
> CPU ciphers and saw that increasing block size beyond 4K does increase 
> performance,
> albeit the performance improvement isn't as large as I've seen when 
> using my cipher.
>
> Following are "tcrypt mode=600 sec=5 num_mb=512" results for 
> xts-aes-ce decryption
> (ARM CPU Cryptographic Extensions cipher):
>   testing speed of multibuffer xts(aes) (xts-aes-ce) decryption
>   ...
>   trcypt: test 5 (256 bit key, 4096 byte blocks): 801792 operations in 
> 5 seconds (3284140032 bytes)
>   ...
>   trcypt: test 9 (256 bit key, 65536 byte blocks): 63488 operations in 
> 5 seconds (4160749568 bytes)
>
> That translates to:
>   657 MB/s for 4K byte blocks.
>   832 MB/s for 64K blocks.
>
> That means that there is about 27 percents improvement when 
> transitioning to 64K sectors,
> for the cipher alone (only tcrypt benchmark).
>
> This benchmark had been performed on an ARM Cortex A53 CPU.
> (Note that in all of my benchmarks PAGE_SIZE=64K).
>
>> How the number looks for random access? Linear test is usually 
>> misleading.
>> I expect there will be big performance problem if you write small 
>> data chunks,
>> writes and encryption will be amplified to full big sectors here...)
> I understand your concern.
> However my patch does not force anyone to use large sectors - it only 
> opens up this
> possibility for those interested in that option.
> This is similarly to the option to format an ext4 filesystem with 64K 
> sectors.
> By the way: when you do that, you get a warning saying that the 
> filesystem
> will not be usable on most systems.
>
> Sometime users need to store mostly large files on a filesystem, for 
> example for
> backup or for video files.
> I think that in these cases random access time is not so important.
> Some users may also be able to reserve a dedicated partition for 
> storing such
> large files.
>
>>
>> (Technical detail: such pat MUST increase dm-crypt minor version.)
> Thanks for pointing that out. I believe that I can prepare a v2 patch 
> that will
> address that issue.
>>
>> Milan
>>
>>>
>>> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
>>> and in addition it changes the type of sector_size in
>>> struct crypt_config from 'unsigned short int' to 'unsigned int', in
>>> order to be able to represent larger values.
>>>
>>> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
>>> x2 performance improvement in sequential read throughput benchmark
>>> while using only about half of the CPU usage, after simply increasing
>>> sector size from 4096 to 65536 bytes.
>>> I used ext4 filesystem for that benchmark, which supports 64KiB
>>> sectors.
>>>
>>> Note: A small change should be made in cryptsetup in order to add
>>> support for sectors larger than 4096 bytes.
>>>
>>> Signed-off-by: Itai Handler <itai.handler@gmail.com>
>>> ---
>>>   drivers/md/dm-crypt.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>>> index 916b7da16de2..78c239443bd5 100644
>>> --- a/drivers/md/dm-crypt.c
>>> +++ b/drivers/md/dm-crypt.c
>>> @@ -168,7 +168,7 @@ struct crypt_config {
>>>          } iv_gen_private;
>>>          u64 iv_offset;
>>>          unsigned int iv_size;
>>> -       unsigned short int sector_size;
>>> +       unsigned int sector_size;
>>>          unsigned char sector_shift;
>>>
>>>          union {
>>> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
>>> *ti, unsigned int argc, char **ar
>>>                          cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>>>                          if (!cc->cipher_auth)
>>>                                  return -ENOMEM;
>>> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
>>> &cc->sector_size, &dummy) == 1) {
>>> +               } else if (sscanf(opt_string, "sector_size:%u%c",
>>> &cc->sector_size, &dummy) == 1) {
>>>                          if (cc->sector_size < (1 << SECTOR_SHIFT) ||
>>> -                           cc->sector_size > 4096 ||
>>> +                           cc->sector_size > PAGE_SIZE ||
>>>                              (cc->sector_size & (cc->sector_size - 
>>> 1))) {
>>>                                  ti->error = "Invalid feature value for
>>> sector_size";
>>>                                  return -EINVAL;
>>>
>>
> I appreciate your valuable comments.
>
> Itai
>
Milan, can you comment on the above?

Itai

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-25 16:28     ` Itai Handler
@ 2021-11-27 11:19       ` Milan Broz
  2021-11-28 15:33         ` Itai Handler
  0 siblings, 1 reply; 8+ messages in thread
From: Milan Broz @ 2021-11-27 11:19 UTC (permalink / raw)
  To: Itai Handler, dm-devel; +Cc: Mikulas Patocka, agk, snitzer

On 11/25/21 17:28, Itai Handler wrote:
> On 14/11/2021 13:56, Itai Handler wrote:
>> On 11/11/2021 15:07, Milan Broz wrote:
>>> On 10/11/2021 18:43, Itai Handler wrote:
>>>> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
>>>>
>>>> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
>>>> sectors can be beneficial for performance reasons.
>>>
>>> The limit to 4096 was set because this is the smallest possible
>>> page size that all platform supports.
>>>
>>> If you allow a higher size here, the device cannot be activated on a
>>> platform
>>> with the smaller page size. (Encrypted sector size becomes
>>> atomic sector size for all upper layers - as you mention below, not
>>> all fs support bigger sectors.)
>>>
>>> For LUKS, this is not acceptable - the format is portable by definition.
>>>
>>> For specific dm-crypt device, I am not sure. I would better kept
>>> the 4096 page size limit here.
>>
>> I considered only plain dm-crypt since I am unfamiliar with LUKS.
>> Does LUKS assume that dm-crypt sector size is limited to 4K?
>> If so, maybe I'll be able to also patch LUKS regarding this issue.
>>
>>>
>>> It also depends on crypto API driver here (performance is usually
>>> optimized to 4k).
>>> What cipher and encryption mode did you use for test?
>>
>> The cipher I used for the test is not publicly available but I can say
>> that it's performance
>> is not optimized to 4k blocks.
>> I believe that this results from the high overhead of setting up DMA
>> transfers. (my
>> cipher uses DMA to transfer data between memory and programmable logic).
>> There are many additional ciphers that use DMA in the tree, but I
>> cannot run any
>> benchmark with them at the moment.
>>
>> I have performed some additional benchmarks using the ARM
>> Cryptographic Extensions
>> CPU ciphers and saw that increasing block size beyond 4K does increase
>> performance,
>> albeit the performance improvement isn't as large as I've seen when
>> using my cipher.
>>
>> Following are "tcrypt mode=600 sec=5 num_mb=512" results for
>> xts-aes-ce decryption
>> (ARM CPU Cryptographic Extensions cipher):
>>    testing speed of multibuffer xts(aes) (xts-aes-ce) decryption
>>    ...
>>    trcypt: test 5 (256 bit key, 4096 byte blocks): 801792 operations in
>> 5 seconds (3284140032 bytes)
>>    ...
>>    trcypt: test 9 (256 bit key, 65536 byte blocks): 63488 operations in
>> 5 seconds (4160749568 bytes)
>>
>> That translates to:
>>    657 MB/s for 4K byte blocks.
>>    832 MB/s for 64K blocks.
>>
>> That means that there is about 27 percents improvement when
>> transitioning to 64K sectors,
>> for the cipher alone (only tcrypt benchmark).
>>
>> This benchmark had been performed on an ARM Cortex A53 CPU.
>> (Note that in all of my benchmarks PAGE_SIZE=64K).
>>
>>> How the number looks for random access? Linear test is usually
>>> misleading.
>>> I expect there will be big performance problem if you write small
>>> data chunks,
>>> writes and encryption will be amplified to full big sectors here...)
>> I understand your concern.
>> However my patch does not force anyone to use large sectors - it only
>> opens up this
>> possibility for those interested in that option.
>> This is similarly to the option to format an ext4 filesystem with 64K
>> sectors.
>> By the way: when you do that, you get a warning saying that the
>> filesystem
>> will not be usable on most systems.
>>
>> Sometime users need to store mostly large files on a filesystem, for
>> example for
>> backup or for video files.
>> I think that in these cases random access time is not so important.
>> Some users may also be able to reserve a dedicated partition for
>> storing such
>> large files.
>>
>>>
>>> (Technical detail: such pat MUST increase dm-crypt minor version.)
>> Thanks for pointing that out. I believe that I can prepare a v2 patch
>> that will
>> address that issue.
>>>
>>> Milan
>>>
>>>>
>>>> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
>>>> and in addition it changes the type of sector_size in
>>>> struct crypt_config from 'unsigned short int' to 'unsigned int', in
>>>> order to be able to represent larger values.
>>>>
>>>> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
>>>> x2 performance improvement in sequential read throughput benchmark
>>>> while using only about half of the CPU usage, after simply increasing
>>>> sector size from 4096 to 65536 bytes.
>>>> I used ext4 filesystem for that benchmark, which supports 64KiB
>>>> sectors.
>>>>
>>>> Note: A small change should be made in cryptsetup in order to add
>>>> support for sectors larger than 4096 bytes.
>>>>
>>>> Signed-off-by: Itai Handler <itai.handler@gmail.com>
>>>> ---
>>>>    drivers/md/dm-crypt.c | 6 +++---
>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>>>> index 916b7da16de2..78c239443bd5 100644
>>>> --- a/drivers/md/dm-crypt.c
>>>> +++ b/drivers/md/dm-crypt.c
>>>> @@ -168,7 +168,7 @@ struct crypt_config {
>>>>           } iv_gen_private;
>>>>           u64 iv_offset;
>>>>           unsigned int iv_size;
>>>> -       unsigned short int sector_size;
>>>> +       unsigned int sector_size;
>>>>           unsigned char sector_shift;
>>>>
>>>>           union {
>>>> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
>>>> *ti, unsigned int argc, char **ar
>>>>                           cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>>>>                           if (!cc->cipher_auth)
>>>>                                   return -ENOMEM;
>>>> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
>>>> &cc->sector_size, &dummy) == 1) {
>>>> +               } else if (sscanf(opt_string, "sector_size:%u%c",
>>>> &cc->sector_size, &dummy) == 1) {
>>>>                           if (cc->sector_size < (1 << SECTOR_SHIFT) ||
>>>> -                           cc->sector_size > 4096 ||
>>>> +                           cc->sector_size > PAGE_SIZE ||
>>>>                               (cc->sector_size & (cc->sector_size -
>>>> 1))) {
>>>>                                   ti->error = "Invalid feature value for
>>>> sector_size";
>>>>                                   return -EINVAL;
>>>>
>>>
>> I appreciate your valuable comments.
>>
>> Itai
>>
> Milan, can you comment on the above?

Hi,

well, if you want my opinion, let's summarize it:

- you tested it on a driver that is proprietary and is not in
the mainline kernel

- provided numbers are with tcrypt - I think this is
an internal crypto API test; it says nothing about performance
with your dm-crypt patch above it, right?

- no numbers for random access/small file access here (note,
if you need to reencrypt 64k sector vs. 4k of mostly unused data,
it WILL have a huge performance impact).

IMO, optimizing for big linear access is better on the fs layer
(and even years ago, I noticed some vendor's patches for setting
large fs sectors in video recording equipment).

The same applies to encryption - I think FDE is not the best layer
here if you need to work with large sectors (encryption of not-used
sectors is just huge waste of resources - fs encrypts only used space,
I hope... :).

I said that LUKS must remain multiplatform. Adding larger than
4k sectors is not an option; I will undoubtedly reject that patch
in cryptsetup.
So this patch can be used only for plain dm-crypt mapping,
where you need to handle your key management (that could be ok, though).

Sorry, but without any performance numbers that prove this really
helps and does not create huge performance problems...

NACK from me for this patch (but it is up to Mike to decide here).

Milan

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-27 11:19       ` Milan Broz
@ 2021-11-28 15:33         ` Itai Handler
  2022-03-13 14:53           ` Itai Handler
  0 siblings, 1 reply; 8+ messages in thread
From: Itai Handler @ 2021-11-28 15:33 UTC (permalink / raw)
  To: Milan Broz, dm-devel; +Cc: Mikulas Patocka, agk, snitzer

On 27/11/2021 13:19, Milan Broz wrote:
> On 11/25/21 17:28, Itai Handler wrote:
>> On 14/11/2021 13:56, Itai Handler wrote:
>>> On 11/11/2021 15:07, Milan Broz wrote:
>>>> On 10/11/2021 18:43, Itai Handler wrote:
>>>>> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
>>>>>
>>>>> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
>>>>> sectors can be beneficial for performance reasons.
>>>>
>>>> The limit to 4096 was set because this is the smallest possible
>>>> page size that all platform supports.
>>>>
>>>> If you allow a higher size here, the device cannot be activated on a
>>>> platform
>>>> with the smaller page size. (Encrypted sector size becomes
>>>> atomic sector size for all upper layers - as you mention below, not
>>>> all fs support bigger sectors.)
>>>>
>>>> For LUKS, this is not acceptable - the format is portable by definition.
>>>>
>>>> For specific dm-crypt device, I am not sure. I would better kept
>>>> the 4096 page size limit here.
>>>
>>> I considered only plain dm-crypt since I am unfamiliar with LUKS.
>>> Does LUKS assume that dm-crypt sector size is limited to 4K?
>>> If so, maybe I'll be able to also patch LUKS regarding this issue.
>>>
>>>>
>>>> It also depends on crypto API driver here (performance is usually
>>>> optimized to 4k).
>>>> What cipher and encryption mode did you use for test?
>>>
>>> The cipher I used for the test is not publicly available but I can say
>>> that it's performance
>>> is not optimized to 4k blocks.
>>> I believe that this results from the high overhead of setting up DMA
>>> transfers. (my
>>> cipher uses DMA to transfer data between memory and programmable logic).
>>> There are many additional ciphers that use DMA in the tree, but I
>>> cannot run any
>>> benchmark with them at the moment.
>>>
>>> I have performed some additional benchmarks using the ARM
>>> Cryptographic Extensions
>>> CPU ciphers and saw that increasing block size beyond 4K does increase
>>> performance,
>>> albeit the performance improvement isn't as large as I've seen when
>>> using my cipher.
>>>
>>> Following are "tcrypt mode=600 sec=5 num_mb=512" results for
>>> xts-aes-ce decryption
>>> (ARM CPU Cryptographic Extensions cipher):
>>>    testing speed of multibuffer xts(aes) (xts-aes-ce) decryption
>>>    ...
>>>    trcypt: test 5 (256 bit key, 4096 byte blocks): 801792 operations in
>>> 5 seconds (3284140032 bytes)
>>>    ...
>>>    trcypt: test 9 (256 bit key, 65536 byte blocks): 63488 operations in
>>> 5 seconds (4160749568 bytes)
>>>
>>> That translates to:
>>>    657 MB/s for 4K byte blocks.
>>>    832 MB/s for 64K blocks.
>>>
>>> That means that there is about 27 percents improvement when
>>> transitioning to 64K sectors,
>>> for the cipher alone (only tcrypt benchmark).
>>>
>>> This benchmark had been performed on an ARM Cortex A53 CPU.
>>> (Note that in all of my benchmarks PAGE_SIZE=64K).
>>>
>>>> How the number looks for random access? Linear test is usually
>>>> misleading.
>>>> I expect there will be big performance problem if you write small
>>>> data chunks,
>>>> writes and encryption will be amplified to full big sectors here...)
>>> I understand your concern.
>>> However my patch does not force anyone to use large sectors - it only
>>> opens up this
>>> possibility for those interested in that option.
>>> This is similarly to the option to format an ext4 filesystem with 64K
>>> sectors.
>>> By the way: when you do that, you get a warning saying that the
>>> filesystem
>>> will not be usable on most systems.
>>>
>>> Sometime users need to store mostly large files on a filesystem, for
>>> example for
>>> backup or for video files.
>>> I think that in these cases random access time is not so important.
>>> Some users may also be able to reserve a dedicated partition for
>>> storing such
>>> large files.
>>>
>>>>
>>>> (Technical detail: such pat MUST increase dm-crypt minor version.)
>>> Thanks for pointing that out. I believe that I can prepare a v2 patch
>>> that will
>>> address that issue.
>>>>
>>>> Milan
>>>>
>>>>>
>>>>> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
>>>>> and in addition it changes the type of sector_size in
>>>>> struct crypt_config from 'unsigned short int' to 'unsigned int', in
>>>>> order to be able to represent larger values.
>>>>>
>>>>> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
>>>>> x2 performance improvement in sequential read throughput benchmark
>>>>> while using only about half of the CPU usage, after simply increasing
>>>>> sector size from 4096 to 65536 bytes.
>>>>> I used ext4 filesystem for that benchmark, which supports 64KiB
>>>>> sectors.
>>>>>
>>>>> Note: A small change should be made in cryptsetup in order to add
>>>>> support for sectors larger than 4096 bytes.
>>>>>
>>>>> Signed-off-by: Itai Handler <itai.handler@gmail.com>
>>>>> ---
>>>>>    drivers/md/dm-crypt.c | 6 +++---
>>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>>>>> index 916b7da16de2..78c239443bd5 100644
>>>>> --- a/drivers/md/dm-crypt.c
>>>>> +++ b/drivers/md/dm-crypt.c
>>>>> @@ -168,7 +168,7 @@ struct crypt_config {
>>>>>           } iv_gen_private;
>>>>>           u64 iv_offset;
>>>>>           unsigned int iv_size;
>>>>> -       unsigned short int sector_size;
>>>>> +       unsigned int sector_size;
>>>>>           unsigned char sector_shift;
>>>>>
>>>>>           union {
>>>>> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
>>>>> *ti, unsigned int argc, char **ar
>>>>>                           cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>>>>>                           if (!cc->cipher_auth)
>>>>>                                   return -ENOMEM;
>>>>> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
>>>>> &cc->sector_size, &dummy) == 1) {
>>>>> +               } else if (sscanf(opt_string, "sector_size:%u%c",
>>>>> &cc->sector_size, &dummy) == 1) {
>>>>>                           if (cc->sector_size < (1 << SECTOR_SHIFT) ||
>>>>> -                           cc->sector_size > 4096 ||
>>>>> +                           cc->sector_size > PAGE_SIZE ||
>>>>>                               (cc->sector_size & (cc->sector_size -
>>>>> 1))) {
>>>>>                                   ti->error = "Invalid feature value for
>>>>> sector_size";
>>>>>                                   return -EINVAL;
>>>>>
>>>>
>>> I appreciate your valuable comments.
>>>
>>> Itai
>>>
>> Milan, can you comment on the above?
>
> Hi,
>
> well, if you want my opinion, let's summarize it:
>
> - you tested it on a driver that is proprietary and is not in
> the mainline kernel
You are correct here. I also suspect that some drivers in the mainline may
benefit as well, but I didn't have the opportunity to verify that yet.
>
> - provided numbers are with tcrypt - I think this is
> an internal crypto API test; it says nothing about performance
> with your dm-crypt patch above it, right?
You are correct. Please see additional benchmark results below.
>
> - no numbers for random access/small file access here (note,
> if you need to reencrypt 64k sector vs. 4k of mostly unused data,
> it WILL have a huge performance impact).
>
> IMO, optimizing for big linear access is better on the fs layer
> (and even years ago, I noticed some vendor's patches for setting
> large fs sectors in video recording equipment).
Interesting. Can you point me to some of those patches?
>
> The same applies to encryption - I think FDE is not the best layer
> here if you need to work with large sectors (encryption of not-used
> sectors is just huge waste of resources - fs encrypts only used space,
> I hope... :).
>
> I said that LUKS must remain multiplatform. Adding larger than
> 4k sectors is not an option; I will undoubtedly reject that patch
> in cryptsetup.
> So this patch can be used only for plain dm-crypt mapping,
> where you need to handle your key management (that could be ok, though).
In the future I plan to submit a cryptsetup patch that will allow
larger sectors only for the plain dm-crypt mapping. Such a change will
not affectthe portability of LUKS.
>
> Sorry, but without any performance numbers that prove this really
> helps and does not create huge performance problems...

Following are some benchmark results using the ARM Cryptographic
Extensions CPU cipher "xts(aes)", on an ARM Cortex A53 CPU,
and a 2 TB Samsung EVO 970 NVME SSD:

Random read, dm crypt sector size 4K: 966 MB/s
Random read, dm crypt sector size 64K: 1054 MB/s (9% improvement)

Random write, sector size 4K: 841 MB/s
Random write, sector size 64K: 983 MB/s (16% improvement)

Following are cryptsetup and fio parameters used for this test:
cryptsetup open --type plain --sector-size $SECTOR_SIZE \
    -c 'capi:xts(aes)-plain64'
    /dev/nvme0n1p1 dmcrypt0

(SECTOR_SIZE is set to 4096 or 65536).

fio --ioengine pvsync --filename=/dev/mapper/dmcrypt0 \
    --readwrite=$READWRITE --bs=65536 --blockalign=65536 \
    --iomem_align=65536 --time_based --group_reporting \
    --runtime=50 --direct=1 --num_jobs=32 --name=plain

(READWRITE is set to randread or randwrite)

Note that I suspect that with ciphers that use DMA we will be able to see
much larger improvement (similarly to what I have reported in my initial
post), however I don't have suitable hardware for testing that at the
moment.
It would be nice if someone who has such hardware could test that or
point me to publicly available hardware for buying for such an experiment.

Note that this benchmark randomly reads/writes 64K files to the SSD.
For smaller than 64KB files, the performance is indeed expected to drop
significantly, however:
1. Today users can format (unencrypted) ext4 filesystems to 64K sectors.
If they store small files on such filesystems, they probably already
suffer from performance problems.
2. My patch allows to use all sector sizes between 4K and 64K
(i.e. 4K, 8K, 16K, 32K and 64K). When we use 8K sectors, the performance
drop for small files isn't expected to be so high and still I can show in
that case huge performance improvement, at least when using my proprietary
cipher (I may be able to show improvement with other hardware when I
get my hands on such hardware).
3. Users might be able to partition their disk to several partitions, with
the 64K bytes sectors partitions reserved for larger files.

>
> NACK from me for this patch (but it is up to Mike to decide here).
>
> Milan

Thanks.
Please note that I'll be off-work from now until next Tuesday.

Itai

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE
  2021-11-28 15:33         ` Itai Handler
@ 2022-03-13 14:53           ` Itai Handler
  0 siblings, 0 replies; 8+ messages in thread
From: Itai Handler @ 2022-03-13 14:53 UTC (permalink / raw)
  To: Milan Broz, dm-devel; +Cc: Mikulas Patocka, agk, snitzer


Hi Milan,

On 28/11/2021 17:33, Itai Handler wrote:
> On 27/11/2021 13:19, Milan Broz wrote:
>> On 11/25/21 17:28, Itai Handler wrote:
>>> On 14/11/2021 13:56, Itai Handler wrote:
>>>> On 11/11/2021 15:07, Milan Broz wrote:
>>>>> On 10/11/2021 18:43, Itai Handler wrote:
>>>>>> Maximum sector size of dm-crypt is currently limited to 4096 bytes.
>>>>>>
>>>>>> On systems where PAGE_SIZE is larger than 4096 bytes, using larger
>>>>>> sectors can be beneficial for performance reasons.
>>>>> The limit to 4096 was set because this is the smallest possible
>>>>> page size that all platform supports.
>>>>>
>>>>> If you allow a higher size here, the device cannot be activated on a
>>>>> platform
>>>>> with the smaller page size. (Encrypted sector size becomes
>>>>> atomic sector size for all upper layers - as you mention below, not
>>>>> all fs support bigger sectors.)
>>>>>
>>>>> For LUKS, this is not acceptable - the format is portable by definition.
>>>>>
>>>>> For specific dm-crypt device, I am not sure. I would better kept
>>>>> the 4096 page size limit here.
>>>> I considered only plain dm-crypt since I am unfamiliar with LUKS.
>>>> Does LUKS assume that dm-crypt sector size is limited to 4K?
>>>> If so, maybe I'll be able to also patch LUKS regarding this issue.
>>>>
>>>>> It also depends on crypto API driver here (performance is usually
>>>>> optimized to 4k).
>>>>> What cipher and encryption mode did you use for test?
>>>> The cipher I used for the test is not publicly available but I can say
>>>> that it's performance
>>>> is not optimized to 4k blocks.
>>>> I believe that this results from the high overhead of setting up DMA
>>>> transfers. (my
>>>> cipher uses DMA to transfer data between memory and programmable logic).
>>>> There are many additional ciphers that use DMA in the tree, but I
>>>> cannot run any
>>>> benchmark with them at the moment.
>>>>
>>>> I have performed some additional benchmarks using the ARM
>>>> Cryptographic Extensions
>>>> CPU ciphers and saw that increasing block size beyond 4K does increase
>>>> performance,
>>>> albeit the performance improvement isn't as large as I've seen when
>>>> using my cipher.
>>>>
>>>> Following are "tcrypt mode=600 sec=5 num_mb=512" results for
>>>> xts-aes-ce decryption
>>>> (ARM CPU Cryptographic Extensions cipher):
>>>>    testing speed of multibuffer xts(aes) (xts-aes-ce) decryption
>>>>    ...
>>>>    trcypt: test 5 (256 bit key, 4096 byte blocks): 801792 operations in
>>>> 5 seconds (3284140032 bytes)
>>>>    ...
>>>>    trcypt: test 9 (256 bit key, 65536 byte blocks): 63488 operations in
>>>> 5 seconds (4160749568 bytes)
>>>>
>>>> That translates to:
>>>>    657 MB/s for 4K byte blocks.
>>>>    832 MB/s for 64K blocks.
>>>>
>>>> That means that there is about 27 percents improvement when
>>>> transitioning to 64K sectors,
>>>> for the cipher alone (only tcrypt benchmark).
>>>>
>>>> This benchmark had been performed on an ARM Cortex A53 CPU.
>>>> (Note that in all of my benchmarks PAGE_SIZE=64K).
>>>>
>>>>> How the number looks for random access? Linear test is usually
>>>>> misleading.
>>>>> I expect there will be big performance problem if you write small
>>>>> data chunks,
>>>>> writes and encryption will be amplified to full big sectors here...)
>>>> I understand your concern.
>>>> However my patch does not force anyone to use large sectors - it only
>>>> opens up this
>>>> possibility for those interested in that option.
>>>> This is similarly to the option to format an ext4 filesystem with 64K
>>>> sectors.
>>>> By the way: when you do that, you get a warning saying that the
>>>> filesystem
>>>> will not be usable on most systems.
>>>>
>>>> Sometime users need to store mostly large files on a filesystem, for
>>>> example for
>>>> backup or for video files.
>>>> I think that in these cases random access time is not so important.
>>>> Some users may also be able to reserve a dedicated partition for
>>>> storing such
>>>> large files.
>>>>
>>>>> (Technical detail: such pat MUST increase dm-crypt minor version.)
>>>> Thanks for pointing that out. I believe that I can prepare a v2 patch
>>>> that will
>>>> address that issue.
>>>>> Milan
>>>>>
>>>>>> This patch changes maximum sector size from 4096 bytes to PAGE_SIZE,
>>>>>> and in addition it changes the type of sector_size in
>>>>>> struct crypt_config from 'unsigned short int' to 'unsigned int', in
>>>>>> order to be able to represent larger values.
>>>>>>
>>>>>> On a prototype system which has PAGE_SIZE of 65536 bytes, I saw about
>>>>>> x2 performance improvement in sequential read throughput benchmark
>>>>>> while using only about half of the CPU usage, after simply increasing
>>>>>> sector size from 4096 to 65536 bytes.
>>>>>> I used ext4 filesystem for that benchmark, which supports 64KiB
>>>>>> sectors.
>>>>>>
>>>>>> Note: A small change should be made in cryptsetup in order to add
>>>>>> support for sectors larger than 4096 bytes.
>>>>>>
>>>>>> Signed-off-by: Itai Handler <itai.handler@gmail.com>
>>>>>> ---
>>>>>>    drivers/md/dm-crypt.c | 6 +++---
>>>>>>    1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
>>>>>> index 916b7da16de2..78c239443bd5 100644
>>>>>> --- a/drivers/md/dm-crypt.c
>>>>>> +++ b/drivers/md/dm-crypt.c
>>>>>> @@ -168,7 +168,7 @@ struct crypt_config {
>>>>>>           } iv_gen_private;
>>>>>>           u64 iv_offset;
>>>>>>           unsigned int iv_size;
>>>>>> -       unsigned short int sector_size;
>>>>>> +       unsigned int sector_size;
>>>>>>           unsigned char sector_shift;
>>>>>>
>>>>>>           union {
>>>>>> @@ -3115,9 +3115,9 @@ static int crypt_ctr_optional(struct dm_target
>>>>>> *ti, unsigned int argc, char **ar
>>>>>>                           cc->cipher_auth = kstrdup(sval, GFP_KERNEL);
>>>>>>                           if (!cc->cipher_auth)
>>>>>>                                   return -ENOMEM;
>>>>>> -               } else if (sscanf(opt_string, "sector_size:%hu%c",
>>>>>> &cc->sector_size, &dummy) == 1) {
>>>>>> +               } else if (sscanf(opt_string, "sector_size:%u%c",
>>>>>> &cc->sector_size, &dummy) == 1) {
>>>>>>                           if (cc->sector_size < (1 << SECTOR_SHIFT) ||
>>>>>> -                           cc->sector_size > 4096 ||
>>>>>> +                           cc->sector_size > PAGE_SIZE ||
>>>>>>                               (cc->sector_size & (cc->sector_size -
>>>>>> 1))) {
>>>>>>                                   ti->error = "Invalid feature value for
>>>>>> sector_size";
>>>>>>                                   return -EINVAL;
>>>>>>
>>>> I appreciate your valuable comments.
>>>>
>>>> Itai
>>>>
>>> Milan, can you comment on the above?
>> Hi,
>>
>> well, if you want my opinion, let's summarize it:
>>
>> - you tested it on a driver that is proprietary and is not in
>> the mainline kernel
> You are correct here. I also suspect that some drivers in the mainline may
> benefit as well, but I didn't have the opportunity to verify that yet.
>> - provided numbers are with tcrypt - I think this is
>> an internal crypto API test; it says nothing about performance
>> with your dm-crypt patch above it, right?
> You are correct. Please see additional benchmark results below.
>> - no numbers for random access/small file access here (note,
>> if you need to reencrypt 64k sector vs. 4k of mostly unused data,
>> it WILL have a huge performance impact).
>>
>> IMO, optimizing for big linear access is better on the fs layer
>> (and even years ago, I noticed some vendor's patches for setting
>> large fs sectors in video recording equipment).
> Interesting. Can you point me to some of those patches?
>> The same applies to encryption - I think FDE is not the best layer
>> here if you need to work with large sectors (encryption of not-used
>> sectors is just huge waste of resources - fs encrypts only used space,
>> I hope... :).
>>
>> I said that LUKS must remain multiplatform. Adding larger than
>> 4k sectors is not an option; I will undoubtedly reject that patch
>> in cryptsetup.
>> So this patch can be used only for plain dm-crypt mapping,
>> where you need to handle your key management (that could be ok, though).
> In the future I plan to submit a cryptsetup patch that will allow
> larger sectors only for the plain dm-crypt mapping. Such a change will
> not affectthe portability of LUKS.
>> Sorry, but without any performance numbers that prove this really
>> helps and does not create huge performance problems...
> Following are some benchmark results using the ARM Cryptographic
> Extensions CPU cipher "xts(aes)", on an ARM Cortex A53 CPU,
> and a 2 TB Samsung EVO 970 NVME SSD:
>
> Random read, dm crypt sector size 4K: 966 MB/s
> Random read, dm crypt sector size 64K: 1054 MB/s (9% improvement)
>
> Random write, sector size 4K: 841 MB/s
> Random write, sector size 64K: 983 MB/s (16% improvement)
>
> Following are cryptsetup and fio parameters used for this test:
> cryptsetup open --type plain --sector-size $SECTOR_SIZE \
>     -c 'capi:xts(aes)-plain64'
>     /dev/nvme0n1p1 dmcrypt0
>
> (SECTOR_SIZE is set to 4096 or 65536).
>
> fio --ioengine pvsync --filename=/dev/mapper/dmcrypt0 \
>     --readwrite=$READWRITE --bs=65536 --blockalign=65536 \
>     --iomem_align=65536 --time_based --group_reporting \
>     --runtime=50 --direct=1 --num_jobs=32 --name=plain
>
> (READWRITE is set to randread or randwrite)
>
> Note that I suspect that with ciphers that use DMA we will be able to see
> much larger improvement (similarly to what I have reported in my initial
> post), however I don't have suitable hardware for testing that at the
> moment.
> It would be nice if someone who has such hardware could test that or
> point me to publicly available hardware for buying for such an experiment.
>
> Note that this benchmark randomly reads/writes 64K files to the SSD.
> For smaller than 64KB files, the performance is indeed expected to drop
> significantly, however:
> 1. Today users can format (unencrypted) ext4 filesystems to 64K sectors.
> If they store small files on such filesystems, they probably already
> suffer from performance problems.
> 2. My patch allows to use all sector sizes between 4K and 64K
> (i.e. 4K, 8K, 16K, 32K and 64K). When we use 8K sectors, the performance
> drop for small files isn't expected to be so high and still I can show in
> that case huge performance improvement, at least when using my proprietary
> cipher (I may be able to show improvement with other hardware when I
> get my hands on such hardware).
> 3. Users might be able to partition their disk to several partitions, with
> the 64K bytes sectors partitions reserved for larger files.
>
>> NACK from me for this patch (but it is up to Mike to decide here).
>>
>> Milan
> Thanks.
> Please note that I'll be off-work from now until next Tuesday.
>
> Itai
>
I finally managed to get my hands on an Open-Q 865 SOM Development board
and have just performed a few benchmarks using the qce (qcrypt) crypto
accelerator, which is available also in the mainline kernel.

Following is a summary of these benchmarks:

+------------------+-----------+-----------+-----------+-------------+
| sector_size/test | seq_read  | rand_read | seq_write | rand_write  |
+------------------+-----------+-----------+-----------+-------------+
| 4 KiB  sectors   |  13.3     |  22.8     |  27.2     |  24         |
| 64 KiB sectors   |  581      |  582      |  586      |  588        |
+------------------+-----------+-----------+-----------+-------------+
                               (All numbers in MB/s).

These results show significant improvements:
  x43 sequential read.
  x25 random read.
  x21 sequential write.
  x24 random write.

So it appears to me that there is real benefit for enabling 64K sectors
in dm-crypt at least for the particular use case I showed (qce driver
on arm with 64K PAGE_SIZE, plain dm-crypt mode).

Notes:

1. These benchmarks have been performed using a ramdisk.
2. These benchmarks have been performed on an MSM kernel, which is based on
   linux kernel 4.19.81.
   Kernel sources are available at:
     https://source.codeaurora.org/quic/la/kernel/msm-4.19
   commit id: c815d7543db92377d85d214af0dbf1d11324726e

In all benchmarks, CONFIG_ARM64_64K_PAGES=y, the ramdisk size is 1 GB
(/dev/ram0).

For benchmarking with 4 KiB sectors, I created the dm-crypt device as
follows:

cryptsetup open --type plain --sector-size 4096 \
  -c 'capi:qcom-xts(aes)-plain64' /dev/ram0 dmcrypt0

For 64 KiB sectors, simply replace 4096 with 65536.

Following are the benchmark commands:
fio --filename=/dev/mapper/dmcrypt0 --readwrite=read --bs=1048576 \
  --time_based --group_reporting --runtime=10 --direct=1 --iodepth=4 \
  --ioengine=libaio --name=plain

fio --filename=/dev/mapper/dmcrypt0 --readwrite=randread --bs=1048576 \
  --time_based --group_reporting --runtime=10 --direct=1 --iodepth=4 \
  --ioengine=libaio --name=plain

fio --filename=/dev/mapper/dmcrypt0 --readwrite=write --bs=1048576 \
  --time_based --group_reporting --runtime=10 --direct=1 --iodepth=4 \
  --ioengine=libaio --name=plain

fio --filename=/dev/mapper/dmcrypt0 --readwrite=randwrite --bs=1048576 \
  --time_based --group_reporting --runtime=10 --direct=1 --iodepth=4 \
  --ioengine=libaio --name=plain

- Itai

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-03-13 14:54 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-10 17:43 [dm-devel] [RFC PATCH 1/1] dm crypt: change maximum sector size to PAGE_SIZE Itai Handler
2021-11-11 13:07 ` Milan Broz
2021-11-12 15:32   ` Mike Snitzer
2021-11-14 11:56   ` Itai Handler
2021-11-25 16:28     ` Itai Handler
2021-11-27 11:19       ` Milan Broz
2021-11-28 15:33         ` Itai Handler
2022-03-13 14:53           ` Itai Handler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.