All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-crypt] Encrypting with larger packet size
@ 2013-01-22  4:18 Dinesh Garg
  2013-01-22  5:42 ` Arno Wagner
  0 siblings, 1 reply; 12+ messages in thread
From: Dinesh Garg @ 2013-01-22  4:18 UTC (permalink / raw)
  To: dm-crypt

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

Hi,

Currently dm-crypt encrypts the block device using sector size. Is it
possible to change this to use larger sizes such as 8K or 16K or larger? If
not possible currently, what would it take to make that possible i..e  what
are the problems that we need to fix?

Is it DM layer that restricts us to use larger sizes or file system due to
its block size?

I have one more question: How does a read operation ends up at calling
dm-crypt api?

Thanks,
DG

[-- Attachment #2: Type: text/html, Size: 566 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22  4:18 [dm-crypt] Encrypting with larger packet size Dinesh Garg
@ 2013-01-22  5:42 ` Arno Wagner
  2013-01-22  6:04   ` Dinesh Garg
  0 siblings, 1 reply; 12+ messages in thread
From: Arno Wagner @ 2013-01-22  5:42 UTC (permalink / raw)
  To: dm-crypt

Why would you want to do this? There doe snot seem to be any
good reason...

Arno

On Mon, Jan 21, 2013 at 08:18:50PM -0800, Dinesh Garg wrote:
> Hi,
> 
> Currently dm-crypt encrypts the block device using sector size. Is it
> possible to change this to use larger sizes such as 8K or 16K or larger? If
> not possible currently, what would it take to make that possible i..e  what
> are the problems that we need to fix?
> 
> Is it DM layer that restricts us to use larger sizes or file system due to
> its block size?
> 
> I have one more question: How does a read operation ends up at calling
> dm-crypt api?
> 
> Thanks,
> DG

> _______________________________________________
> dm-crypt mailing list
> dm-crypt@saout.de
> http://www.saout.de/mailman/listinfo/dm-crypt


-- 
Arno Wagner,     Dr. sc. techn., Dipl. Inform.,    Email: arno@wagner.name
GnuPG: ID: CB5D9718  FP: 12D6 C03B 1B30 33BB 13CF  B774 E35C 5FA1 CB5D 9718
----
One of the painful things about our time is that those who feel certainty
are stupid, and those with any imagination and understanding are filled
with doubt and indecision. -- Bertrand Russell

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22  5:42 ` Arno Wagner
@ 2013-01-22  6:04   ` Dinesh Garg
  2013-01-22  7:24     ` Milan Broz
  0 siblings, 1 reply; 12+ messages in thread
From: Dinesh Garg @ 2013-01-22  6:04 UTC (permalink / raw)
  To: dm-crypt

[-- Attachment #1: Type: text/plain, Size: 2086 bytes --]

I want to use hardware based crypto engine which would provide better
encryption/decryption throughput when packet sizes are bigger. If we invoke
hardware based crypto engine for every 512 bytes, there is so much time
spent in setting up cipher context in that throughput becomes very less.

When I ran IOZONE benchmark on device with and without encryption,
performance is upto 70% slower. This data is based on SW based crypto
engine.

Thats why I was thinking if I can contribute to dm-crypt where it can
accept larger packet sizes, it would be great for hardware based crypto
engine solution.

HW based crypto engine outperforms the SW based when packet size reaches
8K.


On Mon, Jan 21, 2013 at 9:42 PM, Arno Wagner <arno@wagner.name> wrote:

> Why would you want to do this? There doe snot seem to be any
> good reason...
>
> Arno
>
> On Mon, Jan 21, 2013 at 08:18:50PM -0800, Dinesh Garg wrote:
> > Hi,
> >
> > Currently dm-crypt encrypts the block device using sector size. Is it
> > possible to change this to use larger sizes such as 8K or 16K or larger?
> If
> > not possible currently, what would it take to make that possible i..e
>  what
> > are the problems that we need to fix?
> >
> > Is it DM layer that restricts us to use larger sizes or file system due
> to
> > its block size?
> >
> > I have one more question: How does a read operation ends up at calling
> > dm-crypt api?
> >
> > Thanks,
> > DG
>
> > _______________________________________________
> > dm-crypt mailing list
> > dm-crypt@saout.de
> > http://www.saout.de/mailman/listinfo/dm-crypt
>
>
> --
> Arno Wagner,     Dr. sc. techn., Dipl. Inform.,    Email: arno@wagner.name
> GnuPG: ID: CB5D9718  FP: 12D6 C03B 1B30 33BB 13CF  B774 E35C 5FA1 CB5D 9718
> ----
> One of the painful things about our time is that those who feel certainty
> are stupid, and those with any imagination and understanding are filled
> with doubt and indecision. -- Bertrand Russell
> _______________________________________________
> dm-crypt mailing list
> dm-crypt@saout.de
> http://www.saout.de/mailman/listinfo/dm-crypt
>

[-- Attachment #2: Type: text/html, Size: 2970 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22  6:04   ` Dinesh Garg
@ 2013-01-22  7:24     ` Milan Broz
  2013-01-22  8:00       ` Dinesh Garg
  0 siblings, 1 reply; 12+ messages in thread
From: Milan Broz @ 2013-01-22  7:24 UTC (permalink / raw)
  To: Dinesh Garg; +Cc: dm-crypt

On 01/22/2013 07:04 AM, Dinesh Garg wrote:
> I want to use hardware based crypto engine which would provide better
> encryption/decryption throughput when packet sizes are bigger. If we
> invoke hardware based crypto engine for every 512 bytes, there is so
> much time spent in setting up cipher context in that throughput
> becomes very less.

Sorry, but fixing hw accelerator issues in sw is not reason for me.

But anyway, if you search archive (just one month ago) I replied
here mentioning the real problem with such change
http://www.saout.de/pipermail/dm-crypt/2012-December/002917.html

The real problem with allowing larger sector sizes is not in patch
(it is quite simple in fact) but in incompatibility it causes
(DM operates on 512 sector block level, LUKS metadata is not ready
for this etc.)

I still think that proper crypto hw accelerator should work even
with these small blocks (scatter-gather lists to offload multiple
buffers and operate in batch mode or whatever. If re-setting IV
for crypto context eat so much time, something is wrong).
Kernel crypto API provides api for async crypto drivers already.

But my reply still applies - it is on TODO list but I would like
to see some real world example, where we really need this.
(Like some on-disk encryption format which require such operation
and we would like to have support for it in Linux.)

> Thats why I was thinking if I can contribute to dm-crypt where it can
> accept larger packet sizes, it would be great for hardware based
> crypto engine solution.
> 
> HW based crypto engine outperforms the SW based when packet size
> reaches 8K.

For your particular machine. Try some new with AES-NI for example.
I have years old board, where VIA padlock crypto hw acceleration
outperforms sw as well - with 512bytes block size.

Milan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22  7:24     ` Milan Broz
@ 2013-01-22  8:00       ` Dinesh Garg
  2013-01-22 15:54         ` Milan Broz
  0 siblings, 1 reply; 12+ messages in thread
From: Dinesh Garg @ 2013-01-22  8:00 UTC (permalink / raw)
  To: Milan Broz; +Cc: dm-crypt

[-- Attachment #1: Type: text/plain, Size: 2848 bytes --]

I think HW accelerator is an issue with smaller packet size because:
1. It takes time to switch the context from software to hardware ( i.e.
interrupt handling)
2. HW accelerators dont allow multiple crypto operations at the same time
due to security reasons

What do you mean by DM operates at 512B level? Is it that DM always calls
dmcrypt with 512B data even if file system is 4K block size?

I tried aggregating scatter gather list at the crypto driver layer but it
did not work because DM supposedly operates at 512B. Sending multiple
encryptions commands to HW CE does not help due to CE setup time etc.

dm-crypt is excellent block driver encryption solution. If anyone can guide
me, I would like to contribute as this is problem faced by HW accelerator(
I read several mails in the archive).

As per my understanding, AES-NI is not supported on arm based chipset.

Thanks,
Dinesh

On Mon, Jan 21, 2013 at 11:24 PM, Milan Broz <gmazyland@gmail.com> wrote:

> On 01/22/2013 07:04 AM, Dinesh Garg wrote:
> > I want to use hardware based crypto engine which would provide better
> > encryption/decryption throughput when packet sizes are bigger. If we
> > invoke hardware based crypto engine for every 512 bytes, there is so
> > much time spent in setting up cipher context in that throughput
> > becomes very less.
>
> Sorry, but fixing hw accelerator issues in sw is not reason for me.
>
> But anyway, if you search archive (just one month ago) I replied
> here mentioning the real problem with such change
> http://www.saout.de/pipermail/dm-crypt/2012-December/002917.html
>
> The real problem with allowing larger sector sizes is not in patch
> (it is quite simple in fact) but in incompatibility it causes
> (DM operates on 512 sector block level, LUKS metadata is not ready
> for this etc.)
>
> I still think that proper crypto hw accelerator should work even
> with these small blocks (scatter-gather lists to offload multiple
> buffers and operate in batch mode or whatever. If re-setting IV
> for crypto context eat so much time, something is wrong).
> Kernel crypto API provides api for async crypto drivers already.
>
> But my reply still applies - it is on TODO list but I would like
> to see some real world example, where we really need this.
> (Like some on-disk encryption format which require such operation
> and we would like to have support for it in Linux.)
>
> > Thats why I was thinking if I can contribute to dm-crypt where it can
> > accept larger packet sizes, it would be great for hardware based
> > crypto engine solution.
> >
> > HW based crypto engine outperforms the SW based when packet size
> > reaches 8K.
>
> For your particular machine. Try some new with AES-NI for example.
> I have years old board, where VIA padlock crypto hw acceleration
> outperforms sw as well - with 512bytes block size.
>
> Milan
>

[-- Attachment #2: Type: text/html, Size: 3602 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22  8:00       ` Dinesh Garg
@ 2013-01-22 15:54         ` Milan Broz
  2013-01-24 19:44           ` Dinesh Garg
  0 siblings, 1 reply; 12+ messages in thread
From: Milan Broz @ 2013-01-22 15:54 UTC (permalink / raw)
  To: Dinesh Garg; +Cc: dm-crypt

On 01/22/2013 09:00 AM, Dinesh Garg wrote:
> I think HW accelerator is an issue with smaller packet size because:
> 1. It takes time to switch the context from software to hardware ( i.e. interrupt handling)
> 2. HW accelerators dont allow multiple crypto operations at the same time due to security reasons

I think all these problems are technically solvable but that's
not discussion for dmcrypt list...

> What do you mean by DM operates at 512B level? Is it that DM always
> calls dmcrypt with 512B data even if file system is 4K block size?

Everything in block layer is calculated in sectors of 512 bytes size,
just for bigger sector size proper alignment is required.
(So if we support, say 8k block, we must _force_ all layers to use only
8k block. Otherwise you will get data corruption.)

> I tried aggregating scatter gather list at the crypto driver layer
> but it did not work because DM supposedly operates at 512B. Sending
> multiple encryptions commands to HW CE does not help due to CE setup
> time etc.

As I said above, you have force DM/block layer to accept only
properly sized requests. Otherwise I do not see problem in crypto layer.
(This should be simulated with kernel cryptd() driver even without
proper crypto hw, just it need some hacking.)

> dm-crypt is excellent block driver encryption solution. If anyone can
> guide me, I would like to contribute as this is problem faced by HW
> accelerator( I read several mails in the archive).

Please can you be more specific?

Which HW accelerator you are using, where is the source of it and
when it will be part of upstream kernel? (if not already).

But as I said, the patch for dmcrypt is not complicated, it
is incompatibility later which this can cause.
So I would really like to see strong reason before we implement
direct support for larger sectors.

> As per my understanding, AES-NI is not supported on arm based
> chipset.

Sure, that was just an example. But I am sure that even on ARM you can
do hw acceleration which works with 512 bytes sectors.

Even if we provide option to use bigger block size, it will not
be compatible with  LUKS devices and other encryption schemes
(like TrueCrypt) which use 512 block always.

Milan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-22 15:54         ` Milan Broz
@ 2013-01-24 19:44           ` Dinesh Garg
  2013-01-25 23:25             ` Arno Wagner
  2013-01-28 12:00             ` [dm-crypt] Encrypting with larger packet size (+some experimental patch) Milan Broz
  0 siblings, 2 replies; 12+ messages in thread
From: Dinesh Garg @ 2013-01-24 19:44 UTC (permalink / raw)
  To: Milan Broz; +Cc: dm-crypt

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

>>Please can you be more specific?
I was offering my help in term of code change and testing and upstreaming
the patch.

>>Which HW accelerator you are using, where is the source of it and
when it will be part of upstream kernel? (if not already).
I would not like to disclose the HW accelerator details. However, IMO,
anyone who changes GPL license based code, has to opensource the changes.

[-- Attachment #2: Type: text/html, Size: 446 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size
  2013-01-24 19:44           ` Dinesh Garg
@ 2013-01-25 23:25             ` Arno Wagner
  2013-01-28 12:00             ` [dm-crypt] Encrypting with larger packet size (+some experimental patch) Milan Broz
  1 sibling, 0 replies; 12+ messages in thread
From: Arno Wagner @ 2013-01-25 23:25 UTC (permalink / raw)
  To: dm-crypt

On Thu, Jan 24, 2013 at 11:44:34AM -0800, Dinesh Garg wrote:
> >>Please can you be more specific?
> I was offering my help in term of code change and testing and upstreaming
> the patch.

You need to establish a clear and valuable use-case first. I still
do not see it.

> > Which HW accelerator you are using, where is the source of it and
> > when it will be part of upstream kernel? (if not already).
> I would not like to disclose the HW accelerator details. However, IMO,
> anyone who changes GPL license based code, has to opensource the changes.

Well, maybe you have noticed that FOSS projects tend to 
be not very friendly to closed hardware and even less so to
secret hardware. I guess you have to fight through this on
your own. If you actually have links to the developers of
said hardware, you may tell them that by having long key
change times, they messed up.

Arno
-- 
Arno Wagner,     Dr. sc. techn., Dipl. Inform.,    Email: arno@wagner.name
GnuPG: ID: CB5D9718  FP: 12D6 C03B 1B30 33BB 13CF  B774 E35C 5FA1 CB5D 9718
----
One of the painful things about our time is that those who feel certainty
are stupid, and those with any imagination and understanding are filled
with doubt and indecision. -- Bertrand Russell

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size (+some experimental patch)
  2013-01-24 19:44           ` Dinesh Garg
  2013-01-25 23:25             ` Arno Wagner
@ 2013-01-28 12:00             ` Milan Broz
  2013-03-05 11:53               ` Michael Stapelberg
  1 sibling, 1 reply; 12+ messages in thread
From: Milan Broz @ 2013-01-28 12:00 UTC (permalink / raw)
  To: Dinesh Garg; +Cc: dm-crypt

On 01/24/2013 08:44 PM, Dinesh Garg wrote:

> >>Please can you be more specific
> I was offering my help in term of code change and testing and upstreaming the patch.

I am more than friendly to add any dmcrypt extension but you can imagine what happens
if reason is

"add larger sector size because some closed source crypto accelerator performs better
with large sector size and as a bonus make device incompatible for others" ...

I need some real reason before submitting such patch.

That said... here is the patch which can use larger block size option. It is not complete
(mainly IV generators need changes and I am afraid there should be morech changes than
basic io hints) but it should work for basic performance testing
if you want to play with it (you have to use dmsetup).

So show your crypto performance numbers now :-)

Milan

-
dm-crypt: optionally support encryption block (sector) size (EXPERIMENTAL)

Add "block_size" optional parameter which specifies encryption
sector size (atomic unit of block device encryption).

Parameter can be in range 512 - 65536 bytes and must be power of two.

NOTE: this device cannot be handled with cryptsetup directly
if this parameter is set.

FIXME: missing fixes for IV generators which have 512 bytes sector hardcoded.
IOW IV is calculated always from 512 bytes offset (which is wrong but
patch still can be used for experiments).

Test script using dmsetup:

  DEV="/dev/sdb"
  DEV_SIZE=$(blockdev --getsz $DEV)
  KEY="9c1185a5c5e9fc54612808977ee8f548b2258d31ddadef707ba62c166051b9e3"
  BLOCK_SIZE=4096

  dmsetup create test_crypt --table "0 $DEV_SIZE crypt aes-xts-plain64 $KEY 0 $DEV 0 2 block_size $BLOCK_SIZE"
  #dmsetup table --showkeys test_crypt

Signed-off-by: Milan Broz <gmazyland@gmail.com>

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index f7369f9..0f4ecf2 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -143,6 +143,7 @@ struct crypt_config {
 	sector_t iv_offset;
 	unsigned int iv_size;
 
+	unsigned int block_size;
 	/*
 	 * Duplicated per cpu state. Access through
 	 * per_cpu_ptr() only.
@@ -693,20 +694,20 @@ static int crypt_convert_block(struct crypt_config *cc,
 	dmreq->iv_sector = ctx->cc_sector;
 	dmreq->ctx = ctx;
 	sg_init_table(&dmreq->sg_in, 1);
-	sg_set_page(&dmreq->sg_in, bv_in->bv_page, 1 << SECTOR_SHIFT,
+	sg_set_page(&dmreq->sg_in, bv_in->bv_page, cc->block_size,
 		    bv_in->bv_offset + ctx->offset_in);
 
 	sg_init_table(&dmreq->sg_out, 1);
-	sg_set_page(&dmreq->sg_out, bv_out->bv_page, 1 << SECTOR_SHIFT,
+	sg_set_page(&dmreq->sg_out, bv_out->bv_page, cc->block_size,
 		    bv_out->bv_offset + ctx->offset_out);
 
-	ctx->offset_in += 1 << SECTOR_SHIFT;
+	ctx->offset_in += cc->block_size;
 	if (ctx->offset_in >= bv_in->bv_len) {
 		ctx->offset_in = 0;
 		ctx->idx_in++;
 	}
 
-	ctx->offset_out += 1 << SECTOR_SHIFT;
+	ctx->offset_out += cc->block_size;
 	if (ctx->offset_out >= bv_out->bv_len) {
 		ctx->offset_out = 0;
 		ctx->idx_out++;
@@ -719,7 +720,7 @@ static int crypt_convert_block(struct crypt_config *cc,
 	}
 
 	ablkcipher_request_set_crypt(req, &dmreq->sg_in, &dmreq->sg_out,
-				     1 << SECTOR_SHIFT, iv);
+				     cc->block_size, iv);
 
 	if (bio_data_dir(ctx->bio_in) == WRITE)
 		r = crypto_ablkcipher_encrypt(req);
@@ -1563,7 +1564,8 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	char dummy;
 
 	static struct dm_arg _args[] = {
-		{0, 1, "Invalid number of feature args"},
+		{0, 3, "Invalid number of feature args"},
+		{512, 65536, "Invalid encryption sector size"},
 	};
 
 	if (argc < 5) {
@@ -1638,6 +1640,8 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	argv += 5;
 	argc -= 5;
 
+	cc->block_size = (1 << SECTOR_SHIFT);
+
 	/* Optional parameters */
 	if (argc) {
 		as.argc = argc;
@@ -1647,15 +1651,25 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		if (ret)
 			goto bad;
 
-		opt_string = dm_shift_arg(&as);
-
-		if (opt_params == 1 && opt_string &&
-		    !strcasecmp(opt_string, "allow_discards"))
-			ti->num_discard_requests = 1;
-		else if (opt_params) {
-			ret = -EINVAL;
-			ti->error = "Invalid feature arguments";
-			goto bad;
+		while (opt_params && (opt_string = dm_shift_arg(&as))) {
+			opt_params--;
+
+			if (!strcasecmp(opt_string, "allow_discards")) {
+				ti->num_discard_requests = 1;
+			} else if (opt_params && !strcasecmp(opt_string, "block_size")) {
+				opt_params--;
+				ret = dm_read_arg(_args + 1, &as, &cc->block_size, &ti->error);
+				/* value must be power of 2 */
+				if (ret || (1 << ilog2(cc->block_size) != cc->block_size)) {
+					ret = -EINVAL;
+					ti->error = "Invalid feature value for block_size";
+					goto bad;
+				}
+			} else {
+				ret = -EINVAL;
+				ti->error = "Invalid feature arguments";
+				goto bad;
+			}
 		}
 	}
 
@@ -1721,7 +1735,8 @@ static int crypt_status(struct dm_target *ti, status_type_t type,
 			unsigned status_flags, char *result, unsigned maxlen)
 {
 	struct crypt_config *cc = ti->private;
-	unsigned int sz = 0;
+	unsigned int sz = 0, opt_num = 0;
+	bool opt_discards = false, opt_sector = false;
 
 	switch (type) {
 	case STATUSTYPE_INFO:
@@ -1746,8 +1761,24 @@ static int crypt_status(struct dm_target *ti, status_type_t type,
 		DMEMIT(" %llu %s %llu", (unsigned long long)cc->iv_offset,
 				cc->dev->name, (unsigned long long)cc->start);
 
-		if (ti->num_discard_requests)
-			DMEMIT(" 1 allow_discards");
+		if (ti->num_discard_requests) {
+			opt_discards = true;
+			opt_num++;
+		}
+
+		if (cc->block_size != (1 << SECTOR_SHIFT)) {
+			opt_sector = true;
+			opt_num += 2;
+		}
+
+		if (opt_num)
+			DMEMIT(" %d", opt_num);
+
+		if (opt_discards)
+			DMEMIT(" allow_discards");
+
+		if (opt_sector)
+			DMEMIT(" block_size %d", cc->block_size);
 
 		break;
 	}
@@ -1843,9 +1874,21 @@ static int crypt_iterate_devices(struct dm_target *ti,
 	return fn(ti, cc->dev, cc->start, ti->len, data);
 }
 
+static void crypt_io_hints(struct dm_target *ti,
+			    struct queue_limits *limits)
+{
+	struct crypt_config *cc = ti->private;
+
+	if (cc->block_size != (1 << SECTOR_SHIFT)) {
+		limits->logical_block_size = cc->block_size;
+		limits->physical_block_size = cc->block_size;
+		blk_limits_io_min(limits, cc->block_size);
+	}
+}
+
 static struct target_type crypt_target = {
 	.name   = "crypt",
-	.version = {1, 12, 0},
+	.version = {1, 13, 0},
 	.module = THIS_MODULE,
 	.ctr    = crypt_ctr,
 	.dtr    = crypt_dtr,
@@ -1857,6 +1900,7 @@ static struct target_type crypt_target = {
 	.message = crypt_message,
 	.merge  = crypt_merge,
 	.iterate_devices = crypt_iterate_devices,
+	.io_hints = crypt_io_hints,
 };
 
 static int __init dm_crypt_init(void)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size (+some experimental patch)
  2013-01-28 12:00             ` [dm-crypt] Encrypting with larger packet size (+some experimental patch) Milan Broz
@ 2013-03-05 11:53               ` Michael Stapelberg
  2013-03-10 15:05                 ` Milan Broz
  0 siblings, 1 reply; 12+ messages in thread
From: Michael Stapelberg @ 2013-03-05 11:53 UTC (permalink / raw)
  To: Milan Broz, dm-crypt

Hi Milan,

On Mon, 28 Jan 2013 13:00:07 +0100
gmazyland at gmail.com (Milan Broz) wrote:
> I need some real reason before submitting such patch.
I am running a qnap TS-119P II NAS with Debian GNU/Linux. The device
has a CESA crypto engine, which is supported by the in-mainline-kernel
mv_cesa module. Unfortunately, this crypto engine exhibits the same
symptoms as the one the author of this thread was describing:

With small block sizes, the crypto engine is really slow.

I have done a series of tests with a WD WD3000FYYZ SATA hard disk drive
that achieves a raw read/write rate of about 177 MiB/s:

dd if=/dev/sda2 of=/dev/null bs=1M count=4096 iflag=direct
4294967296 bytes (4.3 GB) copied, 24.2781 s, 177 MB/s

dd if=/dev/zero of=/dev/sda2 bs=1M count=768 oflag=direct
805306368 bytes (805 MB) copied, 4.5337 s, 178 MB/s

Using mv_cesa with aes-cbc-essiv:sha256 (one of the accelerated
ciphers, as cryptsetup benchmark confirms), I get a rate of about 12.9
MiB/s:

dd if=/dev/zero of=/dev/sda2_crypt bs=1M count=4096 oflag=direct
4294967296 bytes (4.3 GB) copied, 332.425 s, 12.9 MB/s

There is a patch by Phil Sutter which makes mv_cesa use the TDMA engine
of that SoC, and it improves performance quite a bit:

dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=1024 oflag=direct
1073741824 bytes (1.1 GB) copied, 50.5507 s, 21.2 MB/s

This is just to let you know where I am coming from :).

> That said... here is the patch which can use larger block size
> option. It is not complete (mainly IV generators need changes and I
> am afraid there should be morech changes than basic io hints) but it
> should work for basic performance testing if you want to play with it
> (you have to use dmsetup).
> 
> So show your crypto performance numbers now :-)
Thanks for posting this patch. I have set up my device as follows:

echo "0 5858578432 crypt aes-cbc-essiv:sha256
0123456789abcdef0123456789abcdef 0 /dev/sda2 0 2 block_size 4096" |
dmsetup create sda2_crypt

The encryption rate improves significantly:

dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=1024 oflag=direct
1073741824 bytes (1.1 GB) copied, 26.4864 s, 40.5 MB/s

I hope you realize that this is a real-world usecase for me — I cannot
switch to a full-blown Intel system with AES-NI due to space and power
efficiency reasons. Nevertheless, I need to encrypt my hard disk(s).

I would appreciate it very much if you could implement the FIXME and
commit that patch to dmcrypt.

BTW, when using a block_size of more than 4096 (e.g. 8192),
I get a kernel oops:

[   36.601704] device-mapper: ioctl: 4.23.1-ioctl (2012-12-18) initialised: dm-devel@redhat.com
[   36.658525] bio: create slab <bio-1> at 1
[   36.672332] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[   36.680496] pgd = decf8000
[   36.683212] [00000000] *pgd=1eeed831, *pte=00000000, *ppte=00000000
[   36.689537] Internal error: Oops: 17 [#1] ARM
[   36.693911] Modules linked in: sha256_generic dm_crypt dm_mod nfsd auth_rpcgss nfs_acl nfs lockd dns_resolver fscache sunrpc hmac sha1_generic ehci_hcd mv_cesa mv643xx_eth usbcore inet_lro libphy usb_common mv_dma evdev loop gpio_keys autofs4 ext4 jbd2 mbcache sd_mod crc_t10dif sata_mv libata scsi_mod
[   36.720988] CPU: 0    Not tainted  (3.8-trunk-kirkwood #1 Debian 3.8-1~experimental.2)
[   36.728940] PC is at create_empty_buffers+0x24/0x11c
[   36.733921] LR is at create_empty_buffers+0x14/0x11c
[   36.738899] pc : [<c00f44ec>]    lr : [<c00f44dc>]    psr: 80000013
[   36.738899] sp : de8e5d10  ip : c04eb02c  fp : c093d280
[   36.750416] r10: 00100100  r9 : df6370b8  r8 : c093d280
[   36.755658] r7 : dea39c80  r6 : 00000000  r5 : 00000000  r4 : c093d280
[   36.762206] r3 : 00000000  r2 : 00000001  r1 : 00002000  r0 : 00000000
[   36.768756] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[   36.775914] Control: 0005397f  Table: 1ecf8000  DAC: 00000015
[   36.781678] Process blkid (pid: 931, stack limit = 0xde8e41b8)
[   36.787530] Stack: (0xde8e5d10 to 0xde8e6000)
[   36.796608] 5d00:                                     c093d280 df63717c 00000000 c00f4ad8
[   36.806338] 5d20: 00000001 c00f4b14 c0502024 00000040 00000000 c055af38 00000010 000213d0
[   36.816046] 5d40: 00000000 c0501a24 df637180 c00f9850 c093d280 df63717c 00000000 2ba657f0
[   36.825732] 5d60: 000000d0 00200200 00100100 c0092b4c 60000013 00000001 df63717c 00000000
[   36.833978] 5d80: dea39c80 c093d280 00200200 00100100 de8e5da0 c009b434 2ba657ff 2ba657f0
[   36.844821] 5da0: de8e5da0 de8e5da0 91827364 de8e5dac de8e5dac de8e5db4 de8e5db4 00000000
[   36.853045] 5dc0: c057bee0 00000000 00000001 2ba657f0 00000001 df63717c 000001ff dea39c80
[   36.863832] 5de0: 00000000 c009b6c4 00000000 dea39c80 2ba657f0 00000000 00000001 00000000
[   36.873483] 5e00: 00000000 2ba657f0 df63717c dea39c80 ffffffff c0093d08 00000001 00000000
[   36.883145] 5e20: dea39c80 de8e5ec0 00000101 de8e5f00 657f0000 000002ba 00000fff b6f03000
[   36.892789] 5e40: 000005b7 de8e5eb8 def41520 def6a1d8 decfadb8 00000028 00000000 de8e5ef8
[   36.902532] 5e60: df6370b8 dea39cc0 2ba657f1 00000001 b6f031f0 00000000 00000000 00000040
[   36.912233] 5e80: 0096ca78 00000000 c00f99e4 de8e5ec0 fffffdee de8e5f80 00000040 dea39c80
[   36.921889] 5ea0: de8e4000 de8e5ec0 00000000 c00cd8b0 657f0000 000002ba 0096ca78 00000040
[   36.931551] 5ec0: 00000000 00000000 00000000 00000001 ffffffff dea39c80 00000000 00000000
[   36.941233] 5ee0: 00000000 00000000 dee603c0 00000000 00000000 00000000 657f0000 000002ba
[   36.950920] 5f00: 00000000 00000000 00000040 00000000 00000040 00000000 00000000 00000000
[   36.960574] 5f20: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   36.970268] 5f40: 0096ca78 dea39c80 0096ca78 de8e5f80 00000000 00000040 00000040 c00cdfb4
[   36.979934] 5f60: dea39c80 0096ca78 657f0000 000002ba dea39c80 00000000 0096ca78 c00ce0c4
[   36.989635] 5f80: 657f0000 000002ba 00000040 0096ca70 00000040 0096c018 00000003 c000e508
[   36.999297] 5fa0: 00000000 c000e380 0096ca70 00000040 00000003 0096ca78 00000040 00008000
[   37.008957] 5fc0: 0096ca70 00000040 0096c018 00000003 657f0000 b6fc7000 0096ca58 00000000
[   37.018632] 5fe0: 0096c064 bee91370 b6fac760 b6f0320c 60000010 00000003 00000000 00000000
[   37.026882] [<c00f44ec>] (create_empty_buffers+0x24/0x11c) from [<c00f4ad8>] (create_page_buffers+0x34/0x4c)
[   37.036764] [<c00f4ad8>] (create_page_buffers+0x34/0x4c) from [<c00f4b14>] (block_read_full_page+0x24/0x358)
[   37.046646] [<c00f4b14>] (block_read_full_page+0x24/0x358) from [<c009b434>] (__do_page_cache_readahead+0x194/0x1ec)
[   37.057230] [<c009b434>] (__do_page_cache_readahead+0x194/0x1ec) from [<c009b6c4>] (force_page_cache_readahead+0x6c/0xa4)
[   37.068240] [<c009b6c4>] (force_page_cache_readahead+0x6c/0xa4) from [<c0093d08>] (generic_file_aio_read+0x290/0x6dc)
[   37.078906] [<c0093d08>] (generic_file_aio_read+0x290/0x6dc) from [<c00cd8b0>] (do_sync_read+0x90/0xcc)
[   37.088348] [<c00cd8b0>] (do_sync_read+0x90/0xcc) from [<c00cdfb4>] (vfs_read+0xa4/0x17c)
[   37.096569] [<c00cdfb4>] (vfs_read+0xa4/0x17c) from [<c00ce0c4>] (sys_read+0x38/0x64)
[   37.104447] [<c00ce0c4>] (sys_read+0x38/0x64) from [<c000e380>] (ret_fast_syscall+0x0/0x2c)
[   37.112924] Code: e1a05000 e1a03000 ea000000 e1a03001 (e5932000) 
[   37.119055] ---[ end trace c844ca55a4ae58f6 ]---


-- 
Best regards,
Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size (+some experimental patch)
  2013-03-05 11:53               ` Michael Stapelberg
@ 2013-03-10 15:05                 ` Milan Broz
  2013-03-25 19:22                   ` Michael Stapelberg
  0 siblings, 1 reply; 12+ messages in thread
From: Milan Broz @ 2013-03-10 15:05 UTC (permalink / raw)
  To: Michael Stapelberg; +Cc: dm-crypt

On 5.3.2013 12:53, Michael Stapelberg wrote:

> dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=1024 oflag=direct
> 1073741824 bytes (1.1 GB) copied, 50.5507 s, 21.2 MB/s
>
> This is just to let you know where I am coming from :).
...

> dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=1024 oflag=direct
> 1073741824 bytes (1.1 GB) copied, 26.4864 s, 40.5 MB/s

So 2 * performance increase? I would expect even more...
(Maybe compare it with new cryptsetup benchmark - in fact it uses much
larger block and it should measure almost real throughput or crypto engine.)

> I would appreciate it very much if you could implement the FIXME and
> commit that patch to dmcrypt.

As I said, it is on my TODO list, kernel support is only one part of the problem.
ANyway, I added http://code.google.com/p/cryptsetup/issues/detail?id=150 to track it.

> BTW, when using a block_size of more than 4096 (e.g. 8192),
> I get a kernel oops:
>
> [   36.601704] device-mapper: ioctl: 4.23.1-ioctl (2012-12-18) initialised: dm-devel@redhat.com
> [   36.658525] bio: create slab <bio-1> at 1
> [   36.672332] Unable to handle kernel NULL pointer dereference at virtual address 00000000

> [   37.026882] [<c00f44ec>] (create_empty_buffers+0x24/0x11c) from [<c00f4ad8>] (create_page_buffers+0x34/0x4c)
> [   37.036764] [<c00f4ad8>] (create_page_buffers+0x34/0x4c) from [<c00f4b14>] (block_read_full_page+0x24/0x358)
> [   37.046646] [<c00f4b14>] (block_read_full_page+0x24/0x358) from [<c009b434>] (__do_page_cache_readahead+0x194/0x1ec)
> [   37.057230] [<c009b434>] (__do_page_cache_readahead+0x194/0x1ec) from [<c009b6c4>] (force_page_cache_readahead+0x6c/0xa4)
> [   37.068240] [<c009b6c4>] (force_page_cache_readahead+0x6c/0xa4) from [<c0093d08>] (generic_file_aio_read+0x290/0x6dc)
> [   37.078906] [<c0093d08>] (generic_file_aio_read+0x290/0x6dc) from [<c00cd8b0>] (do_sync_read+0x90/0xcc)
> [   37.088348] [<c00cd8b0>] (do_sync_read+0x90/0xcc) from [<c00cdfb4>] (vfs_read+0xa4/0x17c)
> [   37.096569] [<c00cdfb4>] (vfs_read+0xa4/0x17c) from [<c00ce0c4>] (sys_read+0x38/0x64)
> [   37.104447] [<c00ce0c4>] (sys_read+0x38/0x64) from [<c000e380>] (ret_fast_syscall+0x0/0x2c)
> [   37.112924] Code: e1a05000 e1a03000 ea000000 e1a03001 (e5932000)
> [   37.119055] ---[ end trace c844ca55a4ae58f6 ]---

This looks like crash in different layer, IMHO this should happen even with other devices (try dm-linear).
If it is reproducible, perhaps report it to LKML.

Thanks,
Milan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dm-crypt] Encrypting with larger packet size (+some experimental patch)
  2013-03-10 15:05                 ` Milan Broz
@ 2013-03-25 19:22                   ` Michael Stapelberg
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Stapelberg @ 2013-03-25 19:22 UTC (permalink / raw)
  To: Milan Broz; +Cc: dm-crypt

Hi Milan,

Milan Broz <gmazyland@gmail.com> writes:
> So 2 * performance increase? I would expect even more...
> (Maybe compare it with new cryptsetup benchmark - in fact it uses much
> larger block and it should measure almost real throughput or crypto
> engine.)
In cryptsetup benchmark I am seeing much higher numbers, but I never get
close to that in “real” tests:

# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1        44887 iterations per second
PBKDF2-sha256      34492 iterations per second
PBKDF2-sha512       8904 iterations per second
PBKDF2-ripemd160   43115 iterations per second
PBKDF2-whirlpool    6884 iterations per second
#  Algorithm | Key |  Encryption |  Decryption
     aes-cbc   128b   254.0 MiB/s   264.0 MiB/s
 serpent-cbc   128b    17.5 MiB/s    17.8 MiB/s
 twofish-cbc   128b    20.4 MiB/s    20.6 MiB/s
     aes-cbc   256b   245.0 MiB/s   242.0 MiB/s
 serpent-cbc   256b    17.5 MiB/s    18.0 MiB/s
 twofish-cbc   256b    20.4 MiB/s    20.8 MiB/s
     aes-xts   256b    21.6 MiB/s    21.8 MiB/s
 serpent-xts   256b    18.0 MiB/s    18.0 MiB/s
 twofish-xts   256b    21.0 MiB/s    20.6 MiB/s
     aes-xts   512b    17.0 MiB/s    17.1 MiB/s
 serpent-xts   512b    18.0 MiB/s    18.0 MiB/s
 twofish-xts   512b    21.0 MiB/s    20.6 MiB/s


> This looks like crash in different layer, IMHO this should happen even
> with other devices (try dm-linear).  If it is reproducible, perhaps
> report it to LKML.

How would I be able to reproduce this with dm-linear? As far as I can
tell, it doesn’t have a block size parameter?

Apparently, using a block size bigger than PAGESIZE is not possible with
Linux currently. Ted Ts'o writes that he is surprised to see the
patchset to allow that being used in the wild:
https://lkml.org/lkml/2012/3/21/493

I have compared dm-linear with the dm-crypt null cipher:

echo "0 5856626815 linear /dev/sda2 0" | dmsetup create identity
dd if=/dev/zero of=/dev/mapper/identity bs=1M count=768 oflag=direct conv=fsync
768+0 records in
768+0 records out
805306368 bytes (805 MB) copied, 4.69666 s, 171 MB/s

cryptsetup luksFormat -c null /dev/sda2
cryptsetup luksOpen /dev/sda2 sda2_crypt
dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=768 oflag=direct conv=fsync
768+0 records in
768+0 records out
805306368 bytes (805 MB) copied, 9.39298 s, 85.7 MB/s

This is quite a difference and I wonder what might cause it.
Here is the same test with 4096 bytes big blocks:

echo "0 5856624640 crypt cipher_null-ecb - 0 /dev/sda2 0 2 block_size 4096" | dmsetup create sda2_crypt
dd if=/dev/zero of=/dev/mapper/sda2_crypt bs=1M count=768 oflag=direct conv=fsync
768+0 records in
768+0 records out
805306368 bytes (805 MB) copied, 6.25626 s, 129 MB/s

So, any ideas on how to further debug where the problem is, or how we
can get to approximately 170 MB/s? :-)

-- 
Best regards,
Michael

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-03-25 19:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-22  4:18 [dm-crypt] Encrypting with larger packet size Dinesh Garg
2013-01-22  5:42 ` Arno Wagner
2013-01-22  6:04   ` Dinesh Garg
2013-01-22  7:24     ` Milan Broz
2013-01-22  8:00       ` Dinesh Garg
2013-01-22 15:54         ` Milan Broz
2013-01-24 19:44           ` Dinesh Garg
2013-01-25 23:25             ` Arno Wagner
2013-01-28 12:00             ` [dm-crypt] Encrypting with larger packet size (+some experimental patch) Milan Broz
2013-03-05 11:53               ` Michael Stapelberg
2013-03-10 15:05                 ` Milan Broz
2013-03-25 19:22                   ` Michael Stapelberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.