linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* PadLock processing multiple blocks at a time
       [not found]   ` <20041130222442.7b0f4f67.davem@davemloft.net>
@ 2005-01-11 17:03     ` Michal Ludvig
  2005-01-11 17:08       ` [PATCH 1/2] " Michal Ludvig
  2005-01-11 17:08       ` [PATCH 2/2] PadLock processing multiple blocks " Michal Ludvig
  0 siblings, 2 replies; 13+ messages in thread
From: Michal Ludvig @ 2005-01-11 17:03 UTC (permalink / raw)
  To: David S. Miller; +Cc: jmorris, cryptoapi, linux-kernel

Hi all,

I have got some improvements for VIA PadLock crypto driver.

1. Generic extension to crypto/cipher.c that allows offloading the 
   encryption of the whole buffer in a given mode (CBC, ...) to the 
   algorithm provider (i.e. PadLock). Basically it extends 'struct 
   cipher_alg' by some new fields:

@@ -69,6 +73,18 @@ struct cipher_alg {
                          unsigned int keylen, u32 *flags);
        void (*cia_encrypt)(void *ctx, u8 *dst, const u8 *src);
        void (*cia_decrypt)(void *ctx, u8 *dst, const u8 *src);
+       size_t cia_max_nbytes;
+       size_t cia_req_align;
+       void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+                       size_t nbytes, int encdec, int inplace);
+       void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+                       size_t nbytes, int encdec, int inplace);
+       void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+                       size_t nbytes, int encdec, int inplace);
+       void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+                       size_t nbytes, int encdec, int inplace);
+       void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+                       size_t nbytes, int encdec, int inplace);
 };

  If cia_<mode> is non-NULL that function is used instead of the 
  software <mode>_process chaining function (e.g. cbc_process()). In the 
  case of PadLock it can significantly speed-up the {en,de}cryption.

2. On top of this I have an extension of the padlock module to support 
   this scheme.

I will send both patches in separate follow ups.

The speedup gained by this change is quite significant (measured with 
bonnie on ext2 over dm-crypt with aes128):

			No encryption	2.6.10-bk1	multiblock
Writing with putc()	10454 (100%)	7479  (72%)	9353  (89%)
Rewriting		16510 (100%)	7628  (46%)	10611 (64%)
Writing intelligently	61128 (100%)	21132 (35%)	48103 (79%)
Reading with getc()	9406  (100%)	6916  (74%)	8801  (94%)
Reading intelligently	35885 (100%)	15271 (43%)	23202 (65%)

Numbers are in kB/s, percents show the slowdown from plaintext run. 
As can be seen, the multiblock encryption is significantly faster 
in comparsion to the already comitted single-block-at-a-time 
processing.

More statistics (e.g. comparsion with aes.ko and aes-i586.ko) are 
available at http://www.logix.cz/michal/devel/padlock/bench.xp

Dave, if you're OK with these changes, please merge them.

Michal Ludvig
-- 
* A mouse is a device used to point at the xterm you want to type in.
* Personal homepage - http://www.logix.cz/michal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] PadLock processing multiple blocks at a time
  2005-01-11 17:03     ` PadLock processing multiple blocks at a time Michal Ludvig
@ 2005-01-11 17:08       ` Michal Ludvig
  2005-01-14 13:10         ` [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers " Michal Ludvig
  2005-01-11 17:08       ` [PATCH 2/2] PadLock processing multiple blocks " Michal Ludvig
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Ludvig @ 2005-01-11 17:08 UTC (permalink / raw)
  To: David S. Miller; +Cc: jmorris, cryptoapi, linux-kernel

# 
# Extends crypto/cipher.c for offloading the whole chaining modes
# to e.g. hardware crypto accelerators.
# 
#	Signed-off-by: Michal Ludvig <mludvig@suse.cz>
# 

Index: linux-2.6.10/crypto/api.c
===================================================================
--- linux-2.6.10.orig/crypto/api.c	2004-12-24 22:35:39.000000000 +0100
+++ linux-2.6.10/crypto/api.c	2005-01-10 16:37:11.943356651 +0100
@@ -217,6 +217,19 @@ int crypto_alg_available(const char *nam
 	return ret;
 }
 
+void *crypto_aligned_kmalloc(size_t size, int mode, size_t alignment, void **index)
+{
+	char *ptr;
+
+	ptr = kmalloc(size + alignment, mode);
+	*index = ptr;
+	if (alignment > 1 && ((long)ptr & (alignment - 1))) {
+		ptr += alignment - ((long)ptr & (alignment - 1));
+	}
+
+	return ptr;
+}
+
 static int __init init_crypto(void)
 {
 	printk(KERN_INFO "Initializing Cryptographic API\n");
@@ -231,3 +244,4 @@ EXPORT_SYMBOL_GPL(crypto_unregister_alg)
 EXPORT_SYMBOL_GPL(crypto_alloc_tfm);
 EXPORT_SYMBOL_GPL(crypto_free_tfm);
 EXPORT_SYMBOL_GPL(crypto_alg_available);
+EXPORT_SYMBOL_GPL(crypto_aligned_kmalloc);
Index: linux-2.6.10/include/linux/crypto.h
===================================================================
--- linux-2.6.10.orig/include/linux/crypto.h	2005-01-07 17:26:42.000000000 +0100
+++ linux-2.6.10/include/linux/crypto.h	2005-01-10 16:37:52.157648454 +0100
@@ -42,6 +42,7 @@
 #define CRYPTO_TFM_MODE_CBC		0x00000002
 #define CRYPTO_TFM_MODE_CFB		0x00000004
 #define CRYPTO_TFM_MODE_CTR		0x00000008
+#define CRYPTO_TFM_MODE_OFB		0x00000010
 
 #define CRYPTO_TFM_REQ_WEAK_KEY		0x00000100
 #define CRYPTO_TFM_RES_WEAK_KEY		0x00100000
@@ -72,6 +73,18 @@ struct cipher_alg {
 	                  unsigned int keylen, u32 *flags);
 	void (*cia_encrypt)(void *ctx, u8 *dst, const u8 *src);
 	void (*cia_decrypt)(void *ctx, u8 *dst, const u8 *src);
+	size_t cia_max_nbytes;
+	size_t cia_req_align;
+	void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
 };
 
 struct digest_alg {
@@ -124,6 +137,11 @@ int crypto_unregister_alg(struct crypto_
 int crypto_alg_available(const char *name, u32 flags);
 
 /*
+ * Helper function.
+ */
+void *crypto_aligned_kmalloc (size_t size, int mode, size_t alignment, void **index);
+
+/*
  * Transforms: user-instantiated objects which encapsulate algorithms
  * and core processing logic.  Managed via crypto_alloc_tfm() and
  * crypto_free_tfm(), as well as the various helpers below.
@@ -258,6 +276,18 @@ static inline unsigned int crypto_tfm_al
 	return tfm->__crt_alg->cra_digest.dia_digestsize;
 }
 
+static inline unsigned int crypto_tfm_alg_max_nbytes(struct crypto_tfm *tfm)
+{
+	BUG_ON(crypto_tfm_alg_type(tfm) != CRYPTO_ALG_TYPE_CIPHER);
+	return tfm->__crt_alg->cra_cipher.cia_max_nbytes;
+}
+
+static inline unsigned int crypto_tfm_alg_req_align(struct crypto_tfm *tfm)
+{
+	BUG_ON(crypto_tfm_alg_type(tfm) != CRYPTO_ALG_TYPE_CIPHER);
+	return tfm->__crt_alg->cra_cipher.cia_req_align;
+}
+
 /*
  * API wrappers.
  */
Index: linux-2.6.10/crypto/cipher.c
===================================================================
--- linux-2.6.10.orig/crypto/cipher.c	2004-12-24 22:34:57.000000000 +0100
+++ linux-2.6.10/crypto/cipher.c	2005-01-10 16:37:11.974350710 +0100
@@ -20,7 +20,31 @@
 #include "internal.h"
 #include "scatterwalk.h"
 
+#define CRA_CIPHER(tfm)	(tfm)->__crt_alg->cra_cipher
+
+#define DEF_TFM_FUNCTION(name,mode,encdec,iv)	\
+static int name(struct crypto_tfm *tfm,		\
+                struct scatterlist *dst,	\
+                struct scatterlist *src,	\
+		unsigned int nbytes)		\
+{						\
+	return crypt(tfm, dst, src, nbytes,	\
+		     mode, encdec, iv);		\
+}
+
+#define DEF_TFM_FUNCTION_IV(name,mode,encdec,iv)	\
+static int name(struct crypto_tfm *tfm,		\
+                struct scatterlist *dst,	\
+                struct scatterlist *src,	\
+		unsigned int nbytes, u8 *iv)	\
+{						\
+	return crypt(tfm, dst, src, nbytes,	\
+		     mode, encdec, iv);		\
+}
+
 typedef void (cryptfn_t)(void *, u8 *, const u8 *);
+typedef void (cryptblkfn_t)(void *, u8 *, const u8 *, u8 *,
+			    size_t, int, int);
 typedef void (procfn_t)(struct crypto_tfm *, u8 *,
                         u8*, cryptfn_t, int enc, void *, int);
 
@@ -38,6 +62,36 @@ static inline void xor_128(u8 *a, const 
 	((u32 *)a)[3] ^= ((u32 *)b)[3];
 }
 
+static void cbc_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
+			cryptfn_t *fn, int enc, void *info, int in_place)
+{
+	u8 *iv = info;
+	
+	/* Null encryption */
+	if (!iv)
+		return;
+		
+	if (enc) {
+		tfm->crt_u.cipher.cit_xor_block(iv, src);
+		(*fn)(crypto_tfm_ctx(tfm), dst, iv);
+		memcpy(iv, dst, crypto_tfm_alg_blocksize(tfm));
+	} else {
+		u8 stack[in_place ? crypto_tfm_alg_blocksize(tfm) : 0];
+		u8 *buf = in_place ? stack : dst;
+
+		(*fn)(crypto_tfm_ctx(tfm), buf, src);
+		tfm->crt_u.cipher.cit_xor_block(buf, iv);
+		memcpy(iv, src, crypto_tfm_alg_blocksize(tfm));
+		if (buf != dst)
+			memcpy(dst, buf, crypto_tfm_alg_blocksize(tfm));
+	}
+}
+
+static void ecb_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
+			cryptfn_t fn, int enc, void *info, int in_place)
+{
+	(*fn)(crypto_tfm_ctx(tfm), dst, src);
+}
 
 /* 
  * Generic encrypt/decrypt wrapper for ciphers, handles operations across
@@ -47,22 +101,101 @@ static inline void xor_128(u8 *a, const 
 static int crypt(struct crypto_tfm *tfm,
 		 struct scatterlist *dst,
 		 struct scatterlist *src,
-                 unsigned int nbytes, cryptfn_t crfn,
-                 procfn_t prfn, int enc, void *info)
+		 unsigned int nbytes, 
+		 int mode, int enc, void *info)
 {
-	struct scatter_walk walk_in, walk_out;
-	const unsigned int bsize = crypto_tfm_alg_blocksize(tfm);
-	u8 tmp_src[bsize];
-	u8 tmp_dst[bsize];
+ 	cryptfn_t *cryptofn = NULL;
+ 	procfn_t *processfn = NULL;
+ 	cryptblkfn_t *cryptomultiblockfn = NULL;
+ 
+ 	struct scatter_walk walk_in, walk_out;
+ 	size_t max_nbytes = crypto_tfm_alg_max_nbytes(tfm);
+ 	size_t bsize = crypto_tfm_alg_blocksize(tfm);
+ 	int req_align = crypto_tfm_alg_req_align(tfm);
+ 	int ret = 0;
+	int gfp;
+ 	void *index_src = NULL, *index_dst = NULL;
+ 	u8 *iv = info;
+ 	u8 *tmp_src, *tmp_dst;
 
 	if (!nbytes)
-		return 0;
+		return ret;
 
 	if (nbytes % bsize) {
 		tfm->crt_flags |= CRYPTO_TFM_RES_BAD_BLOCK_LEN;
-		return -EINVAL;
+		ret = -EINVAL;
+		goto out;
 	}
 
+ 
+ 	switch (mode) {
+ 		case CRYPTO_TFM_MODE_ECB:
+ 			if (CRA_CIPHER(tfm).cia_ecb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ecb;
+ 			else {
+ 				cryptofn = (enc == CRYPTO_DIR_ENCRYPT) ?
+						CRA_CIPHER(tfm).cia_encrypt :
+						CRA_CIPHER(tfm).cia_decrypt;
+ 				processfn = ecb_process;
+ 			}
+ 			break;
+ 
+ 		case CRYPTO_TFM_MODE_CBC:
+ 			if (CRA_CIPHER(tfm).cia_cbc)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_cbc;
+ 			else {
+ 				cryptofn = (enc == CRYPTO_DIR_ENCRYPT) ?
+						CRA_CIPHER(tfm).cia_encrypt :
+						CRA_CIPHER(tfm).cia_decrypt;
+ 				processfn = cbc_process;
+ 			}
+ 			break;
+ 
+		/* Until we have the appropriate {ofb,cfb,ctr}_process()
+		   functions, the following cases will return -ENOSYS if
+		   there is no HW support for the mode. */
+ 		case CRYPTO_TFM_MODE_OFB:
+ 			if (CRA_CIPHER(tfm).cia_ofb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ofb;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+ 
+ 		case CRYPTO_TFM_MODE_CFB:
+ 			if (CRA_CIPHER(tfm).cia_cfb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_cfb;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+ 
+ 		case CRYPTO_TFM_MODE_CTR:
+ 			if (CRA_CIPHER(tfm).cia_ctr)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ctr;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+ 
+ 		default:
+ 			BUG();
+ 	}
+ 
+	if (cryptomultiblockfn)
+		bsize = (max_nbytes > nbytes) ? nbytes : max_nbytes;
+ 
+ 	/* Some hardware crypto engines may require a specific 
+ 	   alignment of the buffers. We will align the buffers
+ 	   already here to avoid their reallocating later. */
+	gfp = in_atomic() ? GFP_ATOMIC : GFP_KERNEL;
+	tmp_src = crypto_aligned_kmalloc(bsize, gfp,
+					 req_align, &index_src);
+	tmp_dst = crypto_aligned_kmalloc(bsize, gfp,
+					 req_align, &index_dst);
+ 
+ 	if (!index_src || !index_dst) {
+		ret = -ENOMEM;
+		goto out;
+  	}
+
 	scatterwalk_start(&walk_in, src);
 	scatterwalk_start(&walk_out, dst);
 
@@ -81,7 +214,13 @@ static int crypt(struct crypto_tfm *tfm,
 
 		scatterwalk_copychunks(src_p, &walk_in, bsize, 0);
 
-		prfn(tfm, dst_p, src_p, crfn, enc, info, in_place);
+ 		if (cryptomultiblockfn)
+ 			(*cryptomultiblockfn)(crypto_tfm_ctx(tfm),
+					      dst_p, src_p, iv,
+					      bsize, enc, in_place);
+ 		else
+ 			(*processfn)(tfm, dst_p, src_p, cryptofn,
+				     enc, info, in_place);
 
 		scatterwalk_done(&walk_in, 0, nbytes);
 
@@ -89,46 +228,23 @@ static int crypt(struct crypto_tfm *tfm,
 		scatterwalk_done(&walk_out, 1, nbytes);
 
 		if (!nbytes)
-			return 0;
+			goto out;
 
 		crypto_yield(tfm);
 	}
-}
-
-static void cbc_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
-			cryptfn_t fn, int enc, void *info, int in_place)
-{
-	u8 *iv = info;
-	
-	/* Null encryption */
-	if (!iv)
-		return;
-		
-	if (enc) {
-		tfm->crt_u.cipher.cit_xor_block(iv, src);
-		fn(crypto_tfm_ctx(tfm), dst, iv);
-		memcpy(iv, dst, crypto_tfm_alg_blocksize(tfm));
-	} else {
-		u8 stack[in_place ? crypto_tfm_alg_blocksize(tfm) : 0];
-		u8 *buf = in_place ? stack : dst;
 
-		fn(crypto_tfm_ctx(tfm), buf, src);
-		tfm->crt_u.cipher.cit_xor_block(buf, iv);
-		memcpy(iv, src, crypto_tfm_alg_blocksize(tfm));
-		if (buf != dst)
-			memcpy(dst, buf, crypto_tfm_alg_blocksize(tfm));
-	}
-}
+out:
+	if (index_src)
+		kfree(index_src);
+	if (index_dst)
+		kfree(index_dst);
 
-static void ecb_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
-			cryptfn_t fn, int enc, void *info, int in_place)
-{
-	fn(crypto_tfm_ctx(tfm), dst, src);
+	return ret;
 }
 
 static int setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen)
 {
-	struct cipher_alg *cia = &tfm->__crt_alg->cra_cipher;
+	struct cipher_alg *cia = &CRA_CIPHER(tfm);
 	
 	if (keylen < cia->cia_min_keysize || keylen > cia->cia_max_keysize) {
 		tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
@@ -138,80 +254,28 @@ static int setkey(struct crypto_tfm *tfm
 		                       &tfm->crt_flags);
 }
 
-static int ecb_encrypt(struct crypto_tfm *tfm,
-		       struct scatterlist *dst,
-                       struct scatterlist *src, unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             ecb_process, 1, NULL);
-}
+DEF_TFM_FUNCTION(ecb_encrypt, CRYPTO_TFM_MODE_ECB, CRYPTO_DIR_ENCRYPT, NULL);
+DEF_TFM_FUNCTION(ecb_decrypt, CRYPTO_TFM_MODE_ECB, CRYPTO_DIR_DECRYPT, NULL);
 
-static int ecb_decrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             ecb_process, 1, NULL);
-}
-
-static int cbc_encrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             cbc_process, 1, tfm->crt_cipher.cit_iv);
-}
-
-static int cbc_encrypt_iv(struct crypto_tfm *tfm,
-                          struct scatterlist *dst,
-                          struct scatterlist *src,
-                          unsigned int nbytes, u8 *iv)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             cbc_process, 1, iv);
-}
-
-static int cbc_decrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             cbc_process, 0, tfm->crt_cipher.cit_iv);
-}
-
-static int cbc_decrypt_iv(struct crypto_tfm *tfm,
-                          struct scatterlist *dst,
-                          struct scatterlist *src,
-                          unsigned int nbytes, u8 *iv)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             cbc_process, 0, iv);
-}
-
-static int nocrypt(struct crypto_tfm *tfm,
-                   struct scatterlist *dst,
-                   struct scatterlist *src,
-		   unsigned int nbytes)
-{
-	return -ENOSYS;
-}
-
-static int nocrypt_iv(struct crypto_tfm *tfm,
-                      struct scatterlist *dst,
-                      struct scatterlist *src,
-                      unsigned int nbytes, u8 *iv)
-{
-	return -ENOSYS;
-}
+DEF_TFM_FUNCTION(cbc_encrypt, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cbc_encrypt_iv, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(cbc_decrypt, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cbc_decrypt_iv, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(cfb_encrypt, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cfb_encrypt_iv, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(cfb_decrypt, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cfb_decrypt_iv, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(ofb_encrypt, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ofb_encrypt_iv, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(ofb_decrypt, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ofb_decrypt_iv, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(ctr_encrypt, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ctr_encrypt_iv, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(ctr_decrypt, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ctr_decrypt_iv, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_DECRYPT, iv);
 
 int crypto_init_cipher_flags(struct crypto_tfm *tfm, u32 flags)
 {
@@ -245,17 +309,24 @@ int crypto_init_cipher_ops(struct crypto
 		break;
 		
 	case CRYPTO_TFM_MODE_CFB:
-		ops->cit_encrypt = nocrypt;
-		ops->cit_decrypt = nocrypt;
-		ops->cit_encrypt_iv = nocrypt_iv;
-		ops->cit_decrypt_iv = nocrypt_iv;
+		ops->cit_encrypt = cfb_encrypt;
+		ops->cit_decrypt = cfb_decrypt;
+		ops->cit_encrypt_iv = cfb_encrypt_iv;
+		ops->cit_decrypt_iv = cfb_decrypt_iv;
+		break;
+	
+	case CRYPTO_TFM_MODE_OFB:
+		ops->cit_encrypt = ofb_encrypt;
+		ops->cit_decrypt = ofb_decrypt;
+		ops->cit_encrypt_iv = ofb_encrypt_iv;
+		ops->cit_decrypt_iv = ofb_decrypt_iv;
 		break;
 	
 	case CRYPTO_TFM_MODE_CTR:
-		ops->cit_encrypt = nocrypt;
-		ops->cit_decrypt = nocrypt;
-		ops->cit_encrypt_iv = nocrypt_iv;
-		ops->cit_decrypt_iv = nocrypt_iv;
+		ops->cit_encrypt = ctr_encrypt;
+		ops->cit_decrypt = ctr_decrypt;
+		ops->cit_encrypt_iv = ctr_encrypt_iv;
+		ops->cit_decrypt_iv = ctr_decrypt_iv;
 		break;
 
 	default:

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] PadLock processing multiple blocks at a time
  2005-01-11 17:03     ` PadLock processing multiple blocks at a time Michal Ludvig
  2005-01-11 17:08       ` [PATCH 1/2] " Michal Ludvig
@ 2005-01-11 17:08       ` Michal Ludvig
  2005-01-14  3:05         ` Andrew Morton
  2005-01-14 13:15         ` [PATCH 2/2] CryptoAPI: Update PadLock to process multiple blocks at once Michal Ludvig
  1 sibling, 2 replies; 13+ messages in thread
From: Michal Ludvig @ 2005-01-11 17:08 UTC (permalink / raw)
  To: David S. Miller; +Cc: jmorris, cryptoapi, linux-kernel

# 
# Update to padlock-aes.c that enables processing of the whole 
# buffer of data at once with the given chaining mode (e.g. CBC).
# 
# Signed-off-by: Michal Ludvig <michal@logix.cz>
# 
Index: linux-2.6.10/drivers/crypto/padlock-aes.c
===================================================================
--- linux-2.6.10.orig/drivers/crypto/padlock-aes.c	2005-01-07 17:26:42.000000000 +0100
+++ linux-2.6.10/drivers/crypto/padlock-aes.c	2005-01-10 17:59:17.000000000 +0100
@@ -369,19 +369,54 @@ aes_set_key(void *ctx_arg, const uint8_t
 
 /* ====== Encryption/decryption routines ====== */
 
-/* This is the real call to PadLock. */
-static inline void
+/* These are the real calls to PadLock. */
+static inline void *
 padlock_xcrypt_ecb(uint8_t *input, uint8_t *output, uint8_t *key,
-		   void *control_word, uint32_t count)
+		   uint8_t *iv, void *control_word, uint32_t count)
 {
 	asm volatile ("pushfl; popfl");		/* enforce key reload. */
 	asm volatile (".byte 0xf3,0x0f,0xa7,0xc8"	/* rep xcryptecb */
 		      : "+S"(input), "+D"(output)
 		      : "d"(control_word), "b"(key), "c"(count));
+	return NULL;
+}
+
+static inline void *
+padlock_xcrypt_cbc(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xd0"	/* rep xcryptcbc */
+		      : "=m"(*output), "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
+}
+
+static inline void *
+padlock_xcrypt_cfb(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xe0"	/* rep xcryptcfb */
+		      : "=m"(*output), "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
+}
+
+static inline void *
+padlock_xcrypt_ofb(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xe8"	/* rep xcryptofb */
+		      : "=m"(*output), "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
 }
 
 static void
-aes_padlock(void *ctx_arg, uint8_t *out_arg, const uint8_t *in_arg, int encdec)
+aes_padlock(void *ctx_arg, uint8_t *out_arg, const uint8_t *in_arg,
+	    uint8_t *iv_arg, size_t nbytes, int encdec, int mode)
 {
 	/* Don't blindly modify this structure - the items must 
 	   fit on 16-Bytes boundaries! */
@@ -419,21 +454,126 @@ aes_padlock(void *ctx_arg, uint8_t *out_
 	else
 		key = ctx->D;
 	
-	memcpy(data->buf, in_arg, AES_BLOCK_SIZE);
-	padlock_xcrypt_ecb(data->buf, data->buf, key, &data->cword, 1);
-	memcpy(out_arg, data->buf, AES_BLOCK_SIZE);
+	if (nbytes == AES_BLOCK_SIZE) {
+		/* Processing one block only => ECB is enough */
+		memcpy(data->buf, in_arg, AES_BLOCK_SIZE);
+		padlock_xcrypt_ecb(data->buf, data->buf, key, NULL,
+				   &data->cword, 1);
+		memcpy(out_arg, data->buf, AES_BLOCK_SIZE);
+	} else {
+		/* Processing multiple blocks at once */
+		uint8_t *in, *out, *iv;
+		int gfp = in_atomic() ? GFP_ATOMIC : GFP_KERNEL;
+		void *index = NULL;
+
+		if (unlikely(((long)in_arg) & 0x0F)) {
+			in = crypto_aligned_kmalloc(nbytes, gfp, 16, &index);
+			memcpy(in, in_arg, nbytes);
+		}
+		else
+			in = (uint8_t*)in_arg;
+
+		if (unlikely(((long)out_arg) & 0x0F)) {
+			if (index)
+				out = in;	/* xcrypt can work "in place" */
+			else
+				out = crypto_aligned_kmalloc(nbytes, gfp, 16,
+							     &index);
+		}
+		else
+			out = out_arg;
+
+		/* Always make a local copy of IV - xcrypt may change it! */
+		iv = data->buf;
+		if (iv_arg)
+			memcpy(iv, iv_arg, AES_BLOCK_SIZE);
+
+		switch (mode) {
+			case CRYPTO_TFM_MODE_ECB:
+				iv = padlock_xcrypt_ecb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_CBC:
+				iv = padlock_xcrypt_cbc(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_CFB:
+				iv = padlock_xcrypt_cfb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_OFB:
+				iv = padlock_xcrypt_ofb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			default:
+				BUG();
+		}
+
+		/* Back up IV */
+		if (iv && iv_arg)
+			memcpy(iv_arg, iv, AES_BLOCK_SIZE);
+
+		/* Copy the 16-Byte aligned output to the caller's buffer. */
+		if (out != out_arg)
+			memcpy(out_arg, out, nbytes);
+
+		if (index)
+			kfree(index);
+	}
+}
+
+static void
+aes_padlock_ecb(void *ctx, uint8_t *dst, const uint8_t *src,
+		uint8_t *iv, size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, NULL, nbytes, encdec,
+		    CRYPTO_TFM_MODE_ECB);
+}
+
+static void
+aes_padlock_cbc(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_CBC);
+}
+
+static void
+aes_padlock_cfb(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_CFB);
+}
+
+static void
+aes_padlock_ofb(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_OFB);
 }
 
 static void
 aes_encrypt(void *ctx_arg, uint8_t *out, const uint8_t *in)
 {
-	aes_padlock(ctx_arg, out, in, CRYPTO_DIR_ENCRYPT);
+	aes_padlock(ctx_arg, out, in, NULL, AES_BLOCK_SIZE,
+		    CRYPTO_DIR_ENCRYPT, CRYPTO_TFM_MODE_ECB);
 }
 
 static void
 aes_decrypt(void *ctx_arg, uint8_t *out, const uint8_t *in)
 {
-	aes_padlock(ctx_arg, out, in, CRYPTO_DIR_DECRYPT);
+	aes_padlock(ctx_arg, out, in, NULL, AES_BLOCK_SIZE,
+		    CRYPTO_DIR_DECRYPT, CRYPTO_TFM_MODE_ECB);
 }
 
 static struct crypto_alg aes_alg = {
@@ -454,9 +594,25 @@ static struct crypto_alg aes_alg = {
 	}
 };
 
+static int disable_multiblock = 0;
+MODULE_PARM(disable_multiblock, "i");
+MODULE_PARM_DESC(disable_multiblock,
+		 "Disable encryption of whole multiblock buffers.");
+
 int __init padlock_init_aes(void)
 {
-	printk(KERN_NOTICE PFX "Using VIA PadLock ACE for AES algorithm.\n");
+	if (!disable_multiblock) {
+		aes_alg.cra_u.cipher.cia_max_nbytes = (size_t)-1;
+		aes_alg.cra_u.cipher.cia_req_align  = 16;
+		aes_alg.cra_u.cipher.cia_ecb        = aes_padlock_ecb;
+		aes_alg.cra_u.cipher.cia_cbc        = aes_padlock_cbc;
+		aes_alg.cra_u.cipher.cia_cfb        = aes_padlock_cfb;
+		aes_alg.cra_u.cipher.cia_ofb        = aes_padlock_ofb;
+	}
+
+	printk(KERN_NOTICE PFX 
+		"Using VIA PadLock ACE for AES algorithm%s.\n", 
+		disable_multiblock ? "" : " (multiblock)");
 
 	gen_tabs();
 	return crypto_register_alg(&aes_alg);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] PadLock processing multiple blocks at a time
  2005-01-11 17:08       ` [PATCH 2/2] PadLock processing multiple blocks " Michal Ludvig
@ 2005-01-14  3:05         ` Andrew Morton
  2005-01-14 13:15         ` [PATCH 2/2] CryptoAPI: Update PadLock to process multiple blocks at once Michal Ludvig
  1 sibling, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2005-01-14  3:05 UTC (permalink / raw)
  To: Michal Ludvig; +Cc: davem, jmorris, cryptoapi, linux-kernel

Michal Ludvig <michal@logix.cz> wrote:
>
> # 
> # Update to padlock-aes.c that enables processing of the whole 
> # buffer of data at once with the given chaining mode (e.g. CBC).
> # 

Please don't email different patche sunder the same Subject:.  Choose a
Subject: which is meaningful for each patch?

This one kills gcc-2.95.x:

drivers/crypto/padlock-aes.c: In function `aes_padlock':
drivers/crypto/padlock-aes.c:391: impossible register constraint in `asm'
drivers/crypto/padlock-aes.c:402: impossible register constraint in `asm'
drivers/crypto/padlock-aes.c:413: impossible register constraint in `asm'
drivers/crypto/padlock-aes.c:391: `asm' needs too many reloads


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-11 17:08       ` [PATCH 1/2] " Michal Ludvig
@ 2005-01-14 13:10         ` Michal Ludvig
  2005-01-14 14:20           ` Fruhwirth Clemens
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Ludvig @ 2005-01-14 13:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David S. Miller, jmorris, cryptoapi, linux-kernel

Hi all,

I'm resending this patch with trailing spaces removed per Andrew's 
comment.

This patch extends crypto/cipher.c for offloading the whole chaining modes
to e.g. hardware crypto accelerators. It is much faster to let the 
hardware do all the chaining if it can do so.

Signed-off-by: Michal Ludvig <michal@logix.cz>

---

 crypto/api.c           |   14 ++
 crypto/cipher.c        |  313 ++++++++++++++++++++++++++++++-------------------
 include/linux/crypto.h |   30 ++++
 3 files changed, 236 insertions(+), 121 deletions(-)


Index: linux-2.6.10/crypto/api.c
===================================================================
--- linux-2.6.10.orig/crypto/api.c	2004-12-24 22:35:39.000000000 +0100
+++ linux-2.6.10/crypto/api.c	2005-01-10 16:37:11.943356651 +0100
@@ -217,6 +217,19 @@ int crypto_alg_available(const char *nam
 	return ret;
 }
 
+void *crypto_aligned_kmalloc(size_t size, int mode, size_t alignment, void **index)
+{
+	char *ptr;
+
+	ptr = kmalloc(size + alignment, mode);
+	*index = ptr;
+	if (alignment > 1 && ((long)ptr & (alignment - 1))) {
+		ptr += alignment - ((long)ptr & (alignment - 1));
+	}
+
+	return ptr;
+}
+
 static int __init init_crypto(void)
 {
 	printk(KERN_INFO "Initializing Cryptographic API\n");
@@ -231,3 +244,4 @@ EXPORT_SYMBOL_GPL(crypto_unregister_alg)
 EXPORT_SYMBOL_GPL(crypto_alloc_tfm);
 EXPORT_SYMBOL_GPL(crypto_free_tfm);
 EXPORT_SYMBOL_GPL(crypto_alg_available);
+EXPORT_SYMBOL_GPL(crypto_aligned_kmalloc);
Index: linux-2.6.10/include/linux/crypto.h
===================================================================
--- linux-2.6.10.orig/include/linux/crypto.h	2005-01-07 17:26:42.000000000 +0100
+++ linux-2.6.10/include/linux/crypto.h	2005-01-10 16:37:52.157648454 +0100
@@ -42,6 +42,7 @@
 #define CRYPTO_TFM_MODE_CBC		0x00000002
 #define CRYPTO_TFM_MODE_CFB		0x00000004
 #define CRYPTO_TFM_MODE_CTR		0x00000008
+#define CRYPTO_TFM_MODE_OFB		0x00000010
 
 #define CRYPTO_TFM_REQ_WEAK_KEY		0x00000100
 #define CRYPTO_TFM_RES_WEAK_KEY		0x00100000
@@ -72,6 +73,18 @@ struct cipher_alg {
 	                  unsigned int keylen, u32 *flags);
 	void (*cia_encrypt)(void *ctx, u8 *dst, const u8 *src);
 	void (*cia_decrypt)(void *ctx, u8 *dst, const u8 *src);
+	size_t cia_max_nbytes;
+	size_t cia_req_align;
+	void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
+	void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
+			size_t nbytes, int encdec, int inplace);
 };
 
 struct digest_alg {
@@ -124,6 +137,11 @@ int crypto_unregister_alg(struct crypto_
 int crypto_alg_available(const char *name, u32 flags);
 
 /*
+ * Helper function.
+ */
+void *crypto_aligned_kmalloc (size_t size, int mode, size_t alignment, void **index);
+
+/*
  * Transforms: user-instantiated objects which encapsulate algorithms
  * and core processing logic.  Managed via crypto_alloc_tfm() and
  * crypto_free_tfm(), as well as the various helpers below.
@@ -258,6 +276,18 @@ static inline unsigned int crypto_tfm_al
 	return tfm->__crt_alg->cra_digest.dia_digestsize;
 }
 
+static inline unsigned int crypto_tfm_alg_max_nbytes(struct crypto_tfm *tfm)
+{
+	BUG_ON(crypto_tfm_alg_type(tfm) != CRYPTO_ALG_TYPE_CIPHER);
+	return tfm->__crt_alg->cra_cipher.cia_max_nbytes;
+}
+
+static inline unsigned int crypto_tfm_alg_req_align(struct crypto_tfm *tfm)
+{
+	BUG_ON(crypto_tfm_alg_type(tfm) != CRYPTO_ALG_TYPE_CIPHER);
+	return tfm->__crt_alg->cra_cipher.cia_req_align;
+}
+
 /*
  * API wrappers.
  */
Index: linux-2.6.10/crypto/cipher.c
===================================================================
--- linux-2.6.10.orig/crypto/cipher.c	2004-12-24 22:34:57.000000000 +0100
+++ linux-2.6.10/crypto/cipher.c	2005-01-10 16:37:11.974350710 +0100
@@ -20,7 +20,31 @@
 #include "internal.h"
 #include "scatterwalk.h"
 
+#define CRA_CIPHER(tfm)	(tfm)->__crt_alg->cra_cipher
+
+#define DEF_TFM_FUNCTION(name,mode,encdec,iv)	\
+static int name(struct crypto_tfm *tfm,		\
+                struct scatterlist *dst,	\
+                struct scatterlist *src,	\
+		unsigned int nbytes)		\
+{						\
+	return crypt(tfm, dst, src, nbytes,	\
+		     mode, encdec, iv);		\
+}
+
+#define DEF_TFM_FUNCTION_IV(name,mode,encdec,iv)	\
+static int name(struct crypto_tfm *tfm,		\
+                struct scatterlist *dst,	\
+                struct scatterlist *src,	\
+		unsigned int nbytes, u8 *iv)	\
+{						\
+	return crypt(tfm, dst, src, nbytes,	\
+		     mode, encdec, iv);		\
+}
+
 typedef void (cryptfn_t)(void *, u8 *, const u8 *);
+typedef void (cryptblkfn_t)(void *, u8 *, const u8 *, u8 *,
+			    size_t, int, int);
 typedef void (procfn_t)(struct crypto_tfm *, u8 *,
                         u8*, cryptfn_t, int enc, void *, int);
 
@@ -38,6 +62,36 @@ static inline void xor_128(u8 *a, const 
 	((u32 *)a)[3] ^= ((u32 *)b)[3];
 }
 
+static void cbc_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
+			cryptfn_t *fn, int enc, void *info, int in_place)
+{
+	u8 *iv = info;
+
+	/* Null encryption */
+	if (!iv)
+		return;
+
+	if (enc) {
+		tfm->crt_u.cipher.cit_xor_block(iv, src);
+		(*fn)(crypto_tfm_ctx(tfm), dst, iv);
+		memcpy(iv, dst, crypto_tfm_alg_blocksize(tfm));
+	} else {
+		u8 stack[in_place ? crypto_tfm_alg_blocksize(tfm) : 0];
+		u8 *buf = in_place ? stack : dst;
+
+		(*fn)(crypto_tfm_ctx(tfm), buf, src);
+		tfm->crt_u.cipher.cit_xor_block(buf, iv);
+		memcpy(iv, src, crypto_tfm_alg_blocksize(tfm));
+		if (buf != dst)
+			memcpy(dst, buf, crypto_tfm_alg_blocksize(tfm));
+	}
+}
+
+static void ecb_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
+			cryptfn_t fn, int enc, void *info, int in_place)
+{
+	(*fn)(crypto_tfm_ctx(tfm), dst, src);
+}
 
 /*
  * Generic encrypt/decrypt wrapper for ciphers, handles operations across
@@ -47,22 +101,101 @@ static inline void xor_128(u8 *a, const 
 static int crypt(struct crypto_tfm *tfm,
 		 struct scatterlist *dst,
 		 struct scatterlist *src,
-                 unsigned int nbytes, cryptfn_t crfn,
-                 procfn_t prfn, int enc, void *info)
+		 unsigned int nbytes,
+		 int mode, int enc, void *info)
 {
-	struct scatter_walk walk_in, walk_out;
-	const unsigned int bsize = crypto_tfm_alg_blocksize(tfm);
-	u8 tmp_src[bsize];
-	u8 tmp_dst[bsize];
+ 	cryptfn_t *cryptofn = NULL;
+ 	procfn_t *processfn = NULL;
+ 	cryptblkfn_t *cryptomultiblockfn = NULL;
+
+ 	struct scatter_walk walk_in, walk_out;
+ 	size_t max_nbytes = crypto_tfm_alg_max_nbytes(tfm);
+ 	size_t bsize = crypto_tfm_alg_blocksize(tfm);
+ 	int req_align = crypto_tfm_alg_req_align(tfm);
+ 	int ret = 0;
+	int gfp;
+ 	void *index_src = NULL, *index_dst = NULL;
+ 	u8 *iv = info;
+ 	u8 *tmp_src, *tmp_dst;
 
 	if (!nbytes)
-		return 0;
+		return ret;
 
 	if (nbytes % bsize) {
 		tfm->crt_flags |= CRYPTO_TFM_RES_BAD_BLOCK_LEN;
-		return -EINVAL;
+		ret = -EINVAL;
+		goto out;
 	}
 
+
+ 	switch (mode) {
+ 		case CRYPTO_TFM_MODE_ECB:
+ 			if (CRA_CIPHER(tfm).cia_ecb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ecb;
+ 			else {
+ 				cryptofn = (enc == CRYPTO_DIR_ENCRYPT) ?
+						CRA_CIPHER(tfm).cia_encrypt :
+						CRA_CIPHER(tfm).cia_decrypt;
+ 				processfn = ecb_process;
+ 			}
+ 			break;
+
+ 		case CRYPTO_TFM_MODE_CBC:
+ 			if (CRA_CIPHER(tfm).cia_cbc)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_cbc;
+ 			else {
+ 				cryptofn = (enc == CRYPTO_DIR_ENCRYPT) ?
+						CRA_CIPHER(tfm).cia_encrypt :
+						CRA_CIPHER(tfm).cia_decrypt;
+ 				processfn = cbc_process;
+ 			}
+ 			break;
+
+		/* Until we have the appropriate {ofb,cfb,ctr}_process()
+		   functions, the following cases will return -ENOSYS if
+		   there is no HW support for the mode. */
+ 		case CRYPTO_TFM_MODE_OFB:
+ 			if (CRA_CIPHER(tfm).cia_ofb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ofb;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+
+ 		case CRYPTO_TFM_MODE_CFB:
+ 			if (CRA_CIPHER(tfm).cia_cfb)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_cfb;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+
+ 		case CRYPTO_TFM_MODE_CTR:
+ 			if (CRA_CIPHER(tfm).cia_ctr)
+ 				cryptomultiblockfn = CRA_CIPHER(tfm).cia_ctr;
+ 			else
+ 				return -ENOSYS;
+ 			break;
+
+ 		default:
+ 			BUG();
+ 	}
+
+	if (cryptomultiblockfn)
+		bsize = (max_nbytes > nbytes) ? nbytes : max_nbytes;
+
+ 	/* Some hardware crypto engines may require a specific
+ 	   alignment of the buffers. We will align the buffers
+ 	   already here to avoid their reallocating later. */
+	gfp = in_atomic() ? GFP_ATOMIC : GFP_KERNEL;
+	tmp_src = crypto_aligned_kmalloc(bsize, gfp,
+					 req_align, &index_src);
+	tmp_dst = crypto_aligned_kmalloc(bsize, gfp,
+					 req_align, &index_dst);
+
+ 	if (!index_src || !index_dst) {
+		ret = -ENOMEM;
+		goto out;
+  	}
+
 	scatterwalk_start(&walk_in, src);
 	scatterwalk_start(&walk_out, dst);
 
@@ -81,7 +214,13 @@ static int crypt(struct crypto_tfm *tfm,
 
 		scatterwalk_copychunks(src_p, &walk_in, bsize, 0);
 
-		prfn(tfm, dst_p, src_p, crfn, enc, info, in_place);
+ 		if (cryptomultiblockfn)
+ 			(*cryptomultiblockfn)(crypto_tfm_ctx(tfm),
+					      dst_p, src_p, iv,
+					      bsize, enc, in_place);
+ 		else
+ 			(*processfn)(tfm, dst_p, src_p, cryptofn,
+				     enc, info, in_place);
 
 		scatterwalk_done(&walk_in, 0, nbytes);
 
@@ -89,46 +228,23 @@ static int crypt(struct crypto_tfm *tfm,
 		scatterwalk_done(&walk_out, 1, nbytes);
 
 		if (!nbytes)
-			return 0;
+			goto out;
 
 		crypto_yield(tfm);
 	}
-}
-
-static void cbc_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
-			cryptfn_t fn, int enc, void *info, int in_place)
-{
-	u8 *iv = info;
-	
-	/* Null encryption */
-	if (!iv)
-		return;
-		
-	if (enc) {
-		tfm->crt_u.cipher.cit_xor_block(iv, src);
-		fn(crypto_tfm_ctx(tfm), dst, iv);
-		memcpy(iv, dst, crypto_tfm_alg_blocksize(tfm));
-	} else {
-		u8 stack[in_place ? crypto_tfm_alg_blocksize(tfm) : 0];
-		u8 *buf = in_place ? stack : dst;
 
-		fn(crypto_tfm_ctx(tfm), buf, src);
-		tfm->crt_u.cipher.cit_xor_block(buf, iv);
-		memcpy(iv, src, crypto_tfm_alg_blocksize(tfm));
-		if (buf != dst)
-			memcpy(dst, buf, crypto_tfm_alg_blocksize(tfm));
-	}
-}
+out:
+	if (index_src)
+		kfree(index_src);
+	if (index_dst)
+		kfree(index_dst);
 
-static void ecb_process(struct crypto_tfm *tfm, u8 *dst, u8 *src,
-			cryptfn_t fn, int enc, void *info, int in_place)
-{
-	fn(crypto_tfm_ctx(tfm), dst, src);
+	return ret;
 }
 
 static int setkey(struct crypto_tfm *tfm, const u8 *key, unsigned int keylen)
 {
-	struct cipher_alg *cia = &tfm->__crt_alg->cra_cipher;
+	struct cipher_alg *cia = &CRA_CIPHER(tfm);
 	
 	if (keylen < cia->cia_min_keysize || keylen > cia->cia_max_keysize) {
 		tfm->crt_flags |= CRYPTO_TFM_RES_BAD_KEY_LEN;
@@ -138,80 +254,28 @@ static int setkey(struct crypto_tfm *tfm
 		                       &tfm->crt_flags);
 }
 
-static int ecb_encrypt(struct crypto_tfm *tfm,
-		       struct scatterlist *dst,
-                       struct scatterlist *src, unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             ecb_process, 1, NULL);
-}
+DEF_TFM_FUNCTION(ecb_encrypt, CRYPTO_TFM_MODE_ECB, CRYPTO_DIR_ENCRYPT, NULL);
+DEF_TFM_FUNCTION(ecb_decrypt, CRYPTO_TFM_MODE_ECB, CRYPTO_DIR_DECRYPT, NULL);
 
-static int ecb_decrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             ecb_process, 1, NULL);
-}
-
-static int cbc_encrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             cbc_process, 1, tfm->crt_cipher.cit_iv);
-}
-
-static int cbc_encrypt_iv(struct crypto_tfm *tfm,
-                          struct scatterlist *dst,
-                          struct scatterlist *src,
-                          unsigned int nbytes, u8 *iv)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_encrypt,
-	             cbc_process, 1, iv);
-}
-
-static int cbc_decrypt(struct crypto_tfm *tfm,
-                       struct scatterlist *dst,
-                       struct scatterlist *src,
-		       unsigned int nbytes)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             cbc_process, 0, tfm->crt_cipher.cit_iv);
-}
-
-static int cbc_decrypt_iv(struct crypto_tfm *tfm,
-                          struct scatterlist *dst,
-                          struct scatterlist *src,
-                          unsigned int nbytes, u8 *iv)
-{
-	return crypt(tfm, dst, src, nbytes,
-	             tfm->__crt_alg->cra_cipher.cia_decrypt,
-	             cbc_process, 0, iv);
-}
-
-static int nocrypt(struct crypto_tfm *tfm,
-                   struct scatterlist *dst,
-                   struct scatterlist *src,
-		   unsigned int nbytes)
-{
-	return -ENOSYS;
-}
-
-static int nocrypt_iv(struct crypto_tfm *tfm,
-                      struct scatterlist *dst,
-                      struct scatterlist *src,
-                      unsigned int nbytes, u8 *iv)
-{
-	return -ENOSYS;
-}
+DEF_TFM_FUNCTION(cbc_encrypt, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cbc_encrypt_iv, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(cbc_decrypt, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cbc_decrypt_iv, CRYPTO_TFM_MODE_CBC, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(cfb_encrypt, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cfb_encrypt_iv, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(cfb_decrypt, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(cfb_decrypt_iv, CRYPTO_TFM_MODE_CFB, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(ofb_encrypt, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ofb_encrypt_iv, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(ofb_decrypt, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ofb_decrypt_iv, CRYPTO_TFM_MODE_OFB, CRYPTO_DIR_DECRYPT, iv);
+
+DEF_TFM_FUNCTION(ctr_encrypt, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_ENCRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ctr_encrypt_iv, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_ENCRYPT, iv);
+DEF_TFM_FUNCTION(ctr_decrypt, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_DECRYPT, tfm->crt_cipher.cit_iv);
+DEF_TFM_FUNCTION_IV(ctr_decrypt_iv, CRYPTO_TFM_MODE_CTR, CRYPTO_DIR_DECRYPT, iv);
 
 int crypto_init_cipher_flags(struct crypto_tfm *tfm, u32 flags)
 {
@@ -245,17 +309,24 @@ int crypto_init_cipher_ops(struct crypto
 		break;
 		
 	case CRYPTO_TFM_MODE_CFB:
-		ops->cit_encrypt = nocrypt;
-		ops->cit_decrypt = nocrypt;
-		ops->cit_encrypt_iv = nocrypt_iv;
-		ops->cit_decrypt_iv = nocrypt_iv;
+		ops->cit_encrypt = cfb_encrypt;
+		ops->cit_decrypt = cfb_decrypt;
+		ops->cit_encrypt_iv = cfb_encrypt_iv;
+		ops->cit_decrypt_iv = cfb_decrypt_iv;
+		break;
+
+	case CRYPTO_TFM_MODE_OFB:
+		ops->cit_encrypt = ofb_encrypt;
+		ops->cit_decrypt = ofb_decrypt;
+		ops->cit_encrypt_iv = ofb_encrypt_iv;
+		ops->cit_decrypt_iv = ofb_decrypt_iv;
 		break;
 	
 	case CRYPTO_TFM_MODE_CTR:
-		ops->cit_encrypt = nocrypt;
-		ops->cit_decrypt = nocrypt;
-		ops->cit_encrypt_iv = nocrypt_iv;
-		ops->cit_decrypt_iv = nocrypt_iv;
+		ops->cit_encrypt = ctr_encrypt;
+		ops->cit_decrypt = ctr_decrypt;
+		ops->cit_encrypt_iv = ctr_encrypt_iv;
+		ops->cit_decrypt_iv = ctr_decrypt_iv;
 		break;
 
 	default:

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/2] CryptoAPI: Update PadLock to process multiple blocks at once
  2005-01-11 17:08       ` [PATCH 2/2] PadLock processing multiple blocks " Michal Ludvig
  2005-01-14  3:05         ` Andrew Morton
@ 2005-01-14 13:15         ` Michal Ludvig
  1 sibling, 0 replies; 13+ messages in thread
From: Michal Ludvig @ 2005-01-14 13:15 UTC (permalink / raw)
  To: Andrew Morton; +Cc: David S. Miller, jmorris, cryptoapi, linux-kernel

Hi all,

Update to padlock-aes.c that enables processing of the whole buffer of 
data at once with the given chaining mode (e.g. CBC). It brings much 
higher speed over the case where the chaining is done in software by 
CryptoAPI.

This is updated revision of the patch. Now it compiles even with GCC 
2.95.3.

Signed-off-by: Michal Ludvig <michal@logix.cz>

---

 padlock-aes.c |  176 ++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 166 insertions(+), 10 deletions(-)

Index: linux-2.6.10/drivers/crypto/padlock-aes.c
===================================================================
--- linux-2.6.10.orig/drivers/crypto/padlock-aes.c	2005-01-11 14:01:05.000000000 +0100
+++ linux-2.6.10/drivers/crypto/padlock-aes.c	2005-01-11 23:40:26.000000000 +0100
@@ -369,19 +369,54 @@ aes_set_key(void *ctx_arg, const uint8_t
 
 /* ====== Encryption/decryption routines ====== */
 
-/* This is the real call to PadLock. */
-static inline void
+/* These are the real calls to PadLock. */
+static inline void *
 padlock_xcrypt_ecb(uint8_t *input, uint8_t *output, uint8_t *key,
-		   void *control_word, uint32_t count)
+		   uint8_t *iv, void *control_word, uint32_t count)
 {
 	asm volatile ("pushfl; popfl");		/* enforce key reload. */
 	asm volatile (".byte 0xf3,0x0f,0xa7,0xc8"	/* rep xcryptecb */
 		      : "+S"(input), "+D"(output)
 		      : "d"(control_word), "b"(key), "c"(count));
+	return NULL;
+}
+
+static inline void *
+padlock_xcrypt_cbc(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xd0"	/* rep xcryptcbc */
+		      : "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
+}
+
+static inline void *
+padlock_xcrypt_cfb(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xe0"	/* rep xcryptcfb */
+		      : "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
+}
+
+static inline void *
+padlock_xcrypt_ofb(uint8_t *input, uint8_t *output, uint8_t *key,
+		   uint8_t *iv, void *control_word, uint32_t count)
+{
+	asm volatile ("pushfl; popfl");		/* enforce key reload. */
+	asm volatile (".byte 0xf3,0x0f,0xa7,0xe8"	/* rep xcryptofb */
+		      : "+S"(input), "+D"(output), "+a"(iv)
+		      : "d"(control_word), "b"(key), "c"(count));
+	return iv;
 }
 
 static void
-aes_padlock(void *ctx_arg, uint8_t *out_arg, const uint8_t *in_arg, int encdec)
+aes_padlock(void *ctx_arg, uint8_t *out_arg, const uint8_t *in_arg,
+	    uint8_t *iv_arg, size_t nbytes, int encdec, int mode)
 {
 	/* Don't blindly modify this structure - the items must 
 	   fit on 16-Bytes boundaries! */
@@ -419,21 +454,126 @@ aes_padlock(void *ctx_arg, uint8_t *out_
 	else
 		key = ctx->D;
 	
-	memcpy(data->buf, in_arg, AES_BLOCK_SIZE);
-	padlock_xcrypt_ecb(data->buf, data->buf, key, &data->cword, 1);
-	memcpy(out_arg, data->buf, AES_BLOCK_SIZE);
+	if (nbytes == AES_BLOCK_SIZE) {
+		/* Processing one block only => ECB is enough */
+		memcpy(data->buf, in_arg, AES_BLOCK_SIZE);
+		padlock_xcrypt_ecb(data->buf, data->buf, key, NULL,
+				   &data->cword, 1);
+		memcpy(out_arg, data->buf, AES_BLOCK_SIZE);
+	} else {
+		/* Processing multiple blocks at once */
+		uint8_t *in, *out, *iv;
+		int gfp = in_atomic() ? GFP_ATOMIC : GFP_KERNEL;
+		void *index = NULL;
+
+		if (unlikely(((long)in_arg) & 0x0F)) {
+			in = crypto_aligned_kmalloc(nbytes, gfp, 16, &index);
+			memcpy(in, in_arg, nbytes);
+		}
+		else
+			in = (uint8_t*)in_arg;
+
+		if (unlikely(((long)out_arg) & 0x0F)) {
+			if (index)
+				out = in;	/* xcrypt can work "in place" */
+			else
+				out = crypto_aligned_kmalloc(nbytes, gfp, 16,
+							     &index);
+		}
+		else
+			out = out_arg;
+
+		/* Always make a local copy of IV - xcrypt may change it! */
+		iv = data->buf;
+		if (iv_arg)
+			memcpy(iv, iv_arg, AES_BLOCK_SIZE);
+
+		switch (mode) {
+			case CRYPTO_TFM_MODE_ECB:
+				iv = padlock_xcrypt_ecb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_CBC:
+				iv = padlock_xcrypt_cbc(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_CFB:
+				iv = padlock_xcrypt_cfb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			case CRYPTO_TFM_MODE_OFB:
+				iv = padlock_xcrypt_ofb(in, out, key, iv,
+							&data->cword,
+							nbytes/AES_BLOCK_SIZE);
+				break;
+
+			default:
+				BUG();
+		}
+
+		/* Back up IV */
+		if (iv && iv_arg)
+			memcpy(iv_arg, iv, AES_BLOCK_SIZE);
+
+		/* Copy the 16-Byte aligned output to the caller's buffer. */
+		if (out != out_arg)
+			memcpy(out_arg, out, nbytes);
+
+		if (index)
+			kfree(index);
+	}
+}
+
+static void
+aes_padlock_ecb(void *ctx, uint8_t *dst, const uint8_t *src,
+		uint8_t *iv, size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, NULL, nbytes, encdec,
+		    CRYPTO_TFM_MODE_ECB);
+}
+
+static void
+aes_padlock_cbc(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_CBC);
+}
+
+static void
+aes_padlock_cfb(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_CFB);
+}
+
+static void
+aes_padlock_ofb(void *ctx, uint8_t *dst, const uint8_t *src, uint8_t *iv,
+		size_t nbytes, int encdec, int inplace)
+{
+	aes_padlock(ctx, dst, src, iv, nbytes, encdec,
+		    CRYPTO_TFM_MODE_OFB);
 }
 
 static void
 aes_encrypt(void *ctx_arg, uint8_t *out, const uint8_t *in)
 {
-	aes_padlock(ctx_arg, out, in, CRYPTO_DIR_ENCRYPT);
+	aes_padlock(ctx_arg, out, in, NULL, AES_BLOCK_SIZE,
+		    CRYPTO_DIR_ENCRYPT, CRYPTO_TFM_MODE_ECB);
 }
 
 static void
 aes_decrypt(void *ctx_arg, uint8_t *out, const uint8_t *in)
 {
-	aes_padlock(ctx_arg, out, in, CRYPTO_DIR_DECRYPT);
+	aes_padlock(ctx_arg, out, in, NULL, AES_BLOCK_SIZE,
+		    CRYPTO_DIR_DECRYPT, CRYPTO_TFM_MODE_ECB);
 }
 
 static struct crypto_alg aes_alg = {
@@ -454,9 +594,25 @@ static struct crypto_alg aes_alg = {
 	}
 };
 
+static int disable_multiblock = 0;
+MODULE_PARM(disable_multiblock, "i");
+MODULE_PARM_DESC(disable_multiblock,
+		 "Disable encryption of whole multiblock buffers.");
+
 int __init padlock_init_aes(void)
 {
-	printk(KERN_NOTICE PFX "Using VIA PadLock ACE for AES algorithm.\n");
+	if (!disable_multiblock) {
+		aes_alg.cra_u.cipher.cia_max_nbytes = (size_t)-1;
+		aes_alg.cra_u.cipher.cia_req_align  = 16;
+		aes_alg.cra_u.cipher.cia_ecb        = aes_padlock_ecb;
+		aes_alg.cra_u.cipher.cia_cbc        = aes_padlock_cbc;
+		aes_alg.cra_u.cipher.cia_cfb        = aes_padlock_cfb;
+		aes_alg.cra_u.cipher.cia_ofb        = aes_padlock_ofb;
+	}
+
+	printk(KERN_NOTICE PFX
+		"Using VIA PadLock ACE for AES algorithm%s.\n",
+		disable_multiblock ? "" : " (multiblock)");
 
 	gen_tabs();
 	return crypto_register_alg(&aes_alg);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-14 13:10         ` [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers " Michal Ludvig
@ 2005-01-14 14:20           ` Fruhwirth Clemens
  2005-01-14 16:40             ` Michal Ludvig
  0 siblings, 1 reply; 13+ messages in thread
From: Fruhwirth Clemens @ 2005-01-14 14:20 UTC (permalink / raw)
  To: Michal Ludvig
  Cc: Andrew Morton, James Morris, cryptoapi, David S. Miller, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4498 bytes --]

On Fri, 2005-01-14 at 14:10 +0100, Michal Ludvig wrote:

> This patch extends crypto/cipher.c for offloading the whole chaining modes
> to e.g. hardware crypto accelerators. It is much faster to let the 
> hardware do all the chaining if it can do so.

Is there any connection to Evgeniy Polyakov's acrypto work? It appears,
that there are two project for one objective. Would be nice to see both
parties pulling on one string.

> +	void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> +			size_t nbytes, int encdec, int inplace);
> +	void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> +			size_t nbytes, int encdec, int inplace);
> +	void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> +			size_t nbytes, int encdec, int inplace);
> +	void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> +			size_t nbytes, int encdec, int inplace);
> +	void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> +			size_t nbytes, int encdec, int inplace);

What's the use of adding mode specific functions to the tfm struct? And
why do they all have the same function type? For instance, the "iv" or
"inplace" argument is meaningless for ECB.

Have a look at
http://clemens.endorphin.org/patches/lrw/2-tweakable-cipher-interface.diff

This patch takes the following approach to handle the 
cipher mode/interface issue:

Every mode is associated with one or more interfaces. This interface is
either cit_encrypt, cit_encrypt_iv, or cit_encrypt_tweaks. How these
interfaces are associated with cipher modes, is handled in
crypto_init_cipher_flags. 

Except for CBC, every mode associates with just one interface. In CBC,
the CryptoAPI caller can use the IV interface to supply an IV, or use
the current tfm's IV by using cit_encrypt instead of cit_encrypt_iv.

I don't see a gain to through dozens of pointers into the tfm, as a tfm
is always assigned a single mode.
 
>  /*
>   * Generic encrypt/decrypt wrapper for ciphers, handles operations across
> @@ -47,22 +101,101 @@ static inline void xor_128(u8 *a, const 
>  static int crypt(struct crypto_tfm *tfm,
>  		 struct scatterlist *dst,
>  		 struct scatterlist *src,
> -                 unsigned int nbytes, cryptfn_t crfn,
> -                 procfn_t prfn, int enc, void *info)

Your patch heavily interferes with my cleanup patch for crypt(..). To
put it briefly, I consider crypt(..) a mess. The function definition of
crypto and the procfn_t function is just a patchwork of stuff, added
when needed. 

I've rewritten a generic scatterwalker, that's a generic replacement for
crypto, that can apply any processing function with arbitrary argument
length to the data associated with a set of scatterlists. I think this
function shouldn't be in crypto/ but in some more generic location, as I
think it could be useful for much more things. 

http://clemens.endorphin.org/patches/lrw/1-generic-scatterwalker.diff
is the generic scatterwalk patch. 

int scatterwalk_walker_generic(void (function)(void *priv, int length,
void **buflist), void *priv, int steps, int nsl, ...) 

"function" is applied to the scatterlist data. 
"priv" is a private data structure for bookkeeping. It's supplied to the
function as first parameter.
"steps" is the number of times function is called.
"nsl" is the number of scatterlists following.

After "nsl", the scatterlists follow in a tuple of data:
<struct scatterlist *, int steplength, int ioflag>

ECB, for example:
	...
struct ecb_process_priv priv = { 
	.tfm = tfm,
	.crfn = tfm->__crt_alg->cra_cipher.cia_decrypt,
};
int bsize = crypto_tfm_alg_blocksize(tfm);
scatterwalk_walker_generic(ecb_process_gw, 	// processing function
	&priv,		// private data
	nbytes/bsize,	// number of steps
	2, 		// number of scatterlists
	dst, bsize, 1, 	// first, ioflag set to output
	src, bsize, 0);	// second, ioflag set to input

..
static void ecb_process_gw(void *_priv, int nsg, void **buf) 
{
	struct ecb_process_priv *priv = (struct ecb_process_priv *)_priv;
	char *dst = buf[0];	// pointer to correctly kmapped and copied dst
	char *src = buf[1];	// pointer to correctly kmapped and copied src
	priv->crfn(crypto_tfm_ctx(priv->tfm), dst, src);
}

Well, I recognize that I'm somehow off-topic now. But, it demonstrates
clearly, why we should get rid of crypt(..) and replace it with
something more generic.

-- 
Fruhwirth Clemens <clemens@endorphin.org>  http://clemens.endorphin.org

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-14 14:20           ` Fruhwirth Clemens
@ 2005-01-14 16:40             ` Michal Ludvig
  2005-01-15 12:45               ` Fruhwirth Clemens
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Ludvig @ 2005-01-14 16:40 UTC (permalink / raw)
  To: Fruhwirth Clemens
  Cc: Andrew Morton, James Morris, cryptoapi, David S. Miller, linux-kernel

On Fri, 14 Jan 2005, Fruhwirth Clemens wrote:

> On Fri, 2005-01-14 at 14:10 +0100, Michal Ludvig wrote:
> 
> > This patch extends crypto/cipher.c for offloading the whole chaining modes
> > to e.g. hardware crypto accelerators. It is much faster to let the 
> > hardware do all the chaining if it can do so.
> 
> Is there any connection to Evgeniy Polyakov's acrypto work? It appears,
> that there are two project for one objective. Would be nice to see both
> parties pulling on one string.

These projects do not compete at all. Evgeniy's work is a complete 
replacement for current cryptoapi and brings the asynchronous 
operations at the first place. My patches are simple and straightforward 
extensions to current cryptoapi that enable offloading the chaining to 
hardware where possible.

> > +	void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > +			size_t nbytes, int encdec, int inplace);
> > +	void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > +			size_t nbytes, int encdec, int inplace);
> > +	void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > +			size_t nbytes, int encdec, int inplace);
> > +	void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > +			size_t nbytes, int encdec, int inplace);
> > +	void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > +			size_t nbytes, int encdec, int inplace);
> 
> What's the use of adding mode specific functions to the tfm struct? And
> why do they all have the same function type? For instance, the "iv" or
> "inplace" argument is meaningless for ECB.

The prototypes must be the same in my implementation, because in crypt() 
only a pointer to the appropriate mode function is taken and further used 
as "(func*)(arg, arg, ...)".

BTW these functions are not added to "struct crypto_tfm", but to "struct 
crypto_alg" which describes what a particular module supports (i.e. along 
with the block size, algorithm name, etc). In this case it can say that 
e.g. padlock.ko supports encryption in CBC mode in addition to a common 
single-block processing.

BTW I'll look at the pointers of the tweakable api over the weekend...

Michal Ludvig
-- 
* A mouse is a device used to point at the xterm you want to type in.
* Personal homepage - http://www.logix.cz/michal

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-14 16:40             ` Michal Ludvig
@ 2005-01-15 12:45               ` Fruhwirth Clemens
  2005-01-18 16:49                 ` James Morris
  0 siblings, 1 reply; 13+ messages in thread
From: Fruhwirth Clemens @ 2005-01-15 12:45 UTC (permalink / raw)
  To: Michal Ludvig
  Cc: Andrew Morton, James Morris, cryptoapi, David S. Miller, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4730 bytes --]

On Fri, 2005-01-14 at 17:40 +0100, Michal Ludvig wrote: 
> > Is there any connection to Evgeniy Polyakov's acrypto work? It appears,
> > that there are two project for one objective. Would be nice to see both
> > parties pulling on one string.
> 
> These projects do not compete at all. Evgeniy's work is a complete 
> replacement for current cryptoapi and brings the asynchronous 
> operations at the first place. My patches are simple and straightforward 
> extensions to current cryptoapi that enable offloading the chaining to 
> hardware where possible.

Fine, I just saw in Evgeniy's reply, that he took your padlock
implementation. I thought both of you have been working on different
implementations. 

But actually both aim for the same goal. Hardware crypto-offloading.
With padlock the need for a async interface isn't that big, because it's
not "off-loading" as it's done on the same chip and in the same thread. 

However, developing two different APIs isn't particular efficient. I
know, at the moment there isn't much choice, as J.Morris hasn't commited
to acrypto in anyway. But I think it would be good to replace the
synchronized CryptoAPI implementation altogether, put the missing
internals of CryptoAPI into acrypto, and back the interfaces of
CryptoAPI with small stubs, that do like

somereturnvalue synchronized_interface(..) {
	acrypto_kick_some_operation(acrypto_struct);
	wait_for_completion(acrypto_struct);
	return fetch_result(acrypto_struct);
}

The other way round, a asynchron interface using a synchronized
interface doesn't seem natural to me.
(That doesn't mean I oppose your patches, merely that we should start to
think in different directions)

> > > +	void (*cia_ecb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > > +			size_t nbytes, int encdec, int inplace);
> > > +	void (*cia_cbc)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > > +			size_t nbytes, int encdec, int inplace);
> > > +	void (*cia_cfb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > > +			size_t nbytes, int encdec, int inplace);
> > > +	void (*cia_ofb)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > > +			size_t nbytes, int encdec, int inplace);
> > > +	void (*cia_ctr)(void *ctx, u8 *dst, const u8 *src, u8 *iv,
> > > +			size_t nbytes, int encdec, int inplace);
> > 
> > What's the use of adding mode specific functions to the tfm struct? And
> > why do they all have the same function type? For instance, the "iv" or
> > "inplace" argument is meaningless for ECB.
> 
> The prototypes must be the same in my implementation, because in crypt() 
> only a pointer to the appropriate mode function is taken and further used 
> as "(func*)(arg, arg, ...)".
> 
> BTW these functions are not added to "struct crypto_tfm", but to "struct 
> crypto_alg" which describes what a particular module supports (i.e. along 
> with the block size, algorithm name, etc). In this case it can say that 
> e.g. padlock.ko supports encryption in CBC mode in addition to a common 
> single-block processing.

Err, right. I overlooked that it's cia and not cit. However, I don't
like the idea of extending structs when there is a new cipher mode. I
think the API should not have to be extended for every addition, but
should be designed for such extension right from the start.

What about a "selector" function, which returns the appropriate
encryption function for a mode?

typedef void (procfn_t)(struct crypto_tfm *, u8 *,
                        u8*, cryptfn_t, int enc, void *, int);

put 
	procfn_t (*cia_modesel)(u32 function, int iface, int encdec);
into struct crypto_alg;

then in crypto_init_cipher_ops, instead of

	switch (tfm->crt_cipher.cit_mode) {
..
	case CRYPTO_TFM_MODE_CFB:
        	ops->cit_encrypt = cfb_encrypt;
		ops->cit_decrypt = cfb_decrypt;
..
}
we do,
	struct cipher_alg *cia = &tfm->__crt_alg->cra_cipher;
	
	switch (tfm->crt_cipher.cit_mode) {
..
		case CRYPTO_TFM_MODE_CFB:
			ops->cit_encrypt = cia->cia_modesel(cit_mode, 0, IFACE_ECB);
			ops->cit_decrypt = cia->cia_modesel(cit_mode, 1, IFACE_ECB);
			ops->cit_encrypt_iv = cia->cia_modesel(cit_mode, 0, IFACE_IV);
			ops->cit_decrypt_iv = cia->cia_modesel(cit_mode, 1, IFACE_IV);
..

Alternatively, we could also add a lookup table. But I like this better,
since this is much easier to read for people, and tfm's aren't alloced
that often.

Probably, we can add a wrapper for cia_modesel, that when cia_modesel is
NULL, it falls back to the old behaviour. This way, we don't have to
patch all algorithm implementations to include cia_modesel.

How you like that idea?

-- 
Fruhwirth Clemens <clemens@endorphin.org>  http://clemens.endorphin.org

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-15 12:45               ` Fruhwirth Clemens
@ 2005-01-18 16:49                 ` James Morris
  2005-01-20  3:30                   ` David McCullough
  0 siblings, 1 reply; 13+ messages in thread
From: James Morris @ 2005-01-18 16:49 UTC (permalink / raw)
  To: Fruhwirth Clemens
  Cc: Michal Ludvig, Andrew Morton, cryptoapi, David S. Miller, linux-kernel

On Sat, 15 Jan 2005, Fruhwirth Clemens wrote:

> However, developing two different APIs isn't particular efficient. I
> know, at the moment there isn't much choice, as J.Morris hasn't commited
> to acrypto in anyway.

There is also the OCF port (OpenBSD crypto framework) to consider, if 
permission to dual license from the original authors can be obtained.


- James
-- 
James Morris
<jmorris@redhat.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-18 16:49                 ` James Morris
@ 2005-01-20  3:30                   ` David McCullough
  2005-01-20 13:47                     ` James Morris
  0 siblings, 1 reply; 13+ messages in thread
From: David McCullough @ 2005-01-20  3:30 UTC (permalink / raw)
  To: James Morris
  Cc: Fruhwirth Clemens, Andrew Morton, linux-kernel, cryptoapi,
	Michal Ludvig, David S. Miller


Jivin James Morris lays it down ...
> On Sat, 15 Jan 2005, Fruhwirth Clemens wrote:
> 
> > However, developing two different APIs isn't particular efficient. I
> > know, at the moment there isn't much choice, as J.Morris hasn't commited
> > to acrypto in anyway.
> 
> There is also the OCF port (OpenBSD crypto framework) to consider, if 
> permission to dual license from the original authors can be obtained.

For anyone looking for the OCF port for linux,  you can find the latest
release here:

	http://lists.logix.cz/pipermail/cryptoapi/2004/000261.html

One of the drivers uses the existing kernel crypto API to implement
a SW crypto engine for OCF.

As for permission to use a dual license,  I will gladly approach the
authors if others feel it is important to know the possibility of it at this
point,

Cheers,
Davidm

-- 
David McCullough, davidm@snapgear.com  Ph:+61 7 34352815 http://www.SnapGear.com
Custom Embedded Solutions + Security   Fx:+61 7 38913630 http://www.uCdot.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-20  3:30                   ` David McCullough
@ 2005-01-20 13:47                     ` James Morris
  2005-03-03 10:50                       ` David McCullough
  0 siblings, 1 reply; 13+ messages in thread
From: James Morris @ 2005-01-20 13:47 UTC (permalink / raw)
  To: David McCullough
  Cc: Fruhwirth Clemens, Andrew Morton, linux-kernel, cryptoapi,
	Michal Ludvig, David S. Miller

On Thu, 20 Jan 2005, David McCullough wrote:

> As for permission to use a dual license,  I will gladly approach the
> authors if others feel it is important to know the possibility of it at this
> point,

Please do so.  It would be useful to have the option of using an already
developed, debugged and analyzed framework.


- James
-- 
James Morris
<jmorris@redhat.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers at a time
  2005-01-20 13:47                     ` James Morris
@ 2005-03-03 10:50                       ` David McCullough
  0 siblings, 0 replies; 13+ messages in thread
From: David McCullough @ 2005-03-03 10:50 UTC (permalink / raw)
  To: James Morris
  Cc: Fruhwirth Clemens, Andrew Morton, linux-kernel, cryptoapi,
	Michal Ludvig, David S. Miller


Jivin James Morris lays it down ...
> On Thu, 20 Jan 2005, David McCullough wrote:
> 
> > As for permission to use a dual license,  I will gladly approach the
> > authors if others feel it is important to know the possibility of it at this
> > point,
> 
> Please do so.  It would be useful to have the option of using an already
> developed, debugged and analyzed framework.

Ok,  I finally managed to get responses from all the individual
contributors,  though none of the corporations contacted have responded.

While a good number of those contacted were happy to dual-license,  most
are concerned that changes made under the GPL will not be available for
use in BSD.  A couple were a definate no.

I have had offers to rewrite any portions that can not be dual-licensed,
but I think that is overkill for now unless there is significant
interest in taking that path.

Fortunately we have been able to obtain some funding to complete a large
amount of work on the project so it should have some nice progress in the
next couple of weeks as that ramps up :-)

Cheers,
Davidm

-- 
David McCullough, davidm@snapgear.com  Ph:+61 7 34352815 http://www.SnapGear.com
Custom Embedded Solutions + Security   Fx:+61 7 38913630 http://www.uCdot.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2005-03-03 12:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Xine.LNX.4.44.0411301009560.11945-100000@thoron.boston.redhat.com>
     [not found] ` <Pine.LNX.4.61.0411301722270.4409@maxipes.logix.cz>
     [not found]   ` <20041130222442.7b0f4f67.davem@davemloft.net>
2005-01-11 17:03     ` PadLock processing multiple blocks at a time Michal Ludvig
2005-01-11 17:08       ` [PATCH 1/2] " Michal Ludvig
2005-01-14 13:10         ` [PATCH 1/2] CryptoAPI: prepare for processing multiple buffers " Michal Ludvig
2005-01-14 14:20           ` Fruhwirth Clemens
2005-01-14 16:40             ` Michal Ludvig
2005-01-15 12:45               ` Fruhwirth Clemens
2005-01-18 16:49                 ` James Morris
2005-01-20  3:30                   ` David McCullough
2005-01-20 13:47                     ` James Morris
2005-03-03 10:50                       ` David McCullough
2005-01-11 17:08       ` [PATCH 2/2] PadLock processing multiple blocks " Michal Ludvig
2005-01-14  3:05         ` Andrew Morton
2005-01-14 13:15         ` [PATCH 2/2] CryptoAPI: Update PadLock to process multiple blocks at once Michal Ludvig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).