All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] mac80211: aes_ccm: cache AEAD request allocations per CPU
@ 2016-10-18 14:08 ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:08 UTC (permalink / raw)
  To: linux-wireless, johannes, netdev; +Cc: herbert, j, luto, Ard Biesheuvel

This RFC implements per CPU caching of AEAD request structures, which allows
us to get rid of the per-packet kzalloc/kzfree calls we were forced to
introduce to deal with SG API violations, both in the mac80211 and in the
core crypto API code.

Since mac80211 only executes the AEAD transforms in softirq context, only one
AEAD request can be in flight at the same time on any given CPU, and so, instead
of free the request, we can stash its address in a per CPU variable, and reuse
it for the next packet.

This RFC only addressess CCMP, but GCM and GMAC could be fixed in the same way
(and CMAC did not suffer from the API violation issue in the first place)

Ard Biesheuvel (2):
  mac80211: aes_ccm: prepare key struct for storing context data
  mac80211: aes_ccm: cache AEAD request structures per CPU

 net/mac80211/aes_ccm.c | 80 +++++++++++++-------
 net/mac80211/aes_ccm.h | 16 ++--
 net/mac80211/key.c     | 16 ++--
 net/mac80211/key.h     |  3 +-
 net/mac80211/wpa.c     |  4 +-
 5 files changed, 71 insertions(+), 48 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 0/2] mac80211: aes_ccm: cache AEAD request allocations per CPU
@ 2016-10-18 14:08 ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:08 UTC (permalink / raw)
  To: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	johannes-cdvu00un1VgdHxzADdlk8Q, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q, j,
	luto-kltTT9wpgjJwATOyAt5JVQ, Ard Biesheuvel

This RFC implements per CPU caching of AEAD request structures, which allows
us to get rid of the per-packet kzalloc/kzfree calls we were forced to
introduce to deal with SG API violations, both in the mac80211 and in the
core crypto API code.

Since mac80211 only executes the AEAD transforms in softirq context, only one
AEAD request can be in flight at the same time on any given CPU, and so, instead
of free the request, we can stash its address in a per CPU variable, and reuse
it for the next packet.

This RFC only addressess CCMP, but GCM and GMAC could be fixed in the same way
(and CMAC did not suffer from the API violation issue in the first place)

Ard Biesheuvel (2):
  mac80211: aes_ccm: prepare key struct for storing context data
  mac80211: aes_ccm: cache AEAD request structures per CPU

 net/mac80211/aes_ccm.c | 80 +++++++++++++-------
 net/mac80211/aes_ccm.h | 16 ++--
 net/mac80211/key.c     | 16 ++--
 net/mac80211/key.h     |  3 +-
 net/mac80211/wpa.c     |  4 +-
 5 files changed, 71 insertions(+), 48 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/2] mac80211: aes_ccm: prepare key struct for storing context data
  2016-10-18 14:08 ` Ard Biesheuvel
  (?)
@ 2016-10-18 14:08 ` Ard Biesheuvel
  -1 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:08 UTC (permalink / raw)
  To: linux-wireless, johannes, netdev; +Cc: herbert, j, luto, Ard Biesheuvel

As a prepatory change to allow per CPU caching of request structures,
refactor the key allocation routine so we can access per key data
beyond the core AEAD transform easily.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 net/mac80211/aes_ccm.c | 35 +++++++++++---------
 net/mac80211/aes_ccm.h | 16 ++++-----
 net/mac80211/key.c     | 16 ++++-----
 net/mac80211/key.h     |  2 +-
 net/mac80211/wpa.c     |  4 +--
 5 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
index a4e0d59a40dd..58e0338a2c34 100644
--- a/net/mac80211/aes_ccm.c
+++ b/net/mac80211/aes_ccm.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/kernel.h>
+#include <linux/percpu.h>
 #include <linux/types.h>
 #include <linux/err.h>
 #include <crypto/aead.h>
@@ -18,13 +19,13 @@
 #include "key.h"
 #include "aes_ccm.h"
 
-int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
-			      u8 *data, size_t data_len, u8 *mic,
+int ieee80211_aes_ccm_encrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
+			      u8 *aad, u8 *data, size_t data_len, u8 *mic,
 			      size_t mic_len)
 {
 	struct scatterlist sg[3];
 	struct aead_request *aead_req;
-	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
+	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(ccmp->tfm);
 	u8 *__aad;
 
 	aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
@@ -39,7 +40,7 @@ int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 	sg_set_buf(&sg[1], data, data_len);
 	sg_set_buf(&sg[2], mic, mic_len);
 
-	aead_request_set_tfm(aead_req, tfm);
+	aead_request_set_tfm(aead_req, ccmp->tfm);
 	aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
 	aead_request_set_ad(aead_req, sg[0].length);
 
@@ -49,13 +50,13 @@ int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 	return 0;
 }
 
-int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
-			      u8 *data, size_t data_len, u8 *mic,
+int ieee80211_aes_ccm_decrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
+			      u8 *aad, u8 *data, size_t data_len, u8 *mic,
 			      size_t mic_len)
 {
 	struct scatterlist sg[3];
 	struct aead_request *aead_req;
-	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(tfm);
+	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(ccmp->tfm);
 	u8 *__aad;
 	int err;
 
@@ -74,7 +75,7 @@ int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 	sg_set_buf(&sg[1], data, data_len);
 	sg_set_buf(&sg[2], mic, mic_len);
 
-	aead_request_set_tfm(aead_req, tfm);
+	aead_request_set_tfm(aead_req, ccmp->tfm);
 	aead_request_set_crypt(aead_req, sg, sg, data_len + mic_len, b_0);
 	aead_request_set_ad(aead_req, sg[0].length);
 
@@ -84,16 +85,17 @@ int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
 	return err;
 }
 
-struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[],
-						    size_t key_len,
-						    size_t mic_len)
+int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
+				    const u8 key[],
+				    size_t key_len,
+				    size_t mic_len)
 {
 	struct crypto_aead *tfm;
 	int err;
 
 	tfm = crypto_alloc_aead("ccm(aes)", 0, CRYPTO_ALG_ASYNC);
 	if (IS_ERR(tfm))
-		return tfm;
+		return PTR_ERR(tfm);
 
 	err = crypto_aead_setkey(tfm, key, key_len);
 	if (err)
@@ -102,14 +104,15 @@ struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[],
 	if (err)
 		goto free_aead;
 
-	return tfm;
+	ccmp->tfm = tfm;
+	return 0;
 
 free_aead:
 	crypto_free_aead(tfm);
-	return ERR_PTR(err);
+	return err;
 }
 
-void ieee80211_aes_key_free(struct crypto_aead *tfm)
+void ieee80211_aes_key_free(struct ieee80211_ccmp_aead *ccmp)
 {
-	crypto_free_aead(tfm);
+	crypto_free_aead(ccmp->tfm);
 }
diff --git a/net/mac80211/aes_ccm.h b/net/mac80211/aes_ccm.h
index fcd3254c5cf0..82e91c6ec41f 100644
--- a/net/mac80211/aes_ccm.h
+++ b/net/mac80211/aes_ccm.h
@@ -14,15 +14,15 @@
 
 #define CCM_AAD_LEN	32
 
-struct crypto_aead *ieee80211_aes_key_setup_encrypt(const u8 key[],
-						    size_t key_len,
-						    size_t mic_len);
-int ieee80211_aes_ccm_encrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
-			      u8 *data, size_t data_len, u8 *mic,
+int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
+				    const u8 key[], size_t key_len,
+				    size_t mic_len);
+int ieee80211_aes_ccm_encrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
+			      u8 *aad, u8 *data, size_t data_len, u8 *mic,
 			      size_t mic_len);
-int ieee80211_aes_ccm_decrypt(struct crypto_aead *tfm, u8 *b_0, u8 *aad,
-			      u8 *data, size_t data_len, u8 *mic,
+int ieee80211_aes_ccm_decrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
+			      u8 *aad, u8 *data, size_t data_len, u8 *mic,
 			      size_t mic_len);
-void ieee80211_aes_key_free(struct crypto_aead *tfm);
+void ieee80211_aes_key_free(struct ieee80211_ccmp_aead *ccmp);
 
 #endif /* AES_CCM_H */
diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index edd6f2945f69..6d1a066e3c4e 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -431,10 +431,9 @@ ieee80211_key_alloc(u32 cipher, int idx, size_t key_len,
 		 * Initialize AES key state here as an optimization so that
 		 * it does not need to be initialized for every packet.
 		 */
-		key->u.ccmp.tfm = ieee80211_aes_key_setup_encrypt(
-			key_data, key_len, IEEE80211_CCMP_MIC_LEN);
-		if (IS_ERR(key->u.ccmp.tfm)) {
-			err = PTR_ERR(key->u.ccmp.tfm);
+		err = ieee80211_aes_key_setup_encrypt(&key->u.ccmp, key_data,
+					key_len, IEEE80211_CCMP_MIC_LEN);
+		if (err) {
 			kfree(key);
 			return ERR_PTR(err);
 		}
@@ -449,10 +448,9 @@ ieee80211_key_alloc(u32 cipher, int idx, size_t key_len,
 		/* Initialize AES key state here as an optimization so that
 		 * it does not need to be initialized for every packet.
 		 */
-		key->u.ccmp.tfm = ieee80211_aes_key_setup_encrypt(
-			key_data, key_len, IEEE80211_CCMP_256_MIC_LEN);
-		if (IS_ERR(key->u.ccmp.tfm)) {
-			err = PTR_ERR(key->u.ccmp.tfm);
+		err = ieee80211_aes_key_setup_encrypt(&key->u.ccmp, key_data,
+					key_len, IEEE80211_CCMP_256_MIC_LEN);
+		if (err) {
 			kfree(key);
 			return ERR_PTR(err);
 		}
@@ -545,7 +543,7 @@ static void ieee80211_key_free_common(struct ieee80211_key *key)
 	switch (key->conf.cipher) {
 	case WLAN_CIPHER_SUITE_CCMP:
 	case WLAN_CIPHER_SUITE_CCMP_256:
-		ieee80211_aes_key_free(key->u.ccmp.tfm);
+		ieee80211_aes_key_free(&key->u.ccmp);
 		break;
 	case WLAN_CIPHER_SUITE_AES_CMAC:
 	case WLAN_CIPHER_SUITE_BIP_CMAC_256:
diff --git a/net/mac80211/key.h b/net/mac80211/key.h
index 4aa20cef0859..1ec7a737ab79 100644
--- a/net/mac80211/key.h
+++ b/net/mac80211/key.h
@@ -80,7 +80,7 @@ struct ieee80211_key {
 			/* number of mic failures */
 			u32 mic_failures;
 		} tkip;
-		struct {
+		struct ieee80211_ccmp_aead {
 			/*
 			 * Last received packet number. The first
 			 * IEEE80211_NUM_TIDS counters are used with Data
diff --git a/net/mac80211/wpa.c b/net/mac80211/wpa.c
index 14b28998c571..694af013494f 100644
--- a/net/mac80211/wpa.c
+++ b/net/mac80211/wpa.c
@@ -461,7 +461,7 @@ static int ccmp_encrypt_skb(struct ieee80211_tx_data *tx, struct sk_buff *skb,
 
 	pos += IEEE80211_CCMP_HDR_LEN;
 	ccmp_special_blocks(skb, pn, b_0, aad);
-	return ieee80211_aes_ccm_encrypt(key->u.ccmp.tfm, b_0, aad, pos, len,
+	return ieee80211_aes_ccm_encrypt(&key->u.ccmp, b_0, aad, pos, len,
 					 skb_put(skb, mic_len), mic_len);
 }
 
@@ -538,7 +538,7 @@ ieee80211_crypto_ccmp_decrypt(struct ieee80211_rx_data *rx,
 			ccmp_special_blocks(skb, pn, b_0, aad);
 
 			if (ieee80211_aes_ccm_decrypt(
-				    key->u.ccmp.tfm, b_0, aad,
+				    &key->u.ccmp, b_0, aad,
 				    skb->data + hdrlen + IEEE80211_CCMP_HDR_LEN,
 				    data_len,
 				    skb->data + skb->len - mic_len, mic_len))
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
  2016-10-18 14:08 ` Ard Biesheuvel
  (?)
  (?)
@ 2016-10-18 14:08 ` Ard Biesheuvel
  2016-10-18 14:16     ` Johannes Berg
  -1 siblings, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:08 UTC (permalink / raw)
  To: linux-wireless, johannes, netdev; +Cc: herbert, j, luto, Ard Biesheuvel

Now that we can no longer invoke AEAD transforms with the aead_request
structure allocated on the stack, we perform a kmalloc/kfree for every
packet, which is expensive.

Since the CCMP routines execute in softirq context, we know there can
never be more than one request in flight on each CPU, and so we can
simply keep a cached aead_request structure per CPU, and deallocate all
of them when deallocating the AEAD transform.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 net/mac80211/aes_ccm.c | 49 ++++++++++++++------
 net/mac80211/key.h     |  1 +
 2 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/net/mac80211/aes_ccm.c b/net/mac80211/aes_ccm.c
index 58e0338a2c34..2cb5ee4868ea 100644
--- a/net/mac80211/aes_ccm.c
+++ b/net/mac80211/aes_ccm.c
@@ -28,9 +28,14 @@ int ieee80211_aes_ccm_encrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
 	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(ccmp->tfm);
 	u8 *__aad;
 
-	aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
-	if (!aead_req)
-		return -ENOMEM;
+	aead_req = *this_cpu_ptr(ccmp->reqs);
+	if (!aead_req) {
+		aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
+		if (!aead_req)
+			return -ENOMEM;
+		*this_cpu_ptr(ccmp->reqs) = aead_req;
+		aead_request_set_tfm(aead_req, ccmp->tfm);
+	}
 
 	__aad = (u8 *)aead_req + reqsize;
 	memcpy(__aad, aad, CCM_AAD_LEN);
@@ -40,12 +45,10 @@ int ieee80211_aes_ccm_encrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
 	sg_set_buf(&sg[1], data, data_len);
 	sg_set_buf(&sg[2], mic, mic_len);
 
-	aead_request_set_tfm(aead_req, ccmp->tfm);
 	aead_request_set_crypt(aead_req, sg, sg, data_len, b_0);
 	aead_request_set_ad(aead_req, sg[0].length);
 
 	crypto_aead_encrypt(aead_req);
-	kzfree(aead_req);
 
 	return 0;
 }
@@ -58,14 +61,18 @@ int ieee80211_aes_ccm_decrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
 	struct aead_request *aead_req;
 	int reqsize = sizeof(*aead_req) + crypto_aead_reqsize(ccmp->tfm);
 	u8 *__aad;
-	int err;
 
 	if (data_len == 0)
 		return -EINVAL;
 
-	aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
-	if (!aead_req)
-		return -ENOMEM;
+	aead_req = *this_cpu_ptr(ccmp->reqs);
+	if (!aead_req) {
+		aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
+		if (!aead_req)
+			return -ENOMEM;
+		*this_cpu_ptr(ccmp->reqs) = aead_req;
+		aead_request_set_tfm(aead_req, ccmp->tfm);
+	}
 
 	__aad = (u8 *)aead_req + reqsize;
 	memcpy(__aad, aad, CCM_AAD_LEN);
@@ -75,14 +82,10 @@ int ieee80211_aes_ccm_decrypt(struct ieee80211_ccmp_aead *ccmp, u8 *b_0,
 	sg_set_buf(&sg[1], data, data_len);
 	sg_set_buf(&sg[2], mic, mic_len);
 
-	aead_request_set_tfm(aead_req, ccmp->tfm);
 	aead_request_set_crypt(aead_req, sg, sg, data_len + mic_len, b_0);
 	aead_request_set_ad(aead_req, sg[0].length);
 
-	err = crypto_aead_decrypt(aead_req);
-	kzfree(aead_req);
-
-	return err;
+	return crypto_aead_decrypt(aead_req);
 }
 
 int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
@@ -91,6 +94,7 @@ int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
 				    size_t mic_len)
 {
 	struct crypto_aead *tfm;
+	struct aead_request **r;
 	int err;
 
 	tfm = crypto_alloc_aead("ccm(aes)", 0, CRYPTO_ALG_ASYNC);
@@ -104,6 +108,14 @@ int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
 	if (err)
 		goto free_aead;
 
+	/* allow a struct aead_request to be cached per cpu */
+	r = alloc_percpu(struct aead_request *);
+	if (!r) {
+		err = -ENOMEM;
+		goto free_aead;
+	}
+
+	ccmp->reqs = r;
 	ccmp->tfm = tfm;
 	return 0;
 
@@ -114,5 +126,14 @@ int ieee80211_aes_key_setup_encrypt(struct ieee80211_ccmp_aead *ccmp,
 
 void ieee80211_aes_key_free(struct ieee80211_ccmp_aead *ccmp)
 {
+	struct aead_request *req;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		req = *per_cpu_ptr(ccmp->reqs, cpu);
+		if (req)
+			kzfree(req);
+	}
+	free_percpu(ccmp->reqs);
 	crypto_free_aead(ccmp->tfm);
 }
diff --git a/net/mac80211/key.h b/net/mac80211/key.h
index 1ec7a737ab79..f99ec46dc08f 100644
--- a/net/mac80211/key.h
+++ b/net/mac80211/key.h
@@ -89,6 +89,7 @@ struct ieee80211_key {
 			 */
 			u8 rx_pn[IEEE80211_NUM_TIDS + 1][IEEE80211_CCMP_PN_LEN];
 			struct crypto_aead *tfm;
+			struct aead_request * __percpu *reqs;
 			u32 replays; /* dot11RSNAStatsCCMPReplays */
 		} ccmp;
 		struct {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:16     ` Johannes Berg
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2016-10-18 14:16 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-wireless, netdev; +Cc: herbert, j, luto

On Tue, 2016-10-18 at 15:08 +0100, Ard Biesheuvel wrote:
> 
> +	aead_req = *this_cpu_ptr(ccmp->reqs);
> +	if (!aead_req) {
> +		aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
> +		if (!aead_req)
> +			return -ENOMEM;
> +		*this_cpu_ptr(ccmp->reqs) = aead_req;
> +		aead_request_set_tfm(aead_req, ccmp->tfm);
> +	}

Hmm. Is it really worth having a per-CPU variable for each possible
key? You could have a large number of those (typically three when
you're a client on an AP, and 1 + 1 for each client when you're the
AP).

Would it be so bad to have to set the TFM every time (if that's even
possible), and just have a single per-CPU cache?

johannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:16     ` Johannes Berg
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Berg @ 2016-10-18 14:16 UTC (permalink / raw)
  To: Ard Biesheuvel, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: herbert-lOAM2aK0SrRLBo1qDEOMRrpzq4S04n8Q, j, luto-kltTT9wpgjJwATOyAt5JVQ

On Tue, 2016-10-18 at 15:08 +0100, Ard Biesheuvel wrote:
> 
> +	aead_req = *this_cpu_ptr(ccmp->reqs);
> +	if (!aead_req) {
> +		aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
> +		if (!aead_req)
> +			return -ENOMEM;
> +		*this_cpu_ptr(ccmp->reqs) = aead_req;
> +		aead_request_set_tfm(aead_req, ccmp->tfm);
> +	}

Hmm. Is it really worth having a per-CPU variable for each possible
key? You could have a large number of those (typically three when
you're a client on an AP, and 1 + 1 for each client when you're the
AP).

Would it be so bad to have to set the TFM every time (if that's even
possible), and just have a single per-CPU cache?

johannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:18       ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:18 UTC (permalink / raw)
  To: Johannes Berg
  Cc: <linux-wireless@vger.kernel.org>,
	<netdev@vger.kernel.org>,
	Herbert Xu, Jouni Malinen, Andy Lutomirski

On 18 October 2016 at 15:16, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Tue, 2016-10-18 at 15:08 +0100, Ard Biesheuvel wrote:
>>
>> +     aead_req = *this_cpu_ptr(ccmp->reqs);
>> +     if (!aead_req) {
>> +             aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
>> +             if (!aead_req)
>> +                     return -ENOMEM;
>> +             *this_cpu_ptr(ccmp->reqs) = aead_req;
>> +             aead_request_set_tfm(aead_req, ccmp->tfm);
>> +     }
>
> Hmm. Is it really worth having a per-CPU variable for each possible
> key? You could have a large number of those (typically three when
> you're a client on an AP, and 1 + 1 for each client when you're the
> AP).
>
> Would it be so bad to have to set the TFM every time (if that's even
> possible), and just have a single per-CPU cache?
>

That would be preferred, yes. The only snag here is that
crypto_alloc_aead() is not guaranteed to return the same algo every
time, which means the request size is not guaranteed to be the same
either. This is a rare corner case, of course, but it needs to be
dealt with regardless

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:18       ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:18 UTC (permalink / raw)
  To: Johannes Berg
  Cc: <linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Herbert Xu, Jouni Malinen, Andy Lutomirski

On 18 October 2016 at 15:16, Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> On Tue, 2016-10-18 at 15:08 +0100, Ard Biesheuvel wrote:
>>
>> +     aead_req = *this_cpu_ptr(ccmp->reqs);
>> +     if (!aead_req) {
>> +             aead_req = kzalloc(reqsize + CCM_AAD_LEN, GFP_ATOMIC);
>> +             if (!aead_req)
>> +                     return -ENOMEM;
>> +             *this_cpu_ptr(ccmp->reqs) = aead_req;
>> +             aead_request_set_tfm(aead_req, ccmp->tfm);
>> +     }
>
> Hmm. Is it really worth having a per-CPU variable for each possible
> key? You could have a large number of those (typically three when
> you're a client on an AP, and 1 + 1 for each client when you're the
> AP).
>
> Would it be so bad to have to set the TFM every time (if that's even
> possible), and just have a single per-CPU cache?
>

That would be preferred, yes. The only snag here is that
crypto_alloc_aead() is not guaranteed to return the same algo every
time, which means the request size is not guaranteed to be the same
either. This is a rare corner case, of course, but it needs to be
dealt with regardless

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
  2016-10-18 14:18       ` Ard Biesheuvel
  (?)
@ 2016-10-18 14:24       ` Johannes Berg
  2016-10-18 14:30           ` Ard Biesheuvel
  -1 siblings, 1 reply; 11+ messages in thread
From: Johannes Berg @ 2016-10-18 14:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: <linux-wireless@vger.kernel.org>,
	<netdev@vger.kernel.org>,
	Herbert Xu, Jouni Malinen, Andy Lutomirski

On Tue, 2016-10-18 at 15:18 +0100, Ard Biesheuvel wrote:
> 
> > Hmm. Is it really worth having a per-CPU variable for each possible
> > key? You could have a large number of those (typically three when
> > you're a client on an AP, and 1 + 1 for each client when you're the
> > AP).

2 + 1 for each client, actually, since you have 2 GTKs present in the
"steady state"; not a big difference though.

> > Would it be so bad to have to set the TFM every time (if that's
> > even possible), and just have a single per-CPU cache?

> That would be preferred, yes. The only snag here is that
> crypto_alloc_aead() is not guaranteed to return the same algo every
> time, which means the request size is not guaranteed to be the same
> either. This is a rare corner case, of course, but it needs to be
> dealt with regardless

Ah, good point. Well I guess you could allocate a bigger one it if it's
too small, but then we'd have to recalculate the size all the time
(which we already did anyway, but saving something else would be good).
Then we'd be close to just having a per-CPU memory block cache though.

johannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:30           ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:30 UTC (permalink / raw)
  To: Johannes Berg
  Cc: <linux-wireless@vger.kernel.org>,
	<netdev@vger.kernel.org>,
	Herbert Xu, Jouni Malinen, Andy Lutomirski

On 18 October 2016 at 15:24, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Tue, 2016-10-18 at 15:18 +0100, Ard Biesheuvel wrote:
>>
>> > Hmm. Is it really worth having a per-CPU variable for each possible
>> > key? You could have a large number of those (typically three when
>> > you're a client on an AP, and 1 + 1 for each client when you're the
>> > AP).
>
> 2 + 1 for each client, actually, since you have 2 GTKs present in the
> "steady state"; not a big difference though.
>
>> > Would it be so bad to have to set the TFM every time (if that's
>> > even possible), and just have a single per-CPU cache?
>
>> That would be preferred, yes. The only snag here is that
>> crypto_alloc_aead() is not guaranteed to return the same algo every
>> time, which means the request size is not guaranteed to be the same
>> either. This is a rare corner case, of course, but it needs to be
>> dealt with regardless
>
> Ah, good point. Well I guess you could allocate a bigger one it if it's
> too small, but then we'd have to recalculate the size all the time
> (which we already did anyway, but saving something else would be good).
> Then we'd be close to just having a per-CPU memory block cache though.
>

Well, ideally we'd allocate the ccm(aes) crypto_alg a single time and
'spawn' the transforms for each key. This is how the crypto API
implements templates internally, but I don't think this functionality
is publicly accessible. Herbert?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU
@ 2016-10-18 14:30           ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2016-10-18 14:30 UTC (permalink / raw)
  To: Johannes Berg
  Cc: <linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Herbert Xu, Jouni Malinen, Andy Lutomirski

On 18 October 2016 at 15:24, Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> wrote:
> On Tue, 2016-10-18 at 15:18 +0100, Ard Biesheuvel wrote:
>>
>> > Hmm. Is it really worth having a per-CPU variable for each possible
>> > key? You could have a large number of those (typically three when
>> > you're a client on an AP, and 1 + 1 for each client when you're the
>> > AP).
>
> 2 + 1 for each client, actually, since you have 2 GTKs present in the
> "steady state"; not a big difference though.
>
>> > Would it be so bad to have to set the TFM every time (if that's
>> > even possible), and just have a single per-CPU cache?
>
>> That would be preferred, yes. The only snag here is that
>> crypto_alloc_aead() is not guaranteed to return the same algo every
>> time, which means the request size is not guaranteed to be the same
>> either. This is a rare corner case, of course, but it needs to be
>> dealt with regardless
>
> Ah, good point. Well I guess you could allocate a bigger one it if it's
> too small, but then we'd have to recalculate the size all the time
> (which we already did anyway, but saving something else would be good).
> Then we'd be close to just having a per-CPU memory block cache though.
>

Well, ideally we'd allocate the ccm(aes) crypto_alg a single time and
'spawn' the transforms for each key. This is how the crypto API
implements templates internally, but I don't think this functionality
is publicly accessible. Herbert?

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-10-18 14:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-18 14:08 [RFC PATCH 0/2] mac80211: aes_ccm: cache AEAD request allocations per CPU Ard Biesheuvel
2016-10-18 14:08 ` Ard Biesheuvel
2016-10-18 14:08 ` [RFC PATCH 1/2] mac80211: aes_ccm: prepare key struct for storing context data Ard Biesheuvel
2016-10-18 14:08 ` [RFC PATCH 2/2] mac80211: aes_ccm: cache AEAD request structures per CPU Ard Biesheuvel
2016-10-18 14:16   ` Johannes Berg
2016-10-18 14:16     ` Johannes Berg
2016-10-18 14:18     ` Ard Biesheuvel
2016-10-18 14:18       ` Ard Biesheuvel
2016-10-18 14:24       ` Johannes Berg
2016-10-18 14:30         ` Ard Biesheuvel
2016-10-18 14:30           ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.