All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] Fix PTK rekey freezes and cleartext leaks
@ 2018-07-31 20:10 Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 1/3] nl80211: Add ATOMIC_KEY_REPLACE API Alexander Wetzel
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Alexander Wetzel @ 2018-07-31 20:10 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, greearb, s.gottschall, denkenz, Alexander Wetzel

I do not see an acceptable way from the kernel to trigger a fast
reconnect. wpa_supplicant e.g. has some special code handling errors
during a 4-way-handshake differently. I assume we would have to add
code detecting if we are running as AP, Station Infrastructure, Ad-Hoc
or Mesh and then find and spoof an error which allows the userspace to
reconnect fast. For multiple different userspace implementations and
the different versions of them.
And we'll have that mess in the kernel for basically forever...
Seems to be a clear no go, correct?

So this patch (series) is giving up on a quick fix and will allow broken
rekeys to continue. It is instead providing an updated API for both the
userspace and the drivers to also do it correctly. Users running a
userspace not aware of the new API will get some improvements but may
still leak cleartext packets and have connection freezes till we get an
updated userspace rolled out. On the plus side users running updated
userspace won't be able to rekey connections any longer, avoiding the
issues even with unpatched kernels.

Since the new approach now also touches nl80211 I've now broken it down
into a small patch series.
The first two patches are just laying some ground work with all the
magic happening in the last patch, which also best describes what we
want to achieve in its commit message. The commit messages of the first
two are mainly outlining how the new APIs should be used.

Here a quick overview of the patches in the series:
1) nl80211: Add ATOMIC_KEY_REPLACE API
   This adds support for NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE
   to nl80211. We expect the userspace (hostap, wpa_supplicant, iwd ...)
   to check this flag and only rekey a connection when this flag is
   present. When the flag is not set a rekey has to be "emulated" by a
   full de-association and a reconnect if it can't be avoided by the
   userspace.

2) mac80211: Define new driver callback replace_key
   This is just adding the new driver callback replace_key in
   mac80211 the last patch of the set can then use when needed.
   It's needed to provide a clear demarcation line between old
   packets which absolutely must not be send out with new key and new
   ones which should use the new key.
   By giving the driver the option to directly replace a running key
   with a updated one many drivers should be able to implement an atomic
   switch which is either using the old or the new key, avoiding the
   headach how to prevent some packets to be send out unencrypted.

3) mac80211: Fix PTK rekey freezes and cleartext leaks
   This finally changes how mac80211 handles the rekey and starts using
   the replace_key and sets NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE for
   drivers using mac80211 for it. GTK rekey is basically unchanged and
   only PTK rekey fixed to allow replacing the key with it while receiving
   and sending packets with HW encryption offloading and the races this
   causes to be factored in.
   When the driver is not supporting replace_key the code still fix the
   cleartext packet leak in mac80211 and stops RX and TX aggregation to
   prevent those to freeze the connection but besides that falls back to
   the old - know to be broken - sequence for backward compatibility. In
   that case mac80211 will print our a error message, alerting users
   that they still could leak clear text packets and may experience
   connection freezes till  they do not either disable rekey or upgrade
   the userspace. (There is currently no updated userspace available.)

Version history:
V4 Fix PTK rekey freezes and cleartext leaks
    - Switched over to a small patch series.
    - Allows insecure rekeys again for compatibility
    - Allows the userspace to check if rekeys are safe by extending
      nl80211.

V3 mac80211: Fix PTK rekey freezes and cleartext leaks
    Updates the mac80211 API. When the driver is implementing the new
    callback replace_key mac80211 allows PTK rekeys. If not it returns
    an error on key install to the requester.

V2 mac80211: Fix wlan freezes under load at rekey
    This fixes the issue in mac80211 only without API changes on a
    best-can-do approach. Drivers still can freeze the connection and if
    so have to be fixed.

V1 mac80211: Fix wlan freezes under load at rekey
    Very simple approach, only fixing the freezes and using a not
    guaranteed to be working fallback to SW encryption.

Alexander Wetzel (3):
  nl80211: Add ATOMIC_KEY_REPLACE API
  mac80211: Define new driver callback replace_key
  mac80211: Fix PTK rekey freezes and cleartext leaks

 include/net/mac80211.h       |  15 +++++
 include/uapi/linux/nl80211.h |   6 ++
 net/mac80211/driver-ops.h    |  20 ++++++
 net/mac80211/key.c           | 123 ++++++++++++++++++++++++++++++-----
 net/mac80211/main.c          |   5 ++
 net/mac80211/trace.h         |  39 +++++++++++
 net/mac80211/tx.c            |   4 ++
 7 files changed, 195 insertions(+), 17 deletions(-)

-- 
2.18.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 1/3] nl80211: Add ATOMIC_KEY_REPLACE API
  2018-07-31 20:10 [PATCH v4 0/3] Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
@ 2018-07-31 20:10 ` Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 2/3] mac80211: Define new driver callback replace_key Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
  2 siblings, 0 replies; 10+ messages in thread
From: Alexander Wetzel @ 2018-07-31 20:10 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, greearb, s.gottschall, denkenz, Alexander Wetzel

Drivers able to replace a in-use should set
NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE to allow the userspace (e.g.
hostapd or wpa_supplicant) to rekey the PTK without a full
disassociation. The userspace must detect a PTK rekey attempt and only
go ahead with the rekey if the driver has set this flag. When the driver
is not supporting the feature the userspace has to perform
a re-association.

Ignoring this flag and continuing to rekey the connection can still
work but has to be considered insecure and broken. It can leak cleartext
packets or freeze the connection and is only supported to allow the
userspace to be updated.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---
 include/uapi/linux/nl80211.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/uapi/linux/nl80211.h b/include/uapi/linux/nl80211.h
index 7acc16f34942..b41b9ade0449 100644
--- a/include/uapi/linux/nl80211.h
+++ b/include/uapi/linux/nl80211.h
@@ -5224,6 +5224,11 @@ enum nl80211_feature_flags {
  *	except for supported rates from the probe request content if requested
  *	by the %NL80211_SCAN_FLAG_MIN_PREQ_CONTENT flag.
  *
+ * @NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE: Driver/device confirm that they are
+ *      able to rekey an in-use key correctly. Userspace must not rekey PTK keys
+ *      if this flag is not set. Ignoring this can leak clear text packets and/or
+ *      freeze the connection.
+ *
  * @NUM_NL80211_EXT_FEATURES: number of extended features.
  * @MAX_NL80211_EXT_FEATURES: highest extended feature index.
  */
@@ -5259,6 +5264,7 @@ enum nl80211_ext_feature_index {
 	NL80211_EXT_FEATURE_TXQS,
 	NL80211_EXT_FEATURE_SCAN_RANDOM_SN,
 	NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT,
+	NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE,
 
 	/* add new features before the definition below */
 	NUM_NL80211_EXT_FEATURES,
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 2/3] mac80211: Define new driver callback replace_key
  2018-07-31 20:10 [PATCH v4 0/3] Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 1/3] nl80211: Add ATOMIC_KEY_REPLACE API Alexander Wetzel
@ 2018-07-31 20:10 ` Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
  2 siblings, 0 replies; 10+ messages in thread
From: Alexander Wetzel @ 2018-07-31 20:10 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, greearb, s.gottschall, denkenz, Alexander Wetzel

Define the new driver callback replace_key in mac80211 for future use.
Drivers able to replace a in-use key should implement this new callback
to allow mac80211 drivers to securely use PTK rekeying.

On return of the function drivers must guarantee they:
 - Did not send out any frames for the key unencrypted during the
   replace,
 - will not send out packets queued to them prior to the call encrypted
   with the new key
 - and will no longer hand over any which were encrypted with the old
   key to mac80211 when not handling IV/ICV in the driver.

Packets handed over to the driver after the callback returned are
expected to be send out encrypted with the new key and pending
retransmissions must either be dropped or continue to use the old key.
Mac80211 will not hand over outgoing packets for the key being replaced
while the callback is running.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---
 include/net/mac80211.h    | 15 +++++++++++++++
 net/mac80211/driver-ops.h | 20 ++++++++++++++++++++
 net/mac80211/main.c       |  5 +++++
 net/mac80211/trace.h      | 39 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 5790f55c241d..166a47f1886e 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -3137,6 +3137,17 @@ enum ieee80211_reconfig_type {
  *	Returns a negative error code if the key can't be added.
  *	The callback can sleep.
  *
+ * @replace_key: Replace an exiting in use key with a new one while guranteeing
+ * 	to not leak clear text packets. Implementing this callback will enable
+ * 	mac80211 to anounce NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE.
+ * 	Packets already queued must not be send out encrypted with the new key
+ * 	and packets decoded with the old key must not be handed over to mac80211
+ * 	when the driver is not checking IV/ICV itself once the callback has been
+ * 	completed.
+ * 	Mac80211 will log an error when asked to use replace a PTK key
+ * 	without replace_key but will still perform the then potentially
+ * 	insecure action via set_key for backward combatibility for now.
+ *
  * @update_tkip_key: See the section "Hardware crypto acceleration"
  * 	This callback will be called in the context of Rx. Called for drivers
  * 	which set IEEE80211_KEY_FLAG_TKIP_REQ_RX_P1_KEY.
@@ -3585,6 +3596,10 @@ struct ieee80211_ops {
 	int (*set_key)(struct ieee80211_hw *hw, enum set_key_cmd cmd,
 		       struct ieee80211_vif *vif, struct ieee80211_sta *sta,
 		       struct ieee80211_key_conf *key);
+	int (*replace_key)(struct ieee80211_hw *hw,
+		       struct ieee80211_vif *vif, struct ieee80211_sta *sta,
+		       struct ieee80211_key_conf *old,
+		       struct ieee80211_key_conf *new);
 	void (*update_tkip_key)(struct ieee80211_hw *hw,
 				struct ieee80211_vif *vif,
 				struct ieee80211_key_conf *conf,
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h
index 8f6998091d26..ebd7f1463336 100644
--- a/net/mac80211/driver-ops.h
+++ b/net/mac80211/driver-ops.h
@@ -255,6 +255,26 @@ static inline int drv_set_key(struct ieee80211_local *local,
 	return ret;
 }
 
+static inline int drv_replace_key(struct ieee80211_local *local,
+			      struct ieee80211_sub_if_data *sdata,
+			      struct ieee80211_sta *sta,
+			      struct ieee80211_key_conf *old_key,
+			      struct ieee80211_key_conf *new_key)
+{
+	int ret;
+
+	might_sleep();
+
+	sdata = get_bss_sdata(sdata);
+	if (!check_sdata_in_driver(sdata))
+		return -EIO;
+
+	trace_drv_replace_key(local, sdata, sta, old_key, new_key);
+	ret = local->ops->replace_key(&local->hw, &sdata->vif, sta, old_key, new_key);
+	trace_drv_return_int(local, ret);
+	return ret;
+}
+
 static inline void drv_update_tkip_key(struct ieee80211_local *local,
 				       struct ieee80211_sub_if_data *sdata,
 				       struct ieee80211_key_conf *conf,
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 4fb2709cb527..84cc8005c19a 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -572,9 +572,14 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 				      NL80211_EXT_FEATURE_SCAN_MIN_PREQ_CONTENT);
 	}
 
+	if (ops->replace_key)
+		wiphy_ext_feature_set(wiphy,
+				      NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE);
+
 	if (!ops->set_key)
 		wiphy->flags |= WIPHY_FLAG_IBSS_RSN;
 
+
 	if (ops->wake_tx_queue)
 		wiphy_ext_feature_set(wiphy, NL80211_EXT_FEATURE_TXQS);
 
diff --git a/net/mac80211/trace.h b/net/mac80211/trace.h
index 0ab69a1964f8..f93e00f1ae4d 100644
--- a/net/mac80211/trace.h
+++ b/net/mac80211/trace.h
@@ -603,6 +603,45 @@ TRACE_EVENT(drv_set_key,
 	)
 );
 
+TRACE_EVENT(drv_replace_key,
+	TP_PROTO(struct ieee80211_local *local,
+		 struct ieee80211_sub_if_data *sdata,
+		 struct ieee80211_sta *sta,
+		 struct ieee80211_key_conf *old_key,
+		 struct ieee80211_key_conf *new_key),
+
+	TP_ARGS(local, sdata, sta, old_key, new_key),
+
+	TP_STRUCT__entry(
+		LOCAL_ENTRY
+		VIF_ENTRY
+		STA_ENTRY
+		KEY_ENTRY
+		__field(u32, cipher2)
+		__field(u8, hw_key_idx2)
+		__field(u8, flags2)
+		__field(s8, keyidx2)
+	),
+
+	TP_fast_assign(
+		LOCAL_ASSIGN;
+		VIF_ASSIGN;
+		STA_ASSIGN;
+		KEY_ASSIGN(old_key);
+		__entry->cipher2 = new_key->cipher;
+		__entry->flags2 = new_key->flags;
+		__entry->keyidx2 = new_key->keyidx;
+		__entry->hw_key_idx2 = new_key->hw_key_idx;
+	),
+
+	TP_printk(
+		LOCAL_PR_FMT  VIF_PR_FMT  STA_PR_FMT KEY_PR_FMT
+		" cipher2:0x%x, flags2=%#x, keyidx2=%d, hw_key_idx2=%d",
+		LOCAL_PR_ARG, VIF_PR_ARG, STA_PR_ARG, KEY_PR_ARG,
+		__entry->cipher2, __entry->flags2, __entry->keyidx2, __entry->hw_key_idx2
+	)
+);
+
 TRACE_EVENT(drv_update_tkip_key,
 	TP_PROTO(struct ieee80211_local *local,
 		 struct ieee80211_sub_if_data *sdata,
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-07-31 20:10 [PATCH v4 0/3] Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 1/3] nl80211: Add ATOMIC_KEY_REPLACE API Alexander Wetzel
  2018-07-31 20:10 ` [PATCH v4 2/3] mac80211: Define new driver callback replace_key Alexander Wetzel
@ 2018-07-31 20:10 ` Alexander Wetzel
  2018-07-31 21:36   ` Ben Greear
  2 siblings, 1 reply; 10+ messages in thread
From: Alexander Wetzel @ 2018-07-31 20:10 UTC (permalink / raw)
  To: johannes; +Cc: linux-wireless, greearb, s.gottschall, denkenz, Alexander Wetzel

Rekeying a pairwise key with encryption offload and only keyid 0 had
multiple races. Two of them could freeze the wlan connection till
rekeyed again and the third could send out packets in clear which should
have been encrypted.

 1) Freeze caused by incoming packets:
    If the local STA installed the key prior to the remote STA we still
    had the old key active in the hardware while mac80211 was already
    operating on the new key.
    The card could still hand over packets decoded with the old key to
    mac80211, bumping the new PN (IV) value to an incorrect high number
    and tricking the local replay detection to later drop all packets
    really sent with the new key.

 2) Freeze caused by outgoing packets:
    If mac80211 was providing the PN (IV) and handed over a cleartext
    packets for encryption to the hardware prior to a key change the
    driver/card could have processed the queued packets after switching
    to the new key.
    This immediately bumped the PN (IV) value on the remote STA to
    an incorrect high number, which also froze the connection.

 3) Clear text leak:
    Removing encryption offload from the card cleared the encryption
    offload flag only after the card had removed the key. Packets handed
    over between that were send out unencrypted.

To prevent those issues we now stop queuing packets to the driver while
replacing the key, replace the key first in the HW and stop/block new
aggregation sessions during the rekey.

This will only work correctly with drivers implementing replace_key and
a userspace honoring NL80211_EXT_FEATURE_ATOMIC_KEY_REPLACE.
If the driver is not implementing replace_key all three issues can still
occur and the userspace must not rekey a running connection.

With drivers implementing replace_key this fixes the (for rekeys)
invalid unicast key install procedure from:
 - atomic switch over to the new key in mac80211 (TX still active!)
 - remove the old key in the HW (stops encryption offloading, switch
   to SW encryption and can leak clear text packets while doing that)
 - delete the inactive old key in mac80211
 - add new key in the HW for encryption offloading (ending software
   encryption)
to:
 - mark the old key as tainted to drop TX packets with the outgoing key
 - replace the key in HW with the new one using the new driver callback
   "replace_key" (driver still must guarantee it is not leaking
   cleartext itself)
 - atomic switch over to the new key in mac80211 (allow TX again)
 - delete the inactive old key in mac80211

For drivers not implementing the new callback "replace_key" it's
unknown if the driver can replace the key without leaking cleartext
packets. Mac80211 will therefore log an error message when trying to
update the PTK key but still replace the key as instructed. Refusing
cooperation is possible but considered to be the greater evil due to
user visible changes. The mac80211 cleartext packet leak is still fixed,
but potential cleartext packet leaks and the freezes - caused by the
undefined driver behavior - can't be ruled out.

The logic behind the updated rekey sequence:
With the new sequence the HW will be unable to decode packets encrypted
with the old key prior to switching to the new key in mac80211.
Locking down TX during the rekey makes sure that there are no outgoing
packets while the driver and card are switching to the new key.

The driver is allowed to hand over packets decrypted with either the new
or the old key till "replace_key" returns. But all packets queued prior
to calling the callback must be either still be send out encrypted with
the old key or be dropped.

A RX aggregation session started prior to the rekey and completed after
can still dump frames received with the old key at mac80211 after we
switched over to the new key. This is avoided by stopping all RX and
aggregation sessions when we replace a PTK key and are using key
offloading.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---
 net/mac80211/key.c | 123 ++++++++++++++++++++++++++++++++++++++-------
 net/mac80211/tx.c  |   4 ++
 2 files changed, 110 insertions(+), 17 deletions(-)

diff --git a/net/mac80211/key.c b/net/mac80211/key.c
index c054ac85793c..5a5f9e0d276e 100644
--- a/net/mac80211/key.c
+++ b/net/mac80211/key.c
@@ -248,6 +248,7 @@ static void ieee80211_key_disable_hw_accel(struct ieee80211_key *key)
 	      (key->conf.flags & IEEE80211_KEY_FLAG_RESERVE_TAILROOM)))
 		increment_tailroom_need_count(sdata);
 
+	key->flags &= ~KEY_FLAG_UPLOADED_TO_HARDWARE;
 	ret = drv_set_key(key->local, DISABLE_KEY, sdata,
 			  sta ? &sta->sta : NULL, &key->conf);
 
@@ -257,7 +258,63 @@ static void ieee80211_key_disable_hw_accel(struct ieee80211_key *key)
 			  key->conf.keyidx,
 			  sta ? sta->sta.addr : bcast_addr, ret);
 
-	key->flags &= ~KEY_FLAG_UPLOADED_TO_HARDWARE;
+}
+
+static int ieee80211_hw_ptk_replace(struct ieee80211_key *old_key,
+				    struct ieee80211_key *new_key,
+				    bool canreplace)
+{
+	struct ieee80211_sub_if_data *sdata;
+	struct ieee80211_local *local;
+	struct sta_info *sta;
+	int ret;
+
+	if (!old_key || !new_key || !old_key->sta)
+		return -EINVAL;
+
+	/* Running on software encryption */
+	if (!(old_key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE))
+		return 0;
+
+	assert_key_lock(old_key->local);
+
+	sta = old_key->sta;
+	local = old_key->local;
+	sdata = old_key->sdata;
+
+	/* Stop TX till we are on the new key */
+	old_key->flags |= KEY_FLAG_TAINTED;
+	ieee80211_clear_fast_xmit(sta);
+
+	if (ieee80211_hw_check(&local->hw, AMPDU_AGGREGATION)) {
+		set_sta_flag(sta, WLAN_STA_BLOCK_BA);
+		ieee80211_sta_tear_down_BA_sessions(sta, AGG_STOP_LOCAL_REQUEST);
+	}
+
+	if (canreplace) {
+		ret = drv_replace_key(old_key->local, sdata,
+				      &sta->sta, &old_key->conf,
+				      &new_key->conf);
+		if (!ret)
+			new_key->flags |= KEY_FLAG_UPLOADED_TO_HARDWARE;
+		else
+			sdata_err(sdata,
+				  "failed to replace key (%d) with key " \
+				  "(%d) for STA, %pM in hardware (%d)\n",
+				  old_key->conf.keyidx,
+				  new_key->conf.keyidx,
+				  sta ? sta->sta.addr : bcast_addr, ret);
+
+		old_key->flags &= ~KEY_FLAG_UPLOADED_TO_HARDWARE;
+	} else {
+		sdata_err(sdata,
+			  "PTK rekey and old userspace software used. Some "
+			  "packets to STA %pM may be send out without "
+			  "encryption and the connection may also freeze!",
+			  sta->sta.addr);
+		ret = 0;
+	}
+	return (ret);
 }
 
 static void __ieee80211_set_default_key(struct ieee80211_sub_if_data *sdata,
@@ -316,38 +373,74 @@ void ieee80211_set_default_mgmt_key(struct ieee80211_sub_if_data *sdata,
 }
 
 
-static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,
+static int ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,
 				  struct sta_info *sta,
 				  bool pairwise,
 				  struct ieee80211_key *old,
 				  struct ieee80211_key *new)
 {
 	int idx;
+	int ret;
 	bool defunikey, defmultikey, defmgmtkey;
+	bool canreplace = false;
 
 	/* caller must provide at least one old/new */
 	if (WARN_ON(!new && !old))
-		return;
+		return 0;
+
+	/* Drivers can provide and signal support for replacing in-use
+	 * keys by implementing the callback replace_key. If possible
+	 * use that for PTK key replaces. If not we stick to an improved
+	 * procedure with_set_key which may still leak clear text packets
+	 * and freeze RX and/or TX for compatibility.
+	 */
+	if (new && new->local->ops->replace_key)
+		canreplace = true;
 
 	if (new)
 		list_add_tail_rcu(&new->list, &sdata->key_list);
 
 	WARN_ON(new && old && new->conf.keyidx != old->conf.keyidx);
 
-	if (old)
+	if (old) {
 		idx = old->conf.keyidx;
-	else
+
+		if(new && sta && pairwise) {
+			ret = ieee80211_hw_ptk_replace(old, new, canreplace);
+			if (ret)
+				return ret;
+		}
+	} else {
 		idx = new->conf.keyidx;
+	}
+
+	if (new && !new->local->wowlan &&
+	    !(sta && old && canreplace)) {
+		/* Only safe to use for GTK keys!
+		 * Doing this for an in use PTK key can trick our replay
+		 * detection to kill RX and potentially also the TX of the
+		 * remote station. But it's still allowed for PTKs when the
+		 * driver is not implementing replace_key for backward
+		 * combatibility for now.
+		 */
+		ret = ieee80211_key_enable_hw_accel(new);
+		if (ret)
+			return ret;
+	}
 
 	if (sta) {
 		if (pairwise) {
 			rcu_assign_pointer(sta->ptk[idx], new);
 			sta->ptk_idx = idx;
-			ieee80211_check_fast_xmit(sta);
+			if (new) {
+				clear_sta_flag(sta, WLAN_STA_BLOCK_BA);
+				ieee80211_check_fast_xmit(sta);
+			}
 		} else {
 			rcu_assign_pointer(sta->gtk[idx], new);
 		}
-		ieee80211_check_fast_rx(sta);
+		if (new)
+			ieee80211_check_fast_rx(sta);
 	} else {
 		defunikey = old &&
 			old == key_mtx_dereference(sdata->local,
@@ -380,6 +473,7 @@ static void ieee80211_key_replace(struct ieee80211_sub_if_data *sdata,
 
 	if (old)
 		list_del_rcu(&old->list);
+	return 0;
 }
 
 struct ieee80211_key *
@@ -654,7 +748,6 @@ int ieee80211_key_link(struct ieee80211_key *key,
 		       struct ieee80211_sub_if_data *sdata,
 		       struct sta_info *sta)
 {
-	struct ieee80211_local *local = sdata->local;
 	struct ieee80211_key *old_key;
 	int idx = key->conf.keyidx;
 	bool pairwise = key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE;
@@ -691,17 +784,13 @@ int ieee80211_key_link(struct ieee80211_key *key,
 
 	increment_tailroom_need_count(sdata);
 
-	ieee80211_key_replace(sdata, sta, pairwise, old_key, key);
-	ieee80211_key_destroy(old_key, delay_tailroom);
-
-	ieee80211_debugfs_key_add(key);
+	ret = ieee80211_key_replace(sdata, sta, pairwise, old_key, key);
 
-	if (!local->wowlan) {
-		ret = ieee80211_key_enable_hw_accel(key);
-		if (ret)
-			ieee80211_key_free(key, delay_tailroom);
+	if (!ret) {
+		ieee80211_debugfs_key_add(key);
+		ieee80211_key_destroy(old_key, true);
 	} else {
-		ret = 0;
+		ieee80211_key_free(key, true);
 	}
 
  out:
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index cd332e3e1134..13228693324c 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -2951,6 +2951,10 @@ void ieee80211_check_fast_xmit(struct sta_info *sta)
 		if (!(build.key->flags & KEY_FLAG_UPLOADED_TO_HARDWARE))
 			goto out;
 
+		/* Key is beeing removed */
+		if (build.key->flags & KEY_FLAG_TAINTED)
+			goto out;
+
 		switch (build.key->conf.cipher) {
 		case WLAN_CIPHER_SUITE_CCMP:
 		case WLAN_CIPHER_SUITE_CCMP_256:
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-07-31 20:10 ` [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
@ 2018-07-31 21:36   ` Ben Greear
  2018-08-01 17:01     ` Alexander Wetzel
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2018-07-31 21:36 UTC (permalink / raw)
  To: Alexander Wetzel, johannes; +Cc: linux-wireless, s.gottschall, denkenz

On 07/31/2018 01:10 PM, Alexander Wetzel wrote:
> Rekeying a pairwise key with encryption offload and only keyid 0 had
> multiple races. Two of them could freeze the wlan connection till
> rekeyed again and the third could send out packets in clear which should
> have been encrypted.
>
>  1) Freeze caused by incoming packets:
>     If the local STA installed the key prior to the remote STA we still
>     had the old key active in the hardware while mac80211 was already
>     operating on the new key.
>     The card could still hand over packets decoded with the old key to
>     mac80211, bumping the new PN (IV) value to an incorrect high number
>     and tricking the local replay detection to later drop all packets
>     really sent with the new key.
>
>  2) Freeze caused by outgoing packets:
>     If mac80211 was providing the PN (IV) and handed over a cleartext
>     packets for encryption to the hardware prior to a key change the
>     driver/card could have processed the queued packets after switching
>     to the new key.
>     This immediately bumped the PN (IV) value on the remote STA to
>     an incorrect high number, which also froze the connection.
>
>  3) Clear text leak:
>     Removing encryption offload from the card cleared the encryption
>     offload flag only after the card had removed the key. Packets handed
>     over between that were send out unencrypted.
>
> To prevent those issues we now stop queuing packets to the driver while
> replacing the key, replace the key first in the HW and stop/block new
> aggregation sessions during the rekey.

I guess the replace_key logic will have to flush the tx hardware/firmware
queues and any other packets currently owned by the driver before it can
replace the key?


..

> +		/* Key is beeing removed */

Looks like 'being' is mis-spelled?


Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-07-31 21:36   ` Ben Greear
@ 2018-08-01 17:01     ` Alexander Wetzel
  2018-08-01 17:21       ` Ben Greear
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Wetzel @ 2018-08-01 17:01 UTC (permalink / raw)
  To: Ben Greear, johannes; +Cc: linux-wireless, s.gottschall, denkenz

Hallo,

>> Rekeying a pairwise key with encryption offload and only keyid 0 had
>> multiple races. Two of them could freeze the wlan connection till
>> rekeyed again and the third could send out packets in clear which should
>> have been encrypted.
>>
>>  1) Freeze caused by incoming packets:
>>     If the local STA installed the key prior to the remote STA we still
>>     had the old key active in the hardware while mac80211 was already
>>     operating on the new key.
>>     The card could still hand over packets decoded with the old key to
>>     mac80211, bumping the new PN (IV) value to an incorrect high number
>>     and tricking the local replay detection to later drop all packets
>>     really sent with the new key.
>>
>>  2) Freeze caused by outgoing packets:
>>     If mac80211 was providing the PN (IV) and handed over a cleartext
>>     packets for encryption to the hardware prior to a key change the
>>     driver/card could have processed the queued packets after switching
>>     to the new key.
>>     This immediately bumped the PN (IV) value on the remote STA to
>>     an incorrect high number, which also froze the connection.
>>
>>  3) Clear text leak:
>>     Removing encryption offload from the card cleared the encryption
>>     offload flag only after the card had removed the key. Packets handed
>>     over between that were send out unencrypted.
>>
>> To prevent those issues we now stop queuing packets to the driver while
>> replacing the key, replace the key first in the HW and stop/block new
>> aggregation sessions during the rekey.
> 
> I guess the replace_key logic will have to flush the tx hardware/firmware
> queues and any other packets currently owned by the driver before it can
> replace the key?

That is one of the options, but looks like we can also avoid it with
some efforts. Question will be what's the best approach per driver and
how complex we want the code to become...
My current ath9k fix depends on flush - since it's simple - but it
definitely looks like it can be done without. (It looks like the
userspace updates are more urgent - assuming we can even agree on this
overall solution - and I've not done anything on it for weeks.)

There are two ways to establish the boundary we need here:
1) Make sure all old packets have been send/dropped or 2) continue to
use the old key for them.

Iwlwifi e.g. is handing over the key with the packet to the HW and the
driver simply uses the key it was told per packet. Implementing
replace_key is trivial, we can just call the existing set_key function
for deletion and addition from replace_key if we want. (I've tested that
quite extensive and it works flawless with my dvm card.)

Drivers uploading the keys to the HW and then just tell the HW to use
key ID X for encryption - like ath9, the only other driver I drilled
deeper so far - are more tricky. But replace_key is still allowed to
change the HW key ID for the new key if desired by the driver.

The driver can therefore only "deprecate" a key but keep it programmed
in HW, assign and program a new key and return to mac80211.
Any queued packets will still lookup and use the old key while new
packets will tell the HW to use the new one. Problem solved, if we
ignore the headache when and how to really free the old key id and
remove it from the HW. In the worst case we wait the max time any packet
can be stuck in the queue. Of course this may cause other issues in at
least some special cases. The one I can think of would be:

A busy AP with many clients connected and rekey enabled gets rebooted
during work hours. All clients reconnect at basically the same time.
Thus all of those will also rekey at nearly the same time and therefore
all need two PTK key slots in the HW for some seconds, potentially
exceeding the available ones. So you may get some strange effects after
the rekey interval expired, long after the reboot...

> ..
> 
>> +        /* Key is beeing removed */
> 
> Looks like 'being' is mis-spelled?

Thanks, I've fixed that in my local git tree. Guess I'll wait a few more
days till sending and hopefully accumulating more feedback in it.

Regards,

Alexander

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-08-01 17:01     ` Alexander Wetzel
@ 2018-08-01 17:21       ` Ben Greear
  2018-08-01 19:37         ` Alexander Wetzel
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2018-08-01 17:21 UTC (permalink / raw)
  To: Alexander Wetzel, johannes; +Cc: linux-wireless, s.gottschall, denkenz

On 08/01/2018 10:01 AM, Alexander Wetzel wrote:
> Hallo,
>
>>> Rekeying a pairwise key with encryption offload and only keyid 0 had
>>> multiple races. Two of them could freeze the wlan connection till
>>> rekeyed again and the third could send out packets in clear which should
>>> have been encrypted.
>>>
>>>  1) Freeze caused by incoming packets:
>>>     If the local STA installed the key prior to the remote STA we still
>>>     had the old key active in the hardware while mac80211 was already
>>>     operating on the new key.
>>>     The card could still hand over packets decoded with the old key to
>>>     mac80211, bumping the new PN (IV) value to an incorrect high number
>>>     and tricking the local replay detection to later drop all packets
>>>     really sent with the new key.
>>>
>>>  2) Freeze caused by outgoing packets:
>>>     If mac80211 was providing the PN (IV) and handed over a cleartext
>>>     packets for encryption to the hardware prior to a key change the
>>>     driver/card could have processed the queued packets after switching
>>>     to the new key.
>>>     This immediately bumped the PN (IV) value on the remote STA to
>>>     an incorrect high number, which also froze the connection.
>>>
>>>  3) Clear text leak:
>>>     Removing encryption offload from the card cleared the encryption
>>>     offload flag only after the card had removed the key. Packets handed
>>>     over between that were send out unencrypted.
>>>
>>> To prevent those issues we now stop queuing packets to the driver while
>>> replacing the key, replace the key first in the HW and stop/block new
>>> aggregation sessions during the rekey.
>>
>> I guess the replace_key logic will have to flush the tx hardware/firmware
>> queues and any other packets currently owned by the driver before it can
>> replace the key?
>
> That is one of the options, but looks like we can also avoid it with
> some efforts. Question will be what's the best approach per driver and
> how complex we want the code to become...
> My current ath9k fix depends on flush - since it's simple - but it
> definitely looks like it can be done without. (It looks like the
> userspace updates are more urgent - assuming we can even agree on this
> overall solution - and I've not done anything on it for weeks.)
>
> There are two ways to establish the boundary we need here:
> 1) Make sure all old packets have been send/dropped or 2) continue to
> use the old key for them.
>
> Iwlwifi e.g. is handing over the key with the packet to the HW and the
> driver simply uses the key it was told per packet. Implementing
> replace_key is trivial, we can just call the existing set_key function
> for deletion and addition from replace_key if we want. (I've tested that
> quite extensive and it works flawless with my dvm card.)
 >
> Drivers uploading the keys to the HW and then just tell the HW to use
> key ID X for encryption - like ath9, the only other driver I drilled
> deeper so far - are more tricky. But replace_key is still allowed to
> change the HW key ID for the new key if desired by the driver.
>
> The driver can therefore only "deprecate" a key but keep it programmed
> in HW, assign and program a new key and return to mac80211.
> Any queued packets will still lookup and use the old key while new
> packets will tell the HW to use the new one. Problem solved, if we
> ignore the headache when and how to really free the old key id and
> remove it from the HW. In the worst case we wait the max time any packet
> can be stuck in the queue. Of course this may cause other issues in at
> least some special cases. The one I can think of would be:
>
> A busy AP with many clients connected and rekey enabled gets rebooted
> during work hours. All clients reconnect at basically the same time.
> Thus all of those will also rekey at nearly the same time and therefore
> all need two PTK key slots in the HW for some seconds, potentially
> exceeding the available ones. So you may get some strange effects after
> the rekey interval expired, long after the reboot...

The ath10k firmware already sort of handles stale keys as far as I can tell,
and early on in my testing, which often has this issue of all
stations rekeying at once, I was running out of firmware key objects.  I
increased the multiplier in the driver and made the firmware smarter about
recycling key objects, and that fixed the problem for me.

So, maybe this would just work with ath10k, but if there were any issues,
probably you'd have to fall back to flushing everything.  And, that might
be a way to work somewhat generically with drivers that don't have special
replace-key logic?

Someone could also tweak supplicant to somewhat randomize the rekey timers,
but that would take a while to propagate to all of the station devices out
there.

Thanks,
Ben



-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-08-01 17:21       ` Ben Greear
@ 2018-08-01 19:37         ` Alexander Wetzel
  2018-08-01 19:47           ` Ben Greear
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Wetzel @ 2018-08-01 19:37 UTC (permalink / raw)
  To: Ben Greear, johannes; +Cc: linux-wireless, s.gottschall, denkenz

Hello,

>>
>>>> Rekeying a pairwise key with encryption offload and only keyid 0 had
>>>> multiple races. Two of them could freeze the wlan connection till
>>>> rekeyed again and the third could send out packets in clear which
>>>> should
>>>> have been encrypted.
>>>>
>>>>  1) Freeze caused by incoming packets:
>>>>     If the local STA installed the key prior to the remote STA we still
>>>>     had the old key active in the hardware while mac80211 was already
>>>>     operating on the new key.
>>>>     The card could still hand over packets decoded with the old key to
>>>>     mac80211, bumping the new PN (IV) value to an incorrect high number
>>>>     and tricking the local replay detection to later drop all packets
>>>>     really sent with the new key.
>>>>
>>>>  2) Freeze caused by outgoing packets:
>>>>     If mac80211 was providing the PN (IV) and handed over a cleartext
>>>>     packets for encryption to the hardware prior to a key change the
>>>>     driver/card could have processed the queued packets after switching
>>>>     to the new key.
>>>>     This immediately bumped the PN (IV) value on the remote STA to
>>>>     an incorrect high number, which also froze the connection.
>>>>
>>>>  3) Clear text leak:
>>>>     Removing encryption offload from the card cleared the encryption
>>>>     offload flag only after the card had removed the key. Packets
>>>> handed
>>>>     over between that were send out unencrypted.
>>>>
>>>> To prevent those issues we now stop queuing packets to the driver while
>>>> replacing the key, replace the key first in the HW and stop/block new
>>>> aggregation sessions during the rekey.
>>>
>>> I guess the replace_key logic will have to flush the tx
>>> hardware/firmware
>>> queues and any other packets currently owned by the driver before it can
>>> replace the key?
>>
>> That is one of the options, but looks like we can also avoid it with
>> some efforts. Question will be what's the best approach per driver and
>> how complex we want the code to become...
>> My current ath9k fix depends on flush - since it's simple - but it
>> definitely looks like it can be done without. (It looks like the
>> userspace updates are more urgent - assuming we can even agree on this
>> overall solution - and I've not done anything on it for weeks.)
>>
>> There are two ways to establish the boundary we need here:
>> 1) Make sure all old packets have been send/dropped or 2) continue to
>> use the old key for them.
>>
>> Iwlwifi e.g. is handing over the key with the packet to the HW and the
>> driver simply uses the key it was told per packet. Implementing
>> replace_key is trivial, we can just call the existing set_key function
>> for deletion and addition from replace_key if we want. (I've tested that
>> quite extensive and it works flawless with my dvm card.)
>>
>> Drivers uploading the keys to the HW and then just tell the HW to use
>> key ID X for encryption - like ath9, the only other driver I drilled
>> deeper so far - are more tricky. But replace_key is still allowed to
>> change the HW key ID for the new key if desired by the driver.
>>
>> The driver can therefore only "deprecate" a key but keep it programmed
>> in HW, assign and program a new key and return to mac80211.
>> Any queued packets will still lookup and use the old key while new
>> packets will tell the HW to use the new one. Problem solved, if we
>> ignore the headache when and how to really free the old key id and
>> remove it from the HW. In the worst case we wait the max time any packet
>> can be stuck in the queue. Of course this may cause other issues in at
>> least some special cases. The one I can think of would be:
>>
>> A busy AP with many clients connected and rekey enabled gets rebooted
>> during work hours. All clients reconnect at basically the same time.
>> Thus all of those will also rekey at nearly the same time and therefore
>> all need two PTK key slots in the HW for some seconds, potentially
>> exceeding the available ones. So you may get some strange effects after
>> the rekey interval expired, long after the reboot...
> 
> The ath10k firmware already sort of handles stale keys as far as I can
> tell,
> and early on in my testing, which often has this issue of all
> stations rekeying at once, I was running out of firmware key objects.  I
> increased the multiplier in the driver and made the firmware smarter about
> recycling key objects, and that fixed the problem for me.
> 
> So, maybe this would just work with ath10k, but if there were any issues,
> probably you'd have to fall back to flushing everything.  And, that might
> be a way to work somewhat generically with drivers that don't have special
> replace-key logic?

The V2 version of the patch called ieee80211_flush_queues
unconditionally as kind of safety net to increase the odds of the driver
to not send out a wrong packet. Argument against that was, that not all
drivers are implementing flush and even when they are we have no
guarantee that the packet was really flushed from the card and not only
the driver. The current patch is now expecting the userspace to never
rekey a connection when the driver is not supporting. If it does anyhow
we are back in uncharted area, only slightly better than current stable
kernels. Of course we can handle that like in the V2 version of the
patch. I kind of like the idea...Is that something I shall add in the
next version?

> 
> Someone could also tweak supplicant to somewhat randomize the rekey timers,
> but that would take a while to propagate to all of the station devices out
> there.

It may not even be a realistic problem. Just trying to think of worst
cases:-) The reconnections will for sure be spread a few ms, at least.
What is the longest time we have to keep the old key programmed in HW?

Alexander

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-08-01 19:37         ` Alexander Wetzel
@ 2018-08-01 19:47           ` Ben Greear
  2018-08-02 19:33             ` Alexander Wetzel
  0 siblings, 1 reply; 10+ messages in thread
From: Ben Greear @ 2018-08-01 19:47 UTC (permalink / raw)
  To: Alexander Wetzel, johannes; +Cc: linux-wireless, s.gottschall, denkenz

On 08/01/2018 12:37 PM, Alexander Wetzel wrote:
> Hello,
>
>>>
>>>>> Rekeying a pairwise key with encryption offload and only keyid 0 had
>>>>> multiple races. Two of them could freeze the wlan connection till
>>>>> rekeyed again and the third could send out packets in clear which
>>>>> should
>>>>> have been encrypted.
>>>>>
>>>>>  1) Freeze caused by incoming packets:
>>>>>     If the local STA installed the key prior to the remote STA we still
>>>>>     had the old key active in the hardware while mac80211 was already
>>>>>     operating on the new key.
>>>>>     The card could still hand over packets decoded with the old key to
>>>>>     mac80211, bumping the new PN (IV) value to an incorrect high number
>>>>>     and tricking the local replay detection to later drop all packets
>>>>>     really sent with the new key.
>>>>>
>>>>>  2) Freeze caused by outgoing packets:
>>>>>     If mac80211 was providing the PN (IV) and handed over a cleartext
>>>>>     packets for encryption to the hardware prior to a key change the
>>>>>     driver/card could have processed the queued packets after switching
>>>>>     to the new key.
>>>>>     This immediately bumped the PN (IV) value on the remote STA to
>>>>>     an incorrect high number, which also froze the connection.
>>>>>
>>>>>  3) Clear text leak:
>>>>>     Removing encryption offload from the card cleared the encryption
>>>>>     offload flag only after the card had removed the key. Packets
>>>>> handed
>>>>>     over between that were send out unencrypted.
>>>>>
>>>>> To prevent those issues we now stop queuing packets to the driver while
>>>>> replacing the key, replace the key first in the HW and stop/block new
>>>>> aggregation sessions during the rekey.
>>>>
>>>> I guess the replace_key logic will have to flush the tx
>>>> hardware/firmware
>>>> queues and any other packets currently owned by the driver before it can
>>>> replace the key?
>>>
>>> That is one of the options, but looks like we can also avoid it with
>>> some efforts. Question will be what's the best approach per driver and
>>> how complex we want the code to become...
>>> My current ath9k fix depends on flush - since it's simple - but it
>>> definitely looks like it can be done without. (It looks like the
>>> userspace updates are more urgent - assuming we can even agree on this
>>> overall solution - and I've not done anything on it for weeks.)
>>>
>>> There are two ways to establish the boundary we need here:
>>> 1) Make sure all old packets have been send/dropped or 2) continue to
>>> use the old key for them.
>>>
>>> Iwlwifi e.g. is handing over the key with the packet to the HW and the
>>> driver simply uses the key it was told per packet. Implementing
>>> replace_key is trivial, we can just call the existing set_key function
>>> for deletion and addition from replace_key if we want. (I've tested that
>>> quite extensive and it works flawless with my dvm card.)
>>>
>>> Drivers uploading the keys to the HW and then just tell the HW to use
>>> key ID X for encryption - like ath9, the only other driver I drilled
>>> deeper so far - are more tricky. But replace_key is still allowed to
>>> change the HW key ID for the new key if desired by the driver.
>>>
>>> The driver can therefore only "deprecate" a key but keep it programmed
>>> in HW, assign and program a new key and return to mac80211.
>>> Any queued packets will still lookup and use the old key while new
>>> packets will tell the HW to use the new one. Problem solved, if we
>>> ignore the headache when and how to really free the old key id and
>>> remove it from the HW. In the worst case we wait the max time any packet
>>> can be stuck in the queue. Of course this may cause other issues in at
>>> least some special cases. The one I can think of would be:
>>>
>>> A busy AP with many clients connected and rekey enabled gets rebooted
>>> during work hours. All clients reconnect at basically the same time.
>>> Thus all of those will also rekey at nearly the same time and therefore
>>> all need two PTK key slots in the HW for some seconds, potentially
>>> exceeding the available ones. So you may get some strange effects after
>>> the rekey interval expired, long after the reboot...
>>
>> The ath10k firmware already sort of handles stale keys as far as I can
>> tell,
>> and early on in my testing, which often has this issue of all
>> stations rekeying at once, I was running out of firmware key objects.  I
>> increased the multiplier in the driver and made the firmware smarter about
>> recycling key objects, and that fixed the problem for me.
>>
>> So, maybe this would just work with ath10k, but if there were any issues,
>> probably you'd have to fall back to flushing everything.  And, that might
>> be a way to work somewhat generically with drivers that don't have special
>> replace-key logic?
>
> The V2 version of the patch called ieee80211_flush_queues
> unconditionally as kind of safety net to increase the odds of the driver
> to not send out a wrong packet. Argument against that was, that not all
> drivers are implementing flush and even when they are we have no
> guarantee that the packet was really flushed from the card and not only
> the driver. The current patch is now expecting the userspace to never
> rekey a connection when the driver is not supporting. If it does anyhow
> we are back in uncharted area, only slightly better than current stable
> kernels. Of course we can handle that like in the V2 version of the
> patch. I kind of like the idea...Is that something I shall add in the
> next version?

I think if mac80211 will stop sending frames, then a driver *should* be able
to implement flush one way or another.  But, having the driver itself
try to flush in a key_replace() method while the rest of the system might
still be sending it frames is probably going to be more complicated.

You could handle it on a per-driver basis by having the driver somehow notify
your logic whether a flush is desired or not?

>> Someone could also tweak supplicant to somewhat randomize the rekey timers,
>> but that would take a while to propagate to all of the station devices out
>> there.
>
> It may not even be a realistic problem. Just trying to think of worst
> cases:-) The reconnections will for sure be spread a few ms, at least.
> What is the longest time we have to keep the old key programmed in HW?

If you are trying to transmit a frame with the wrong key, probably it will
be retransmitted a lot since the receiver should dispose of it, so I guess
a few seconds max?

ath10k firmware will only dispose of old keys once all packets referencing
it are removed.

Either way, this is a fixable problem, and a flush should certainly fix it
if there is no better way.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks
  2018-08-01 19:47           ` Ben Greear
@ 2018-08-02 19:33             ` Alexander Wetzel
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Wetzel @ 2018-08-02 19:33 UTC (permalink / raw)
  To: Ben Greear, johannes; +Cc: linux-wireless, s.gottschall, denkenz

>>>>>> Rekeying a pairwise key with encryption offload and only keyid 0 had
>>>>>> multiple races. Two of them could freeze the wlan connection till
>>>>>> rekeyed again and the third could send out packets in clear which
>>>>>> should
>>>>>> have been encrypted.
>>>>>>
>>>>>>  1) Freeze caused by incoming packets:
>>>>>>     If the local STA installed the key prior to the remote STA we
>>>>>> still
>>>>>>     had the old key active in the hardware while mac80211 was already
>>>>>>     operating on the new key.
>>>>>>     The card could still hand over packets decoded with the old
>>>>>> key to
>>>>>>     mac80211, bumping the new PN (IV) value to an incorrect high
>>>>>> number
>>>>>>     and tricking the local replay detection to later drop all packets
>>>>>>     really sent with the new key.
>>>>>>
>>>>>>  2) Freeze caused by outgoing packets:
>>>>>>     If mac80211 was providing the PN (IV) and handed over a cleartext
>>>>>>     packets for encryption to the hardware prior to a key change the
>>>>>>     driver/card could have processed the queued packets after
>>>>>> switching
>>>>>>     to the new key.
>>>>>>     This immediately bumped the PN (IV) value on the remote STA to
>>>>>>     an incorrect high number, which also froze the connection.
>>>>>>
>>>>>>  3) Clear text leak:
>>>>>>     Removing encryption offload from the card cleared the encryption
>>>>>>     offload flag only after the card had removed the key. Packets
>>>>>> handed
>>>>>>     over between that were send out unencrypted.
>>>>>>
>>>>>> To prevent those issues we now stop queuing packets to the driver
>>>>>> while
>>>>>> replacing the key, replace the key first in the HW and stop/block new
>>>>>> aggregation sessions during the rekey.
>>>>>
>>>>> I guess the replace_key logic will have to flush the tx
>>>>> hardware/firmware
>>>>> queues and any other packets currently owned by the driver before
>>>>> it can
>>>>> replace the key?
>>>>
>>>> That is one of the options, but looks like we can also avoid it with
>>>> some efforts. Question will be what's the best approach per driver and
>>>> how complex we want the code to become...
>>>> My current ath9k fix depends on flush - since it's simple - but it
>>>> definitely looks like it can be done without. (It looks like the
>>>> userspace updates are more urgent - assuming we can even agree on this
>>>> overall solution - and I've not done anything on it for weeks.)
>>>>
>>>> There are two ways to establish the boundary we need here:
>>>> 1) Make sure all old packets have been send/dropped or 2) continue to
>>>> use the old key for them.
>>>>
>>>> Iwlwifi e.g. is handing over the key with the packet to the HW and the
>>>> driver simply uses the key it was told per packet. Implementing
>>>> replace_key is trivial, we can just call the existing set_key function
>>>> for deletion and addition from replace_key if we want. (I've tested
>>>> that
>>>> quite extensive and it works flawless with my dvm card.)
>>>>
>>>> Drivers uploading the keys to the HW and then just tell the HW to use
>>>> key ID X for encryption - like ath9, the only other driver I drilled
>>>> deeper so far - are more tricky. But replace_key is still allowed to
>>>> change the HW key ID for the new key if desired by the driver.
>>>>
>>>> The driver can therefore only "deprecate" a key but keep it programmed
>>>> in HW, assign and program a new key and return to mac80211.
>>>> Any queued packets will still lookup and use the old key while new
>>>> packets will tell the HW to use the new one. Problem solved, if we
>>>> ignore the headache when and how to really free the old key id and
>>>> remove it from the HW. In the worst case we wait the max time any
>>>> packet
>>>> can be stuck in the queue. Of course this may cause other issues in at
>>>> least some special cases. The one I can think of would be:
>>>>
>>>> A busy AP with many clients connected and rekey enabled gets rebooted
>>>> during work hours. All clients reconnect at basically the same time.
>>>> Thus all of those will also rekey at nearly the same time and therefore
>>>> all need two PTK key slots in the HW for some seconds, potentially
>>>> exceeding the available ones. So you may get some strange effects after
>>>> the rekey interval expired, long after the reboot...
>>>
>>> The ath10k firmware already sort of handles stale keys as far as I can
>>> tell,
>>> and early on in my testing, which often has this issue of all
>>> stations rekeying at once, I was running out of firmware key objects.  I
>>> increased the multiplier in the driver and made the firmware smarter
>>> about
>>> recycling key objects, and that fixed the problem for me.
>>>
>>> So, maybe this would just work with ath10k, but if there were any
>>> issues,
>>> probably you'd have to fall back to flushing everything.  And, that
>>> might
>>> be a way to work somewhat generically with drivers that don't have
>>> special
>>> replace-key logic?
>>
>> The V2 version of the patch called ieee80211_flush_queues
>> unconditionally as kind of safety net to increase the odds of the driver
>> to not send out a wrong packet. Argument against that was, that not all
>> drivers are implementing flush and even when they are we have no
>> guarantee that the packet was really flushed from the card and not only
>> the driver. The current patch is now expecting the userspace to never
>> rekey a connection when the driver is not supporting. If it does anyhow
>> we are back in uncharted area, only slightly better than current stable
>> kernels. Of course we can handle that like in the V2 version of the
>> patch. I kind of like the idea...Is that something I shall add in the
>> next version?
> 
> I think if mac80211 will stop sending frames, then a driver *should* be
> able
> to implement flush one way or another.  But, having the driver itself
> try to flush in a key_replace() method while the rest of the system might
> still be sending it frames is probably going to be more complicated.
> 
> You could handle it on a per-driver basis by having the driver somehow
> notify
> your logic whether a flush is desired or not?

Mac80211 will only drop frames for the specific key, traffic for other
STA or even unencrypted packets to the STA we are rekeying will still be
queued. (When mac80211 select the outgoing key it will drop the packet.)

When the driver wants to flush the queues we do not need to change the
API: The driver can simply call ieee80211_stop_queues, the driver flush
function and then ieee80211_wake_queues.

> 
>>> Someone could also tweak supplicant to somewhat randomize the rekey
>>> timers,
>>> but that would take a while to propagate to all of the station
>>> devices out
>>> there.
>>
>> It may not even be a realistic problem. Just trying to think of worst
>> cases:-) The reconnections will for sure be spread a few ms, at least.
>> What is the longest time we have to keep the old key programmed in HW?
> 
> If you are trying to transmit a frame with the wrong key, probably it will
> be retransmitted a lot since the receiver should dispose of it, so I guess
> a few seconds max?

For my understanding frames ack will work regardless if we can decrypt
the frame or not. I also never have seen excessive retransmits when
capturing the rekeys so I'm pretty sure here.

> 
> ath10k firmware will only dispose of old keys once all packets referencing
> it are removed.

That's sounds perfect. In fact I think it should even work correctly
without the new API with the V2 version of the patch and migrating to
replace_key will be simply chaining the two normal set_key calls in it
to signal compliance. I may give that a shot with my new ath10k AP at
the weekend. I wanted to check on ath10k for weeks anyhow...

> 
> Either way, this is a fixable problem, and a flush should certainly fix it
> if there is no better way.

Alexander

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-08-02 21:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-31 20:10 [PATCH v4 0/3] Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
2018-07-31 20:10 ` [PATCH v4 1/3] nl80211: Add ATOMIC_KEY_REPLACE API Alexander Wetzel
2018-07-31 20:10 ` [PATCH v4 2/3] mac80211: Define new driver callback replace_key Alexander Wetzel
2018-07-31 20:10 ` [PATCH v4 3/3] mac80211: Fix PTK rekey freezes and cleartext leaks Alexander Wetzel
2018-07-31 21:36   ` Ben Greear
2018-08-01 17:01     ` Alexander Wetzel
2018-08-01 17:21       ` Ben Greear
2018-08-01 19:37         ` Alexander Wetzel
2018-08-01 19:47           ` Ben Greear
2018-08-02 19:33             ` Alexander Wetzel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.