linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k
@ 2019-09-23  7:19 Yibo Zhao
  2019-09-23  7:19 ` [PATCH V3 1/4] mac80211: Switch to a virtual time-based airtime scheduler Yibo Zhao
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Yibo Zhao @ 2019-09-23  7:19 UTC (permalink / raw)
  To: linux-wireless; +Cc: ath10k, Yibo Zhao

This series fix some issues when enabling virtual time-based airtime scheduler on ath10k.

Changes since v2:
  Changes station airtime weight to be per-AC based to avoid sync issue
  Remove Co-developed-by and Toke's sign-off as Toke suggested

Changes since v1:
  Modify the author of Co-developed-by as Johannes suggested

Toke Høiland-Jørgensen (1):
  mac80211: Switch to a virtual time-based airtime scheduler

Yibo Zhao (3):
  mac80211: defer txqs removal from rbtree
  mac80211: fix low throughput in multi-clients situation
  mac80211: Sync airtime weight sum with per AC synced sta airtime
    weight together

 include/net/mac80211.h     |  16 ++-
 net/mac80211/cfg.c         |  29 ++++-
 net/mac80211/debugfs.c     |  48 +++++++-
 net/mac80211/debugfs_sta.c |  18 +--
 net/mac80211/ieee80211_i.h |  17 ++-
 net/mac80211/main.c        |   8 +-
 net/mac80211/sta_info.c    |  25 +++-
 net/mac80211/sta_info.h    |   8 +-
 net/mac80211/tx.c          | 282 +++++++++++++++++++++++++++++++++------------
 9 files changed, 347 insertions(+), 104 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V3 1/4] mac80211: Switch to a virtual time-based airtime scheduler
  2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
@ 2019-09-23  7:19 ` Yibo Zhao
  2019-09-23  7:19 ` [PATCH V3 2/4] mac80211: defer txqs removal from rbtree Yibo Zhao
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Yibo Zhao @ 2019-09-23  7:19 UTC (permalink / raw)
  To: linux-wireless; +Cc: ath10k, Toke Høiland-Jørgensen

From: Toke Høiland-Jørgensen <toke@redhat.com>

This switches the airtime scheduler in mac80211 to use a virtual time-based
scheduler instead of the round-robin scheduler used before. This has a
couple of advantages:

- No need to sync up the round-robin scheduler in firmware/hardware with
  the round-robin airtime scheduler.

- If several stations are eligible for transmission we can schedule both of
  them; no need to hard-block the scheduling rotation until the head of the
  queue has used up its quantum.

- The check of whether a station is eligible for transmission becomes
  simpler (in ieee80211_txq_may_transmit()).

The drawback is that scheduling becomes slightly more expensive, as we need
to maintain an rbtree of TXQs sorted by virtual time. This means that
ieee80211_register_airtime() becomes O(logN) in the number of currently
scheduled TXQs. However, hopefully this number rarely grows too big (it's
only TXQs currently backlogged, not all associated stations), so it
shouldn't be too big of an issue.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 net/mac80211/debugfs.c     |  48 +++++++++-
 net/mac80211/debugfs_sta.c |  16 ++--
 net/mac80211/ieee80211_i.h |  14 ++-
 net/mac80211/main.c        |   2 +-
 net/mac80211/sta_info.c    |  19 +++-
 net/mac80211/sta_info.h    |   3 +-
 net/mac80211/tx.c          | 217 +++++++++++++++++++++++++++++----------------
 7 files changed, 223 insertions(+), 96 deletions(-)

diff --git a/net/mac80211/debugfs.c b/net/mac80211/debugfs.c
index 2d43bc1..4847168 100644
--- a/net/mac80211/debugfs.c
+++ b/net/mac80211/debugfs.c
@@ -150,6 +150,46 @@ static ssize_t aqm_write(struct file *file,
 	.llseek = default_llseek,
 };
 
+static ssize_t airtime_read(struct file *file,
+			    char __user *user_buf,
+			    size_t count,
+			    loff_t *ppos)
+{
+	struct ieee80211_local *local = file->private_data;
+	char buf[200];
+	u64 v_t[IEEE80211_NUM_ACS];
+	u64 wt[IEEE80211_NUM_ACS];
+	int len = 0, ac;
+
+	for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
+		spin_lock_bh(&local->active_txq_lock[ac]);
+		v_t[ac] = local->airtime_v_t[ac];
+		wt[ac] = local->airtime_weight_sum[ac];
+		spin_unlock_bh(&local->active_txq_lock[ac]);
+	}
+	len = scnprintf(buf, sizeof(buf),
+			"\tVO         VI         BE         BK\n"
+			"Virt-t\t%-10llu %-10llu %-10llu %-10llu\n"
+			"Weight\t%-10llu %-10llu %-10llu %-10llu\n",
+			v_t[0],
+			v_t[1],
+			v_t[2],
+			v_t[3],
+			wt[0],
+			wt[1],
+			wt[2],
+			wt[3]);
+
+	return simple_read_from_buffer(user_buf, count, ppos,
+				       buf, len);
+}
+
+static const struct file_operations airtime_ops = {
+	.read = airtime_read,
+	.open = simple_open,
+	.llseek = default_llseek,
+};
+
 #ifdef CONFIG_PM
 static ssize_t reset_write(struct file *file, const char __user *user_buf,
 			   size_t count, loff_t *ppos)
@@ -386,8 +426,12 @@ void debugfs_hw_add(struct ieee80211_local *local)
 	if (local->ops->wake_tx_queue)
 		DEBUGFS_ADD_MODE(aqm, 0600);
 
-	debugfs_create_u16("airtime_flags", 0600,
-			   phyd, &local->airtime_flags);
+	if (wiphy_ext_feature_isset(local->hw.wiphy,
+				    NL80211_EXT_FEATURE_AIRTIME_FAIRNESS)) {
+		DEBUGFS_ADD_MODE(airtime, 0600);
+		debugfs_create_u16("airtime_flags", 0600,
+				   phyd, &local->airtime_flags);
+	}
 
 	statsd = debugfs_create_dir("statistics", phyd);
 
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index 3aa618d..80028da 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -203,7 +203,7 @@ static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
 	size_t bufsz = 200;
 	char *buf = kzalloc(bufsz, GFP_KERNEL), *p = buf;
 	u64 rx_airtime = 0, tx_airtime = 0;
-	s64 deficit[IEEE80211_NUM_ACS];
+	u64 v_t[IEEE80211_NUM_ACS];
 	ssize_t rv;
 	int ac;
 
@@ -214,20 +214,20 @@ static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
 		spin_lock_bh(&local->active_txq_lock[ac]);
 		rx_airtime += sta->airtime[ac].rx_airtime;
 		tx_airtime += sta->airtime[ac].tx_airtime;
-		deficit[ac] = sta->airtime[ac].deficit;
+		v_t[ac] = sta->airtime[ac].v_t;
 		spin_unlock_bh(&local->active_txq_lock[ac]);
 	}
 
 	p += scnprintf(p, bufsz + buf - p,
 		"RX: %llu us\nTX: %llu us\nWeight: %u\n"
-		"Deficit: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
+		"Virt-T: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
 		rx_airtime,
 		tx_airtime,
 		sta->airtime_weight,
-		deficit[0],
-		deficit[1],
-		deficit[2],
-		deficit[3]);
+		v_t[0],
+		v_t[1],
+		v_t[2],
+		v_t[3]);
 
 	rv = simple_read_from_buffer(userbuf, count, ppos, buf, p - buf);
 	kfree(buf);
@@ -245,7 +245,7 @@ static ssize_t sta_airtime_write(struct file *file, const char __user *userbuf,
 		spin_lock_bh(&local->active_txq_lock[ac]);
 		sta->airtime[ac].rx_airtime = 0;
 		sta->airtime[ac].tx_airtime = 0;
-		sta->airtime[ac].deficit = sta->airtime_weight;
+		sta->airtime[ac].v_t = 0;
 		spin_unlock_bh(&local->active_txq_lock[ac]);
 	}
 
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index e170f98..a4556f9 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -846,8 +846,7 @@ struct txq_info {
 	struct codel_vars def_cvars;
 	struct codel_stats cstats;
 	struct sk_buff_head frags;
-	struct list_head schedule_order;
-	u16 schedule_round;
+	struct rb_node schedule_order;
 	unsigned long flags;
 
 	/* keep last! */
@@ -1141,8 +1140,10 @@ struct ieee80211_local {
 
 	/* protects active_txqs and txqi->schedule_order */
 	spinlock_t active_txq_lock[IEEE80211_NUM_ACS];
-	struct list_head active_txqs[IEEE80211_NUM_ACS];
-	u16 schedule_round[IEEE80211_NUM_ACS];
+	struct rb_root_cached active_txqs[IEEE80211_NUM_ACS];
+	struct rb_node *schedule_pos[IEEE80211_NUM_ACS];
+	u64 airtime_v_t[IEEE80211_NUM_ACS];
+	u64 airtime_weight_sum[IEEE80211_NUM_ACS];
 
 	u16 airtime_flags;
 
@@ -1779,6 +1780,11 @@ int ieee80211_tx_control_port(struct wiphy *wiphy, struct net_device *dev,
 			      const u8 *buf, size_t len,
 			      const u8 *dest, __be16 proto, bool unencrypted);
 
+void ieee80211_resort_txq(struct ieee80211_hw *hw,
+			  struct ieee80211_txq *txq);
+void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
+			      struct ieee80211_txq *txq);
+
 /* HT */
 void ieee80211_apply_htcap_overrides(struct ieee80211_sub_if_data *sdata,
 				     struct ieee80211_sta_ht_cap *ht_cap);
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 5055aeb..e9ffa8e 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -666,7 +666,7 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 	spin_lock_init(&local->queue_stop_reason_lock);
 
 	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
-		INIT_LIST_HEAD(&local->active_txqs[i]);
+		local->active_txqs[i] = RB_ROOT_CACHED;
 		spin_lock_init(&local->active_txq_lock[i]);
 	}
 	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 11f0589..9d01fdd 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -389,7 +389,6 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
 	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
 		skb_queue_head_init(&sta->ps_tx_buf[i]);
 		skb_queue_head_init(&sta->tx_filtered[i]);
-		sta->airtime[i].deficit = sta->airtime_weight;
 	}
 
 	for (i = 0; i < IEEE80211_NUM_TIDS; i++)
@@ -1831,18 +1830,32 @@ void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
 {
 	struct sta_info *sta = container_of(pubsta, struct sta_info, sta);
 	struct ieee80211_local *local = sta->sdata->local;
+	struct ieee80211_txq *txq = sta->sta.txq[tid];
 	u8 ac = ieee80211_ac_from_tid(tid);
-	u32 airtime = 0;
+	u64 airtime = 0, weight_sum;
+
+	if (!txq)
+		return;
 
 	if (sta->local->airtime_flags & AIRTIME_USE_TX)
 		airtime += tx_airtime;
 	if (sta->local->airtime_flags & AIRTIME_USE_RX)
 		airtime += rx_airtime;
 
+	/* Weights scale so the unit weight is 256 */
+	airtime <<= 8;
+
 	spin_lock_bh(&local->active_txq_lock[ac]);
+
 	sta->airtime[ac].tx_airtime += tx_airtime;
 	sta->airtime[ac].rx_airtime += rx_airtime;
-	sta->airtime[ac].deficit -= airtime;
+
+	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight;
+
+	local->airtime_v_t[ac] += airtime / weight_sum;
+	sta->airtime[ac].v_t += airtime / sta->airtime_weight;
+	ieee80211_resort_txq(&local->hw, txq);
+
 	spin_unlock_bh(&local->active_txq_lock[ac]);
 }
 EXPORT_SYMBOL(ieee80211_sta_register_airtime);
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 71f7e49..5c1cac9 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -130,11 +130,12 @@ enum ieee80211_agg_stop_reason {
 /* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
 #define AIRTIME_USE_TX		BIT(0)
 #define AIRTIME_USE_RX		BIT(1)
+#define AIRTIME_GRACE 500 /* usec of grace period before reset */
 
 struct airtime_info {
 	u64 rx_airtime;
 	u64 tx_airtime;
-	s64 deficit;
+	u64 v_t;
 };
 
 struct sta_info;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 61c7ea9..d00baaa 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1449,7 +1449,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
 	codel_vars_init(&txqi->def_cvars);
 	codel_stats_init(&txqi->cstats);
 	__skb_queue_head_init(&txqi->frags);
-	INIT_LIST_HEAD(&txqi->schedule_order);
+	RB_CLEAR_NODE(&txqi->schedule_order);
 
 	txqi->txq.vif = &sdata->vif;
 
@@ -1493,9 +1493,7 @@ void ieee80211_txq_purge(struct ieee80211_local *local,
 	ieee80211_purge_tx_queue(&local->hw, &txqi->frags);
 	spin_unlock_bh(&fq->lock);
 
-	spin_lock_bh(&local->active_txq_lock[txqi->txq.ac]);
-	list_del_init(&txqi->schedule_order);
-	spin_unlock_bh(&local->active_txq_lock[txqi->txq.ac]);
+	ieee80211_unschedule_txq(&local->hw, &txqi->txq);
 }
 
 void ieee80211_txq_set_params(struct ieee80211_local *local)
@@ -3640,126 +3638,191 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
 struct ieee80211_txq *ieee80211_next_txq(struct ieee80211_hw *hw, u8 ac)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
+	struct rb_node *node = local->schedule_pos[ac];
 	struct txq_info *txqi = NULL;
+	bool first = false;
 
 	lockdep_assert_held(&local->active_txq_lock[ac]);
 
- begin:
-	txqi = list_first_entry_or_null(&local->active_txqs[ac],
-					struct txq_info,
-					schedule_order);
-	if (!txqi)
+	if (!node) {
+		node = rb_first_cached(&local->active_txqs[ac]);
+		first = true;
+	} else {
+		node = rb_next(node);
+	}
+
+	if (!node)
 		return NULL;
 
+	txqi = container_of(node, struct txq_info, schedule_order);
+
 	if (txqi->txq.sta) {
 		struct sta_info *sta = container_of(txqi->txq.sta,
 						struct sta_info, sta);
 
-		if (sta->airtime[txqi->txq.ac].deficit < 0) {
-			sta->airtime[txqi->txq.ac].deficit +=
-				sta->airtime_weight;
-			list_move_tail(&txqi->schedule_order,
-				       &local->active_txqs[txqi->txq.ac]);
-			goto begin;
+		if (sta->airtime[ac].v_t > local->airtime_v_t[ac]) {
+			if (first)
+				local->airtime_v_t[ac] = sta->airtime[ac].v_t;
+			else
+				return NULL;
 		}
 	}
 
 
-	if (txqi->schedule_round == local->schedule_round[ac])
-		return NULL;
-
-	list_del_init(&txqi->schedule_order);
-	txqi->schedule_round = local->schedule_round[ac];
+	local->schedule_pos[ac] = node;
 	return &txqi->txq;
 }
 EXPORT_SYMBOL(ieee80211_next_txq);
 
-void ieee80211_return_txq(struct ieee80211_hw *hw,
+static void __ieee80211_insert_txq(struct rb_root_cached *root,
+				   struct txq_info *txqi, u8 ac)
+{
+	struct rb_node **new = &root->rb_root.rb_node;
+	struct rb_node *parent = NULL;
+	struct txq_info *__txqi;
+	bool leftmost = true;
+
+	while (*new) {
+		parent = *new;
+		__txqi = rb_entry(parent, struct txq_info, schedule_order);
+
+		if (!txqi->txq.sta) {
+			/* new txqi has no sta - insert to the left */
+			new = &parent->rb_left;
+		} else if (!__txqi->txq.sta) {
+			/* existing txqi has no sta - insert to the right */
+			new = &parent->rb_right;
+			leftmost = false;
+		} else {
+			struct sta_info *old_sta = container_of(__txqi->txq.sta,
+								struct sta_info,
+								sta);
+			struct sta_info *new_sta = container_of(txqi->txq.sta,
+								struct sta_info,
+								sta);
+
+			if (new_sta->airtime[ac].v_t <= old_sta->airtime[ac].v_t) {
+				new = &parent->rb_left;
+			} else {
+				new = &parent->rb_right;
+				leftmost = false;
+			}
+		}
+	}
+
+	rb_link_node(&txqi->schedule_order, parent, new);
+	rb_insert_color_cached(&txqi->schedule_order, root, leftmost);
+}
+
+void ieee80211_schedule_txq(struct ieee80211_hw *hw,
+			    struct ieee80211_txq *txq)
+	__acquires(txq_lock) __releases(txq_lock)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+	u8 ac = txq->ac;
+
+	spin_lock_bh(&local->active_txq_lock[ac]);
+
+	if (!RB_EMPTY_NODE(&txqi->schedule_order))
+		goto out;
+
+	if (txq->sta) {
+		struct sta_info *sta = container_of(txq->sta,
+						    struct sta_info, sta);
+
+		local->airtime_weight_sum[ac] += sta->airtime_weight;
+		if (local->airtime_v_t[ac] > AIRTIME_GRACE)
+			sta->airtime[ac].v_t = max(local->airtime_v_t[ac] - AIRTIME_GRACE,
+						   sta->airtime[ac].v_t);
+	}
+
+	__ieee80211_insert_txq(&local->active_txqs[ac], txqi, ac);
+
+ out:
+	spin_unlock_bh(&local->active_txq_lock[ac]);
+}
+EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+void ieee80211_resort_txq(struct ieee80211_hw *hw,
 			  struct ieee80211_txq *txq)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 	struct txq_info *txqi = to_txq_info(txq);
+	u8 ac = txq->ac;
+
+	if (!RB_EMPTY_NODE(&txqi->schedule_order)) {
+		rb_erase_cached(&txqi->schedule_order,
+				&local->active_txqs[ac]);
+		RB_CLEAR_NODE(&txqi->schedule_order);
+		__ieee80211_insert_txq(&local->active_txqs[ac], txqi, ac);
+	}
+}
+
+static void __ieee80211_unschedule_txq(struct ieee80211_hw *hw,
+				       struct ieee80211_txq *txq)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+	u8 ac = txq->ac;
 
 	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
 
-	if (list_empty(&txqi->schedule_order) &&
-	    (!skb_queue_empty(&txqi->frags) || txqi->tin.backlog_packets)) {
-		/* If airtime accounting is active, always enqueue STAs at the
-		 * head of the list to ensure that they only get moved to the
-		 * back by the airtime DRR scheduler once they have a negative
-		 * deficit. A station that already has a negative deficit will
-		 * get immediately moved to the back of the list on the next
-		 * call to ieee80211_next_txq().
-		 */
-		if (txqi->txq.sta &&
-		    wiphy_ext_feature_isset(local->hw.wiphy,
-					    NL80211_EXT_FEATURE_AIRTIME_FAIRNESS))
-			list_add(&txqi->schedule_order,
-				 &local->active_txqs[txq->ac]);
-		else
-			list_add_tail(&txqi->schedule_order,
-				      &local->active_txqs[txq->ac]);
+	if (RB_EMPTY_NODE(&txqi->schedule_order))
+		return;
+
+	if (txq->sta) {
+		struct sta_info *sta = container_of(txq->sta,
+						    struct sta_info, sta);
+
+		local->airtime_weight_sum[ac] -= sta->airtime_weight;
 	}
+
+	rb_erase_cached(&txqi->schedule_order,
+			&local->active_txqs[txq->ac]);
+	RB_CLEAR_NODE(&txqi->schedule_order);
 }
-EXPORT_SYMBOL(ieee80211_return_txq);
 
-void ieee80211_schedule_txq(struct ieee80211_hw *hw,
-			    struct ieee80211_txq *txq)
+void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
+			      struct ieee80211_txq *txq)
 	__acquires(txq_lock) __releases(txq_lock)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 
 	spin_lock_bh(&local->active_txq_lock[txq->ac]);
-	ieee80211_return_txq(hw, txq);
+	__ieee80211_unschedule_txq(hw, txq);
 	spin_unlock_bh(&local->active_txq_lock[txq->ac]);
 }
-EXPORT_SYMBOL(ieee80211_schedule_txq);
+
+void ieee80211_return_txq(struct ieee80211_hw *hw,
+			  struct ieee80211_txq *txq)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+
+	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
+
+	if (!RB_EMPTY_NODE(&txqi->schedule_order) &&
+	    (skb_queue_empty(&txqi->frags) && !txqi->tin.backlog_packets))
+		__ieee80211_unschedule_txq(hw, txq);
+}
+EXPORT_SYMBOL(ieee80211_return_txq);
 
 bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 				struct ieee80211_txq *txq)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
-	struct txq_info *iter, *tmp, *txqi = to_txq_info(txq);
+	struct txq_info *txqi = to_txq_info(txq);
 	struct sta_info *sta;
 	u8 ac = txq->ac;
 
 	lockdep_assert_held(&local->active_txq_lock[ac]);
 
 	if (!txqi->txq.sta)
-		goto out;
-
-	if (list_empty(&txqi->schedule_order))
-		goto out;
-
-	list_for_each_entry_safe(iter, tmp, &local->active_txqs[ac],
-				 schedule_order) {
-		if (iter == txqi)
-			break;
-
-		if (!iter->txq.sta) {
-			list_move_tail(&iter->schedule_order,
-				       &local->active_txqs[ac]);
-			continue;
-		}
-		sta = container_of(iter->txq.sta, struct sta_info, sta);
-		if (sta->airtime[ac].deficit < 0)
-			sta->airtime[ac].deficit += sta->airtime_weight;
-		list_move_tail(&iter->schedule_order, &local->active_txqs[ac]);
-	}
+		return true;
 
 	sta = container_of(txqi->txq.sta, struct sta_info, sta);
-	if (sta->airtime[ac].deficit >= 0)
-		goto out;
-
-	sta->airtime[ac].deficit += sta->airtime_weight;
-	list_move_tail(&txqi->schedule_order, &local->active_txqs[ac]);
-
-	return false;
-out:
-	if (!list_empty(&txqi->schedule_order))
-		list_del_init(&txqi->schedule_order);
-
-	return true;
+	return (sta->airtime[ac].v_t <= local->airtime_v_t[ac]);
 }
 EXPORT_SYMBOL(ieee80211_txq_may_transmit);
 
@@ -3769,7 +3832,6 @@ void ieee80211_txq_schedule_start(struct ieee80211_hw *hw, u8 ac)
 	struct ieee80211_local *local = hw_to_local(hw);
 
 	spin_lock_bh(&local->active_txq_lock[ac]);
-	local->schedule_round[ac]++;
 }
 EXPORT_SYMBOL(ieee80211_txq_schedule_start);
 
@@ -3778,6 +3840,7 @@ void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
 
+	local->schedule_pos[ac] = NULL;
 	spin_unlock_bh(&local->active_txq_lock[ac]);
 }
 EXPORT_SYMBOL(ieee80211_txq_schedule_end);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 2/4] mac80211: defer txqs removal from rbtree
  2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
  2019-09-23  7:19 ` [PATCH V3 1/4] mac80211: Switch to a virtual time-based airtime scheduler Yibo Zhao
@ 2019-09-23  7:19 ` Yibo Zhao
  2019-09-23 10:56   ` Toke Høiland-Jørgensen
  2019-09-23  7:20 ` [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation Yibo Zhao
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-09-23  7:19 UTC (permalink / raw)
  To: linux-wireless; +Cc: ath10k, Yibo Zhao

In a loop txqs dequeue scenario, if the first txq in the rbtree gets
removed from rbtree immediately in the ieee80211_return_txq(), the
loop will break soon in the ieee80211_next_txq() due to schedule_pos
not leading to the second txq in the rbtree. Thus, defering the
removal right before the end of this schedule round.

Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
---
 include/net/mac80211.h     | 16 ++++++++++--
 net/mac80211/ieee80211_i.h |  3 +++
 net/mac80211/main.c        |  6 +++++
 net/mac80211/tx.c          | 63 +++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index ac2ed8e..ba5a345 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -925,6 +925,8 @@ struct ieee80211_tx_rate {
 
 #define IEEE80211_MAX_TX_RETRY		31
 
+#define IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS 100
+
 static inline void ieee80211_rate_set_vht(struct ieee80211_tx_rate *rate,
 					  u8 mcs, u8 nss)
 {
@@ -6232,7 +6234,8 @@ struct sk_buff *ieee80211_tx_dequeue(struct ieee80211_hw *hw,
  * @ac: AC number to return packets from.
  *
  * Should only be called between calls to ieee80211_txq_schedule_start()
- * and ieee80211_txq_schedule_end().
+ * and ieee80211_txq_schedule_end(). If the txq is empty, it will be added
+ * to a remove list and get removed later.
  * Returns the next txq if successful, %NULL if no queue is eligible. If a txq
  * is returned, it should be returned with ieee80211_return_txq() after the
  * driver has finished scheduling it.
@@ -6268,7 +6271,8 @@ void ieee80211_txq_schedule_start(struct ieee80211_hw *hw, u8 ac)
  * @hw: pointer as obtained from ieee80211_alloc_hw()
  * @ac: AC number to acquire locks for
  *
- * Release locks previously acquired by ieee80211_txq_schedule_end().
+ * Release locks previously acquired by ieee80211_txq_schedule_end(). Check
+ * and remove the empty txq from rb-tree.
  */
 void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
 	__releases(txq_lock);
@@ -6287,6 +6291,14 @@ void ieee80211_schedule_txq(struct ieee80211_hw *hw, struct ieee80211_txq *txq)
 	__acquires(txq_lock) __releases(txq_lock);
 
 /**
+ * ieee80211_txqs_check - Check txqs waiting for removal
+ *
+ * @tmr: pointer as obtained from local
+ *
+ */
+void ieee80211_txqs_check(struct timer_list *tmr);
+
+/**
  * ieee80211_txq_may_transmit - check whether TXQ is allowed to transmit
  *
  * This function is used to check whether given txq is allowed to transmit by
diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
index a4556f9..49aa143e 100644
--- a/net/mac80211/ieee80211_i.h
+++ b/net/mac80211/ieee80211_i.h
@@ -847,6 +847,7 @@ struct txq_info {
 	struct codel_stats cstats;
 	struct sk_buff_head frags;
 	struct rb_node schedule_order;
+	struct list_head candidate;
 	unsigned long flags;
 
 	/* keep last! */
@@ -1145,6 +1146,8 @@ struct ieee80211_local {
 	u64 airtime_v_t[IEEE80211_NUM_ACS];
 	u64 airtime_weight_sum[IEEE80211_NUM_ACS];
 
+	struct list_head remove_list[IEEE80211_NUM_ACS];
+	struct timer_list remove_timer;
 	u16 airtime_flags;
 
 	const struct ieee80211_ops *ops;
diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index e9ffa8e..78fe24a 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -667,10 +667,15 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
 
 	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
 		local->active_txqs[i] = RB_ROOT_CACHED;
+		INIT_LIST_HEAD(&local->remove_list[i]);
 		spin_lock_init(&local->active_txq_lock[i]);
 	}
 	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
 
+	timer_setup(&local->remove_timer, ieee80211_txqs_check, 0);
+	mod_timer(&local->remove_timer,
+		  jiffies + msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
+
 	INIT_LIST_HEAD(&local->chanctx_list);
 	mutex_init(&local->chanctx_mtx);
 
@@ -1305,6 +1310,7 @@ void ieee80211_unregister_hw(struct ieee80211_hw *hw)
 	tasklet_kill(&local->tx_pending_tasklet);
 	tasklet_kill(&local->tasklet);
 
+	del_timer_sync(&local->remove_timer);
 #ifdef CONFIG_INET
 	unregister_inetaddr_notifier(&local->ifa_notifier);
 #endif
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index d00baaa..42ca010 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -1450,6 +1450,7 @@ void ieee80211_txq_init(struct ieee80211_sub_if_data *sdata,
 	codel_stats_init(&txqi->cstats);
 	__skb_queue_head_init(&txqi->frags);
 	RB_CLEAR_NODE(&txqi->schedule_order);
+	INIT_LIST_HEAD(&txqi->candidate);
 
 	txqi->txq.vif = &sdata->vif;
 
@@ -3724,6 +3725,9 @@ void ieee80211_schedule_txq(struct ieee80211_hw *hw,
 
 	spin_lock_bh(&local->active_txq_lock[ac]);
 
+	if (!list_empty(&txqi->candidate))
+		list_del_init(&txqi->candidate);
+
 	if (!RB_EMPTY_NODE(&txqi->schedule_order))
 		goto out;
 
@@ -3783,6 +3787,20 @@ static void __ieee80211_unschedule_txq(struct ieee80211_hw *hw,
 	RB_CLEAR_NODE(&txqi->schedule_order);
 }
 
+void ieee80211_remove_txq(struct ieee80211_hw *hw,
+			  struct ieee80211_txq *txq)
+{
+	struct ieee80211_local *local = hw_to_local(hw);
+	struct txq_info *txqi = to_txq_info(txq);
+
+	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
+
+	if (!RB_EMPTY_NODE(&txqi->schedule_order)) {
+		__ieee80211_unschedule_txq(hw, txq);
+		list_del_init(&txqi->candidate);
+	}
+}
+
 void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
 			      struct ieee80211_txq *txq)
 	__acquires(txq_lock) __releases(txq_lock)
@@ -3790,7 +3808,7 @@ void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
 	struct ieee80211_local *local = hw_to_local(hw);
 
 	spin_lock_bh(&local->active_txq_lock[txq->ac]);
-	__ieee80211_unschedule_txq(hw, txq);
+	ieee80211_remove_txq(hw, txq);
 	spin_unlock_bh(&local->active_txq_lock[txq->ac]);
 }
 
@@ -3803,11 +3821,48 @@ void ieee80211_return_txq(struct ieee80211_hw *hw,
 	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
 
 	if (!RB_EMPTY_NODE(&txqi->schedule_order) &&
-	    (skb_queue_empty(&txqi->frags) && !txqi->tin.backlog_packets))
-		__ieee80211_unschedule_txq(hw, txq);
+		!txq_has_queue(&txqi->txq) &&
+		list_empty(&txqi->candidate))
+		list_add_tail(&txqi->candidate, &local->remove_list[txq->ac]);
+
 }
 EXPORT_SYMBOL(ieee80211_return_txq);
 
+void __ieee80211_check_txqs(struct ieee80211_local *local, int ac)
+{
+	struct txq_info *iter, *tmp;
+	struct sta_info *sta;
+
+	lockdep_assert_held(&local->active_txq_lock[ac]);
+
+	list_for_each_entry_safe(iter, tmp, &local->remove_list[ac],
+				 candidate) {
+		sta = container_of(iter->txq.sta, struct sta_info, sta);
+
+		if (txq_has_queue(&iter->txq))
+			list_del_init(&iter->candidate);
+		else
+			ieee80211_remove_txq(&local->hw, &iter->txq);
+	}
+}
+
+void ieee80211_txqs_check(struct timer_list *t)
+{
+	struct ieee80211_local *local = from_timer(local, t, remove_timer);
+	struct txq_info *iter, *tmp;
+	struct sta_info *sta;
+	int ac;
+
+	for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
+		spin_lock_bh(&local->active_txq_lock[ac]);
+		__ieee80211_check_txqs(local, ac);
+		spin_unlock_bh(&local->active_txq_lock[ac]);
+	}
+
+	mod_timer(&local->remove_timer,
+		  jiffies + msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
+}
+
 bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 				struct ieee80211_txq *txq)
 {
@@ -3841,6 +3896,8 @@ void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
 	struct ieee80211_local *local = hw_to_local(hw);
 
 	local->schedule_pos[ac] = NULL;
+	__ieee80211_check_txqs(local, ac);
+
 	spin_unlock_bh(&local->active_txq_lock[ac]);
 }
 EXPORT_SYMBOL(ieee80211_txq_schedule_end);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation
  2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
  2019-09-23  7:19 ` [PATCH V3 1/4] mac80211: Switch to a virtual time-based airtime scheduler Yibo Zhao
  2019-09-23  7:19 ` [PATCH V3 2/4] mac80211: defer txqs removal from rbtree Yibo Zhao
@ 2019-09-23  7:20 ` Yibo Zhao
  2019-09-23 10:55   ` Toke Høiland-Jørgensen
  2019-09-23  7:20 ` [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together Yibo Zhao
  2019-10-01 10:19 ` [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Johannes Berg
  4 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-09-23  7:20 UTC (permalink / raw)
  To: linux-wireless; +Cc: ath10k, Yibo Zhao

Not long after the start of multi-clients test, not a single station is
an eligible candidate for transmission since global virtual time(g_vt) is
smaller than the virtual airtime(s_vt) of all the stations. As a result,
the Tx has been blocked and throughput is quite low.

This may mainly due to sync mechanism and accumulative deviation from the
devision calculation of g_vt.

For example:
Suppose we have 50 clients in first round.
Round 1:
STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
1	256	2048		12800	2048	2000	N
2	256	2048			2048		N
.	.	.			.		.
.	.	.			.		.
.	.	.			.		.
50	256	2048			2048		N

After this round, all the stations are not valid for next transmission due to
accumulative deviation.

And if we add a new #51,
STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
1	256	2048		13056	2048	2020	N
2	256	2048			2048		N
.	.	.			.
.	.	.			.
.	.	.			.
50	256	2048			2048		N
51	256	1024			2524		N

Sync is done by:
max(g_vt of last round - grace period, s_vt)
and s_vt of #51 = max(2000 - 500, 0) + 1024 = 2524, and it is more than the final
g_vt of this round.

After this round, no more station is valid for transmission.

The real situation can be more complicate, above is one of the extremely case.

To avoid this situation to occur, the new proposal is:

- Increase the airtime grace period a little more to reduce the
  unexpected sync

- If global virtual time is less than the virtual airtime of any station,
  sync it to the airtime of first station in the red-black tree

- Round the division result

Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
---
 net/mac80211/sta_info.c |  3 ++-
 net/mac80211/sta_info.h |  2 +-
 net/mac80211/tx.c       | 16 +++++++++++++++-
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index 9d01fdd..feac975 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -1852,7 +1852,8 @@ void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
 
 	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight;
 
-	local->airtime_v_t[ac] += airtime / weight_sum;
+	/* Round the calculation of global vt */
+	local->airtime_v_t[ac] += (airtime + (weight_sum >> 1)) / weight_sum;
 	sta->airtime[ac].v_t += airtime / sta->airtime_weight;
 	ieee80211_resort_txq(&local->hw, txq);
 
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 5c1cac9..5055f94 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -130,7 +130,7 @@ enum ieee80211_agg_stop_reason {
 /* Debugfs flags to enable/disable use of RX/TX airtime in scheduler */
 #define AIRTIME_USE_TX		BIT(0)
 #define AIRTIME_USE_RX		BIT(1)
-#define AIRTIME_GRACE 500 /* usec of grace period before reset */
+#define AIRTIME_GRACE 2000 /* usec of grace period before reset */
 
 struct airtime_info {
 	u64 rx_airtime;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 42ca010..60cf569 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3867,15 +3867,29 @@ bool ieee80211_txq_may_transmit(struct ieee80211_hw *hw,
 				struct ieee80211_txq *txq)
 {
 	struct ieee80211_local *local = hw_to_local(hw);
-	struct txq_info *txqi = to_txq_info(txq);
+	struct txq_info *first_txqi, *txqi = to_txq_info(txq);
+	struct rb_node *node = NULL;
 	struct sta_info *sta;
 	u8 ac = txq->ac;
+	first_txqi = NULL;
 
 	lockdep_assert_held(&local->active_txq_lock[ac]);
 
 	if (!txqi->txq.sta)
 		return true;
 
+	node = rb_first_cached(&local->active_txqs[ac]);
+	if (node) {
+		first_txqi = container_of(node, struct txq_info,
+					  schedule_order);
+		if (first_txqi->txq.sta) {
+			sta = container_of(first_txqi->txq.sta,
+					   struct sta_info, sta);
+			if (local->airtime_v_t[ac] < sta->airtime[ac].v_t)
+				local->airtime_v_t[ac] = sta->airtime[ac].v_t;
+		}
+	}
+
 	sta = container_of(txqi->txq.sta, struct sta_info, sta);
 	return (sta->airtime[ac].v_t <= local->airtime_v_t[ac]);
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together
  2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
                   ` (2 preceding siblings ...)
  2019-09-23  7:20 ` [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation Yibo Zhao
@ 2019-09-23  7:20 ` Yibo Zhao
  2019-09-23 11:00   ` Toke Høiland-Jørgensen
  2019-10-01 10:19 ` [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Johannes Berg
  4 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-09-23  7:20 UTC (permalink / raw)
  To: linux-wireless; +Cc: ath10k, Yibo Zhao

Global airtime weight sum is updated only when txq is added/removed
from rbtree. If upper layer configures sta weight during high load,
airtime weight sum will not be updated since txq is most likely on the
tree. It could a little late for upper layer to reconfigure sta weight
when txq is already in the rbtree. And thus, incorrect airtime weight sum
will lead to incorrect global virtual time calculation as well as overflow
of airtime weight sum during txq removed.

Hence, need to update airtime weight sum upon receiving event for
configuring sta weight once sta's txq is on the rbtree.

Besides, if airtime weight sum of ACs and sta weight is synced under the
same per AC lock protection, there can be a very short window causing
incorrct airtime weight sum calculation as below:

    active_txq_lock_VO                          .
    VO weight sum is syncd			.
    sta airtime weight sum is synced		.
    active_txq_unlock_VO			.
    .						.
    active_txq_lock_VI    			.
    VI weight sum is syncd			.
    sta airtime weight sum		active_txq_lock_BE
    active_txq_unlock_VI	      Remove txq and thus sum
    .				      is calculated with synced
    .				      sta airtime weight
    .					active_txq_unlock_BE

So introduce a per ac synced station airtime weight synced with per
AC synced weight sum together. And the per-AC station airtime weight
is used to calculate weight sum.

Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
---
 net/mac80211/cfg.c         | 29 ++++++++++++++++++++++++++---
 net/mac80211/debugfs_sta.c |  2 +-
 net/mac80211/sta_info.c    |  9 ++++-----
 net/mac80211/sta_info.h    |  5 +++--
 net/mac80211/tx.c          |  4 ++--
 5 files changed, 36 insertions(+), 13 deletions(-)

diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index d65aa01..c8a0683 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -1284,7 +1284,8 @@ static int sta_apply_parameters(struct ieee80211_local *local,
 	int ret = 0;
 	struct ieee80211_supported_band *sband;
 	struct ieee80211_sub_if_data *sdata = sta->sdata;
-	u32 mask, set;
+	u32 mask, set, tid, ac, pre_weight;
+	struct txq_info *txqi;
 
 	sband = ieee80211_get_sband(sdata);
 	if (!sband)
@@ -1452,8 +1453,30 @@ static int sta_apply_parameters(struct ieee80211_local *local,
 	if (ieee80211_vif_is_mesh(&sdata->vif))
 		sta_apply_mesh_params(local, sta, params);
 
-	if (params->airtime_weight)
-		sta->airtime_weight = params->airtime_weight;
+	if (params->airtime_weight &&
+	    params->airtime_weight != sta->airtime_weight) {
+		for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
+			spin_lock_bh(&local->active_txq_lock[ac]);
+			for (tid = 0; tid < IEEE80211_NUM_TIDS + 1; tid++) {
+				if (!sta->sta.txq[tid] ||
+				    ac != ieee80211_ac_from_tid(tid))
+					continue;
+
+				pre_weight = sta->airtime_weight[ac];
+				sta->airtime_weight[ac] =
+						params->airtime_weight;
+
+				txqi = to_txq_info(sta->sta.txq[tid]);
+				if (RB_EMPTY_NODE(&txqi->schedule_order))
+					continue;
+
+				local->airtime_weight_sum[ac] = local->airtime_weight_sum[ac] +
+								params->airtime_weight -
+								pre_weight;
+			}
+			spin_unlock_bh(&local->active_txq_lock[ac]);
+		}
+	}
 
 	/* set the STA state after all sta info from usermode has been set */
 	if (test_sta_flag(sta, WLAN_STA_TDLS_PEER) ||
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
index 80028da..43a7e6a 100644
--- a/net/mac80211/debugfs_sta.c
+++ b/net/mac80211/debugfs_sta.c
@@ -223,7 +223,7 @@ static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
 		"Virt-T: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
 		rx_airtime,
 		tx_airtime,
-		sta->airtime_weight,
+		sta->airtime_weight[0],
 		v_t[0],
 		v_t[1],
 		v_t[2],
diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
index feac975..e599cf1 100644
--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -384,11 +384,10 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
 	if (sta_prepare_rate_control(local, sta, gfp))
 		goto free_txq;
 
-	sta->airtime_weight = IEEE80211_DEFAULT_AIRTIME_WEIGHT;
-
 	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
 		skb_queue_head_init(&sta->ps_tx_buf[i]);
 		skb_queue_head_init(&sta->tx_filtered[i]);
+		sta->airtime_weight[i] = IEEE80211_DEFAULT_AIRTIME_WEIGHT;
 	}
 
 	for (i = 0; i < IEEE80211_NUM_TIDS; i++)
@@ -1850,11 +1849,11 @@ void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
 	sta->airtime[ac].tx_airtime += tx_airtime;
 	sta->airtime[ac].rx_airtime += rx_airtime;
 
-	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight;
+	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight[ac];
 
 	/* Round the calculation of global vt */
 	local->airtime_v_t[ac] += (airtime + (weight_sum >> 1)) / weight_sum;
-	sta->airtime[ac].v_t += airtime / sta->airtime_weight;
+	sta->airtime[ac].v_t += airtime / sta->airtime_weight[ac];
 	ieee80211_resort_txq(&local->hw, txq);
 
 	spin_unlock_bh(&local->active_txq_lock[ac]);
@@ -2236,7 +2235,7 @@ void sta_set_sinfo(struct sta_info *sta, struct station_info *sinfo,
 	}
 
 	if (!(sinfo->filled & BIT_ULL(NL80211_STA_INFO_AIRTIME_WEIGHT))) {
-		sinfo->airtime_weight = sta->airtime_weight;
+		sinfo->airtime_weight = sta->airtime_weight[0];
 		sinfo->filled |= BIT_ULL(NL80211_STA_INFO_AIRTIME_WEIGHT);
 	}
 
diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
index 5055f94..2697343 100644
--- a/net/mac80211/sta_info.h
+++ b/net/mac80211/sta_info.h
@@ -476,7 +476,8 @@ struct ieee80211_sta_rx_stats {
  * @tid_seq: per-TID sequence numbers for sending to this STA
  * @airtime: per-AC struct airtime_info describing airtime statistics for this
  *	station
- * @airtime_weight: station weight for airtime fairness calculation purposes
+ * @airtime_weight: station per-AC weight for airtime fairness calculation
+ * purposes
  * @ampdu_mlme: A-MPDU state machine state
  * @mesh: mesh STA information
  * @debugfs_dir: debug filesystem directory dentry
@@ -602,7 +603,7 @@ struct sta_info {
 	u16 tid_seq[IEEE80211_QOS_CTL_TID_MASK + 1];
 
 	struct airtime_info airtime[IEEE80211_NUM_ACS];
-	u16 airtime_weight;
+	u16 airtime_weight[IEEE80211_NUM_ACS];
 
 	/*
 	 * Aggregation information, locked with lock.
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 60cf569..286cf14 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -3735,7 +3735,7 @@ void ieee80211_schedule_txq(struct ieee80211_hw *hw,
 		struct sta_info *sta = container_of(txq->sta,
 						    struct sta_info, sta);
 
-		local->airtime_weight_sum[ac] += sta->airtime_weight;
+		local->airtime_weight_sum[ac] += sta->airtime_weight[ac];
 		if (local->airtime_v_t[ac] > AIRTIME_GRACE)
 			sta->airtime[ac].v_t = max(local->airtime_v_t[ac] - AIRTIME_GRACE,
 						   sta->airtime[ac].v_t);
@@ -3779,7 +3779,7 @@ static void __ieee80211_unschedule_txq(struct ieee80211_hw *hw,
 		struct sta_info *sta = container_of(txq->sta,
 						    struct sta_info, sta);
 
-		local->airtime_weight_sum[ac] -= sta->airtime_weight;
+		local->airtime_weight_sum[ac] -= sta->airtime_weight[ac];
 	}
 
 	rb_erase_cached(&txqi->schedule_order,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation
  2019-09-23  7:20 ` [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation Yibo Zhao
@ 2019-09-23 10:55   ` Toke Høiland-Jørgensen
  2019-09-24  8:22     ` Yibo Zhao
  0 siblings, 1 reply; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-23 10:55 UTC (permalink / raw)
  To: Yibo Zhao, linux-wireless; +Cc: ath10k, Yibo Zhao

Yibo Zhao <yiboz@codeaurora.org> writes:

> Not long after the start of multi-clients test, not a single station is
> an eligible candidate for transmission since global virtual time(g_vt) is
> smaller than the virtual airtime(s_vt) of all the stations. As a result,
> the Tx has been blocked and throughput is quite low.
>
> This may mainly due to sync mechanism and accumulative deviation from the
> devision calculation of g_vt.
>
> For example:
> Suppose we have 50 clients in first round.
> Round 1:
> STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
> 1	256	2048		12800	2048	2000	N
> 2	256	2048			2048		N
> .	.	.			.		.
> .	.	.			.		.
> .	.	.			.		.
> 50	256	2048			2048		N
>
> After this round, all the stations are not valid for next transmission due to
> accumulative deviation.
>
> And if we add a new #51,
> STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
> 1	256	2048		13056	2048	2020	N
> 2	256	2048			2048		N
> .	.	.			.
> .	.	.			.
> .	.	.			.
> 50	256	2048			2048		N
> 51	256	1024			2524		N

That's better :)

> Sync is done by:
> max(g_vt of last round - grace period, s_vt)
> and s_vt of #51 = max(2000 - 500, 0) + 1024 = 2524, and it is more than the final
> g_vt of this round.
>
> After this round, no more station is valid for transmission.
>
> The real situation can be more complicate, above is one of the extremely case.
>
> To avoid this situation to occur, the new proposal is:
>
> - Increase the airtime grace period a little more to reduce the
>   unexpected sync
>
> - If global virtual time is less than the virtual airtime of any station,
>   sync it to the airtime of first station in the red-black tree
>
> - Round the division result

I can see why we need the second part (basically, this happens because I
forgot to add a check for "no eligible stations" in may_transmit(), like
the one in next_txq()). And rounding up the division result doesn't
hurt, I guess. But why does it help to change the grace period if we're
doing all the other stuff?

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/4] mac80211: defer txqs removal from rbtree
  2019-09-23  7:19 ` [PATCH V3 2/4] mac80211: defer txqs removal from rbtree Yibo Zhao
@ 2019-09-23 10:56   ` Toke Høiland-Jørgensen
  2019-09-24  2:55     ` Yibo Zhao
  0 siblings, 1 reply; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-23 10:56 UTC (permalink / raw)
  To: Yibo Zhao, linux-wireless; +Cc: ath10k, Yibo Zhao

Yibo Zhao <yiboz@codeaurora.org> writes:

> In a loop txqs dequeue scenario, if the first txq in the rbtree gets
> removed from rbtree immediately in the ieee80211_return_txq(), the
> loop will break soon in the ieee80211_next_txq() due to schedule_pos
> not leading to the second txq in the rbtree. Thus, defering the
> removal right before the end of this schedule round.

Didn't we agree that we could fix this by making __unschedule_txq()
aware of schedule_pos instead of this deferred removal mechanism?

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together
  2019-09-23  7:20 ` [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together Yibo Zhao
@ 2019-09-23 11:00   ` Toke Høiland-Jørgensen
  2019-09-24  3:19     ` Yibo Zhao
  0 siblings, 1 reply; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-23 11:00 UTC (permalink / raw)
  To: Yibo Zhao, linux-wireless; +Cc: ath10k, Yibo Zhao

Yibo Zhao <yiboz@codeaurora.org> writes:

> Global airtime weight sum is updated only when txq is added/removed
> from rbtree. If upper layer configures sta weight during high load,
> airtime weight sum will not be updated since txq is most likely on the
> tree. It could a little late for upper layer to reconfigure sta weight
> when txq is already in the rbtree. And thus, incorrect airtime weight sum
> will lead to incorrect global virtual time calculation as well as overflow
> of airtime weight sum during txq removed.
>
> Hence, need to update airtime weight sum upon receiving event for
> configuring sta weight once sta's txq is on the rbtree.
>
> Besides, if airtime weight sum of ACs and sta weight is synced under the
> same per AC lock protection, there can be a very short window causing
> incorrct airtime weight sum calculation as below:
>
>     active_txq_lock_VO                          .
>     VO weight sum is syncd			.
>     sta airtime weight sum is synced		.
>     active_txq_unlock_VO			.
>     .						.
>     active_txq_lock_VI    			.
>     VI weight sum is syncd			.
>     sta airtime weight sum		active_txq_lock_BE
>     active_txq_unlock_VI	      Remove txq and thus sum
>     .				      is calculated with synced
>     .				      sta airtime weight
>     .					active_txq_unlock_BE
>
> So introduce a per ac synced station airtime weight synced with per
> AC synced weight sum together. And the per-AC station airtime weight
> is used to calculate weight sum.
>
> Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
> ---
>  net/mac80211/cfg.c         | 29 ++++++++++++++++++++++++++---
>  net/mac80211/debugfs_sta.c |  2 +-
>  net/mac80211/sta_info.c    |  9 ++++-----
>  net/mac80211/sta_info.h    |  5 +++--
>  net/mac80211/tx.c          |  4 ++--
>  5 files changed, 36 insertions(+), 13 deletions(-)
>
> diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
> index d65aa01..c8a0683 100644
> --- a/net/mac80211/cfg.c
> +++ b/net/mac80211/cfg.c
> @@ -1284,7 +1284,8 @@ static int sta_apply_parameters(struct ieee80211_local *local,
>  	int ret = 0;
>  	struct ieee80211_supported_band *sband;
>  	struct ieee80211_sub_if_data *sdata = sta->sdata;
> -	u32 mask, set;
> +	u32 mask, set, tid, ac, pre_weight;
> +	struct txq_info *txqi;
>  
>  	sband = ieee80211_get_sband(sdata);
>  	if (!sband)
> @@ -1452,8 +1453,30 @@ static int sta_apply_parameters(struct ieee80211_local *local,
>  	if (ieee80211_vif_is_mesh(&sdata->vif))
>  		sta_apply_mesh_params(local, sta, params);
>  
> -	if (params->airtime_weight)
> -		sta->airtime_weight = params->airtime_weight;
> +	if (params->airtime_weight &&
> +	    params->airtime_weight != sta->airtime_weight) {

This check doesn't work I think? You're not using the array-based
sta->airtime_weight[], and there are locking issues by just checking
like this; so maybe just keep the if() on params->airtime_weight, and do
the checking inside the loop while holding the lock?

Or could we just turn the weights into atomics to avoid the locking
entirely?

> +		for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
> +			spin_lock_bh(&local->active_txq_lock[ac]);
> +			for (tid = 0; tid < IEEE80211_NUM_TIDS + 1; tid++) {
> +				if (!sta->sta.txq[tid] ||
> +				    ac != ieee80211_ac_from_tid(tid))
> +					continue;
> +
> +				pre_weight = sta->airtime_weight[ac];
> +				sta->airtime_weight[ac] =
> +						params->airtime_weight;
> +
> +				txqi = to_txq_info(sta->sta.txq[tid]);
> +				if (RB_EMPTY_NODE(&txqi->schedule_order))
> +					continue;
> +
> +				local->airtime_weight_sum[ac] = local->airtime_weight_sum[ac] +
> +								params->airtime_weight -
> +								pre_weight;
> +			}
> +			spin_unlock_bh(&local->active_txq_lock[ac]);
> +		}
> +	}
>  
>  	/* set the STA state after all sta info from usermode has been set */
>  	if (test_sta_flag(sta, WLAN_STA_TDLS_PEER) ||
> diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c
> index 80028da..43a7e6a 100644
> --- a/net/mac80211/debugfs_sta.c
> +++ b/net/mac80211/debugfs_sta.c
> @@ -223,7 +223,7 @@ static ssize_t sta_airtime_read(struct file *file, char __user *userbuf,
>  		"Virt-T: VO: %lld us VI: %lld us BE: %lld us BK: %lld us\n",
>  		rx_airtime,
>  		tx_airtime,
> -		sta->airtime_weight,
> +		sta->airtime_weight[0],
>  		v_t[0],
>  		v_t[1],
>  		v_t[2],
> diff --git a/net/mac80211/sta_info.c b/net/mac80211/sta_info.c
> index feac975..e599cf1 100644
> --- a/net/mac80211/sta_info.c
> +++ b/net/mac80211/sta_info.c
> @@ -384,11 +384,10 @@ struct sta_info *sta_info_alloc(struct ieee80211_sub_if_data *sdata,
>  	if (sta_prepare_rate_control(local, sta, gfp))
>  		goto free_txq;
>  
> -	sta->airtime_weight = IEEE80211_DEFAULT_AIRTIME_WEIGHT;
> -
>  	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
>  		skb_queue_head_init(&sta->ps_tx_buf[i]);
>  		skb_queue_head_init(&sta->tx_filtered[i]);
> +		sta->airtime_weight[i] = IEEE80211_DEFAULT_AIRTIME_WEIGHT;
>  	}
>  
>  	for (i = 0; i < IEEE80211_NUM_TIDS; i++)
> @@ -1850,11 +1849,11 @@ void ieee80211_sta_register_airtime(struct ieee80211_sta *pubsta, u8 tid,
>  	sta->airtime[ac].tx_airtime += tx_airtime;
>  	sta->airtime[ac].rx_airtime += rx_airtime;
>  
> -	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight;
> +	weight_sum = local->airtime_weight_sum[ac] ?: sta->airtime_weight[ac];
>  
>  	/* Round the calculation of global vt */
>  	local->airtime_v_t[ac] += (airtime + (weight_sum >> 1)) / weight_sum;
> -	sta->airtime[ac].v_t += airtime / sta->airtime_weight;
> +	sta->airtime[ac].v_t += airtime / sta->airtime_weight[ac];
>  	ieee80211_resort_txq(&local->hw, txq);
>  
>  	spin_unlock_bh(&local->active_txq_lock[ac]);
> @@ -2236,7 +2235,7 @@ void sta_set_sinfo(struct sta_info *sta, struct station_info *sinfo,
>  	}
>  
>  	if (!(sinfo->filled & BIT_ULL(NL80211_STA_INFO_AIRTIME_WEIGHT))) {
> -		sinfo->airtime_weight = sta->airtime_weight;
> +		sinfo->airtime_weight = sta->airtime_weight[0];
>  		sinfo->filled |= BIT_ULL(NL80211_STA_INFO_AIRTIME_WEIGHT);
>  	}
>  
> diff --git a/net/mac80211/sta_info.h b/net/mac80211/sta_info.h
> index 5055f94..2697343 100644
> --- a/net/mac80211/sta_info.h
> +++ b/net/mac80211/sta_info.h
> @@ -476,7 +476,8 @@ struct ieee80211_sta_rx_stats {
>   * @tid_seq: per-TID sequence numbers for sending to this STA
>   * @airtime: per-AC struct airtime_info describing airtime statistics for this
>   *	station
> - * @airtime_weight: station weight for airtime fairness calculation purposes
> + * @airtime_weight: station per-AC weight for airtime fairness calculation
> + * purposes
>   * @ampdu_mlme: A-MPDU state machine state
>   * @mesh: mesh STA information
>   * @debugfs_dir: debug filesystem directory dentry
> @@ -602,7 +603,7 @@ struct sta_info {
>  	u16 tid_seq[IEEE80211_QOS_CTL_TID_MASK + 1];
>  
>  	struct airtime_info airtime[IEEE80211_NUM_ACS];
> -	u16 airtime_weight;
> +	u16 airtime_weight[IEEE80211_NUM_ACS];
>  
>  	/*
>  	 * Aggregation information, locked with lock.
> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
> index 60cf569..286cf14 100644
> --- a/net/mac80211/tx.c
> +++ b/net/mac80211/tx.c
> @@ -3735,7 +3735,7 @@ void ieee80211_schedule_txq(struct ieee80211_hw *hw,
>  		struct sta_info *sta = container_of(txq->sta,
>  						    struct sta_info, sta);
>  
> -		local->airtime_weight_sum[ac] += sta->airtime_weight;
> +		local->airtime_weight_sum[ac] += sta->airtime_weight[ac];
>  		if (local->airtime_v_t[ac] > AIRTIME_GRACE)
>  			sta->airtime[ac].v_t = max(local->airtime_v_t[ac] - AIRTIME_GRACE,
>  						   sta->airtime[ac].v_t);
> @@ -3779,7 +3779,7 @@ static void __ieee80211_unschedule_txq(struct ieee80211_hw *hw,
>  		struct sta_info *sta = container_of(txq->sta,
>  						    struct sta_info, sta);
>  
> -		local->airtime_weight_sum[ac] -= sta->airtime_weight;
> +		local->airtime_weight_sum[ac] -= sta->airtime_weight[ac];
>  	}
>  
>  	rb_erase_cached(&txqi->schedule_order,
> -- 
> 1.9.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/4] mac80211: defer txqs removal from rbtree
  2019-09-23 10:56   ` Toke Høiland-Jørgensen
@ 2019-09-24  2:55     ` Yibo Zhao
  0 siblings, 0 replies; 17+ messages in thread
From: Yibo Zhao @ 2019-09-24  2:55 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen; +Cc: linux-wireless, ath10k

On 2019-09-23 18:56, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz@codeaurora.org> writes:
> 
>> In a loop txqs dequeue scenario, if the first txq in the rbtree gets
>> removed from rbtree immediately in the ieee80211_return_txq(), the
>> loop will break soon in the ieee80211_next_txq() due to schedule_pos
>> not leading to the second txq in the rbtree. Thus, defering the
>> removal right before the end of this schedule round.
> 
> Didn't we agree that we could fix this by making __unschedule_txq()
> aware of schedule_pos instead of this deferred removal mechanism?

Yes, V3 is actually used to update the commit log of [PATCH V3 3/4] so 
that we can discuss in parallel with others, and [PATCH V3 4/4] for 
discussion. Please ignore [PATCH V3 2/4]. we can keep our discussion in 
V2 until we conclude the final one and then a new version will be sent 
out along with [mac80211: Switch to a virtual time-based airtime 
scheduler] which includes the reducing lock fix.

Sorry for the confusion, ;).

> 
> -Toke

-- 
Yibo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together
  2019-09-23 11:00   ` Toke Høiland-Jørgensen
@ 2019-09-24  3:19     ` Yibo Zhao
  2019-09-24  7:27       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-09-24  3:19 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-wireless, ath10k, linux-wireless-owner

On 2019-09-23 19:00, Toke Høiland-Jørgensen wrote:

>> -	if (params->airtime_weight)
>> -		sta->airtime_weight = params->airtime_weight;
>> +	if (params->airtime_weight &&
>> +	    params->airtime_weight != sta->airtime_weight) {
> 
> This check doesn't work I think? You're not using the array-based
> sta->airtime_weight[], and there are locking issues by just checking
> like this; so maybe just keep the if() on params->airtime_weight, and 
> do
> the checking inside the loop while holding the lock?

It should be array-based sta->airtime_weight[] and I am missing that 
part during the porting. But you are right about we should check it 
inside the loop with the lock.

> 
> Or could we just turn the weights into atomics to avoid the locking
> entirely?

Actually, we still need the active txq locking to make sure the txq is 
on the rbtree. Otherwise, no need to change airtime weight sum.

> 
>> +		for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
>> +			spin_lock_bh(&local->active_txq_lock[ac]);
>> +			for (tid = 0; tid < IEEE80211_NUM_TIDS + 1; tid++) {
>> +				if (!sta->sta.txq[tid] ||
>> +				    ac != ieee80211_ac_from_tid(tid))
>> +					continue;
>> +
>> +				pre_weight = sta->airtime_weight[ac];
>> +				sta->airtime_weight[ac] =
>> +						params->airtime_weight;
>> +
>> +				txqi = to_txq_info(sta->sta.txq[tid]);
>> +				if (RB_EMPTY_NODE(&txqi->schedule_order))
>> +					continue;
>> +
>> +				local->airtime_weight_sum[ac] = local->airtime_weight_sum[ac] +
>> +								params->airtime_weight -
>> +								pre_weight;
>> +			}
>> +			spin_unlock_bh(&local->active_txq_lock[ac]);
>> +		}
>> +	}
>> 


-- 
Yibo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together
  2019-09-24  3:19     ` Yibo Zhao
@ 2019-09-24  7:27       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-24  7:27 UTC (permalink / raw)
  To: Yibo Zhao; +Cc: linux-wireless, ath10k, linux-wireless-owner

Yibo Zhao <yiboz@codeaurora.org> writes:

> On 2019-09-23 19:00, Toke Høiland-Jørgensen wrote:
>
>>> -	if (params->airtime_weight)
>>> -		sta->airtime_weight = params->airtime_weight;
>>> +	if (params->airtime_weight &&
>>> +	    params->airtime_weight != sta->airtime_weight) {
>> 
>> This check doesn't work I think? You're not using the array-based
>> sta->airtime_weight[], and there are locking issues by just checking
>> like this; so maybe just keep the if() on params->airtime_weight, and 
>> do
>> the checking inside the loop while holding the lock?
>
> It should be array-based sta->airtime_weight[] and I am missing that 
> part during the porting. But you are right about we should check it 
> inside the loop with the lock.
>
>> 
>> Or could we just turn the weights into atomics to avoid the locking
>> entirely?
>
> Actually, we still need the active txq locking to make sure the txq is 
> on the rbtree. Otherwise, no need to change airtime weight sum.

True. Just moving the check inside the locking will be the right thing
to do, then.

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation
  2019-09-23 10:55   ` Toke Høiland-Jørgensen
@ 2019-09-24  8:22     ` Yibo Zhao
  2019-09-24  8:48       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-09-24  8:22 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-wireless, ath10k, linux-wireless-owner

On 2019-09-23 18:55, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz@codeaurora.org> writes:
> 
>> Not long after the start of multi-clients test, not a single station 
>> is
>> an eligible candidate for transmission since global virtual time(g_vt) 
>> is
>> smaller than the virtual airtime(s_vt) of all the stations. As a 
>> result,
>> the Tx has been blocked and throughput is quite low.
>> 
>> This may mainly due to sync mechanism and accumulative deviation from 
>> the
>> devision calculation of g_vt.
>> 
>> For example:
>> Suppose we have 50 clients in first round.
>> Round 1:
>> STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
>> 1	256	2048		12800	2048	2000	N
>> 2	256	2048			2048		N
>> .	.	.			.		.
>> .	.	.			.		.
>> .	.	.			.		.
>> 50	256	2048			2048		N
>> 
>> After this round, all the stations are not valid for next transmission 
>> due to
>> accumulative deviation.
>> 
>> And if we add a new #51,
>> STA	weight	Tx_time_round  wt_sum	s_vt	g_vt  valid_for_next_Tx
>> 1	256	2048		13056	2048	2020	N
>> 2	256	2048			2048		N
>> .	.	.			.
>> .	.	.			.
>> .	.	.			.
>> 50	256	2048			2048		N
>> 51	256	1024			2524		N
> 
> That's better :)
> 
>> Sync is done by:
>> max(g_vt of last round - grace period, s_vt)
>> and s_vt of #51 = max(2000 - 500, 0) + 1024 = 2524, and it is more 
>> than the final
>> g_vt of this round.
>> 
>> After this round, no more station is valid for transmission.
>> 
>> The real situation can be more complicate, above is one of the 
>> extremely case.
>> 
>> To avoid this situation to occur, the new proposal is:
>> 
>> - Increase the airtime grace period a little more to reduce the
>>   unexpected sync
>> 
>> - If global virtual time is less than the virtual airtime of any 
>> station,
>>   sync it to the airtime of first station in the red-black tree
>> 
>> - Round the division result
> 
> I can see why we need the second part (basically, this happens because 
> I
> forgot to add a check for "no eligible stations" in may_transmit(), 
> like
> the one in next_txq()). And rounding up the division result doesn't
> hurt, I guess. But why does it help to change the grace period if we're
> doing all the other stuff?
In multi-clients case, it is possible a TXQ sometimes gets drained due 
to FW has deep queue and few packets in TXQ at that time. So the TXQ is 
removed from the rbtree after dequeuing. When it is about to added back 
very soon after the removal, the g_vt might have gone a little far away 
from sta vt where sync is needed. With this sync, the station is forced 
to catch up with the g_vt, however, its chance for transmission has been 
reduced. I think 500us is quite a short period in multi-clients case.
> 
> -Toke

-- 
Yibo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation
  2019-09-24  8:22     ` Yibo Zhao
@ 2019-09-24  8:48       ` Toke Høiland-Jørgensen
  2019-09-24  8:58         ` Yibo Zhao
  0 siblings, 1 reply; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-24  8:48 UTC (permalink / raw)
  To: Yibo Zhao; +Cc: linux-wireless, ath10k, linux-wireless-owner

Yibo Zhao <yiboz@codeaurora.org> writes:

>> I can see why we need the second part (basically, this happens because 
>> I
>> forgot to add a check for "no eligible stations" in may_transmit(), 
>> like
>> the one in next_txq()). And rounding up the division result doesn't
>> hurt, I guess. But why does it help to change the grace period if we're
>> doing all the other stuff?
> In multi-clients case, it is possible a TXQ sometimes gets drained due 
> to FW has deep queue and few packets in TXQ at that time. So the TXQ is 
> removed from the rbtree after dequeuing. When it is about to added back 
> very soon after the removal, the g_vt might have gone a little far away 
> from sta vt where sync is needed. With this sync, the station is forced 
> to catch up with the g_vt, however, its chance for transmission has been 
> reduced. I think 500us is quite a short period in multi-clients case.

That's a good point, actually: Having the grace period be too small will
allow stations that leave and re-enter the queue to "skip ahead" and use
more than its share. However, I think it's a separate issue from what
this patch is about; so how about I just increase the grace period in
the next version of the base patch?

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation
  2019-09-24  8:48       ` Toke Høiland-Jørgensen
@ 2019-09-24  8:58         ` Yibo Zhao
  0 siblings, 0 replies; 17+ messages in thread
From: Yibo Zhao @ 2019-09-24  8:58 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: linux-wireless, ath10k, linux-wireless-owner

On 2019-09-24 16:48, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz@codeaurora.org> writes:
> 
>>> I can see why we need the second part (basically, this happens 
>>> because
>>> I
>>> forgot to add a check for "no eligible stations" in may_transmit(),
>>> like
>>> the one in next_txq()). And rounding up the division result doesn't
>>> hurt, I guess. But why does it help to change the grace period if 
>>> we're
>>> doing all the other stuff?
>> In multi-clients case, it is possible a TXQ sometimes gets drained due
>> to FW has deep queue and few packets in TXQ at that time. So the TXQ 
>> is
>> removed from the rbtree after dequeuing. When it is about to added 
>> back
>> very soon after the removal, the g_vt might have gone a little far 
>> away
>> from sta vt where sync is needed. With this sync, the station is 
>> forced
>> to catch up with the g_vt, however, its chance for transmission has 
>> been
>> reduced. I think 500us is quite a short period in multi-clients case.
> 
> That's a good point, actually: Having the grace period be too small 
> will
> allow stations that leave and re-enter the queue to "skip ahead" and 
> use
> more than its share. However, I think it's a separate issue from what
> this patch is about; so how about I just increase the grace period in
> the next version of the base patch?

Sure, no problem. :)
> 
> -Toke

-- 
Yibo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k
  2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
                   ` (3 preceding siblings ...)
  2019-09-23  7:20 ` [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together Yibo Zhao
@ 2019-10-01 10:19 ` Johannes Berg
  2019-10-01 10:59   ` Yibo Zhao
  4 siblings, 1 reply; 17+ messages in thread
From: Johannes Berg @ 2019-10-01 10:19 UTC (permalink / raw)
  To: Yibo Zhao, linux-wireless; +Cc: ath10k

On Mon, 2019-09-23 at 15:19 +0800, Yibo Zhao wrote:
> This series fix some issues when enabling virtual time-based airtime scheduler on ath10k.
> 
Given the lengthy discussion here and also over in the related thread
about AQL, I'm assuming you're going to repost this eventually.

johannes


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k
  2019-10-01 10:19 ` [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Johannes Berg
@ 2019-10-01 10:59   ` Yibo Zhao
  2019-10-01 11:05     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 17+ messages in thread
From: Yibo Zhao @ 2019-10-01 10:59 UTC (permalink / raw)
  To: Johannes Berg; +Cc: linux-wireless, ath10k, linux-wireless-owner

On 2019-10-01 18:19, Johannes Berg wrote:
> On Mon, 2019-09-23 at 15:19 +0800, Yibo Zhao wrote:
>> This series fix some issues when enabling virtual time-based airtime 
>> scheduler on ath10k.
>> 
> Given the lengthy discussion here and also over in the related thread
> about AQL, I'm assuming you're going to repost this eventually.

Yes, will repost once Toke have updated "mac80211: Switch to a virtual 
time-based airtime scheduler" with his new ideas.
> 
> johannes

-- 
Yibo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k
  2019-10-01 10:59   ` Yibo Zhao
@ 2019-10-01 11:05     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 17+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-10-01 11:05 UTC (permalink / raw)
  To: Yibo Zhao, Johannes Berg; +Cc: linux-wireless, ath10k, linux-wireless-owner

Yibo Zhao <yiboz@codeaurora.org> writes:

> On 2019-10-01 18:19, Johannes Berg wrote:
>> On Mon, 2019-09-23 at 15:19 +0800, Yibo Zhao wrote:
>>> This series fix some issues when enabling virtual time-based airtime 
>>> scheduler on ath10k.
>>> 
>> Given the lengthy discussion here and also over in the related thread
>> about AQL, I'm assuming you're going to repost this eventually.
>
> Yes, will repost once Toke have updated "mac80211: Switch to a virtual 
> time-based airtime scheduler" with his new ideas.

Which in turn is waiting for the AQL stuff... :)

-Toke


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-10-01 11:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-23  7:19 [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Yibo Zhao
2019-09-23  7:19 ` [PATCH V3 1/4] mac80211: Switch to a virtual time-based airtime scheduler Yibo Zhao
2019-09-23  7:19 ` [PATCH V3 2/4] mac80211: defer txqs removal from rbtree Yibo Zhao
2019-09-23 10:56   ` Toke Høiland-Jørgensen
2019-09-24  2:55     ` Yibo Zhao
2019-09-23  7:20 ` [PATCH V3 3/4] mac80211: fix low throughput in multi-clients situation Yibo Zhao
2019-09-23 10:55   ` Toke Høiland-Jørgensen
2019-09-24  8:22     ` Yibo Zhao
2019-09-24  8:48       ` Toke Høiland-Jørgensen
2019-09-24  8:58         ` Yibo Zhao
2019-09-23  7:20 ` [PATCH V3 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together Yibo Zhao
2019-09-23 11:00   ` Toke Høiland-Jørgensen
2019-09-24  3:19     ` Yibo Zhao
2019-09-24  7:27       ` Toke Høiland-Jørgensen
2019-10-01 10:19 ` [PATCH V3 0/4] Enable virtual time-based airtime scheduler support on ath10k Johannes Berg
2019-10-01 10:59   ` Yibo Zhao
2019-10-01 11:05     ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).