All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yibo Zhao <yiboz@codeaurora.org>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: ath10k@lists.infradead.org, linux-wireless@vger.kernel.org,
	linux-wireless-owner@vger.kernel.org
Subject: Re: [PATCH 2/4] mac80211: defer txqs removal from rbtree
Date: Fri, 20 Sep 2019 16:29:44 +0800	[thread overview]
Message-ID: <1b4ab006d9b5c88035845aaac193ef48@codeaurora.org> (raw)
In-Reply-To: <87tv98fu6l.fsf@toke.dk>

On 2019-09-19 18:37, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz@codeaurora.org> writes:
> 
>> On 2019-09-18 19:23, Toke Høiland-Jørgensen wrote:
>>> Yibo Zhao <yiboz@codeaurora.org> writes:
>>> 
>>>> On 2019-09-18 05:10, Toke Høiland-Jørgensen wrote:
>>>>> Yibo Zhao <yiboz@codeaurora.org> writes:
>>>>> 
>>>>>> In a loop txqs dequeue scenario, if the first txq in the rbtree 
>>>>>> gets
>>>>>> removed from rbtree immediately in the ieee80211_return_txq(), the
>>>>>> loop will break soon in the ieee80211_next_txq() due to 
>>>>>> schedule_pos
>>>>>> not leading to the second txq in the rbtree. Thus, defering the
>>>>>> removal right before the end of this schedule round.
>>>>>> 
>>>>>> Co-developed-by: Yibo Zhao <yiboz@codeaurora.org>
>>>>>> Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
>>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>>>>> 
>>>>> I didn't write this patch, so please don't use my sign-off. I'll 
>>>>> add
>>>>> ack or review tags as appropriate in reply; but a few comments 
>>>>> first:
>>>>> 
>>>>>> ---
>>>>>>  include/net/mac80211.h     | 16 ++++++++++--
>>>>>>  net/mac80211/ieee80211_i.h |  3 +++
>>>>>>  net/mac80211/main.c        |  6 +++++
>>>>>>  net/mac80211/tx.c          | 63
>>>>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>>>>  4 files changed, 83 insertions(+), 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/include/net/mac80211.h b/include/net/mac80211.h
>>>>>> index ac2ed8e..ba5a345 100644
>>>>>> --- a/include/net/mac80211.h
>>>>>> +++ b/include/net/mac80211.h
>>>>>> @@ -925,6 +925,8 @@ struct ieee80211_tx_rate {
>>>>>> 
>>>>>>  #define IEEE80211_MAX_TX_RETRY		31
>>>>>> 
>>>>>> +#define IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS 100
>>>>>> +
>>>>>>  static inline void ieee80211_rate_set_vht(struct 
>>>>>> ieee80211_tx_rate
>>>>>> *rate,
>>>>>>  					  u8 mcs, u8 nss)
>>>>>>  {
>>>>>> @@ -6232,7 +6234,8 @@ struct sk_buff *ieee80211_tx_dequeue(struct
>>>>>> ieee80211_hw *hw,
>>>>>>   * @ac: AC number to return packets from.
>>>>>>   *
>>>>>>   * Should only be called between calls to
>>>>>> ieee80211_txq_schedule_start()
>>>>>> - * and ieee80211_txq_schedule_end().
>>>>>> + * and ieee80211_txq_schedule_end(). If the txq is empty, it will
>>>>>> be
>>>>>> added
>>>>>> + * to a remove list and get removed later.
>>>>>>   * Returns the next txq if successful, %NULL if no queue is
>>>>>> eligible.
>>>>>> If a txq
>>>>>>   * is returned, it should be returned with ieee80211_return_txq()
>>>>>> after the
>>>>>>   * driver has finished scheduling it.
>>>>>> @@ -6268,7 +6271,8 @@ void ieee80211_txq_schedule_start(struct
>>>>>> ieee80211_hw *hw, u8 ac)
>>>>>>   * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>>>>   * @ac: AC number to acquire locks for
>>>>>>   *
>>>>>> - * Release locks previously acquired by
>>>>>> ieee80211_txq_schedule_end().
>>>>>> + * Release locks previously acquired by
>>>>>> ieee80211_txq_schedule_end().
>>>>>> Check
>>>>>> + * and remove the empty txq from rb-tree.
>>>>>>   */
>>>>>>  void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
>>>>>>  	__releases(txq_lock);
>>>>>> @@ -6287,6 +6291,14 @@ void ieee80211_schedule_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw, struct ieee80211_txq *txq)
>>>>>>  	__acquires(txq_lock) __releases(txq_lock);
>>>>>> 
>>>>>>  /**
>>>>>> + * ieee80211_txqs_check - Check txqs waiting for removal
>>>>>> + *
>>>>>> + * @tmr: pointer as obtained from local
>>>>>> + *
>>>>>> + */
>>>>>> +void ieee80211_txqs_check(struct timer_list *tmr);
>>>>>> +
>>>>>> +/**
>>>>>>   * ieee80211_txq_may_transmit - check whether TXQ is allowed to
>>>>>> transmit
>>>>>>   *
>>>>>>   * This function is used to check whether given txq is allowed to
>>>>>> transmit by
>>>>>> diff --git a/net/mac80211/ieee80211_i.h 
>>>>>> b/net/mac80211/ieee80211_i.h
>>>>>> index a4556f9..49aa143e 100644
>>>>>> --- a/net/mac80211/ieee80211_i.h
>>>>>> +++ b/net/mac80211/ieee80211_i.h
>>>>>> @@ -847,6 +847,7 @@ struct txq_info {
>>>>>>  	struct codel_stats cstats;
>>>>>>  	struct sk_buff_head frags;
>>>>>>  	struct rb_node schedule_order;
>>>>>> +	struct list_head candidate;
>>>>>>  	unsigned long flags;
>>>>>> 
>>>>>>  	/* keep last! */
>>>>>> @@ -1145,6 +1146,8 @@ struct ieee80211_local {
>>>>>>  	u64 airtime_v_t[IEEE80211_NUM_ACS];
>>>>>>  	u64 airtime_weight_sum[IEEE80211_NUM_ACS];
>>>>>> 
>>>>>> +	struct list_head remove_list[IEEE80211_NUM_ACS];
>>>>>> +	struct timer_list remove_timer;
>>>>>>  	u16 airtime_flags;
>>>>>> 
>>>>>>  	const struct ieee80211_ops *ops;
>>>>>> diff --git a/net/mac80211/main.c b/net/mac80211/main.c
>>>>>> index e9ffa8e..78fe24a 100644
>>>>>> --- a/net/mac80211/main.c
>>>>>> +++ b/net/mac80211/main.c
>>>>>> @@ -667,10 +667,15 @@ struct ieee80211_hw
>>>>>> *ieee80211_alloc_hw_nm(size_t priv_data_len,
>>>>>> 
>>>>>>  	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
>>>>>>  		local->active_txqs[i] = RB_ROOT_CACHED;
>>>>>> +		INIT_LIST_HEAD(&local->remove_list[i]);
>>>>>>  		spin_lock_init(&local->active_txq_lock[i]);
>>>>>>  	}
>>>>>>  	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
>>>>>> 
>>>>>> +	timer_setup(&local->remove_timer, ieee80211_txqs_check, 0);
>>>>>> +	mod_timer(&local->remove_timer,
>>>>>> +		  jiffies +
>>>>>> msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
>>>>>> +
>>>>>>  	INIT_LIST_HEAD(&local->chanctx_list);
>>>>>>  	mutex_init(&local->chanctx_mtx);
>>>>>> 
>>>>>> @@ -1305,6 +1310,7 @@ void ieee80211_unregister_hw(struct
>>>>>> ieee80211_hw
>>>>>> *hw)
>>>>>>  	tasklet_kill(&local->tx_pending_tasklet);
>>>>>>  	tasklet_kill(&local->tasklet);
>>>>>> 
>>>>>> +	del_timer_sync(&local->remove_timer);
>>>>>>  #ifdef CONFIG_INET
>>>>>>  	unregister_inetaddr_notifier(&local->ifa_notifier);
>>>>>>  #endif
>>>>>> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
>>>>>> index d00baaa..42ca010 100644
>>>>>> --- a/net/mac80211/tx.c
>>>>>> +++ b/net/mac80211/tx.c
>>>>>> @@ -1450,6 +1450,7 @@ void ieee80211_txq_init(struct
>>>>>> ieee80211_sub_if_data *sdata,
>>>>>>  	codel_stats_init(&txqi->cstats);
>>>>>>  	__skb_queue_head_init(&txqi->frags);
>>>>>>  	RB_CLEAR_NODE(&txqi->schedule_order);
>>>>>> +	INIT_LIST_HEAD(&txqi->candidate);
>>>>>> 
>>>>>>  	txqi->txq.vif = &sdata->vif;
>>>>>> 
>>>>>> @@ -3724,6 +3725,9 @@ void ieee80211_schedule_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw,
>>>>>> 
>>>>>>  	spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>> 
>>>>>> +	if (!list_empty(&txqi->candidate))
>>>>>> +		list_del_init(&txqi->candidate);
>>>>>> +
>>>>>>  	if (!RB_EMPTY_NODE(&txqi->schedule_order))
>>>>>>  		goto out;
>>>>>> 
>>>>>> @@ -3783,6 +3787,20 @@ static void 
>>>>>> __ieee80211_unschedule_txq(struct
>>>>>> ieee80211_hw *hw,
>>>>>>  	RB_CLEAR_NODE(&txqi->schedule_order);
>>>>>>  }
>>>>>> 
>>>>>> +void ieee80211_remove_txq(struct ieee80211_hw *hw,
>>>>>> +			  struct ieee80211_txq *txq)
>>>>>> +{
>>>>>> +	struct ieee80211_local *local = hw_to_local(hw);
>>>>>> +	struct txq_info *txqi = to_txq_info(txq);
>>>>>> +
>>>>>> +	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
>>>>>> +
>>>>>> +	if (!RB_EMPTY_NODE(&txqi->schedule_order)) {
>>>>>> +		__ieee80211_unschedule_txq(hw, txq);
>>>>>> +		list_del_init(&txqi->candidate);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>  void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
>>>>>>  			      struct ieee80211_txq *txq)
>>>>>>  	__acquires(txq_lock) __releases(txq_lock)
>>>>>> @@ -3790,7 +3808,7 @@ void ieee80211_unschedule_txq(struct
>>>>>> ieee80211_hw *hw,
>>>>>>  	struct ieee80211_local *local = hw_to_local(hw);
>>>>>> 
>>>>>>  	spin_lock_bh(&local->active_txq_lock[txq->ac]);
>>>>>> -	__ieee80211_unschedule_txq(hw, txq);
>>>>>> +	ieee80211_remove_txq(hw, txq);
>>>>>>  	spin_unlock_bh(&local->active_txq_lock[txq->ac]);
>>>>>>  }
>>>>>> 
>>>>>> @@ -3803,11 +3821,48 @@ void ieee80211_return_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw,
>>>>>>  	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
>>>>>> 
>>>>>>  	if (!RB_EMPTY_NODE(&txqi->schedule_order) &&
>>>>>> -	    (skb_queue_empty(&txqi->frags) && 
>>>>>> !txqi->tin.backlog_packets))
>>>>>> -		__ieee80211_unschedule_txq(hw, txq);
>>>>>> +		!txq_has_queue(&txqi->txq) &&
>>>>>> +		list_empty(&txqi->candidate))
>>>>>> +		list_add_tail(&txqi->candidate, &local->remove_list[txq->ac]);
>>>>>> +
>>>>>>  }
>>>>>>  EXPORT_SYMBOL(ieee80211_return_txq);
>>>>>> 
>>>>>> +void __ieee80211_check_txqs(struct ieee80211_local *local, int 
>>>>>> ac)
>>>>>> +{
>>>>>> +	struct txq_info *iter, *tmp;
>>>>>> +	struct sta_info *sta;
>>>>>> +
>>>>>> +	lockdep_assert_held(&local->active_txq_lock[ac]);
>>>>>> +
>>>>>> +	list_for_each_entry_safe(iter, tmp, &local->remove_list[ac],
>>>>>> +				 candidate) {
>>>>>> +		sta = container_of(iter->txq.sta, struct sta_info, sta);
>>>>>> +
>>>>>> +		if (txq_has_queue(&iter->txq))
>>>>>> +			list_del_init(&iter->candidate);
>>>>>> +		else
>>>>>> +			ieee80211_remove_txq(&local->hw, &iter->txq);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>> +void ieee80211_txqs_check(struct timer_list *t)
>>>>>> +{
>>>>>> +	struct ieee80211_local *local = from_timer(local, t,
>>>>>> remove_timer);
>>>>>> +	struct txq_info *iter, *tmp;
>>>>>> +	struct sta_info *sta;
>>>>>> +	int ac;
>>>>>> +
>>>>>> +	for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
>>>>>> +		spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>> +		__ieee80211_check_txqs(local, ac);
>>>>>> +		spin_unlock_bh(&local->active_txq_lock[ac]);
>>>>>> +	}
>>>>>> +
>>>>>> +	mod_timer(&local->remove_timer,
>>>>>> +		  jiffies +
>>>>>> msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
>>>>>> +}
>>>>> 
>>>>> I'll ask the same as I did last time (where you told me to hold off
>>>>> until this round):
>>>>> 
>>>>> Why do you need the timer and the periodic check? If TXQs are added
>>>>> to
>>>>> the remove list during the scheduling run, and
>>>>> __ieee80211_check_txqs()
>>>>> is run from schedule_end(), isn't that sufficient to clear the 
>>>>> list?
>>>> Is it possible that a txq is not added to the remove list but then
>>>> packets in it are dropped by fq_codel algo? Like the station
>>>> disconnects
>>>> without any notification.
>>> 
>>> Well as long as all the other cleanup paths call directly into
>>> __unschedule_txq(), that should remove stations from the scheduler 
>>> when
>>> they disconnect etc.
>> Yes, the disconnect scenario is a bad example. My concern is, say, we
>> have 10 stations and only one of them is assigned a very small weight
>> compared with that of others. Suppose, after its chance of Tx, it is
>> most likely to be placed in the rightmost(still has some packets in 
>> the
>> txq) and no more incoming data for it. The remaining packets in txq 
>> will
>> be dropped due to timeout algo in codel(correct me if I am wrong) but
>> this empty txq will stay on the rbtree until other txqs get drained or
>> global vt catch up with its vt. The staying time could be long if 
>> weight
>> is extremely small. Then do we need timer to check or any other better
>> solution?
> 
> Ah, I see what you mean. No, I don't think this will be a problem; the
> scenario you're describing would play out like this:
> 
> 1. Station ends transmitting, still has a single packet queued, gets
>    moved to the end of the rbtree (and stays there for a while).
> 
> 2. When we finally get to the point where this station gets another
>    chance to transmit, the CoDel drop timer triggers and the last 
> packet
>    is dropped[0]. This means that the queue will just be empty
>    (and ieee80211_tx_dequeue() will return NULL).
> 
> 3. Because the queue is empty, ieee80211_return_txq() will not put it
>    back on the rbtree.
> 
> Crucially, in 2. the CoDel algorithm doesn't kick in until the point of
> packet dequeue. But even if an empty queue stays on the rbtree for a
> while, there is no harm in that: eventually it will get its turn, it
> will turn out to be empty, and just be skipped over.
Then that will be fine. Thanks for the explanation of the dropping part 
in CoDel algorithm.
> 
> The issue we need to be concerned about is the opposite: If we have a
> queue that *does* have packets queued, but which is *not* scheduled for
> transmission, that will stall TX.
Is it by design since its vt is more than global vt, right? The lattency 
may somehow get impacted though.
> 
> [0] CoDel in most cases only drops a single packet at a time, so it 
> will
> not clear out an entire queue with multiple packets in one go. But you
> are right that it could conceivably drop the last packet in a queue.
> 
>>> We only need to defer removal inside a single "scheduling round" 
>>> (i.e.,
>>> between a pair of ieee80211_txq_schedule_start/end. So if we just 
>>> walk
>>> the remove list in schedule_end() we should be enough, no?
>>> 
>>> Hmm, or maybe a simpler way to fix the original issue is just to have
>>> unschedule_txq() update the schedule_pos() pointer?
>>> 
>>> I.e., unschedule_txq checks if the txq being removed is currently 
>>> being
>>> pointed to by schedule_pos[ac], and if it is, it updates schedule_pos
>>> to
>>> be the rb_next of the current value?
>> Actually, if schedule_pos is updated to rb_next of the current value,
>> then in the next_txq() where we are going to use rb_next again and
>> finally pick the next node of the node we really want. Is it fine to
>> update schedule_pos to NULL?
> 
> Hmm, yeah, good point.
> 
> If we do end up setting schedule_pos to NULL in the middle of a
> scheduling round, that will make next_txq() "start over", and do 
> another
> loop through the whole thing. I guess we may be able hit a case where
> things can oscillate back and forth between addition and removal
> resulting in an infinite loop? Not sure, but at least I can't seem to
> convince myself that this can't happen.

As the loop of next_txq under lock protection as below,

txq_schedule_start();
while(txq=next_txq()){
...
return_txq(txq);
}
txq_schedule_end();

I do not see any chance of addition, no?
In ath10k, we will usually push packets of first txq as many as we can 
until it is drained and then move to the next one. So if a txq gets 
removed in the return_txq, it should always be the leftmost. And during 
this period, neither vt of any station or global vt can be updated due 
to lock protection.

> 
> But in that case, we could fix it by just conditionally assigning 
> either
> rb_next or rb_prev to the schedule_pos in unschedule_txq()? I.e.,
> something like:
> 
> local->schedule_pos[ac] = rb_next(node) ?: rb_prev(node);
I am not sure I am getting your point. Still in next_txq, 
schedule_pos[ac] will lead us to the next node of the one we want.
> 
> -Toke

-- 
Yibo

WARNING: multiple messages have this Message-ID (diff)
From: Yibo Zhao <yiboz@codeaurora.org>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: linux-wireless-owner@vger.kernel.org,
	linux-wireless@vger.kernel.org, ath10k@lists.infradead.org
Subject: Re: [PATCH 2/4] mac80211: defer txqs removal from rbtree
Date: Fri, 20 Sep 2019 16:29:44 +0800	[thread overview]
Message-ID: <1b4ab006d9b5c88035845aaac193ef48@codeaurora.org> (raw)
In-Reply-To: <87tv98fu6l.fsf@toke.dk>

On 2019-09-19 18:37, Toke Høiland-Jørgensen wrote:
> Yibo Zhao <yiboz@codeaurora.org> writes:
> 
>> On 2019-09-18 19:23, Toke Høiland-Jørgensen wrote:
>>> Yibo Zhao <yiboz@codeaurora.org> writes:
>>> 
>>>> On 2019-09-18 05:10, Toke Høiland-Jørgensen wrote:
>>>>> Yibo Zhao <yiboz@codeaurora.org> writes:
>>>>> 
>>>>>> In a loop txqs dequeue scenario, if the first txq in the rbtree 
>>>>>> gets
>>>>>> removed from rbtree immediately in the ieee80211_return_txq(), the
>>>>>> loop will break soon in the ieee80211_next_txq() due to 
>>>>>> schedule_pos
>>>>>> not leading to the second txq in the rbtree. Thus, defering the
>>>>>> removal right before the end of this schedule round.
>>>>>> 
>>>>>> Co-developed-by: Yibo Zhao <yiboz@codeaurora.org>
>>>>>> Signed-off-by: Yibo Zhao <yiboz@codeaurora.org>
>>>>>> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
>>>>> 
>>>>> I didn't write this patch, so please don't use my sign-off. I'll 
>>>>> add
>>>>> ack or review tags as appropriate in reply; but a few comments 
>>>>> first:
>>>>> 
>>>>>> ---
>>>>>>  include/net/mac80211.h     | 16 ++++++++++--
>>>>>>  net/mac80211/ieee80211_i.h |  3 +++
>>>>>>  net/mac80211/main.c        |  6 +++++
>>>>>>  net/mac80211/tx.c          | 63
>>>>>> +++++++++++++++++++++++++++++++++++++++++++---
>>>>>>  4 files changed, 83 insertions(+), 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/include/net/mac80211.h b/include/net/mac80211.h
>>>>>> index ac2ed8e..ba5a345 100644
>>>>>> --- a/include/net/mac80211.h
>>>>>> +++ b/include/net/mac80211.h
>>>>>> @@ -925,6 +925,8 @@ struct ieee80211_tx_rate {
>>>>>> 
>>>>>>  #define IEEE80211_MAX_TX_RETRY		31
>>>>>> 
>>>>>> +#define IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS 100
>>>>>> +
>>>>>>  static inline void ieee80211_rate_set_vht(struct 
>>>>>> ieee80211_tx_rate
>>>>>> *rate,
>>>>>>  					  u8 mcs, u8 nss)
>>>>>>  {
>>>>>> @@ -6232,7 +6234,8 @@ struct sk_buff *ieee80211_tx_dequeue(struct
>>>>>> ieee80211_hw *hw,
>>>>>>   * @ac: AC number to return packets from.
>>>>>>   *
>>>>>>   * Should only be called between calls to
>>>>>> ieee80211_txq_schedule_start()
>>>>>> - * and ieee80211_txq_schedule_end().
>>>>>> + * and ieee80211_txq_schedule_end(). If the txq is empty, it will
>>>>>> be
>>>>>> added
>>>>>> + * to a remove list and get removed later.
>>>>>>   * Returns the next txq if successful, %NULL if no queue is
>>>>>> eligible.
>>>>>> If a txq
>>>>>>   * is returned, it should be returned with ieee80211_return_txq()
>>>>>> after the
>>>>>>   * driver has finished scheduling it.
>>>>>> @@ -6268,7 +6271,8 @@ void ieee80211_txq_schedule_start(struct
>>>>>> ieee80211_hw *hw, u8 ac)
>>>>>>   * @hw: pointer as obtained from ieee80211_alloc_hw()
>>>>>>   * @ac: AC number to acquire locks for
>>>>>>   *
>>>>>> - * Release locks previously acquired by
>>>>>> ieee80211_txq_schedule_end().
>>>>>> + * Release locks previously acquired by
>>>>>> ieee80211_txq_schedule_end().
>>>>>> Check
>>>>>> + * and remove the empty txq from rb-tree.
>>>>>>   */
>>>>>>  void ieee80211_txq_schedule_end(struct ieee80211_hw *hw, u8 ac)
>>>>>>  	__releases(txq_lock);
>>>>>> @@ -6287,6 +6291,14 @@ void ieee80211_schedule_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw, struct ieee80211_txq *txq)
>>>>>>  	__acquires(txq_lock) __releases(txq_lock);
>>>>>> 
>>>>>>  /**
>>>>>> + * ieee80211_txqs_check - Check txqs waiting for removal
>>>>>> + *
>>>>>> + * @tmr: pointer as obtained from local
>>>>>> + *
>>>>>> + */
>>>>>> +void ieee80211_txqs_check(struct timer_list *tmr);
>>>>>> +
>>>>>> +/**
>>>>>>   * ieee80211_txq_may_transmit - check whether TXQ is allowed to
>>>>>> transmit
>>>>>>   *
>>>>>>   * This function is used to check whether given txq is allowed to
>>>>>> transmit by
>>>>>> diff --git a/net/mac80211/ieee80211_i.h 
>>>>>> b/net/mac80211/ieee80211_i.h
>>>>>> index a4556f9..49aa143e 100644
>>>>>> --- a/net/mac80211/ieee80211_i.h
>>>>>> +++ b/net/mac80211/ieee80211_i.h
>>>>>> @@ -847,6 +847,7 @@ struct txq_info {
>>>>>>  	struct codel_stats cstats;
>>>>>>  	struct sk_buff_head frags;
>>>>>>  	struct rb_node schedule_order;
>>>>>> +	struct list_head candidate;
>>>>>>  	unsigned long flags;
>>>>>> 
>>>>>>  	/* keep last! */
>>>>>> @@ -1145,6 +1146,8 @@ struct ieee80211_local {
>>>>>>  	u64 airtime_v_t[IEEE80211_NUM_ACS];
>>>>>>  	u64 airtime_weight_sum[IEEE80211_NUM_ACS];
>>>>>> 
>>>>>> +	struct list_head remove_list[IEEE80211_NUM_ACS];
>>>>>> +	struct timer_list remove_timer;
>>>>>>  	u16 airtime_flags;
>>>>>> 
>>>>>>  	const struct ieee80211_ops *ops;
>>>>>> diff --git a/net/mac80211/main.c b/net/mac80211/main.c
>>>>>> index e9ffa8e..78fe24a 100644
>>>>>> --- a/net/mac80211/main.c
>>>>>> +++ b/net/mac80211/main.c
>>>>>> @@ -667,10 +667,15 @@ struct ieee80211_hw
>>>>>> *ieee80211_alloc_hw_nm(size_t priv_data_len,
>>>>>> 
>>>>>>  	for (i = 0; i < IEEE80211_NUM_ACS; i++) {
>>>>>>  		local->active_txqs[i] = RB_ROOT_CACHED;
>>>>>> +		INIT_LIST_HEAD(&local->remove_list[i]);
>>>>>>  		spin_lock_init(&local->active_txq_lock[i]);
>>>>>>  	}
>>>>>>  	local->airtime_flags = AIRTIME_USE_TX | AIRTIME_USE_RX;
>>>>>> 
>>>>>> +	timer_setup(&local->remove_timer, ieee80211_txqs_check, 0);
>>>>>> +	mod_timer(&local->remove_timer,
>>>>>> +		  jiffies +
>>>>>> msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
>>>>>> +
>>>>>>  	INIT_LIST_HEAD(&local->chanctx_list);
>>>>>>  	mutex_init(&local->chanctx_mtx);
>>>>>> 
>>>>>> @@ -1305,6 +1310,7 @@ void ieee80211_unregister_hw(struct
>>>>>> ieee80211_hw
>>>>>> *hw)
>>>>>>  	tasklet_kill(&local->tx_pending_tasklet);
>>>>>>  	tasklet_kill(&local->tasklet);
>>>>>> 
>>>>>> +	del_timer_sync(&local->remove_timer);
>>>>>>  #ifdef CONFIG_INET
>>>>>>  	unregister_inetaddr_notifier(&local->ifa_notifier);
>>>>>>  #endif
>>>>>> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
>>>>>> index d00baaa..42ca010 100644
>>>>>> --- a/net/mac80211/tx.c
>>>>>> +++ b/net/mac80211/tx.c
>>>>>> @@ -1450,6 +1450,7 @@ void ieee80211_txq_init(struct
>>>>>> ieee80211_sub_if_data *sdata,
>>>>>>  	codel_stats_init(&txqi->cstats);
>>>>>>  	__skb_queue_head_init(&txqi->frags);
>>>>>>  	RB_CLEAR_NODE(&txqi->schedule_order);
>>>>>> +	INIT_LIST_HEAD(&txqi->candidate);
>>>>>> 
>>>>>>  	txqi->txq.vif = &sdata->vif;
>>>>>> 
>>>>>> @@ -3724,6 +3725,9 @@ void ieee80211_schedule_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw,
>>>>>> 
>>>>>>  	spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>> 
>>>>>> +	if (!list_empty(&txqi->candidate))
>>>>>> +		list_del_init(&txqi->candidate);
>>>>>> +
>>>>>>  	if (!RB_EMPTY_NODE(&txqi->schedule_order))
>>>>>>  		goto out;
>>>>>> 
>>>>>> @@ -3783,6 +3787,20 @@ static void 
>>>>>> __ieee80211_unschedule_txq(struct
>>>>>> ieee80211_hw *hw,
>>>>>>  	RB_CLEAR_NODE(&txqi->schedule_order);
>>>>>>  }
>>>>>> 
>>>>>> +void ieee80211_remove_txq(struct ieee80211_hw *hw,
>>>>>> +			  struct ieee80211_txq *txq)
>>>>>> +{
>>>>>> +	struct ieee80211_local *local = hw_to_local(hw);
>>>>>> +	struct txq_info *txqi = to_txq_info(txq);
>>>>>> +
>>>>>> +	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
>>>>>> +
>>>>>> +	if (!RB_EMPTY_NODE(&txqi->schedule_order)) {
>>>>>> +		__ieee80211_unschedule_txq(hw, txq);
>>>>>> +		list_del_init(&txqi->candidate);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>  void ieee80211_unschedule_txq(struct ieee80211_hw *hw,
>>>>>>  			      struct ieee80211_txq *txq)
>>>>>>  	__acquires(txq_lock) __releases(txq_lock)
>>>>>> @@ -3790,7 +3808,7 @@ void ieee80211_unschedule_txq(struct
>>>>>> ieee80211_hw *hw,
>>>>>>  	struct ieee80211_local *local = hw_to_local(hw);
>>>>>> 
>>>>>>  	spin_lock_bh(&local->active_txq_lock[txq->ac]);
>>>>>> -	__ieee80211_unschedule_txq(hw, txq);
>>>>>> +	ieee80211_remove_txq(hw, txq);
>>>>>>  	spin_unlock_bh(&local->active_txq_lock[txq->ac]);
>>>>>>  }
>>>>>> 
>>>>>> @@ -3803,11 +3821,48 @@ void ieee80211_return_txq(struct
>>>>>> ieee80211_hw
>>>>>> *hw,
>>>>>>  	lockdep_assert_held(&local->active_txq_lock[txq->ac]);
>>>>>> 
>>>>>>  	if (!RB_EMPTY_NODE(&txqi->schedule_order) &&
>>>>>> -	    (skb_queue_empty(&txqi->frags) && 
>>>>>> !txqi->tin.backlog_packets))
>>>>>> -		__ieee80211_unschedule_txq(hw, txq);
>>>>>> +		!txq_has_queue(&txqi->txq) &&
>>>>>> +		list_empty(&txqi->candidate))
>>>>>> +		list_add_tail(&txqi->candidate, &local->remove_list[txq->ac]);
>>>>>> +
>>>>>>  }
>>>>>>  EXPORT_SYMBOL(ieee80211_return_txq);
>>>>>> 
>>>>>> +void __ieee80211_check_txqs(struct ieee80211_local *local, int 
>>>>>> ac)
>>>>>> +{
>>>>>> +	struct txq_info *iter, *tmp;
>>>>>> +	struct sta_info *sta;
>>>>>> +
>>>>>> +	lockdep_assert_held(&local->active_txq_lock[ac]);
>>>>>> +
>>>>>> +	list_for_each_entry_safe(iter, tmp, &local->remove_list[ac],
>>>>>> +				 candidate) {
>>>>>> +		sta = container_of(iter->txq.sta, struct sta_info, sta);
>>>>>> +
>>>>>> +		if (txq_has_queue(&iter->txq))
>>>>>> +			list_del_init(&iter->candidate);
>>>>>> +		else
>>>>>> +			ieee80211_remove_txq(&local->hw, &iter->txq);
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>> +void ieee80211_txqs_check(struct timer_list *t)
>>>>>> +{
>>>>>> +	struct ieee80211_local *local = from_timer(local, t,
>>>>>> remove_timer);
>>>>>> +	struct txq_info *iter, *tmp;
>>>>>> +	struct sta_info *sta;
>>>>>> +	int ac;
>>>>>> +
>>>>>> +	for (ac = 0; ac < IEEE80211_NUM_ACS; ac++) {
>>>>>> +		spin_lock_bh(&local->active_txq_lock[ac]);
>>>>>> +		__ieee80211_check_txqs(local, ac);
>>>>>> +		spin_unlock_bh(&local->active_txq_lock[ac]);
>>>>>> +	}
>>>>>> +
>>>>>> +	mod_timer(&local->remove_timer,
>>>>>> +		  jiffies +
>>>>>> msecs_to_jiffies(IEEE80211_AIRTIME_TXQ_RM_CHK_INTV_IN_MS));
>>>>>> +}
>>>>> 
>>>>> I'll ask the same as I did last time (where you told me to hold off
>>>>> until this round):
>>>>> 
>>>>> Why do you need the timer and the periodic check? If TXQs are added
>>>>> to
>>>>> the remove list during the scheduling run, and
>>>>> __ieee80211_check_txqs()
>>>>> is run from schedule_end(), isn't that sufficient to clear the 
>>>>> list?
>>>> Is it possible that a txq is not added to the remove list but then
>>>> packets in it are dropped by fq_codel algo? Like the station
>>>> disconnects
>>>> without any notification.
>>> 
>>> Well as long as all the other cleanup paths call directly into
>>> __unschedule_txq(), that should remove stations from the scheduler 
>>> when
>>> they disconnect etc.
>> Yes, the disconnect scenario is a bad example. My concern is, say, we
>> have 10 stations and only one of them is assigned a very small weight
>> compared with that of others. Suppose, after its chance of Tx, it is
>> most likely to be placed in the rightmost(still has some packets in 
>> the
>> txq) and no more incoming data for it. The remaining packets in txq 
>> will
>> be dropped due to timeout algo in codel(correct me if I am wrong) but
>> this empty txq will stay on the rbtree until other txqs get drained or
>> global vt catch up with its vt. The staying time could be long if 
>> weight
>> is extremely small. Then do we need timer to check or any other better
>> solution?
> 
> Ah, I see what you mean. No, I don't think this will be a problem; the
> scenario you're describing would play out like this:
> 
> 1. Station ends transmitting, still has a single packet queued, gets
>    moved to the end of the rbtree (and stays there for a while).
> 
> 2. When we finally get to the point where this station gets another
>    chance to transmit, the CoDel drop timer triggers and the last 
> packet
>    is dropped[0]. This means that the queue will just be empty
>    (and ieee80211_tx_dequeue() will return NULL).
> 
> 3. Because the queue is empty, ieee80211_return_txq() will not put it
>    back on the rbtree.
> 
> Crucially, in 2. the CoDel algorithm doesn't kick in until the point of
> packet dequeue. But even if an empty queue stays on the rbtree for a
> while, there is no harm in that: eventually it will get its turn, it
> will turn out to be empty, and just be skipped over.
Then that will be fine. Thanks for the explanation of the dropping part 
in CoDel algorithm.
> 
> The issue we need to be concerned about is the opposite: If we have a
> queue that *does* have packets queued, but which is *not* scheduled for
> transmission, that will stall TX.
Is it by design since its vt is more than global vt, right? The lattency 
may somehow get impacted though.
> 
> [0] CoDel in most cases only drops a single packet at a time, so it 
> will
> not clear out an entire queue with multiple packets in one go. But you
> are right that it could conceivably drop the last packet in a queue.
> 
>>> We only need to defer removal inside a single "scheduling round" 
>>> (i.e.,
>>> between a pair of ieee80211_txq_schedule_start/end. So if we just 
>>> walk
>>> the remove list in schedule_end() we should be enough, no?
>>> 
>>> Hmm, or maybe a simpler way to fix the original issue is just to have
>>> unschedule_txq() update the schedule_pos() pointer?
>>> 
>>> I.e., unschedule_txq checks if the txq being removed is currently 
>>> being
>>> pointed to by schedule_pos[ac], and if it is, it updates schedule_pos
>>> to
>>> be the rb_next of the current value?
>> Actually, if schedule_pos is updated to rb_next of the current value,
>> then in the next_txq() where we are going to use rb_next again and
>> finally pick the next node of the node we really want. Is it fine to
>> update schedule_pos to NULL?
> 
> Hmm, yeah, good point.
> 
> If we do end up setting schedule_pos to NULL in the middle of a
> scheduling round, that will make next_txq() "start over", and do 
> another
> loop through the whole thing. I guess we may be able hit a case where
> things can oscillate back and forth between addition and removal
> resulting in an infinite loop? Not sure, but at least I can't seem to
> convince myself that this can't happen.

As the loop of next_txq under lock protection as below,

txq_schedule_start();
while(txq=next_txq()){
...
return_txq(txq);
}
txq_schedule_end();

I do not see any chance of addition, no?
In ath10k, we will usually push packets of first txq as many as we can 
until it is drained and then move to the next one. So if a txq gets 
removed in the return_txq, it should always be the leftmost. And during 
this period, neither vt of any station or global vt can be updated due 
to lock protection.

> 
> But in that case, we could fix it by just conditionally assigning 
> either
> rb_next or rb_prev to the schedule_pos in unschedule_txq()? I.e.,
> something like:
> 
> local->schedule_pos[ac] = rb_next(node) ?: rb_prev(node);
I am not sure I am getting your point. Still in next_txq, 
schedule_pos[ac] will lead us to the next node of the one we want.
> 
> -Toke

-- 
Yibo

_______________________________________________
ath10k mailing list
ath10k@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/ath10k

  reply	other threads:[~2019-09-20  8:29 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-16 13:09 [PATCH 1/4] mac80211: Switch to a virtual time-based airtime scheduler Yibo Zhao
2019-09-16 13:09 ` Yibo Zhao
2019-09-16 13:09 ` [PATCH 2/4] mac80211: defer txqs removal from rbtree Yibo Zhao
2019-09-16 13:09   ` Yibo Zhao
2019-09-17 21:10   ` Toke Høiland-Jørgensen
2019-09-17 21:10     ` Toke Høiland-Jørgensen
2019-09-18 10:27     ` Yibo Zhao
2019-09-18 10:27       ` Yibo Zhao
2019-09-18 11:23       ` Toke Høiland-Jørgensen
2019-09-18 11:23         ` Toke Høiland-Jørgensen
2019-09-19  9:56         ` Yibo Zhao
2019-09-19  9:56           ` Yibo Zhao
2019-09-19 10:37           ` Toke Høiland-Jørgensen
2019-09-19 10:37             ` Toke Høiland-Jørgensen
2019-09-20  8:29             ` Yibo Zhao [this message]
2019-09-20  8:29               ` Yibo Zhao
2019-09-20  9:15               ` Toke Høiland-Jørgensen
2019-09-20  9:15                 ` Toke Høiland-Jørgensen
2019-09-21 10:49                 ` Yibo Zhao
2019-09-21 10:49                   ` Yibo Zhao
2019-09-21 11:27                   ` Toke Høiland-Jørgensen
2019-09-21 11:27                     ` Toke Høiland-Jørgensen
2019-09-21 11:53                     ` Yibo Zhao
2019-09-21 11:53                       ` Yibo Zhao
2019-09-21 12:22                     ` Yibo Zhao
2019-09-21 12:22                       ` Yibo Zhao
2019-09-21 13:02                       ` Toke Høiland-Jørgensen
2019-09-21 13:02                         ` Toke Høiland-Jørgensen
2019-09-21 13:24                         ` Yibo Zhao
2019-09-21 13:24                           ` Yibo Zhao
2019-09-21 14:00                           ` Toke Høiland-Jørgensen
2019-09-21 14:00                             ` Toke Høiland-Jørgensen
2019-09-22  5:19                             ` Yibo Zhao
2019-09-22  5:19                               ` Yibo Zhao
2019-09-23 10:47                               ` Toke Høiland-Jørgensen
2019-09-23 10:47                                 ` Toke Høiland-Jørgensen
2019-09-23 11:42                                 ` Kalle Valo
2019-09-23 11:42                                   ` Kalle Valo
2019-09-23 16:39                                   ` Toke Høiland-Jørgensen
2019-09-23 16:39                                     ` Toke Høiland-Jørgensen
2019-09-24  5:27                                     ` Kalle Valo
2019-09-24  5:27                                       ` Kalle Valo
2019-09-24  7:23                                       ` Toke Høiland-Jørgensen
2019-09-24  7:23                                         ` Toke Høiland-Jørgensen
2019-09-24  2:45                                 ` Yibo Zhao
2019-09-24  2:45                                   ` Yibo Zhao
2019-09-24  7:26                                   ` Toke Høiland-Jørgensen
2019-09-24  7:26                                     ` Toke Høiland-Jørgensen
2019-09-24  8:31                                     ` Yibo Zhao
2019-09-24  8:31                                       ` Yibo Zhao
2019-09-24  8:44                                       ` Toke Høiland-Jørgensen
2019-09-24  8:44                                         ` Toke Høiland-Jørgensen
2019-09-16 13:09 ` [PATCH 3/4] mac80211: fix low throughput in push pull mode Yibo Zhao
2019-09-16 13:09   ` Yibo Zhao
2019-09-16 15:27   ` Johannes Berg
2019-09-16 15:27     ` Johannes Berg
2019-09-17  6:36     ` Yibo Zhao
2019-09-17  6:36       ` Yibo Zhao
2019-09-17  6:55       ` Johannes Berg
2019-09-17  6:55         ` Johannes Berg
2019-09-17 21:12       ` Toke Høiland-Jørgensen
2019-09-17 21:12         ` Toke Høiland-Jørgensen
2019-09-18 10:02         ` Yibo Zhao
2019-09-18 10:02           ` Yibo Zhao
2019-09-18 10:16           ` Toke Høiland-Jørgensen
2019-09-18 10:16             ` Toke Høiland-Jørgensen
2019-09-18 10:18             ` Yibo Zhao
2019-09-18 10:18               ` Yibo Zhao
2019-09-16 13:09 ` [PATCH 4/4] mac80211: Sync airtime weight sum with per AC synced sta airtime weight together Yibo Zhao
2019-09-16 13:09   ` Yibo Zhao
2019-09-17 21:24   ` Toke Høiland-Jørgensen
2019-09-17 21:24     ` Toke Høiland-Jørgensen
2019-09-18 10:16     ` Yibo Zhao
2019-09-18 10:16       ` Yibo Zhao
2019-09-16 14:51 ` [PATCH 1/4] mac80211: Switch to a virtual time-based airtime scheduler Toke Høiland-Jørgensen
2019-09-16 14:51   ` Toke Høiland-Jørgensen
2019-09-17 21:31 ` Toke Høiland-Jørgensen
2019-09-17 21:31   ` Toke Høiland-Jørgensen
2019-09-20  8:37   ` Yibo Zhao
2019-09-20  8:37     ` Yibo Zhao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1b4ab006d9b5c88035845aaac193ef48@codeaurora.org \
    --to=yiboz@codeaurora.org \
    --cc=ath10k@lists.infradead.org \
    --cc=linux-wireless-owner@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.