* [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
@ 2020-09-08 11:02 Yunsheng Lin
2020-09-10 19:39 ` David Miller
2020-09-10 20:07 ` Cong Wang
0 siblings, 2 replies; 26+ messages in thread
From: Yunsheng Lin @ 2020-09-08 11:02 UTC (permalink / raw)
To: jhs, xiyou.wangcong, jiri, davem, kuba
Cc: netdev, linux-kernel, linuxarm, john.fastabend, eric.dumazet
Currently there is concurrent reset and enqueue operation for the
same lockless qdisc when there is no lock to synchronize the
q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
qdisc_deactivate() called by dev_deactivate_queue(), which may cause
out-of-bounds access for priv->ring[] in hns3 driver if user has
requested a smaller queue num when __dev_xmit_skb() still enqueue a
skb with a larger queue_mapping after the corresponding qdisc is
reset, and call hns3_nic_net_xmit() with that skb later.
Reused the existing synchronize_net() in dev_deactivate_many() to
make sure skb with larger queue_mapping enqueued to old qdisc(which
is saved in dev_queue->qdisc_sleeping) will always be reset when
dev_reset_queue() is called.
Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
---
ChangeLog V2:
Reuse existing synchronize_net().
---
net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
1 file changed, 33 insertions(+), 15 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 265a61d..54c4172 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
static void qdisc_deactivate(struct Qdisc *qdisc)
{
- bool nolock = qdisc->flags & TCQ_F_NOLOCK;
-
if (qdisc->flags & TCQ_F_BUILTIN)
return;
- if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
- return;
-
- if (nolock)
- spin_lock_bh(&qdisc->seqlock);
- spin_lock_bh(qdisc_lock(qdisc));
set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
-
- qdisc_reset(qdisc);
-
- spin_unlock_bh(qdisc_lock(qdisc));
- if (nolock)
- spin_unlock_bh(&qdisc->seqlock);
}
static void dev_deactivate_queue(struct net_device *dev,
@@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
}
}
+static void dev_reset_queue(struct net_device *dev,
+ struct netdev_queue *dev_queue,
+ void *_unused)
+{
+ struct Qdisc *qdisc;
+ bool nolock;
+
+ qdisc = dev_queue->qdisc_sleeping;
+ if (!qdisc)
+ return;
+
+ nolock = qdisc->flags & TCQ_F_NOLOCK;
+
+ if (nolock)
+ spin_lock_bh(&qdisc->seqlock);
+ spin_lock_bh(qdisc_lock(qdisc));
+
+ qdisc_reset(qdisc);
+
+ spin_unlock_bh(qdisc_lock(qdisc));
+ if (nolock)
+ spin_unlock_bh(&qdisc->seqlock);
+}
+
static bool some_qdisc_is_busy(struct net_device *dev)
{
unsigned int i;
@@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
dev_watchdog_down(dev);
}
- /* Wait for outstanding qdisc-less dev_queue_xmit calls.
+ /* Wait for outstanding qdisc-less dev_queue_xmit calls or
+ * outstanding qdisc enqueuing calls.
* This is avoided if all devices are in dismantle phase :
* Caller will call synchronize_net() for us
*/
synchronize_net();
+ list_for_each_entry(dev, head, close_list) {
+ netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
+
+ if (dev_ingress_queue(dev))
+ dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
+ }
+
/* Wait for outstanding qdisc_run calls. */
list_for_each_entry(dev, head, close_list) {
while (some_qdisc_is_busy(dev)) {
--
2.8.1
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-08 11:02 [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc Yunsheng Lin
@ 2020-09-10 19:39 ` David Miller
2020-09-10 20:07 ` Cong Wang
1 sibling, 0 replies; 26+ messages in thread
From: David Miller @ 2020-09-10 19:39 UTC (permalink / raw)
To: linyunsheng
Cc: jhs, xiyou.wangcong, jiri, kuba, netdev, linux-kernel, linuxarm,
john.fastabend, eric.dumazet
From: Yunsheng Lin <linyunsheng@huawei.com>
Date: Tue, 8 Sep 2020 19:02:34 +0800
> Currently there is concurrent reset and enqueue operation for the
> same lockless qdisc when there is no lock to synchronize the
> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> out-of-bounds access for priv->ring[] in hns3 driver if user has
> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> skb with a larger queue_mapping after the corresponding qdisc is
> reset, and call hns3_nic_net_xmit() with that skb later.
>
> Reused the existing synchronize_net() in dev_deactivate_many() to
> make sure skb with larger queue_mapping enqueued to old qdisc(which
> is saved in dev_queue->qdisc_sleeping) will always be reset when
> dev_reset_queue() is called.
>
> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> ---
> ChangeLog V2:
> Reuse existing synchronize_net().
Applied and queued up for -stable, thank you.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-08 11:02 [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc Yunsheng Lin
2020-09-10 19:39 ` David Miller
@ 2020-09-10 20:07 ` Cong Wang
2020-09-11 8:13 ` Yunsheng Lin
1 sibling, 1 reply; 26+ messages in thread
From: Cong Wang @ 2020-09-10 20:07 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> Currently there is concurrent reset and enqueue operation for the
> same lockless qdisc when there is no lock to synchronize the
> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> out-of-bounds access for priv->ring[] in hns3 driver if user has
> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> skb with a larger queue_mapping after the corresponding qdisc is
> reset, and call hns3_nic_net_xmit() with that skb later.
>
> Reused the existing synchronize_net() in dev_deactivate_many() to
> make sure skb with larger queue_mapping enqueued to old qdisc(which
> is saved in dev_queue->qdisc_sleeping) will always be reset when
> dev_reset_queue() is called.
>
> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> ---
> ChangeLog V2:
> Reuse existing synchronize_net().
> ---
> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
> 1 file changed, 33 insertions(+), 15 deletions(-)
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 265a61d..54c4172 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>
> static void qdisc_deactivate(struct Qdisc *qdisc)
> {
> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
> -
> if (qdisc->flags & TCQ_F_BUILTIN)
> return;
> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
> - return;
> -
> - if (nolock)
> - spin_lock_bh(&qdisc->seqlock);
> - spin_lock_bh(qdisc_lock(qdisc));
>
> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
> -
> - qdisc_reset(qdisc);
> -
> - spin_unlock_bh(qdisc_lock(qdisc));
> - if (nolock)
> - spin_unlock_bh(&qdisc->seqlock);
> }
>
> static void dev_deactivate_queue(struct net_device *dev,
> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
> }
> }
>
> +static void dev_reset_queue(struct net_device *dev,
> + struct netdev_queue *dev_queue,
> + void *_unused)
> +{
> + struct Qdisc *qdisc;
> + bool nolock;
> +
> + qdisc = dev_queue->qdisc_sleeping;
> + if (!qdisc)
> + return;
> +
> + nolock = qdisc->flags & TCQ_F_NOLOCK;
> +
> + if (nolock)
> + spin_lock_bh(&qdisc->seqlock);
> + spin_lock_bh(qdisc_lock(qdisc));
I think you do not need this lock for lockless one.
> +
> + qdisc_reset(qdisc);
> +
> + spin_unlock_bh(qdisc_lock(qdisc));
> + if (nolock)
> + spin_unlock_bh(&qdisc->seqlock);
> +}
> +
> static bool some_qdisc_is_busy(struct net_device *dev)
> {
> unsigned int i;
> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
> dev_watchdog_down(dev);
> }
>
> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
> + * outstanding qdisc enqueuing calls.
> * This is avoided if all devices are in dismantle phase :
> * Caller will call synchronize_net() for us
> */
> synchronize_net();
>
> + list_for_each_entry(dev, head, close_list) {
> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
> +
> + if (dev_ingress_queue(dev))
> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
> + }
> +
> /* Wait for outstanding qdisc_run calls. */
> list_for_each_entry(dev, head, close_list) {
> while (some_qdisc_is_busy(dev)) {
Do you want to reset before waiting for TX action?
I think it is safer to do it after, at least prior to commit 759ae57f1b
we did after.
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-10 20:07 ` Cong Wang
@ 2020-09-11 8:13 ` Yunsheng Lin
2020-09-11 8:25 ` Yunsheng Lin
2020-09-17 19:26 ` Cong Wang
0 siblings, 2 replies; 26+ messages in thread
From: Yunsheng Lin @ 2020-09-11 8:13 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/9/11 4:07, Cong Wang wrote:
> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> Currently there is concurrent reset and enqueue operation for the
>> same lockless qdisc when there is no lock to synchronize the
>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>> skb with a larger queue_mapping after the corresponding qdisc is
>> reset, and call hns3_nic_net_xmit() with that skb later.
>>
>> Reused the existing synchronize_net() in dev_deactivate_many() to
>> make sure skb with larger queue_mapping enqueued to old qdisc(which
>> is saved in dev_queue->qdisc_sleeping) will always be reset when
>> dev_reset_queue() is called.
>>
>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>> ---
>> ChangeLog V2:
>> Reuse existing synchronize_net().
>> ---
>> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 265a61d..54c4172 100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>>
>> static void qdisc_deactivate(struct Qdisc *qdisc)
>> {
>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
>> -
>> if (qdisc->flags & TCQ_F_BUILTIN)
>> return;
>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
>> - return;
>> -
>> - if (nolock)
>> - spin_lock_bh(&qdisc->seqlock);
>> - spin_lock_bh(qdisc_lock(qdisc));
>>
>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
>> -
>> - qdisc_reset(qdisc);
>> -
>> - spin_unlock_bh(qdisc_lock(qdisc));
>> - if (nolock)
>> - spin_unlock_bh(&qdisc->seqlock);
>> }
>>
>> static void dev_deactivate_queue(struct net_device *dev,
>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
>> }
>> }
>>
>> +static void dev_reset_queue(struct net_device *dev,
>> + struct netdev_queue *dev_queue,
>> + void *_unused)
>> +{
>> + struct Qdisc *qdisc;
>> + bool nolock;
>> +
>> + qdisc = dev_queue->qdisc_sleeping;
>> + if (!qdisc)
>> + return;
>> +
>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
>> +
>> + if (nolock)
>> + spin_lock_bh(&qdisc->seqlock);
>> + spin_lock_bh(qdisc_lock(qdisc));
>
>
> I think you do not need this lock for lockless one.
It seems so.
Maybe another patch to remove qdisc_lock(qdisc) for lockless
qdisc?
>
>> +
>> + qdisc_reset(qdisc);
>> +
>> + spin_unlock_bh(qdisc_lock(qdisc));
>> + if (nolock)
>> + spin_unlock_bh(&qdisc->seqlock);
>> +}
>> +
>> static bool some_qdisc_is_busy(struct net_device *dev)
>> {
>> unsigned int i;
>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
>> dev_watchdog_down(dev);
>> }
>>
>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
>> + * outstanding qdisc enqueuing calls.
>> * This is avoided if all devices are in dismantle phase :
>> * Caller will call synchronize_net() for us
>> */
>> synchronize_net();
>>
>> + list_for_each_entry(dev, head, close_list) {
>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
>> +
>> + if (dev_ingress_queue(dev))
>> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
>> + }
>> +
>> /* Wait for outstanding qdisc_run calls. */
>> list_for_each_entry(dev, head, close_list) {
>> while (some_qdisc_is_busy(dev)) {
>
> Do you want to reset before waiting for TX action?
>
> I think it is safer to do it after, at least prior to commit 759ae57f1b
> we did after.
The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
should be enought to ensure there is no skb enqueued to the old qdisc that is saved
in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
enqueued to it will be dropped and freed, right?
If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
can move it after some_qdisc_is_busy() checking.
Also, it seems the __QDISC_STATE_DEACTIVATED checking in qdisc_run() is unnecessary
after this patch, because after synchronize_net() qdisc_run() will now see the old
qdisc.
static inline void qdisc_run(struct Qdisc *q)
{
if (qdisc_run_begin(q)) {
/* NOLOCK qdisc must check 'state' under the qdisc seqlock
* to avoid racing with dev_qdisc_reset()
*/
if (!(q->flags & TCQ_F_NOLOCK) ||
likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
__qdisc_run(q);
qdisc_run_end(q);
}
}
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-11 8:13 ` Yunsheng Lin
@ 2020-09-11 8:25 ` Yunsheng Lin
2020-09-17 19:26 ` Cong Wang
1 sibling, 0 replies; 26+ messages in thread
From: Yunsheng Lin @ 2020-09-11 8:25 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/9/11 16:13, Yunsheng Lin wrote:
> On 2020/9/11 4:07, Cong Wang wrote:
>> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>
>>> Currently there is concurrent reset and enqueue operation for the
>>> same lockless qdisc when there is no lock to synchronize the
>>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>>> skb with a larger queue_mapping after the corresponding qdisc is
>>> reset, and call hns3_nic_net_xmit() with that skb later.
>>>
>>> Reused the existing synchronize_net() in dev_deactivate_many() to
>>> make sure skb with larger queue_mapping enqueued to old qdisc(which
>>> is saved in dev_queue->qdisc_sleeping) will always be reset when
>>> dev_reset_queue() is called.
>>>
>>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>>> ---
>>> ChangeLog V2:
>>> Reuse existing synchronize_net().
>>> ---
>>> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
>>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>>> index 265a61d..54c4172 100644
>>> --- a/net/sched/sch_generic.c
>>> +++ b/net/sched/sch_generic.c
>>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>>>
>>> static void qdisc_deactivate(struct Qdisc *qdisc)
>>> {
>>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
>>> -
>>> if (qdisc->flags & TCQ_F_BUILTIN)
>>> return;
>>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
>>> - return;
>>> -
>>> - if (nolock)
>>> - spin_lock_bh(&qdisc->seqlock);
>>> - spin_lock_bh(qdisc_lock(qdisc));
>>>
>>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
>>> -
>>> - qdisc_reset(qdisc);
>>> -
>>> - spin_unlock_bh(qdisc_lock(qdisc));
>>> - if (nolock)
>>> - spin_unlock_bh(&qdisc->seqlock);
>>> }
>>>
>>> static void dev_deactivate_queue(struct net_device *dev,
>>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
>>> }
>>> }
>>>
>>> +static void dev_reset_queue(struct net_device *dev,
>>> + struct netdev_queue *dev_queue,
>>> + void *_unused)
>>> +{
>>> + struct Qdisc *qdisc;
>>> + bool nolock;
>>> +
>>> + qdisc = dev_queue->qdisc_sleeping;
>>> + if (!qdisc)
>>> + return;
>>> +
>>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
>>> +
>>> + if (nolock)
>>> + spin_lock_bh(&qdisc->seqlock);
>>> + spin_lock_bh(qdisc_lock(qdisc));
>>
>>
>> I think you do not need this lock for lockless one.
>
> It seems so.
> Maybe another patch to remove qdisc_lock(qdisc) for lockless
> qdisc?
>
>
>>
>>> +
>>> + qdisc_reset(qdisc);
>>> +
>>> + spin_unlock_bh(qdisc_lock(qdisc));
>>> + if (nolock)
>>> + spin_unlock_bh(&qdisc->seqlock);
>>> +}
>>> +
>>> static bool some_qdisc_is_busy(struct net_device *dev)
>>> {
>>> unsigned int i;
>>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
>>> dev_watchdog_down(dev);
>>> }
>>>
>>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
>>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
>>> + * outstanding qdisc enqueuing calls.
>>> * This is avoided if all devices are in dismantle phase :
>>> * Caller will call synchronize_net() for us
>>> */
>>> synchronize_net();
>>>
>>> + list_for_each_entry(dev, head, close_list) {
>>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
>>> +
>>> + if (dev_ingress_queue(dev))
>>> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
>>> + }
>>> +
>>> /* Wait for outstanding qdisc_run calls. */
>>> list_for_each_entry(dev, head, close_list) {
>>> while (some_qdisc_is_busy(dev)) {
>>
>> Do you want to reset before waiting for TX action?
>>
>> I think it is safer to do it after, at least prior to commit 759ae57f1b
>> we did after.
>
> The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
> should be enought to ensure there is no skb enqueued to the old qdisc that is saved
> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
> after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
> enqueued to it will be dropped and freed, right?
>
> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
> can move it after some_qdisc_is_busy() checking.
>
> Also, it seems the __QDISC_STATE_DEACTIVATED checking in qdisc_run() is unnecessary
> after this patch, because after synchronize_net() qdisc_run() will now see the old
> qdisc.
now -> not
sorry for the typo.
>
> static inline void qdisc_run(struct Qdisc *q)
> {
> if (qdisc_run_begin(q)) {
> /* NOLOCK qdisc must check 'state' under the qdisc seqlock
> * to avoid racing with dev_qdisc_reset()
> */
> if (!(q->flags & TCQ_F_NOLOCK) ||
> likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
> __qdisc_run(q);
> qdisc_run_end(q);
> }
> }
>
>>
>> Thanks.
>> .
>>
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-11 8:13 ` Yunsheng Lin
2020-09-11 8:25 ` Yunsheng Lin
@ 2020-09-17 19:26 ` Cong Wang
[not found] ` <CAP12E-+3DY-dgzVercKc-NYGPExWO1NjTOr1Gf3tPLKvp6O6+g@mail.gmail.com>
` (2 more replies)
1 sibling, 3 replies; 26+ messages in thread
From: Cong Wang @ 2020-09-17 19:26 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2020/9/11 4:07, Cong Wang wrote:
> > On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >>
> >> Currently there is concurrent reset and enqueue operation for the
> >> same lockless qdisc when there is no lock to synchronize the
> >> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> >> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> >> out-of-bounds access for priv->ring[] in hns3 driver if user has
> >> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> >> skb with a larger queue_mapping after the corresponding qdisc is
> >> reset, and call hns3_nic_net_xmit() with that skb later.
> >>
> >> Reused the existing synchronize_net() in dev_deactivate_many() to
> >> make sure skb with larger queue_mapping enqueued to old qdisc(which
> >> is saved in dev_queue->qdisc_sleeping) will always be reset when
> >> dev_reset_queue() is called.
> >>
> >> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> >> ---
> >> ChangeLog V2:
> >> Reuse existing synchronize_net().
> >> ---
> >> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
> >> 1 file changed, 33 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> >> index 265a61d..54c4172 100644
> >> --- a/net/sched/sch_generic.c
> >> +++ b/net/sched/sch_generic.c
> >> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
> >>
> >> static void qdisc_deactivate(struct Qdisc *qdisc)
> >> {
> >> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
> >> -
> >> if (qdisc->flags & TCQ_F_BUILTIN)
> >> return;
> >> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
> >> - return;
> >> -
> >> - if (nolock)
> >> - spin_lock_bh(&qdisc->seqlock);
> >> - spin_lock_bh(qdisc_lock(qdisc));
> >>
> >> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
> >> -
> >> - qdisc_reset(qdisc);
> >> -
> >> - spin_unlock_bh(qdisc_lock(qdisc));
> >> - if (nolock)
> >> - spin_unlock_bh(&qdisc->seqlock);
> >> }
> >>
> >> static void dev_deactivate_queue(struct net_device *dev,
> >> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
> >> }
> >> }
> >>
> >> +static void dev_reset_queue(struct net_device *dev,
> >> + struct netdev_queue *dev_queue,
> >> + void *_unused)
> >> +{
> >> + struct Qdisc *qdisc;
> >> + bool nolock;
> >> +
> >> + qdisc = dev_queue->qdisc_sleeping;
> >> + if (!qdisc)
> >> + return;
> >> +
> >> + nolock = qdisc->flags & TCQ_F_NOLOCK;
> >> +
> >> + if (nolock)
> >> + spin_lock_bh(&qdisc->seqlock);
> >> + spin_lock_bh(qdisc_lock(qdisc));
> >
> >
> > I think you do not need this lock for lockless one.
>
> It seems so.
> Maybe another patch to remove qdisc_lock(qdisc) for lockless
> qdisc?
Yeah, but not sure if we still want this lockless qdisc any more,
it brings more troubles than gains.
>
>
> >
> >> +
> >> + qdisc_reset(qdisc);
> >> +
> >> + spin_unlock_bh(qdisc_lock(qdisc));
> >> + if (nolock)
> >> + spin_unlock_bh(&qdisc->seqlock);
> >> +}
> >> +
> >> static bool some_qdisc_is_busy(struct net_device *dev)
> >> {
> >> unsigned int i;
> >> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
> >> dev_watchdog_down(dev);
> >> }
> >>
> >> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
> >> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
> >> + * outstanding qdisc enqueuing calls.
> >> * This is avoided if all devices are in dismantle phase :
> >> * Caller will call synchronize_net() for us
> >> */
> >> synchronize_net();
> >>
> >> + list_for_each_entry(dev, head, close_list) {
> >> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
> >> +
> >> + if (dev_ingress_queue(dev))
> >> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
> >> + }
> >> +
> >> /* Wait for outstanding qdisc_run calls. */
> >> list_for_each_entry(dev, head, close_list) {
> >> while (some_qdisc_is_busy(dev)) {
> >
> > Do you want to reset before waiting for TX action?
> >
> > I think it is safer to do it after, at least prior to commit 759ae57f1b
> > we did after.
>
> The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
> should be enought to ensure there is no skb enqueued to the old qdisc that is saved
> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
> after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
> enqueued to it will be dropped and freed, right?
Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
reference qdisc via txq->qdisc but via sd->output_queue.
>
> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
> can move it after some_qdisc_is_busy() checking.
I am not suggesting to do an additional reset, I am suggesting to move
your reset after the busy waiting.
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
[not found] ` <CAP12E-+3DY-dgzVercKc-NYGPExWO1NjTOr1Gf3tPLKvp6O6+g@mail.gmail.com>
@ 2020-10-28 15:37 ` Pai, Vishwanath
2020-10-28 17:47 ` Cong Wang
0 siblings, 1 reply; 26+ messages in thread
From: Pai, Vishwanath @ 2020-10-28 15:37 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet, Cong Wang
On 10/28/20, 10:51 AM, "Vishwanath Pai" <pai.vishwain.kernel@gmail.com> wrote:
On Thu, Sep 17, 2020 at 4:26 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >
> > On 2020/9/11 4:07, Cong Wang wrote:
> > > On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> > >>
> > >> Currently there is concurrent reset and enqueue operation for the
> > >> same lockless qdisc when there is no lock to synchronize the
> > >> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> > >> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> > >> out-of-bounds access for priv->ring[] in hns3 driver if user has
> > >> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> > >> skb with a larger queue_mapping after the corresponding qdisc is
> > >> reset, and call hns3_nic_net_xmit() with that skb later.
> > >>
> > >> Reused the existing synchronize_net() in dev_deactivate_many() to
> > >> make sure skb with larger queue_mapping enqueued to old qdisc(which
> > >> is saved in dev_queue->qdisc_sleeping) will always be reset when
> > >> dev_reset_queue() is called.
> > >>
> > >> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
> > >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> > >> ---
> > >> ChangeLog V2:
> > >> Reuse existing synchronize_net().
> > >> ---
> > >> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
> > >> 1 file changed, 33 insertions(+), 15 deletions(-)
> > >>
> > >> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > >> index 265a61d..54c4172 100644
> > >> --- a/net/sched/sch_generic.c
> > >> +++ b/net/sched/sch_generic.c
> > >> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
> > >>
> > >> static void qdisc_deactivate(struct Qdisc *qdisc)
> > >> {
> > >> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
> > >> -
> > >> if (qdisc->flags & TCQ_F_BUILTIN)
> > >> return;
> > >> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
> > >> - return;
> > >> -
> > >> - if (nolock)
> > >> - spin_lock_bh(&qdisc->seqlock);
> > >> - spin_lock_bh(qdisc_lock(qdisc));
> > >>
> > >> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
> > >> -
> > >> - qdisc_reset(qdisc);
> > >> -
> > >> - spin_unlock_bh(qdisc_lock(qdisc));
> > >> - if (nolock)
> > >> - spin_unlock_bh(&qdisc->seqlock);
> > >> }
> > >>
> > >> static void dev_deactivate_queue(struct net_device *dev,
> > >> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
> > >> }
> > >> }
> > >>
> > >> +static void dev_reset_queue(struct net_device *dev,
> > >> + struct netdev_queue *dev_queue,
> > >> + void *_unused)
> > >> +{
> > >> + struct Qdisc *qdisc;
> > >> + bool nolock;
> > >> +
> > >> + qdisc = dev_queue->qdisc_sleeping;
> > >> + if (!qdisc)
> > >> + return;
> > >> +
> > >> + nolock = qdisc->flags & TCQ_F_NOLOCK;
> > >> +
> > >> + if (nolock)
> > >> + spin_lock_bh(&qdisc->seqlock);
> > >> + spin_lock_bh(qdisc_lock(qdisc));
> > >
> > >
> > > I think you do not need this lock for lockless one.
> >
> > It seems so.
> > Maybe another patch to remove qdisc_lock(qdisc) for lockless
> > qdisc?
>
> Yeah, but not sure if we still want this lockless qdisc any more,
> it brings more troubles than gains.
>
> >
> >
> > >
> > >> +
> > >> + qdisc_reset(qdisc);
> > >> +
> > >> + spin_unlock_bh(qdisc_lock(qdisc));
> > >> + if (nolock)
> > >> + spin_unlock_bh(&qdisc->seqlock);
> > >> +}
> > >> +
> > >> static bool some_qdisc_is_busy(struct net_device *dev)
> > >> {
> > >> unsigned int i;
> > >> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
> > >> dev_watchdog_down(dev);
> > >> }
> > >>
> > >> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
> > >> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
> > >> + * outstanding qdisc enqueuing calls.
> > >> * This is avoided if all devices are in dismantle phase :
> > >> * Caller will call synchronize_net() for us
> > >> */
> > >> synchronize_net();
> > >>
> > >> + list_for_each_entry(dev, head, close_list) {
> > >> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
> > >> +
> > >> + if (dev_ingress_queue(dev))
> > >> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
> > >> + }
> > >> +
> > >> /* Wait for outstanding qdisc_run calls. */
> > >> list_for_each_entry(dev, head, close_list) {
> > >> while (some_qdisc_is_busy(dev)) {
> > >
> > > Do you want to reset before waiting for TX action?
> > >
> > > I think it is safer to do it after, at least prior to commit 759ae57f1b
> > > we did after.
> >
> > The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
> > should be enought to ensure there is no skb enqueued to the old qdisc that is saved
> > in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
> > after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
> > enqueued to it will be dropped and freed, right?
>
> Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
> reference qdisc via txq->qdisc but via sd->output_queue.
>
>
> >
> > If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
> > can move it after some_qdisc_is_busy() checking.
>
> I am not suggesting to do an additional reset, I am suggesting to move
> your reset after the busy waiting.
>
> Thanks.
Hi,
We noticed some problems when testing the latest 5.4 LTS kernel and traced it
back to this commit using git bisect. When running our tests the machine stops
responding to all traffic and the only way to recover is a reboot. I do not see
a stack trace on the console.
This can be reproduced using the packetdrill test below, it should be run a
few times or in a loop. You should hit this issue within a few tries but
sometimes might take up to 15-20 tries.
*** TEST BEGIN ***
0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
// set maxseg to 1000 to work with both ipv4 and ipv6
0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
0.500 bind(3, ..., ...) = 0
0.500 listen(3, 1) = 0
// Establish connection
0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
0.600 > S. 0:0(0) ack 1 <...>
0.800 < . 1:1(0) ack 1 win 320
0.800 accept(3, ..., ...) = 4
// Send 4 data segments.
+0 write(4, ..., 4000) = 4000
+0 > P. 1:4001(4000) ack 1
// Receive a SACK
+.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
+.3 %{ print "TCP CA state: ",tcpi_ca_state }%
*** TEST END ***
I can reproduce the issue easily on v5.4.68, and after reverting this commit it
does not happen anymore.
Thanks,
Vishwanath
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-17 19:26 ` Cong Wang
[not found] ` <CAP12E-+3DY-dgzVercKc-NYGPExWO1NjTOr1Gf3tPLKvp6O6+g@mail.gmail.com>
@ 2020-10-28 17:46 ` Vishwanath Pai
2020-10-29 2:52 ` Yunsheng Lin
2 siblings, 0 replies; 26+ messages in thread
From: Vishwanath Pai @ 2020-10-28 17:46 UTC (permalink / raw)
To: Cong Wang, Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet, Hunt, Joshua
On 9/17/20 3:26 PM, Cong Wang wrote:
> On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com>
wrote:
>>
>> On 2020/9/11 4:07, Cong Wang wrote:
>>> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin
<linyunsheng@huawei.com> wrote:
>>>>
>>>> Currently there is concurrent reset and enqueue operation for the
>>>> same lockless qdisc when there is no lock to synchronize the
>>>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>>>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>>>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>>>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>>>> skb with a larger queue_mapping after the corresponding qdisc is
>>>> reset, and call hns3_nic_net_xmit() with that skb later.
>>>>
>>>> Reused the existing synchronize_net() in dev_deactivate_many() to
>>>> make sure skb with larger queue_mapping enqueued to old qdisc(which
>>>> is saved in dev_queue->qdisc_sleeping) will always be reset when
>>>> dev_reset_queue() is called.
>>>>
>>>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>>>> ---
>>>> ChangeLog V2:
>>>> Reuse existing synchronize_net().
>>>> ---
>>>> net/sched/sch_generic.c | 48
+++++++++++++++++++++++++++++++++---------------
>>>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>>>> index 265a61d..54c4172 100644
>>>> --- a/net/sched/sch_generic.c
>>>> +++ b/net/sched/sch_generic.c
>>>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>>>>
>>>> static void qdisc_deactivate(struct Qdisc *qdisc)
>>>> {
>>>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>> -
>>>> if (qdisc->flags & TCQ_F_BUILTIN)
>>>> return;
>>>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
>>>> - return;
>>>> -
>>>> - if (nolock)
>>>> - spin_lock_bh(&qdisc->seqlock);
>>>> - spin_lock_bh(qdisc_lock(qdisc));
>>>>
>>>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
>>>> -
>>>> - qdisc_reset(qdisc);
>>>> -
>>>> - spin_unlock_bh(qdisc_lock(qdisc));
>>>> - if (nolock)
>>>> - spin_unlock_bh(&qdisc->seqlock);
>>>> }
>>>>
>>>> static void dev_deactivate_queue(struct net_device *dev,
>>>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct
net_device *dev,
>>>> }
>>>> }
>>>>
>>>> +static void dev_reset_queue(struct net_device *dev,
>>>> + struct netdev_queue *dev_queue,
>>>> + void *_unused)
>>>> +{
>>>> + struct Qdisc *qdisc;
>>>> + bool nolock;
>>>> +
>>>> + qdisc = dev_queue->qdisc_sleeping;
>>>> + if (!qdisc)
>>>> + return;
>>>> +
>>>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>> +
>>>> + if (nolock)
>>>> + spin_lock_bh(&qdisc->seqlock);
>>>> + spin_lock_bh(qdisc_lock(qdisc));
>>>
>>>
>>> I think you do not need this lock for lockless one.
>>
>> It seems so.
>> Maybe another patch to remove qdisc_lock(qdisc) for lockless
>> qdisc?
>
> Yeah, but not sure if we still want this lockless qdisc any more,
> it brings more troubles than gains.
>
>>
>>
>>>
>>>> +
>>>> + qdisc_reset(qdisc);
>>>> +
>>>> + spin_unlock_bh(qdisc_lock(qdisc));
>>>> + if (nolock)
>>>> + spin_unlock_bh(&qdisc->seqlock);
>>>> +}
>>>> +
>>>> static bool some_qdisc_is_busy(struct net_device *dev)
>>>> {
>>>> unsigned int i;
>>>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head
*head)
>>>> dev_watchdog_down(dev);
>>>> }
>>>>
>>>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
>>>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
>>>> + * outstanding qdisc enqueuing calls.
>>>> * This is avoided if all devices are in dismantle phase :
>>>> * Caller will call synchronize_net() for us
>>>> */
>>>> synchronize_net();
>>>>
>>>> + list_for_each_entry(dev, head, close_list) {
>>>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
>>>> +
>>>> + if (dev_ingress_queue(dev))
>>>> + dev_reset_queue(dev,
dev_ingress_queue(dev), NULL);
>>>> + }
>>>> +
>>>> /* Wait for outstanding qdisc_run calls. */
>>>> list_for_each_entry(dev, head, close_list) {
>>>> while (some_qdisc_is_busy(dev)) {
>>>
>>> Do you want to reset before waiting for TX action?
>>>
>>> I think it is safer to do it after, at least prior to commit 759ae57f1b
>>> we did after.
>>
>> The reference to the txq->qdisc is always protected by RCU, so the
synchronize_net()
>> should be enought to ensure there is no skb enqueued to the old
qdisc that is saved
>> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only
see the new qdisc
>> after synchronize_net(), which is noop_qdisc, and noop_qdisc will
make sure any skb
>> enqueued to it will be dropped and freed, right?
>
> Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
> reference qdisc via txq->qdisc but via sd->output_queue.
>
>
>>
>> If we do any additional reset that is not related to qdisc in
dev_reset_queue(), we
>> can move it after some_qdisc_is_busy() checking.
>
> I am not suggesting to do an additional reset, I am suggesting to move
> your reset after the busy waiting.
>
> Thanks.
Re-sending this, looks like my previous email did not get delivered
somehow.
We noticed some problems when testing the latest 5.4 LTS kernel and
traced it back to this commit using git bisect. When running our tests
the machine stops responding to all traffic and the only way to recover
is a reboot. I do not see a stack trace on the console.
This can be reproduced using the packetdrill test below, it should be
run a few times or in a loop. You should hit this issue within a few
tries but sometimes might take up to 15-20 tries.
*** TEST BEGIN ***
0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
// set maxseg to 1000 to work with both ipv4 and ipv6
0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
0.500 bind(3, ..., ...) = 0
0.500 listen(3, 1) = 0
// Establish connection
0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
0.600 > S. 0:0(0) ack 1 <...>
0.800 < . 1:1(0) ack 1 win 320
0.800 accept(3, ..., ...) = 4
// Send 4 data segments.
+0 write(4, ..., 4000) = 4000
+0 > P. 1:4001(4000) ack 1
// Receive a SACK
+.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
+.3 %{ print "TCP CA state: ",tcpi_ca_state }%
*** TEST END ***
I can reproduce the issue easily on v5.4.68, and after reverting this
commit it
does not happen anymore.
Thanks,
Vishwanath
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-28 15:37 ` Pai, Vishwanath
@ 2020-10-28 17:47 ` Cong Wang
2020-10-28 20:04 ` Vishwanath Pai
0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2020-10-28 17:47 UTC (permalink / raw)
To: Pai, Vishwanath
Cc: Yunsheng Lin, Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko,
David Miller, Jakub Kicinski, Linux Kernel Network Developers,
LKML, linuxarm, John Fastabend, Eric Dumazet
On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com> wrote:
> Hi,
>
> We noticed some problems when testing the latest 5.4 LTS kernel and traced it
> back to this commit using git bisect. When running our tests the machine stops
> responding to all traffic and the only way to recover is a reboot. I do not see
> a stack trace on the console.
Do you mean the machine is still running fine just the network is down?
If so, can you dump your tc config with stats when the problem is happening?
(You can use `tc -s -d qd show ...`.)
>
> This can be reproduced using the packetdrill test below, it should be run a
> few times or in a loop. You should hit this issue within a few tries but
> sometimes might take up to 15-20 tries.
...
> I can reproduce the issue easily on v5.4.68, and after reverting this commit it
> does not happen anymore.
This is odd. The patch in this thread touches netdev reset path, if packetdrill
is the only thing you use to trigger the bug (that is netdev is always active),
I can not connect them.
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-28 17:47 ` Cong Wang
@ 2020-10-28 20:04 ` Vishwanath Pai
2020-10-29 2:37 ` Yunsheng Lin
0 siblings, 1 reply; 26+ messages in thread
From: Vishwanath Pai @ 2020-10-28 20:04 UTC (permalink / raw)
To: Cong Wang
Cc: Yunsheng Lin, Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko,
David Miller, Jakub Kicinski, Linux Kernel Network Developers,
LKML, linuxarm, John Fastabend, Eric Dumazet
On 10/28/20 1:47 PM, Cong Wang wrote:
> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com> wrote:
>> Hi,
>>
>> We noticed some problems when testing the latest 5.4 LTS kernel and
traced it
>> back to this commit using git bisect. When running our tests the
machine stops
>> responding to all traffic and the only way to recover is a reboot. I
do not see
>> a stack trace on the console.
>
> Do you mean the machine is still running fine just the network is down?
>
> If so, can you dump your tc config with stats when the problem is
happening?
> (You can use `tc -s -d qd show ...`.)
>
>>
>> This can be reproduced using the packetdrill test below, it should
be run a
>> few times or in a loop. You should hit this issue within a few tries but
>> sometimes might take up to 15-20 tries.
> ...
>> I can reproduce the issue easily on v5.4.68, and after reverting
this commit it
>> does not happen anymore.
>
> This is odd. The patch in this thread touches netdev reset path, if
packetdrill
> is the only thing you use to trigger the bug (that is netdev is
always active),
> I can not connect them.
>
> Thanks.
Hi Cong,
> Do you mean the machine is still running fine just the network is down?
I was able to access the machine via serial console, it looks like it is
up and running, just that networking is down.
> If so, can you dump your tc config with stats when the problem is
happening?
> (You can use `tc -s -d qd show ...`.)
If I try running tc when the machine is in this state the command never
returns. It doesn't print anything but doesn't exit either.
> This is odd. The patch in this thread touches netdev reset path, if
packetdrill
> is the only thing you use to trigger the bug (that is netdev is
always active),
> I can not connect them.
I think packetdrill creates a tun0 interface when it starts the
test and tears it down at the end, so it might be hitting this code path
during teardown.
P.S: My mail server is having connectivity issues with vger.kernel.org
so messages aren't getting delivered to netdev. It'll hopefully get
resolved soon.
Thanks,
Vishwanath
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-28 20:04 ` Vishwanath Pai
@ 2020-10-29 2:37 ` Yunsheng Lin
2020-10-29 4:50 ` Vishwanath Pai
0 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-10-29 2:37 UTC (permalink / raw)
To: Vishwanath Pai, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet
On 2020/10/29 4:04, Vishwanath Pai wrote:
> On 10/28/20 1:47 PM, Cong Wang wrote:
>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com> wrote:
>>> Hi,
>>>
>>> We noticed some problems when testing the latest 5.4 LTS kernel and traced it
>>> back to this commit using git bisect. When running our tests the machine stops
>>> responding to all traffic and the only way to recover is a reboot. I do not see
>>> a stack trace on the console.
>>
>> Do you mean the machine is still running fine just the network is down?
>>
>> If so, can you dump your tc config with stats when the problem is happening?
>> (You can use `tc -s -d qd show ...`.)
>>
>>>
>>> This can be reproduced using the packetdrill test below, it should be run a
>>> few times or in a loop. You should hit this issue within a few tries but
>>> sometimes might take up to 15-20 tries.
>> ...
>>> I can reproduce the issue easily on v5.4.68, and after reverting this commit it
>>> does not happen anymore.
>>
>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>> is the only thing you use to trigger the bug (that is netdev is always active),
>> I can not connect them.
>>
>> Thanks.
>
> Hi Cong,
>
>> Do you mean the machine is still running fine just the network is down?
>
> I was able to access the machine via serial console, it looks like it is
> up and running, just that networking is down.
>
>> If so, can you dump your tc config with stats when the problem is happening?
>> (You can use `tc -s -d qd show ...`.)
>
> If I try running tc when the machine is in this state the command never
> returns. It doesn't print anything but doesn't exit either.
>
>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>> is the only thing you use to trigger the bug (that is netdev is always active),
>> I can not connect them.
>
> I think packetdrill creates a tun0 interface when it starts the
> test and tears it down at the end, so it might be hitting this code path
> during teardown.
Hi, Is there any preparation setup before running the above packetdrill test
case, I run the above test case in 5.9-rc4 with this patch applied without any
preparation setup, did not reproduce it.
By the way, I am newbie to packetdrill:), it would be good to provide the
detail setup to reproduce it,thanks.
>
> P.S: My mail server is having connectivity issues with vger.kernel.org
> so messages aren't getting delivered to netdev. It'll hopefully get
> resolved soon.
>
> Thanks,
> Vishwanath
>
>
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-09-17 19:26 ` Cong Wang
[not found] ` <CAP12E-+3DY-dgzVercKc-NYGPExWO1NjTOr1Gf3tPLKvp6O6+g@mail.gmail.com>
2020-10-28 17:46 ` Vishwanath Pai
@ 2020-10-29 2:52 ` Yunsheng Lin
2020-10-29 19:05 ` Cong Wang
2 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-10-29 2:52 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/9/18 3:26, Cong Wang wrote:
> On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> On 2020/9/11 4:07, Cong Wang wrote:
>>> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>>
>>>> Currently there is concurrent reset and enqueue operation for the
>>>> same lockless qdisc when there is no lock to synchronize the
>>>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>>>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>>>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>>>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>>>> skb with a larger queue_mapping after the corresponding qdisc is
>>>> reset, and call hns3_nic_net_xmit() with that skb later.
>>>>
>>>> Reused the existing synchronize_net() in dev_deactivate_many() to
>>>> make sure skb with larger queue_mapping enqueued to old qdisc(which
>>>> is saved in dev_queue->qdisc_sleeping) will always be reset when
>>>> dev_reset_queue() is called.
>>>>
>>>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
>>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>>>> ---
>>>> ChangeLog V2:
>>>> Reuse existing synchronize_net().
>>>> ---
>>>> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
>>>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>>>
>>>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>>>> index 265a61d..54c4172 100644
>>>> --- a/net/sched/sch_generic.c
>>>> +++ b/net/sched/sch_generic.c
>>>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>>>>
>>>> static void qdisc_deactivate(struct Qdisc *qdisc)
>>>> {
>>>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>> -
>>>> if (qdisc->flags & TCQ_F_BUILTIN)
>>>> return;
>>>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
>>>> - return;
>>>> -
>>>> - if (nolock)
>>>> - spin_lock_bh(&qdisc->seqlock);
>>>> - spin_lock_bh(qdisc_lock(qdisc));
>>>>
>>>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
>>>> -
>>>> - qdisc_reset(qdisc);
>>>> -
>>>> - spin_unlock_bh(qdisc_lock(qdisc));
>>>> - if (nolock)
>>>> - spin_unlock_bh(&qdisc->seqlock);
>>>> }
>>>>
>>>> static void dev_deactivate_queue(struct net_device *dev,
>>>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
>>>> }
>>>> }
>>>>
>>>> +static void dev_reset_queue(struct net_device *dev,
>>>> + struct netdev_queue *dev_queue,
>>>> + void *_unused)
>>>> +{
>>>> + struct Qdisc *qdisc;
>>>> + bool nolock;
>>>> +
>>>> + qdisc = dev_queue->qdisc_sleeping;
>>>> + if (!qdisc)
>>>> + return;
>>>> +
>>>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>> +
>>>> + if (nolock)
>>>> + spin_lock_bh(&qdisc->seqlock);
>>>> + spin_lock_bh(qdisc_lock(qdisc));
>>>
>>>
>>> I think you do not need this lock for lockless one.
>>
>> It seems so.
>> Maybe another patch to remove qdisc_lock(qdisc) for lockless
>> qdisc?
>
> Yeah, but not sure if we still want this lockless qdisc any more,
> it brings more troubles than gains.
>
>>
>>
>>>
>>>> +
>>>> + qdisc_reset(qdisc);
>>>> +
>>>> + spin_unlock_bh(qdisc_lock(qdisc));
>>>> + if (nolock)
>>>> + spin_unlock_bh(&qdisc->seqlock);
>>>> +}
>>>> +
>>>> static bool some_qdisc_is_busy(struct net_device *dev)
>>>> {
>>>> unsigned int i;
>>>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
>>>> dev_watchdog_down(dev);
>>>> }
>>>>
>>>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
>>>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
>>>> + * outstanding qdisc enqueuing calls.
>>>> * This is avoided if all devices are in dismantle phase :
>>>> * Caller will call synchronize_net() for us
>>>> */
>>>> synchronize_net();
>>>>
>>>> + list_for_each_entry(dev, head, close_list) {
>>>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
>>>> +
>>>> + if (dev_ingress_queue(dev))
>>>> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
>>>> + }
>>>> +
>>>> /* Wait for outstanding qdisc_run calls. */
>>>> list_for_each_entry(dev, head, close_list) {
>>>> while (some_qdisc_is_busy(dev)) {
>>>
>>> Do you want to reset before waiting for TX action?
>>>
>>> I think it is safer to do it after, at least prior to commit 759ae57f1b
>>> we did after.
>>
>> The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
>> should be enought to ensure there is no skb enqueued to the old qdisc that is saved
>> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
>> after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
>> enqueued to it will be dropped and freed, right?
>
> Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
> reference qdisc via txq->qdisc but via sd->output_queue.
Sorry for the delay reply, I seems to miss this.
I assumed synchronize_net() also wait for outstanding softirq to finish, right?
>
>
>>
>> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
>> can move it after some_qdisc_is_busy() checking.
>
> I am not suggesting to do an additional reset, I am suggesting to move
> your reset after the busy waiting.
There maybe a deadlock here if we reset the qdisc after the some_qdisc_is_busy() checking,
because some_qdisc_is_busy() may require the qdisc reset to clear the skb, so that
some_qdisc_is_busy() can return false. I am not sure this is really a problem, but
sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return TX_BUSY.
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 2:37 ` Yunsheng Lin
@ 2020-10-29 4:50 ` Vishwanath Pai
2020-10-29 10:24 ` Yunsheng Lin
0 siblings, 1 reply; 26+ messages in thread
From: Vishwanath Pai @ 2020-10-29 4:50 UTC (permalink / raw)
To: Yunsheng Lin, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet
On 10/28/20 10:37 PM, Yunsheng Lin wrote:
> On 2020/10/29 4:04, Vishwanath Pai wrote:
>> On 10/28/20 1:47 PM, Cong Wang wrote:
>>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com>
wrote:
>>>> Hi,
>>>>
>>>> We noticed some problems when testing the latest 5.4 LTS kernel
and traced it
>>>> back to this commit using git bisect. When running our tests the
machine stops
>>>> responding to all traffic and the only way to recover is a reboot.
I do not see
>>>> a stack trace on the console.
>>>
>>> Do you mean the machine is still running fine just the network is down?
>>>
>>> If so, can you dump your tc config with stats when the problem is
happening?
>>> (You can use `tc -s -d qd show ...`.)
>>>
>>>>
>>>> This can be reproduced using the packetdrill test below, it should
be run a
>>>> few times or in a loop. You should hit this issue within a few
tries but
>>>> sometimes might take up to 15-20 tries.
>>> ...
>>>> I can reproduce the issue easily on v5.4.68, and after reverting
this commit it
>>>> does not happen anymore.
>>>
>>> This is odd. The patch in this thread touches netdev reset path, if
packetdrill
>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>> I can not connect them.
>>>
>>> Thanks.
>>
>> Hi Cong,
>>
>>> Do you mean the machine is still running fine just the network is down?
>>
>> I was able to access the machine via serial console, it looks like it is
>> up and running, just that networking is down.
>>
>>> If so, can you dump your tc config with stats when the problem is
happening?
>>> (You can use `tc -s -d qd show ...`.)
>>
>> If I try running tc when the machine is in this state the command never
>> returns. It doesn't print anything but doesn't exit either.
>>
>>> This is odd. The patch in this thread touches netdev reset path, if
packetdrill
>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>> I can not connect them.
>>
>> I think packetdrill creates a tun0 interface when it starts the
>> test and tears it down at the end, so it might be hitting this code path
>> during teardown.
>
> Hi, Is there any preparation setup before running the above
packetdrill test
> case, I run the above test case in 5.9-rc4 with this patch applied
without any
> preparation setup, did not reproduce it.
>
> By the way, I am newbie to packetdrill:), it would be good to provide the
> detail setup to reproduce it,thanks.
>
>>
>> P.S: My mail server is having connectivity issues with vger.kernel.org
>> so messages aren't getting delivered to netdev. It'll hopefully get
>> resolved soon.
>>
>> Thanks,
>> Vishwanath
>>
>>
>> .
>>
I can't reproduce it on v5.9-rc4 either, it is probably an issue only on
5.4 then (and maybe older LTS versions). Can you give it a try on
5.4.68?
For running packetdrill, download the latest version from their github
repo, then run it in a loop without any special arguments. This is what
I do to reproduce it:
while true; do ./packetdrill <test-file>; done
I don't think any other setup is necessary.
-Vishwanath
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 4:50 ` Vishwanath Pai
@ 2020-10-29 10:24 ` Yunsheng Lin
2020-10-29 17:20 ` Vishwanath Pai
0 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-10-29 10:24 UTC (permalink / raw)
To: Vishwanath Pai, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet
On 2020/10/29 12:50, Vishwanath Pai wrote:
> On 10/28/20 10:37 PM, Yunsheng Lin wrote:
>> On 2020/10/29 4:04, Vishwanath Pai wrote:
>>> On 10/28/20 1:47 PM, Cong Wang wrote:
>>>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com> wrote:
>>>>> Hi,
>>>>>
>>>>> We noticed some problems when testing the latest 5.4 LTS kernel and traced it
>>>>> back to this commit using git bisect. When running our tests the machine stops
>>>>> responding to all traffic and the only way to recover is a reboot. I do not see
>>>>> a stack trace on the console.
>>>>
>>>> Do you mean the machine is still running fine just the network is down?
>>>>
>>>> If so, can you dump your tc config with stats when the problem is happening?
>>>> (You can use `tc -s -d qd show ...`.)
>>>>
>>>>>
>>>>> This can be reproduced using the packetdrill test below, it should be run a
>>>>> few times or in a loop. You should hit this issue within a few tries but
>>>>> sometimes might take up to 15-20 tries.
>>>> ...
>>>>> I can reproduce the issue easily on v5.4.68, and after reverting this commit it
>>>>> does not happen anymore.
>>>>
>>>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>>>> is the only thing you use to trigger the bug (that is netdev is always active),
>>>> I can not connect them.
>>>>
>>>> Thanks.
>>>
>>> Hi Cong,
>>>
>>>> Do you mean the machine is still running fine just the network is down?
>>>
>>> I was able to access the machine via serial console, it looks like it is
>>> up and running, just that networking is down.
>>>
>>>> If so, can you dump your tc config with stats when the problem is happening?
>>>> (You can use `tc -s -d qd show ...`.)
>>>
>>> If I try running tc when the machine is in this state the command never
>>> returns. It doesn't print anything but doesn't exit either.
>>>
>>>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>>>> is the only thing you use to trigger the bug (that is netdev is always active),
>>>> I can not connect them.
>>>
>>> I think packetdrill creates a tun0 interface when it starts the
>>> test and tears it down at the end, so it might be hitting this code path
>>> during teardown.
>>
>> Hi, Is there any preparation setup before running the above packetdrill test
>> case, I run the above test case in 5.9-rc4 with this patch applied without any
>> preparation setup, did not reproduce it.
>>
>> By the way, I am newbie to packetdrill:), it would be good to provide the
>> detail setup to reproduce it,thanks.
>>
>>>
>>> P.S: My mail server is having connectivity issues with vger.kernel.org
>>> so messages aren't getting delivered to netdev. It'll hopefully get
>>> resolved soon.
>>>
>>> Thanks,
>>> Vishwanath
>>>
>>>
>>> .
>>>
>
> I can't reproduce it on v5.9-rc4 either, it is probably an issue only on
> 5.4 then (and maybe older LTS versions). Can you give it a try on
> 5.4.68?
>
> For running packetdrill, download the latest version from their github
> repo, then run it in a loop without any special arguments. This is what
> I do to reproduce it:
>
> while true; do ./packetdrill <test-file>; done
>
> I don't think any other setup is necessary.
Hi, run the above test for above an hour using 5.4.68, still did not
reproduce it, as below:
root@(none)$ cd /home/root/
root@(none)$ ls
creat_vlan.sh packetdrill test.pd
root@(none)$ cat test.pd
0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
// set maxseg to 1000 to work with both ipv4 and ipv6
0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
0.500 bind(3, ..., ...) = 0
0.500 listen(3, 1) = 0
// Establish connection
0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
0.600 > S. 0:0(0) ack 1 <...>
0.800 < . 1:1(0) ack 1 win 320
0.800 accept(3, ..., ...) = 4
// Send 4 data segments.
+0 write(4, ..., 4000) = 4000
+0 > P. 1:4001(4000) ack 1
// Receive a SACK
+.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
+.3 %{ print "TCP CA state: ",tcpi_ca_state }%
root@(none)$ cat creat_vlan.sh
#!/bin/sh
for((i=0; i<10000; i++))
do
./packetdrill test.pd
done
root@(none)$ ./creat_vlan.sh
TCP CA state: 3
^C
root@(none)$ ifconfig
eth0 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ed
inet addr:192.168.1.93 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3570 errors:0 dropped:0 overruns:0 frame:0
TX packets:3190 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1076349 (1.0 MiB) TX bytes:414874 (405.1 KiB)
eth2 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ec
inet addr:192.168.100.1 Bcast:192.168.100.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:81848576 errors:0 dropped:0 overruns:0 frame:78
TX packets:72497816 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:2044282289568 (1.8 TiB) TX bytes:2457441698852 (2.2 TiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0 frame:0
TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:68 (68.0 B) TX bytes:68 (68.0 B)
root@(none)$ ./creat_vlan.sh
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
TCP CA state: 3
^C
root@(none)$ cat /proc/cmdline
BOOT_IMAGE=/linyunsheng/Image.5.0 rdinit=/init console=ttyAMA0,115200 earlycon=pl011,mmio32,0x94080000 iommu.strict=1
root@(none)$ cat /proc/version
Linux version 5.4.68 (linyunsheng@ubuntu) (gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12)) #1 SMP PREEMPT Thu Oct 29 16:59:37 CST 2020
root@(none)$
>
> -Vishwanath
>
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 10:24 ` Yunsheng Lin
@ 2020-10-29 17:20 ` Vishwanath Pai
2020-11-02 9:08 ` Yunsheng Lin
0 siblings, 1 reply; 26+ messages in thread
From: Vishwanath Pai @ 2020-10-29 17:20 UTC (permalink / raw)
To: Yunsheng Lin, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet
On 10/29/20 6:24 AM, Yunsheng Lin wrote:
> On 2020/10/29 12:50, Vishwanath Pai wrote:
>> On 10/28/20 10:37 PM, Yunsheng Lin wrote:
>>> On 2020/10/29 4:04, Vishwanath Pai wrote:
>>>> On 10/28/20 1:47 PM, Cong Wang wrote:
>>>>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com>
wrote:
>>>>>> Hi,
>>>>>>
>>>>>> We noticed some problems when testing the latest 5.4 LTS kernel
and traced it
>>>>>> back to this commit using git bisect. When running our tests the
machine stops
>>>>>> responding to all traffic and the only way to recover is a
reboot. I do not see
>>>>>> a stack trace on the console.
>>>>>
>>>>> Do you mean the machine is still running fine just the network is
down?
>>>>>
>>>>> If so, can you dump your tc config with stats when the problem is
happening?
>>>>> (You can use `tc -s -d qd show ...`.)
>>>>>
>>>>>>
>>>>>> This can be reproduced using the packetdrill test below, it
should be run a
>>>>>> few times or in a loop. You should hit this issue within a few
tries but
>>>>>> sometimes might take up to 15-20 tries.
>>>>> ...
>>>>>> I can reproduce the issue easily on v5.4.68, and after reverting
this commit it
>>>>>> does not happen anymore.
>>>>>
>>>>> This is odd. The patch in this thread touches netdev reset path,
if packetdrill
>>>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>>>> I can not connect them.
>>>>>
>>>>> Thanks.
>>>>
>>>> Hi Cong,
>>>>
>>>>> Do you mean the machine is still running fine just the network is
down?
>>>>
>>>> I was able to access the machine via serial console, it looks like
it is
>>>> up and running, just that networking is down.
>>>>
>>>>> If so, can you dump your tc config with stats when the problem is
happening?
>>>>> (You can use `tc -s -d qd show ...`.)
>>>>
>>>> If I try running tc when the machine is in this state the command
never
>>>> returns. It doesn't print anything but doesn't exit either.
>>>>
>>>>> This is odd. The patch in this thread touches netdev reset path,
if packetdrill
>>>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>>>> I can not connect them.
>>>>
>>>> I think packetdrill creates a tun0 interface when it starts the
>>>> test and tears it down at the end, so it might be hitting this
code path
>>>> during teardown.
>>>
>>> Hi, Is there any preparation setup before running the above
packetdrill test
>>> case, I run the above test case in 5.9-rc4 with this patch applied
without any
>>> preparation setup, did not reproduce it.
>>>
>>> By the way, I am newbie to packetdrill:), it would be good to
provide the
>>> detail setup to reproduce it,thanks.
>>>
>>>>
>>>> P.S: My mail server is having connectivity issues with vger.kernel.org
>>>> so messages aren't getting delivered to netdev. It'll hopefully get
>>>> resolved soon.
>>>>
>>>> Thanks,
>>>> Vishwanath
>>>>
>>>>
>>>> .
>>>>
>>
>> I can't reproduce it on v5.9-rc4 either, it is probably an issue only on
>> 5.4 then (and maybe older LTS versions). Can you give it a try on
>> 5.4.68?
>>
>> For running packetdrill, download the latest version from their github
>> repo, then run it in a loop without any special arguments. This is what
>> I do to reproduce it:
>>
>> while true; do ./packetdrill <test-file>; done
>>
>> I don't think any other setup is necessary.
>
> Hi, run the above test for above an hour using 5.4.68, still did not
> reproduce it, as below:
>
>
> root@(none)$ cd /home/root/
> root@(none)$ ls
> creat_vlan.sh packetdrill test.pd
> root@(none)$ cat test.pd
> 0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
>
> 0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
> 0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>
> // set maxseg to 1000 to work with both ipv4 and ipv6
> 0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
> 0.500 bind(3, ..., ...) = 0
> 0.500 listen(3, 1) = 0
>
> // Establish connection
> 0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
> 0.600 > S. 0:0(0) ack 1 <...>
>
> 0.800 < . 1:1(0) ack 1 win 320
> 0.800 accept(3, ..., ...) = 4
>
> // Send 4 data segments.
> +0 write(4, ..., 4000) = 4000
> +0 > P. 1:4001(4000) ack 1
>
> // Receive a SACK
> +.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
>
> +.3 %{ print "TCP CA state: ",tcpi_ca_state }%
> root@(none)$ cat creat_vlan.sh
> #!/bin/sh
>
> for((i=0; i<10000; i++))
> do
> ./packetdrill test.pd
> done
> root@(none)$ ./creat_vlan.sh
> TCP CA state: 3
> ^C
> root@(none)$ ifconfig
> eth0 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ed
> inet addr:192.168.1.93 Bcast:192.168.1.255 Mask:255.255.255.0
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:3570 errors:0 dropped:0 overruns:0 frame:0
> TX packets:3190 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:1076349 (1.0 MiB) TX bytes:414874 (405.1 KiB)
>
> eth2 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ec
> inet addr:192.168.100.1 Bcast:192.168.100.255
Mask:255.255.255.0
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:81848576 errors:0 dropped:0 overruns:0 frame:78
> TX packets:72497816 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:2044282289568 (1.8 TiB) TX bytes:2457441698852
(2.2 TiB)
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> UP LOOPBACK RUNNING MTU:65536 Metric:1
> RX packets:1 errors:0 dropped:0 overruns:0 frame:0
> TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:68 (68.0 B) TX bytes:68 (68.0 B)
>
> root@(none)$ ./creat_vlan.sh
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> TCP CA state: 3
> ^C
> root@(none)$ cat /proc/cmdline
> BOOT_IMAGE=/linyunsheng/Image.5.0 rdinit=/init console=ttyAMA0,115200
earlycon=pl011,mmio32,0x94080000 iommu.strict=1
> root@(none)$ cat /proc/version
> Linux version 5.4.68 (linyunsheng@ubuntu) (gcc version 5.4.0 20160609
(Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12)) #1 SMP PREEMPT Thu Oct 29
16:59:37 CST 2020
> root@(none)$
>
>
>
>>
>> -Vishwanath
>>
>> .
>>
I couldn't get it to reproduce on a ubuntu VM, maybe something is
different with the way we setup our machines. We do have some scripts in
/etc/network/{if-up.d,if-post-down.d} etc, or probably something else.
I'll let you know when I can reliably reproduce it on the VM.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 2:52 ` Yunsheng Lin
@ 2020-10-29 19:05 ` Cong Wang
2020-10-30 7:37 ` Yunsheng Lin
0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2020-10-29 19:05 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Wed, Oct 28, 2020 at 7:54 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2020/9/18 3:26, Cong Wang wrote:
> > On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >>
> >> On 2020/9/11 4:07, Cong Wang wrote:
> >>> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >>>>
> >>>> Currently there is concurrent reset and enqueue operation for the
> >>>> same lockless qdisc when there is no lock to synchronize the
> >>>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
> >>>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
> >>>> out-of-bounds access for priv->ring[] in hns3 driver if user has
> >>>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
> >>>> skb with a larger queue_mapping after the corresponding qdisc is
> >>>> reset, and call hns3_nic_net_xmit() with that skb later.
> >>>>
> >>>> Reused the existing synchronize_net() in dev_deactivate_many() to
> >>>> make sure skb with larger queue_mapping enqueued to old qdisc(which
> >>>> is saved in dev_queue->qdisc_sleeping) will always be reset when
> >>>> dev_reset_queue() is called.
> >>>>
> >>>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
> >>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> >>>> ---
> >>>> ChangeLog V2:
> >>>> Reuse existing synchronize_net().
> >>>> ---
> >>>> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
> >>>> 1 file changed, 33 insertions(+), 15 deletions(-)
> >>>>
> >>>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> >>>> index 265a61d..54c4172 100644
> >>>> --- a/net/sched/sch_generic.c
> >>>> +++ b/net/sched/sch_generic.c
> >>>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
> >>>>
> >>>> static void qdisc_deactivate(struct Qdisc *qdisc)
> >>>> {
> >>>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
> >>>> -
> >>>> if (qdisc->flags & TCQ_F_BUILTIN)
> >>>> return;
> >>>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
> >>>> - return;
> >>>> -
> >>>> - if (nolock)
> >>>> - spin_lock_bh(&qdisc->seqlock);
> >>>> - spin_lock_bh(qdisc_lock(qdisc));
> >>>>
> >>>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
> >>>> -
> >>>> - qdisc_reset(qdisc);
> >>>> -
> >>>> - spin_unlock_bh(qdisc_lock(qdisc));
> >>>> - if (nolock)
> >>>> - spin_unlock_bh(&qdisc->seqlock);
> >>>> }
> >>>>
> >>>> static void dev_deactivate_queue(struct net_device *dev,
> >>>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
> >>>> }
> >>>> }
> >>>>
> >>>> +static void dev_reset_queue(struct net_device *dev,
> >>>> + struct netdev_queue *dev_queue,
> >>>> + void *_unused)
> >>>> +{
> >>>> + struct Qdisc *qdisc;
> >>>> + bool nolock;
> >>>> +
> >>>> + qdisc = dev_queue->qdisc_sleeping;
> >>>> + if (!qdisc)
> >>>> + return;
> >>>> +
> >>>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
> >>>> +
> >>>> + if (nolock)
> >>>> + spin_lock_bh(&qdisc->seqlock);
> >>>> + spin_lock_bh(qdisc_lock(qdisc));
> >>>
> >>>
> >>> I think you do not need this lock for lockless one.
> >>
> >> It seems so.
> >> Maybe another patch to remove qdisc_lock(qdisc) for lockless
> >> qdisc?
> >
> > Yeah, but not sure if we still want this lockless qdisc any more,
> > it brings more troubles than gains.
> >
> >>
> >>
> >>>
> >>>> +
> >>>> + qdisc_reset(qdisc);
> >>>> +
> >>>> + spin_unlock_bh(qdisc_lock(qdisc));
> >>>> + if (nolock)
> >>>> + spin_unlock_bh(&qdisc->seqlock);
> >>>> +}
> >>>> +
> >>>> static bool some_qdisc_is_busy(struct net_device *dev)
> >>>> {
> >>>> unsigned int i;
> >>>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
> >>>> dev_watchdog_down(dev);
> >>>> }
> >>>>
> >>>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
> >>>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
> >>>> + * outstanding qdisc enqueuing calls.
> >>>> * This is avoided if all devices are in dismantle phase :
> >>>> * Caller will call synchronize_net() for us
> >>>> */
> >>>> synchronize_net();
> >>>>
> >>>> + list_for_each_entry(dev, head, close_list) {
> >>>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
> >>>> +
> >>>> + if (dev_ingress_queue(dev))
> >>>> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
> >>>> + }
> >>>> +
> >>>> /* Wait for outstanding qdisc_run calls. */
> >>>> list_for_each_entry(dev, head, close_list) {
> >>>> while (some_qdisc_is_busy(dev)) {
> >>>
> >>> Do you want to reset before waiting for TX action?
> >>>
> >>> I think it is safer to do it after, at least prior to commit 759ae57f1b
> >>> we did after.
> >>
> >> The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
> >> should be enought to ensure there is no skb enqueued to the old qdisc that is saved
> >> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
> >> after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
> >> enqueued to it will be dropped and freed, right?
> >
> > Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
> > reference qdisc via txq->qdisc but via sd->output_queue.
>
> Sorry for the delay reply, I seems to miss this.
>
> I assumed synchronize_net() also wait for outstanding softirq to finish, right?
I do not see how and why it should. synchronize_net() is merely an optimized
version of synchronize_rcu(), it should wait for RCU readers, softirqs are not
necessarily RCU readers, net_tx_action() does not take RCU read lock either.
>
> >
> >
> >>
> >> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
> >> can move it after some_qdisc_is_busy() checking.
> >
> > I am not suggesting to do an additional reset, I am suggesting to move
> > your reset after the busy waiting.
>
> There maybe a deadlock here if we reset the qdisc after the some_qdisc_is_busy() checking,
> because some_qdisc_is_busy() may require the qdisc reset to clear the skb, so that
some_qdisc_is_busy() checks the status of qdisc, not the skb queue.
> some_qdisc_is_busy() can return false. I am not sure this is really a problem, but
> sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return TX_BUSY.
Sounds like another reason we should move the reset as late as possible?
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 19:05 ` Cong Wang
@ 2020-10-30 7:37 ` Yunsheng Lin
2020-11-02 16:55 ` Cong Wang
0 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-10-30 7:37 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/10/30 3:05, Cong Wang wrote:
> On Wed, Oct 28, 2020 at 7:54 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> On 2020/9/18 3:26, Cong Wang wrote:
>>> On Fri, Sep 11, 2020 at 1:13 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>>
>>>> On 2020/9/11 4:07, Cong Wang wrote:
>>>>> On Tue, Sep 8, 2020 at 4:06 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>>>>
>>>>>> Currently there is concurrent reset and enqueue operation for the
>>>>>> same lockless qdisc when there is no lock to synchronize the
>>>>>> q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
>>>>>> qdisc_deactivate() called by dev_deactivate_queue(), which may cause
>>>>>> out-of-bounds access for priv->ring[] in hns3 driver if user has
>>>>>> requested a smaller queue num when __dev_xmit_skb() still enqueue a
>>>>>> skb with a larger queue_mapping after the corresponding qdisc is
>>>>>> reset, and call hns3_nic_net_xmit() with that skb later.
>>>>>>
>>>>>> Reused the existing synchronize_net() in dev_deactivate_many() to
>>>>>> make sure skb with larger queue_mapping enqueued to old qdisc(which
>>>>>> is saved in dev_queue->qdisc_sleeping) will always be reset when
>>>>>> dev_reset_queue() is called.
>>>>>>
>>>>>> Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
>>>>>> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
>>>>>> ---
>>>>>> ChangeLog V2:
>>>>>> Reuse existing synchronize_net().
>>>>>> ---
>>>>>> net/sched/sch_generic.c | 48 +++++++++++++++++++++++++++++++++---------------
>>>>>> 1 file changed, 33 insertions(+), 15 deletions(-)
>>>>>>
>>>>>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>>>>>> index 265a61d..54c4172 100644
>>>>>> --- a/net/sched/sch_generic.c
>>>>>> +++ b/net/sched/sch_generic.c
>>>>>> @@ -1131,24 +1131,10 @@ EXPORT_SYMBOL(dev_activate);
>>>>>>
>>>>>> static void qdisc_deactivate(struct Qdisc *qdisc)
>>>>>> {
>>>>>> - bool nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>>>> -
>>>>>> if (qdisc->flags & TCQ_F_BUILTIN)
>>>>>> return;
>>>>>> - if (test_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state))
>>>>>> - return;
>>>>>> -
>>>>>> - if (nolock)
>>>>>> - spin_lock_bh(&qdisc->seqlock);
>>>>>> - spin_lock_bh(qdisc_lock(qdisc));
>>>>>>
>>>>>> set_bit(__QDISC_STATE_DEACTIVATED, &qdisc->state);
>>>>>> -
>>>>>> - qdisc_reset(qdisc);
>>>>>> -
>>>>>> - spin_unlock_bh(qdisc_lock(qdisc));
>>>>>> - if (nolock)
>>>>>> - spin_unlock_bh(&qdisc->seqlock);
>>>>>> }
>>>>>>
>>>>>> static void dev_deactivate_queue(struct net_device *dev,
>>>>>> @@ -1165,6 +1151,30 @@ static void dev_deactivate_queue(struct net_device *dev,
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> +static void dev_reset_queue(struct net_device *dev,
>>>>>> + struct netdev_queue *dev_queue,
>>>>>> + void *_unused)
>>>>>> +{
>>>>>> + struct Qdisc *qdisc;
>>>>>> + bool nolock;
>>>>>> +
>>>>>> + qdisc = dev_queue->qdisc_sleeping;
>>>>>> + if (!qdisc)
>>>>>> + return;
>>>>>> +
>>>>>> + nolock = qdisc->flags & TCQ_F_NOLOCK;
>>>>>> +
>>>>>> + if (nolock)
>>>>>> + spin_lock_bh(&qdisc->seqlock);
>>>>>> + spin_lock_bh(qdisc_lock(qdisc));
>>>>>
>>>>>
>>>>> I think you do not need this lock for lockless one.
>>>>
>>>> It seems so.
>>>> Maybe another patch to remove qdisc_lock(qdisc) for lockless
>>>> qdisc?
>>>
>>> Yeah, but not sure if we still want this lockless qdisc any more,
>>> it brings more troubles than gains.
>>>
>>>>
>>>>
>>>>>
>>>>>> +
>>>>>> + qdisc_reset(qdisc);
>>>>>> +
>>>>>> + spin_unlock_bh(qdisc_lock(qdisc));
>>>>>> + if (nolock)
>>>>>> + spin_unlock_bh(&qdisc->seqlock);
>>>>>> +}
>>>>>> +
>>>>>> static bool some_qdisc_is_busy(struct net_device *dev)
>>>>>> {
>>>>>> unsigned int i;
>>>>>> @@ -1213,12 +1223,20 @@ void dev_deactivate_many(struct list_head *head)
>>>>>> dev_watchdog_down(dev);
>>>>>> }
>>>>>>
>>>>>> - /* Wait for outstanding qdisc-less dev_queue_xmit calls.
>>>>>> + /* Wait for outstanding qdisc-less dev_queue_xmit calls or
>>>>>> + * outstanding qdisc enqueuing calls.
>>>>>> * This is avoided if all devices are in dismantle phase :
>>>>>> * Caller will call synchronize_net() for us
>>>>>> */
>>>>>> synchronize_net();
>>>>>>
>>>>>> + list_for_each_entry(dev, head, close_list) {
>>>>>> + netdev_for_each_tx_queue(dev, dev_reset_queue, NULL);
>>>>>> +
>>>>>> + if (dev_ingress_queue(dev))
>>>>>> + dev_reset_queue(dev, dev_ingress_queue(dev), NULL);
>>>>>> + }
>>>>>> +
>>>>>> /* Wait for outstanding qdisc_run calls. */
>>>>>> list_for_each_entry(dev, head, close_list) {
>>>>>> while (some_qdisc_is_busy(dev)) {
>>>>>
>>>>> Do you want to reset before waiting for TX action?
>>>>>
>>>>> I think it is safer to do it after, at least prior to commit 759ae57f1b
>>>>> we did after.
>>>>
>>>> The reference to the txq->qdisc is always protected by RCU, so the synchronize_net()
>>>> should be enought to ensure there is no skb enqueued to the old qdisc that is saved
>>>> in the dev_queue->qdisc_sleeping, because __dev_queue_xmit can only see the new qdisc
>>>> after synchronize_net(), which is noop_qdisc, and noop_qdisc will make sure any skb
>>>> enqueued to it will be dropped and freed, right?
>>>
>>> Hmm? In net_tx_action(), we do not hold RCU read lock, and we do not
>>> reference qdisc via txq->qdisc but via sd->output_queue.
>>
>> Sorry for the delay reply, I seems to miss this.
>>
>> I assumed synchronize_net() also wait for outstanding softirq to finish, right?
>
> I do not see how and why it should. synchronize_net() is merely an optimized
> version of synchronize_rcu(), it should wait for RCU readers, softirqs are not
> necessarily RCU readers, net_tx_action() does not take RCU read lock either.
Ok, make sense.
Taking RCU read lock in net_tx_action() does not seems to solve the problem,
what about the time window between __netif_reschedule() and net_tx_action()?
It seems we need to re-dereference the qdisc whenever RCU read lock is released
and qdisc is still in sd->output_queue or wait for the sd->output_queue to drain?
>
>>
>>>
>>>
>>>>
>>>> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
>>>> can move it after some_qdisc_is_busy() checking.
>>>
>>> I am not suggesting to do an additional reset, I am suggesting to move
>>> your reset after the busy waiting.
>>
>> There maybe a deadlock here if we reset the qdisc after the some_qdisc_is_busy() checking,
>> because some_qdisc_is_busy() may require the qdisc reset to clear the skb, so that
>
> some_qdisc_is_busy() checks the status of qdisc, not the skb queue.
Is there any reason why we do not check the skb queue in the dqisc?
It seems there may be skb left when netdev is deactivated, maybe at least warn
about that when there is still skb left when netdev is deactivated?
Is that why we call qdisc_reset() to clear the leftover skb in qdisc_destroy()?
>
>
>> some_qdisc_is_busy() can return false. I am not sure this is really a problem, but
>> sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return TX_BUSY.
>
> Sounds like another reason we should move the reset as late as possible?
Why?
There current netdev down order is mainly below:
netif_tx_stop_all_queues()
dev_deactivate_queue()
synchronize_net()
dev_reset_queue()
some_qdisc_is_busy()
You suggest to change it to below order, right?
netif_tx_stop_all_queues()
dev_deactivate_queue()
synchronize_net()
some_qdisc_is_busy()
dev_reset_queue()
What is the semantics of some_qdisc_is_busy()?
From my understanding, we can do anything about the old qdisc (including
destorying the old qdisc) after some_qdisc_is_busy() return false.
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-29 17:20 ` Vishwanath Pai
@ 2020-11-02 9:08 ` Yunsheng Lin
2020-11-02 18:23 ` Vishwanath Pai
0 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-11-02 9:08 UTC (permalink / raw)
To: Vishwanath Pai, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet, dsahern, Joakim Tjernlund
On 2020/10/30 1:20, Vishwanath Pai wrote:
> On 10/29/20 6:24 AM, Yunsheng Lin wrote:
>> On 2020/10/29 12:50, Vishwanath Pai wrote:
>>> On 10/28/20 10:37 PM, Yunsheng Lin wrote:
>>>> On 2020/10/29 4:04, Vishwanath Pai wrote:
>>>>> On 10/28/20 1:47 PM, Cong Wang wrote:
>>>>>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath <vpai@akamai.com> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> We noticed some problems when testing the latest 5.4 LTS kernel and traced it
>>>>>>> back to this commit using git bisect. When running our tests the machine stops
>>>>>>> responding to all traffic and the only way to recover is a reboot. I do not see
>>>>>>> a stack trace on the console.
>>>>>>
>>>>>> Do you mean the machine is still running fine just the network is down?
>>>>>>
>>>>>> If so, can you dump your tc config with stats when the problem is happening?
>>>>>> (You can use `tc -s -d qd show ...`.)
>>>>>>
>>>>>>>
>>>>>>> This can be reproduced using the packetdrill test below, it should be run a
>>>>>>> few times or in a loop. You should hit this issue within a few tries but
>>>>>>> sometimes might take up to 15-20 tries.
>>>>>> ...
>>>>>>> I can reproduce the issue easily on v5.4.68, and after reverting this commit it
>>>>>>> does not happen anymore.
>>>>>>
>>>>>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>>>>>> is the only thing you use to trigger the bug (that is netdev is always active),
>>>>>> I can not connect them.
>>>>>>
>>>>>> Thanks.
>>>>>
>>>>> Hi Cong,
>>>>>
>>>>>> Do you mean the machine is still running fine just the network is down?
>>>>>
>>>>> I was able to access the machine via serial console, it looks like it is
>>>>> up and running, just that networking is down.
>>>>>
>>>>>> If so, can you dump your tc config with stats when the problem is happening?
>>>>>> (You can use `tc -s -d qd show ...`.)
>>>>>
>>>>> If I try running tc when the machine is in this state the command never
>>>>> returns. It doesn't print anything but doesn't exit either.
>>>>>
>>>>>> This is odd. The patch in this thread touches netdev reset path, if packetdrill
>>>>>> is the only thing you use to trigger the bug (that is netdev is always active),
>>>>>> I can not connect them.
>>>>>
>>>>> I think packetdrill creates a tun0 interface when it starts the
>>>>> test and tears it down at the end, so it might be hitting this code path
>>>>> during teardown.
>>>>
>>>> Hi, Is there any preparation setup before running the above packetdrill test
>>>> case, I run the above test case in 5.9-rc4 with this patch applied without any
>>>> preparation setup, did not reproduce it.
>>>>
>>>> By the way, I am newbie to packetdrill:), it would be good to provide the
>>>> detail setup to reproduce it,thanks.
>>>>
>>>>>
>>>>> P.S: My mail server is having connectivity issues with vger.kernel.org
>>>>> so messages aren't getting delivered to netdev. It'll hopefully get
>>>>> resolved soon.
>>>>>
>>>>> Thanks,
>>>>> Vishwanath
>>>>>
>>>>>
>>>>> .
>>>>>
>>>
>>> I can't reproduce it on v5.9-rc4 either, it is probably an issue only on
>>> 5.4 then (and maybe older LTS versions). Can you give it a try on
>>> 5.4.68?
>>>
>>> For running packetdrill, download the latest version from their github
>>> repo, then run it in a loop without any special arguments. This is what
>>> I do to reproduce it:
>>>
>>> while true; do ./packetdrill <test-file>; done
>>>
>>> I don't think any other setup is necessary.
>>
>> Hi, run the above test for above an hour using 5.4.68, still did not
>> reproduce it, as below:
>>
>>
>> root@(none)$ cd /home/root/
>> root@(none)$ ls
>> creat_vlan.sh packetdrill test.pd
>> root@(none)$ cat test.pd
>> 0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
>>
>> 0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>> 0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>
>> // set maxseg to 1000 to work with both ipv4 and ipv6
>> 0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
>> 0.500 bind(3, ..., ...) = 0
>> 0.500 listen(3, 1) = 0
>>
>> // Establish connection
>> 0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
>> 0.600 > S. 0:0(0) ack 1 <...>
>>
>> 0.800 < . 1:1(0) ack 1 win 320
>> 0.800 accept(3, ..., ...) = 4
>>
>> // Send 4 data segments.
>> +0 write(4, ..., 4000) = 4000
>> +0 > P. 1:4001(4000) ack 1
>>
>> // Receive a SACK
>> +.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
>>
>> +.3 %{ print "TCP CA state: ",tcpi_ca_state }%
>> root@(none)$ cat creat_vlan.sh
>> #!/bin/sh
>>
>> for((i=0; i<10000; i++))
>> do
>> ./packetdrill test.pd
>> done
>> root@(none)$ ./creat_vlan.sh
>> TCP CA state: 3
>> ^C
>> root@(none)$ ifconfig
>> eth0 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ed
>> inet addr:192.168.1.93 Bcast:192.168.1.255 Mask:255.255.255.0
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> RX packets:3570 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:3190 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:1076349 (1.0 MiB) TX bytes:414874 (405.1 KiB)
>>
>> eth2 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ec
>> inet addr:192.168.100.1 Bcast:192.168.100.255 Mask:255.255.255.0
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> RX packets:81848576 errors:0 dropped:0 overruns:0 frame:78
>> TX packets:72497816 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:2044282289568 (1.8 TiB) TX bytes:2457441698852 (2.2 TiB)
>>
>> lo Link encap:Local Loopback
>> inet addr:127.0.0.1 Mask:255.0.0.0
>> UP LOOPBACK RUNNING MTU:65536 Metric:1
>> RX packets:1 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:68 (68.0 B) TX bytes:68 (68.0 B)
>>
>> root@(none)$ ./creat_vlan.sh
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> TCP CA state: 3
>> ^C
>> root@(none)$ cat /proc/cmdline
>> BOOT_IMAGE=/linyunsheng/Image.5.0 rdinit=/init console=ttyAMA0,115200 earlycon=pl011,mmio32,0x94080000 iommu.strict=1
>> root@(none)$ cat /proc/version
>> Linux version 5.4.68 (linyunsheng@ubuntu) (gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12)) #1 SMP PREEMPT Thu Oct 29 16:59:37 CST 2020
>> root@(none)$
>>
>>
>>
>>>
>>> -Vishwanath
>>>
>>> .
>>>
> I couldn't get it to reproduce on a ubuntu VM, maybe something is
> different with the way we setup our machines. We do have some scripts in
> /etc/network/{if-up.d,if-post-down.d} etc, or probably something else.
> I'll let you know when I can reliably reproduce it on the VM.
Hi, Vishwanath
Please see if the patch in the below link fix your problem, thanks.
https://www.spinics.net/lists/netdev/msg695908.html
>
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-10-30 7:37 ` Yunsheng Lin
@ 2020-11-02 16:55 ` Cong Wang
2020-11-03 7:24 ` Yunsheng Lin
0 siblings, 1 reply; 26+ messages in thread
From: Cong Wang @ 2020-11-02 16:55 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Fri, Oct 30, 2020 at 12:38 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2020/10/30 3:05, Cong Wang wrote:
> >
> > I do not see how and why it should. synchronize_net() is merely an optimized
> > version of synchronize_rcu(), it should wait for RCU readers, softirqs are not
> > necessarily RCU readers, net_tx_action() does not take RCU read lock either.
>
> Ok, make sense.
>
> Taking RCU read lock in net_tx_action() does not seems to solve the problem,
> what about the time window between __netif_reschedule() and net_tx_action()?
>
> It seems we need to re-dereference the qdisc whenever RCU read lock is released
> and qdisc is still in sd->output_queue or wait for the sd->output_queue to drain?
Not suggesting you to take RCU read lock. We already wait for TX action with
a loop of sleep. To me, the only thing missing is just moving the
reset after that
wait.
> >>>> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
> >>>> can move it after some_qdisc_is_busy() checking.
> >>>
> >>> I am not suggesting to do an additional reset, I am suggesting to move
> >>> your reset after the busy waiting.
> >>
> >> There maybe a deadlock here if we reset the qdisc after the some_qdisc_is_busy() checking,
> >> because some_qdisc_is_busy() may require the qdisc reset to clear the skb, so that
> >
> > some_qdisc_is_busy() checks the status of qdisc, not the skb queue.
>
> Is there any reason why we do not check the skb queue in the dqisc?
> It seems there may be skb left when netdev is deactivated, maybe at least warn
> about that when there is still skb left when netdev is deactivated?
> Is that why we call qdisc_reset() to clear the leftover skb in qdisc_destroy()?
>
> >
> >
> >> some_qdisc_is_busy() can return false. I am not sure this is really a problem, but
> >> sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return TX_BUSY.
> >
> > Sounds like another reason we should move the reset as late as possible?
>
> Why?
You said "sch_direct_xmit() may requeue the skb", I agree. I assume you mean
net_tx_action() calls sch_direct_xmit() which does the requeue then races with
reset. No?
>
> There current netdev down order is mainly below:
>
> netif_tx_stop_all_queues()
>
> dev_deactivate_queue()
>
> synchronize_net()
>
> dev_reset_queue()
>
> some_qdisc_is_busy()
>
>
> You suggest to change it to below order, right?
>
> netif_tx_stop_all_queues()
>
> dev_deactivate_queue()
>
> synchronize_net()
>
> some_qdisc_is_busy()
>
> dev_reset_queue()
Yes.
>
>
> What is the semantics of some_qdisc_is_busy()?
Waiting for flying TX action.
> From my understanding, we can do anything about the old qdisc (including
> destorying the old qdisc) after some_qdisc_is_busy() return false.
But the current code does the reset _before_ some_qdisc_is_busy(). ;)
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-02 9:08 ` Yunsheng Lin
@ 2020-11-02 18:23 ` Vishwanath Pai
0 siblings, 0 replies; 26+ messages in thread
From: Vishwanath Pai @ 2020-11-02 18:23 UTC (permalink / raw)
To: Yunsheng Lin, Cong Wang
Cc: Hunt, Joshua, Jamal Hadi Salim, Jiri Pirko, David Miller,
Jakub Kicinski, Linux Kernel Network Developers, LKML, linuxarm,
John Fastabend, Eric Dumazet, dsahern, Joakim Tjernlund
On 11/2/20 4:08 AM, Yunsheng Lin wrote:
> On 2020/10/30 1:20, Vishwanath Pai wrote:
>> On 10/29/20 6:24 AM, Yunsheng Lin wrote:
>>> On 2020/10/29 12:50, Vishwanath Pai wrote:
>>>> On 10/28/20 10:37 PM, Yunsheng Lin wrote:
>>>>> On 2020/10/29 4:04, Vishwanath Pai wrote:
>>>>>> On 10/28/20 1:47 PM, Cong Wang wrote:
>>>>>>> On Wed, Oct 28, 2020 at 8:37 AM Pai, Vishwanath
<vpai@akamai.com> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We noticed some problems when testing the latest 5.4 LTS
kernel and traced it
>>>>>>>> back to this commit using git bisect. When running our tests
the machine stops
>>>>>>>> responding to all traffic and the only way to recover is a
reboot. I do not see
>>>>>>>> a stack trace on the console.
>>>>>>>
>>>>>>> Do you mean the machine is still running fine just the network
is down?
>>>>>>>
>>>>>>> If so, can you dump your tc config with stats when the problem
is happening?
>>>>>>> (You can use `tc -s -d qd show ...`.)
>>>>>>>
>>>>>>>>
>>>>>>>> This can be reproduced using the packetdrill test below, it
should be run a
>>>>>>>> few times or in a loop. You should hit this issue within a few
tries but
>>>>>>>> sometimes might take up to 15-20 tries.
>>>>>>> ...
>>>>>>>> I can reproduce the issue easily on v5.4.68, and after
reverting this commit it
>>>>>>>> does not happen anymore.
>>>>>>>
>>>>>>> This is odd. The patch in this thread touches netdev reset
path, if packetdrill
>>>>>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>>>>>> I can not connect them.
>>>>>>>
>>>>>>> Thanks.
>>>>>>
>>>>>> Hi Cong,
>>>>>>
>>>>>>> Do you mean the machine is still running fine just the network
is down?
>>>>>>
>>>>>> I was able to access the machine via serial console, it looks
like it is
>>>>>> up and running, just that networking is down.
>>>>>>
>>>>>>> If so, can you dump your tc config with stats when the problem
is happening?
>>>>>>> (You can use `tc -s -d qd show ...`.)
>>>>>>
>>>>>> If I try running tc when the machine is in this state the
command never
>>>>>> returns. It doesn't print anything but doesn't exit either.
>>>>>>
>>>>>>> This is odd. The patch in this thread touches netdev reset
path, if packetdrill
>>>>>>> is the only thing you use to trigger the bug (that is netdev is
always active),
>>>>>>> I can not connect them.
>>>>>>
>>>>>> I think packetdrill creates a tun0 interface when it starts the
>>>>>> test and tears it down at the end, so it might be hitting this
code path
>>>>>> during teardown.
>>>>>
>>>>> Hi, Is there any preparation setup before running the above
packetdrill test
>>>>> case, I run the above test case in 5.9-rc4 with this patch
applied without any
>>>>> preparation setup, did not reproduce it.
>>>>>
>>>>> By the way, I am newbie to packetdrill:), it would be good to
provide the
>>>>> detail setup to reproduce it,thanks.
>>>>>
>>>>>>
>>>>>> P.S: My mail server is having connectivity issues with
vger.kernel.org
>>>>>> so messages aren't getting delivered to netdev. It'll hopefully get
>>>>>> resolved soon.
>>>>>>
>>>>>> Thanks,
>>>>>> Vishwanath
>>>>>>
>>>>>>
>>>>>> .
>>>>>>
>>>>
>>>> I can't reproduce it on v5.9-rc4 either, it is probably an issue
only on
>>>> 5.4 then (and maybe older LTS versions). Can you give it a try on
>>>> 5.4.68?
>>>>
>>>> For running packetdrill, download the latest version from their github
>>>> repo, then run it in a loop without any special arguments. This is
what
>>>> I do to reproduce it:
>>>>
>>>> while true; do ./packetdrill <test-file>; done
>>>>
>>>> I don't think any other setup is necessary.
>>>
>>> Hi, run the above test for above an hour using 5.4.68, still did not
>>> reproduce it, as below:
>>>
>>>
>>> root@(none)$ cd /home/root/
>>> root@(none)$ ls
>>> creat_vlan.sh packetdrill test.pd
>>> root@(none)$ cat test.pd
>>> 0 `echo 4 > /proc/sys/net/ipv4/tcp_min_tso_segs`
>>>
>>> 0.400 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
>>> 0.400 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>
>>> // set maxseg to 1000 to work with both ipv4 and ipv6
>>> 0.500 setsockopt(3, SOL_TCP, TCP_MAXSEG, [1000], 4) = 0
>>> 0.500 bind(3, ..., ...) = 0
>>> 0.500 listen(3, 1) = 0
>>>
>>> // Establish connection
>>> 0.600 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 5>
>>> 0.600 > S. 0:0(0) ack 1 <...>
>>>
>>> 0.800 < . 1:1(0) ack 1 win 320
>>> 0.800 accept(3, ..., ...) = 4
>>>
>>> // Send 4 data segments.
>>> +0 write(4, ..., 4000) = 4000
>>> +0 > P. 1:4001(4000) ack 1
>>>
>>> // Receive a SACK
>>> +.1 < . 1:1(0) ack 1 win 320 <sack 1001:2001,nop,nop>
>>>
>>> +.3 %{ print "TCP CA state: ",tcpi_ca_state }%
>>> root@(none)$ cat creat_vlan.sh
>>> #!/bin/sh
>>>
>>> for((i=0; i<10000; i++))
>>> do
>>> ./packetdrill test.pd
>>> done
>>> root@(none)$ ./creat_vlan.sh
>>> TCP CA state: 3
>>> ^C
>>> root@(none)$ ifconfig
>>> eth0 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ed
>>> inet addr:192.168.1.93 Bcast:192.168.1.255
Mask:255.255.255.0
>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>>> RX packets:3570 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:3190 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:1000
>>> RX bytes:1076349 (1.0 MiB) TX bytes:414874 (405.1 KiB)
>>>
>>> eth2 Link encap:Ethernet HWaddr 5c:e8:83:0d:f7:ec
>>> inet addr:192.168.100.1 Bcast:192.168.100.255
Mask:255.255.255.0
>>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>>> RX packets:81848576 errors:0 dropped:0 overruns:0 frame:78
>>> TX packets:72497816 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:1000
>>> RX bytes:2044282289568 (1.8 TiB) TX bytes:2457441698852
(2.2 TiB)
>>>
>>> lo Link encap:Local Loopback
>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>> UP LOOPBACK RUNNING MTU:65536 Metric:1
>>> RX packets:1 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
>>> collisions:0 txqueuelen:1000
>>> RX bytes:68 (68.0 B) TX bytes:68 (68.0 B)
>>>
>>> root@(none)$ ./creat_vlan.sh
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> TCP CA state: 3
>>> ^C
>>> root@(none)$ cat /proc/cmdline
>>> BOOT_IMAGE=/linyunsheng/Image.5.0 rdinit=/init
console=ttyAMA0,115200 earlycon=pl011,mmio32,0x94080000 iommu.strict=1
>>> root@(none)$ cat /proc/version
>>> Linux version 5.4.68 (linyunsheng@ubuntu) (gcc version 5.4.0
20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.12)) #1 SMP PREEMPT Thu Oct
29 16:59:37 CST 2020
>>> root@(none)$
>>>
>>>
>>>
>>>>
>>>> -Vishwanath
>>>>
>>>> .
>>>>
>> I couldn't get it to reproduce on a ubuntu VM, maybe something is
>> different with the way we setup our machines. We do have some scripts in
>> /etc/network/{if-up.d,if-post-down.d} etc, or probably something else.
>> I'll let you know when I can reliably reproduce it on the VM.
>
> Hi, Vishwanath
> Please see if the patch in the below link fix your problem, thanks.
>
https://urldefense.com/v3/__https://www.spinics.net/lists/netdev/msg695908.html__;!!GjvTz_vk!Gc1MbfkLP7lWuWFxK2Bauofrqq53Tt79VJaozx08T8cmH0dRxAXuilS7xFt0$
>
>>
>> .
>>
I tried out the patch on top of 5.4.71 and it fixes the issue for me
too. Thanks for the quick fix.
- Vishwanath
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-02 16:55 ` Cong Wang
@ 2020-11-03 7:24 ` Yunsheng Lin
2020-11-05 6:04 ` Cong Wang
0 siblings, 1 reply; 26+ messages in thread
From: Yunsheng Lin @ 2020-11-03 7:24 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/11/3 0:55, Cong Wang wrote:
> On Fri, Oct 30, 2020 at 12:38 AM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>
>> On 2020/10/30 3:05, Cong Wang wrote:
>>>
>>> I do not see how and why it should. synchronize_net() is merely an optimized
>>> version of synchronize_rcu(), it should wait for RCU readers, softirqs are not
>>> necessarily RCU readers, net_tx_action() does not take RCU read lock either.
>>
>> Ok, make sense.
>>
>> Taking RCU read lock in net_tx_action() does not seems to solve the problem,
>> what about the time window between __netif_reschedule() and net_tx_action()?
>>
>> It seems we need to re-dereference the qdisc whenever RCU read lock is released
>> and qdisc is still in sd->output_queue or wait for the sd->output_queue to drain?
>
> Not suggesting you to take RCU read lock. We already wait for TX action with
> a loop of sleep. To me, the only thing missing is just moving the
> reset after that
> wait.
__QDISC_STATE_SCHED is cleared before calling qdisc_run() in net_tx_action(),
some_qdisc_is_busy does not seem to wait fully for TX action, at least
qdisc is still being accessed even if __QDISC_STATE_DEACTIVATED is set.
>
>
>>>>>> If we do any additional reset that is not related to qdisc in dev_reset_queue(), we
>>>>>> can move it after some_qdisc_is_busy() checking.
>>>>>
>>>>> I am not suggesting to do an additional reset, I am suggesting to move
>>>>> your reset after the busy waiting.
>>>>
>>>> There maybe a deadlock here if we reset the qdisc after the some_qdisc_is_busy() checking,
>>>> because some_qdisc_is_busy() may require the qdisc reset to clear the skb, so that
>>>
>>> some_qdisc_is_busy() checks the status of qdisc, not the skb queue.
>>
>> Is there any reason why we do not check the skb queue in the dqisc?
>> It seems there may be skb left when netdev is deactivated, maybe at least warn
>> about that when there is still skb left when netdev is deactivated?
>> Is that why we call qdisc_reset() to clear the leftover skb in qdisc_destroy()?
>>
>>>
>>>
>>>> some_qdisc_is_busy() can return false. I am not sure this is really a problem, but
>>>> sch_direct_xmit() may requeue the skb when dev_hard_start_xmit return TX_BUSY.
>>>
>>> Sounds like another reason we should move the reset as late as possible?
>>
>> Why?
>
> You said "sch_direct_xmit() may requeue the skb", I agree. I assume you mean
> net_tx_action() calls sch_direct_xmit() which does the requeue then races with
> reset. No?
>
Look at current code again, I think there is no race between sch_direct_xmit()
in net_tx_action() and dev_reset_queue() in dev_deactivate_many(), because
qdisc_lock(qdisc) or qdisc->seqlock has been taken when calling sch_direct_xmit()
or dev_reset_queue().
>
>>
>> There current netdev down order is mainly below:
>>
>> netif_tx_stop_all_queues()
>>
>> dev_deactivate_queue()
>>
>> synchronize_net()
>>
>> dev_reset_queue()
>>
>> some_qdisc_is_busy()
>>
>>
>> You suggest to change it to below order, right?
>>
>> netif_tx_stop_all_queues()
>>
>> dev_deactivate_queue()
>>
>> synchronize_net()
>>
>> some_qdisc_is_busy()
>>
>> dev_reset_queue()
>
> Yes.
>
>>
>>
>> What is the semantics of some_qdisc_is_busy()?
>
> Waiting for flying TX action.
It wait for __QDISC_STATE_SCHED to clear and qdisc running to finish, but
there is still time window between __QDISC_STATE_SCHED clearing and qdisc
running, right?
>
>> From my understanding, we can do anything about the old qdisc (including
>> destorying the old qdisc) after some_qdisc_is_busy() return false.
>
> But the current code does the reset _before_ some_qdisc_is_busy(). ;)
If lock is taken when doing reset, it does not matter if the reset is
before some_qdisc_is_busy(), right?
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-03 7:24 ` Yunsheng Lin
@ 2020-11-05 6:04 ` Cong Wang
2020-11-05 6:16 ` Cong Wang
2020-11-05 6:22 ` Yunsheng Lin
0 siblings, 2 replies; 26+ messages in thread
From: Cong Wang @ 2020-11-05 6:04 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Mon, Nov 2, 2020 at 11:24 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >> From my understanding, we can do anything about the old qdisc (including
> >> destorying the old qdisc) after some_qdisc_is_busy() return false.
> >
> > But the current code does the reset _before_ some_qdisc_is_busy(). ;)
>
> If lock is taken when doing reset, it does not matter if the reset is
> before some_qdisc_is_busy(), right?
Why not? How about the following scenario?
CPU0: CPU1:
dev_reset_queue()
net_tx_action()
-> sch_direct_xmit()
-> dev_requeue_skb()
some_qdisc_is_busy()
// waiting for TX action on CPU1
// now some packets are requeued
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-05 6:04 ` Cong Wang
@ 2020-11-05 6:16 ` Cong Wang
2020-11-05 6:32 ` Yunsheng Lin
2020-11-05 6:22 ` Yunsheng Lin
1 sibling, 1 reply; 26+ messages in thread
From: Cong Wang @ 2020-11-05 6:16 UTC (permalink / raw)
To: Yunsheng Lin
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On Wed, Nov 4, 2020 at 10:04 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Nov 2, 2020 at 11:24 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
> > >> From my understanding, we can do anything about the old qdisc (including
> > >> destorying the old qdisc) after some_qdisc_is_busy() return false.
> > >
> > > But the current code does the reset _before_ some_qdisc_is_busy(). ;)
> >
> > If lock is taken when doing reset, it does not matter if the reset is
> > before some_qdisc_is_busy(), right?
>
> Why not? How about the following scenario?
>
> CPU0: CPU1:
> dev_reset_queue()
> net_tx_action()
> -> sch_direct_xmit()
> -> dev_requeue_skb()
> some_qdisc_is_busy()
> // waiting for TX action on CPU1
> // now some packets are requeued
Never mind, the skb_bad_txq is also cleared by dev_reset_queue().
TX action after resetting should get NULL.
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-05 6:04 ` Cong Wang
2020-11-05 6:16 ` Cong Wang
@ 2020-11-05 6:22 ` Yunsheng Lin
1 sibling, 0 replies; 26+ messages in thread
From: Yunsheng Lin @ 2020-11-05 6:22 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/11/5 14:04, Cong Wang wrote:
> On Mon, Nov 2, 2020 at 11:24 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>> From my understanding, we can do anything about the old qdisc (including
>>>> destorying the old qdisc) after some_qdisc_is_busy() return false.
>>>
>>> But the current code does the reset _before_ some_qdisc_is_busy(). ;)
>>
>> If lock is taken when doing reset, it does not matter if the reset is
>> before some_qdisc_is_busy(), right?
>
> Why not? How about the following scenario?
>
> CPU0: CPU1:
> dev_reset_queue()
There is no skb in the dqisc now.
> net_tx_action()
> -> sch_direct_xmit()
So when CPU1 calls net_tx_action(), there is no skb to send and
no skb to be requeued too.
> -> dev_requeue_skb()
> some_qdisc_is_busy()
> // waiting for TX action on CPU1
> // now some packets are requeued
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
2020-11-05 6:16 ` Cong Wang
@ 2020-11-05 6:32 ` Yunsheng Lin
0 siblings, 0 replies; 26+ messages in thread
From: Yunsheng Lin @ 2020-11-05 6:32 UTC (permalink / raw)
To: Cong Wang
Cc: Jamal Hadi Salim, Jiri Pirko, David Miller, Jakub Kicinski,
Linux Kernel Network Developers, LKML, linuxarm, John Fastabend,
Eric Dumazet
On 2020/11/5 14:16, Cong Wang wrote:
> On Wed, Nov 4, 2020 at 10:04 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> On Mon, Nov 2, 2020 at 11:24 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>>> From my understanding, we can do anything about the old qdisc (including
>>>>> destorying the old qdisc) after some_qdisc_is_busy() return false.
>>>>
>>>> But the current code does the reset _before_ some_qdisc_is_busy(). ;)
>>>
>>> If lock is taken when doing reset, it does not matter if the reset is
>>> before some_qdisc_is_busy(), right?
>>
>> Why not? How about the following scenario?
>>
>> CPU0: CPU1:
>> dev_reset_queue()
>> net_tx_action()
>> -> sch_direct_xmit()
>> -> dev_requeue_skb()
>> some_qdisc_is_busy()
>> // waiting for TX action on CPU1
>> // now some packets are requeued
>
> Never mind, the skb_bad_txq is also cleared by dev_reset_queue().
> TX action after resetting should get NULL.
Yes, maybe it is safe to remove qdisc_reset() now? or issue a warning
if there are still skb in the qdisc.
>
> Thanks.
> .
>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc
@ 2020-09-09 4:07 kernel test robot
0 siblings, 0 replies; 26+ messages in thread
From: kernel test robot @ 2020-09-09 4:07 UTC (permalink / raw)
To: kbuild
[-- Attachment #1: Type: text/plain, Size: 4359 bytes --]
CC: kbuild-all(a)lists.01.org
In-Reply-To: <1599562954-87257-1-git-send-email-linyunsheng@huawei.com>
References: <1599562954-87257-1-git-send-email-linyunsheng@huawei.com>
TO: Yunsheng Lin <linyunsheng@huawei.com>
TO: jhs(a)mojatatu.com
TO: xiyou.wangcong(a)gmail.com
TO: jiri(a)resnulli.us
TO: davem(a)davemloft.net
TO: kuba(a)kernel.org
CC: netdev(a)vger.kernel.org
CC: linux-kernel(a)vger.kernel.org
CC: linuxarm(a)huawei.com
CC: john.fastabend(a)gmail.com
CC: eric.dumazet(a)gmail.com
Hi Yunsheng,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net/master]
url: https://github.com/0day-ci/linux/commits/Yunsheng-Lin/net-sch_generic-aviod-concurrent-reset-and-enqueue-op-for-lockless-qdisc/20200908-190838
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 19162fd4063a3211843b997a454b505edb81d5ce
:::::: branch date: 17 hours ago
:::::: commit date: 17 hours ago
config: powerpc64-randconfig-s032-20200909 (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-191-g10164920-dirty
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=powerpc64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
sparse warnings: (new ones prefixed by >>)
net/sched/sch_generic.c:189:50: sparse: sparse: context imbalance in 'try_bulk_dequeue_skb_slow' - different lock contexts for basic block
net/sched/sch_generic.c:248:17: sparse: sparse: context imbalance in 'dequeue_skb' - different lock contexts for basic block
net/sched/sch_generic.c:294:28: sparse: sparse: context imbalance in 'sch_direct_xmit' - unexpected unlock
>> net/sched/sch_generic.c:1154:13: sparse: sparse: context imbalance in 'dev_reset_queue' - different lock contexts for basic block
# https://github.com/0day-ci/linux/commit/ea5743c9fd781a0a0ed7bb09e9953550afa92390
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Yunsheng-Lin/net-sch_generic-aviod-concurrent-reset-and-enqueue-op-for-lockless-qdisc/20200908-190838
git checkout ea5743c9fd781a0a0ed7bb09e9953550afa92390
vim +/dev_reset_queue +1154 net/sched/sch_generic.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 1153
ea5743c9fd781a Yunsheng Lin 2020-09-08 @1154 static void dev_reset_queue(struct net_device *dev,
ea5743c9fd781a Yunsheng Lin 2020-09-08 1155 struct netdev_queue *dev_queue,
ea5743c9fd781a Yunsheng Lin 2020-09-08 1156 void *_unused)
ea5743c9fd781a Yunsheng Lin 2020-09-08 1157 {
ea5743c9fd781a Yunsheng Lin 2020-09-08 1158 struct Qdisc *qdisc;
ea5743c9fd781a Yunsheng Lin 2020-09-08 1159 bool nolock;
ea5743c9fd781a Yunsheng Lin 2020-09-08 1160
ea5743c9fd781a Yunsheng Lin 2020-09-08 1161 qdisc = dev_queue->qdisc_sleeping;
ea5743c9fd781a Yunsheng Lin 2020-09-08 1162 if (!qdisc)
ea5743c9fd781a Yunsheng Lin 2020-09-08 1163 return;
ea5743c9fd781a Yunsheng Lin 2020-09-08 1164
ea5743c9fd781a Yunsheng Lin 2020-09-08 1165 nolock = qdisc->flags & TCQ_F_NOLOCK;
ea5743c9fd781a Yunsheng Lin 2020-09-08 1166
ea5743c9fd781a Yunsheng Lin 2020-09-08 1167 if (nolock)
ea5743c9fd781a Yunsheng Lin 2020-09-08 1168 spin_lock_bh(&qdisc->seqlock);
ea5743c9fd781a Yunsheng Lin 2020-09-08 1169 spin_lock_bh(qdisc_lock(qdisc));
ea5743c9fd781a Yunsheng Lin 2020-09-08 1170
ea5743c9fd781a Yunsheng Lin 2020-09-08 1171 qdisc_reset(qdisc);
ea5743c9fd781a Yunsheng Lin 2020-09-08 1172
ea5743c9fd781a Yunsheng Lin 2020-09-08 1173 spin_unlock_bh(qdisc_lock(qdisc));
ea5743c9fd781a Yunsheng Lin 2020-09-08 1174 if (nolock)
ea5743c9fd781a Yunsheng Lin 2020-09-08 1175 spin_unlock_bh(&qdisc->seqlock);
ea5743c9fd781a Yunsheng Lin 2020-09-08 1176 }
ea5743c9fd781a Yunsheng Lin 2020-09-08 1177
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 31514 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2020-11-05 6:33 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-08 11:02 [PATCH v2 net] net: sch_generic: aviod concurrent reset and enqueue op for lockless qdisc Yunsheng Lin
2020-09-10 19:39 ` David Miller
2020-09-10 20:07 ` Cong Wang
2020-09-11 8:13 ` Yunsheng Lin
2020-09-11 8:25 ` Yunsheng Lin
2020-09-17 19:26 ` Cong Wang
[not found] ` <CAP12E-+3DY-dgzVercKc-NYGPExWO1NjTOr1Gf3tPLKvp6O6+g@mail.gmail.com>
2020-10-28 15:37 ` Pai, Vishwanath
2020-10-28 17:47 ` Cong Wang
2020-10-28 20:04 ` Vishwanath Pai
2020-10-29 2:37 ` Yunsheng Lin
2020-10-29 4:50 ` Vishwanath Pai
2020-10-29 10:24 ` Yunsheng Lin
2020-10-29 17:20 ` Vishwanath Pai
2020-11-02 9:08 ` Yunsheng Lin
2020-11-02 18:23 ` Vishwanath Pai
2020-10-28 17:46 ` Vishwanath Pai
2020-10-29 2:52 ` Yunsheng Lin
2020-10-29 19:05 ` Cong Wang
2020-10-30 7:37 ` Yunsheng Lin
2020-11-02 16:55 ` Cong Wang
2020-11-03 7:24 ` Yunsheng Lin
2020-11-05 6:04 ` Cong Wang
2020-11-05 6:16 ` Cong Wang
2020-11-05 6:32 ` Yunsheng Lin
2020-11-05 6:22 ` Yunsheng Lin
2020-09-09 4:07 kernel test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.