* [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop
@ 2023-06-06 14:45 renmingshuai
2023-06-06 16:59 ` Pedro Tammela
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-06 14:45 UTC (permalink / raw)
To: netdev, linux-kernel, jhs, xiyou.wangcong, jiri, davem, edumazet,
kuba, pabeni
Cc: liaichun, caowangbao, yanan, liubo335
When a new chain is added by using tc, one soft lockup alarm will be
generated after delete the prio 0 filter of the chain. To reproduce
the problem, perform the following steps:
(1) tc qdisc add dev eth0 root handle 1: htb default 1
(2) tc chain add dev eth0
(3) tc filter del dev eth0 chain 0 parent 1: prio 0
(4) tc filter add dev eth0 chain 0 parent 1:
The refcnt of the chain added by step 2 is equal to 1. After step 3,
the flushing flag of the chain is set to true in the tcf_chain_flush()
called by tc_del_tfilter() because the prio is 0. In this case, if
we add a new filter to this chain, it will never succeed and try again
and again because the refresh flash is always true and refcnt is 1.
A soft lock alarm is generated 20 seconds later.
The stack is show as below:
Kernel panic - not syncing: softlockup: hung tasks
CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Call Trace:
<IRQ>
dump_stack+0x57/0x6e
panic+0x196/0x3ec
watchdog_timer_fn.cold+0x16/0x5c
__run_hrtimer+0x5e/0x190
__hrtimer_run_queues+0x8a/0xe0
hrtimer_interrupt+0x110/0x2c0
? irqtime_account_irq+0x49/0xf0
__sysvec_apic_timer_interrupt+0x5f/0xe0
asm_call_irq_on_stack+0x12/0x20
</IRQ>
sysvec_apic_timer_interrupt+0x72/0x80
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:mutex_lock+0x24/0x70
RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
__tcf_chain_put+0x27/0x200
tc_new_tfilter+0x5e8/0x810
? tc_setup_cb_add+0x210/0x210
rtnetlink_rcv_msg+0x2e3/0x380
? rtnl_calcit.isra.0+0x120/0x120
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x12d/0x1d0
netlink_sendmsg+0x286/0x490
sock_sendmsg+0x62/0x70
____sys_sendmsg+0x24c/0x2c0
? import_iovec+0x17/0x20
? sendmsg_copy_msghdr+0x80/0xa0
___sys_sendmsg+0x75/0xc0
? do_fault_around+0x118/0x160
? do_read_fault+0x68/0xf0
? __handle_mm_fault+0x3f9/0x6f0
__sys_sendmsg+0x59/0xa0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f96705b8247
RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
To avoid this case, set chain->flushing to be false if the chain->refcnt
is 1 after flushing the chain when prio is 0.
Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Signed-off-by: Ren Mingshuai <renmingshuai@huawei.com>
---
net/sched/cls_api.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2621550bfddc..68be55d75831 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
tfilter_notify_chain(net, skb, block, q, parent, n,
chain, RTM_DELTFILTER, extack);
tcf_chain_flush(chain, rtnl_held);
+ /* Set the flushing flags to false to prevent an infinite loop
+ * when a new filter is added.
+ */
+ mutex_lock(&chain->filter_chain_lock);
+ if (chain->refcnt == 1)
+ chain->flushing = false;
+ mutex_unlock(&chain->filter_chain_lock);
err = 0;
goto errout;
}
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop
2023-06-06 14:45 [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop renmingshuai
@ 2023-06-06 16:59 ` Pedro Tammela
2023-06-07 4:19 ` renmingshuai
0 siblings, 1 reply; 13+ messages in thread
From: Pedro Tammela @ 2023-06-06 16:59 UTC (permalink / raw)
To: renmingshuai, netdev, linux-kernel, jhs, xiyou.wangcong, jiri,
davem, edumazet, kuba, pabeni
Cc: liaichun, caowangbao, yanan, liubo335
On 06/06/2023 11:45, renmingshuai wrote:
> When a new chain is added by using tc, one soft lockup alarm will be
> generated after delete the prio 0 filter of the chain. To reproduce
> the problem, perform the following steps:
> (1) tc qdisc add dev eth0 root handle 1: htb default 1
> (2) tc chain add dev eth0
> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
> (4) tc filter add dev eth0 chain 0 parent 1:
This seems like it could be added to tdc or 3 and 4 must be run in parallel?
>
>
> The refcnt of the chain added by step 2 is equal to 1. After step 3,
> the flushing flag of the chain is set to true in the tcf_chain_flush()
> called by tc_del_tfilter() because the prio is 0. In this case, if
> we add a new filter to this chain, it will never succeed and try again
> and again because the refresh flash is always true and refcnt is 1.
> A soft lock alarm is generated 20 seconds later.
> The stack is show as below:
>
> Kernel panic - not syncing: softlockup: hung tasks
> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> Call Trace:
> <IRQ>
> dump_stack+0x57/0x6e
> panic+0x196/0x3ec
> watchdog_timer_fn.cold+0x16/0x5c
> __run_hrtimer+0x5e/0x190
> __hrtimer_run_queues+0x8a/0xe0
> hrtimer_interrupt+0x110/0x2c0
> ? irqtime_account_irq+0x49/0xf0
> __sysvec_apic_timer_interrupt+0x5f/0xe0
> asm_call_irq_on_stack+0x12/0x20
> </IRQ>
> sysvec_apic_timer_interrupt+0x72/0x80
> asm_sysvec_apic_timer_interrupt+0x12/0x20
> RIP: 0010:mutex_lock+0x24/0x70
> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
> __tcf_chain_put+0x27/0x200
> tc_new_tfilter+0x5e8/0x810
> ? tc_setup_cb_add+0x210/0x210
> rtnetlink_rcv_msg+0x2e3/0x380
> ? rtnl_calcit.isra.0+0x120/0x120
> netlink_rcv_skb+0x50/0x100
> netlink_unicast+0x12d/0x1d0
> netlink_sendmsg+0x286/0x490
> sock_sendmsg+0x62/0x70
> ____sys_sendmsg+0x24c/0x2c0
> ? import_iovec+0x17/0x20
> ? sendmsg_copy_msghdr+0x80/0xa0
> ___sys_sendmsg+0x75/0xc0
> ? do_fault_around+0x118/0x160
> ? do_read_fault+0x68/0xf0
> ? __handle_mm_fault+0x3f9/0x6f0
> __sys_sendmsg+0x59/0xa0
> do_syscall_64+0x33/0x40
> entry_SYSCALL_64_after_hwframe+0x61/0xc6
> RIP: 0033:0x7f96705b8247
> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>
> To avoid this case, set chain->flushing to be false if the chain->refcnt
> is 1 after flushing the chain when prio is 0.
>
> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> Signed-off-by: Ren Mingshuai <renmingshuai@huawei.com>
> ---
> net/sched/cls_api.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 2621550bfddc..68be55d75831 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
> tfilter_notify_chain(net, skb, block, q, parent, n,
> chain, RTM_DELTFILTER, extack);
> tcf_chain_flush(chain, rtnl_held);
> + /* Set the flushing flags to false to prevent an infinite loop
> + * when a new filter is added.
> + */
> + mutex_lock(&chain->filter_chain_lock);
> + if (chain->refcnt == 1)
> + chain->flushing = false;
> + mutex_unlock(&chain->filter_chain_lock);
> err = 0;
> goto errout;
> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop
2023-06-06 16:59 ` Pedro Tammela
@ 2023-06-07 4:19 ` renmingshuai
2023-06-07 15:02 ` Pedro Tammela
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-07 4:19 UTC (permalink / raw)
To: pctammela
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, renmingshuai,
xiyou.wangcong, yanan
>On 06/06/2023 11:45, renmingshuai wrote:
>> When a new chain is added by using tc, one soft lockup alarm will be
>> generated after delete the prio 0 filter of the chain. To reproduce
>> the problem, perform the following steps:
>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>> (2) tc chain add dev eth0
>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>> (4) tc filter add dev eth0 chain 0 parent 1:
>
>This seems like it could be added to tdc or 3 and 4 must be run in parallel?
3 and 4 do not need to be run inparallel. When a new chain is added by the
way as step 1 and the step 3 is completed, this problem always occurs
whenever step 4 is run.
>>
>>
>> The refcnt of the chain added by step 2 is equal to 1. After step 3,
>> the flushing flag of the chain is set to true in the tcf_chain_flush()
>> called by tc_del_tfilter() because the prio is 0. In this case, if
>> we add a new filter to this chain, it will never succeed and try again
>> and again because the refresh flash is always true and refcnt is 1.
>> A soft lock alarm is generated 20 seconds later.
>> The stack is show as below:
>>
>> Kernel panic - not syncing: softlockup: hung tasks
>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>> Call Trace:
>> <IRQ>
>> dump_stack+0x57/0x6e
>> panic+0x196/0x3ec
>> watchdog_timer_fn.cold+0x16/0x5c
>> __run_hrtimer+0x5e/0x190
>> __hrtimer_run_queues+0x8a/0xe0
>> hrtimer_interrupt+0x110/0x2c0
>> ? irqtime_account_irq+0x49/0xf0
>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>> asm_call_irq_on_stack+0x12/0x20
>> </IRQ>
>> sysvec_apic_timer_interrupt+0x72/0x80
>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>> RIP: 0010:mutex_lock+0x24/0x70
>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>> __tcf_chain_put+0x27/0x200
>> tc_new_tfilter+0x5e8/0x810
>> ? tc_setup_cb_add+0x210/0x210
>> rtnetlink_rcv_msg+0x2e3/0x380
>> ? rtnl_calcit.isra.0+0x120/0x120
>> netlink_rcv_skb+0x50/0x100
>> netlink_unicast+0x12d/0x1d0
>> netlink_sendmsg+0x286/0x490
>> sock_sendmsg+0x62/0x70
>> ____sys_sendmsg+0x24c/0x2c0
>> ? import_iovec+0x17/0x20
>> ? sendmsg_copy_msghdr+0x80/0xa0
>> ___sys_sendmsg+0x75/0xc0
>> ? do_fault_around+0x118/0x160
>> ? do_read_fault+0x68/0xf0
>> ? __handle_mm_fault+0x3f9/0x6f0
>> __sys_sendmsg+0x59/0xa0
>> do_syscall_64+0x33/0x40
>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>> RIP: 0033:0x7f96705b8247
>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>
>> To avoid this case, set chain->flushing to be false if the chain->refcnt
>> is 1 after flushing the chain when prio is 0.
>>
>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>> Signed-off-by: Ren Mingshuai <renmingshuai@huawei.com>
>> ---
>> net/sched/cls_api.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 2621550bfddc..68be55d75831 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
>> tfilter_notify_chain(net, skb, block, q, parent, n,
>> chain, RTM_DELTFILTER, extack);
>> tcf_chain_flush(chain, rtnl_held);
>> + /* Set the flushing flags to false to prevent an infinite loop
>> + * when a new filter is added.
>> + */
>> + mutex_lock(&chain->filter_chain_lock);
>> + if (chain->refcnt == 1)
>> + chain->flushing = false;
>> + mutex_unlock(&chain->filter_chain_lock);
>> err = 0;
>> goto errout;
>> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop
2023-06-07 4:19 ` renmingshuai
@ 2023-06-07 15:02 ` Pedro Tammela
2023-06-08 12:32 ` [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc renmingshuai
0 siblings, 1 reply; 13+ messages in thread
From: Pedro Tammela @ 2023-06-07 15:02 UTC (permalink / raw)
To: renmingshuai
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, xiyou.wangcong, yanan
On 07/06/2023 01:19, renmingshuai wrote:
>> On 06/06/2023 11:45, renmingshuai wrote:
>>> When a new chain is added by using tc, one soft lockup alarm will be
>>> generated after delete the prio 0 filter of the chain. To reproduce
>>> the problem, perform the following steps:
>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>> (2) tc chain add dev eth0
>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>
>> This seems like it could be added to tdc or 3 and 4 must be run in parallel?
> 3 and 4 do not need to be run inparallel. When a new chain is added by the
> way as step 1 and the step 3 is completed, this problem always occurs
> whenever step 4 is run.
Got it,
The test still hangs with the provided patch.
+ tc qdisc add dev lo root handle 1: htb default 1
+ tc chain add dev lo
+ tc filter del dev lo chain 0 parent 1: prio 0
[ 68.790030][ T6704] [+]
[ 68.790060][ T6704] chain refcnt 2
[ 68.790951][ T6704] [-]
+ tc filter add dev lo chain 0 parent 1:
<hangs>
Also please add this test to tdc, it should be straightforward.
>>>
>>>
>>> The refcnt of the chain added by step 2 is equal to 1. After step 3,
>>> the flushing flag of the chain is set to true in the tcf_chain_flush()
>>> called by tc_del_tfilter() because the prio is 0. In this case, if
>>> we add a new filter to this chain, it will never succeed and try again
>>> and again because the refresh flash is always true and refcnt is 1.
>>> A soft lock alarm is generated 20 seconds later.
>>> The stack is show as below:
>>>
>>> Kernel panic - not syncing: softlockup: hung tasks
>>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>>> Call Trace:
>>> <IRQ>
>>> dump_stack+0x57/0x6e
>>> panic+0x196/0x3ec
>>> watchdog_timer_fn.cold+0x16/0x5c
>>> __run_hrtimer+0x5e/0x190
>>> __hrtimer_run_queues+0x8a/0xe0
>>> hrtimer_interrupt+0x110/0x2c0
>>> ? irqtime_account_irq+0x49/0xf0
>>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>>> asm_call_irq_on_stack+0x12/0x20
>>> </IRQ>
>>> sysvec_apic_timer_interrupt+0x72/0x80
>>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>>> RIP: 0010:mutex_lock+0x24/0x70
>>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>>> __tcf_chain_put+0x27/0x200
>>> tc_new_tfilter+0x5e8/0x810
>>> ? tc_setup_cb_add+0x210/0x210
>>> rtnetlink_rcv_msg+0x2e3/0x380
>>> ? rtnl_calcit.isra.0+0x120/0x120
>>> netlink_rcv_skb+0x50/0x100
>>> netlink_unicast+0x12d/0x1d0
>>> netlink_sendmsg+0x286/0x490
>>> sock_sendmsg+0x62/0x70
>>> ____sys_sendmsg+0x24c/0x2c0
>>> ? import_iovec+0x17/0x20
>>> ? sendmsg_copy_msghdr+0x80/0xa0
>>> ___sys_sendmsg+0x75/0xc0
>>> ? do_fault_around+0x118/0x160
>>> ? do_read_fault+0x68/0xf0
>>> ? __handle_mm_fault+0x3f9/0x6f0
>>> __sys_sendmsg+0x59/0xa0
>>> do_syscall_64+0x33/0x40
>>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>>> RIP: 0033:0x7f96705b8247
>>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>>
>>> To avoid this case, set chain->flushing to be false if the chain->refcnt
>>> is 1 after flushing the chain when prio is 0.
>>>
>>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>>> Signed-off-by: Ren Mingshuai <renmingshuai@huawei.com>
>>> ---
>>> net/sched/cls_api.c | 7 +++++++
>>> 1 file changed, 7 insertions(+)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 2621550bfddc..68be55d75831 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
>>> tfilter_notify_chain(net, skb, block, q, parent, n,
>>> chain, RTM_DELTFILTER, extack);
>>> tcf_chain_flush(chain, rtnl_held);
>>> + /* Set the flushing flags to false to prevent an infinite loop
>>> + * when a new filter is added.
>>> + */
>>> + mutex_lock(&chain->filter_chain_lock);
>>> + if (chain->refcnt == 1)
>>> + chain->flushing = false;
>>> + mutex_unlock(&chain->filter_chain_lock);
>>> err = 0;
>>> goto errout;
>>> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-07 15:02 ` Pedro Tammela
@ 2023-06-08 12:32 ` renmingshuai
2023-06-08 16:08 ` Pedro Tammela
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-08 12:32 UTC (permalink / raw)
To: pctammela
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, renmingshuai,
xiyou.wangcong, yanan
>On 07/06/2023 01:19, renmingshuai wrote:
>>> On 06/06/2023 11:45, renmingshuai wrote:
>>>> When a new chain is added by using tc, one soft lockup alarm will
>>>> be
>>>> generated after delete the prio 0 filter of the chain. To
>>>> reproduce
>>>> the problem, perform the following steps:
>>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>>> (2) tc chain add dev eth0
>>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>>
>>> This seems like it could be added to tdc or 3 and 4 must be run in
>>> parallel?
>> 3 and 4 do not need to be run inparallel. When a new chain is added
>> by the
>> way as step 1 and the step 3 is completed, this problem always
>> occurs
>> whenever step 4 is run.
>
>Got it,
>The test still hangs with the provided patch.
>
>+ tc qdisc add dev lo root handle 1: htb default 1
>+ tc chain add dev lo
>+ tc filter del dev lo chain 0 parent 1: prio 0
>[ 68.790030][ T6704] [+]
>[ 68.790060][ T6704] chain refcnt 2
>[ 68.790951][ T6704] [-]
>+ tc filter add dev lo chain 0 parent 1:
><hangs>
>
>Also please add this test to tdc, it should be straightforward.
>
Sorry for not testing before. I forgot that the chain->refcnt was
increased by 1 when tcf_chain_get() is called in tc_del_tfilter().
The value of chain->refcnt is 2 after chain flush. The test
result is as follows:
[root@localhost ~]# tc qdisc add dev eth2 root handle 1: htb default 1
[root@localhost ~]# tc chain add dev eth2
[root@localhost ~]# tc filter del dev eth2 chain 0 parent 1: prio 0
[root@localhost ~]# tc filter add dev eth2 chain 0 parent 1:
Error: Filter kind and protocol must be specified.
We have an error talking to the kernel
And I have add this test to tdc:
[root@localhost tc-testing]# ./tdc.py -f tc-tests/filters/tests.json
ok 7 c2b4 - Adding a new fiter after deleting a filter in a chain does
not cause an infinite loop
Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Signed-off-by: renmingshuai <renmingshuai@huawei.com>
---
net/sched/cls_api.c | 7 +++++
.../tc-testing/tc-tests/filters/tests.json | 26 ++++++++++++++++---
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2621550bfddc..3ea054e03fbf 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
tfilter_notify_chain(net, skb, block, q, parent, n,
chain, RTM_DELTFILTER, extack);
tcf_chain_flush(chain, rtnl_held);
+ /* Set the flushing flags to false to prevent an infinite loop
+ * when a new filter is added.
+ */
+ mutex_lock(&chain->filter_chain_lock);
+ if (chain->refcnt == 2)
+ chain->flushing = false;
+ mutex_unlock(&chain->filter_chain_lock);
err = 0;
goto errout;
}
diff --git a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
index 361235ad574b..f165ca091109 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
@@ -125,5 +125,25 @@
"teardown": [
"$TC qdisc del dev $DEV2 ingress"
]
+ },
+ {
+ "id": "c2b4",
+ "name": "Adding a new fiter after deleting a filter in a chain does not cause an infinite loop",
+ "category": [
+ "filter",
+ "prio"
+ ],
+ "setup": [
+ "$TC qdisc add dev $DEV1 root handle 1: htb default 1",
+ "$TC chain add dev $DEV1"
+ ],
+ "cmdUnderTest": "$TC filter del dev $DEV1 chain 0 parent 1: prio 0",
+ "expExitCode": "0",
+ "verifyCmd": "$TC filter add dev $DEV1 chain 0 parent 1:",
+ "matchPattern": "Error: Filter kind and protocol must be specified.",
+ "matchCount": "1",
+ "teardown": [
+ "$TC qdisc del dev $DEV1 root handle 1: htb default 1"
+ ]
}
]
--
2.27.0
>>>>
>>>>
>>>> The refcnt of the chain added by step 2 is equal to 1. After step
>>>> 3,
>>>> the flushing flag of the chain is set to true in the
>>>> tcf_chain_flush()
>>>> called by tc_del_tfilter() because the prio is 0. In this case,
>>>> if
>>>> we add a new filter to this chain, it will never succeed and try
>>>> again
>>>> and again because the refresh flash is always true and refcnt is
>>>> 1.
>>>> A soft lock alarm is generated 20 seconds later.
>>>> The stack is show as below:
>>>>
>>>> Kernel panic - not syncing: softlockup: hung tasks
>>>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>>>> Call Trace:
>>>> <IRQ>
>>>> dump_stack+0x57/0x6e
>>>> panic+0x196/0x3ec
>>>> watchdog_timer_fn.cold+0x16/0x5c
>>>> __run_hrtimer+0x5e/0x190
>>>> __hrtimer_run_queues+0x8a/0xe0
>>>> hrtimer_interrupt+0x110/0x2c0
>>>> ? irqtime_account_irq+0x49/0xf0
>>>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>>>> asm_call_irq_on_stack+0x12/0x20
>>>> </IRQ>
>>>> sysvec_apic_timer_interrupt+0x72/0x80
>>>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>>>> RIP: 0010:mutex_lock+0x24/0x70
>>>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>>>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>>>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>>>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>>>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>>>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>>>> __tcf_chain_put+0x27/0x200
>>>> tc_new_tfilter+0x5e8/0x810
>>>> ? tc_setup_cb_add+0x210/0x210
>>>> rtnetlink_rcv_msg+0x2e3/0x380
>>>> ? rtnl_calcit.isra.0+0x120/0x120
>>>> netlink_rcv_skb+0x50/0x100
>>>> netlink_unicast+0x12d/0x1d0
>>>> netlink_sendmsg+0x286/0x490
>>>> sock_sendmsg+0x62/0x70
>>>> ____sys_sendmsg+0x24c/0x2c0
>>>> ? import_iovec+0x17/0x20
>>>> ? sendmsg_copy_msghdr+0x80/0xa0
>>>> ___sys_sendmsg+0x75/0xc0
>>>> ? do_fault_around+0x118/0x160
>>>> ? do_read_fault+0x68/0xf0
>>>> ? __handle_mm_fault+0x3f9/0x6f0
>>>> __sys_sendmsg+0x59/0xa0
>>>> do_syscall_64+0x33/0x40
>>>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>>>> RIP: 0033:0x7f96705b8247
>>>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX:
>>>> 000000000000002e
>>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>>>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>>>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>>>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>>>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>>>
>>>> To avoid this case, set chain->flushing to be false if the
>>>> chain->refcnt
>>>> is 1 after flushing the chain when prio is 0.
>>>>
>>>> Fixes: 726d061286ce ("net: sched: prevent insertion of new
>>>> classifiers during chain flush")
>>>> Signed-off-by: Ren Mingshuai <renmingshuai@huawei.com>
>>>> ---
>>>> net/sched/cls_api.c | 7 +++++++
>>>> 1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>>> index 2621550bfddc..68be55d75831 100644
>>>> --- a/net/sched/cls_api.c
>>>> +++ b/net/sched/cls_api.c
>>>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff
>>>> *skb, struct nlmsghdr *n,
>>>> tfilter_notify_chain(net, skb, block, q, parent,
>>>> n,
>>>> chain, RTM_DELTFILTER,
>>>> extack);
>>>> tcf_chain_flush(chain, rtnl_held);
>>>> + /* Set the flushing flags to false to prevent an
>>>> infinite loop
>>>> + * when a new filter is added.
>>>> + */
>>>> + mutex_lock(&chain->filter_chain_lock);
>>>> + if (chain->refcnt == 1)
>>>> + chain->flushing = false;
>>>> + mutex_unlock(&chain->filter_chain_lock);
>>>> err = 0;
>>>> goto errout;
>>>> }
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-08 12:32 ` [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc renmingshuai
@ 2023-06-08 16:08 ` Pedro Tammela
2023-06-09 3:31 ` [PATCH v3] " renmingshuai
0 siblings, 1 reply; 13+ messages in thread
From: Pedro Tammela @ 2023-06-08 16:08 UTC (permalink / raw)
To: renmingshuai
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, xiyou.wangcong, yanan
On 08/06/2023 09:32, renmingshuai wrote:
>> On 07/06/2023 01:19, renmingshuai wrote:
>>>> On 06/06/2023 11:45, renmingshuai wrote:
>>>>> When a new chain is added by using tc, one soft lockup alarm will
>>>>> be
>>>>> generated after delete the prio 0 filter of the chain. To
>>>>> reproduce
>>>>> the problem, perform the following steps:
>>>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>>>> (2) tc chain add dev eth0
>>>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>>>
>>>> This seems like it could be added to tdc or 3 and 4 must be run in
>>>> parallel?
>>> 3 and 4 do not need to be run inparallel. When a new chain is added
>>> by the
>>> way as step 1 and the step 3 is completed, this problem always
>>> occurs
>>> whenever step 4 is run.
>>
>> Got it,
>> The test still hangs with the provided patch.
>>
>> + tc qdisc add dev lo root handle 1: htb default 1
>> + tc chain add dev lo
>> + tc filter del dev lo chain 0 parent 1: prio 0
>> [ 68.790030][ T6704] [+]
>> [ 68.790060][ T6704] chain refcnt 2
>> [ 68.790951][ T6704] [-]
>> + tc filter add dev lo chain 0 parent 1:
>> <hangs>
>>
>> Also please add this test to tdc, it should be straightforward.
>>
> Sorry for not testing before. I forgot that the chain->refcnt was
> increased by 1 when tcf_chain_get() is called in tc_del_tfilter().
> The value of chain->refcnt is 2 after chain flush. The test
> result is as follows:
> [root@localhost ~]# tc qdisc add dev eth2 root handle 1: htb default 1
> [root@localhost ~]# tc chain add dev eth2
> [root@localhost ~]# tc filter del dev eth2 chain 0 parent 1: prio 0
> [root@localhost ~]# tc filter add dev eth2 chain 0 parent 1:
> Error: Filter kind and protocol must be specified.
> We have an error talking to the kernel
>
> And I have add this test to tdc:
> [root@localhost tc-testing]# ./tdc.py -f tc-tests/filters/tests.json
> ok 7 c2b4 - Adding a new fiter after deleting a filter in a chain does
> not cause an infinite loop
>
> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> Signed-off-by: renmingshuai <renmingshuai@huawei.com>
Please respin with the following applied:
diff --git
a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
index c759c3db9a37..361235ad574b 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
@@ -125,25 +125,5 @@
"teardown": [
"$TC qdisc del dev $DEV2 ingress"
]
- },
- {
- "id": "c2b4",
- "name": "Adding a new fiter after deleting a filter in a chain
does not cause an infinite loop",
- "category": [
- "filter",
- "prio"
- ],
- "setup": [
- "$TC qdisc add dev $DEV1 root handle 1: htb default 1",
- "$TC chain add dev $DEV1"
- ],
- "cmdUnderTest": "$TC filter del dev $DEV1 chain 0 parent 1:
prio 0",
- "expExitCode": "0",
- "verifyCmd": "$TC filter add dev $DEV1 chain 0 parent 1:",
- "matchPattern": "Error: Filter kind and protocol must be
specified.",
- "matchCount": "1",
- "teardown": [
- "$TC qdisc del dev $DEV1 root handle 1: htb default 1"
- ]
}
]
diff --git
a/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.
json
new file mode 100644
index 000000000000..55d6f209c388
--- /dev/null
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
@@ -0,0 +1,24 @@
+[
+ {
+ "id": "c2b4",
+ "name": "Adding a new filter after flushing empty chain doesnt
cause an infinite loop",
+ "category": [
+ "filter",
+ "chain"
+ ],
+ "setup": [
+ "$IP link add dev $DUMMY type dummy || /bin/true",
+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
+ "$TC chain add dev $DUMMY",
+ "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
+ "expExitCode": "2",
+ "verifyCmd": "$TC chain ls dev $DUMMY",
+ "matchPattern": "chain parent 1: chain 0",
+ "matchCount": "1",
+ "teardown": [
+ "$TC qdisc del dev $DUMMY root handle 1: htb default 1"
+ ]
+ }
+]
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-08 16:08 ` Pedro Tammela
@ 2023-06-09 3:31 ` renmingshuai
2023-06-09 8:24 ` Simon Horman
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-09 3:31 UTC (permalink / raw)
To: pctammela
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, renmingshuai,
xiyou.wangcong, yanan
>On 08/06/2023 09:32, renmingshuai wrote:
>>> On 07/06/2023 01:19, renmingshuai wrote:
>>>>> On 06/06/2023 11:45, renmingshuai wrote:
>>>>>> When a new chain is added by using tc, one soft lockup alarm will
>>>>>> be
>>>>>> generated after delete the prio 0 filter of the chain. To
>>>>>> reproduce
>>>>>> the problem, perform the following steps:
>>>>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>>>>> (2) tc chain add dev eth0
>>>>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>>>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>>>>
>>>>> This seems like it could be added to tdc or 3 and 4 must be run in
>>>>> parallel?
>>>> 3 and 4 do not need to be run inparallel. When a new chain is added
>>>> by the
>>>> way as step 1 and the step 3 is completed, this problem always
>>>> occurs
>>>> whenever step 4 is run.
>>>
>>> Got it,
>>> The test still hangs with the provided patch.
>>>
>>> + tc qdisc add dev lo root handle 1: htb default 1
>>> + tc chain add dev lo
>>> + tc filter del dev lo chain 0 parent 1: prio 0
>>> [ 68.790030][ T6704] [+]
>>> [ 68.790060][ T6704] chain refcnt 2
>>> [ 68.790951][ T6704] [-]
>>> + tc filter add dev lo chain 0 parent 1:
>>> <hangs>
>>>
>>> Also please add this test to tdc, it should be straightforward.
>>>
>> Sorry for not testing before. I forgot that the chain->refcnt was
>> increased by 1 when tcf_chain_get() is called in tc_del_tfilter().
>> The value of chain->refcnt is 2 after chain flush. The test
>> result is as follows:
>> [root@localhost ~]# tc qdisc add dev eth2 root handle 1: htb default 1
>> [root@localhost ~]# tc chain add dev eth2
>> [root@localhost ~]# tc filter del dev eth2 chain 0 parent 1: prio 0
>> [root@localhost ~]# tc filter add dev eth2 chain 0 parent 1:
>> Error: Filter kind and protocol must be specified.
>> We have an error talking to the kernel
>>
>> And I have add this test to tdc:
>> [root@localhost tc-testing]# ./tdc.py -f tc-tests/filters/tests.json
>> ok 7 c2b4 - Adding a new fiter after deleting a filter in a chain does
>> not cause an infinite loop
>>
>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>> Signed-off-by: renmingshuai <renmingshuai@huawei.com>
>
>Please respin with the following applied:
>
>diff --git
>a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
>b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
>index c759c3db9a37..361235ad574b 100644
>--- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
>+++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
>@@ -125,25 +125,5 @@
> "teardown": [
> "$TC qdisc del dev $DEV2 ingress"
> ]
>- },
>- {
>- "id": "c2b4",
>- "name": "Adding a new fiter after deleting a filter in a chain
>does not cause an infinite loop",
>- "category": [
>- "filter",
>- "prio"
>- ],
>- "setup": [
>- "$TC qdisc add dev $DEV1 root handle 1: htb default 1",
>- "$TC chain add dev $DEV1"
>- ],
>- "cmdUnderTest": "$TC filter del dev $DEV1 chain 0 parent 1:
>prio 0",
>- "expExitCode": "0",
>- "verifyCmd": "$TC filter add dev $DEV1 chain 0 parent 1:",
>- "matchPattern": "Error: Filter kind and protocol must be
>specified.",
>- "matchCount": "1",
>- "teardown": [
>- "$TC qdisc del dev $DEV1 root handle 1: htb default 1"
>- ]
> }
> ]
>diff --git
>a/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
>b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.
>json
>new file mode 100644
>index 000000000000..55d6f209c388
>--- /dev/null
>+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
>@@ -0,0 +1,24 @@
>+[
>+ {
>+ "id": "c2b4",
>+ "name": "Adding a new filter after flushing empty chain doesnt
>cause an infinite loop",
>+ "category": [
>+ "filter",
>+ "chain"
>+ ],
>+ "setup": [
>+ "$IP link add dev $DUMMY type dummy || /bin/true",
>+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
>+ "$TC chain add dev $DUMMY",
>+ "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
>+ ],
>+ "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
>+ "expExitCode": "2",
>+ "verifyCmd": "$TC chain ls dev $DUMMY",
>+ "matchPattern": "chain parent 1: chain 0",
>+ "matchCount": "1",
>+ "teardown": [
>+ "$TC qdisc del dev $DUMMY root handle 1: htb default 1"
>+ ]
>+ }
>+]
Ok. The new test is passed.
[root@localhost tc-testing]# ./tdc.py -f tc-tests/infra/filter.json
Test c2b4: Adding a new filter after flushing empty chain doesn't cause an infinite loop
All test results:
1..1
ok 1 c2b4 - Adding a new filter after flushing empty chain doesn't cause an infinite loop
Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Signed-off-by: renmingshuai <renmingshuai@huawei.com>
---
net/sched/cls_api.c | 7 ++++++
.../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
2 files changed, 32 insertions(+)
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2621550bfddc..3ea054e03fbf 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
tfilter_notify_chain(net, skb, block, q, parent, n,
chain, RTM_DELTFILTER, extack);
tcf_chain_flush(chain, rtnl_held);
+ /* Set the flushing flags to false to prevent an infinite loop
+ * when a new filter is added.
+ */
+ mutex_lock(&chain->filter_chain_lock);
+ if (chain->refcnt == 2)
+ chain->flushing = false;
+ mutex_unlock(&chain->filter_chain_lock);
err = 0;
goto errout;
}
diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
new file mode 100644
index 000000000000..db3b42aaa4fa
--- /dev/null
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
@@ -0,0 +1,25 @@
+[
+ {
+ "id": "c2b4",
+ "name": "Adding a new filter after flushing empty chain doesn't cause an infinite loop",
+ "category": [
+ "filter",
+ "chain"
+ ],
+ "setup": [
+ "$IP link add dev $DUMMY type dummy || /bin/true",
+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
+ "$TC chain add dev $DUMMY",
+ "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
+ "expExitCode": "2",
+ "verifyCmd": "$TC chain ls dev $DUMMY",
+ "matchPattern": "chain parent 1: chain 0",
+ "matchCount": "1",
+ "teardown": [
+ "$TC qdisc del dev $DUMMY root handle 1: htb default 1",
+ "$IP link del dev $DUMMY type dummy"
+ ]
+ }
+]
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-09 3:31 ` [PATCH v3] " renmingshuai
@ 2023-06-09 8:24 ` Simon Horman
0 siblings, 0 replies; 13+ messages in thread
From: Simon Horman @ 2023-06-09 8:24 UTC (permalink / raw)
To: renmingshuai
Cc: pctammela, caowangbao, davem, edumazet, jhs, jiri, kuba,
liaichun, linux-kernel, liubo335, netdev, pabeni, xiyou.wangcong,
yanan
On Fri, Jun 09, 2023 at 11:31:15AM +0800, renmingshuai wrote:
> >On 08/06/2023 09:32, renmingshuai wrote:
> >>> On 07/06/2023 01:19, renmingshuai wrote:
> >>>>> On 06/06/2023 11:45, renmingshuai wrote:
> >>>>>> When a new chain is added by using tc, one soft lockup alarm will
> >>>>>> be
> >>>>>> generated after delete the prio 0 filter of the chain. To
> >>>>>> reproduce
> >>>>>> the problem, perform the following steps:
> >>>>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
> >>>>>> (2) tc chain add dev eth0
> >>>>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
> >>>>>> (4) tc filter add dev eth0 chain 0 parent 1:
> >>>>>
> >>>>> This seems like it could be added to tdc or 3 and 4 must be run in
> >>>>> parallel?
> >>>> 3 and 4 do not need to be run inparallel. When a new chain is added
> >>>> by the
> >>>> way as step 1 and the step 3 is completed, this problem always
> >>>> occurs
> >>>> whenever step 4 is run.
> >>>
> >>> Got it,
> >>> The test still hangs with the provided patch.
> >>>
> >>> + tc qdisc add dev lo root handle 1: htb default 1
> >>> + tc chain add dev lo
> >>> + tc filter del dev lo chain 0 parent 1: prio 0
> >>> [ 68.790030][ T6704] [+]
> >>> [ 68.790060][ T6704] chain refcnt 2
> >>> [ 68.790951][ T6704] [-]
> >>> + tc filter add dev lo chain 0 parent 1:
> >>> <hangs>
> >>>
> >>> Also please add this test to tdc, it should be straightforward.
> >>>
> >> Sorry for not testing before. I forgot that the chain->refcnt was
> >> increased by 1 when tcf_chain_get() is called in tc_del_tfilter().
> >> The value of chain->refcnt is 2 after chain flush. The test
> >> result is as follows:
> >> [root@localhost ~]# tc qdisc add dev eth2 root handle 1: htb default 1
> >> [root@localhost ~]# tc chain add dev eth2
> >> [root@localhost ~]# tc filter del dev eth2 chain 0 parent 1: prio 0
> >> [root@localhost ~]# tc filter add dev eth2 chain 0 parent 1:
> >> Error: Filter kind and protocol must be specified.
> >> We have an error talking to the kernel
> >>
> >> And I have add this test to tdc:
> >> [root@localhost tc-testing]# ./tdc.py -f tc-tests/filters/tests.json
> >> ok 7 c2b4 - Adding a new fiter after deleting a filter in a chain does
> >> not cause an infinite loop
> >>
> >> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> >> Signed-off-by: renmingshuai <renmingshuai@huawei.com>
> >
> >Please respin with the following applied:
> >
> >diff --git
> >a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> >b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> >index c759c3db9a37..361235ad574b 100644
> >--- a/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> >+++ b/tools/testing/selftests/tc-testing/tc-tests/filters/tests.json
> >@@ -125,25 +125,5 @@
> > "teardown": [
> > "$TC qdisc del dev $DEV2 ingress"
> > ]
> >- },
> >- {
> >- "id": "c2b4",
> >- "name": "Adding a new fiter after deleting a filter in a chain
> >does not cause an infinite loop",
> >- "category": [
> >- "filter",
> >- "prio"
> >- ],
> >- "setup": [
> >- "$TC qdisc add dev $DEV1 root handle 1: htb default 1",
> >- "$TC chain add dev $DEV1"
> >- ],
> >- "cmdUnderTest": "$TC filter del dev $DEV1 chain 0 parent 1:
> >prio 0",
> >- "expExitCode": "0",
> >- "verifyCmd": "$TC filter add dev $DEV1 chain 0 parent 1:",
> >- "matchPattern": "Error: Filter kind and protocol must be
> >specified.",
> >- "matchCount": "1",
> >- "teardown": [
> >- "$TC qdisc del dev $DEV1 root handle 1: htb default 1"
> >- ]
> > }
> > ]
> >diff --git
> >a/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
> >b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.
> >json
> >new file mode 100644
> >index 000000000000..55d6f209c388
> >--- /dev/null
> >+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filters.json
> >@@ -0,0 +1,24 @@
> >+[
> >+ {
> >+ "id": "c2b4",
> >+ "name": "Adding a new filter after flushing empty chain doesnt
> >cause an infinite loop",
> >+ "category": [
> >+ "filter",
> >+ "chain"
> >+ ],
> >+ "setup": [
> >+ "$IP link add dev $DUMMY type dummy || /bin/true",
> >+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
> >+ "$TC chain add dev $DUMMY",
> >+ "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
> >+ ],
> >+ "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
> >+ "expExitCode": "2",
> >+ "verifyCmd": "$TC chain ls dev $DUMMY",
> >+ "matchPattern": "chain parent 1: chain 0",
> >+ "matchCount": "1",
> >+ "teardown": [
> >+ "$TC qdisc del dev $DUMMY root handle 1: htb default 1"
> >+ ]
> >+ }
> >+]
>
> Ok. The new test is passed.
> [root@localhost tc-testing]# ./tdc.py -f tc-tests/infra/filter.json
> Test c2b4: Adding a new filter after flushing empty chain doesn't cause an infinite loop
> All test results:
> 1..1
> ok 1 c2b4 - Adding a new filter after flushing empty chain doesn't cause an infinite loop
>
> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> Signed-off-by: renmingshuai <renmingshuai@huawei.com>
Hi renmingshuai,
the above text, in it's entirety, does not meet the needs
of a patch submission for the Linux kernel. In particular:
* All the context from previous discussion (the bits prefixed by '>')
should be removed.
* The patch description should consist of just that, a description of
the patch. Any relevant test results are also welcome here.
* Information about changes from previous versions should be provided
in the form of a changelog, typically below the scissors (---).
A recent example is here:
- https://lore.kernel.org/all/20230607162353.3631199-1-mtottenh@akamai.com/
[PATCH v3] net/sched: act_pedit: Parse L3 Header for L4 offset
Also, please consider using spaces and capital letters
in your name in the from and signed-off parts of your submission.
e.g.: Mingshuai Ren <...> or Ren Mingshuai <...>
...
--
pw-bot: cr
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
@ 2023-06-09 9:46 renmingshuai
2023-06-09 14:35 ` Vlad Buslov
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-09 9:46 UTC (permalink / raw)
To: netdev, linux-kernel, jhs, xiyou.wangcong, jiri, davem, edumazet,
kuba, pabeni
Cc: liaichun, caowangbao, yanan, liubo335
When a new chain is added by using tc, one soft lockup alarm will be
generated after delete the prio 0 filter of the chain. To reproduce
the problem, perform the following steps:
(1) tc qdisc add dev eth0 root handle 1: htb default 1
(2) tc chain add dev eth0
(3) tc filter del dev eth0 chain 0 parent 1: prio 0
(4) tc filter add dev eth0 chain 0 parent 1:
The refcnt of the chain added by step 2 is equal to 1. After step 3,
the flushing flag of the chain is set to true in the tcf_chain_flush()
called by tc_del_tfilter() because the prio is 0. In this case, if
we add a new filter to this chain, it will never succeed and try again
and again because the refresh flash is always true and refcnt is 1.
A soft lock alarm is generated 20 seconds later.
The stack is show as below:
Kernel panic - not syncing: softlockup: hung tasks
CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
Call Trace:
<IRQ>
dump_stack+0x57/0x6e
panic+0x196/0x3ec
watchdog_timer_fn.cold+0x16/0x5c
__run_hrtimer+0x5e/0x190
__hrtimer_run_queues+0x8a/0xe0
hrtimer_interrupt+0x110/0x2c0
? irqtime_account_irq+0x49/0xf0
__sysvec_apic_timer_interrupt+0x5f/0xe0
asm_call_irq_on_stack+0x12/0x20
</IRQ>
sysvec_apic_timer_interrupt+0x72/0x80
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:mutex_lock+0x24/0x70
RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
__tcf_chain_put+0x27/0x200
tc_new_tfilter+0x5e8/0x810
? tc_setup_cb_add+0x210/0x210
rtnetlink_rcv_msg+0x2e3/0x380
? rtnl_calcit.isra.0+0x120/0x120
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x12d/0x1d0
netlink_sendmsg+0x286/0x490
sock_sendmsg+0x62/0x70
____sys_sendmsg+0x24c/0x2c0
? import_iovec+0x17/0x20
? sendmsg_copy_msghdr+0x80/0xa0
___sys_sendmsg+0x75/0xc0
? do_fault_around+0x118/0x160
? do_read_fault+0x68/0xf0
? __handle_mm_fault+0x3f9/0x6f0
__sys_sendmsg+0x59/0xa0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7f96705b8247
RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
To avoid this case, set chain->flushing to be false if the chain->refcnt
is 2 after flushing the chain when prio is 0. I also add one test to tdc.
Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
Reviewed-by: Aichun Li <Liaichun@huawei.com>
---
V1 -> V2:
* Correct the judgment on the value chain->refcnt and add one test to tdc
---
net/sched/cls_api.c | 7 ++++++
.../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
2 files changed, 32 insertions(+)
create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 2621550bfddc..3ea054e03fbf 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
tfilter_notify_chain(net, skb, block, q, parent, n,
chain, RTM_DELTFILTER, extack);
tcf_chain_flush(chain, rtnl_held);
+ /* Set the flushing flags to false to prevent an infinite loop
+ * when a new filter is added.
+ */
+ mutex_lock(&chain->filter_chain_lock);
+ if (chain->refcnt == 2)
+ chain->flushing = false;
+ mutex_unlock(&chain->filter_chain_lock);
err = 0;
goto errout;
}
diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
new file mode 100644
index 000000000000..db3b42aaa4fa
--- /dev/null
+++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
@@ -0,0 +1,25 @@
+[
+ {
+ "id": "c2b4",
+ "name": "Adding a new filter after flushing empty chain doesn't cause an infinite loop",
+ "category": [
+ "filter",
+ "chain"
+ ],
+ "setup": [
+ "$IP link add dev $DUMMY type dummy || /bin/true",
+ "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
+ "$TC chain add dev $DUMMY",
+ "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
+ ],
+ "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
+ "expExitCode": "2",
+ "verifyCmd": "$TC chain ls dev $DUMMY",
+ "matchPattern": "chain parent 1: chain 0",
+ "matchCount": "1",
+ "teardown": [
+ "$TC qdisc del dev $DUMMY root handle 1: htb default 1",
+ "$IP link del dev $DUMMY type dummy"
+ ]
+ }
+]
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-09 9:46 [PATCH v2] " renmingshuai
@ 2023-06-09 14:35 ` Vlad Buslov
2023-06-12 14:29 ` renmingshuai
0 siblings, 1 reply; 13+ messages in thread
From: Vlad Buslov @ 2023-06-09 14:35 UTC (permalink / raw)
To: renmingshuai
Cc: netdev, linux-kernel, jhs, xiyou.wangcong, jiri, davem, edumazet,
kuba, pabeni, liaichun, caowangbao, yanan, liubo335
On Fri 09 Jun 2023 at 17:46, renmingshuai <renmingshuai@huawei.com> wrote:
> When a new chain is added by using tc, one soft lockup alarm will be
> generated after delete the prio 0 filter of the chain. To reproduce
> the problem, perform the following steps:
> (1) tc qdisc add dev eth0 root handle 1: htb default 1
> (2) tc chain add dev eth0
> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
> (4) tc filter add dev eth0 chain 0 parent 1:
>
>
> The refcnt of the chain added by step 2 is equal to 1. After step 3,
> the flushing flag of the chain is set to true in the tcf_chain_flush()
> called by tc_del_tfilter() because the prio is 0. In this case, if
> we add a new filter to this chain, it will never succeed and try again
> and again because the refresh flash is always true and refcnt is 1.
> A soft lock alarm is generated 20 seconds later.
Hi Mingshuai,
Thanks for investigating and reporting this!
> The stack is show as below:
>
> Kernel panic - not syncing: softlockup: hung tasks
> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
> Call Trace:
> <IRQ>
> dump_stack+0x57/0x6e
> panic+0x196/0x3ec
> watchdog_timer_fn.cold+0x16/0x5c
> __run_hrtimer+0x5e/0x190
> __hrtimer_run_queues+0x8a/0xe0
> hrtimer_interrupt+0x110/0x2c0
> ? irqtime_account_irq+0x49/0xf0
> __sysvec_apic_timer_interrupt+0x5f/0xe0
> asm_call_irq_on_stack+0x12/0x20
> </IRQ>
> sysvec_apic_timer_interrupt+0x72/0x80
> asm_sysvec_apic_timer_interrupt+0x12/0x20
> RIP: 0010:mutex_lock+0x24/0x70
> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
> __tcf_chain_put+0x27/0x200
> tc_new_tfilter+0x5e8/0x810
> ? tc_setup_cb_add+0x210/0x210
> rtnetlink_rcv_msg+0x2e3/0x380
> ? rtnl_calcit.isra.0+0x120/0x120
> netlink_rcv_skb+0x50/0x100
> netlink_unicast+0x12d/0x1d0
> netlink_sendmsg+0x286/0x490
> sock_sendmsg+0x62/0x70
> ____sys_sendmsg+0x24c/0x2c0
> ? import_iovec+0x17/0x20
> ? sendmsg_copy_msghdr+0x80/0xa0
> ___sys_sendmsg+0x75/0xc0
> ? do_fault_around+0x118/0x160
> ? do_read_fault+0x68/0xf0
> ? __handle_mm_fault+0x3f9/0x6f0
> __sys_sendmsg+0x59/0xa0
> do_syscall_64+0x33/0x40
> entry_SYSCALL_64_after_hwframe+0x61/0xc6
> RIP: 0033:0x7f96705b8247
> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>
> To avoid this case, set chain->flushing to be false if the chain->refcnt
> is 2 after flushing the chain when prio is 0. I also add one test to tdc.
>
> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
> Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
> Reviewed-by: Aichun Li <Liaichun@huawei.com>
> ---
> V1 -> V2:
> * Correct the judgment on the value chain->refcnt and add one test to tdc
> ---
> net/sched/cls_api.c | 7 ++++++
> .../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
> 2 files changed, 32 insertions(+)
> create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
>
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 2621550bfddc..3ea054e03fbf 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
> tfilter_notify_chain(net, skb, block, q, parent, n,
> chain, RTM_DELTFILTER, extack);
> tcf_chain_flush(chain, rtnl_held);
> + /* Set the flushing flags to false to prevent an infinite loop
> + * when a new filter is added.
> + */
> + mutex_lock(&chain->filter_chain_lock);
> + if (chain->refcnt == 2)
> + chain->flushing = false;
> + mutex_unlock(&chain->filter_chain_lock);
I don't think this check is enough since there can be concurrent filter
ops holding the reference to the chain. This just makes the issue harder
to reproduce.
I'll try to formulate a fix that takes any potential concurrent users
into account and verify it with your test.
> err = 0;
> goto errout;
> }
> diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
> new file mode 100644
> index 000000000000..db3b42aaa4fa
> --- /dev/null
> +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
> @@ -0,0 +1,25 @@
> +[
> + {
> + "id": "c2b4",
> + "name": "Adding a new filter after flushing empty chain doesn't cause an infinite loop",
> + "category": [
> + "filter",
> + "chain"
> + ],
> + "setup": [
> + "$IP link add dev $DUMMY type dummy || /bin/true",
> + "$TC qdisc add dev $DUMMY root handle 1: htb default 1",
> + "$TC chain add dev $DUMMY",
> + "$TC filter del dev $DUMMY chain 0 parent 1: prio 0"
> + ],
> + "cmdUnderTest": "$TC filter add dev $DUMMY chain 0 parent 1:",
> + "expExitCode": "2",
> + "verifyCmd": "$TC chain ls dev $DUMMY",
> + "matchPattern": "chain parent 1: chain 0",
> + "matchCount": "1",
> + "teardown": [
> + "$TC qdisc del dev $DUMMY root handle 1: htb default 1",
> + "$IP link del dev $DUMMY type dummy"
> + ]
> + }
> +]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-09 14:35 ` Vlad Buslov
@ 2023-06-12 14:29 ` renmingshuai
2023-06-12 14:48 ` Vlad Buslov
0 siblings, 1 reply; 13+ messages in thread
From: renmingshuai @ 2023-06-12 14:29 UTC (permalink / raw)
To: vladbu
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, renmingshuai,
xiyou.wangcong, yanan
>On Fri 09 Jun 2023 at 17:46, renmingshuai <renmingshuai@huawei.com> wrote:
>> When a new chain is added by using tc, one soft lockup alarm will be
>> generated after delete the prio 0 filter of the chain. To reproduce
>> the problem, perform the following steps:
>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>> (2) tc chain add dev eth0
>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>> (4) tc filter add dev eth0 chain 0 parent 1:
>>
>>
>> The refcnt of the chain added by step 2 is equal to 1. After step 3,
>> the flushing flag of the chain is set to true in the tcf_chain_flush()
>> called by tc_del_tfilter() because the prio is 0. In this case, if
>> we add a new filter to this chain, it will never succeed and try again
>> and again because the refresh flash is always true and refcnt is 1.
>> A soft lock alarm is generated 20 seconds later.
>
>Hi Mingshuai,
>
>Thanks for investigating and reporting this!
>
>> The stack is show as below:
>>
>> Kernel panic - not syncing: softlockup: hung tasks
>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>> Call Trace:
>> <IRQ>
>> dump_stack+0x57/0x6e
>> panic+0x196/0x3ec
>> watchdog_timer_fn.cold+0x16/0x5c
>> __run_hrtimer+0x5e/0x190
>> __hrtimer_run_queues+0x8a/0xe0
>> hrtimer_interrupt+0x110/0x2c0
>> ? irqtime_account_irq+0x49/0xf0
>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>> asm_call_irq_on_stack+0x12/0x20
>> </IRQ>
>> sysvec_apic_timer_interrupt+0x72/0x80
>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>> RIP: 0010:mutex_lock+0x24/0x70
>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>> __tcf_chain_put+0x27/0x200
>> tc_new_tfilter+0x5e8/0x810
>> ? tc_setup_cb_add+0x210/0x210
>> rtnetlink_rcv_msg+0x2e3/0x380
>> ? rtnl_calcit.isra.0+0x120/0x120
>> netlink_rcv_skb+0x50/0x100
>> netlink_unicast+0x12d/0x1d0
>> netlink_sendmsg+0x286/0x490
>> sock_sendmsg+0x62/0x70
>> ____sys_sendmsg+0x24c/0x2c0
>> ? import_iovec+0x17/0x20
>> ? sendmsg_copy_msghdr+0x80/0xa0
>> ___sys_sendmsg+0x75/0xc0
>> ? do_fault_around+0x118/0x160
>> ? do_read_fault+0x68/0xf0
>> ? __handle_mm_fault+0x3f9/0x6f0
>> __sys_sendmsg+0x59/0xa0
>> do_syscall_64+0x33/0x40
>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>> RIP: 0033:0x7f96705b8247
>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>
>> To avoid this case, set chain->flushing to be false if the chain->refcnt
>> is 2 after flushing the chain when prio is 0. I also add one test to tdc.
>>
>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>> Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
>> Reviewed-by: Aichun Li <Liaichun@huawei.com>
>> ---
>> V1 -> V2:
>> * Correct the judgment on the value chain->refcnt and add one test to tdc
>> ---
>> net/sched/cls_api.c | 7 ++++++
>> .../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
>> 2 files changed, 32 insertions(+)
>> create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index 2621550bfddc..3ea054e03fbf 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
>> tfilter_notify_chain(net, skb, block, q, parent, n,
>> chain, RTM_DELTFILTER, extack);
>> tcf_chain_flush(chain, rtnl_held);
>> + /* Set the flushing flags to false to prevent an infinite loop
>> + * when a new filter is added.
>> + */
>> + mutex_lock(&chain->filter_chain_lock);
>> + if (chain->refcnt == 2)
>> + chain->flushing = false;
>> + mutex_unlock(&chain->filter_chain_lock);
>
>I don't think this check is enough since there can be concurrent filter
>ops holding the reference to the chain. This just makes the issue harder
>to reproduce.
>
>I'll try to formulate a fix that takes any potential concurrent users
>into account and verify it with your test.
Thanks for your reply.
I didn't find any concurrent scenarios that would "escape" this check.
That's my understanding.
During the flush operation, after filter_chain_lock is obtained, no new chain reference
could be added. After unlock, chain->flushing is set to true. The chain->refcnt and
chain->action_refcnt are always different. Therefore, chain->flushing will be never set to false.
Then there seems to be no concurrency problem until flushing is set to true.
would you mind tell me the concurrent scenario you're talking about?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-12 14:29 ` renmingshuai
@ 2023-06-12 14:48 ` Vlad Buslov
2023-06-13 7:02 ` renmingshuai
0 siblings, 1 reply; 13+ messages in thread
From: Vlad Buslov @ 2023-06-12 14:48 UTC (permalink / raw)
To: renmingshuai
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, xiyou.wangcong, yanan
On Mon 12 Jun 2023 at 22:29, renmingshuai <renmingshuai@huawei.com> wrote:
>>On Fri 09 Jun 2023 at 17:46, renmingshuai <renmingshuai@huawei.com> wrote:
>>> When a new chain is added by using tc, one soft lockup alarm will be
>>> generated after delete the prio 0 filter of the chain. To reproduce
>>> the problem, perform the following steps:
>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>> (2) tc chain add dev eth0
>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>>
>>>
>>> The refcnt of the chain added by step 2 is equal to 1. After step 3,
>>> the flushing flag of the chain is set to true in the tcf_chain_flush()
>>> called by tc_del_tfilter() because the prio is 0. In this case, if
>>> we add a new filter to this chain, it will never succeed and try again
>>> and again because the refresh flash is always true and refcnt is 1.
>>> A soft lock alarm is generated 20 seconds later.
>>
>>Hi Mingshuai,
>>
>>Thanks for investigating and reporting this!
>>
>>> The stack is show as below:
>>>
>>> Kernel panic - not syncing: softlockup: hung tasks
>>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>>> Call Trace:
>>> <IRQ>
>>> dump_stack+0x57/0x6e
>>> panic+0x196/0x3ec
>>> watchdog_timer_fn.cold+0x16/0x5c
>>> __run_hrtimer+0x5e/0x190
>>> __hrtimer_run_queues+0x8a/0xe0
>>> hrtimer_interrupt+0x110/0x2c0
>>> ? irqtime_account_irq+0x49/0xf0
>>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>>> asm_call_irq_on_stack+0x12/0x20
>>> </IRQ>
>>> sysvec_apic_timer_interrupt+0x72/0x80
>>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>>> RIP: 0010:mutex_lock+0x24/0x70
>>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>>> __tcf_chain_put+0x27/0x200
>>> tc_new_tfilter+0x5e8/0x810
>>> ? tc_setup_cb_add+0x210/0x210
>>> rtnetlink_rcv_msg+0x2e3/0x380
>>> ? rtnl_calcit.isra.0+0x120/0x120
>>> netlink_rcv_skb+0x50/0x100
>>> netlink_unicast+0x12d/0x1d0
>>> netlink_sendmsg+0x286/0x490
>>> sock_sendmsg+0x62/0x70
>>> ____sys_sendmsg+0x24c/0x2c0
>>> ? import_iovec+0x17/0x20
>>> ? sendmsg_copy_msghdr+0x80/0xa0
>>> ___sys_sendmsg+0x75/0xc0
>>> ? do_fault_around+0x118/0x160
>>> ? do_read_fault+0x68/0xf0
>>> ? __handle_mm_fault+0x3f9/0x6f0
>>> __sys_sendmsg+0x59/0xa0
>>> do_syscall_64+0x33/0x40
>>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>>> RIP: 0033:0x7f96705b8247
>>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>>
>>> To avoid this case, set chain->flushing to be false if the chain->refcnt
>>> is 2 after flushing the chain when prio is 0. I also add one test to tdc.
>>>
>>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>>> Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
>>> Reviewed-by: Aichun Li <Liaichun@huawei.com>
>>> ---
>>> V1 -> V2:
>>> * Correct the judgment on the value chain->refcnt and add one test to tdc
>>> ---
>>> net/sched/cls_api.c | 7 ++++++
>>> .../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
>>> 2 files changed, 32 insertions(+)
>>> create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index 2621550bfddc..3ea054e03fbf 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
>>> tfilter_notify_chain(net, skb, block, q, parent, n,
>>> chain, RTM_DELTFILTER, extack);
>>> tcf_chain_flush(chain, rtnl_held);
>>> + /* Set the flushing flags to false to prevent an infinite loop
>>> + * when a new filter is added.
>>> + */
>>> + mutex_lock(&chain->filter_chain_lock);
>>> + if (chain->refcnt == 2)
>>> + chain->flushing = false;
>>> + mutex_unlock(&chain->filter_chain_lock);
>>
>>I don't think this check is enough since there can be concurrent filter
>>ops holding the reference to the chain. This just makes the issue harder
>>to reproduce.
>>
>>I'll try to formulate a fix that takes any potential concurrent users
>>into account and verify it with your test.
>
> Thanks for your reply.
> I didn't find any concurrent scenarios that would "escape" this check.
> That's my understanding.
> During the flush operation, after filter_chain_lock is obtained, no new chain reference
> could be added. After unlock, chain->flushing is set to true. The chain->refcnt and
> chain->action_refcnt are always different. Therefore, chain->flushing will be never set to false.
I agree with your analysis.
> Then there seems to be no concurrency problem until flushing is set to true.
> would you mind tell me the concurrent scenario you're talking about?
What if, for example, chain->refcnt==3 after flush because there is
another task inserting filters to the same chain concurrently? In such
case for chains explicitly created with chains API, chain->flushing flag
will never be reset afterwards and user will get the same lockup when
trying to install any following filters. Also, refcnt==2 only means that
there is no concurrent users when flushing explicitly created chains
(because chains API holds a "passive" reference to such chains). For
regular chains implicitly created by filters API refcnt==2 here means
that there are active concurrent users.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc
2023-06-12 14:48 ` Vlad Buslov
@ 2023-06-13 7:02 ` renmingshuai
0 siblings, 0 replies; 13+ messages in thread
From: renmingshuai @ 2023-06-13 7:02 UTC (permalink / raw)
To: vladbu
Cc: caowangbao, davem, edumazet, jhs, jiri, kuba, liaichun,
linux-kernel, liubo335, netdev, pabeni, renmingshuai,
xiyou.wangcong, yanan
>On Mon 12 Jun 2023 at 22:29, renmingshuai <renmingshuai@huawei.com> wrote:
>>>On Fri 09 Jun 2023 at 17:46, renmingshuai <renmingshuai@huawei.com> wrote:
>>>> When a new chain is added by using tc, one soft lockup alarm will be
>>>> generated after delete the prio 0 filter of the chain. To reproduce
>>>> the problem, perform the following steps:
>>>> (1) tc qdisc add dev eth0 root handle 1: htb default 1
>>>> (2) tc chain add dev eth0
>>>> (3) tc filter del dev eth0 chain 0 parent 1: prio 0
>>>> (4) tc filter add dev eth0 chain 0 parent 1:
>>>>
>>>>
>>>> The refcnt of the chain added by step 2 is equal to 1. After step 3,
>>>> the flushing flag of the chain is set to true in the tcf_chain_flush()
>>>> called by tc_del_tfilter() because the prio is 0. In this case, if
>>>> we add a new filter to this chain, it will never succeed and try again
>>>> and again because the refresh flash is always true and refcnt is 1.
>>>> A soft lock alarm is generated 20 seconds later.
>>>
>>>Hi Mingshuai,
>>>
>>>Thanks for investigating and reporting this!
>>>
>>>> The stack is show as below:
>>>>
>>>> Kernel panic - not syncing: softlockup: hung tasks
>>>> CPU: 2 PID: 3321861 Comm: tc Kdump: loaded Tainted: G
>>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
>>>> Call Trace:
>>>> <IRQ>
>>>> dump_stack+0x57/0x6e
>>>> panic+0x196/0x3ec
>>>> watchdog_timer_fn.cold+0x16/0x5c
>>>> __run_hrtimer+0x5e/0x190
>>>> __hrtimer_run_queues+0x8a/0xe0
>>>> hrtimer_interrupt+0x110/0x2c0
>>>> ? irqtime_account_irq+0x49/0xf0
>>>> __sysvec_apic_timer_interrupt+0x5f/0xe0
>>>> asm_call_irq_on_stack+0x12/0x20
>>>> </IRQ>
>>>> sysvec_apic_timer_interrupt+0x72/0x80
>>>> asm_sysvec_apic_timer_interrupt+0x12/0x20
>>>> RIP: 0010:mutex_lock+0x24/0x70
>>>> RSP: 0018:ffffa836004ab9a8 EFLAGS: 00000246
>>>> RAX: 0000000000000000 RBX: ffff95bb02d76700 RCX: 0000000000000000
>>>> RDX: ffff95bb27462100 RSI: 0000000000000000 RDI: ffff95ba5b527000
>>>> RBP: ffff95ba5b527000 R08: 0000000000000001 R09: ffffa836004abbb8
>>>> R10: 000000000000000f R11: 0000000000000000 R12: 0000000000000000
>>>> R13: ffff95ba5b527000 R14: ffffa836004abbb8 R15: 0000000000000001
>>>> __tcf_chain_put+0x27/0x200
>>>> tc_new_tfilter+0x5e8/0x810
>>>> ? tc_setup_cb_add+0x210/0x210
>>>> rtnetlink_rcv_msg+0x2e3/0x380
>>>> ? rtnl_calcit.isra.0+0x120/0x120
>>>> netlink_rcv_skb+0x50/0x100
>>>> netlink_unicast+0x12d/0x1d0
>>>> netlink_sendmsg+0x286/0x490
>>>> sock_sendmsg+0x62/0x70
>>>> ____sys_sendmsg+0x24c/0x2c0
>>>> ? import_iovec+0x17/0x20
>>>> ? sendmsg_copy_msghdr+0x80/0xa0
>>>> ___sys_sendmsg+0x75/0xc0
>>>> ? do_fault_around+0x118/0x160
>>>> ? do_read_fault+0x68/0xf0
>>>> ? __handle_mm_fault+0x3f9/0x6f0
>>>> __sys_sendmsg+0x59/0xa0
>>>> do_syscall_64+0x33/0x40
>>>> entry_SYSCALL_64_after_hwframe+0x61/0xc6
>>>> RIP: 0033:0x7f96705b8247
>>>> RSP: 002b:00007ffe552e9dc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
>>>> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f96705b8247
>>>> RDX: 0000000000000000 RSI: 00007ffe552e9e40 RDI: 0000000000000003
>>>> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000558113f678b0
>>>> R10: 00007f967069ab00 R11: 0000000000000246 R12: 00000000647ea089
>>>> R13: 00007ffe552e9f30 R14: 0000000000000001 R15: 0000558113175f00
>>>>
>>>> To avoid this case, set chain->flushing to be false if the chain->refcnt
>>>> is 2 after flushing the chain when prio is 0. I also add one test to tdc.
>>>>
>>>> Fixes: 726d061286ce ("net: sched: prevent insertion of new classifiers during chain flush")
>>>> Signed-off-by: Mingshuai Ren <renmingshuai@huawei.com>
>>>> Reviewed-by: Aichun Li <Liaichun@huawei.com>
>>>> ---
>>>> V1 -> V2:
>>>> * Correct the judgment on the value chain->refcnt and add one test to tdc
>>>> ---
>>>> net/sched/cls_api.c | 7 ++++++
>>>> .../tc-testing/tc-tests/infra/filter.json | 25 +++++++++++++++++++
>>>> 2 files changed, 32 insertions(+)
>>>> create mode 100644 tools/testing/selftests/tc-testing/tc-tests/infra/filter.json
>>>>
>>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>>> index 2621550bfddc..3ea054e03fbf 100644
>>>> --- a/net/sched/cls_api.c
>>>> +++ b/net/sched/cls_api.c
>>>> @@ -2442,6 +2442,13 @@ static int tc_del_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
>>>> tfilter_notify_chain(net, skb, block, q, parent, n,
>>>> chain, RTM_DELTFILTER, extack);
>>>> tcf_chain_flush(chain, rtnl_held);
>>>> + /* Set the flushing flags to false to prevent an infinite loop
>>>> + * when a new filter is added.
>>>> + */
>>>> + mutex_lock(&chain->filter_chain_lock);
>>>> + if (chain->refcnt == 2)
>>>> + chain->flushing = false;
>>>> + mutex_unlock(&chain->filter_chain_lock);
>>>
>>>I don't think this check is enough since there can be concurrent filter
>>>ops holding the reference to the chain. This just makes the issue harder
>>>to reproduce.
>>>
>>>I'll try to formulate a fix that takes any potential concurrent users
>>>into account and verify it with your test.
>>
>> Thanks for your reply.
>> I didn't find any concurrent scenarios that would "escape" this check.
>> That's my understanding.
>> During the flush operation, after filter_chain_lock is obtained, no new chain reference
>> could be added. After unlock, chain->flushing is set to true. The chain->refcnt and
>> chain->action_refcnt are always different. Therefore, chain->flushing will be never set to false.
>
>I agree with your analysis.
>
>> Then there seems to be no concurrency problem until flushing is set to true.
>> would you mind tell me the concurrent scenario you're talking about?
>
>What if, for example, chain->refcnt==3 after flush because there is
>another task inserting filters to the same chain concurrently? In such
>case for chains explicitly created with chains API, chain->flushing flag
>will never be reset afterwards and user will get the same lockup when
>trying to install any following filters. Also, refcnt==2 only means that
>there is no concurrent users when flushing explicitly created chains
>(because chains API holds a "passive" reference to such chains). For
>regular chains implicitly created by filters API refcnt==2 here means
>that there are active concurrent users.
Thanks, you are right.
My previous insistence that the chain->refcnt would only be increased if the filter was successfully
inserted into the chain was clearly wrong.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2023-06-13 7:03 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-06 14:45 [PATCH] net/sched: Set the flushing flags to false to prevent an infinite loop renmingshuai
2023-06-06 16:59 ` Pedro Tammela
2023-06-07 4:19 ` renmingshuai
2023-06-07 15:02 ` Pedro Tammela
2023-06-08 12:32 ` [PATCH v2] net/sched: Set the flushing flags to false to prevent an infinite loop and add one test to tdc renmingshuai
2023-06-08 16:08 ` Pedro Tammela
2023-06-09 3:31 ` [PATCH v3] " renmingshuai
2023-06-09 8:24 ` Simon Horman
2023-06-09 9:46 [PATCH v2] " renmingshuai
2023-06-09 14:35 ` Vlad Buslov
2023-06-12 14:29 ` renmingshuai
2023-06-12 14:48 ` Vlad Buslov
2023-06-13 7:02 ` renmingshuai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).