From: Vlad Buslov <email@example.com> To: Ido Schimmel <firstname.lastname@example.org> Cc: "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org>, "email@example.com" <firstname.lastname@example.org> Subject: Re: [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Date: Fri, 15 Feb 2019 15:35:10 +0000 Message-ID: <email@example.com> (raw) In-Reply-To: <20190215113041.GA10511@splinter> On Fri 15 Feb 2019 at 11:30, Ido Schimmel <firstname.lastname@example.org> wrote: > On Fri, Feb 15, 2019 at 10:02:11AM +0000, Vlad Buslov wrote: >> >> On Thu 14 Feb 2019 at 18:24, Ido Schimmel <email@example.com> wrote: >> > On Mon, Feb 11, 2019 at 10:55:38AM +0200, Vlad Buslov wrote: >> >> Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain >> >> when accessing filter_chain list, instead of relying on rtnl lock. >> >> Dereference filter_chain with tcf_chain_dereference() lockdep macro to >> >> verify that all users of chain_list have the lock taken. >> >> >> >> Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute >> >> all necessary code while holding chain lock in order to prevent >> >> invalidation of chain_info structure by potential concurrent change. This >> >> also serializes calls to tcf_chain0_head_change(), which allows head change >> >> callbacks to rely on filter_chain_lock for synchronization instead of rtnl >> >> mutex. >> >> >> >> Signed-off-by: Vlad Buslov <firstname.lastname@example.org> >> >> Acked-by: Jiri Pirko <email@example.com> >> > >> > With this sequence  I get the following trace . Bisected it to >> > this patch. Note that second filter is rejected by the device driver >> > (that's the intention). I guess it is not properly removed from the >> > filter chain in the error path? >> > >> > Thanks >> > >> >  >> > # tc qdisc add dev swp3 clsact >> > # tc filter add dev swp3 ingress pref 1 matchall skip_sw \ >> > action mirred egress mirror dev swp7 >> > # tc filter add dev swp3 ingress pref 2 matchall skip_sw \ >> > action mirred egress mirror dev swp7 >> > RTNETLINK answers: File exists >> > We have an error talking to the kernel, -1 >> > # tc qdisc del dev swp3 clsact >> > >> >  >> > [ 70.545131] kasan: GPF could be caused by NULL-ptr deref or user memory access >> > [ 70.553394] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI >> > [ 70.560618] CPU: 2 PID: 2268 Comm: tc Not tainted 5.0.0-rc5-custom-02103-g415d39427317 #1304 >> > [ 70.570065] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016 >> > [ 70.580204] RIP: 0010:mall_reoffload+0x14a/0x760 >> > [ 70.585382] Code: c1 0f 85 ba 05 00 00 31 c0 4d 8d 6c 24 34 b9 06 00 00 00 4c 89 ff f3 48 ab 4c 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 14 02 4c 89 e8 83 >> > e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 bd >> > [ 70.606382] RSP: 0018:ffff888231faefc0 EFLAGS: 00010207 >> > [ 70.612235] RAX: dffffc0000000000 RBX: 1ffff110463f5dfe RCX: 0000000000000000 >> > [ 70.620220] RDX: 0000000000000006 RSI: 1ffff110463f5e01 RDI: ffff888231faf040 >> > [ 70.628206] RBP: ffff8881ef151a00 R08: 0000000000000000 R09: ffffed10463f5dfa >> > [ 70.636192] R10: ffffed10463f5dfa R11: 0000000000000003 R12: 0000000000000000 >> > [ 70.644177] R13: 0000000000000034 R14: 0000000000000000 R15: ffff888231faf010 >> > [ 70.652163] FS: 00007f46b5bf0800(0000) GS:ffff888236c00000(0000) knlGS:0000000000000000 >> > [ 70.661218] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > [ 70.667649] CR2: 0000000001d590a8 CR3: 0000000231c3c000 CR4: 00000000001006e0 >> > [ 70.675633] Call Trace: >> > [ 70.693046] tcf_block_playback_offloads+0x94/0x230 >> > [ 70.710617] __tcf_block_cb_unregister+0xf7/0x2d0 >> > [ 70.734143] mlxsw_sp_setup_tc+0x20f/0x660 >> > [ 70.738739] tcf_block_offload_unbind+0x22b/0x350 >> > [ 70.748898] __tcf_block_put+0x203/0x630 >> > [ 70.769700] tcf_block_put_ext+0x2f/0x40 >> > [ 70.774098] clsact_destroy+0x7a/0xb0 >> > [ 70.782604] qdisc_destroy+0x11a/0x5c0 >> > [ 70.786810] qdisc_put+0x5a/0x70 >> > [ 70.790435] notify_and_destroy+0x8e/0xa0 >> > [ 70.794933] qdisc_graft+0xbb7/0xef0 >> > [ 70.809009] tc_get_qdisc+0x518/0xa30 >> > [ 70.821530] rtnetlink_rcv_msg+0x397/0xa20 >> > [ 70.835510] netlink_rcv_skb+0x132/0x380 >> > [ 70.848419] netlink_unicast+0x4c0/0x690 >> > [ 70.866019] netlink_sendmsg+0x929/0xe10 >> > [ 70.882134] sock_sendmsg+0xc8/0x110 >> > [ 70.886144] ___sys_sendmsg+0x77a/0x8f0 >> > [ 70.924064] __sys_sendmsg+0xf7/0x250 >> > [ 70.955727] do_syscall_64+0x14d/0x610 >> >> Hi Ido, >> >> Thanks for reporting this. >> >> I looked at the code and problem seems to be matchall classifier >> specific. My implementation of unlocked cls API assumes that concurrent >> insertions are possible and checks for it when deleting "empty" tp. >> Since classifiers don't expose number of elements, the only way to test >> this is to do tp->walk() on them and assume that walk callback is called >> once per filter on every classifier. In your example new tp is created >> for second filter, filter insertion fails, number of elements on newly >> created tp is checked with tp->walk() before deleting it. However, >> matchall classifier always calls the tp->walk() callback once, even when >> it doesn't have a valid filter (in this case with NULL filter pointer). >> >> Possible ways to fix this: >> >> 1) Check for NULL filter pointer in tp->walk() callback and ignore it >> when counting filters on tp - will work but I don't like it because I >> don't think it is ever correct to call tp->walk() callback with NULL >> filter pointer. >> >> 2) Fix matchall to not call tp->walk() callback with NULL filter >> pointers - my preferred simple solution. >> >> 3) Extend tp API to have direct way to count filters by implementing >> tp->count - requires change to every classifier. Or maybe add it as >> optional API that only unlocked classifiers are required to implement >> and just delete rtnl lock dependent tp's without checking for concurrent >> insertions. >> >> What do you think? > > Since the problem is matchall-specific, then it makes sense to fix it in > matchall like you suggested in option #2. > > Can you please use this opportunity and audit other classifiers to > confirm problem is indeed specific to matchall? > Turns out cls_cgroup has the same problem. Another problem that I found in cls_fw and cls_route is that they set arg->stop when empty. Both of them have code unchanged since it was committed initially in 2005 so I assume this convention is no longer relevant because all other classifiers don't do that (they only set arg->stop when arg->fn returns negative value). I sent fixes for all of those cases.
next prev parent reply index Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-02-11 8:55 [PATCH net-next v4 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 01/17] net: sched: protect block state with mutex Vlad Buslov 2019-02-11 14:15 ` Jiri Pirko 2019-02-11 8:55 ` [PATCH net-next v4 02/17] net: sched: protect chain->explicitly_created with block->lock Vlad Buslov 2019-02-11 14:15 ` Jiri Pirko 2019-02-11 8:55 ` [PATCH net-next v4 03/17] net: sched: refactor tc_ctl_chain() to use block->lock Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 04/17] net: sched: protect block->chain0 with block->lock Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 05/17] net: sched: traverse chains in block with tcf_get_next_chain() Vlad Buslov 2019-02-15 22:21 ` Cong Wang 2019-02-18 10:07 ` Vlad Buslov 2019-02-18 18:26 ` Cong Wang 2019-02-19 16:04 ` Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 06/17] net: sched: protect chain template accesses with block lock Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Vlad Buslov 2019-02-14 18:24 ` Ido Schimmel 2019-02-15 10:02 ` Vlad Buslov 2019-02-15 11:30 ` Ido Schimmel 2019-02-15 12:11 ` [PATCH] net: sched: matchall: verify that filter is not NULL in mall_walk() Vlad Buslov 2019-02-15 13:47 ` Ido Schimmel 2019-02-16 0:24 ` Cong Wang 2019-02-18 12:00 ` Vlad Buslov 2019-02-17 21:27 ` David Miller 2019-02-15 12:15 ` [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Vlad Buslov 2019-02-15 15:35 ` Vlad Buslov [this message] 2019-02-19 5:26 ` Cong Wang 2019-02-19 12:31 ` Vlad Buslov 2019-02-20 22:43 ` Cong Wang 2019-02-21 15:49 ` Vlad Buslov 2019-02-19 5:08 ` Cong Wang 2019-02-19 15:20 ` Vlad Buslov 2019-02-20 23:00 ` Cong Wang 2019-02-21 17:11 ` Vlad Buslov 2019-02-15 22:35 ` Cong Wang 2019-02-18 11:06 ` Vlad Buslov 2019-02-18 18:31 ` Cong Wang 2019-02-11 8:55 ` [PATCH net-next v4 08/17] net: sched: introduce reference counting for tcf_proto Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 09/17] net: sched: traverse classifiers in chain with tcf_get_next_proto() Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 10/17] net: sched: refactor tp insert/delete for concurrent execution Vlad Buslov 2019-02-15 23:17 ` Cong Wang 2019-02-18 11:19 ` Vlad Buslov 2019-02-18 19:55 ` Cong Wang 2019-02-19 10:25 ` Vlad Buslov 2019-02-18 19:53 ` Cong Wang 2019-02-11 8:55 ` [PATCH net-next v4 11/17] net: sched: prevent insertion of new classifiers during chain flush Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 12/17] net: sched: track rtnl lock status when validating extensions Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 13/17] net: sched: extend proto ops with 'put' callback Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 14/17] net: sched: extend proto ops to support unlocked classifiers Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 15/17] net: sched: add flags to Qdisc class ops struct Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 16/17] net: sched: refactor tcf_block_find() into standalone functions Vlad Buslov 2019-02-11 8:55 ` [PATCH net-next v4 17/17] net: sched: unlock rules update API Vlad Buslov 2019-02-18 18:56 ` Cong Wang 2019-02-12 18:42 ` [PATCH net-next v4 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock David Miller
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Netdev Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \ firstname.lastname@example.org email@example.com public-inbox-index netdev Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.netdev AGPL code for this site: git clone https://public-inbox.org/ public-inbox