Netdev Archive on lore.kernel.org
 help / color / Atom feed
From: Vlad Buslov <vladbu@mellanox.com>
To: Ido Schimmel <idosch@idosch.org>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"jhs@mojatatu.com" <jhs@mojatatu.com>,
	"xiyou.wangcong@gmail.com" <xiyou.wangcong@gmail.com>,
	"jiri@resnulli.us" <jiri@resnulli.us>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"ast@kernel.org" <ast@kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>
Subject: Re: [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex
Date: Fri, 15 Feb 2019 10:02:11 +0000
Message-ID: <vbfh8d5jroi.fsf@mellanox.com> (raw)
In-Reply-To: <20190214182442.GA19269@splinter>


On Thu 14 Feb 2019 at 18:24, Ido Schimmel <idosch@idosch.org> wrote:
> On Mon, Feb 11, 2019 at 10:55:38AM +0200, Vlad Buslov wrote:
>> Extend tcf_chain with new filter_chain_lock mutex. Always lock the chain
>> when accessing filter_chain list, instead of relying on rtnl lock.
>> Dereference filter_chain with tcf_chain_dereference() lockdep macro to
>> verify that all users of chain_list have the lock taken.
>>
>> Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to execute
>> all necessary code while holding chain lock in order to prevent
>> invalidation of chain_info structure by potential concurrent change. This
>> also serializes calls to tcf_chain0_head_change(), which allows head change
>> callbacks to rely on filter_chain_lock for synchronization instead of rtnl
>> mutex.
>>
>> Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
>> Acked-by: Jiri Pirko <jiri@mellanox.com>
>
> With this sequence [1] I get the following trace [2]. Bisected it to
> this patch. Note that second filter is rejected by the device driver
> (that's the intention). I guess it is not properly removed from the
> filter chain in the error path?
>
> Thanks
>
> [1]
> # tc qdisc add dev swp3 clsact
> # tc filter add dev swp3 ingress pref 1 matchall skip_sw \
> 	action mirred egress mirror dev swp7
> # tc filter add dev swp3 ingress pref 2 matchall skip_sw \
> 	action mirred egress mirror dev swp7
> RTNETLINK answers: File exists
> We have an error talking to the kernel, -1
> # tc qdisc del dev swp3 clsact
>
> [2]
> [   70.545131] kasan: GPF could be caused by NULL-ptr deref or user memory access
> [   70.553394] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
> [   70.560618] CPU: 2 PID: 2268 Comm: tc Not tainted 5.0.0-rc5-custom-02103-g415d39427317 #1304
> [   70.570065] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
> [   70.580204] RIP: 0010:mall_reoffload+0x14a/0x760
> [   70.585382] Code: c1 0f 85 ba 05 00 00 31 c0 4d 8d 6c 24 34 b9 06 00 00 00 4c 89 ff f3 48 ab 4c 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <0f> b6 14 02 4c 89 e8 83
> e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 bd
> [   70.606382] RSP: 0018:ffff888231faefc0 EFLAGS: 00010207
> [   70.612235] RAX: dffffc0000000000 RBX: 1ffff110463f5dfe RCX: 0000000000000000
> [   70.620220] RDX: 0000000000000006 RSI: 1ffff110463f5e01 RDI: ffff888231faf040
> [   70.628206] RBP: ffff8881ef151a00 R08: 0000000000000000 R09: ffffed10463f5dfa
> [   70.636192] R10: ffffed10463f5dfa R11: 0000000000000003 R12: 0000000000000000
> [   70.644177] R13: 0000000000000034 R14: 0000000000000000 R15: ffff888231faf010
> [   70.652163] FS:  00007f46b5bf0800(0000) GS:ffff888236c00000(0000) knlGS:0000000000000000
> [   70.661218] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   70.667649] CR2: 0000000001d590a8 CR3: 0000000231c3c000 CR4: 00000000001006e0
> [   70.675633] Call Trace:
> [   70.693046]  tcf_block_playback_offloads+0x94/0x230
> [   70.710617]  __tcf_block_cb_unregister+0xf7/0x2d0
> [   70.734143]  mlxsw_sp_setup_tc+0x20f/0x660
> [   70.738739]  tcf_block_offload_unbind+0x22b/0x350
> [   70.748898]  __tcf_block_put+0x203/0x630
> [   70.769700]  tcf_block_put_ext+0x2f/0x40
> [   70.774098]  clsact_destroy+0x7a/0xb0
> [   70.782604]  qdisc_destroy+0x11a/0x5c0
> [   70.786810]  qdisc_put+0x5a/0x70
> [   70.790435]  notify_and_destroy+0x8e/0xa0
> [   70.794933]  qdisc_graft+0xbb7/0xef0
> [   70.809009]  tc_get_qdisc+0x518/0xa30
> [   70.821530]  rtnetlink_rcv_msg+0x397/0xa20
> [   70.835510]  netlink_rcv_skb+0x132/0x380
> [   70.848419]  netlink_unicast+0x4c0/0x690
> [   70.866019]  netlink_sendmsg+0x929/0xe10
> [   70.882134]  sock_sendmsg+0xc8/0x110
> [   70.886144]  ___sys_sendmsg+0x77a/0x8f0
> [   70.924064]  __sys_sendmsg+0xf7/0x250
> [   70.955727]  do_syscall_64+0x14d/0x610

Hi Ido,

Thanks for reporting this.

I looked at the code and problem seems to be matchall classifier
specific. My implementation of unlocked cls API assumes that concurrent
insertions are possible and checks for it when deleting "empty" tp.
Since classifiers don't expose number of elements, the only way to test
this is to do tp->walk() on them and assume that walk callback is called
once per filter on every classifier. In your example new tp is created
for second filter, filter insertion fails, number of elements on newly
created tp is checked with tp->walk() before deleting it. However,
matchall classifier always calls the tp->walk() callback once, even when
it doesn't have a valid filter (in this case with NULL filter pointer).

Possible ways to fix this:

1) Check for NULL filter pointer in tp->walk() callback and ignore it
when counting filters on tp - will work but I don't like it because I
don't think it is ever correct to call tp->walk() callback with NULL
filter pointer.

2) Fix matchall to not call tp->walk() callback with NULL filter
pointers - my preferred simple solution.

3) Extend tp API to have direct way to count filters by implementing
tp->count - requires change to every classifier. Or maybe add it as
optional API that only unlocked classifiers are required to implement
and just delete rtnl lock dependent tp's without checking for concurrent
insertions.

What do you think?

Regards,
Vlad

  reply index

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-11  8:55 [PATCH net-next v4 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 01/17] net: sched: protect block state with mutex Vlad Buslov
2019-02-11 14:15   ` Jiri Pirko
2019-02-11  8:55 ` [PATCH net-next v4 02/17] net: sched: protect chain->explicitly_created with block->lock Vlad Buslov
2019-02-11 14:15   ` Jiri Pirko
2019-02-11  8:55 ` [PATCH net-next v4 03/17] net: sched: refactor tc_ctl_chain() to use block->lock Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 04/17] net: sched: protect block->chain0 with block->lock Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 05/17] net: sched: traverse chains in block with tcf_get_next_chain() Vlad Buslov
2019-02-15 22:21   ` Cong Wang
2019-02-18 10:07     ` Vlad Buslov
2019-02-18 18:26       ` Cong Wang
2019-02-19 16:04         ` Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 06/17] net: sched: protect chain template accesses with block lock Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Vlad Buslov
2019-02-14 18:24   ` Ido Schimmel
2019-02-15 10:02     ` Vlad Buslov [this message]
2019-02-15 11:30       ` Ido Schimmel
2019-02-15 12:11         ` [PATCH] net: sched: matchall: verify that filter is not NULL in mall_walk() Vlad Buslov
2019-02-15 13:47           ` Ido Schimmel
2019-02-16  0:24           ` Cong Wang
2019-02-18 12:00             ` Vlad Buslov
2019-02-17 21:27           ` David Miller
2019-02-15 12:15         ` [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Vlad Buslov
2019-02-15 15:35         ` Vlad Buslov
2019-02-19  5:26           ` Cong Wang
2019-02-19 12:31             ` Vlad Buslov
2019-02-20 22:43               ` Cong Wang
2019-02-21 15:49                 ` Vlad Buslov
2019-02-19  5:08       ` Cong Wang
2019-02-19 15:20         ` Vlad Buslov
2019-02-20 23:00           ` Cong Wang
2019-02-21 17:11             ` Vlad Buslov
2019-02-15 22:35   ` Cong Wang
2019-02-18 11:06     ` Vlad Buslov
2019-02-18 18:31       ` Cong Wang
2019-02-11  8:55 ` [PATCH net-next v4 08/17] net: sched: introduce reference counting for tcf_proto Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 09/17] net: sched: traverse classifiers in chain with tcf_get_next_proto() Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 10/17] net: sched: refactor tp insert/delete for concurrent execution Vlad Buslov
2019-02-15 23:17   ` Cong Wang
2019-02-18 11:19     ` Vlad Buslov
2019-02-18 19:55       ` Cong Wang
2019-02-19 10:25         ` Vlad Buslov
2019-02-18 19:53   ` Cong Wang
2019-02-11  8:55 ` [PATCH net-next v4 11/17] net: sched: prevent insertion of new classifiers during chain flush Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 12/17] net: sched: track rtnl lock status when validating extensions Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 13/17] net: sched: extend proto ops with 'put' callback Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 14/17] net: sched: extend proto ops to support unlocked classifiers Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 15/17] net: sched: add flags to Qdisc class ops struct Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 16/17] net: sched: refactor tcf_block_find() into standalone functions Vlad Buslov
2019-02-11  8:55 ` [PATCH net-next v4 17/17] net: sched: unlock rules update API Vlad Buslov
2019-02-18 18:56   ` Cong Wang
2019-02-12 18:42 ` [PATCH net-next v4 00/17] Refactor classifier API to work with chain/classifiers without rtnl lock David Miller

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vbfh8d5jroi.fsf@mellanox.com \
    --to=vladbu@mellanox.com \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=idosch@idosch.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org netdev@archiver.kernel.org
	public-inbox-index netdev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox