From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07230C43381 for ; Fri, 15 Feb 2019 15:35:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B3E2C2192D for ; Fri, 15 Feb 2019 15:35:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=Mellanox.com header.i=@Mellanox.com header.b="nkttSiQB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390123AbfBOPfT (ORCPT ); Fri, 15 Feb 2019 10:35:19 -0500 Received: from mail-eopbgr80077.outbound.protection.outlook.com ([40.107.8.77]:56736 "EHLO EUR04-VI1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S2389964AbfBOPfR (ORCPT ); Fri, 15 Feb 2019 10:35:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hwlsNlXxeTmy2qR7q0HNoQMStF/LZdEYTXDypfdvLDA=; b=nkttSiQBG1mUY+vY+Uhx7RrxlrnUDYi5EpEmzotW9CZCT92yRHSNJChEfCqxjAOcyXgmjNDaMojYEAw4tC5DtIEqKNQi5r3NXM8NWwjzx6fmABpNS7FHQTWraxxpS4mqsVoyiWwNRHR/rMIMvx3u/ZaBzKVlAte0uh6Npc866Ks= Received: from HE1PR0502MB3641.eurprd05.prod.outlook.com (10.167.127.11) by HE1PR0502MB3674.eurprd05.prod.outlook.com (10.167.127.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1622.16; Fri, 15 Feb 2019 15:35:10 +0000 Received: from HE1PR0502MB3641.eurprd05.prod.outlook.com ([fe80::4041:bb68:2af3:eab8]) by HE1PR0502MB3641.eurprd05.prod.outlook.com ([fe80::4041:bb68:2af3:eab8%5]) with mapi id 15.20.1622.018; Fri, 15 Feb 2019 15:35:10 +0000 From: Vlad Buslov To: Ido Schimmel CC: "netdev@vger.kernel.org" , "jhs@mojatatu.com" , "xiyou.wangcong@gmail.com" , "jiri@resnulli.us" , "davem@davemloft.net" , "ast@kernel.org" , "daniel@iogearbox.net" Subject: Re: [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Thread-Topic: [PATCH net-next v4 07/17] net: sched: protect filter_chain list with filter_chain_lock mutex Thread-Index: AQHUwefD/Cm+p7M7pEKHPk7WVBdPDaXfoUsAgAEF54CAABjBgIAAQ66A Date: Fri, 15 Feb 2019 15:35:10 +0000 Message-ID: References: <20190211085548.7190-1-vladbu@mellanox.com> <20190211085548.7190-8-vladbu@mellanox.com> <20190214182442.GA19269@splinter> <20190215113041.GA10511@splinter> In-Reply-To: <20190215113041.GA10511@splinter> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: LO2P265CA0365.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a3::17) To HE1PR0502MB3641.eurprd05.prod.outlook.com (2603:10a6:7:85::11) authentication-results: spf=none (sender IP is ) smtp.mailfrom=vladbu@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [37.142.13.130] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 44e6b229-82a4-49b3-991c-08d6935b2b77 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600110)(711020)(4605077)(4618075)(2017052603328)(7153060)(7193020);SRVR:HE1PR0502MB3674; x-ms-traffictypediagnostic: HE1PR0502MB3674: x-microsoft-exchange-diagnostics: =?iso-8859-1?Q?1;HE1PR0502MB3674;23:LND84VyUYPcr5MH86mJFjjsEDkia6DXJKgf7p?= =?iso-8859-1?Q?+xKHMHf6Dx8gZzFHUXGxUrF6F/3UYuKg+ooTwUBgqQpRdv8Dn15+ChkojI?= =?iso-8859-1?Q?Yuz5Wty5vKi/HCmQUQJZPxAIQ8VHUoFOBEiwKrlXJA5/Eb5f8BWOkqor4a?= =?iso-8859-1?Q?Sc07L+WIK5xMjJ8pLG9VWe0bLHlcdk5yNDTwWcN+KE4k0uLX1ZDFU87C6f?= =?iso-8859-1?Q?Oh4trcVaxwaHaa9i5xsuNTWFeqB+mD3B6Q6p0eGsyA7yaHiY1VfncgqN2f?= =?iso-8859-1?Q?68LOCfNcMU7gpEa6M5So+4W5VOXA8uhRpr2/QWVx5Yco1CnD8JeGZXOykQ?= =?iso-8859-1?Q?94VdF8A7yjvnBpqDSDWAZ0Uk/jDAxqq5lpZRx270OurerYumrZw6F9Zcb+?= =?iso-8859-1?Q?XYNdAbjF/6zBGsDaMvJfx9giWXJKjkf+3MWL/mEeCHkudYkYdxyCc3PggH?= =?iso-8859-1?Q?iylZOEH2Hz7g53xg8hLylHeHZ7a+ROlr5MyVL3ZQzq6CSlt9rDiCYcukuX?= =?iso-8859-1?Q?bUs1wroe2chI7I25sZSR47bfaNRMvZQ+r8hWV2ACw1HDwzO7gqt5kn/6Sj?= =?iso-8859-1?Q?J/pq0KAtiVco63Efp0MeviJddPwPKP9t1lgP90sOHsQAREcDndwqGQFsdi?= =?iso-8859-1?Q?V1wtzDPsGXpmO96/J45xRQUcLWBkuixdgcCxSQ/XuB3fodyAyE3AsbgynU?= =?iso-8859-1?Q?DQ/zDyvMxmrQm5QHJU5r8qZaksA8CLQvDShCLZ9h/yPczxzsBTDvD77hcq?= =?iso-8859-1?Q?8ReoyQyYyFGzzJbXSI813X9xe5W41cI5hP3H857ER7O0O8zbzWKy2OONIh?= =?iso-8859-1?Q?V9+ASSLt8ZGauSN0078b+hL33U2sKsHsbxKH8OUsuJhoqCSNdOv8Pj/bKQ?= =?iso-8859-1?Q?FslZBNq0CEv+ncf/Mku4b+jITJijA4a9orRw26UiDGdvwEZ8T0b2pXCjH9?= =?iso-8859-1?Q?8vD+RCUwsGIgj4KNy2lMc3M8DHZCneuOlGNmjZ6UvvDvGfVly1FvN5+NTq?= =?iso-8859-1?Q?/jm0TQhvC94Bua9DsK0kpmRSRzkD9l859IaLbju8r+AxDNBl2lV3Rsse37?= =?iso-8859-1?Q?m4zuNVSW5QojkzkS1ls+zYjSZRmsyPqkPNc2ugPd3tVmW4IBJW53oP7Hne?= =?iso-8859-1?Q?4e512XIxxy+YAYjh5hwtSv4TBSrLQuV9rm/7Mj2nQwfilDPePdUgsPZb9m?= =?iso-8859-1?Q?GSnJwlQ6aS4AP4+tGoYH3f5R/1JN6bC3HqLrl2L2ROKm0wGnkf8TxCzrc0?= =?iso-8859-1?Q?3X5GfYFFzDN6NKbul3oNrOSUSefHQhvjrUHlF25ow=3D=3D?= x-microsoft-antispam-prvs: x-forefront-prvs: 09497C15EB x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(366004)(376002)(346002)(396003)(136003)(39860400002)(199004)(189003)(229853002)(6436002)(68736007)(7736002)(305945005)(6916009)(14444005)(36756003)(71190400001)(71200400001)(93886005)(66066001)(2906002)(97736004)(6116002)(256004)(486006)(8676002)(3846002)(86362001)(52116002)(11346002)(316002)(186003)(26005)(54906003)(8936002)(102836004)(446003)(81156014)(476003)(6506007)(81166006)(2616005)(76176011)(386003)(6246003)(53936002)(4326008)(6486002)(105586002)(106356001)(478600001)(14454004)(99286004)(25786009)(6512007);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR0502MB3674;H:HE1PR0502MB3641.eurprd05.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;A:1;MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: lcxlW7B2kfh0lNnHVM3ChydzPSLT4UyrdEuCykOqv0pBApH3UyBEDoiIkuvVMhp6gn6WaIx1BJLbiiWYGmfwMnvD6ME0u8nG86B5+bOh2G3oqkAiXR8onXLxd0C6yNGrl8Ch6zj2AK37AGOVRFr46llsdTJZ7UsZNpHktJuK0Y0SkMyatQOIKkBZpxpOsUvWTqQPbOu4lgnaG27skzySj8ravUlp5q1pm2DIK7cOaVTx5gKTNWadpJtope9IIXfJhgBlW7KAvnY8AcrkHErFc8N5zO5J9L6PBNI+PByqM/7RR8niXRAzSiNGZ7WsqxnTopc0K6CpglODIvbbhko+qEIWM49/S86JK/D5+Ix8S0XbO9hA5EftEmDrM5C3HFcT53dem6+fF7NtP6Y3biDKFslDt9T1bbrgU4eQlpHZN34= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 44e6b229-82a4-49b3-991c-08d6935b2b77 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Feb 2019 15:35:08.9391 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0502MB3674 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri 15 Feb 2019 at 11:30, Ido Schimmel wrote: > On Fri, Feb 15, 2019 at 10:02:11AM +0000, Vlad Buslov wrote: >> >> On Thu 14 Feb 2019 at 18:24, Ido Schimmel wrote: >> > On Mon, Feb 11, 2019 at 10:55:38AM +0200, Vlad Buslov wrote: >> >> Extend tcf_chain with new filter_chain_lock mutex. Always lock the ch= ain >> >> when accessing filter_chain list, instead of relying on rtnl lock. >> >> Dereference filter_chain with tcf_chain_dereference() lockdep macro t= o >> >> verify that all users of chain_list have the lock taken. >> >> >> >> Rearrange tp insert/remove code in tc_new_tfilter/tc_del_tfilter to e= xecute >> >> all necessary code while holding chain lock in order to prevent >> >> invalidation of chain_info structure by potential concurrent change. = This >> >> also serializes calls to tcf_chain0_head_change(), which allows head = change >> >> callbacks to rely on filter_chain_lock for synchronization instead of= rtnl >> >> mutex. >> >> >> >> Signed-off-by: Vlad Buslov >> >> Acked-by: Jiri Pirko >> > >> > With this sequence [1] I get the following trace [2]. Bisected it to >> > this patch. Note that second filter is rejected by the device driver >> > (that's the intention). I guess it is not properly removed from the >> > filter chain in the error path? >> > >> > Thanks >> > >> > [1] >> > # tc qdisc add dev swp3 clsact >> > # tc filter add dev swp3 ingress pref 1 matchall skip_sw \ >> > action mirred egress mirror dev swp7 >> > # tc filter add dev swp3 ingress pref 2 matchall skip_sw \ >> > action mirred egress mirror dev swp7 >> > RTNETLINK answers: File exists >> > We have an error talking to the kernel, -1 >> > # tc qdisc del dev swp3 clsact >> > >> > [2] >> > [ 70.545131] kasan: GPF could be caused by NULL-ptr deref or user me= mory access >> > [ 70.553394] general protection fault: 0000 [#1] PREEMPT SMP KASAN P= TI >> > [ 70.560618] CPU: 2 PID: 2268 Comm: tc Not tainted 5.0.0-rc5-custom-= 02103-g415d39427317 #1304 >> > [ 70.570065] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO= /SA001017, BIOS 5.6.5 06/07/2016 >> > [ 70.580204] RIP: 0010:mall_reoffload+0x14a/0x760 >> > [ 70.585382] Code: c1 0f 85 ba 05 00 00 31 c0 4d 8d 6c 24 34 b9 06 0= 0 00 00 4c 89 ff f3 48 ab 4c 89 ea 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 0= 3 <0f> b6 14 02 4c 89 e8 83 >> > e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 bd >> > [ 70.606382] RSP: 0018:ffff888231faefc0 EFLAGS: 00010207 >> > [ 70.612235] RAX: dffffc0000000000 RBX: 1ffff110463f5dfe RCX: 000000= 0000000000 >> > [ 70.620220] RDX: 0000000000000006 RSI: 1ffff110463f5e01 RDI: ffff88= 8231faf040 >> > [ 70.628206] RBP: ffff8881ef151a00 R08: 0000000000000000 R09: ffffed= 10463f5dfa >> > [ 70.636192] R10: ffffed10463f5dfa R11: 0000000000000003 R12: 000000= 0000000000 >> > [ 70.644177] R13: 0000000000000034 R14: 0000000000000000 R15: ffff88= 8231faf010 >> > [ 70.652163] FS: 00007f46b5bf0800(0000) GS:ffff888236c00000(0000) k= nlGS:0000000000000000 >> > [ 70.661218] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> > [ 70.667649] CR2: 0000000001d590a8 CR3: 0000000231c3c000 CR4: 000000= 00001006e0 >> > [ 70.675633] Call Trace: >> > [ 70.693046] tcf_block_playback_offloads+0x94/0x230 >> > [ 70.710617] __tcf_block_cb_unregister+0xf7/0x2d0 >> > [ 70.734143] mlxsw_sp_setup_tc+0x20f/0x660 >> > [ 70.738739] tcf_block_offload_unbind+0x22b/0x350 >> > [ 70.748898] __tcf_block_put+0x203/0x630 >> > [ 70.769700] tcf_block_put_ext+0x2f/0x40 >> > [ 70.774098] clsact_destroy+0x7a/0xb0 >> > [ 70.782604] qdisc_destroy+0x11a/0x5c0 >> > [ 70.786810] qdisc_put+0x5a/0x70 >> > [ 70.790435] notify_and_destroy+0x8e/0xa0 >> > [ 70.794933] qdisc_graft+0xbb7/0xef0 >> > [ 70.809009] tc_get_qdisc+0x518/0xa30 >> > [ 70.821530] rtnetlink_rcv_msg+0x397/0xa20 >> > [ 70.835510] netlink_rcv_skb+0x132/0x380 >> > [ 70.848419] netlink_unicast+0x4c0/0x690 >> > [ 70.866019] netlink_sendmsg+0x929/0xe10 >> > [ 70.882134] sock_sendmsg+0xc8/0x110 >> > [ 70.886144] ___sys_sendmsg+0x77a/0x8f0 >> > [ 70.924064] __sys_sendmsg+0xf7/0x250 >> > [ 70.955727] do_syscall_64+0x14d/0x610 >> >> Hi Ido, >> >> Thanks for reporting this. >> >> I looked at the code and problem seems to be matchall classifier >> specific. My implementation of unlocked cls API assumes that concurrent >> insertions are possible and checks for it when deleting "empty" tp. >> Since classifiers don't expose number of elements, the only way to test >> this is to do tp->walk() on them and assume that walk callback is called >> once per filter on every classifier. In your example new tp is created >> for second filter, filter insertion fails, number of elements on newly >> created tp is checked with tp->walk() before deleting it. However, >> matchall classifier always calls the tp->walk() callback once, even when >> it doesn't have a valid filter (in this case with NULL filter pointer). >> >> Possible ways to fix this: >> >> 1) Check for NULL filter pointer in tp->walk() callback and ignore it >> when counting filters on tp - will work but I don't like it because I >> don't think it is ever correct to call tp->walk() callback with NULL >> filter pointer. >> >> 2) Fix matchall to not call tp->walk() callback with NULL filter >> pointers - my preferred simple solution. >> >> 3) Extend tp API to have direct way to count filters by implementing >> tp->count - requires change to every classifier. Or maybe add it as >> optional API that only unlocked classifiers are required to implement >> and just delete rtnl lock dependent tp's without checking for concurrent >> insertions. >> >> What do you think? > > Since the problem is matchall-specific, then it makes sense to fix it in > matchall like you suggested in option #2. > > Can you please use this opportunity and audit other classifiers to > confirm problem is indeed specific to matchall? > Turns out cls_cgroup has the same problem. Another problem that I found in cls_fw and cls_route is that they set arg->stop when empty. Both of them have code unchanged since it was committed initially in 2005 so I assume this convention is no longer relevant because all other classifiers don't do that (they only set arg->stop when arg->fn returns negative value). I sent fixes for all of those cases.