All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch net 00/16] net_sched: fix races with RCU callbacks
@ 2017-10-27  1:24 Cong Wang
  2017-10-27  1:24 ` [Patch net 01/16] net_sched: introduce a workqueue for RCU callbacks of tc filter Cong Wang
                   ` (17 more replies)
  0 siblings, 18 replies; 36+ messages in thread
From: Cong Wang @ 2017-10-27  1:24 UTC (permalink / raw)
  To: netdev
  Cc: Chris Mi, Cong Wang, Daniel Borkmann, Jiri Pirko, John Fastabend,
	Jamal Hadi Salim, Paul E. McKenney

Recently, the RCU callbacks used in TC filters and TC actions keep
drawing my attention, they introduce at least 4 race condition bugs:

1. A simple one fixed by Daniel:

commit c78e1746d3ad7d548bdf3fe491898cc453911a49
Author: Daniel Borkmann <daniel@iogearbox.net>
Date:   Wed May 20 17:13:33 2015 +0200

    net: sched: fix call_rcu() race on classifier module unloads

2. A very nasty one fixed by me:

commit 1697c4bb5245649a23f06a144cc38c06715e1b65
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Mon Sep 11 16:33:32 2017 -0700

    net_sched: carefully handle tcf_block_put()

3. Two more bugs found by Chris:
https://patchwork.ozlabs.org/patch/826696/
https://patchwork.ozlabs.org/patch/826695/

Usually RCU callbacks are simple, however for TC filters and actions,
they are complex because at least TC actions could be destroyed
together with the TC filter in one callback. And RCU callbacks are
invoked in BH context, without locking they are parallel too. All of
these contribute to the cause of these nasty bugs.

Alternatively, we could also:

a) Introduce a spinlock to serialize these RCU callbacks. But as I
said in commit 1697c4bb5245 ("net_sched: carefully handle
tcf_block_put()"), it is very hard to do because of tcf_chain_dump().
Potentially we need to do a lot of work to make it possible (if not
impossible).

b) Just get rid of these RCU callbacks, because they are not
necessary at all, callers of these call_rcu() are all on slow paths
and holding RTNL lock, so blocking is allowed in their contexts.
However, David and Eric dislike adding synchronize_rcu() here.

As suggested by Paul, we could defer the work to a workqueue and
gain the permission of holding RTNL again without any performance
impact, however, in tcf_block_put() we could have a deadlock when
flushing workqueue while hodling RTNL lock, the trick here is to
defer the work itself in workqueue and make it queued after all
other works so that we keep the same ordering to avoid any
use-after-free. Please see the first patch for details.

Patch 1 introduces the infrastructure, patch 2~12 move each
tc filter to the new tc filter workqueue, patch 13 adds
an assertion to catch potential bugs like this, patch 14
closes another rcu callback race, patch 15 and patch 16 add
new test cases.

Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Chris Mi (2):
  selftests: Introduce a new script to generate tc batch file
  selftests: Introduce a new test case to tc testsuite

Cong Wang (14):
  net_sched: introduce a workqueue for RCU callbacks of tc filter
  net_sched: use tcf_queue_work() in basic filter
  net_sched: use tcf_queue_work() in bpf filter
  net_sched: use tcf_queue_work() in cgroup filter
  net_sched: use tcf_queue_work() in flow filter
  net_sched: use tcf_queue_work() in flower filter
  net_sched: use tcf_queue_work() in fw filter
  net_sched: use tcf_queue_work() in matchall filter
  net_sched: use tcf_queue_work() in u32 filter
  net_sched: use tcf_queue_work() in route filter
  net_sched: use tcf_queue_work() in rsvp filter
  net_sched: use tcf_queue_work() in tcindex filter
  net_sched: add rtnl assertion to tcf_exts_destroy()
  net_sched: fix call_rcu() race on act_sample module removal

 include/net/pkt_cls.h                              |  3 +
 include/net/sch_generic.h                          |  2 +
 net/sched/act_sample.c                             |  1 +
 net/sched/cls_api.c                                | 69 ++++++++++++++++------
 net/sched/cls_basic.c                              | 20 ++++++-
 net/sched/cls_bpf.c                                | 19 +++++-
 net/sched/cls_cgroup.c                             | 22 +++++--
 net/sched/cls_flow.c                               | 19 +++++-
 net/sched/cls_flower.c                             | 19 +++++-
 net/sched/cls_fw.c                                 | 19 +++++-
 net/sched/cls_matchall.c                           | 19 +++++-
 net/sched/cls_route.c                              | 19 +++++-
 net/sched/cls_rsvp.h                               | 19 +++++-
 net/sched/cls_tcindex.c                            | 38 ++++++++++--
 net/sched/cls_u32.c                                | 29 ++++++++-
 .../tc-testing/tc-tests/filters/tests.json         | 23 +++++++-
 tools/testing/selftests/tc-testing/tdc.py          | 20 +++++--
 tools/testing/selftests/tc-testing/tdc_batch.py    | 62 +++++++++++++++++++
 tools/testing/selftests/tc-testing/tdc_config.py   |  2 +
 19 files changed, 367 insertions(+), 57 deletions(-)
 create mode 100755 tools/testing/selftests/tc-testing/tdc_batch.py

-- 
2.13.0

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2017-11-01 16:59 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27  1:24 [Patch net 00/16] net_sched: fix races with RCU callbacks Cong Wang
2017-10-27  1:24 ` [Patch net 01/16] net_sched: introduce a workqueue for RCU callbacks of tc filter Cong Wang
2017-10-27  4:05   ` Eric Dumazet
2017-10-27  4:28     ` Cong Wang
2017-10-27  4:39       ` Eric Dumazet
2017-10-27 11:55         ` Paul E. McKenney
2017-10-27 15:43           ` Cong Wang
2017-10-27 15:37         ` Cong Wang
2017-10-27 15:52           ` Eric Dumazet
2017-10-27  1:24 ` [Patch net 02/16] net_sched: use tcf_queue_work() in basic filter Cong Wang
2017-10-27  1:24 ` [Patch net 03/16] net_sched: use tcf_queue_work() in bpf filter Cong Wang
2017-10-27  1:24 ` [Patch net 04/16] net_sched: use tcf_queue_work() in cgroup filter Cong Wang
2017-10-27  1:24 ` [Patch net 05/16] net_sched: use tcf_queue_work() in flow filter Cong Wang
2017-10-27  1:24 ` [Patch net 06/16] net_sched: use tcf_queue_work() in flower filter Cong Wang
2017-10-27  1:24 ` [Patch net 07/16] net_sched: use tcf_queue_work() in fw filter Cong Wang
2017-10-27  1:24 ` [Patch net 08/16] net_sched: use tcf_queue_work() in matchall filter Cong Wang
2017-10-27  1:24 ` [Patch net 09/16] net_sched: use tcf_queue_work() in u32 filter Cong Wang
2017-10-27  1:24 ` [Patch net 10/16] net_sched: use tcf_queue_work() in route filter Cong Wang
2017-10-27  1:24 ` [Patch net 11/16] net_sched: use tcf_queue_work() in rsvp filter Cong Wang
2017-10-27  1:24 ` [Patch net 12/16] net_sched: use tcf_queue_work() in tcindex filter Cong Wang
2017-10-27  1:24 ` [Patch net 13/16] net_sched: add rtnl assertion to tcf_exts_destroy() Cong Wang
2017-10-27  1:24 ` [Patch net 14/16] net_sched: fix call_rcu() race on act_sample module removal Cong Wang
2017-10-27  1:24 ` [Patch net 15/16] selftests: Introduce a new script to generate tc batch file Cong Wang
2017-10-27  1:24 ` [Patch net 16/16] selftests: Introduce a new test case to tc testsuite Cong Wang
2017-10-29 14:41 ` [Patch net 00/16] net_sched: fix races with RCU callbacks David Miller
2017-10-30 22:39 ` Lucas Bates
2017-10-30 23:12   ` Cong Wang
2017-10-31  1:46     ` Lucas Bates
     [not found]     ` <CAMDBHYJTMv4MDQENNKjOXaUBHyXicOnkM_j+i7__3d1CAgdVRg@mail.gmail.com>
2017-10-31  5:44       ` Cong Wang
2017-10-31 11:00         ` Jamal Hadi Salim
2017-10-31 18:55           ` Lucas Bates
2017-10-31 19:13             ` Lucas Bates
2017-10-31 22:09               ` Cong Wang
2017-10-31 23:02                 ` Cong Wang
2017-11-01 16:55                   ` Lucas Bates
2017-11-01 16:59                     ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.