netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next 0/2] prevent sync issues with hw offload of flower
@ 2019-10-02 23:14 John Hurley
  2019-10-02 23:14 ` [RFC net-next 1/2] net: sched: add tp_op for pre_destroy John Hurley
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: John Hurley @ 2019-10-02 23:14 UTC (permalink / raw)
  To: vladbu
  Cc: jiri, netdev, simon.horman, jakub.kicinski, oss-drivers, John Hurley

Hi,

Putting this out an RFC built on net-next. It fixes some issues
discovered in testing when using the TC API of OvS to generate flower
rules and subsequently offloading them to HW. Rules seen contain the same
match fields or may be rule modifications run as a delete plus an add.
We're seeing race conditions whereby the rules present in kernel flower
are out of sync with those offloaded. Note that there are some issues
that will need fixed in the RFC before it becomes a patch such as
potential races between releasing locks and re-taking them. However, I'm
putting this out for comments or potential alternative solutions.

The main cause of the races seem to be in the chain table of cls_api. If
a tcf_proto is destroyed then it is removed from its chain. If a new
filter is then added to the same chain with the same priority and protocol
a new tcf_proto will be created - this may happen before the first is
fully removed and the hw offload message sent to the driver. In cls_flower
this means that the fl_ht_insert_unique() function can pass as its
hashtable is associated with the tcf_proto. We are then in a position
where the 'delete' and the 'add' are in a race to get offloaded. We also
noticed that doing an offload add, then checking if a tcf_proto is
concurrently deleting, then remove the offload if it is, can extend the
out of order messages. Drivers do not expect to get duplicate rules.
However, the kernel TC datapath they are not duplicates so we can get out
of sync here.

The RFC fixes this by adding a pre_destroy hook to cls_api that is called
when a tcf_proto is signaled to be destroyed but before it is removed from
its chain (which is essentially the lock for allowing duplicates in
flower). Flower then uses this new hook to send the hw delete messages
from tcf_proto destroys, preventing them racing with duplicate adds. It
also moves the check for 'deleting' to before the sending the hw add
message.

John Hurley (2):
  net: sched: add tp_op for pre_destroy
  net: sched: fix tp destroy race conditions in flower

 include/net/sch_generic.h |  3 +++
 net/sched/cls_api.c       | 29 ++++++++++++++++++++++++-
 net/sched/cls_flower.c    | 55 ++++++++++++++++++++++++++---------------------
 3 files changed, 61 insertions(+), 26 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-10-04 16:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 23:14 [RFC net-next 0/2] prevent sync issues with hw offload of flower John Hurley
2019-10-02 23:14 ` [RFC net-next 1/2] net: sched: add tp_op for pre_destroy John Hurley
2019-10-02 23:14 ` [RFC net-next 2/2] net: sched: fix tp destroy race conditions in flower John Hurley
2019-10-03 16:18   ` Vlad Buslov
2019-10-03 16:39     ` John Hurley
2019-10-03 16:26 ` [RFC net-next 0/2] prevent sync issues with hw offload of flower Vlad Buslov
2019-10-03 16:59   ` John Hurley
2019-10-03 17:19     ` Vlad Buslov
2019-10-04 15:39       ` John Hurley
2019-10-04 15:58         ` Vlad Buslov
2019-10-04 16:06           ` John Hurley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).