All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/5] devlink: add an explicit locking API
@ 2021-10-30 23:12 Jakub Kicinski
  2021-10-30 23:12 ` [RFC 1/5] devlink: add unlocked APIs Jakub Kicinski
                   ` (6 more replies)
  0 siblings, 7 replies; 23+ messages in thread
From: Jakub Kicinski @ 2021-10-30 23:12 UTC (permalink / raw)
  To: leon, idosch; +Cc: edwin.peer, jiri, netdev, Jakub Kicinski

This implements what I think is the right direction for devlink
API overhaul. It's an early RFC/PoC because the body of code is
rather large so I'd appreciate feedback early... The patches are
very roughly split, the point of the posting is primarily to prove
that from the driver side the conversion is an net improvement
in complexity.

IMO the problems with devlink locking are caused by two things:

 (1) driver has no way to block devlink calls like it does in case
     of netedev / rtnl_lock, note that for devlink each driver has
     it's own lock so contention is not an issue;
     
 (2) sometimes devlink calls into the driver without holding its lock
     - for operations which may need the driver to call devlink (port
     splitting, switch config, reload etc.), the circular dependency
     is there, the driver can't solve this problem.

This set "fixes" the ordering by allowing the driver to participate
in locking. The lock order remains:

  device_lock -> [devlink_mutex] -> devlink instance -> rtnl_lock

but now driver can take devlink lock itself, and _ALL_ devlink ops
are locked.

The expectation is that driver will take the devlink instance lock
on its probe and remove paths, hence protecting all configuration
paths with the devlink instance lock.

This is clearly demonstrated by the netdevsim conversion. All entry
points to driver config are protected by devlink instance lock, so
the driver switches to the "I'm already holding the devlink lock" API
when calling devlink. All driver locks and trickery is removed.

The part which is slightly more challanging is quiescing async entry
points which need to be closed on the devlink reload path (workqueue,
debugfs etc.) and which also take devlink instance lock. For that we
need to be able to take refs on the devlink instance and let them
clean up after themselves rather than waiting synchronously.

That last part is not 100% finished in this patch set - it works but
we need the driver to take devlink_mutex (the global lock) from its
probe/remove path. I hope this is good enough for an RFC, the problem
is easily solved by protecting the devlink XArray with something else
than devlink_mutex and/or not holding devlink_mutex around each op
(so that there is no ordering between the global and instance locks).
Speaking of not holding devlink_mutex around each op this patch set
also opens the path for parallel ops to different devlink instances
which is currently impossible because all user space calls take
devlink_mutex...

The patches are on top of the cleanups I posted earlier.

Jakub Kicinski (5):
  devlink: add unlocked APIs
  devlink: add API for explicit locking
  devlink: allow locking of all ops
  netdevsim: minor code move
  netdevsim: use devlink locking

 drivers/net/netdevsim/bus.c       |  19 -
 drivers/net/netdevsim/dev.c       | 450 ++++++++-------
 drivers/net/netdevsim/fib.c       |  62 +--
 drivers/net/netdevsim/netdevsim.h |   5 -
 include/net/devlink.h             |  88 +++
 net/core/devlink.c                | 875 +++++++++++++++++++++---------
 6 files changed, 996 insertions(+), 503 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-11-03 19:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-30 23:12 [RFC 0/5] devlink: add an explicit locking API Jakub Kicinski
2021-10-30 23:12 ` [RFC 1/5] devlink: add unlocked APIs Jakub Kicinski
2021-10-30 23:12 ` [RFC 2/5] devlink: add API for explicit locking Jakub Kicinski
2021-10-30 23:12 ` [RFC 3/5] devlink: allow locking of all ops Jakub Kicinski
2021-10-30 23:12 ` [RFC 4/5] netdevsim: minor code move Jakub Kicinski
2021-10-30 23:12 ` [RFC 5/5] netdevsim: use devlink locking Jakub Kicinski
2021-10-31  7:23 ` [RFC 0/5] devlink: add an explicit locking API Leon Romanovsky
2021-11-01 14:32   ` Jakub Kicinski
2021-11-01 18:36     ` Leon Romanovsky
2021-11-01 21:16       ` Jakub Kicinski
2021-11-02  8:08         ` Leon Romanovsky
2021-11-02 15:14           ` Jakub Kicinski
2021-11-02 18:14             ` Leon Romanovsky
2021-11-03  0:05               ` Jakub Kicinski
2021-11-03  7:23                 ` Leon Romanovsky
2021-11-03 14:12                   ` Jakub Kicinski
2021-11-01 20:04   ` Edwin Peer
2021-11-02  7:44     ` Leon Romanovsky
2021-11-02 15:16       ` Jakub Kicinski
2021-11-02 17:50         ` Leon Romanovsky
2021-11-03  9:03 ` Jiri Pirko
2021-11-03 14:52   ` Jakub Kicinski
2021-11-03 19:19     ` Jiri Pirko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.