Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH rdma-next v1 0/4] Let IB core distribute cache update events
@ 2019-12-12 11:30 Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 1/4] IB/mlx5: Do reverse sequence during device removal Leon Romanovsky
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-12-12 11:30 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Parav Pandit

From: Leon Romanovsky <leonro@mellanox.com>

Changelog:
---
v0->v1:
 - Address Jason's comment to split qp event handler lock with IB device event
 - Added patch that add qp_ prefix to reflect QP event operation

Note: Patch #1 can go to the -rc too, and is sent here because "fix" is
needed only if we are using those cache patches.
-------------------------------------------------------------------------
From Parav,

Currently when low level driver notifies Pkey, GID, port change
events, they are notified to the registered handlers in the order
they are registered.

IB core and other ULPs such as IPoIB are interested in GID, LID,
Pkey change events. Since all GID query done by ULPs is serviced by
IB core, IB core is yet to update the GID cache when IPoIB queries
the GID, resulting into not updating IPoIB address.

Hence, all events which require cache update are handled first by
the IB core. Once cache update work is completed, IB core distributes
the event to subscriber clients.

This is tested with opensm's /etc/rdma/opensm.conf subnet_prefix
configuration update where before the update

$ ip link show dev ib0

ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
    link/infiniband
80:00:01:07:fe:80:00:00:00:00:00:00:24:8a:07:03:00:b3:d1:12 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

And after the subnet prefix update:

ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
    link/infiniband
80:00:01:07:fe:80:00:00:00:00:00:02:24:8a:07:03:00:b3:d1:12 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

Thanks

Parav Pandit (4):
  IB/mlx5: Do reverse sequence during device removal
  IB/core: Let IB core distribute cache update events
  IB/core: Cut down single member ib_cache structure
  IB/core: Prefix qp to event_handler_lock

 drivers/infiniband/core/cache.c     | 151 +++++++++++++++++-----------
 drivers/infiniband/core/core_priv.h |   1 +
 drivers/infiniband/core/device.c    |  35 ++-----
 drivers/infiniband/core/verbs.c     |  12 +--
 drivers/infiniband/hw/mlx5/main.c   |   2 +
 include/rdma/ib_verbs.h             |  16 +--
 6 files changed, 118 insertions(+), 99 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-next v1 1/4] IB/mlx5: Do reverse sequence during device removal
  2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
@ 2019-12-12 11:30 ` Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events Leon Romanovsky
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-12-12 11:30 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

When IB device profile initialization completes, device is marked as
active.
However, IB device is not marked inactive, during device removal flow.
It should be the mirror of the add flow.

Hence, mark it inactive during remove sequence.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 52bc86ab9490..a9090065a997 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -6921,6 +6921,8 @@ void __mlx5_ib_remove(struct mlx5_ib_dev *dev,
 		      const struct mlx5_ib_profile *profile,
 		      int stage)
 {
+	dev->ib_active = false;
+
 	/* Number of stages to cleanup */
 	while (stage) {
 		stage--;
--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events
  2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 1/4] IB/mlx5: Do reverse sequence during device removal Leon Romanovsky
@ 2019-12-12 11:30 ` Leon Romanovsky
  2020-01-07 21:02   ` Jason Gunthorpe
  2019-12-12 11:30 ` [PATCH rdma-next v1 3/4] IB/core: Cut down single member ib_cache structure Leon Romanovsky
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2019-12-12 11:30 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

Currently when low level driver notifies Pkey, GID, port change events,
they are notified to the registered handlers in the order they are
registered.
IB core and other ULPs such as IPoIB are interested in GID, LID, Pkey
change events.
Since all GID query done by ULPs is serviced by IB core, in below flow
when GID change event occurs, IB core is yet to update the GID cache
when IPoIB queries the GID, resulting into not updating IPoIB address.

mlx5_ib_handle_event()
  ib_dispatch_event()
    ib_cache_event()
       queue_work() -> slow cache update

    [..]
    ipoib_event()
     queue_work()
       [..]
       work handler
         ipoib_ib_dev_flush_light()
           __ipoib_ib_dev_flush()
              ipoib_dev_addr_changed_valid()
                rdma_query_gid() <- Returns old GID, cache not updated.

Hence, all events which require cache update are handled first by the IB
core. Once cache update work is completed, IB core distributes the event
to subscriber clients.

Fixes: f35faa4ba956 ("IB/core: Simplify ib_query_gid to always refer to cache")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cache.c     | 121 +++++++++++++++++-----------
 drivers/infiniband/core/core_priv.h |   1 +
 drivers/infiniband/core/device.c    |  33 +++-----
 include/rdma/ib_verbs.h             |   9 ++-
 4 files changed, 92 insertions(+), 72 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index d535995711c3..e55f345799e4 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -51,9 +51,8 @@ struct ib_pkey_cache {

 struct ib_update_work {
 	struct work_struct work;
-	struct ib_device  *device;
-	u8                 port_num;
-	bool		   enforce_security;
+	struct ib_event event;
+	bool enforce_security;
 };

 union ib_gid zgid;
@@ -130,7 +129,7 @@ static void dispatch_gid_change_event(struct ib_device *ib_dev, u8 port)
 	event.element.port_num	= port;
 	event.event		= IB_EVENT_GID_CHANGE;

-	ib_dispatch_event(&event);
+	ib_dispatch_event_clients(&event);
 }

 static const char * const gid_type_str[] = {
@@ -1381,9 +1380,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 	return ret;
 }

-static void ib_cache_update(struct ib_device *device,
-			    u8                port,
-			    bool	      enforce_security)
+static int
+ib_cache_update(struct ib_device *device, u8 port, bool enforce_security)
 {
 	struct ib_port_attr       *tprops = NULL;
 	struct ib_pkey_cache      *pkey_cache = NULL, *old_pkey_cache;
@@ -1391,11 +1389,11 @@ static void ib_cache_update(struct ib_device *device,
 	int                        ret;

 	if (!rdma_is_port_valid(device, port))
-		return;
+		return -EINVAL;

 	tprops = kmalloc(sizeof *tprops, GFP_KERNEL);
 	if (!tprops)
-		return;
+		return -ENOMEM;

 	ret = ib_query_port(device, port, tprops);
 	if (ret) {
@@ -1413,8 +1411,10 @@ static void ib_cache_update(struct ib_device *device,
 	pkey_cache = kmalloc(struct_size(pkey_cache, table,
 					 tprops->pkey_tbl_len),
 			     GFP_KERNEL);
-	if (!pkey_cache)
+	if (!pkey_cache) {
+		ret = -ENOMEM;
 		goto err;
+	}

 	pkey_cache->table_len = tprops->pkey_tbl_len;

@@ -1446,50 +1446,84 @@ static void ib_cache_update(struct ib_device *device,

 	kfree(old_pkey_cache);
 	kfree(tprops);
-	return;
+	return 0;

 err:
 	kfree(pkey_cache);
 	kfree(tprops);
+	return ret;
+}
+
+static void ib_cache_event_task(struct work_struct *_work)
+{
+	struct ib_update_work *work =
+		container_of(_work, struct ib_update_work, work);
+	int ret;
+
+	/* Before distributing the cache update event, first sync
+	 * the cache.
+	 */
+	ret = ib_cache_update(work->event.device, work->event.element.port_num,
+			      work->enforce_security);
+
+	/* GID event is notified already for individual GID entries by
+	 * dispatch_gid_change_event(). Hence, notifiy for rest of the
+	 * events.
+	 */
+	if (!ret && work->event.event != IB_EVENT_GID_CHANGE)
+		ib_dispatch_event_clients(&work->event);
+
+	kfree(work);
 }

-static void ib_cache_task(struct work_struct *_work)
+static void ib_generic_event_task(struct work_struct *_work)
 {
 	struct ib_update_work *work =
 		container_of(_work, struct ib_update_work, work);

-	ib_cache_update(work->device,
-			work->port_num,
-			work->enforce_security);
+	ib_dispatch_event_clients(&work->event);
 	kfree(work);
 }

-static void ib_cache_event(struct ib_event_handler *handler,
-			   struct ib_event *event)
+static bool is_cache_update_event(const struct ib_event *event)
+{
+	return (event->event == IB_EVENT_PORT_ERR    ||
+		event->event == IB_EVENT_PORT_ACTIVE ||
+		event->event == IB_EVENT_LID_CHANGE  ||
+		event->event == IB_EVENT_PKEY_CHANGE ||
+		event->event == IB_EVENT_CLIENT_REREGISTER ||
+		event->event == IB_EVENT_GID_CHANGE);
+}
+
+/**
+ * ib_dispatch_event - Dispatch an asynchronous event
+ * @event:Event to dispatch
+ *
+ * Low-level drivers must call ib_dispatch_event() to dispatch the
+ * event to all registered event handlers when an asynchronous event
+ * occurs.
+ */
+void ib_dispatch_event(const struct ib_event *event)
 {
 	struct ib_update_work *work;

-	if (event->event == IB_EVENT_PORT_ERR    ||
-	    event->event == IB_EVENT_PORT_ACTIVE ||
-	    event->event == IB_EVENT_LID_CHANGE  ||
-	    event->event == IB_EVENT_PKEY_CHANGE ||
-	    event->event == IB_EVENT_CLIENT_REREGISTER ||
-	    event->event == IB_EVENT_GID_CHANGE) {
-		work = kmalloc(sizeof *work, GFP_ATOMIC);
-		if (work) {
-			INIT_WORK(&work->work, ib_cache_task);
-			work->device   = event->device;
-			work->port_num = event->element.port_num;
-			if (event->event == IB_EVENT_PKEY_CHANGE ||
-			    event->event == IB_EVENT_GID_CHANGE)
-				work->enforce_security = true;
-			else
-				work->enforce_security = false;
-
-			queue_work(ib_wq, &work->work);
-		}
-	}
+	work = kzalloc(sizeof(*work), GFP_ATOMIC);
+	if (!work)
+		return;
+
+	if (is_cache_update_event(event))
+		INIT_WORK(&work->work, ib_cache_event_task);
+	else
+		INIT_WORK(&work->work, ib_generic_event_task);
+
+	work->event = *event;
+	if (event->event == IB_EVENT_PKEY_CHANGE ||
+	    event->event == IB_EVENT_GID_CHANGE)
+		work->enforce_security = true;
+
+	queue_work(ib_wq, &work->work);
 }
+EXPORT_SYMBOL(ib_dispatch_event);

 int ib_cache_setup_one(struct ib_device *device)
 {
@@ -1505,9 +1539,6 @@ int ib_cache_setup_one(struct ib_device *device)
 	rdma_for_each_port (device, p)
 		ib_cache_update(device, p, true);

-	INIT_IB_EVENT_HANDLER(&device->cache.event_handler,
-			      device, ib_cache_event);
-	ib_register_event_handler(&device->cache.event_handler);
 	return 0;
 }

@@ -1529,14 +1560,12 @@ void ib_cache_release_one(struct ib_device *device)

 void ib_cache_cleanup_one(struct ib_device *device)
 {
-	/* The cleanup function unregisters the event handler,
-	 * waits for all in-progress workqueue elements and cleans
-	 * up the GID cache. This function should be called after
-	 * the device was removed from the devices list and all
-	 * clients were removed, so the cache exists but is
+	/* The cleanup function waits for all in-progress workqueue
+	 * elements and cleans up the GID cache. This function should be
+	 * called after the device was removed from the devices list and
+	 * all clients were removed, so the cache exists but is
 	 * non-functional and shouldn't be updated anymore.
 	 */
-	ib_unregister_event_handler(&device->cache.event_handler);
 	flush_workqueue(ib_wq);
 	gid_table_cleanup_one(device);

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 3645e092e1c7..d657d90e618b 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -149,6 +149,7 @@ unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port);
 int ib_cache_setup_one(struct ib_device *device);
 void ib_cache_cleanup_one(struct ib_device *device);
 void ib_cache_release_one(struct ib_device *device);
+void ib_dispatch_event_clients(struct ib_event *event);

 #ifdef CONFIG_CGROUP_RDMA
 void ib_device_register_rdmacg(struct ib_device *device);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 84dd74fe13b8..c38b2b0b078a 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -588,6 +588,7 @@ struct ib_device *_ib_alloc_device(size_t size)

 	INIT_LIST_HEAD(&device->event_handler_list);
 	spin_lock_init(&device->event_handler_lock);
+	init_rwsem(&device->event_handler_rwsem);
 	mutex_init(&device->unregistration_lock);
 	/*
 	 * client_data needs to be alloc because we don't want our mark to be
@@ -1931,17 +1932,15 @@ EXPORT_SYMBOL(ib_set_client_data);
  *
  * ib_register_event_handler() registers an event handler that will be
  * called back when asynchronous IB events occur (as defined in
- * chapter 11 of the InfiniBand Architecture Specification).  This
- * callback may occur in interrupt context.
+ * chapter 11 of the InfiniBand Architecture Specification). This
+ * callback occurs in workqueue context.
  */
 void ib_register_event_handler(struct ib_event_handler *event_handler)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&event_handler->device->event_handler_lock, flags);
+	down_write(&event_handler->device->event_handler_rwsem);
 	list_add_tail(&event_handler->list,
 		      &event_handler->device->event_handler_list);
-	spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
+	up_write(&event_handler->device->event_handler_rwsem);
 }
 EXPORT_SYMBOL(ib_register_event_handler);

@@ -1954,35 +1953,23 @@ EXPORT_SYMBOL(ib_register_event_handler);
  */
 void ib_unregister_event_handler(struct ib_event_handler *event_handler)
 {
-	unsigned long flags;
-
-	spin_lock_irqsave(&event_handler->device->event_handler_lock, flags);
+	down_write(&event_handler->device->event_handler_rwsem);
 	list_del(&event_handler->list);
-	spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
+	up_write(&event_handler->device->event_handler_rwsem);
 }
 EXPORT_SYMBOL(ib_unregister_event_handler);

-/**
- * ib_dispatch_event - Dispatch an asynchronous event
- * @event:Event to dispatch
- *
- * Low-level drivers must call ib_dispatch_event() to dispatch the
- * event to all registered event handlers when an asynchronous event
- * occurs.
- */
-void ib_dispatch_event(struct ib_event *event)
+void ib_dispatch_event_clients(struct ib_event *event)
 {
-	unsigned long flags;
 	struct ib_event_handler *handler;

-	spin_lock_irqsave(&event->device->event_handler_lock, flags);
+	down_read(&event->device->event_handler_rwsem);

 	list_for_each_entry(handler, &event->device->event_handler_list, list)
 		handler->handler(handler, event);

-	spin_unlock_irqrestore(&event->device->event_handler_lock, flags);
+	up_read(&event->device->event_handler_rwsem);
 }
-EXPORT_SYMBOL(ib_dispatch_event);

 static int iw_query_port(struct ib_device *device,
 			   u8 port_num,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 5608e14e3aad..cb02d36d41d2 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2149,7 +2149,6 @@ struct ib_port_cache {

 struct ib_cache {
 	rwlock_t                lock;
-	struct ib_event_handler event_handler;
 };

 struct ib_port_immutable {
@@ -2627,7 +2626,11 @@ struct ib_device {
 	struct rcu_head rcu_head;

 	struct list_head              event_handler_list;
-	spinlock_t                    event_handler_lock;
+	/* Protects event_handler_list */
+	struct rw_semaphore event_handler_rwsem;
+
+	/* Protects QP's event_handler calls and open_qp list */
+	spinlock_t event_handler_lock;

 	struct rw_semaphore	      client_data_rwsem;
 	struct xarray                 client_data;
@@ -2942,7 +2945,7 @@ bool ib_modify_qp_is_ok(enum ib_qp_state cur_state, enum ib_qp_state next_state,

 void ib_register_event_handler(struct ib_event_handler *event_handler);
 void ib_unregister_event_handler(struct ib_event_handler *event_handler);
-void ib_dispatch_event(struct ib_event *event);
+void ib_dispatch_event(const struct ib_event *event);

 int ib_query_port(struct ib_device *device,
 		  u8 port_num, struct ib_port_attr *port_attr);
--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-next v1 3/4] IB/core: Cut down single member ib_cache structure
  2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 1/4] IB/mlx5: Do reverse sequence during device removal Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events Leon Romanovsky
@ 2019-12-12 11:30 ` Leon Romanovsky
  2019-12-12 11:30 ` [PATCH rdma-next v1 4/4] IB/core: Prefix qp to event_handler_lock Leon Romanovsky
  2020-01-08  0:28 ` [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Jason Gunthorpe
  4 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-12-12 11:30 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

Given that ib_cache structure has only single member now, merge the
cache lock directly in the ib_device.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cache.c | 30 +++++++++++++++---------------
 include/rdma/ib_verbs.h         |  7 ++-----
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index e55f345799e4..17bfedd24cc3 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1033,7 +1033,7 @@ int ib_get_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1042,7 +1042,7 @@ int ib_get_cached_pkey(struct ib_device *device,
 	else
 		*pkey = cache->table[index];

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1057,9 +1057,9 @@ int ib_get_cached_subnet_prefix(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*sn_pfx = device->port_data[port_num].cache.subnet_prefix;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return 0;
 }
@@ -1079,7 +1079,7 @@ int ib_find_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1100,7 +1100,7 @@ int ib_find_cached_pkey(struct ib_device *device,
 		ret = 0;
 	}

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1119,7 +1119,7 @@ int ib_find_exact_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1132,7 +1132,7 @@ int ib_find_exact_cached_pkey(struct ib_device *device,
 			break;
 		}

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1148,9 +1148,9 @@ int ib_get_cached_lmc(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*lmc = device->port_data[port_num].cache.lmc;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1166,9 +1166,9 @@ int ib_get_cached_port_state(struct ib_device   *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*port_state = device->port_data[port_num].cache.port_state;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1428,7 +1428,7 @@ ib_cache_update(struct ib_device *device, u8 port, bool enforce_security)
 		}
 	}

-	write_lock_irq(&device->cache.lock);
+	write_lock_irq(&device->cache_lock);

 	old_pkey_cache = device->port_data[port].cache.pkey;

@@ -1437,7 +1437,7 @@ ib_cache_update(struct ib_device *device, u8 port, bool enforce_security)
 	device->port_data[port].cache.port_state = tprops->state;

 	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
-	write_unlock_irq(&device->cache.lock);
+	write_unlock_irq(&device->cache_lock);

 	if (enforce_security)
 		ib_security_cache_change(device,
@@ -1530,7 +1530,7 @@ int ib_cache_setup_one(struct ib_device *device)
 	unsigned int p;
 	int err;

-	rwlock_init(&device->cache.lock);
+	rwlock_init(&device->cache_lock);

 	err = gid_table_setup_one(device);
 	if (err)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cb02d36d41d2..eaf9c948ff9b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2147,10 +2147,6 @@ struct ib_port_cache {
 	enum ib_port_state     port_state;
 };

-struct ib_cache {
-	rwlock_t                lock;
-};
-
 struct ib_port_immutable {
 	int                           pkey_tbl_len;
 	int                           gid_tbl_len;
@@ -2636,7 +2632,8 @@ struct ib_device {
 	struct xarray                 client_data;
 	struct mutex                  unregistration_lock;

-	struct ib_cache               cache;
+	/* Synchronize GID, Pkey cache entries, subnet prefix, LMC */
+	rwlock_t cache_lock;
 	/**
 	 * port_data is indexed by port number
 	 */
--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-next v1 4/4] IB/core: Prefix qp to event_handler_lock
  2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
                   ` (2 preceding siblings ...)
  2019-12-12 11:30 ` [PATCH rdma-next v1 3/4] IB/core: Cut down single member ib_cache structure Leon Romanovsky
@ 2019-12-12 11:30 ` Leon Romanovsky
  2020-01-08  0:28 ` [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Jason Gunthorpe
  4 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-12-12 11:30 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

Since event_handler_lock is accessed while executing QP's event handler,
prefix it with qp_ to avoid confusion with device's event_handler_list
and event_handler_rwsem.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/device.c |  2 +-
 drivers/infiniband/core/verbs.c  | 12 ++++++------
 include/rdma/ib_verbs.h          |  2 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index c38b2b0b078a..90794238bae8 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -587,7 +587,7 @@ struct ib_device *_ib_alloc_device(size_t size)
 	rdma_init_coredev(&device->coredev, device, &init_net);

 	INIT_LIST_HEAD(&device->event_handler_list);
-	spin_lock_init(&device->event_handler_lock);
+	spin_lock_init(&device->qp_event_handler_lock);
 	init_rwsem(&device->event_handler_rwsem);
 	mutex_init(&device->unregistration_lock);
 	/*
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index dd765e176cdd..4606a5221ba5 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1053,11 +1053,11 @@ static void __ib_shared_qp_event_handler(struct ib_event *event, void *context)
 	struct ib_qp *qp = context;
 	unsigned long flags;

-	spin_lock_irqsave(&qp->device->event_handler_lock, flags);
+	spin_lock_irqsave(&qp->device->qp_event_handler_lock, flags);
 	list_for_each_entry(event->element.qp, &qp->open_list, open_list)
 		if (event->element.qp->event_handler)
 			event->element.qp->event_handler(event, event->element.qp->qp_context);
-	spin_unlock_irqrestore(&qp->device->event_handler_lock, flags);
+	spin_unlock_irqrestore(&qp->device->qp_event_handler_lock, flags);
 }

 static void __ib_insert_xrcd_qp(struct ib_xrcd *xrcd, struct ib_qp *qp)
@@ -1094,9 +1094,9 @@ static struct ib_qp *__ib_open_qp(struct ib_qp *real_qp,
 	qp->qp_num = real_qp->qp_num;
 	qp->qp_type = real_qp->qp_type;

-	spin_lock_irqsave(&real_qp->device->event_handler_lock, flags);
+	spin_lock_irqsave(&real_qp->device->qp_event_handler_lock, flags);
 	list_add(&qp->open_list, &real_qp->open_list);
-	spin_unlock_irqrestore(&real_qp->device->event_handler_lock, flags);
+	spin_unlock_irqrestore(&real_qp->device->qp_event_handler_lock, flags);

 	return qp;
 }
@@ -1824,9 +1824,9 @@ int ib_close_qp(struct ib_qp *qp)
 	if (real_qp == qp)
 		return -EINVAL;

-	spin_lock_irqsave(&real_qp->device->event_handler_lock, flags);
+	spin_lock_irqsave(&real_qp->device->qp_event_handler_lock, flags);
 	list_del(&qp->open_list);
-	spin_unlock_irqrestore(&real_qp->device->event_handler_lock, flags);
+	spin_unlock_irqrestore(&real_qp->device->qp_event_handler_lock, flags);

 	atomic_dec(&real_qp->usecnt);
 	if (qp->qp_sec)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index eaf9c948ff9b..e16a592e4536 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2626,7 +2626,7 @@ struct ib_device {
 	struct rw_semaphore event_handler_rwsem;

 	/* Protects QP's event_handler calls and open_qp list */
-	spinlock_t event_handler_lock;
+	spinlock_t qp_event_handler_lock;

 	struct rw_semaphore	      client_data_rwsem;
 	struct xarray                 client_data;
--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events
  2019-12-12 11:30 ` [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events Leon Romanovsky
@ 2020-01-07 21:02   ` Jason Gunthorpe
  2020-01-08 11:35     ` Parav Pandit
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2020-01-07 21:02 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Parav Pandit

On Thu, Dec 12, 2019 at 01:30:22PM +0200, Leon Romanovsky wrote:

> @@ -2627,7 +2626,11 @@ struct ib_device {
>  	struct rcu_head rcu_head;
> 
>  	struct list_head              event_handler_list;
> -	spinlock_t                    event_handler_lock;
> +	/* Protects event_handler_list */
> +	struct rw_semaphore event_handler_rwsem;
> +
> +	/* Protects QP's event_handler calls and open_qp list */
> +	spinlock_t event_handler_lock;

This only protects the open_qp list really, the event handler call
doesn't need a spinlock. So lets name it properly. open_list_lock ?

It is sort of weird that we globally serialize all the qp event
handlers? ie that this lock isn't in the ib_qp.

Is this deliberate and relied upon or just something random?

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next v1 0/4] Let IB core distribute cache update events
  2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
                   ` (3 preceding siblings ...)
  2019-12-12 11:30 ` [PATCH rdma-next v1 4/4] IB/core: Prefix qp to event_handler_lock Leon Romanovsky
@ 2020-01-08  0:28 ` Jason Gunthorpe
  2020-01-08 11:42   ` Parav Pandit
  4 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2020-01-08  0:28 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Parav Pandit

On Thu, Dec 12, 2019 at 01:30:20PM +0200, Leon Romanovsky wrote:

> Parav Pandit (4):
>   IB/mlx5: Do reverse sequence during device removal
>   IB/core: Let IB core distribute cache update events
>   IB/core: Cut down single member ib_cache structure
>   IB/core: Prefix qp to event_handler_lock

I used qp_open_list_lock in the last patch, and I'm still interested
if/why globally serializing the qp handlers is required, or if that
could be rw spinlock too.

Otherwise applied to for-next

Thanks,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events
  2020-01-07 21:02   ` Jason Gunthorpe
@ 2020-01-08 11:35     ` Parav Pandit
  0 siblings, 0 replies; 9+ messages in thread
From: Parav Pandit @ 2020-01-08 11:35 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list

On 1/8/2020 2:32 AM, Jason Gunthorpe wrote:
> On Thu, Dec 12, 2019 at 01:30:22PM +0200, Leon Romanovsky wrote:
> 
>> @@ -2627,7 +2626,11 @@ struct ib_device {
>>  	struct rcu_head rcu_head;
>>
>>  	struct list_head              event_handler_list;
>> -	spinlock_t                    event_handler_lock;
>> +	/* Protects event_handler_list */
>> +	struct rw_semaphore event_handler_rwsem;
>> +
>> +	/* Protects QP's event_handler calls and open_qp list */
>> +	spinlock_t event_handler_lock;
> 
> This only protects the open_qp list really, the event handler call
> doesn't need a spinlock. So lets name it properly. open_list_lock ?
> 
Yes. it protects open_qp list and event handler is called for each list
item. So it doesn't really need to protect event handler calls.

> It is sort of weird that we globally serialize all the qp event
> handlers? ie that this lock isn't in the ib_qp.
> 
It probably isn't in each ib_qp because ib_qp is in hundred thousands
and xrc qp events are not so frequent event that can get contented.
So I think per device qp list lock seems fine.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next v1 0/4] Let IB core distribute cache update events
  2020-01-08  0:28 ` [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Jason Gunthorpe
@ 2020-01-08 11:42   ` Parav Pandit
  0 siblings, 0 replies; 9+ messages in thread
From: Parav Pandit @ 2020-01-08 11:42 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list

On 1/8/2020 5:58 AM, Jason Gunthorpe wrote:
> On Thu, Dec 12, 2019 at 01:30:20PM +0200, Leon Romanovsky wrote:
> 
>> Parav Pandit (4):
>>   IB/mlx5: Do reverse sequence during device removal
>>   IB/core: Let IB core distribute cache update events
>>   IB/core: Cut down single member ib_cache structure
>>   IB/core: Prefix qp to event_handler_lock
> 
> I used qp_open_list_lock in the last patch, and I'm still interested
> if/why globally serializing the qp handlers is required, or if that
> could be rw spinlock too.
> 
My understanding is as in email of patch-2, its open_list_lock.
probably there isn't too much contention, but yes it can be changed to
rw spinlock.

> Otherwise applied to for-next
> 
Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-12 11:30 [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Leon Romanovsky
2019-12-12 11:30 ` [PATCH rdma-next v1 1/4] IB/mlx5: Do reverse sequence during device removal Leon Romanovsky
2019-12-12 11:30 ` [PATCH rdma-next v1 2/4] IB/core: Let IB core distribute cache update events Leon Romanovsky
2020-01-07 21:02   ` Jason Gunthorpe
2020-01-08 11:35     ` Parav Pandit
2019-12-12 11:30 ` [PATCH rdma-next v1 3/4] IB/core: Cut down single member ib_cache structure Leon Romanovsky
2019-12-12 11:30 ` [PATCH rdma-next v1 4/4] IB/core: Prefix qp to event_handler_lock Leon Romanovsky
2020-01-08  0:28 ` [PATCH rdma-next v1 0/4] Let IB core distribute cache update events Jason Gunthorpe
2020-01-08 11:42   ` Parav Pandit

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git