All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/3] Let IB core distribute cache update events
@ 2019-10-20  6:54 Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 1/3] IB/core: " Leon Romanovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-10-20  6:54 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Daniel Jurgens, Parav Pandit

From: Leon Romanovsky <leonro@mellanox.com>

From Parav,

Currently when low level driver notifies Pkey, GID, port change
events, they are notified to the registered handlers in the order
they are registered.

IB core and other ULPs such as IPoIB are interested in GID, LID,
Pkey change events. Since all GID query done by ULPs is serviced
by IB core, IB core is yet to update the GID cache when IPoIB
queries the GID, resulting into not updating IPoIB address.
Detailed flow is shown in the patch-1.

Hence, all events which require cache update are handled first by
the IB core. Once cache update work is completed, IB core distributes
the event to subscriber clients.

This is tested with opensm's /etc/rdma/opensm.conf subnet_prefix
configuration update where before the update

$ ip link show dev ib0
ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
    link/infiniband
80:00:01:07:fe:80:00:00:00:00:00:00:24:8a:07:03:00:b3:d1:12 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

And after the subnet prefix update:

ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
state UP mode DEFAULT group default qlen 256
    link/infiniband
80:00:01:07:fe:80:00:00:00:00:00:02:24:8a:07:03:00:b3:d1:12 brd
00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff

Patch-1 fixes the race condition between GID users and GID cache update
Patch-2 eliminates single entry structure
Patch-3 simplifies the code to not generate the event for unregistered device

Parav Pandit (3):
  IB/core: Let IB core distribute cache update events
  IB/core: Cut down single member ib_cache structure
  IB/core: Do not notify GID change event of an unregistered device

 drivers/infiniband/core/cache.c     | 132 ++++++++++++++--------------
 drivers/infiniband/core/core_priv.h |   3 +
 drivers/infiniband/core/device.c    |  26 ++++--
 include/rdma/ib_verbs.h             |   8 +-
 4 files changed, 87 insertions(+), 82 deletions(-)

--
2.20.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update events
  2019-10-20  6:54 [PATCH rdma-next 0/3] Let IB core distribute cache update events Leon Romanovsky
@ 2019-10-20  6:54 ` Leon Romanovsky
  2019-10-22 19:55   ` Jason Gunthorpe
  2019-10-20  6:54 ` [PATCH rdma-next 2/3] IB/core: Cut down single member ib_cache structure Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device Leon Romanovsky
  2 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2019-10-20  6:54 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Daniel Jurgens, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

Currently when low level driver notifies Pkey, GID, port change events,
they are notified to the registered handlers in the order they are
registered.
IB core and other ULPs such as IPoIB are interested in GID, LID, Pkey
change events.
Since all GID query done by ULPs is serviced by IB core, in below flow
when GID change event occurs, IB core is yet to update the GID cache
when IPoIB queries the GID, resulting into not updating IPoIB address.

mlx5_ib_handle_event()
  ib_dispatch_event()
    ib_cache_event()
       queue_work() -> slow cache update

    [..]
    ipoib_event()
     queue_work()
       [..]
       work handler
         ipoib_ib_dev_flush_light()
           __ipoib_ib_dev_flush()
              ipoib_dev_addr_changed_valid()
                rdma_query_gid() <- Returns old GID, cache not updated.

Hence, all events which require cache update are handled first by the IB
core. Once cache update work is completed, IB core distributes the event
to subscriber clients.

Fixes: f35faa4ba956 ("IB/core: Simplify ib_query_gid to always refer to cache")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cache.c     | 94 +++++++++++++++--------------
 drivers/infiniband/core/core_priv.h |  3 +
 drivers/infiniband/core/device.c    | 26 +++++---
 include/rdma/ib_verbs.h             |  1 -
 4 files changed, 69 insertions(+), 55 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 00fb3eacda19..46dba17b385d 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -51,9 +51,8 @@ struct ib_pkey_cache {

 struct ib_update_work {
 	struct work_struct work;
-	struct ib_device  *device;
-	u8                 port_num;
-	bool		   enforce_security;
+	struct ib_event event;
+	bool enforce_security;
 };

 union ib_gid zgid;
@@ -130,7 +129,7 @@ static void dispatch_gid_change_event(struct ib_device *ib_dev, u8 port)
 	event.element.port_num	= port;
 	event.event		= IB_EVENT_GID_CHANGE;

-	ib_dispatch_event(&event);
+	ib_dispatch_cache_event_clients(&event);
 }

 static const char * const gid_type_str[] = {
@@ -1387,9 +1386,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 	return ret;
 }

-static void ib_cache_update(struct ib_device *device,
-			    u8                port,
-			    bool	      enforce_security)
+static int
+ib_cache_update(struct ib_device *device, u8 port, bool	enforce_security)
 {
 	struct ib_port_attr       *tprops = NULL;
 	struct ib_pkey_cache      *pkey_cache = NULL, *old_pkey_cache;
@@ -1397,11 +1395,11 @@ static void ib_cache_update(struct ib_device *device,
 	int                        ret;

 	if (!rdma_is_port_valid(device, port))
-		return;
+		return -EINVAL;

 	tprops = kmalloc(sizeof *tprops, GFP_KERNEL);
 	if (!tprops)
-		return;
+		return -ENOMEM;

 	ret = ib_query_port(device, port, tprops);
 	if (ret) {
@@ -1419,8 +1417,10 @@ static void ib_cache_update(struct ib_device *device,
 	pkey_cache = kmalloc(struct_size(pkey_cache, table,
 					 tprops->pkey_tbl_len),
 			     GFP_KERNEL);
-	if (!pkey_cache)
+	if (!pkey_cache) {
+		ret = -ENOMEM;
 		goto err;
+	}

 	pkey_cache->table_len = tprops->pkey_tbl_len;

@@ -1452,49 +1452,58 @@ static void ib_cache_update(struct ib_device *device,

 	kfree(old_pkey_cache);
 	kfree(tprops);
-	return;
+	return 0;

 err:
 	kfree(pkey_cache);
 	kfree(tprops);
+	return ret;
 }

 static void ib_cache_task(struct work_struct *_work)
 {
 	struct ib_update_work *work =
 		container_of(_work, struct ib_update_work, work);
+	int ret;
+
+	ret = ib_cache_update(work->event.device, work->event.element.port_num,
+			      work->enforce_security);
+
+	/* GID event is notified already for individual GID entries by
+	 * dispatch_gid_change_event(). Hence, notifiy for rest of the
+	 * events.
+	 */
+	if (!ret && work->event.event != IB_EVENT_GID_CHANGE)
+		ib_dispatch_cache_event_clients(&work->event);

-	ib_cache_update(work->device,
-			work->port_num,
-			work->enforce_security);
 	kfree(work);
 }

-static void ib_cache_event(struct ib_event_handler *handler,
-			   struct ib_event *event)
+bool ib_is_cache_update_event(const struct ib_event *event)
+{
+	return (event->event == IB_EVENT_PORT_ERR    ||
+		event->event == IB_EVENT_PORT_ACTIVE ||
+		event->event == IB_EVENT_LID_CHANGE  ||
+		event->event == IB_EVENT_PKEY_CHANGE ||
+		event->event == IB_EVENT_CLIENT_REREGISTER ||
+		event->event == IB_EVENT_GID_CHANGE);
+}
+
+void ib_enqueue_cache_update_event(const struct ib_event *event)
 {
 	struct ib_update_work *work;

-	if (event->event == IB_EVENT_PORT_ERR    ||
-	    event->event == IB_EVENT_PORT_ACTIVE ||
-	    event->event == IB_EVENT_LID_CHANGE  ||
-	    event->event == IB_EVENT_PKEY_CHANGE ||
-	    event->event == IB_EVENT_CLIENT_REREGISTER ||
-	    event->event == IB_EVENT_GID_CHANGE) {
-		work = kmalloc(sizeof *work, GFP_ATOMIC);
-		if (work) {
-			INIT_WORK(&work->work, ib_cache_task);
-			work->device   = event->device;
-			work->port_num = event->element.port_num;
-			if (event->event == IB_EVENT_PKEY_CHANGE ||
-			    event->event == IB_EVENT_GID_CHANGE)
-				work->enforce_security = true;
-			else
-				work->enforce_security = false;
-
-			queue_work(ib_wq, &work->work);
-		}
-	}
+	work = kzalloc(sizeof(*work), GFP_ATOMIC);
+	if (!work)
+		return;
+
+	INIT_WORK(&work->work, ib_cache_task);
+	work->event = *event;
+	if (event->event == IB_EVENT_PKEY_CHANGE ||
+	    event->event == IB_EVENT_GID_CHANGE)
+		work->enforce_security = true;
+
+	queue_work(ib_wq, &work->work);
 }

 int ib_cache_setup_one(struct ib_device *device)
@@ -1511,9 +1520,6 @@ int ib_cache_setup_one(struct ib_device *device)
 	rdma_for_each_port (device, p)
 		ib_cache_update(device, p, true);

-	INIT_IB_EVENT_HANDLER(&device->cache.event_handler,
-			      device, ib_cache_event);
-	ib_register_event_handler(&device->cache.event_handler);
 	return 0;
 }

@@ -1535,14 +1541,12 @@ void ib_cache_release_one(struct ib_device *device)

 void ib_cache_cleanup_one(struct ib_device *device)
 {
-	/* The cleanup function unregisters the event handler,
-	 * waits for all in-progress workqueue elements and cleans
-	 * up the GID cache. This function should be called after
-	 * the device was removed from the devices list and all
-	 * clients were removed, so the cache exists but is
+	/* The cleanup function waits for all in-progress workqueue
+	 * elements and cleans up the GID cache. This function should be
+	 * called after the device was removed from the devices list and
+	 * all clients were removed, so the cache exists but is
 	 * non-functional and shouldn't be updated anymore.
 	 */
-	ib_unregister_event_handler(&device->cache.event_handler);
 	flush_workqueue(ib_wq);
 	gid_table_cleanup_one(device);

diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 9d07378b5b42..b08018a8cf74 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -149,6 +149,9 @@ unsigned long roce_gid_type_mask_support(struct ib_device *ib_dev, u8 port);
 int ib_cache_setup_one(struct ib_device *device);
 void ib_cache_cleanup_one(struct ib_device *device);
 void ib_cache_release_one(struct ib_device *device);
+bool ib_is_cache_update_event(const struct ib_event *event);
+void ib_enqueue_cache_update_event(const struct ib_event *event);
+void ib_dispatch_cache_event_clients(struct ib_event *event);

 #ifdef CONFIG_CGROUP_RDMA
 void ib_device_register_rdmacg(struct ib_device *device);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 2f89c4d64b73..e9ab1289c224 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -1951,15 +1951,7 @@ void ib_unregister_event_handler(struct ib_event_handler *event_handler)
 }
 EXPORT_SYMBOL(ib_unregister_event_handler);

-/**
- * ib_dispatch_event - Dispatch an asynchronous event
- * @event:Event to dispatch
- *
- * Low-level drivers must call ib_dispatch_event() to dispatch the
- * event to all registered event handlers when an asynchronous event
- * occurs.
- */
-void ib_dispatch_event(struct ib_event *event)
+void ib_dispatch_cache_event_clients(struct ib_event *event)
 {
 	unsigned long flags;
 	struct ib_event_handler *handler;
@@ -1971,6 +1963,22 @@ void ib_dispatch_event(struct ib_event *event)

 	spin_unlock_irqrestore(&event->device->event_handler_lock, flags);
 }
+
+/**
+ * ib_dispatch_event - Dispatch an asynchronous event
+ * @event:Event to dispatch
+ *
+ * Low-level drivers must call ib_dispatch_event() to dispatch the
+ * event to all registered event handlers when an asynchronous event
+ * occurs.
+ */
+void ib_dispatch_event(struct ib_event *event)
+{
+	if (ib_is_cache_update_event(event))
+		ib_enqueue_cache_update_event(event);
+	else
+		ib_dispatch_cache_event_clients(event);
+}
 EXPORT_SYMBOL(ib_dispatch_event);

 static int iw_query_port(struct ib_device *device,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index cca9985b4cbc..1f6d6734f477 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2148,7 +2148,6 @@ struct ib_port_cache {

 struct ib_cache {
 	rwlock_t                lock;
-	struct ib_event_handler event_handler;
 };

 struct ib_port_immutable {
--
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rdma-next 2/3] IB/core: Cut down single member ib_cache structure
  2019-10-20  6:54 [PATCH rdma-next 0/3] Let IB core distribute cache update events Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 1/3] IB/core: " Leon Romanovsky
@ 2019-10-20  6:54 ` Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device Leon Romanovsky
  2 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-10-20  6:54 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Daniel Jurgens, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

Given that ib_cache structure has only single member now, merge the
cache lock directly in the ib_device.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cache.c | 30 +++++++++++++++---------------
 include/rdma/ib_verbs.h         |  7 ++-----
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 46dba17b385d..b626ca682004 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1039,7 +1039,7 @@ int ib_get_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1048,7 +1048,7 @@ int ib_get_cached_pkey(struct ib_device *device,
 	else
 		*pkey = cache->table[index];

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1063,9 +1063,9 @@ int ib_get_cached_subnet_prefix(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*sn_pfx = device->port_data[port_num].cache.subnet_prefix;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return 0;
 }
@@ -1085,7 +1085,7 @@ int ib_find_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1106,7 +1106,7 @@ int ib_find_cached_pkey(struct ib_device *device,
 		ret = 0;
 	}

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1125,7 +1125,7 @@ int ib_find_exact_cached_pkey(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);

 	cache = device->port_data[port_num].cache.pkey;

@@ -1138,7 +1138,7 @@ int ib_find_exact_cached_pkey(struct ib_device *device,
 			break;
 		}

-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1154,9 +1154,9 @@ int ib_get_cached_lmc(struct ib_device *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*lmc = device->port_data[port_num].cache.lmc;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1172,9 +1172,9 @@ int ib_get_cached_port_state(struct ib_device   *device,
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;

-	read_lock_irqsave(&device->cache.lock, flags);
+	read_lock_irqsave(&device->cache_lock, flags);
 	*port_state = device->port_data[port_num].cache.port_state;
-	read_unlock_irqrestore(&device->cache.lock, flags);
+	read_unlock_irqrestore(&device->cache_lock, flags);

 	return ret;
 }
@@ -1434,7 +1434,7 @@ ib_cache_update(struct ib_device *device, u8 port, bool	enforce_security)
 		}
 	}

-	write_lock_irq(&device->cache.lock);
+	write_lock_irq(&device->cache_lock);

 	old_pkey_cache = device->port_data[port].cache.pkey;

@@ -1443,7 +1443,7 @@ ib_cache_update(struct ib_device *device, u8 port, bool	enforce_security)
 	device->port_data[port].cache.port_state = tprops->state;

 	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
-	write_unlock_irq(&device->cache.lock);
+	write_unlock_irq(&device->cache_lock);

 	if (enforce_security)
 		ib_security_cache_change(device,
@@ -1511,7 +1511,7 @@ int ib_cache_setup_one(struct ib_device *device)
 	unsigned int p;
 	int err;

-	rwlock_init(&device->cache.lock);
+	rwlock_init(&device->cache_lock);

 	err = gid_table_setup_one(device);
 	if (err)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 1f6d6734f477..adff05eade2c 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2146,10 +2146,6 @@ struct ib_port_cache {
 	enum ib_port_state     port_state;
 };

-struct ib_cache {
-	rwlock_t                lock;
-};
-
 struct ib_port_immutable {
 	int                           pkey_tbl_len;
 	int                           gid_tbl_len;
@@ -2609,7 +2605,8 @@ struct ib_device {
 	struct xarray                 client_data;
 	struct mutex                  unregistration_lock;

-	struct ib_cache               cache;
+	/* Synchronize GID, Pkey cache entries, subnet prefix, LMC */
+	rwlock_t cache_lock;
 	/**
 	 * port_data is indexed by port number
 	 */
--
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device
  2019-10-20  6:54 [PATCH rdma-next 0/3] Let IB core distribute cache update events Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 1/3] IB/core: " Leon Romanovsky
  2019-10-20  6:54 ` [PATCH rdma-next 2/3] IB/core: Cut down single member ib_cache structure Leon Romanovsky
@ 2019-10-20  6:54 ` Leon Romanovsky
  2019-10-22 20:00   ` Jason Gunthorpe
  2 siblings, 1 reply; 9+ messages in thread
From: Leon Romanovsky @ 2019-10-20  6:54 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, RDMA mailing list, Daniel Jurgens, Parav Pandit

From: Parav Pandit <parav@mellanox.com>

When IB device is undergoing unregistration, GID cache is cleaned up
after all clients are unregistered in below flow.

__ib_unregister_device()
disable_device();
  ib_cache_cleanup_one()
    gid_table_cleanup_one()
      cleanup_gid_table_port()

There is no use of generating a GID change event at such stage, where
there is no active client of the device and device is unregistered
state.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cache.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index b626ca682004..53d8313e8309 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -818,22 +818,16 @@ static void cleanup_gid_table_port(struct ib_device *ib_dev, u8 port,
 				   struct ib_gid_table *table)
 {
 	int i;
-	bool deleted = false;

 	if (!table)
 		return;

 	mutex_lock(&table->lock);
 	for (i = 0; i < table->sz; ++i) {
-		if (is_gid_entry_valid(table->data_vec[i])) {
+		if (is_gid_entry_valid(table->data_vec[i]))
 			del_gid(ib_dev, port, table, i);
-			deleted = true;
-		}
 	}
 	mutex_unlock(&table->lock);
-
-	if (deleted)
-		dispatch_gid_change_event(ib_dev, port);
 }

 void ib_cache_gid_set_default_gid(struct ib_device *ib_dev, u8 port,
--
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update events
  2019-10-20  6:54 ` [PATCH rdma-next 1/3] IB/core: " Leon Romanovsky
@ 2019-10-22 19:55   ` Jason Gunthorpe
  2019-10-23  5:08     ` Parav Pandit
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2019-10-22 19:55 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Daniel Jurgens,
	Parav Pandit

On Sun, Oct 20, 2019 at 09:54:25AM +0300, Leon Romanovsky wrote:
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 2f89c4d64b73..e9ab1289c224 100644
> +++ b/drivers/infiniband/core/device.c
> @@ -1951,15 +1951,7 @@ void ib_unregister_event_handler(struct ib_event_handler *event_handler)
>  }
>  EXPORT_SYMBOL(ib_unregister_event_handler);
> 
> -/**
> - * ib_dispatch_event - Dispatch an asynchronous event
> - * @event:Event to dispatch
> - *
> - * Low-level drivers must call ib_dispatch_event() to dispatch the
> - * event to all registered event handlers when an asynchronous event
> - * occurs.
> - */
> -void ib_dispatch_event(struct ib_event *event)
> +void ib_dispatch_cache_event_clients(struct ib_event *event)
>  {

no kdoc for this?

>  	unsigned long flags;
>  	struct ib_event_handler *handler;
> @@ -1971,6 +1963,22 @@ void ib_dispatch_event(struct ib_event *event)
> 
>  	spin_unlock_irqrestore(&event->device->event_handler_lock, flags);
>  }
> +
> +/**
> + * ib_dispatch_event - Dispatch an asynchronous event
> + * @event:Event to dispatch
> + *
> + * Low-level drivers must call ib_dispatch_event() to dispatch the
> + * event to all registered event handlers when an asynchronous event
> + * occurs.
> + */
> +void ib_dispatch_event(struct ib_event *event)
> +{
> +	if (ib_is_cache_update_event(event))
> +		ib_enqueue_cache_update_event(event);
> +	else
> +		ib_dispatch_cache_event_clients(event);
> +}
>  EXPORT_SYMBOL(ib_dispatch_event);

It seems like there is now some big mess here, many of the users of
events, including cache, acctually do need a blocking context to do
their work, while this function is supposed to be atomic context for
the driver.

So, after this change, many event types are now guarenteed to be
called from a blocking context in a WQ - but we still go ahead and do
silly things like launch more work to get into blocking contexts
from the other users

Thus I'm wondering if this wouldn't be better off just always pushing
events into a wq and running the notifier subscriptions sequentially?

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device
  2019-10-20  6:54 ` [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device Leon Romanovsky
@ 2019-10-22 20:00   ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2019-10-22 20:00 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Daniel Jurgens,
	Parav Pandit

On Sun, Oct 20, 2019 at 09:54:27AM +0300, Leon Romanovsky wrote:
> From: Parav Pandit <parav@mellanox.com>
> 
> When IB device is undergoing unregistration, GID cache is cleaned up
> after all clients are unregistered in below flow.
> 
> __ib_unregister_device()
> disable_device();
>   ib_cache_cleanup_one()
>     gid_table_cleanup_one()
>       cleanup_gid_table_port()
> 
> There is no use of generating a GID change event at such stage, where
> there is no active client of the device and device is unregistered
> state.
> 
> Signed-off-by: Parav Pandit <parav@mellanox.com>
> Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/core/cache.c | 8 +-------
>  1 file changed, 1 insertion(+), 7 deletions(-)

Yep, this is a consequence of the recent cleanups in device. Applied
to for-next

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update events
  2019-10-22 19:55   ` Jason Gunthorpe
@ 2019-10-23  5:08     ` Parav Pandit
  2019-10-23 17:40       ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Parav Pandit @ 2019-10-23  5:08 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, RDMA mailing list, Daniel Jurgens



> -----Original Message-----
> From: Jason Gunthorpe <jgg@mellanox.com>
> Sent: Tuesday, October 22, 2019 2:55 PM
> To: Leon Romanovsky <leon@kernel.org>
> Cc: Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leonro@mellanox.com>; RDMA mailing list <linux-rdma@vger.kernel.org>;
> Daniel Jurgens <danielj@mellanox.com>; Parav Pandit
> <parav@mellanox.com>
> Subject: Re: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache
> update events
> 
> On Sun, Oct 20, 2019 at 09:54:25AM +0300, Leon Romanovsky wrote:
> > diff --git a/drivers/infiniband/core/device.c
> > b/drivers/infiniband/core/device.c
> > index 2f89c4d64b73..e9ab1289c224 100644
> > +++ b/drivers/infiniband/core/device.c
> > @@ -1951,15 +1951,7 @@ void ib_unregister_event_handler(struct
> > ib_event_handler *event_handler)  }
> > EXPORT_SYMBOL(ib_unregister_event_handler);
> >
> > -/**
> > - * ib_dispatch_event - Dispatch an asynchronous event
> > - * @event:Event to dispatch
> > - *
> > - * Low-level drivers must call ib_dispatch_event() to dispatch the
> > - * event to all registered event handlers when an asynchronous event
> > - * occurs.
> > - */
> > -void ib_dispatch_event(struct ib_event *event)
> > +void ib_dispatch_cache_event_clients(struct ib_event *event)
> >  {
> 
> no kdoc for this?
>
I dropped the kdoc because it was an internal API, not exposed to other kernel modules. 
 And added for the external one below.

> >  	unsigned long flags;
> >  	struct ib_event_handler *handler;
> > @@ -1971,6 +1963,22 @@ void ib_dispatch_event(struct ib_event *event)
> >
> >  	spin_unlock_irqrestore(&event->device->event_handler_lock, flags);
> > }
> > +
> > +/**
> > + * ib_dispatch_event - Dispatch an asynchronous event
> > + * @event:Event to dispatch
> > + *
> > + * Low-level drivers must call ib_dispatch_event() to dispatch the
> > + * event to all registered event handlers when an asynchronous event
> > + * occurs.
> > + */
> > +void ib_dispatch_event(struct ib_event *event) {
> > +	if (ib_is_cache_update_event(event))
> > +		ib_enqueue_cache_update_event(event);
> > +	else
> > +		ib_dispatch_cache_event_clients(event);
> > +}
> >  EXPORT_SYMBOL(ib_dispatch_event);
> 
> It seems like there is now some big mess here, many of the users of events,
> including cache, acctually do need a blocking context to do their work, while
> this function is supposed to be atomic context for the driver.
> 
> So, after this change, many event types are now guarenteed to be called
> from a blocking context in a WQ - but we still go ahead and do silly things
> like launch more work to get into blocking contexts from the other users
> 
> Thus I'm wondering if this wouldn't be better off just always pushing events
> into a wq and running the notifier subscriptions sequentially?
>
Are you saying we should drop the else part above and always do ib_enqueue_cache_update_event()?
If so, yes, I think it should be fine.
Only event that I wanted to deliver faster was IB_EVENT_SRQ_LIMIT_REACHED.
However given no kernel consumer interested in it, doing fast event delivery for such event is not so useful.
We can slim down ib_is_cache_update_event().


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update events
  2019-10-23  5:08     ` Parav Pandit
@ 2019-10-23 17:40       ` Jason Gunthorpe
  2019-10-24 19:58         ` Parav Pandit
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2019-10-23 17:40 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky,
	RDMA mailing list, Daniel Jurgens

On Wed, Oct 23, 2019 at 05:08:56AM +0000, Parav Pandit wrote:
> > >  	unsigned long flags;
> > >  	struct ib_event_handler *handler;
> > > @@ -1971,6 +1963,22 @@ void ib_dispatch_event(struct ib_event *event)
> > >
> > >  	spin_unlock_irqrestore(&event->device->event_handler_lock, flags);
> > > }
> > > +
> > > +/**
> > > + * ib_dispatch_event - Dispatch an asynchronous event
> > > + * @event:Event to dispatch
> > > + *
> > > + * Low-level drivers must call ib_dispatch_event() to dispatch the
> > > + * event to all registered event handlers when an asynchronous event
> > > + * occurs.
> > > + */
> > > +void ib_dispatch_event(struct ib_event *event) {
> > > +	if (ib_is_cache_update_event(event))
> > > +		ib_enqueue_cache_update_event(event);
> > > +	else
> > > +		ib_dispatch_cache_event_clients(event);
> > > +}
> > >  EXPORT_SYMBOL(ib_dispatch_event);
> > 
> > It seems like there is now some big mess here, many of the users of events,
> > including cache, acctually do need a blocking context to do their work, while
> > this function is supposed to be atomic context for the driver.
> > 
> > So, after this change, many event types are now guarenteed to be called
> > from a blocking context in a WQ - but we still go ahead and do silly things
> > like launch more work to get into blocking contexts from the other users
> > 
> > Thus I'm wondering if this wouldn't be better off just always pushing events
> > into a wq and running the notifier subscriptions sequentially?
> >
> Are you saying we should drop the else part above and always do
> ib_enqueue_cache_update_event()?

Yes, but also now saying that all notifier callbacks are called from
work queues and can block for short periods.

This seems it would simplify many of the users??

And not using the ib_enqueue_cache_update_event() but a simple
blocking_notifier_call_chain() with the cache always at the front

> Only event that I wanted to deliver faster was
> IB_EVENT_SRQ_LIMIT_REACHED.  

It might make sense to have an atomic event list for such things in
future..

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update events
  2019-10-23 17:40       ` Jason Gunthorpe
@ 2019-10-24 19:58         ` Parav Pandit
  0 siblings, 0 replies; 9+ messages in thread
From: Parav Pandit @ 2019-10-24 19:58 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky,
	RDMA mailing list, Daniel Jurgens



> -----Original Message-----
> From: Jason Gunthorpe <jgg@mellanox.com>
> Sent: Wednesday, October 23, 2019 12:41 PM
> To: Parav Pandit <parav@mellanox.com>
> Cc: Leon Romanovsky <leon@kernel.org>; Doug Ledford
> <dledford@redhat.com>; Leon Romanovsky <leonro@mellanox.com>; RDMA
> mailing list <linux-rdma@vger.kernel.org>; Daniel Jurgens
> <danielj@mellanox.com>
> Subject: Re: [PATCH rdma-next 1/3] IB/core: Let IB core distribute cache update
> events
> 
> On Wed, Oct 23, 2019 at 05:08:56AM +0000, Parav Pandit wrote:
> > > >  	unsigned long flags;
> > > >  	struct ib_event_handler *handler; @@ -1971,6 +1963,22 @@ void
> > > > ib_dispatch_event(struct ib_event *event)
> > > >
> > > >  	spin_unlock_irqrestore(&event->device->event_handler_lock,
> > > > flags); }
> > > > +
> > > > +/**
> > > > + * ib_dispatch_event - Dispatch an asynchronous event
> > > > + * @event:Event to dispatch
> > > > + *
> > > > + * Low-level drivers must call ib_dispatch_event() to dispatch
> > > > +the
> > > > + * event to all registered event handlers when an asynchronous
> > > > +event
> > > > + * occurs.
> > > > + */
> > > > +void ib_dispatch_event(struct ib_event *event) {
> > > > +	if (ib_is_cache_update_event(event))
> > > > +		ib_enqueue_cache_update_event(event);
> > > > +	else
> > > > +		ib_dispatch_cache_event_clients(event);
> > > > +}
> > > >  EXPORT_SYMBOL(ib_dispatch_event);
> > >
> > > It seems like there is now some big mess here, many of the users of
> > > events, including cache, acctually do need a blocking context to do
> > > their work, while this function is supposed to be atomic context for the
> driver.
> > >
> > > So, after this change, many event types are now guarenteed to be
> > > called from a blocking context in a WQ - but we still go ahead and
> > > do silly things like launch more work to get into blocking contexts
> > > from the other users
> > >
> > > Thus I'm wondering if this wouldn't be better off just always
> > > pushing events into a wq and running the notifier subscriptions
> sequentially?
> > >
> > Are you saying we should drop the else part above and always do
> > ib_enqueue_cache_update_event()?
> 
> Yes, but also now saying that all notifier callbacks are called from work queues
> and can block for short periods.
> 
> This seems it would simplify many of the users??
> 
Yes, however that optimization should be a different series per ULP/consumer/event based.
Will send v1 through Leon's queue.

> And not using the ib_enqueue_cache_update_event() but a simple
> blocking_notifier_call_chain() with the cache always at the front
> 
> > Only event that I wanted to deliver faster was
> > IB_EVENT_SRQ_LIMIT_REACHED.
> 
> It might make sense to have an atomic event list for such things in future..
> 
Ok.

> Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-24 19:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-20  6:54 [PATCH rdma-next 0/3] Let IB core distribute cache update events Leon Romanovsky
2019-10-20  6:54 ` [PATCH rdma-next 1/3] IB/core: " Leon Romanovsky
2019-10-22 19:55   ` Jason Gunthorpe
2019-10-23  5:08     ` Parav Pandit
2019-10-23 17:40       ` Jason Gunthorpe
2019-10-24 19:58         ` Parav Pandit
2019-10-20  6:54 ` [PATCH rdma-next 2/3] IB/core: Cut down single member ib_cache structure Leon Romanovsky
2019-10-20  6:54 ` [PATCH rdma-next 3/3] IB/core: Do not notify GID change event of an unregistered device Leon Romanovsky
2019-10-22 20:00   ` Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.