All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in
@ 2021-06-16 15:45 Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Anand Khoje @ 2021-06-16 15:45 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

This v5 patch series is used to read the port_attribute subnet_prefix
from a valid cache entry instead of having to call
device->ops.query_gid() in Infiniband link-layer devices. This requires
addition of a flag used to check that the cache entry is initialized and
that a valid value is being read.

1. Removed the port validity check from ib_get_cached_subnet_prefix.
This check was not useful as the port_num is always valid.

2. Shuffled locks pkey_lost_lock and netdev_lock in struct ib_port_data.
This was done as output of pahole showed two 4-byte holes in the
structure ib_port_data after pkey_list_lock and netdev_lock. Moving
netdev_lock shaved off 8 bytes from the structure.

3. Added a flag to struct ib_port_data. This is used to validate the
status of cached subnet_prefix. This valid cache entry of subnet_prefix
is used in function __ib_query_port().
This allows the utilization of the cache entry and hence avoids a call
into device->ops.query_gid(). We also ensure that in the event of a
cache update, the value for subnet_prefix gets read from the newly updated
GID cache and not via ib_query_port(), so that we do not end up reading a
stale cache value.

Anand Khoje (3):
  IB/core: Removed port validity check from ib_get_cached_subnet_prefix
  IB/core: Shuffle locks in ib_port_data to save memory
  IB/core: Obtain subnet_prefix from cache in IB devices

 drivers/infiniband/core/cache.c     | 21 +++++++++++++--------
 drivers/infiniband/core/core_priv.h |  2 +-
 drivers/infiniband/core/device.c    | 20 +++++++++++---------
 drivers/infiniband/core/security.c  |  7 ++-----
 include/rdma/ib_cache.h             |  1 -
 include/rdma/ib_verbs.h             |  5 ++++-
 6 files changed, 31 insertions(+), 25 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v5 for-next 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix
  2021-06-16 15:45 [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
@ 2021-06-16 15:45 ` Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Anand Khoje @ 2021-06-16 15:45 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

Removed port validity check from ib_get_cached_subnet_prefix()
as this check is not needed because "port_num" is valid.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---

v1 -> v2:
    -   Added changes as per Leon's suggestion of removing port
    validity check from ib_get_cached_subnet_prefix().
    -   Split the v1 patch in 3 patches as per Leon's suggestion.
v2 -> v3:
    -   Added some formatting changes per Leon's suggestions
    and removed return from ib_get_cached_subnet_prefix.
v3 -> v4:
    -   Removed a newline in ib_policy_change_task().
v4 -> v5:
    -   No changes.

---
 drivers/infiniband/core/cache.c     |  7 +------
 drivers/infiniband/core/core_priv.h |  2 +-
 drivers/infiniband/core/device.c    | 11 ++---------
 drivers/infiniband/core/security.c  |  7 ++-----
 4 files changed, 6 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index d320459..2325171 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1069,19 +1069,14 @@ int ib_get_cached_pkey(struct ib_device *device,
 }
 EXPORT_SYMBOL(ib_get_cached_pkey);
 
-int ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
+void ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
 				u64 *sn_pfx)
 {
 	unsigned long flags;
 
-	if (!rdma_is_port_valid(device, port_num))
-		return -EINVAL;
-
 	read_lock_irqsave(&device->cache_lock, flags);
 	*sn_pfx = device->port_data[port_num].cache.subnet_prefix;
 	read_unlock_irqrestore(&device->cache_lock, flags);
-
-	return 0;
 }
 EXPORT_SYMBOL(ib_get_cached_subnet_prefix);
 
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 29809dd..0b23f50 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -214,7 +214,7 @@ int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
 			     struct nlmsghdr *nlh,
 			     struct netlink_ext_ack *extack);
 
-int ib_get_cached_subnet_prefix(struct ib_device *device,
+void ib_get_cached_subnet_prefix(struct ib_device *device,
 				u32 port_num,
 				u64 *sn_pfx);
 
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index c660cef..7a617e4 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -886,15 +886,8 @@ static void ib_policy_change_task(struct work_struct *work)
 
 		rdma_for_each_port (dev, i) {
 			u64 sp;
-			int ret = ib_get_cached_subnet_prefix(dev,
-							      i,
-							      &sp);
-
-			WARN_ONCE(ret,
-				  "ib_get_cached_subnet_prefix err: %d, this should never happen here\n",
-				  ret);
-			if (!ret)
-				ib_security_cache_change(dev, i, sp);
+			ib_get_cached_subnet_prefix(dev, i, &sp);
+			ib_security_cache_change(dev, i, sp);
 		}
 	}
 	up_read(&devices_rwsem);
diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
index e5a78d1..5433912 100644
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -72,7 +72,7 @@ static int get_pkey_and_subnet_prefix(struct ib_port_pkey *pp,
 	if (ret)
 		return ret;
 
-	ret = ib_get_cached_subnet_prefix(dev, pp->port_num, subnet_prefix);
+	ib_get_cached_subnet_prefix(dev, pp->port_num, subnet_prefix);
 
 	return ret;
 }
@@ -664,10 +664,7 @@ static int ib_security_pkey_access(struct ib_device *dev,
 	if (ret)
 		return ret;
 
-	ret = ib_get_cached_subnet_prefix(dev, port_num, &subnet_prefix);
-
-	if (ret)
-		return ret;
+	ib_get_cached_subnet_prefix(dev, port_num, &subnet_prefix);
 
 	return security_ib_pkey_access(sec, subnet_prefix, pkey);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 for-next 2/3] IB/core: Shuffle locks in ib_port_data to save memory
  2021-06-16 15:45 [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
@ 2021-06-16 15:45 ` Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
  2021-06-21 23:52 ` [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Jason Gunthorpe
  3 siblings, 0 replies; 11+ messages in thread
From: Anand Khoje @ 2021-06-16 15:45 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

pahole shows two 4-byte holes in struct ib_port_data after
pkey_list_lock and netdev_lock respectively.

Shuffling the netdev_lock to be after pkey_list_lock, this
shaves off eight bytes from the struct.

Suggested-by: Haakon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---

v1 -> v2:
    -   Split the v1 patch in 3 patches as per Leon's suggestion.
v2 -> v3:
    -   No changes.
v3 -> v4:
    -   No changes.
v4 -> v5:
    -   No changes.

---
 include/rdma/ib_verbs.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 05dbc21..c96d601 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2174,11 +2174,13 @@ struct ib_port_data {
 	struct ib_port_immutable immutable;
 
 	spinlock_t pkey_list_lock;
+
+	spinlock_t netdev_lock;
+
 	struct list_head pkey_list;
 
 	struct ib_port_cache cache;
 
-	spinlock_t netdev_lock;
 	struct net_device __rcu *netdev;
 	struct hlist_node ndev_hash_link;
 	struct rdma_port_counter port_counter;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-16 15:45 [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
  2021-06-16 15:45 ` [PATCH v5 for-next 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
@ 2021-06-16 15:45 ` Anand Khoje
  2021-06-17  6:41   ` Leon Romanovsky
  2021-06-21 23:49   ` Jason Gunthorpe
  2021-06-21 23:52 ` [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Jason Gunthorpe
  3 siblings, 2 replies; 11+ messages in thread
From: Anand Khoje @ 2021-06-16 15:45 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

ib_query_port() calls device->ops.query_port() to get the port
attributes. The method of querying is device driver specific.
The same function calls device->ops.query_gid() to get the GID and
extract the subnet_prefix (gid_prefix).

The GID and subnet_prefix are stored in a cache. But they do not get
read from the cache if the device is an Infiniband device. The
following change takes advantage of the cached subnet_prefix.
Testing with RDBMS has shown a significant improvement in performance
with this change.

The function ib_cache_is_initialised() is introduced because
ib_query_port() gets called early in the stage when the cache is not
built while reading port immutable property.

In that case, the default GID still gets read from HCA for IB link-
layer devices.

In the situation of an event causing cache update, the subnet_prefix
will get retrieved from newly updated GID cache in ib_cache_update(),
so that we do not end up reading a stale value from cache via
ib_query_port().

Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Suggested-by: Aru Kolappan <aru.kolappan@oracle.com>
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
---

v1 -> v2:
    -   Split the v1 patch in 3 patches as per Leon's suggestion.

v2 -> v3:
    -   Added changes as per Mark Zhang's suggestion of clearing
        flags in git_table_cleanup_one().
v3 -> v4:
    -   Removed the enum ib_port_data_flags and 8 byte flags from
        struct ib_port_data, and the set_bit()/clear_bit() API
        used to update this flag as that was not necessary.
        Done to keep the code simple.
    -   Added code to read subnet_prefix from updated GID cache in the
        event of cache update. Prior to this change, ib_cache_update
        was reading the value for subnet_prefix via ib_query_port(),
        due to this patch, we ended up reading a stale cached value of
        subnet_prefix.
v4 -> v5:
    -   Removed the code to reset cache_is_initialised bit from cleanup
        as per Leon's suggestion.
    -   Removed ib_cache_is_initialised() function.

---
 drivers/infiniband/core/cache.c  | 14 ++++++++++++--
 drivers/infiniband/core/device.c |  9 +++++++++
 include/rdma/ib_verbs.h          |  1 +
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 2325171..88517b5 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1466,6 +1466,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 	struct ib_port_attr       *tprops = NULL;
 	struct ib_pkey_cache      *pkey_cache = NULL;
 	struct ib_pkey_cache      *old_pkey_cache = NULL;
+	union ib_gid               gid;
 	int                        i;
 	int                        ret;
 
@@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 	device->port_data[port].cache.lmc = tprops->lmc;
 	device->port_data[port].cache.port_state = tprops->state;
 
-	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
+	ret = rdma_query_gid(device, port, 0, &gid);
+	if (ret) {
+		write_unlock_irq(&device->cache_lock);
+		goto err;
+	}
+
+	device->port_data[port].cache.subnet_prefix =
+			be64_to_cpu(gid.global.subnet_prefix);
+
 	write_unlock_irq(&device->cache_lock);
 
 	if (enforce_security)
 		ib_security_cache_change(device,
 					 port,
-					 tprops->subnet_prefix);
+					 be64_to_cpu(gid.global.subnet_prefix));
 
 	kfree(old_pkey_cache);
 	kfree(tprops);
@@ -1629,6 +1638,7 @@ int ib_cache_setup_one(struct ib_device *device)
 		err = ib_cache_update(device, p, true, true, true);
 		if (err)
 			return err;
+		device->port_data[p].cache_is_initialized = 1;
 	}
 
 	return 0;
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 7a617e4..76fbca2 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2057,6 +2057,15 @@ static int __ib_query_port(struct ib_device *device,
 	    IB_LINK_LAYER_INFINIBAND)
 		return 0;
 
+	if (!device->port_data[port_num].cache_is_initialized)
+		goto query_gid_from_device;
+
+	ib_get_cached_subnet_prefix(device, port_num,
+				    &port_attr->subnet_prefix);
+
+	return 0;
+
+query_gid_from_device:
 	err = device->ops.query_gid(device, port_num, 0, &gid);
 	if (err)
 		return err;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c96d601..405f7da 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2177,6 +2177,7 @@ struct ib_port_data {
 
 	spinlock_t netdev_lock;
 
+	u8 cache_is_initialized:1;
 	struct list_head pkey_list;
 
 	struct ib_port_cache cache;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-16 15:45 ` [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
@ 2021-06-17  6:41   ` Leon Romanovsky
  2021-06-21 23:49   ` Jason Gunthorpe
  1 sibling, 0 replies; 11+ messages in thread
From: Leon Romanovsky @ 2021-06-17  6:41 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, jgg, haakon.bugge

On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
> ib_query_port() calls device->ops.query_port() to get the port
> attributes. The method of querying is device driver specific.
> The same function calls device->ops.query_gid() to get the GID and
> extract the subnet_prefix (gid_prefix).
> 
> The GID and subnet_prefix are stored in a cache. But they do not get
> read from the cache if the device is an Infiniband device. The
> following change takes advantage of the cached subnet_prefix.
> Testing with RDBMS has shown a significant improvement in performance
> with this change.
> 
> The function ib_cache_is_initialised() is introduced because
> ib_query_port() gets called early in the stage when the cache is not
> built while reading port immutable property.
> 
> In that case, the default GID still gets read from HCA for IB link-
> layer devices.
> 
> In the situation of an event causing cache update, the subnet_prefix
> will get retrieved from newly updated GID cache in ib_cache_update(),
> so that we do not end up reading a stale value from cache via
> ib_query_port().
> 
> Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
> Suggested-by: Leon Romanovsky <leonro@nvidia.com>
> Suggested-by: Aru Kolappan <aru.kolappan@oracle.com>
> Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
> Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
> ---
> 
> v1 -> v2:
>     -   Split the v1 patch in 3 patches as per Leon's suggestion.
> 
> v2 -> v3:
>     -   Added changes as per Mark Zhang's suggestion of clearing
>         flags in git_table_cleanup_one().
> v3 -> v4:
>     -   Removed the enum ib_port_data_flags and 8 byte flags from
>         struct ib_port_data, and the set_bit()/clear_bit() API
>         used to update this flag as that was not necessary.
>         Done to keep the code simple.
>     -   Added code to read subnet_prefix from updated GID cache in the
>         event of cache update. Prior to this change, ib_cache_update
>         was reading the value for subnet_prefix via ib_query_port(),
>         due to this patch, we ended up reading a stale cached value of
>         subnet_prefix.
> v4 -> v5:
>     -   Removed the code to reset cache_is_initialised bit from cleanup
>         as per Leon's suggestion.
>     -   Removed ib_cache_is_initialised() function.
> 
> ---
>  drivers/infiniband/core/cache.c  | 14 ++++++++++++--
>  drivers/infiniband/core/device.c |  9 +++++++++
>  include/rdma/ib_verbs.h          |  1 +
>  3 files changed, 22 insertions(+), 2 deletions(-)
> 

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-16 15:45 ` [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
  2021-06-17  6:41   ` Leon Romanovsky
@ 2021-06-21 23:49   ` Jason Gunthorpe
  2021-06-23 13:03     ` Anand Khoje
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2021-06-21 23:49 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
>  
> @@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>  	device->port_data[port].cache.lmc = tprops->lmc;
>  	device->port_data[port].cache.port_state = tprops->state;
>  
> -	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
> +	ret = rdma_query_gid(device, port, 0, &gid);
> +	if (ret) {

This is quite a bit different than just calling ops.query_gid() - why
are you changing it? I'm not sure all the additional tests will pass,
the 0 gid entry is not required to be valid..

> @@ -1629,6 +1638,7 @@ int ib_cache_setup_one(struct ib_device *device)
>  		err = ib_cache_update(device, p, true, true, true);
>  		if (err)
>  			return err;
> +		device->port_data[p].cache_is_initialized = 1;
>  	}

And I would much prefer things be re-organized so the cache can be
valid sooner to adding this variable. What is the earlier call that is
motivating this?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in
  2021-06-16 15:45 [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
                   ` (2 preceding siblings ...)
  2021-06-16 15:45 ` [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
@ 2021-06-21 23:52 ` Jason Gunthorpe
  3 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2021-06-21 23:52 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On Wed, Jun 16, 2021 at 09:15:06PM +0530, Anand Khoje wrote:
> This v5 patch series is used to read the port_attribute subnet_prefix
> from a valid cache entry instead of having to call
> device->ops.query_gid() in Infiniband link-layer devices. This requires
> addition of a flag used to check that the cache entry is initialized and
> that a valid value is being read.
> 
> 1. Removed the port validity check from ib_get_cached_subnet_prefix.
> This check was not useful as the port_num is always valid.
> 
> 2. Shuffled locks pkey_lost_lock and netdev_lock in struct ib_port_data.
> This was done as output of pahole showed two 4-byte holes in the
> structure ib_port_data after pkey_list_lock and netdev_lock. Moving
> netdev_lock shaved off 8 bytes from the structure.
> 
> 3. Added a flag to struct ib_port_data. This is used to validate the
> status of cached subnet_prefix. This valid cache entry of subnet_prefix
> is used in function __ib_query_port().
> This allows the utilization of the cache entry and hence avoids a call
> into device->ops.query_gid(). We also ensure that in the event of a
> cache update, the value for subnet_prefix gets read from the newly updated
> GID cache and not via ib_query_port(), so that we do not end up reading a
> stale cache value.
> 
> Anand Khoje (3):
>   IB/core: Removed port validity check from ib_get_cached_subnet_prefix
>   IB/core: Shuffle locks in ib_port_data to save memory

I took these two, thanks

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-21 23:49   ` Jason Gunthorpe
@ 2021-06-23 13:03     ` Anand Khoje
  2021-06-24 17:54       ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Anand Khoje @ 2021-06-23 13:03 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On 6/22/2021 5:19 AM, Jason Gunthorpe wrote:
> On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
>>   
>> @@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>>   	device->port_data[port].cache.lmc = tprops->lmc;
>>   	device->port_data[port].cache.port_state = tprops->state;
>>   
>> -	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
>> +	ret = rdma_query_gid(device, port, 0, &gid);
>> +	if (ret) {
> 
> This is quite a bit different than just calling ops.query_gid() - why
> are you changing it? I'm not sure all the additional tests will pass,
> the 0 gid entry is not required to be valid..
> 
Hi Jason,

We have opted for rdma_query_gid(), as during ib_cache_update() the code 
calls ops.query_gid() earlier in config_non_roce_gid_cache(), thereby 
updating the value of GID in cache. We utilize this updated value, 
instead of calling ops->query_gid() again.

  	I'm not sure all the additional tests will pass,
  	the 0 gid entry is not required to be valid..

To get subnet_prefix __ib_query_port() does indeed obtain zero index GID.

https://elixir.bootlin.com/linux/v5.13-rc5/source/drivers/infiniband/core/device.c#L2067

>> @@ -1629,6 +1638,7 @@ int ib_cache_setup_one(struct ib_device *device)
>>   		err = ib_cache_update(device, p, true, true, true);
>>   		if (err)
>>   			return err;
>> +		device->port_data[p].cache_is_initialized = 1;
>>   	}
> 
> And I would much prefer things be re-organized so the cache can be
> valid sooner to adding this variable. What is the earlier call that is
> motivating this?
> 
> Jason
> 

During device load and when cache is yet to be updated, ib_query_port() 
should have a mechanism to identify if the cache entry is valid or 
invalid (uninitialized), we have added this variable just to ensure the 
validity of cache.

Thanks,
Anand

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-23 13:03     ` Anand Khoje
@ 2021-06-24 17:54       ` Jason Gunthorpe
  2021-06-25  6:03         ` Anand Khoje
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Gunthorpe @ 2021-06-24 17:54 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On Wed, Jun 23, 2021 at 06:33:32PM +0530, Anand Khoje wrote:
> On 6/22/2021 5:19 AM, Jason Gunthorpe wrote:
> > On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
> > > @@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
> > >   	device->port_data[port].cache.lmc = tprops->lmc;
> > >   	device->port_data[port].cache.port_state = tprops->state;
> > > -	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
> > > +	ret = rdma_query_gid(device, port, 0, &gid);
> > > +	if (ret) {
> > 
> > This is quite a bit different than just calling ops.query_gid() - why
> > are you changing it? I'm not sure all the additional tests will pass,
> > the 0 gid entry is not required to be valid..
> > 
> Hi Jason,
> 
> We have opted for rdma_query_gid(), as during ib_cache_update() the code
> calls ops.query_gid() earlier in config_non_roce_gid_cache(), thereby
> updating the value of GID in cache. We utilize this updated value, instead
> of calling ops->query_gid() again.

Uhhhh, so just store the subnet prefix at that point then?

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index c9e9fc81447e89..5c554ebd000e89 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1428,8 +1428,8 @@ int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
 }
 EXPORT_SYMBOL(rdma_read_gid_l2_fields);
 
-static int config_non_roce_gid_cache(struct ib_device *device,
-				     u32 port, int gid_tbl_len)
+static int config_non_roce_gid_cache(struct ib_device *device, u32 port,
+				     struct ib_port_attr *tprops)
 {
 	struct ib_gid_attr gid_attr = {};
 	struct ib_gid_table *table;
@@ -1441,7 +1441,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 	table = rdma_gid_table(device, port);
 
 	mutex_lock(&table->lock);
-	for (i = 0; i < gid_tbl_len; ++i) {
+	for (i = 0; i < tprops->gid_tbl_len; ++i) {
 		if (!device->ops.query_gid)
 			continue;
 		ret = device->ops.query_gid(device, port, i, &gid_attr.gid);
@@ -1452,6 +1452,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
 			goto err;
 		}
 		gid_attr.index = i;
+		tprops->subnet_prefix =
+			be64_to_cpu(gid_attr.global.subnet_prefix);
 		add_modify_gid(table, &gid_attr);
 	}
 err:
@@ -1484,7 +1486,7 @@ ib_cache_update(struct ib_device *device, u32 port, bool update_gids,
 
 	if (!rdma_protocol_roce(device, port) && update_gids) {
 		ret = config_non_roce_gid_cache(device, port,
-						tprops->gid_tbl_len);
+						tprops);
 		if (ret)
 			goto err;
 	}

> > And I would much prefer things be re-organized so the cache can be
> > valid sooner to adding this variable. What is the earlier call that is
> > motivating this?
> 
> During device load and when cache is yet to be updated, ib_query_port()
> should have a mechanism to identify if the cache entry is valid or invalid
> (uninitialized), we have added this variable just to ensure the validity of
> cache.

Unless there is an actual user of ib_query_port() before
config_non_roce_gid_cache() that I can't see, don't bother, returning
0 is fine.

Jason

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-24 17:54       ` Jason Gunthorpe
@ 2021-06-25  6:03         ` Anand Khoje
  2021-06-25 12:48           ` Jason Gunthorpe
  0 siblings, 1 reply; 11+ messages in thread
From: Anand Khoje @ 2021-06-25  6:03 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On 6/24/2021 11:24 PM, Jason Gunthorpe wrote:
> On Wed, Jun 23, 2021 at 06:33:32PM +0530, Anand Khoje wrote:
>> On 6/22/2021 5:19 AM, Jason Gunthorpe wrote:
>>> On Wed, Jun 16, 2021 at 09:15:09PM +0530, Anand Khoje wrote:
>>>> @@ -1523,13 +1524,21 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>>>>    	device->port_data[port].cache.lmc = tprops->lmc;
>>>>    	device->port_data[port].cache.port_state = tprops->state;
>>>> -	device->port_data[port].cache.subnet_prefix = tprops->subnet_prefix;
>>>> +	ret = rdma_query_gid(device, port, 0, &gid);
>>>> +	if (ret) {
>>>
>>> This is quite a bit different than just calling ops.query_gid() - why
>>> are you changing it? I'm not sure all the additional tests will pass,
>>> the 0 gid entry is not required to be valid..
>>>
>> Hi Jason,
>>
>> We have opted for rdma_query_gid(), as during ib_cache_update() the code
>> calls ops.query_gid() earlier in config_non_roce_gid_cache(), thereby
>> updating the value of GID in cache. We utilize this updated value, instead
>> of calling ops->query_gid() again.
> 
> Uhhhh, so just store the subnet prefix at that point then?
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index c9e9fc81447e89..5c554ebd000e89 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -1428,8 +1428,8 @@ int rdma_read_gid_l2_fields(const struct ib_gid_attr *attr,
>   }
>   EXPORT_SYMBOL(rdma_read_gid_l2_fields);
>   
> -static int config_non_roce_gid_cache(struct ib_device *device,
> -				     u32 port, int gid_tbl_len)
> +static int config_non_roce_gid_cache(struct ib_device *device, u32 port,
> +				     struct ib_port_attr *tprops)
>   {
>   	struct ib_gid_attr gid_attr = {};
>   	struct ib_gid_table *table;
> @@ -1441,7 +1441,7 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>   	table = rdma_gid_table(device, port);
>   
>   	mutex_lock(&table->lock);
> -	for (i = 0; i < gid_tbl_len; ++i) {
> +	for (i = 0; i < tprops->gid_tbl_len; ++i) {
>   		if (!device->ops.query_gid)
>   			continue;
>   		ret = device->ops.query_gid(device, port, i, &gid_attr.gid);
> @@ -1452,6 +1452,8 @@ static int config_non_roce_gid_cache(struct ib_device *device,
>   			goto err;
>   		}
>   		gid_attr.index = i;
> +		tprops->subnet_prefix =
> +			be64_to_cpu(gid_attr.global.subnet_prefix);
>   		add_modify_gid(table, &gid_attr);
>   	}
>   err:
> @@ -1484,7 +1486,7 @@ ib_cache_update(struct ib_device *device, u32 port, bool update_gids,
>   
>   	if (!rdma_protocol_roce(device, port) && update_gids) {
>   		ret = config_non_roce_gid_cache(device, port,
> -						tprops->gid_tbl_len);
> +						tprops);
>   		if (ret)
>   			goto err;
>   	}
> 

Hi Jason,

Thanks for the response!

If the above change is to be made, there could arise a scenario in which:
  In case of a cache_update event, another application/module could try 
to call ib_query_port() and read subnet_prefix while the cache is still 
getting updated and the application/module could end up reading a stale 
value of subnet_prefix.

I have a few questions:
- How likely is it that an up and running Infiniband fabric would change 
the subnet_prefix?
- Is it possible that different GIDs in the gid_table will have 
different values of subnet_prefix?

>>> And I would much prefer things be re-organized so the cache can be
>>> valid sooner to adding this variable. What is the earlier call that is
>>> motivating this?
>>
>> During device load and when cache is yet to be updated, ib_query_port()
>> should have a mechanism to identify if the cache entry is valid or invalid
>> (uninitialized), we have added this variable just to ensure the validity of
>> cache.
> 
> Unless there is an actual user of ib_query_port() before
> config_non_roce_gid_cache() that I can't see, don't bother, returning
> 0 is fine.
> 
> Jason
> 

Hm! that makes sense, with the above change we wouldn't need to call 
device->ops.query_gid() from __ib_query_port() and can always read 
subnet_prefix using ib_get_cached_subnet_prefix(), if reading stale 
value during cache update event is not an issue.

Thanks,
Anand

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices
  2021-06-25  6:03         ` Anand Khoje
@ 2021-06-25 12:48           ` Jason Gunthorpe
  0 siblings, 0 replies; 11+ messages in thread
From: Jason Gunthorpe @ 2021-06-25 12:48 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, haakon.bugge, leon

On Fri, Jun 25, 2021 at 11:33:58AM +0530, Anand Khoje wrote:
> 
> If the above change is to be made, there could arise a scenario in which:
>  In case of a cache_update event, another application/module could try to
> call ib_query_port() and read subnet_prefix while the cache is still getting
> updated and the application/module could end up reading a stale value of
> subnet_prefix.

Applications relying on this data must hook the event and update their
state when the event fires.

So long as ib_query_port returns the correct value in the event
handler it is all OK. This whole thing is racy - the HW can change the
subnet_prefix at anytime, this is just shuffling the race around.

> - Is it possible that different GIDs in the gid_table will have different
> values of subnet_prefix?

Valid GIDs should have the same prefix

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-06-25 12:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-16 15:45 [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
2021-06-16 15:45 ` [PATCH v5 for-next 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
2021-06-16 15:45 ` [PATCH v5 for-next 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
2021-06-16 15:45 ` [PATCH v5 for-next 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
2021-06-17  6:41   ` Leon Romanovsky
2021-06-21 23:49   ` Jason Gunthorpe
2021-06-23 13:03     ` Anand Khoje
2021-06-24 17:54       ` Jason Gunthorpe
2021-06-25  6:03         ` Anand Khoje
2021-06-25 12:48           ` Jason Gunthorpe
2021-06-21 23:52 ` [PATCH v5 for-next 0/3] IB/core: Obtaining subnet_prefix from cache in Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.