linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] IB/core: Obtaining subnet_prefix from cache in
@ 2021-06-09  5:55 Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Anand Khoje @ 2021-06-09  5:55 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

This v3 patch series is used to read the port_attribute subnet_prefix
from a valid cache entry instead of having to call
device->ops.query_gid() in Infiniband link-layer devices. This requires
addition of a flag used to check that the cache entry is initialized and
that a valid value is being read.

1. Removed the port validity check from ib_get_cached_subnet_prefix.
This check was not useful as the port_num is always valid.

2. Shuffled locks pkey_lost_lock and netdev_lock in struct ib_port_data.
This was done to add the 8 byte field flags used for checking the cache
entry validity. Output of pahole showed two 4-byte holes in the
structure ib_port_data after pkey_list_lock and netdev_lock. Moving
netdev_lock shaved off 8 bytes from the structure, which is used to add
the 8 byte field flags in patch 3.

3. Added flags to struct ib_port_data and enum ib_port_data_flags. These
are used to validate the status of cached subnet_prefix. This valid
cache entry of subnet_prefix is used in function __ib_query_port().
This allows the utilization of the cache entry and hence avoids a call
into device->ops.query_gid().

Anand Khoje (3):
  IB/core: Removed port validity check from ib_get_cached_subnet_prefix
  IB/core: Shuffle locks in ib_port_data to save memory
  IB/core: Obtain subnet_prefix from cache in IB devices.

 drivers/infiniband/core/cache.c     | 13 +++++++------
 drivers/infiniband/core/core_priv.h |  2 +-
 drivers/infiniband/core/device.c    | 22 +++++++++++++---------
 drivers/infiniband/core/security.c  |  7 ++-----
 include/rdma/ib_cache.h             |  6 ++++++
 include/rdma/ib_verbs.h             | 10 +++++++++-
 6 files changed, 38 insertions(+), 22 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix
  2021-06-09  5:55 [PATCH v3 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
@ 2021-06-09  5:55 ` Anand Khoje
  2021-06-09  8:37   ` Leon Romanovsky
  2021-06-09  5:55 ` [PATCH v3 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
  2 siblings, 1 reply; 15+ messages in thread
From: Anand Khoje @ 2021-06-09  5:55 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

Removed port validity check from ib_get_cached_subnet_prefix()
as this check is not needed because "port_num" is valid.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2:
    -	Added changes as per Leon's suggestion of removing port
    validity check from ib_get_cached_subnet_prefix().
    -	Split the v1 patch in 3 patches as per Leon's suggestion.
v2 -> v3:
    -	Added some formatting changes per Leon's suggestions
    and removed return from ib_get_cached_subnet_prefix.

---
 drivers/infiniband/core/cache.c     |  6 +-----
 drivers/infiniband/core/core_priv.h |  2 +-
 drivers/infiniband/core/device.c    | 13 ++++---------
 drivers/infiniband/core/security.c  |  7 ++-----
 4 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index 3b0991fedd81..e957f0c915a3 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1069,19 +1069,15 @@ int ib_get_cached_pkey(struct ib_device *device,
 }
 EXPORT_SYMBOL(ib_get_cached_pkey);
 
-int ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
+void ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
 				u64 *sn_pfx)
 {
 	unsigned long flags;
 
-	if (!rdma_is_port_valid(device, port_num))
-		return -EINVAL;
-
 	read_lock_irqsave(&device->cache_lock, flags);
 	*sn_pfx = device->port_data[port_num].cache.subnet_prefix;
 	read_unlock_irqrestore(&device->cache_lock, flags);
 
-	return 0;
 }
 EXPORT_SYMBOL(ib_get_cached_subnet_prefix);
 
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 29809dd30041..0b23f50fa958 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -214,7 +214,7 @@ int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
 			     struct nlmsghdr *nlh,
 			     struct netlink_ext_ack *extack);
 
-int ib_get_cached_subnet_prefix(struct ib_device *device,
+void ib_get_cached_subnet_prefix(struct ib_device *device,
 				u32 port_num,
 				u64 *sn_pfx);
 
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index c660cef66ac6..595128b26c34 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -886,15 +886,10 @@ static void ib_policy_change_task(struct work_struct *work)
 
 		rdma_for_each_port (dev, i) {
 			u64 sp;
-			int ret = ib_get_cached_subnet_prefix(dev,
-							      i,
-							      &sp);
-
-			WARN_ONCE(ret,
-				  "ib_get_cached_subnet_prefix err: %d, this should never happen here\n",
-				  ret);
-			if (!ret)
-				ib_security_cache_change(dev, i, sp);
+
+			ib_get_cached_subnet_prefix(dev, i, &sp);
+
+			ib_security_cache_change(dev, i, sp);
 		}
 	}
 	up_read(&devices_rwsem);
diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c
index e5a78d1a63c9..543391273b82 100644
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -72,7 +72,7 @@ static int get_pkey_and_subnet_prefix(struct ib_port_pkey *pp,
 	if (ret)
 		return ret;
 
-	ret = ib_get_cached_subnet_prefix(dev, pp->port_num, subnet_prefix);
+	ib_get_cached_subnet_prefix(dev, pp->port_num, subnet_prefix);
 
 	return ret;
 }
@@ -664,10 +664,7 @@ static int ib_security_pkey_access(struct ib_device *dev,
 	if (ret)
 		return ret;
 
-	ret = ib_get_cached_subnet_prefix(dev, port_num, &subnet_prefix);
-
-	if (ret)
-		return ret;
+	ib_get_cached_subnet_prefix(dev, port_num, &subnet_prefix);
 
 	return security_ib_pkey_access(sec, subnet_prefix, pkey);
 }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/3] IB/core: Shuffle locks in ib_port_data to save memory
  2021-06-09  5:55 [PATCH v3 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
@ 2021-06-09  5:55 ` Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
  2 siblings, 0 replies; 15+ messages in thread
From: Anand Khoje @ 2021-06-09  5:55 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

pahole shows two 4-byte holes in struct ib_port_data after
pkey_list_lock and netdev_lock respectively.

Shuffling the netdev_lock to be after pkey_list_lock, this
shaves off eight bytes from the struct.

Suggested-by: Haakon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

---

v1 -> v2:
    -	Split the v1 patch in 3 patches as per Leon's suggestion.
v2 -> v3:
    -	No changes.

---
 drivers/infiniband/core/cache.c     |  6 +-----
 include/rdma/ib_verbs.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 7e2f3699b898..41cbec516424 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2175,11 +2175,13 @@ struct ib_port_data {
 	struct ib_port_immutable immutable;
 
 	spinlock_t pkey_list_lock;
+
+	spinlock_t netdev_lock;
+
 	struct list_head pkey_list;
 
 	struct ib_port_cache cache;
 
-	spinlock_t netdev_lock;
 	struct net_device __rcu *netdev;
 	struct hlist_node ndev_hash_link;
 	struct rdma_port_counter port_counter;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-09  5:55 [PATCH v3 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
  2021-06-09  5:55 ` [PATCH v3 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
@ 2021-06-09  5:55 ` Anand Khoje
  2021-06-09  8:36   ` Leon Romanovsky
  2 siblings, 1 reply; 15+ messages in thread
From: Anand Khoje @ 2021-06-09  5:55 UTC (permalink / raw)
  To: linux-rdma, linux-kernel; +Cc: dledford, jgg, haakon.bugge, leon

ib_query_port() calls device->ops.query_port() to get the port
attributes. The method of querying is device driver specific.
The same function calls device->ops.query_gid() to get the GID and
extract the subnet_prefix (gid_prefix).

The GID and subnet_prefix are stored in a cache. But they do not get
read from the cache if the device is an Infiniband device. The
following change takes advantage of the cached subnet_prefix.
Testing with RDBMS has shown a significant improvement in performance
with this change.

The function ib_cache_is_initialised() is introduced because
ib_query_port() gets called early in the stage when the cache is not
built while reading port immutable property.

In that case, the default GID still gets read from HCA for IB link-
layer devices.

Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2:
    -	Split the v1 patch in 3 patches as per Leon's suggestion.

v2 -> v3:
    -	Added changes as per Mark Zhang's suggestion of clearing
    	flags in git_table_cleanup_one().

---
 drivers/infiniband/core/cache.c  | 7 ++++++-
 drivers/infiniband/core/device.c | 9 +++++++++
 include/rdma/ib_cache.h          | 6 ++++++
 include/rdma/ib_verbs.h          | 6 ++++++
 4 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index e957f0c915a3..94a8653a72c5 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -917,9 +917,12 @@ static void gid_table_cleanup_one(struct ib_device *ib_dev)
 {
 	u32 p;
 
-	rdma_for_each_port (ib_dev, p)
+	rdma_for_each_port (ib_dev, p) {
+		clear_bit(IB_PORT_CACHE_INITIALIZED,
+			&ib_dev->port_data[p].flags);
 		cleanup_gid_table_port(ib_dev, p,
 				       ib_dev->port_data[p].cache.gid);
+	}
 }
 
 static int gid_table_setup_one(struct ib_device *ib_dev)
@@ -1623,6 +1626,8 @@ int ib_cache_setup_one(struct ib_device *device)
 		err = ib_cache_update(device, p, true);
 		if (err)
 			return err;
+		set_bit(IB_PORT_CACHE_INITIALIZED,
+			&device->port_data[p].flags);
 	}
 
 	return 0;
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 595128b26c34..e8e7b0a61411 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2059,6 +2059,15 @@ static int __ib_query_port(struct ib_device *device,
 	    IB_LINK_LAYER_INFINIBAND)
 		return 0;
 
+	if (!ib_cache_is_initialised(device, port_num))
+		goto query_gid_from_device;
+
+	ib_get_cached_subnet_prefix(device, port_num,
+				    &port_attr->subnet_prefix);
+
+	return 0;
+
+query_gid_from_device:
 	err = device->ops.query_gid(device, port_num, 0, &gid);
 	if (err)
 		return err;
diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h
index 226ae3702d8a..1526fc6637eb 100644
--- a/include/rdma/ib_cache.h
+++ b/include/rdma/ib_cache.h
@@ -114,4 +114,10 @@ ssize_t rdma_query_gid_table(struct ib_device *device,
 			     struct ib_uverbs_gid_entry *entries,
 			     size_t max_entries);
 
+static inline bool ib_cache_is_initialised(struct ib_device *device,
+					  u8 port_num)
+{
+	return test_bit(IB_PORT_CACHE_INITIALIZED,
+			&device->port_data[port_num].flags);
+}
 #endif /* _IB_CACHE_H */
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 41cbec516424..ad2a55e3a2ee 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2169,6 +2169,10 @@ struct ib_port_immutable {
 	u32                           max_mad_size;
 };
 
+enum ib_port_data_flags {
+	IB_PORT_CACHE_INITIALIZED = 1 << 0,
+};
+
 struct ib_port_data {
 	struct ib_device *ib_dev;
 
@@ -2178,6 +2182,8 @@ struct ib_port_data {
 
 	spinlock_t netdev_lock;
 
+	unsigned long flags;
+
 	struct list_head pkey_list;
 
 	struct ib_port_cache cache;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-09  5:55 ` [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
@ 2021-06-09  8:36   ` Leon Romanovsky
  2021-06-09  9:26     ` Anand Khoje
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-09  8:36 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, jgg, haakon.bugge

On Wed, Jun 09, 2021 at 11:25:34AM +0530, Anand Khoje wrote:
> ib_query_port() calls device->ops.query_port() to get the port
> attributes. The method of querying is device driver specific.
> The same function calls device->ops.query_gid() to get the GID and
> extract the subnet_prefix (gid_prefix).
> 
> The GID and subnet_prefix are stored in a cache. But they do not get
> read from the cache if the device is an Infiniband device. The
> following change takes advantage of the cached subnet_prefix.
> Testing with RDBMS has shown a significant improvement in performance
> with this change.
> 
> The function ib_cache_is_initialised() is introduced because
> ib_query_port() gets called early in the stage when the cache is not
> built while reading port immutable property.
> 
> In that case, the default GID still gets read from HCA for IB link-
> layer devices.
> 
> Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
> Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
> Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
> 
> ---
> 
> v1 -> v2:
>     -	Split the v1 patch in 3 patches as per Leon's suggestion.
> 
> v2 -> v3:
>     -	Added changes as per Mark Zhang's suggestion of clearing
>     	flags in git_table_cleanup_one().
> 
> ---
>  drivers/infiniband/core/cache.c  | 7 ++++++-
>  drivers/infiniband/core/device.c | 9 +++++++++
>  include/rdma/ib_cache.h          | 6 ++++++
>  include/rdma/ib_verbs.h          | 6 ++++++
>  4 files changed, 27 insertions(+), 1 deletion(-)

Why did you use clear_bit/test_bit API? I would expect it for the
bitmap, but for such simple thing, the simple "u8 is_cached_init : 1;"
will do the same trick.

Thanks

> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index e957f0c915a3..94a8653a72c5 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -917,9 +917,12 @@ static void gid_table_cleanup_one(struct ib_device *ib_dev)
>  {
>  	u32 p;
>  
> -	rdma_for_each_port (ib_dev, p)
> +	rdma_for_each_port (ib_dev, p) {
> +		clear_bit(IB_PORT_CACHE_INITIALIZED,
> +			&ib_dev->port_data[p].flags);
>  		cleanup_gid_table_port(ib_dev, p,
>  				       ib_dev->port_data[p].cache.gid);
> +	}
>  }
>  
>  static int gid_table_setup_one(struct ib_device *ib_dev)
> @@ -1623,6 +1626,8 @@ int ib_cache_setup_one(struct ib_device *device)
>  		err = ib_cache_update(device, p, true);
>  		if (err)
>  			return err;
> +		set_bit(IB_PORT_CACHE_INITIALIZED,
> +			&device->port_data[p].flags);
>  	}
>  
>  	return 0;
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 595128b26c34..e8e7b0a61411 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -2059,6 +2059,15 @@ static int __ib_query_port(struct ib_device *device,
>  	    IB_LINK_LAYER_INFINIBAND)
>  		return 0;
>  
> +	if (!ib_cache_is_initialised(device, port_num))
> +		goto query_gid_from_device;
> +
> +	ib_get_cached_subnet_prefix(device, port_num,
> +				    &port_attr->subnet_prefix);
> +
> +	return 0;
> +
> +query_gid_from_device:
>  	err = device->ops.query_gid(device, port_num, 0, &gid);
>  	if (err)
>  		return err;
> diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h
> index 226ae3702d8a..1526fc6637eb 100644
> --- a/include/rdma/ib_cache.h
> +++ b/include/rdma/ib_cache.h
> @@ -114,4 +114,10 @@ ssize_t rdma_query_gid_table(struct ib_device *device,
>  			     struct ib_uverbs_gid_entry *entries,
>  			     size_t max_entries);
>  
> +static inline bool ib_cache_is_initialised(struct ib_device *device,
> +					  u8 port_num)
> +{
> +	return test_bit(IB_PORT_CACHE_INITIALIZED,
> +			&device->port_data[port_num].flags);
> +}
>  #endif /* _IB_CACHE_H */
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> index 41cbec516424..ad2a55e3a2ee 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -2169,6 +2169,10 @@ struct ib_port_immutable {
>  	u32                           max_mad_size;
>  };
>  
> +enum ib_port_data_flags {
> +	IB_PORT_CACHE_INITIALIZED = 1 << 0,
> +};
> +
>  struct ib_port_data {
>  	struct ib_device *ib_dev;
>  
> @@ -2178,6 +2182,8 @@ struct ib_port_data {
>  
>  	spinlock_t netdev_lock;
>  
> +	unsigned long flags;
> +
>  	struct list_head pkey_list;
>  
>  	struct ib_port_cache cache;
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix
  2021-06-09  5:55 ` [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
@ 2021-06-09  8:37   ` Leon Romanovsky
  0 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-09  8:37 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, jgg, haakon.bugge

On Wed, Jun 09, 2021 at 11:25:32AM +0530, Anand Khoje wrote:
> Removed port validity check from ib_get_cached_subnet_prefix()
> as this check is not needed because "port_num" is valid.
> 
> Suggested-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
> Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
> 
> ---
> 
> v1 -> v2:
>     -	Added changes as per Leon's suggestion of removing port
>     validity check from ib_get_cached_subnet_prefix().
>     -	Split the v1 patch in 3 patches as per Leon's suggestion.
> v2 -> v3:
>     -	Added some formatting changes per Leon's suggestions
>     and removed return from ib_get_cached_subnet_prefix.
> 
> ---
>  drivers/infiniband/core/cache.c     |  6 +-----
>  drivers/infiniband/core/core_priv.h |  2 +-
>  drivers/infiniband/core/device.c    | 13 ++++---------
>  drivers/infiniband/core/security.c  |  7 ++-----
>  4 files changed, 8 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
> index 3b0991fedd81..e957f0c915a3 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -1069,19 +1069,15 @@ int ib_get_cached_pkey(struct ib_device *device,
>  }
>  EXPORT_SYMBOL(ib_get_cached_pkey);
>  
> -int ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
> +void ib_get_cached_subnet_prefix(struct ib_device *device, u32 port_num,
>  				u64 *sn_pfx)
>  {
>  	unsigned long flags;
>  
> -	if (!rdma_is_port_valid(device, port_num))
> -		return -EINVAL;
> -
>  	read_lock_irqsave(&device->cache_lock, flags);
>  	*sn_pfx = device->port_data[port_num].cache.subnet_prefix;
>  	read_unlock_irqrestore(&device->cache_lock, flags);
>  
> -	return 0;
>  }
>  EXPORT_SYMBOL(ib_get_cached_subnet_prefix);
>  
> diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
> index 29809dd30041..0b23f50fa958 100644
> --- a/drivers/infiniband/core/core_priv.h
> +++ b/drivers/infiniband/core/core_priv.h
> @@ -214,7 +214,7 @@ int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
>  			     struct nlmsghdr *nlh,
>  			     struct netlink_ext_ack *extack);
>  
> -int ib_get_cached_subnet_prefix(struct ib_device *device,
> +void ib_get_cached_subnet_prefix(struct ib_device *device,
>  				u32 port_num,
>  				u64 *sn_pfx);
>  
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index c660cef66ac6..595128b26c34 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -886,15 +886,10 @@ static void ib_policy_change_task(struct work_struct *work)
>  
>  		rdma_for_each_port (dev, i) {
>  			u64 sp;
> -			int ret = ib_get_cached_subnet_prefix(dev,
> -							      i,
> -							      &sp);
> -
> -			WARN_ONCE(ret,
> -				  "ib_get_cached_subnet_prefix err: %d, this should never happen here\n",
> -				  ret);
> -			if (!ret)
> -				ib_security_cache_change(dev, i, sp);
> +
> +			ib_get_cached_subnet_prefix(dev, i, &sp);
> +
> +			ib_security_cache_change(dev, i, sp);

nitpick, the blank line is not needed.

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-09  8:36   ` Leon Romanovsky
@ 2021-06-09  9:26     ` Anand Khoje
  2021-06-09 10:40       ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Anand Khoje @ 2021-06-09  9:26 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, linux-kernel, dledford, jgg, Haakon Bugge

Hi Leon,

The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.

Thanks,
Anand

-----Original Message-----
From: Leon Romanovsky <leon@kernel.org> 
Sent: Wednesday, June 9, 2021 2:06 PM
To: Anand Khoje <anand.a.khoje@oracle.com>
Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org; dledford@redhat.com; jgg@ziepe.ca; Haakon Bugge <haakon.bugge@oracle.com>
Subject: Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.

On Wed, Jun 09, 2021 at 11:25:34AM +0530, Anand Khoje wrote:
> ib_query_port() calls device->ops.query_port() to get the port 
> attributes. The method of querying is device driver specific.
> The same function calls device->ops.query_gid() to get the GID and 
> extract the subnet_prefix (gid_prefix).
> 
> The GID and subnet_prefix are stored in a cache. But they do not get 
> read from the cache if the device is an Infiniband device. The 
> following change takes advantage of the cached subnet_prefix.
> Testing with RDBMS has shown a significant improvement in performance 
> with this change.
> 
> The function ib_cache_is_initialised() is introduced because
> ib_query_port() gets called early in the stage when the cache is not 
> built while reading port immutable property.
> 
> In that case, the default GID still gets read from HCA for IB link- 
> layer devices.
> 
> Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
> Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
> Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
> 
> ---
> 
> v1 -> v2:
>     -	Split the v1 patch in 3 patches as per Leon's suggestion.
> 
> v2 -> v3:
>     -	Added changes as per Mark Zhang's suggestion of clearing
>     	flags in git_table_cleanup_one().
> 
> ---
>  drivers/infiniband/core/cache.c  | 7 ++++++-  
> drivers/infiniband/core/device.c | 9 +++++++++
>  include/rdma/ib_cache.h          | 6 ++++++
>  include/rdma/ib_verbs.h          | 6 ++++++
>  4 files changed, 27 insertions(+), 1 deletion(-)

Why did you use clear_bit/test_bit API? I would expect it for the bitmap, but for such simple thing, the simple "u8 is_cached_init : 1;"
will do the same trick.

Thanks

> 
> diff --git a/drivers/infiniband/core/cache.c 
> b/drivers/infiniband/core/cache.c index e957f0c915a3..94a8653a72c5 
> 100644
> --- a/drivers/infiniband/core/cache.c
> +++ b/drivers/infiniband/core/cache.c
> @@ -917,9 +917,12 @@ static void gid_table_cleanup_one(struct 
> ib_device *ib_dev)  {
>  	u32 p;
>  
> -	rdma_for_each_port (ib_dev, p)
> +	rdma_for_each_port (ib_dev, p) {
> +		clear_bit(IB_PORT_CACHE_INITIALIZED,
> +			&ib_dev->port_data[p].flags);
>  		cleanup_gid_table_port(ib_dev, p,
>  				       ib_dev->port_data[p].cache.gid);
> +	}
>  }
>  
>  static int gid_table_setup_one(struct ib_device *ib_dev) @@ -1623,6 
> +1626,8 @@ int ib_cache_setup_one(struct ib_device *device)
>  		err = ib_cache_update(device, p, true);
>  		if (err)
>  			return err;
> +		set_bit(IB_PORT_CACHE_INITIALIZED,
> +			&device->port_data[p].flags);
>  	}
>  
>  	return 0;
> diff --git a/drivers/infiniband/core/device.c 
> b/drivers/infiniband/core/device.c
> index 595128b26c34..e8e7b0a61411 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -2059,6 +2059,15 @@ static int __ib_query_port(struct ib_device *device,
>  	    IB_LINK_LAYER_INFINIBAND)
>  		return 0;
>  
> +	if (!ib_cache_is_initialised(device, port_num))
> +		goto query_gid_from_device;
> +
> +	ib_get_cached_subnet_prefix(device, port_num,
> +				    &port_attr->subnet_prefix);
> +
> +	return 0;
> +
> +query_gid_from_device:
>  	err = device->ops.query_gid(device, port_num, 0, &gid);
>  	if (err)
>  		return err;
> diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h index 
> 226ae3702d8a..1526fc6637eb 100644
> --- a/include/rdma/ib_cache.h
> +++ b/include/rdma/ib_cache.h
> @@ -114,4 +114,10 @@ ssize_t rdma_query_gid_table(struct ib_device *device,
>  			     struct ib_uverbs_gid_entry *entries,
>  			     size_t max_entries);
>  
> +static inline bool ib_cache_is_initialised(struct ib_device *device,
> +					  u8 port_num)
> +{
> +	return test_bit(IB_PORT_CACHE_INITIALIZED,
> +			&device->port_data[port_num].flags);
> +}
>  #endif /* _IB_CACHE_H */
> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 
> 41cbec516424..ad2a55e3a2ee 100644
> --- a/include/rdma/ib_verbs.h
> +++ b/include/rdma/ib_verbs.h
> @@ -2169,6 +2169,10 @@ struct ib_port_immutable {
>  	u32                           max_mad_size;
>  };
>  
> +enum ib_port_data_flags {
> +	IB_PORT_CACHE_INITIALIZED = 1 << 0,
> +};
> +
>  struct ib_port_data {
>  	struct ib_device *ib_dev;
>  
> @@ -2178,6 +2182,8 @@ struct ib_port_data {
>  
>  	spinlock_t netdev_lock;
>  
> +	unsigned long flags;
> +
>  	struct list_head pkey_list;
>  
>  	struct ib_port_cache cache;
> --
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-09  9:26     ` Anand Khoje
@ 2021-06-09 10:40       ` Leon Romanovsky
  2021-06-14  3:32         ` Haakon Bugge
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-09 10:40 UTC (permalink / raw)
  To: Anand Khoje; +Cc: linux-rdma, linux-kernel, dledford, jgg, Haakon Bugge

On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
> Hi Leon,

Please don't do top-posting.


> 
> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.

The bitfield variables are better suit this use case.
Let's don't overcomplicate code without the reason.

Thanks

> 
> Thanks,
> Anand
> 
> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org> 
> Sent: Wednesday, June 9, 2021 2:06 PM
> To: Anand Khoje <anand.a.khoje@oracle.com>
> Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org; dledford@redhat.com; jgg@ziepe.ca; Haakon Bugge <haakon.bugge@oracle.com>
> Subject: Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
> 
> On Wed, Jun 09, 2021 at 11:25:34AM +0530, Anand Khoje wrote:
> > ib_query_port() calls device->ops.query_port() to get the port 
> > attributes. The method of querying is device driver specific.
> > The same function calls device->ops.query_gid() to get the GID and 
> > extract the subnet_prefix (gid_prefix).
> > 
> > The GID and subnet_prefix are stored in a cache. But they do not get 
> > read from the cache if the device is an Infiniband device. The 
> > following change takes advantage of the cached subnet_prefix.
> > Testing with RDBMS has shown a significant improvement in performance 
> > with this change.
> > 
> > The function ib_cache_is_initialised() is introduced because
> > ib_query_port() gets called early in the stage when the cache is not 
> > built while reading port immutable property.
> > 
> > In that case, the default GID still gets read from HCA for IB link- 
> > layer devices.
> > 
> > Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
> > Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
> > Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
> > 
> > ---
> > 
> > v1 -> v2:
> >     -	Split the v1 patch in 3 patches as per Leon's suggestion.
> > 
> > v2 -> v3:
> >     -	Added changes as per Mark Zhang's suggestion of clearing
> >     	flags in git_table_cleanup_one().
> > 
> > ---
> >  drivers/infiniband/core/cache.c  | 7 ++++++-  
> > drivers/infiniband/core/device.c | 9 +++++++++
> >  include/rdma/ib_cache.h          | 6 ++++++
> >  include/rdma/ib_verbs.h          | 6 ++++++
> >  4 files changed, 27 insertions(+), 1 deletion(-)
> 
> Why did you use clear_bit/test_bit API? I would expect it for the bitmap, but for such simple thing, the simple "u8 is_cached_init : 1;"
> will do the same trick.
> 
> Thanks
> 
> > 
> > diff --git a/drivers/infiniband/core/cache.c 
> > b/drivers/infiniband/core/cache.c index e957f0c915a3..94a8653a72c5 
> > 100644
> > --- a/drivers/infiniband/core/cache.c
> > +++ b/drivers/infiniband/core/cache.c
> > @@ -917,9 +917,12 @@ static void gid_table_cleanup_one(struct 
> > ib_device *ib_dev)  {
> >  	u32 p;
> >  
> > -	rdma_for_each_port (ib_dev, p)
> > +	rdma_for_each_port (ib_dev, p) {
> > +		clear_bit(IB_PORT_CACHE_INITIALIZED,
> > +			&ib_dev->port_data[p].flags);
> >  		cleanup_gid_table_port(ib_dev, p,
> >  				       ib_dev->port_data[p].cache.gid);
> > +	}
> >  }
> >  
> >  static int gid_table_setup_one(struct ib_device *ib_dev) @@ -1623,6 
> > +1626,8 @@ int ib_cache_setup_one(struct ib_device *device)
> >  		err = ib_cache_update(device, p, true);
> >  		if (err)
> >  			return err;
> > +		set_bit(IB_PORT_CACHE_INITIALIZED,
> > +			&device->port_data[p].flags);
> >  	}
> >  
> >  	return 0;
> > diff --git a/drivers/infiniband/core/device.c 
> > b/drivers/infiniband/core/device.c
> > index 595128b26c34..e8e7b0a61411 100644
> > --- a/drivers/infiniband/core/device.c
> > +++ b/drivers/infiniband/core/device.c
> > @@ -2059,6 +2059,15 @@ static int __ib_query_port(struct ib_device *device,
> >  	    IB_LINK_LAYER_INFINIBAND)
> >  		return 0;
> >  
> > +	if (!ib_cache_is_initialised(device, port_num))
> > +		goto query_gid_from_device;
> > +
> > +	ib_get_cached_subnet_prefix(device, port_num,
> > +				    &port_attr->subnet_prefix);
> > +
> > +	return 0;
> > +
> > +query_gid_from_device:
> >  	err = device->ops.query_gid(device, port_num, 0, &gid);
> >  	if (err)
> >  		return err;
> > diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h index 
> > 226ae3702d8a..1526fc6637eb 100644
> > --- a/include/rdma/ib_cache.h
> > +++ b/include/rdma/ib_cache.h
> > @@ -114,4 +114,10 @@ ssize_t rdma_query_gid_table(struct ib_device *device,
> >  			     struct ib_uverbs_gid_entry *entries,
> >  			     size_t max_entries);
> >  
> > +static inline bool ib_cache_is_initialised(struct ib_device *device,
> > +					  u8 port_num)
> > +{
> > +	return test_bit(IB_PORT_CACHE_INITIALIZED,
> > +			&device->port_data[port_num].flags);
> > +}
> >  #endif /* _IB_CACHE_H */
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 
> > 41cbec516424..ad2a55e3a2ee 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -2169,6 +2169,10 @@ struct ib_port_immutable {
> >  	u32                           max_mad_size;
> >  };
> >  
> > +enum ib_port_data_flags {
> > +	IB_PORT_CACHE_INITIALIZED = 1 << 0,
> > +};
> > +
> >  struct ib_port_data {
> >  	struct ib_device *ib_dev;
> >  
> > @@ -2178,6 +2182,8 @@ struct ib_port_data {
> >  
> >  	spinlock_t netdev_lock;
> >  
> > +	unsigned long flags;
> > +
> >  	struct list_head pkey_list;
> >  
> >  	struct ib_port_cache cache;
> > --
> > 2.27.0
> > 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-09 10:40       ` Leon Romanovsky
@ 2021-06-14  3:32         ` Haakon Bugge
  2021-06-14  7:25           ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Haakon Bugge @ 2021-06-14  3:32 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg



> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
>> Hi Leon,
> 
> Please don't do top-posting.
> 
> 
>> 
>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
> 
> The bitfield variables are better suit this use case.
> Let's don't overcomplicate code without the reason.

The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.

What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.


Thxs, Håkon

> 
> Thanks
> 
>> 
>> Thanks,
>> Anand
>> 
>> -----Original Message-----
>> From: Leon Romanovsky <leon@kernel.org> 
>> Sent: Wednesday, June 9, 2021 2:06 PM
>> To: Anand Khoje <anand.a.khoje@oracle.com>
>> Cc: linux-rdma@vger.kernel.org; linux-kernel@vger.kernel.org; dledford@redhat.com; jgg@ziepe.ca; Haakon Bugge <haakon.bugge@oracle.com>
>> Subject: Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
>> 
>> On Wed, Jun 09, 2021 at 11:25:34AM +0530, Anand Khoje wrote:
>>> ib_query_port() calls device->ops.query_port() to get the port 
>>> attributes. The method of querying is device driver specific.
>>> The same function calls device->ops.query_gid() to get the GID and 
>>> extract the subnet_prefix (gid_prefix).
>>> 
>>> The GID and subnet_prefix are stored in a cache. But they do not get 
>>> read from the cache if the device is an Infiniband device. The 
>>> following change takes advantage of the cached subnet_prefix.
>>> Testing with RDBMS has shown a significant improvement in performance 
>>> with this change.
>>> 
>>> The function ib_cache_is_initialised() is introduced because
>>> ib_query_port() gets called early in the stage when the cache is not 
>>> built while reading port immutable property.
>>> 
>>> In that case, the default GID still gets read from HCA for IB link- 
>>> layer devices.
>>> 
>>> Fixes: fad61ad ("IB/core: Add subnet prefix to port info")
>>> Signed-off-by: Anand Khoje <anand.a.khoje@oracle.com>
>>> Signed-off-by: Haakon Bugge <haakon.bugge@oracle.com>
>>> 
>>> ---
>>> 
>>> v1 -> v2:
>>>    -	Split the v1 patch in 3 patches as per Leon's suggestion.
>>> 
>>> v2 -> v3:
>>>    -	Added changes as per Mark Zhang's suggestion of clearing
>>>    	flags in git_table_cleanup_one().
>>> 
>>> ---
>>> drivers/infiniband/core/cache.c  | 7 ++++++-  
>>> drivers/infiniband/core/device.c | 9 +++++++++
>>> include/rdma/ib_cache.h          | 6 ++++++
>>> include/rdma/ib_verbs.h          | 6 ++++++
>>> 4 files changed, 27 insertions(+), 1 deletion(-)
>> 
>> Why did you use clear_bit/test_bit API? I would expect it for the bitmap, but for such simple thing, the simple "u8 is_cached_init : 1;"
>> will do the same trick.
>> 
>> Thanks
>> 
>>> 
>>> diff --git a/drivers/infiniband/core/cache.c 
>>> b/drivers/infiniband/core/cache.c index e957f0c915a3..94a8653a72c5 
>>> 100644
>>> --- a/drivers/infiniband/core/cache.c
>>> +++ b/drivers/infiniband/core/cache.c
>>> @@ -917,9 +917,12 @@ static void gid_table_cleanup_one(struct 
>>> ib_device *ib_dev)  {
>>> 	u32 p;
>>> 
>>> -	rdma_for_each_port (ib_dev, p)
>>> +	rdma_for_each_port (ib_dev, p) {
>>> +		clear_bit(IB_PORT_CACHE_INITIALIZED,
>>> +			&ib_dev->port_data[p].flags);
>>> 		cleanup_gid_table_port(ib_dev, p,
>>> 				       ib_dev->port_data[p].cache.gid);
>>> +	}
>>> }
>>> 
>>> static int gid_table_setup_one(struct ib_device *ib_dev) @@ -1623,6 
>>> +1626,8 @@ int ib_cache_setup_one(struct ib_device *device)
>>> 		err = ib_cache_update(device, p, true);
>>> 		if (err)
>>> 			return err;
>>> +		set_bit(IB_PORT_CACHE_INITIALIZED,
>>> +			&device->port_data[p].flags);
>>> 	}
>>> 
>>> 	return 0;
>>> diff --git a/drivers/infiniband/core/device.c 
>>> b/drivers/infiniband/core/device.c
>>> index 595128b26c34..e8e7b0a61411 100644
>>> --- a/drivers/infiniband/core/device.c
>>> +++ b/drivers/infiniband/core/device.c
>>> @@ -2059,6 +2059,15 @@ static int __ib_query_port(struct ib_device *device,
>>> 	    IB_LINK_LAYER_INFINIBAND)
>>> 		return 0;
>>> 
>>> +	if (!ib_cache_is_initialised(device, port_num))
>>> +		goto query_gid_from_device;
>>> +
>>> +	ib_get_cached_subnet_prefix(device, port_num,
>>> +				    &port_attr->subnet_prefix);
>>> +
>>> +	return 0;
>>> +
>>> +query_gid_from_device:
>>> 	err = device->ops.query_gid(device, port_num, 0, &gid);
>>> 	if (err)
>>> 		return err;
>>> diff --git a/include/rdma/ib_cache.h b/include/rdma/ib_cache.h index 
>>> 226ae3702d8a..1526fc6637eb 100644
>>> --- a/include/rdma/ib_cache.h
>>> +++ b/include/rdma/ib_cache.h
>>> @@ -114,4 +114,10 @@ ssize_t rdma_query_gid_table(struct ib_device *device,
>>> 			     struct ib_uverbs_gid_entry *entries,
>>> 			     size_t max_entries);
>>> 
>>> +static inline bool ib_cache_is_initialised(struct ib_device *device,
>>> +					  u8 port_num)
>>> +{
>>> +	return test_bit(IB_PORT_CACHE_INITIALIZED,
>>> +			&device->port_data[port_num].flags);
>>> +}
>>> #endif /* _IB_CACHE_H */
>>> diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 
>>> 41cbec516424..ad2a55e3a2ee 100644
>>> --- a/include/rdma/ib_verbs.h
>>> +++ b/include/rdma/ib_verbs.h
>>> @@ -2169,6 +2169,10 @@ struct ib_port_immutable {
>>> 	u32                           max_mad_size;
>>> };
>>> 
>>> +enum ib_port_data_flags {
>>> +	IB_PORT_CACHE_INITIALIZED = 1 << 0,
>>> +};
>>> +
>>> struct ib_port_data {
>>> 	struct ib_device *ib_dev;
>>> 
>>> @@ -2178,6 +2182,8 @@ struct ib_port_data {
>>> 
>>> 	spinlock_t netdev_lock;
>>> 
>>> +	unsigned long flags;
>>> +
>>> 	struct list_head pkey_list;
>>> 
>>> 	struct ib_port_cache cache;
>>> --
>>> 2.27.0
>>> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-14  3:32         ` Haakon Bugge
@ 2021-06-14  7:25           ` Leon Romanovsky
  2021-06-14 16:29             ` Haakon Bugge
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-14  7:25 UTC (permalink / raw)
  To: Haakon Bugge; +Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg

On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
> 
> 
> > On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
> > 
> > On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
> >> Hi Leon,
> > 
> > Please don't do top-posting.
> > 
> > 
> >> 
> >> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
> >> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
> > 
> > The bitfield variables are better suit this use case.
> > Let's don't overcomplicate code without the reason.
> 
> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
> 
> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.

We are talking about device initialization flow that shouldn't be
performed in parallel to another initialization of same device, so the
comparison to rdma-cm is not valid here.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-14  7:25           ` Leon Romanovsky
@ 2021-06-14 16:29             ` Haakon Bugge
  2021-06-15  5:08               ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Haakon Bugge @ 2021-06-14 16:29 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg



> On 14 Jun 2021, at 09:25, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
>> 
>> 
>>> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
>>> 
>>> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
>>>> Hi Leon,
>>> 
>>> Please don't do top-posting.
>>> 
>>> 
>>>> 
>>>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
>>>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
>>> 
>>> The bitfield variables are better suit this use case.
>>> Let's don't overcomplicate code without the reason.
>> 
>> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
>> 
>> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.
> 
> We are talking about device initialization flow that shouldn't be
> performed in parallel to another initialization of same device, so the
> comparison to rdma-cm is not valid here.

I can agree to that. And it is probably not worthwhile to fix the bit-fields in rdma_id_private?


Thxs, Håkon


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-14 16:29             ` Haakon Bugge
@ 2021-06-15  5:08               ` Leon Romanovsky
  2021-06-15 16:13                 ` Haakon Bugge
  0 siblings, 1 reply; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-15  5:08 UTC (permalink / raw)
  To: Haakon Bugge; +Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg

On Mon, Jun 14, 2021 at 04:29:09PM +0000, Haakon Bugge wrote:
> 
> 
> > On 14 Jun 2021, at 09:25, Leon Romanovsky <leon@kernel.org> wrote:
> > 
> > On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
> >> 
> >> 
> >>> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
> >>> 
> >>> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
> >>>> Hi Leon,
> >>> 
> >>> Please don't do top-posting.
> >>> 
> >>> 
> >>>> 
> >>>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
> >>>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
> >>> 
> >>> The bitfield variables are better suit this use case.
> >>> Let's don't overcomplicate code without the reason.
> >> 
> >> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
> >> 
> >> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.
> > 
> > We are talking about device initialization flow that shouldn't be
> > performed in parallel to another initialization of same device, so the
> > comparison to rdma-cm is not valid here.
> 
> I can agree to that. And it is probably not worthwhile to fix the bit-fields in rdma_id_private?

Before this article [1], I would say no, we don't need to fix.
Now, I'm not sure about that.

"He also notes that even though the design flaws are difficult to exploit
 on their own, they can be combined with the other flaws found to make for
 a much more serious problem."

and 

"In other words, people did notice this vulnerability and a defense was standardized,
 but in practice the defense was never adopted. This is a good example that security
 defenses must be adopted before attacks become practical."

Thanks

[1] https://lwn.net/Articles/856044/ - Holes in WiFi

> 
> 
> Thxs, Håkon
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-15  5:08               ` Leon Romanovsky
@ 2021-06-15 16:13                 ` Haakon Bugge
  2021-06-16 11:20                   ` Haakon Bugge
  0 siblings, 1 reply; 15+ messages in thread
From: Haakon Bugge @ 2021-06-15 16:13 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg



> On 15 Jun 2021, at 07:08, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Mon, Jun 14, 2021 at 04:29:09PM +0000, Haakon Bugge wrote:
>> 
>> 
>>> On 14 Jun 2021, at 09:25, Leon Romanovsky <leon@kernel.org> wrote:
>>> 
>>> On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
>>>> 
>>>> 
>>>>> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
>>>>> 
>>>>> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
>>>>>> Hi Leon,
>>>>> 
>>>>> Please don't do top-posting.
>>>>> 
>>>>> 
>>>>>> 
>>>>>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
>>>>>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
>>>>> 
>>>>> The bitfield variables are better suit this use case.
>>>>> Let's don't overcomplicate code without the reason.
>>>> 
>>>> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
>>>> 
>>>> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.
>>> 
>>> We are talking about device initialization flow that shouldn't be
>>> performed in parallel to another initialization of same device, so the
>>> comparison to rdma-cm is not valid here.
>> 
>> I can agree to that. And it is probably not worthwhile to fix the bit-fields in rdma_id_private?
> 
> Before this article [1], I would say no, we don't need to fix.
> Now, I'm not sure about that.
> 
> "He also notes that even though the design flaws are difficult to exploit
> on their own, they can be combined with the other flaws found to make for
> a much more serious problem."
> 
> and 
> 
> "In other words, people did notice this vulnerability and a defense was standardized,
> but in practice the defense was never adopted. This is a good example that security
> defenses must be adopted before attacks become practical."

Let me send you a commit tomorrow. The last sentence you quoted above is ambiguous as far as I can understand. But the intention is clear though :-)


Thxs, Håkon


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-15 16:13                 ` Haakon Bugge
@ 2021-06-16 11:20                   ` Haakon Bugge
  2021-06-16 12:43                     ` Leon Romanovsky
  0 siblings, 1 reply; 15+ messages in thread
From: Haakon Bugge @ 2021-06-16 11:20 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg



> On 15 Jun 2021, at 18:13, Haakon Bugge <haakon.bugge@oracle.com> wrote:
> 
> 
> 
>> On 15 Jun 2021, at 07:08, Leon Romanovsky <leon@kernel.org> wrote:
>> 
>> On Mon, Jun 14, 2021 at 04:29:09PM +0000, Haakon Bugge wrote:
>>> 
>>> 
>>>> On 14 Jun 2021, at 09:25, Leon Romanovsky <leon@kernel.org> wrote:
>>>> 
>>>> On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
>>>>> 
>>>>> 
>>>>>> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
>>>>>> 
>>>>>> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
>>>>>>> Hi Leon,
>>>>>> 
>>>>>> Please don't do top-posting.
>>>>>> 
>>>>>> 
>>>>>>> 
>>>>>>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
>>>>>>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
>>>>>> 
>>>>>> The bitfield variables are better suit this use case.
>>>>>> Let's don't overcomplicate code without the reason.
>>>>> 
>>>>> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
>>>>> 
>>>>> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.
>>>> 
>>>> We are talking about device initialization flow that shouldn't be
>>>> performed in parallel to another initialization of same device, so the
>>>> comparison to rdma-cm is not valid here.
>>> 
>>> I can agree to that. And it is probably not worthwhile to fix the bit-fields in rdma_id_private?
>> 
>> Before this article [1], I would say no, we don't need to fix.
>> Now, I'm not sure about that.
>> 
>> "He also notes that even though the design flaws are difficult to exploit
>> on their own, they can be combined with the other flaws found to make for
>> a much more serious problem."
>> 
>> and 
>> 
>> "In other words, people did notice this vulnerability and a defense was standardized,
>> but in practice the defense was never adopted. This is a good example that security
>> defenses must be adopted before attacks become practical."
> 
> Let me send you a commit tomorrow. The last sentence you quoted above is ambiguous as far as I can understand. But the intention is clear though :-)

Do you prefer for-next or for-rc for this?

Thxs, Håkon


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices.
  2021-06-16 11:20                   ` Haakon Bugge
@ 2021-06-16 12:43                     ` Leon Romanovsky
  0 siblings, 0 replies; 15+ messages in thread
From: Leon Romanovsky @ 2021-06-16 12:43 UTC (permalink / raw)
  To: Haakon Bugge; +Cc: Anand Khoje, OFED mailing list, linux-kernel, dledford, jgg

On Wed, Jun 16, 2021 at 11:20:01AM +0000, Haakon Bugge wrote:
> 
> 
> > On 15 Jun 2021, at 18:13, Haakon Bugge <haakon.bugge@oracle.com> wrote:
> > 
> > 
> > 
> >> On 15 Jun 2021, at 07:08, Leon Romanovsky <leon@kernel.org> wrote:
> >> 
> >> On Mon, Jun 14, 2021 at 04:29:09PM +0000, Haakon Bugge wrote:
> >>> 
> >>> 
> >>>> On 14 Jun 2021, at 09:25, Leon Romanovsky <leon@kernel.org> wrote:
> >>>> 
> >>>> On Mon, Jun 14, 2021 at 03:32:39AM +0000, Haakon Bugge wrote:
> >>>>> 
> >>>>> 
> >>>>>> On 9 Jun 2021, at 12:40, Leon Romanovsky <leon@kernel.org> wrote:
> >>>>>> 
> >>>>>> On Wed, Jun 09, 2021 at 09:26:03AM +0000, Anand Khoje wrote:
> >>>>>>> Hi Leon,
> >>>>>> 
> >>>>>> Please don't do top-posting.
> >>>>>> 
> >>>>>> 
> >>>>>>> 
> >>>>>>> The set_bit()/clear_bit() and enum ib_port_data_flags  has been added as a device that can be used for future enhancements. 
> >>>>>>> Also, usage of set_bit()/clear_bit() ensures the operations on this bit is atomic.
> >>>>>> 
> >>>>>> The bitfield variables are better suit this use case.
> >>>>>> Let's don't overcomplicate code without the reason.
> >>>>> 
> >>>>> The problem is always that people tend to build on what's in there. For example, look at the bitfields in rdma_id_private, tos_set,  timeout_set, and min_rnr_timer_set.
> >>>>> 
> >>>>> What do you think will happen when, let's say, rdma_set_service_type() and rdma_set_ack_timeout() are called in close proximity in time? There is no locking, and the RMW will fail intermittently.
> >>>> 
> >>>> We are talking about device initialization flow that shouldn't be
> >>>> performed in parallel to another initialization of same device, so the
> >>>> comparison to rdma-cm is not valid here.
> >>> 
> >>> I can agree to that. And it is probably not worthwhile to fix the bit-fields in rdma_id_private?
> >> 
> >> Before this article [1], I would say no, we don't need to fix.
> >> Now, I'm not sure about that.
> >> 
> >> "He also notes that even though the design flaws are difficult to exploit
> >> on their own, they can be combined with the other flaws found to make for
> >> a much more serious problem."
> >> 
> >> and 
> >> 
> >> "In other words, people did notice this vulnerability and a defense was standardized,
> >> but in practice the defense was never adopted. This is a good example that security
> >> defenses must be adopted before attacks become practical."
> > 
> > Let me send you a commit tomorrow. The last sentence you quoted above is ambiguous as far as I can understand. But the intention is clear though :-)
> 
> Do you prefer for-next or for-rc for this?

for-next, please.

Thanks

> 
> Thxs, Håkon
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-06-16 12:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09  5:55 [PATCH v3 0/3] IB/core: Obtaining subnet_prefix from cache in Anand Khoje
2021-06-09  5:55 ` [PATCH v3 1/3] IB/core: Removed port validity check from ib_get_cached_subnet_prefix Anand Khoje
2021-06-09  8:37   ` Leon Romanovsky
2021-06-09  5:55 ` [PATCH v3 2/3] IB/core: Shuffle locks in ib_port_data to save memory Anand Khoje
2021-06-09  5:55 ` [PATCH v3 3/3] IB/core: Obtain subnet_prefix from cache in IB devices Anand Khoje
2021-06-09  8:36   ` Leon Romanovsky
2021-06-09  9:26     ` Anand Khoje
2021-06-09 10:40       ` Leon Romanovsky
2021-06-14  3:32         ` Haakon Bugge
2021-06-14  7:25           ` Leon Romanovsky
2021-06-14 16:29             ` Haakon Bugge
2021-06-15  5:08               ` Leon Romanovsky
2021-06-15 16:13                 ` Haakon Bugge
2021-06-16 11:20                   ` Haakon Bugge
2021-06-16 12:43                     ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).