linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
@ 2019-02-17 17:09 Håkon Bugge
  2019-02-22 16:36 ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Håkon Bugge @ 2019-02-17 17:09 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe, Leon Romanovsky, Parav Pandit, Steve Wise
  Cc: linux-rdma, linux-kernel

During certain workloads, the default CM response timeout is too
short, leading to excessive retries. Hence, make it configurable
through sysctl. While at it, also make number of CM retries
configurable.

The defaults are not changed.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
---
 drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c43512752b8a..ce99e1cd1029 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -43,6 +43,7 @@
 #include <linux/inetdevice.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/sysctl.h>
 #include <net/route.h>
 
 #include <net/net_namespace.h>
@@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
 MODULE_DESCRIPTION("Generic RDMA CM Agent");
 MODULE_LICENSE("Dual BSD/GPL");
 
-#define CMA_CM_RESPONSE_TIMEOUT 20
 #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
-#define CMA_MAX_CM_RETRIES 15
 #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
 #define CMA_IBOE_PACKET_LIFETIME 18
 #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
 
+#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
+static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
+static int cma_cm_response_timeout_min = 8;
+static int cma_cm_response_timeout_max = 31;
+#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
+
+#define CMA_DFLT_MAX_CM_RETRIES 15
+static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
+static int cma_max_cm_retries_min = 1;
+static int cma_max_cm_retries_max = 100;
+#undef CMA_DFLT_MAX_CM_RETRIES
+
+static struct ctl_table_header *cma_ctl_table_hdr;
+static struct ctl_table cma_ctl_table[] = {
+	{
+		.procname	= "cma_cm_response_timeout",
+		.data		= &cma_cm_response_timeout,
+		.maxlen		= sizeof(cma_cm_response_timeout),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &cma_cm_response_timeout_min,
+		.extra2		= &cma_cm_response_timeout_max,
+	},
+	{
+		.procname	= "cma_max_cm_retries",
+		.data		= &cma_max_cm_retries,
+		.maxlen		= sizeof(cma_max_cm_retries),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &cma_max_cm_retries_min,
+		.extra2		= &cma_max_cm_retries_max,
+	},
+	{ }
+};
+
 static const char * const cma_events[] = {
 	[RDMA_CM_EVENT_ADDR_RESOLVED]	 = "address resolved",
 	[RDMA_CM_EVENT_ADDR_ERROR]	 = "address error",
@@ -3745,8 +3779,8 @@ static int cma_resolve_ib_udp(struct rdma_id_private *id_priv,
 	req.path = id_priv->id.route.path_rec;
 	req.sgid_attr = id_priv->id.route.addr.dev_addr.sgid_attr;
 	req.service_id = rdma_get_service_id(&id_priv->id, cma_dst_addr(id_priv));
-	req.timeout_ms = 1 << (CMA_CM_RESPONSE_TIMEOUT - 8);
-	req.max_cm_retries = CMA_MAX_CM_RETRIES;
+	req.timeout_ms = 1 << (cma_cm_response_timeout - 8);
+	req.max_cm_retries = cma_max_cm_retries;
 
 	ret = ib_send_cm_sidr_req(id_priv->cm_id.ib, &req);
 	if (ret) {
@@ -3816,9 +3850,9 @@ static int cma_connect_ib(struct rdma_id_private *id_priv,
 	req.flow_control = conn_param->flow_control;
 	req.retry_count = min_t(u8, 7, conn_param->retry_count);
 	req.rnr_retry_count = min_t(u8, 7, conn_param->rnr_retry_count);
-	req.remote_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
-	req.local_cm_response_timeout = CMA_CM_RESPONSE_TIMEOUT;
-	req.max_cm_retries = CMA_MAX_CM_RETRIES;
+	req.remote_cm_response_timeout = cma_cm_response_timeout;
+	req.local_cm_response_timeout = cma_cm_response_timeout;
+	req.max_cm_retries = cma_max_cm_retries;
 	req.srq = id_priv->srq ? 1 : 0;
 
 	ret = ib_send_cm_req(id_priv->cm_id.ib, &req);
@@ -4701,6 +4735,9 @@ static int __init cma_init(void)
 		goto err;
 
 	cma_configfs_init();
+	cma_ctl_table_hdr = register_net_sysctl(&init_net, "net/rdma_cm", cma_ctl_table);
+	if (!cma_ctl_table_hdr)
+		pr_warn("rdma_cm: couldn't register sysctl path, using default values\n");
 
 	return 0;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-17 17:09 [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable Håkon Bugge
@ 2019-02-22 16:36 ` Jason Gunthorpe
  2019-02-22 17:14   ` Steve Wise
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2019-02-22 16:36 UTC (permalink / raw)
  To: Håkon Bugge
  Cc: Doug Ledford, Leon Romanovsky, Parav Pandit, Steve Wise,
	linux-rdma, linux-kernel

On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
> During certain workloads, the default CM response timeout is too
> short, leading to excessive retries. Hence, make it configurable
> through sysctl. While at it, also make number of CM retries
> configurable.
> 
> The defaults are not changed.
> 
> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>  drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
>  1 file changed, 44 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index c43512752b8a..ce99e1cd1029 100644
> +++ b/drivers/infiniband/core/cma.c
> @@ -43,6 +43,7 @@
>  #include <linux/inetdevice.h>
>  #include <linux/slab.h>
>  #include <linux/module.h>
> +#include <linux/sysctl.h>
>  #include <net/route.h>
>  
>  #include <net/net_namespace.h>
> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
>  MODULE_DESCRIPTION("Generic RDMA CM Agent");
>  MODULE_LICENSE("Dual BSD/GPL");
>  
> -#define CMA_CM_RESPONSE_TIMEOUT 20
>  #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
> -#define CMA_MAX_CM_RETRIES 15
>  #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
>  #define CMA_IBOE_PACKET_LIFETIME 18
>  #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>  
> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
> +static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
> +static int cma_cm_response_timeout_min = 8;
> +static int cma_cm_response_timeout_max = 31;
> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
> +
> +#define CMA_DFLT_MAX_CM_RETRIES 15
> +static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
> +static int cma_max_cm_retries_min = 1;
> +static int cma_max_cm_retries_max = 100;
> +#undef CMA_DFLT_MAX_CM_RETRIES
> +
> +static struct ctl_table_header *cma_ctl_table_hdr;
> +static struct ctl_table cma_ctl_table[] = {
> +	{
> +		.procname	= "cma_cm_response_timeout",
> +		.data		= &cma_cm_response_timeout,
> +		.maxlen		= sizeof(cma_cm_response_timeout),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= &cma_cm_response_timeout_min,
> +		.extra2		= &cma_cm_response_timeout_max,
> +	},
> +	{
> +		.procname	= "cma_max_cm_retries",
> +		.data		= &cma_max_cm_retries,
> +		.maxlen		= sizeof(cma_max_cm_retries),
> +		.mode		= 0644,
> +		.proc_handler	= proc_dointvec_minmax,
> +		.extra1		= &cma_max_cm_retries_min,
> +		.extra2		= &cma_max_cm_retries_max,
> +	},
> +	{ }
> +};

Is sysctl the right approach here? Should it be rdma tool instead?

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-22 16:36 ` Jason Gunthorpe
@ 2019-02-22 17:14   ` Steve Wise
  2019-02-22 17:51     ` Doug Ledford
  0 siblings, 1 reply; 9+ messages in thread
From: Steve Wise @ 2019-02-22 17:14 UTC (permalink / raw)
  To: Jason Gunthorpe, Håkon Bugge
  Cc: Doug Ledford, Leon Romanovsky, Parav Pandit, linux-rdma, linux-kernel


On 2/22/2019 10:36 AM, Jason Gunthorpe wrote:
> On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
>> During certain workloads, the default CM response timeout is too
>> short, leading to excessive retries. Hence, make it configurable
>> through sysctl. While at it, also make number of CM retries
>> configurable.
>>
>> The defaults are not changed.
>>
>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>>  drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
>>  1 file changed, 44 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index c43512752b8a..ce99e1cd1029 100644
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -43,6 +43,7 @@
>>  #include <linux/inetdevice.h>
>>  #include <linux/slab.h>
>>  #include <linux/module.h>
>> +#include <linux/sysctl.h>
>>  #include <net/route.h>
>>  
>>  #include <net/net_namespace.h>
>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
>>  MODULE_DESCRIPTION("Generic RDMA CM Agent");
>>  MODULE_LICENSE("Dual BSD/GPL");
>>  
>> -#define CMA_CM_RESPONSE_TIMEOUT 20
>>  #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
>> -#define CMA_MAX_CM_RETRIES 15
>>  #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
>>  #define CMA_IBOE_PACKET_LIFETIME 18
>>  #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>  
>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
>> +static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
>> +static int cma_cm_response_timeout_min = 8;
>> +static int cma_cm_response_timeout_max = 31;
>> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
>> +
>> +#define CMA_DFLT_MAX_CM_RETRIES 15
>> +static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
>> +static int cma_max_cm_retries_min = 1;
>> +static int cma_max_cm_retries_max = 100;
>> +#undef CMA_DFLT_MAX_CM_RETRIES
>> +
>> +static struct ctl_table_header *cma_ctl_table_hdr;
>> +static struct ctl_table cma_ctl_table[] = {
>> +	{
>> +		.procname	= "cma_cm_response_timeout",
>> +		.data		= &cma_cm_response_timeout,
>> +		.maxlen		= sizeof(cma_cm_response_timeout),
>> +		.mode		= 0644,
>> +		.proc_handler	= proc_dointvec_minmax,
>> +		.extra1		= &cma_cm_response_timeout_min,
>> +		.extra2		= &cma_cm_response_timeout_max,
>> +	},
>> +	{
>> +		.procname	= "cma_max_cm_retries",
>> +		.data		= &cma_max_cm_retries,
>> +		.maxlen		= sizeof(cma_max_cm_retries),
>> +		.mode		= 0644,
>> +		.proc_handler	= proc_dointvec_minmax,
>> +		.extra1		= &cma_max_cm_retries_min,
>> +		.extra2		= &cma_max_cm_retries_max,
>> +	},
>> +	{ }
>> +};
> Is sysctl the right approach here? Should it be rdma tool instead?
>
> Jason

There are other rdma sysctls currently:  net.rdma_ucm.max_backlog and
net.iw_cm.default_backlog.  The core network stack seems to use sysctl
and not ip tool to set basically globals.

To use rdma tool, we'd have to have some concept of a "module" object, I
guess.  IE there's dev, link, and resource rdma tool objects currently. 
But these cma timeout settings are really not per dev, link, nor a
resource.   Maybe we have just a "core" object:  rdma core set
cma_max_cm_retries min 8 max 30.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-22 17:14   ` Steve Wise
@ 2019-02-22 17:51     ` Doug Ledford
  2019-02-23  8:49       ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Doug Ledford @ 2019-02-22 17:51 UTC (permalink / raw)
  To: Steve Wise
  Cc: Jason Gunthorpe, Håkon Bugge, Leon Romanovsky, Parav Pandit,
	linux-rdma, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3869 bytes --]



> On Feb 22, 2019, at 12:14 PM, Steve Wise <swise@opengridcomputing.com> wrote:
> 
> 
> On 2/22/2019 10:36 AM, Jason Gunthorpe wrote:
>> On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
>>> During certain workloads, the default CM response timeout is too
>>> short, leading to excessive retries. Hence, make it configurable
>>> through sysctl. While at it, also make number of CM retries
>>> configurable.
>>> 
>>> The defaults are not changed.
>>> 
>>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>>> drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
>>> 1 file changed, 44 insertions(+), 7 deletions(-)
>>> 
>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>> index c43512752b8a..ce99e1cd1029 100644
>>> +++ b/drivers/infiniband/core/cma.c
>>> @@ -43,6 +43,7 @@
>>> #include <linux/inetdevice.h>
>>> #include <linux/slab.h>
>>> #include <linux/module.h>
>>> +#include <linux/sysctl.h>
>>> #include <net/route.h>
>>> 
>>> #include <net/net_namespace.h>
>>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
>>> MODULE_DESCRIPTION("Generic RDMA CM Agent");
>>> MODULE_LICENSE("Dual BSD/GPL");
>>> 
>>> -#define CMA_CM_RESPONSE_TIMEOUT 20
>>> #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
>>> -#define CMA_MAX_CM_RETRIES 15
>>> #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
>>> #define CMA_IBOE_PACKET_LIFETIME 18
>>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>> 
>>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
>>> +static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
>>> +static int cma_cm_response_timeout_min = 8;
>>> +static int cma_cm_response_timeout_max = 31;
>>> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
>>> +
>>> +#define CMA_DFLT_MAX_CM_RETRIES 15
>>> +static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
>>> +static int cma_max_cm_retries_min = 1;
>>> +static int cma_max_cm_retries_max = 100;
>>> +#undef CMA_DFLT_MAX_CM_RETRIES
>>> +
>>> +static struct ctl_table_header *cma_ctl_table_hdr;
>>> +static struct ctl_table cma_ctl_table[] = {
>>> +	{
>>> +		.procname	= "cma_cm_response_timeout",
>>> +		.data		= &cma_cm_response_timeout,
>>> +		.maxlen		= sizeof(cma_cm_response_timeout),
>>> +		.mode		= 0644,
>>> +		.proc_handler	= proc_dointvec_minmax,
>>> +		.extra1		= &cma_cm_response_timeout_min,
>>> +		.extra2		= &cma_cm_response_timeout_max,
>>> +	},
>>> +	{
>>> +		.procname	= "cma_max_cm_retries",
>>> +		.data		= &cma_max_cm_retries,
>>> +		.maxlen		= sizeof(cma_max_cm_retries),
>>> +		.mode		= 0644,
>>> +		.proc_handler	= proc_dointvec_minmax,
>>> +		.extra1		= &cma_max_cm_retries_min,
>>> +		.extra2		= &cma_max_cm_retries_max,
>>> +	},
>>> +	{ }
>>> +};
>> Is sysctl the right approach here? Should it be rdma tool instead?
>> 
>> Jason
> 
> There are other rdma sysctls currently:  net.rdma_ucm.max_backlog and
> net.iw_cm.default_backlog.  The core network stack seems to use sysctl
> and not ip tool to set basically globals.
> 
> To use rdma tool, we'd have to have some concept of a "module" object, I
> guess.  IE there's dev, link, and resource rdma tool objects currently.
> But these cma timeout settings are really not per dev, link, nor a
> resource.   Maybe we have just a "core" object:  rdma core set
> cma_max_cm_retries min 8 max 30.

I don’t know, I think you make a fairly good argument for leaving it as a sysctl.  We have infrastructure in place for admins to set persistent sysctl settings.  The per device/link settings need something different because link names and such can change.  Since these are globals, I’d leave them where they are.

--
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD








[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-22 17:51     ` Doug Ledford
@ 2019-02-23  8:49       ` Leon Romanovsky
  2019-02-25 17:23         ` Håkon Bugge
  2019-03-04  6:27         ` Parav Pandit
  0 siblings, 2 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-02-23  8:49 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Steve Wise, Jason Gunthorpe, Håkon Bugge, Parav Pandit,
	linux-rdma, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4399 bytes --]

On Fri, Feb 22, 2019 at 12:51:55PM -0500, Doug Ledford wrote:
>
>
> > On Feb 22, 2019, at 12:14 PM, Steve Wise <swise@opengridcomputing.com> wrote:
> >
> >
> > On 2/22/2019 10:36 AM, Jason Gunthorpe wrote:
> >> On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
> >>> During certain workloads, the default CM response timeout is too
> >>> short, leading to excessive retries. Hence, make it configurable
> >>> through sysctl. While at it, also make number of CM retries
> >>> configurable.
> >>>
> >>> The defaults are not changed.
> >>>
> >>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> >>> drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
> >>> 1 file changed, 44 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> >>> index c43512752b8a..ce99e1cd1029 100644
> >>> +++ b/drivers/infiniband/core/cma.c
> >>> @@ -43,6 +43,7 @@
> >>> #include <linux/inetdevice.h>
> >>> #include <linux/slab.h>
> >>> #include <linux/module.h>
> >>> +#include <linux/sysctl.h>
> >>> #include <net/route.h>
> >>>
> >>> #include <net/net_namespace.h>
> >>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
> >>> MODULE_DESCRIPTION("Generic RDMA CM Agent");
> >>> MODULE_LICENSE("Dual BSD/GPL");
> >>>
> >>> -#define CMA_CM_RESPONSE_TIMEOUT 20
> >>> #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
> >>> -#define CMA_MAX_CM_RETRIES 15
> >>> #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
> >>> #define CMA_IBOE_PACKET_LIFETIME 18
> >>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
> >>>
> >>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
> >>> +static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
> >>> +static int cma_cm_response_timeout_min = 8;
> >>> +static int cma_cm_response_timeout_max = 31;
> >>> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
> >>> +
> >>> +#define CMA_DFLT_MAX_CM_RETRIES 15
> >>> +static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
> >>> +static int cma_max_cm_retries_min = 1;
> >>> +static int cma_max_cm_retries_max = 100;
> >>> +#undef CMA_DFLT_MAX_CM_RETRIES
> >>> +
> >>> +static struct ctl_table_header *cma_ctl_table_hdr;
> >>> +static struct ctl_table cma_ctl_table[] = {
> >>> +	{
> >>> +		.procname	= "cma_cm_response_timeout",
> >>> +		.data		= &cma_cm_response_timeout,
> >>> +		.maxlen		= sizeof(cma_cm_response_timeout),
> >>> +		.mode		= 0644,
> >>> +		.proc_handler	= proc_dointvec_minmax,
> >>> +		.extra1		= &cma_cm_response_timeout_min,
> >>> +		.extra2		= &cma_cm_response_timeout_max,
> >>> +	},
> >>> +	{
> >>> +		.procname	= "cma_max_cm_retries",
> >>> +		.data		= &cma_max_cm_retries,
> >>> +		.maxlen		= sizeof(cma_max_cm_retries),
> >>> +		.mode		= 0644,
> >>> +		.proc_handler	= proc_dointvec_minmax,
> >>> +		.extra1		= &cma_max_cm_retries_min,
> >>> +		.extra2		= &cma_max_cm_retries_max,
> >>> +	},
> >>> +	{ }
> >>> +};
> >> Is sysctl the right approach here? Should it be rdma tool instead?
> >>
> >> Jason
> >
> > There are other rdma sysctls currently:  net.rdma_ucm.max_backlog and
> > net.iw_cm.default_backlog.  The core network stack seems to use sysctl
> > and not ip tool to set basically globals.
> >
> > To use rdma tool, we'd have to have some concept of a "module" object, I
> > guess.  IE there's dev, link, and resource rdma tool objects currently.
> > But these cma timeout settings are really not per dev, link, nor a
> > resource.   Maybe we have just a "core" object:  rdma core set
> > cma_max_cm_retries min 8 max 30.
>
> I don’t know, I think you make a fairly good argument for leaving it as a sysctl.  We have infrastructure in place for admins to set persistent sysctl settings.  The per device/link settings need something different because link names and such can change.  Since these are globals, I’d leave them where they are.
>

I have patches from Parav which extend rdmatool to set global to whole
stack parameters, something like "rdma system ...", so the option to
set through rdmatool global parameters for modules e.g. "rdma system cma set ..."
exists. But I'm not sure if it is right thing to do.

> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>
>
>
>
>
>
>



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-23  8:49       ` Leon Romanovsky
@ 2019-02-25 17:23         ` Håkon Bugge
  2019-03-04  6:27         ` Parav Pandit
  1 sibling, 0 replies; 9+ messages in thread
From: Håkon Bugge @ 2019-02-25 17:23 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Steve Wise, Jason Gunthorpe, Parav Pandit,
	OFED mailing list, linux-kernel



> On 23 Feb 2019, at 09:49, Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Fri, Feb 22, 2019 at 12:51:55PM -0500, Doug Ledford wrote:
>> 
>> 
>>> On Feb 22, 2019, at 12:14 PM, Steve Wise <swise@opengridcomputing.com> wrote:
>>> 
>>> 
>>> On 2/22/2019 10:36 AM, Jason Gunthorpe wrote:
>>>> On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
>>>>> During certain workloads, the default CM response timeout is too
>>>>> short, leading to excessive retries. Hence, make it configurable
>>>>> through sysctl. While at it, also make number of CM retries
>>>>> configurable.
>>>>> 
>>>>> The defaults are not changed.
>>>>> 
>>>>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>>>>> drivers/infiniband/core/cma.c | 51 ++++++++++++++++++++++++++++++-----
>>>>> 1 file changed, 44 insertions(+), 7 deletions(-)
>>>>> 
>>>>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>>>>> index c43512752b8a..ce99e1cd1029 100644
>>>>> +++ b/drivers/infiniband/core/cma.c
>>>>> @@ -43,6 +43,7 @@
>>>>> #include <linux/inetdevice.h>
>>>>> #include <linux/slab.h>
>>>>> #include <linux/module.h>
>>>>> +#include <linux/sysctl.h>
>>>>> #include <net/route.h>
>>>>> 
>>>>> #include <net/net_namespace.h>
>>>>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
>>>>> MODULE_DESCRIPTION("Generic RDMA CM Agent");
>>>>> MODULE_LICENSE("Dual BSD/GPL");
>>>>> 
>>>>> -#define CMA_CM_RESPONSE_TIMEOUT 20
>>>>> #define CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000
>>>>> -#define CMA_MAX_CM_RETRIES 15
>>>>> #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
>>>>> #define CMA_IBOE_PACKET_LIFETIME 18
>>>>> #define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
>>>>> 
>>>>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20
>>>>> +static int cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
>>>>> +static int cma_cm_response_timeout_min = 8;
>>>>> +static int cma_cm_response_timeout_max = 31;
>>>>> +#undef CMA_DFLT_CM_RESPONSE_TIMEOUT
>>>>> +
>>>>> +#define CMA_DFLT_MAX_CM_RETRIES 15
>>>>> +static int cma_max_cm_retries = CMA_DFLT_MAX_CM_RETRIES;
>>>>> +static int cma_max_cm_retries_min = 1;
>>>>> +static int cma_max_cm_retries_max = 100;
>>>>> +#undef CMA_DFLT_MAX_CM_RETRIES
>>>>> +
>>>>> +static struct ctl_table_header *cma_ctl_table_hdr;
>>>>> +static struct ctl_table cma_ctl_table[] = {
>>>>> +	{
>>>>> +		.procname	= "cma_cm_response_timeout",
>>>>> +		.data		= &cma_cm_response_timeout,
>>>>> +		.maxlen		= sizeof(cma_cm_response_timeout),
>>>>> +		.mode		= 0644,
>>>>> +		.proc_handler	= proc_dointvec_minmax,
>>>>> +		.extra1		= &cma_cm_response_timeout_min,
>>>>> +		.extra2		= &cma_cm_response_timeout_max,
>>>>> +	},
>>>>> +	{
>>>>> +		.procname	= "cma_max_cm_retries",
>>>>> +		.data		= &cma_max_cm_retries,
>>>>> +		.maxlen		= sizeof(cma_max_cm_retries),
>>>>> +		.mode		= 0644,
>>>>> +		.proc_handler	= proc_dointvec_minmax,
>>>>> +		.extra1		= &cma_max_cm_retries_min,
>>>>> +		.extra2		= &cma_max_cm_retries_max,
>>>>> +	},
>>>>> +	{ }
>>>>> +};
>>>> Is sysctl the right approach here? Should it be rdma tool instead?
>>>> 
>>>> Jason
>>> 
>>> There are other rdma sysctls currently:  net.rdma_ucm.max_backlog and
>>> net.iw_cm.default_backlog.  The core network stack seems to use sysctl
>>> and not ip tool to set basically globals.
>>> 
>>> To use rdma tool, we'd have to have some concept of a "module" object, I
>>> guess.  IE there's dev, link, and resource rdma tool objects currently.
>>> But these cma timeout settings are really not per dev, link, nor a
>>> resource.   Maybe we have just a "core" object:  rdma core set
>>> cma_max_cm_retries min 8 max 30.
>> 
>> I don’t know, I think you make a fairly good argument for leaving it as a sysctl.  We have infrastructure in place for admins to set persistent sysctl settings.  The per device/link settings need something different because link names and such can change.  Since these are globals, I’d leave them where they are.
>> 
> 
> I have patches from Parav which extend rdmatool to set global to whole
> stack parameters, something like "rdma system ...", so the option to
> set through rdmatool global parameters for modules e.g. "rdma system cma set ..."
> exists. But I'm not sure if it is right thing to do.

If decided to for the sysctl flavour, I must just say: ahem, ahem, I forgot to put in a unregister_net_sysctl_table() in cma_cleanup(). Will send a v2.


Thxs, Håkon

> 
>> --
>> Doug Ledford <dledford@redhat.com>
>>    GPG KeyID: B826A3330E572FDD
>>    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-02-23  8:49       ` Leon Romanovsky
  2019-02-25 17:23         ` Håkon Bugge
@ 2019-03-04  6:27         ` Parav Pandit
  2019-04-12 19:30           ` Jason Gunthorpe
  1 sibling, 1 reply; 9+ messages in thread
From: Parav Pandit @ 2019-03-04  6:27 UTC (permalink / raw)
  To: Leon Romanovsky, Doug Ledford
  Cc: Steve Wise, Jason Gunthorpe, Håkon Bugge, linux-rdma, linux-kernel



> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org>
> Sent: Saturday, February 23, 2019 2:50 AM
> To: Doug Ledford <dledford@redhat.com>
> Cc: Steve Wise <swise@opengridcomputing.com>; Jason Gunthorpe
> <jgg@ziepe.ca>; Håkon Bugge <haakon.bugge@oracle.com>; Parav Pandit
> <parav@mellanox.com>; linux-rdma@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH] RDMA/cma: Make CM response timeout and # CM
> retries configurable
> 
> On Fri, Feb 22, 2019 at 12:51:55PM -0500, Doug Ledford wrote:
> >
> >
> > > On Feb 22, 2019, at 12:14 PM, Steve Wise
> <swise@opengridcomputing.com> wrote:
> > >
> > >
> > > On 2/22/2019 10:36 AM, Jason Gunthorpe wrote:
> > >> On Sun, Feb 17, 2019 at 06:09:09PM +0100, Håkon Bugge wrote:
> > >>> During certain workloads, the default CM response timeout is too
> > >>> short, leading to excessive retries. Hence, make it configurable
> > >>> through sysctl. While at it, also make number of CM retries
> > >>> configurable.
> > >>>
> > >>> The defaults are not changed.
> > >>>
> > >>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
> > >>> drivers/infiniband/core/cma.c | 51
> > >>> ++++++++++++++++++++++++++++++-----
> > >>> 1 file changed, 44 insertions(+), 7 deletions(-)
> > >>>
> > >>> diff --git a/drivers/infiniband/core/cma.c
> > >>> b/drivers/infiniband/core/cma.c index c43512752b8a..ce99e1cd1029
> > >>> 100644
> > >>> +++ b/drivers/infiniband/core/cma.c
> > >>> @@ -43,6 +43,7 @@
> > >>> #include <linux/inetdevice.h>
> > >>> #include <linux/slab.h>
> > >>> #include <linux/module.h>
> > >>> +#include <linux/sysctl.h>
> > >>> #include <net/route.h>
> > >>>
> > >>> #include <net/net_namespace.h>
> > >>> @@ -68,13 +69,46 @@ MODULE_AUTHOR("Sean Hefty");
> > >>> MODULE_DESCRIPTION("Generic RDMA CM Agent");
> MODULE_LICENSE("Dual
> > >>> BSD/GPL");
> > >>>
> > >>> -#define CMA_CM_RESPONSE_TIMEOUT 20 #define
> > >>> CMA_QUERY_CLASSPORT_INFO_TIMEOUT 3000 -#define
> CMA_MAX_CM_RETRIES
> > >>> 15 #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
> #define
> > >>> CMA_IBOE_PACKET_LIFETIME 18 #define
> CMA_PREFERRED_ROCE_GID_TYPE
> > >>> IB_GID_TYPE_ROCE_UDP_ENCAP
> > >>>
> > >>> +#define CMA_DFLT_CM_RESPONSE_TIMEOUT 20 static int
> > >>> +cma_cm_response_timeout = CMA_DFLT_CM_RESPONSE_TIMEOUT;
> static
> > >>> +int cma_cm_response_timeout_min = 8; static int
> > >>> +cma_cm_response_timeout_max = 31; #undef
> > >>> +CMA_DFLT_CM_RESPONSE_TIMEOUT
> > >>> +
> > >>> +#define CMA_DFLT_MAX_CM_RETRIES 15 static int
> cma_max_cm_retries
> > >>> += CMA_DFLT_MAX_CM_RETRIES; static int cma_max_cm_retries_min
> = 1;
> > >>> +static int cma_max_cm_retries_max = 100; #undef
> > >>> +CMA_DFLT_MAX_CM_RETRIES
> > >>> +
> > >>> +static struct ctl_table_header *cma_ctl_table_hdr; static struct
> > >>> +ctl_table cma_ctl_table[] = {
> > >>> +	{
> > >>> +		.procname	= "cma_cm_response_timeout",
> > >>> +		.data		= &cma_cm_response_timeout,
> > >>> +		.maxlen		= sizeof(cma_cm_response_timeout),
> > >>> +		.mode		= 0644,
> > >>> +		.proc_handler	= proc_dointvec_minmax,
> > >>> +		.extra1		= &cma_cm_response_timeout_min,
> > >>> +		.extra2		= &cma_cm_response_timeout_max,
> > >>> +	},
> > >>> +	{
> > >>> +		.procname	= "cma_max_cm_retries",
> > >>> +		.data		= &cma_max_cm_retries,
> > >>> +		.maxlen		= sizeof(cma_max_cm_retries),
> > >>> +		.mode		= 0644,
> > >>> +		.proc_handler	= proc_dointvec_minmax,
> > >>> +		.extra1		= &cma_max_cm_retries_min,
> > >>> +		.extra2		= &cma_max_cm_retries_max,
> > >>> +	},
> > >>> +	{ }
> > >>> +};
> > >> Is sysctl the right approach here? Should it be rdma tool instead?
> > >>
> > >> Jason
> > >
> > > There are other rdma sysctls currently:  net.rdma_ucm.max_backlog
> > > and net.iw_cm.default_backlog.  The core network stack seems to use
> > > sysctl and not ip tool to set basically globals.
> > >
> > > To use rdma tool, we'd have to have some concept of a "module"
> > > object, I guess.  IE there's dev, link, and resource rdma tool objects
> currently.
> > > But these cma timeout settings are really not per dev, link, nor a
> > > resource.   Maybe we have just a "core" object:  rdma core set
> > > cma_max_cm_retries min 8 max 30.
> >
> > I don’t know, I think you make a fairly good argument for leaving it as a
> sysctl.  We have infrastructure in place for admins to set persistent sysctl
> settings.  The per device/link settings need something different because link
> names and such can change.  Since these are globals, I’d leave them where
> they are.
> >
> 
> I have patches from Parav which extend rdmatool to set global to whole
> stack parameters, something like "rdma system ...", so the option to set
> through rdmatool global parameters for modules e.g. "rdma system cma set
> ..."
> exists. But I'm not sure if it is right thing to do.
>

I think we should use rdma_nl_register(RDMA_NL_RDMA_CM, cma_cb_table) which was removed as part of ID stats removal.
Because of below reasons.
1. rdma netlink command auto loads the module
2. we don't need to write any extra code to do register_net_sysctl () in each netns.
Caller's skb's netns will read/write value of response_timeout in 'struct cma_pernet'.
3. last time sysctl added in ipv6 was in 2017 in net/ipv6/addrconf.c, however ipv4 was done in 2018.

Currently rdma_cm/rdma_ucma has configfs, sysctl.
We are adding netlink sys params to ib_core.

We already have 3 clients and infra built using rdma_nl_register() netlink , so hooking up to netlink will provide unified way to set rdma params.
Let's just use netlink for any new params unless it is not doable.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-03-04  6:27         ` Parav Pandit
@ 2019-04-12 19:30           ` Jason Gunthorpe
  2019-04-12 19:56             ` Leon Romanovsky
  0 siblings, 1 reply; 9+ messages in thread
From: Jason Gunthorpe @ 2019-04-12 19:30 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Leon Romanovsky, Doug Ledford, Steve Wise, Håkon Bugge,
	linux-rdma, linux-kernel

On Mon, Mar 04, 2019 at 06:27:26AM +0000, Parav Pandit wrote:

> I think we should use rdma_nl_register(RDMA_NL_RDMA_CM, cma_cb_table) which was removed as part of ID stats removal.
> Because of below reasons.
> 1. rdma netlink command auto loads the module

This is probably the best argument to stay away from sysctl for module
parameters.. It is tricky to make sure the rdma module is loaded
before sysctl runs in boot, and I don't think sysctl autoloads missing
modules, or somehow copes with dynamic module loading, does it?

To that end we should probably have the entire sysctl configurable
available in netlink

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable
  2019-04-12 19:30           ` Jason Gunthorpe
@ 2019-04-12 19:56             ` Leon Romanovsky
  0 siblings, 0 replies; 9+ messages in thread
From: Leon Romanovsky @ 2019-04-12 19:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Parav Pandit, Doug Ledford, Steve Wise, Håkon Bugge,
	linux-rdma, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 841 bytes --]

On Fri, Apr 12, 2019 at 04:30:24PM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 04, 2019 at 06:27:26AM +0000, Parav Pandit wrote:
>
> > I think we should use rdma_nl_register(RDMA_NL_RDMA_CM, cma_cb_table) which was removed as part of ID stats removal.
> > Because of below reasons.
> > 1. rdma netlink command auto loads the module
>
> This is probably the best argument to stay away from sysctl for module
> parameters.. It is tricky to make sure the rdma module is loaded
> before sysctl runs in boot, and I don't think sysctl autoloads missing
> modules, or somehow copes with dynamic module loading, does it?

As far as I remember, sysctl won't be available till module is loaded,
so autoload won't work because path doesn't exist.

>
> To that end we should probably have the entire sysctl configurable
> available in netlink
>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-04-12 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-17 17:09 [PATCH] RDMA/cma: Make CM response timeout and # CM retries configurable Håkon Bugge
2019-02-22 16:36 ` Jason Gunthorpe
2019-02-22 17:14   ` Steve Wise
2019-02-22 17:51     ` Doug Ledford
2019-02-23  8:49       ` Leon Romanovsky
2019-02-25 17:23         ` Håkon Bugge
2019-03-04  6:27         ` Parav Pandit
2019-04-12 19:30           ` Jason Gunthorpe
2019-04-12 19:56             ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).