All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Number of fixes for rtrs
@ 2020-07-24 11:15 Md Haris Iqbal
  2020-07-24 11:15 ` [PATCH 1/3] RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting Md Haris Iqbal
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Md Haris Iqbal @ 2020-07-24 11:15 UTC (permalink / raw)
  To: danil.kipnis, jinpu.wang, linux-rdma, dledford, jgg, leon, bvanassche
  Cc: Md Haris Iqbal

This patch series fixes a number of issues discovered while testing

1) RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
2) RDMA/rtrs-srv: only call put_device when it's in sysfs
3) RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting

Regards
Md Haris Iqbal


Jack Wang (2):
  RDMA/rtrs-srv: only call put_device when it's in sysfs
  RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq

Danil Kipnis (1):
  RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting


 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 16 +++++++++++++---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c |  7 +++++--
 2 files changed, 18 insertions(+), 5 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
  2020-07-24 11:15 [PATCH 0/3] Number of fixes for rtrs Md Haris Iqbal
@ 2020-07-24 11:15 ` Md Haris Iqbal
  2020-07-24 11:15 ` [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs Md Haris Iqbal
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Md Haris Iqbal @ 2020-07-24 11:15 UTC (permalink / raw)
  To: danil.kipnis, jinpu.wang, linux-rdma, dledford, jgg, leon, bvanassche
  Cc: Md Haris Iqbal

From: Danil Kipnis <danil.kipnis@cloud.ionos.com>

In order to avoid all the clients to start reconnecting at the same time
schedule the reconnect dwork +[0,8] seconds late

Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 564388a85603..5b31d3b03737 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -12,6 +12,7 @@
 
 #include <linux/module.h>
 #include <linux/rculist.h>
+#include <linux/random.h>
 
 #include "rtrs-clt.h"
 #include "rtrs-log.h"
@@ -23,6 +24,12 @@
  * leads to "false positives" failed reconnect attempts
  */
 #define RTRS_RECONNECT_BACKOFF 1000
+/*
+ * Wait for additional random time between 0 and 8 seconds
+ * before starting to reconnect to avoid clients reconnecting
+ * all at once in case of a major network outage
+ */
+#define RTRS_RECONNECT_SEED 8
 
 MODULE_DESCRIPTION("RDMA Transport Client");
 MODULE_LICENSE("GPL");
@@ -306,7 +313,8 @@ static void rtrs_rdma_error_recovery(struct rtrs_clt_con *con)
 		 */
 		delay_ms = clt->reconnect_delay_sec * 1000;
 		queue_delayed_work(rtrs_wq, &sess->reconnect_dwork,
-				   msecs_to_jiffies(delay_ms));
+				   msecs_to_jiffies(delay_ms +
+						    prandom_u32() % RTRS_RECONNECT_SEED));
 	} else {
 		/*
 		 * Error can happen just on establishing new connection,
@@ -2503,7 +2511,9 @@ static void rtrs_clt_reconnect_work(struct work_struct *work)
 		sess->stats->reconnects.fail_cnt++;
 		delay_ms = clt->reconnect_delay_sec * 1000;
 		queue_delayed_work(rtrs_wq, &sess->reconnect_dwork,
-				   msecs_to_jiffies(delay_ms));
+				   msecs_to_jiffies(delay_ms +
+						    prandom_u32() %
+						    RTRS_RECONNECT_SEED));
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs
  2020-07-24 11:15 [PATCH 0/3] Number of fixes for rtrs Md Haris Iqbal
  2020-07-24 11:15 ` [PATCH 1/3] RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting Md Haris Iqbal
@ 2020-07-24 11:15 ` Md Haris Iqbal
  2020-07-24 12:28   ` Jason Gunthorpe
  2020-07-24 11:15 ` [PATCH 3/3] RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq Md Haris Iqbal
  2020-07-29 17:28 ` [PATCH 0/3] Number of fixes for rtrs Jason Gunthorpe
  3 siblings, 1 reply; 6+ messages in thread
From: Md Haris Iqbal @ 2020-07-24 11:15 UTC (permalink / raw)
  To: danil.kipnis, jinpu.wang, linux-rdma, dledford, jgg, leon, bvanassche
  Cc: Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

There are error case we will call free_srv before device kobject
initialized, in such case we shouldn't call put_device, otherwise
a Warning will be generated, eg:

kobject: '(null)' (000000009f5445ed): is not initialized, yet kobject_put() is being called.

So add check before call into put_device.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 0d9241f5d9e6..8a55bc559466 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -1373,7 +1373,10 @@ static void free_srv(struct rtrs_srv *srv)
 	mutex_destroy(&srv->paths_mutex);
 	mutex_destroy(&srv->paths_ev_mutex);
 	/* last put to release the srv structure */
-	put_device(&srv->dev);
+	if(srv->dev.kobj.state_in_sysfs)
+		put_device(&srv->dev);
+	else
+		kfree(srv);
 }
 
 static inline struct rtrs_srv *__find_srv_and_get(struct rtrs_srv_ctx *ctx,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
  2020-07-24 11:15 [PATCH 0/3] Number of fixes for rtrs Md Haris Iqbal
  2020-07-24 11:15 ` [PATCH 1/3] RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting Md Haris Iqbal
  2020-07-24 11:15 ` [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs Md Haris Iqbal
@ 2020-07-24 11:15 ` Md Haris Iqbal
  2020-07-29 17:28 ` [PATCH 0/3] Number of fixes for rtrs Jason Gunthorpe
  3 siblings, 0 replies; 6+ messages in thread
From: Md Haris Iqbal @ 2020-07-24 11:15 UTC (permalink / raw)
  To: danil.kipnis, jinpu.wang, linux-rdma, dledford, jgg, leon, bvanassche
  Cc: Md Haris Iqbal

From: Jack Wang <jinpu.wang@cloud.ionos.com>

We triggered warning from time to time when we run regression
test, eg:

rnbd_client L685: </dev/nullb0@bla> Device disconnected.
rnbd_client L1756: Unloading module
------------[ cut here ]-----------
workqueue: WQ_MEM_RECLAIM rtrs_client_wq:rtrs_clt_reconnect_work [rtrs_client] is flushing !WQ_MEM_RECLAIM ib_addr:process_one_req [ib_core]
WARNING: CPU: 2 PID: 18824 at kernel/workqueue.c:2517 check_flush_dependency+0xad/0x130

The root cause is workqueue core expect flushing should not be done
for a !WQ_MEM_RECLAIM wq from a WQ_MEM_RECLAIM workqueue.

In above case ib_addr workqueue without WQ_MEM_RECLAIM, but rtrs_wq
WQ_MEM_RECLAIM.

To avoid the warning, remove the WQ_MEM_RECLAIM flag.

Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
---
 drivers/infiniband/ulp/rtrs/rtrs-clt.c | 2 +-
 drivers/infiniband/ulp/rtrs/rtrs-srv.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/ulp/rtrs/rtrs-clt.c b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
index 5b31d3b03737..776e89231c52 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-clt.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-clt.c
@@ -2982,7 +2982,7 @@ static int __init rtrs_client_init(void)
 		pr_err("Failed to create rtrs-client dev class\n");
 		return PTR_ERR(rtrs_clt_dev_class);
 	}
-	rtrs_wq = alloc_workqueue("rtrs_client_wq", WQ_MEM_RECLAIM, 0);
+	rtrs_wq = alloc_workqueue("rtrs_client_wq", 0, 0);
 	if (!rtrs_wq) {
 		class_destroy(rtrs_clt_dev_class);
 		return -ENOMEM;
diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
index 8a55bc559466..454bb6c343bb 100644
--- a/drivers/infiniband/ulp/rtrs/rtrs-srv.c
+++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
@@ -2153,7 +2153,7 @@ static int __init rtrs_server_init(void)
 		err = PTR_ERR(rtrs_dev_class);
 		goto out_chunk_pool;
 	}
-	rtrs_wq = alloc_workqueue("rtrs_server_wq", WQ_MEM_RECLAIM, 0);
+	rtrs_wq = alloc_workqueue("rtrs_server_wq", 0, 0);
 	if (!rtrs_wq) {
 		err = -ENOMEM;
 		goto out_dev_class;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs
  2020-07-24 11:15 ` [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs Md Haris Iqbal
@ 2020-07-24 12:28   ` Jason Gunthorpe
  0 siblings, 0 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2020-07-24 12:28 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: danil.kipnis, jinpu.wang, linux-rdma, dledford, leon, bvanassche

On Fri, Jul 24, 2020 at 04:45:07PM +0530, Md Haris Iqbal wrote:
> From: Jack Wang <jinpu.wang@cloud.ionos.com>
> 
> There are error case we will call free_srv before device kobject
> initialized, in such case we shouldn't call put_device, otherwise
> a Warning will be generated, eg:
> 
> kobject: '(null)' (000000009f5445ed): is not initialized, yet kobject_put() is being called.
> 
> So add check before call into put_device.
> 
> Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
> Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
> Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
>  drivers/infiniband/ulp/rtrs/rtrs-srv.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv.c b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> index 0d9241f5d9e6..8a55bc559466 100644
> +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv.c
> @@ -1373,7 +1373,10 @@ static void free_srv(struct rtrs_srv *srv)
>  	mutex_destroy(&srv->paths_mutex);
>  	mutex_destroy(&srv->paths_ev_mutex);
>  	/* last put to release the srv structure */
> -	put_device(&srv->dev);
> +	if(srv->dev.kobj.state_in_sysfs)
> +		put_device(&srv->dev);
> +	else
> +		kfree(srv);
>  }

Not like this, call device_initialize() sooner.

Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/3] Number of fixes for rtrs
  2020-07-24 11:15 [PATCH 0/3] Number of fixes for rtrs Md Haris Iqbal
                   ` (2 preceding siblings ...)
  2020-07-24 11:15 ` [PATCH 3/3] RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq Md Haris Iqbal
@ 2020-07-29 17:28 ` Jason Gunthorpe
  3 siblings, 0 replies; 6+ messages in thread
From: Jason Gunthorpe @ 2020-07-29 17:28 UTC (permalink / raw)
  To: Md Haris Iqbal
  Cc: danil.kipnis, jinpu.wang, linux-rdma, dledford, leon, bvanassche

On Fri, Jul 24, 2020 at 04:45:05PM +0530, Md Haris Iqbal wrote:
> This patch series fixes a number of issues discovered while testing
> 
> 1) RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
> 2) RDMA/rtrs-srv: only call put_device when it's in sysfs
> 3) RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
> 
> Regards
> Md Haris Iqbal
> 
> 
> Jack Wang (2):
>   RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
> 
> Danil Kipnis (1):
>   RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting

Applied to for-next, thanks

>   RDMA/rtrs-srv: only call put_device when it's in sysfs

Needs more work

Thanks,
Jason

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-07-29 17:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-24 11:15 [PATCH 0/3] Number of fixes for rtrs Md Haris Iqbal
2020-07-24 11:15 ` [PATCH 1/3] RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting Md Haris Iqbal
2020-07-24 11:15 ` [PATCH 2/3] RDMA/rtrs-srv: only call put_device when it's in sysfs Md Haris Iqbal
2020-07-24 12:28   ` Jason Gunthorpe
2020-07-24 11:15 ` [PATCH 3/3] RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq Md Haris Iqbal
2020-07-29 17:28 ` [PATCH 0/3] Number of fixes for rtrs Jason Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.