All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix a deadlock in the ib_srp driver
@ 2022-02-15 18:26 Bart Van Assche
  2022-02-15 18:26 ` [PATCH 1/3] ib_srp: Add more documentation Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 18:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, Bart Van Assche

Hi Jason,

This patch series fixes a deadlock in the ib_srp driver that was discovered
by syzbot. Please consider these patches for kernel v5.18.

Thanks,

Bart.

Bart Van Assche (3):
  ib_srp: Add more documentation
  ib_srp: Protect the target list with a mutex instead of a spinlock
  ib_srp: Fix a deadlock

 drivers/infiniband/ulp/srp/ib_srp.c | 24 +++++++++++-------------
 drivers/infiniband/ulp/srp/ib_srp.h | 13 +++++++++++--
 2 files changed, 22 insertions(+), 15 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] ib_srp: Add more documentation
  2022-02-15 18:26 [PATCH 0/3] Fix a deadlock in the ib_srp driver Bart Van Assche
@ 2022-02-15 18:26 ` Bart Van Assche
  2022-02-15 18:42   ` Leon Romanovsky
  2022-02-15 18:26 ` [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock Bart Van Assche
  2022-02-15 18:26 ` [PATCH 3/3] ib_srp: Fix a deadlock Bart Van Assche
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 18:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, Bart Van Assche

Make it more clear what the different ib_srp data structures represent.

Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/infiniband/ulp/srp/ib_srp.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index abccddeea1e3..55a575e2cace 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -92,6 +92,9 @@ enum srp_iu_type {
 };
 
 /*
+ * RDMA adapter in the initiator system.
+ *
+ * @dev_list: List of RDMA ports associated with this RDMA adapter (srp_host).
  * @mr_page_mask: HCA memory registration page mask.
  * @mr_page_size: HCA memory registration page size.
  * @mr_max_size: Maximum size in bytes of a single FR registration request.
@@ -109,6 +112,12 @@ struct srp_device {
 	bool			use_fast_reg;
 };
 
+/*
+ * One port of an RDMA adapter in the initiator system.
+ *
+ * @target_list: List of connected target ports (struct srp_target_port).
+ * @target_lock: Protects @target_list.
+ */
 struct srp_host {
 	struct srp_device      *srp_dev;
 	u8			port;
@@ -183,7 +192,7 @@ struct srp_rdma_ch {
 };
 
 /**
- * struct srp_target_port
+ * struct srp_target_port - RDMA port in the SRP target system
  * @comp_vector: Completion vector used by the first RDMA channel created for
  *   this target port.
  */

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock
  2022-02-15 18:26 [PATCH 0/3] Fix a deadlock in the ib_srp driver Bart Van Assche
  2022-02-15 18:26 ` [PATCH 1/3] ib_srp: Add more documentation Bart Van Assche
@ 2022-02-15 18:26 ` Bart Van Assche
  2022-02-15 18:44   ` Leon Romanovsky
  2022-02-15 18:26 ` [PATCH 3/3] ib_srp: Fix a deadlock Bart Van Assche
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 18:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, Bart Van Assche

This patch makes the next patch in this series easier to read.

Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 18 +++++++++---------
 drivers/infiniband/ulp/srp/ib_srp.h |  4 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index e174e853f8a4..2db7429b42e1 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -1062,9 +1062,9 @@ static void srp_remove_target(struct srp_target_port *target)
 	kfree(target->ch);
 	target->ch = NULL;
 
-	spin_lock(&target->srp_host->target_lock);
+	mutex_lock(&target->srp_host->target_mutex);
 	list_del(&target->list);
-	spin_unlock(&target->srp_host->target_lock);
+	mutex_unlock(&target->srp_host->target_mutex);
 
 	scsi_host_put(target->scsi_host);
 }
@@ -3146,9 +3146,9 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
 	rport->lld_data = target;
 	target->rport = rport;
 
-	spin_lock(&host->target_lock);
+	mutex_lock(&host->target_mutex);
 	list_add_tail(&target->list, &host->target_list);
-	spin_unlock(&host->target_lock);
+	mutex_unlock(&host->target_mutex);
 
 	scsi_scan_target(&target->scsi_host->shost_gendev,
 			 0, target->scsi_id, SCAN_WILD_CARD, SCSI_SCAN_INITIAL);
@@ -3203,7 +3203,7 @@ static bool srp_conn_unique(struct srp_host *host,
 
 	ret = true;
 
-	spin_lock(&host->target_lock);
+	mutex_lock(&host->target_mutex);
 	list_for_each_entry(t, &host->target_list, list) {
 		if (t != target &&
 		    target->id_ext == t->id_ext &&
@@ -3213,7 +3213,7 @@ static bool srp_conn_unique(struct srp_host *host,
 			break;
 		}
 	}
-	spin_unlock(&host->target_lock);
+	mutex_unlock(&host->target_mutex);
 
 out:
 	return ret;
@@ -3898,7 +3898,7 @@ static struct srp_host *srp_add_port(struct srp_device *device, u8 port)
 		return NULL;
 
 	INIT_LIST_HEAD(&host->target_list);
-	spin_lock_init(&host->target_lock);
+	mutex_init(&host->target_mutex);
 	init_completion(&host->released);
 	mutex_init(&host->add_target_mutex);
 	host->srp_dev = device;
@@ -4041,10 +4041,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
 		/*
 		 * Remove all target ports.
 		 */
-		spin_lock(&host->target_lock);
+		mutex_lock(&host->target_mutex);
 		list_for_each_entry(target, &host->target_list, list)
 			srp_queue_remove_work(target);
-		spin_unlock(&host->target_lock);
+		mutex_unlock(&host->target_mutex);
 
 		/*
 		 * Wait for tl_err and target port removal tasks.
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 55a575e2cace..2af968277994 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -116,14 +116,14 @@ struct srp_device {
  * One port of an RDMA adapter in the initiator system.
  *
  * @target_list: List of connected target ports (struct srp_target_port).
- * @target_lock: Protects @target_list.
+ * @target_mutex: Protects @target_list.
  */
 struct srp_host {
 	struct srp_device      *srp_dev;
 	u8			port;
 	struct device		dev;
 	struct list_head	target_list;
-	spinlock_t		target_lock;
+	struct mutex		target_mutex;
 	struct completion	released;
 	struct list_head	list;
 	struct mutex		add_target_mutex;

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ib_srp: Fix a deadlock
  2022-02-15 18:26 [PATCH 0/3] Fix a deadlock in the ib_srp driver Bart Van Assche
  2022-02-15 18:26 ` [PATCH 1/3] ib_srp: Add more documentation Bart Van Assche
  2022-02-15 18:26 ` [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock Bart Van Assche
@ 2022-02-15 18:26 ` Bart Van Assche
  2022-02-15 18:51   ` Leon Romanovsky
  2 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 18:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, Bart Van Assche,
	syzbot+831661966588c802aae9

Wait on tl_err_work instead of flushing system_long_wq since flushing
system_long_wq is deadlock-prone.

Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Fixes: ef6c49d87c34 ("IB/srp: Eliminate state SRP_TARGET_DEAD")
Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2db7429b42e1..8e1561a6d325 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -4044,12 +4044,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
 		mutex_lock(&host->target_mutex);
 		list_for_each_entry(target, &host->target_list, list)
 			srp_queue_remove_work(target);
+		list_for_each_entry(target, &host->target_list, list)
+			flush_work(&target->tl_err_work);
 		mutex_unlock(&host->target_mutex);
 
-		/*
-		 * Wait for tl_err and target port removal tasks.
-		 */
-		flush_workqueue(system_long_wq);
 		flush_workqueue(srp_remove_wq);
 
 		kfree(host);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] ib_srp: Add more documentation
  2022-02-15 18:26 ` [PATCH 1/3] ib_srp: Add more documentation Bart Van Assche
@ 2022-02-15 18:42   ` Leon Romanovsky
  2022-02-15 20:34     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2022-02-15 18:42 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa

On Tue, Feb 15, 2022 at 10:26:48AM -0800, Bart Van Assche wrote:
> Make it more clear what the different ib_srp data structures represent.
> 
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.h | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
> index abccddeea1e3..55a575e2cace 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.h
> +++ b/drivers/infiniband/ulp/srp/ib_srp.h
> @@ -92,6 +92,9 @@ enum srp_iu_type {
>  };
>  
>  /*
> + * RDMA adapter in the initiator system.
> + *
> + * @dev_list: List of RDMA ports associated with this RDMA adapter (srp_host).

Isn't this list of RDMA devices and not ports?

>   * @mr_page_mask: HCA memory registration page mask.
>   * @mr_page_size: HCA memory registration page size.
>   * @mr_max_size: Maximum size in bytes of a single FR registration request.
> @@ -109,6 +112,12 @@ struct srp_device {
>  	bool			use_fast_reg;
>  };
>  
> +/*
> + * One port of an RDMA adapter in the initiator system.
> + *
> + * @target_list: List of connected target ports (struct srp_target_port).
> + * @target_lock: Protects @target_list.
> + */
>  struct srp_host {
>  	struct srp_device      *srp_dev;
>  	u8			port;
> @@ -183,7 +192,7 @@ struct srp_rdma_ch {
>  };
>  
>  /**
> - * struct srp_target_port
> + * struct srp_target_port - RDMA port in the SRP target system
>   * @comp_vector: Completion vector used by the first RDMA channel created for
>   *   this target port.
>   */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock
  2022-02-15 18:26 ` [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock Bart Van Assche
@ 2022-02-15 18:44   ` Leon Romanovsky
  2022-02-15 20:35     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2022-02-15 18:44 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa

On Tue, Feb 15, 2022 at 10:26:49AM -0800, Bart Van Assche wrote:
> This patch makes the next patch in this series easier to read.

It is not readability change, but move to lock that can sleep (mutex),
which is always good thing.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> 
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c | 18 +++++++++---------
>  drivers/infiniband/ulp/srp/ib_srp.h |  4 ++--
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index e174e853f8a4..2db7429b42e1 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -1062,9 +1062,9 @@ static void srp_remove_target(struct srp_target_port *target)
>  	kfree(target->ch);
>  	target->ch = NULL;
>  
> -	spin_lock(&target->srp_host->target_lock);
> +	mutex_lock(&target->srp_host->target_mutex);
>  	list_del(&target->list);
> -	spin_unlock(&target->srp_host->target_lock);
> +	mutex_unlock(&target->srp_host->target_mutex);
>  
>  	scsi_host_put(target->scsi_host);
>  }
> @@ -3146,9 +3146,9 @@ static int srp_add_target(struct srp_host *host, struct srp_target_port *target)
>  	rport->lld_data = target;
>  	target->rport = rport;
>  
> -	spin_lock(&host->target_lock);
> +	mutex_lock(&host->target_mutex);
>  	list_add_tail(&target->list, &host->target_list);
> -	spin_unlock(&host->target_lock);
> +	mutex_unlock(&host->target_mutex);
>  
>  	scsi_scan_target(&target->scsi_host->shost_gendev,
>  			 0, target->scsi_id, SCAN_WILD_CARD, SCSI_SCAN_INITIAL);
> @@ -3203,7 +3203,7 @@ static bool srp_conn_unique(struct srp_host *host,
>  
>  	ret = true;
>  
> -	spin_lock(&host->target_lock);
> +	mutex_lock(&host->target_mutex);
>  	list_for_each_entry(t, &host->target_list, list) {
>  		if (t != target &&
>  		    target->id_ext == t->id_ext &&
> @@ -3213,7 +3213,7 @@ static bool srp_conn_unique(struct srp_host *host,
>  			break;
>  		}
>  	}
> -	spin_unlock(&host->target_lock);
> +	mutex_unlock(&host->target_mutex);
>  
>  out:
>  	return ret;
> @@ -3898,7 +3898,7 @@ static struct srp_host *srp_add_port(struct srp_device *device, u8 port)
>  		return NULL;
>  
>  	INIT_LIST_HEAD(&host->target_list);
> -	spin_lock_init(&host->target_lock);
> +	mutex_init(&host->target_mutex);
>  	init_completion(&host->released);
>  	mutex_init(&host->add_target_mutex);
>  	host->srp_dev = device;
> @@ -4041,10 +4041,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
>  		/*
>  		 * Remove all target ports.
>  		 */
> -		spin_lock(&host->target_lock);
> +		mutex_lock(&host->target_mutex);
>  		list_for_each_entry(target, &host->target_list, list)
>  			srp_queue_remove_work(target);
> -		spin_unlock(&host->target_lock);
> +		mutex_unlock(&host->target_mutex);
>  
>  		/*
>  		 * Wait for tl_err and target port removal tasks.
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
> index 55a575e2cace..2af968277994 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.h
> +++ b/drivers/infiniband/ulp/srp/ib_srp.h
> @@ -116,14 +116,14 @@ struct srp_device {
>   * One port of an RDMA adapter in the initiator system.
>   *
>   * @target_list: List of connected target ports (struct srp_target_port).
> - * @target_lock: Protects @target_list.
> + * @target_mutex: Protects @target_list.
>   */
>  struct srp_host {
>  	struct srp_device      *srp_dev;
>  	u8			port;
>  	struct device		dev;
>  	struct list_head	target_list;
> -	spinlock_t		target_lock;
> +	struct mutex		target_mutex;
>  	struct completion	released;
>  	struct list_head	list;
>  	struct mutex		add_target_mutex;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ib_srp: Fix a deadlock
  2022-02-15 18:26 ` [PATCH 3/3] ib_srp: Fix a deadlock Bart Van Assche
@ 2022-02-15 18:51   ` Leon Romanovsky
  2022-02-15 20:37     ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Leon Romanovsky @ 2022-02-15 18:51 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa, syzbot+831661966588c802aae9

On Tue, Feb 15, 2022 at 10:26:50AM -0800, Bart Van Assche wrote:
> Wait on tl_err_work instead of flushing system_long_wq since flushing
> system_long_wq is deadlock-prone.
> 
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Fixes: ef6c49d87c34 ("IB/srp: Eliminate state SRP_TARGET_DEAD")
> Reported-by: syzbot+831661966588c802aae9@syzkaller.appspotmail.com
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/infiniband/ulp/srp/ib_srp.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> index 2db7429b42e1..8e1561a6d325 100644
> --- a/drivers/infiniband/ulp/srp/ib_srp.c
> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> @@ -4044,12 +4044,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
>  		mutex_lock(&host->target_mutex);
>  		list_for_each_entry(target, &host->target_list, list)
>  			srp_queue_remove_work(target);
> +		list_for_each_entry(target, &host->target_list, list)
> +			flush_work(&target->tl_err_work);

Sorry for my silly question, but why do you do flush and not cancel
here? You anyway remove SRP device, so the result of flush is not
really important, am I right?

Thanks

>  		mutex_unlock(&host->target_mutex);
>  
> -		/*
> -		 * Wait for tl_err and target port removal tasks.
> -		 */
> -		flush_workqueue(system_long_wq);
>  		flush_workqueue(srp_remove_wq);
>  
>  		kfree(host);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] ib_srp: Add more documentation
  2022-02-15 18:42   ` Leon Romanovsky
@ 2022-02-15 20:34     ` Bart Van Assche
  2022-02-16  7:02       ` Leon Romanovsky
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 20:34 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa

On 2/15/22 10:42, Leon Romanovsky wrote:
> On Tue, Feb 15, 2022 at 10:26:48AM -0800, Bart Van Assche wrote:
>>   /*
>> + * RDMA adapter in the initiator system.
>> + *
>> + * @dev_list: List of RDMA ports associated with this RDMA adapter (srp_host).
> 
> Isn't this list of RDMA devices and not ports?

Please take a look at srp_add_port(). I think that function establishes 
a 1:1 relationship between an RDMA port and struct srp_host.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock
  2022-02-15 18:44   ` Leon Romanovsky
@ 2022-02-15 20:35     ` Bart Van Assche
  0 siblings, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 20:35 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa

On 2/15/22 10:44, Leon Romanovsky wrote:
> On Tue, Feb 15, 2022 at 10:26:49AM -0800, Bart Van Assche wrote:
>> This patch makes the next patch in this series easier to read.
> 
> It is not readability change, but move to lock that can sleep (mutex),
> which is always good thing.
> 
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>

Thanks for the review, but please note that I wrote that this patch 
improves readability of the next patch in this series.

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ib_srp: Fix a deadlock
  2022-02-15 18:51   ` Leon Romanovsky
@ 2022-02-15 20:37     ` Bart Van Assche
  2022-02-15 20:41       ` Jason Gunthorpe
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 20:37 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa, syzbot+831661966588c802aae9

On 2/15/22 10:51, Leon Romanovsky wrote:
> On Tue, Feb 15, 2022 at 10:26:50AM -0800, Bart Van Assche wrote:
>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>> index 2db7429b42e1..8e1561a6d325 100644
>> --- a/drivers/infiniband/ulp/srp/ib_srp.c
>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>> @@ -4044,12 +4044,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
>>   		mutex_lock(&host->target_mutex);
>>   		list_for_each_entry(target, &host->target_list, list)
>>   			srp_queue_remove_work(target);
>> +		list_for_each_entry(target, &host->target_list, list)
>> +			flush_work(&target->tl_err_work);
> 
> Sorry for my silly question, but why do you do flush and not cancel
> here? You anyway remove SRP device, so the result of flush is not
> really important, am I right?

That's a great question. It probably doesn't matter much whether 
flush_work() or cancel_work_sync() is called in this context since 
srp_queue_remove_work() indirectly cancels tl_err_work. See also the 
following code in srp_remove_target():
  	cancel_work_sync(&target->tl_err_work);

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ib_srp: Fix a deadlock
  2022-02-15 20:37     ` Bart Van Assche
@ 2022-02-15 20:41       ` Jason Gunthorpe
  2022-02-15 20:50         ` Bart Van Assche
  0 siblings, 1 reply; 13+ messages in thread
From: Jason Gunthorpe @ 2022-02-15 20:41 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, syzbot+831661966588c802aae9

On Tue, Feb 15, 2022 at 12:37:43PM -0800, Bart Van Assche wrote:
> On 2/15/22 10:51, Leon Romanovsky wrote:
> > On Tue, Feb 15, 2022 at 10:26:50AM -0800, Bart Van Assche wrote:
> > > diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
> > > index 2db7429b42e1..8e1561a6d325 100644
> > > +++ b/drivers/infiniband/ulp/srp/ib_srp.c
> > > @@ -4044,12 +4044,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
> > >   		mutex_lock(&host->target_mutex);
> > >   		list_for_each_entry(target, &host->target_list, list)
> > >   			srp_queue_remove_work(target);
> > > +		list_for_each_entry(target, &host->target_list, list)
> > > +			flush_work(&target->tl_err_work);
> > 
> > Sorry for my silly question, but why do you do flush and not cancel
> > here? You anyway remove SRP device, so the result of flush is not
> > really important, am I right?
> 
> That's a great question. It probably doesn't matter much whether
> flush_work() or cancel_work_sync() is called in this context since
> srp_queue_remove_work() indirectly cancels tl_err_work. See also the
> following code in srp_remove_target():
>  	cancel_work_sync(&target->tl_err_work);

If it is already canceled then why call flush?

Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ib_srp: Fix a deadlock
  2022-02-15 20:41       ` Jason Gunthorpe
@ 2022-02-15 20:50         ` Bart Van Assche
  0 siblings, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2022-02-15 20:50 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Tetsuo Handa, syzbot+831661966588c802aae9

On 2/15/22 12:41, Jason Gunthorpe wrote:
> On Tue, Feb 15, 2022 at 12:37:43PM -0800, Bart Van Assche wrote:
>> On 2/15/22 10:51, Leon Romanovsky wrote:
>>> On Tue, Feb 15, 2022 at 10:26:50AM -0800, Bart Van Assche wrote:
>>>> diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
>>>> index 2db7429b42e1..8e1561a6d325 100644
>>>> +++ b/drivers/infiniband/ulp/srp/ib_srp.c
>>>> @@ -4044,12 +4044,10 @@ static void srp_remove_one(struct ib_device *device, void *client_data)
>>>>    		mutex_lock(&host->target_mutex);
>>>>    		list_for_each_entry(target, &host->target_list, list)
>>>>    			srp_queue_remove_work(target);
>>>> +		list_for_each_entry(target, &host->target_list, list)
>>>> +			flush_work(&target->tl_err_work);
>>>
>>> Sorry for my silly question, but why do you do flush and not cancel
>>> here? You anyway remove SRP device, so the result of flush is not
>>> really important, am I right?
>>
>> That's a great question. It probably doesn't matter much whether
>> flush_work() or cancel_work_sync() is called in this context since
>> srp_queue_remove_work() indirectly cancels tl_err_work. See also the
>> following code in srp_remove_target():
>>   	cancel_work_sync(&target->tl_err_work);
> 
> If it is already canceled then why call flush?

I'll see whether I can leave the flush_work(&target->tl_err_work) calls out.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] ib_srp: Add more documentation
  2022-02-15 20:34     ` Bart Van Assche
@ 2022-02-16  7:02       ` Leon Romanovsky
  0 siblings, 0 replies; 13+ messages in thread
From: Leon Romanovsky @ 2022-02-16  7:02 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jason Gunthorpe, linux-rdma, Tetsuo Handa

On Tue, Feb 15, 2022 at 12:34:27PM -0800, Bart Van Assche wrote:
> On 2/15/22 10:42, Leon Romanovsky wrote:
> > On Tue, Feb 15, 2022 at 10:26:48AM -0800, Bart Van Assche wrote:
> > >   /*
> > > + * RDMA adapter in the initiator system.
> > > + *
> > > + * @dev_list: List of RDMA ports associated with this RDMA adapter (srp_host).
> > 
> > Isn't this list of RDMA devices and not ports?
> 
> Please take a look at srp_add_port(). I think that function establishes a
> 1:1 relationship between an RDMA port and struct srp_host.

Right, but this list contains devices and not ports.

Anyway, it is not important.

Thanks

> 
> Thanks,
> 
> Bart.
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-02-16  7:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-15 18:26 [PATCH 0/3] Fix a deadlock in the ib_srp driver Bart Van Assche
2022-02-15 18:26 ` [PATCH 1/3] ib_srp: Add more documentation Bart Van Assche
2022-02-15 18:42   ` Leon Romanovsky
2022-02-15 20:34     ` Bart Van Assche
2022-02-16  7:02       ` Leon Romanovsky
2022-02-15 18:26 ` [PATCH 2/3] ib_srp: Protect the target list with a mutex instead of a spinlock Bart Van Assche
2022-02-15 18:44   ` Leon Romanovsky
2022-02-15 20:35     ` Bart Van Assche
2022-02-15 18:26 ` [PATCH 3/3] ib_srp: Fix a deadlock Bart Van Assche
2022-02-15 18:51   ` Leon Romanovsky
2022-02-15 20:37     ` Bart Van Assche
2022-02-15 20:41       ` Jason Gunthorpe
2022-02-15 20:50         ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.