Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
@ 2019-09-04 21:25 Sagi Grimberg
  2019-09-10 11:18 ` Krishnamraju Eraparaju
  0 siblings, 1 reply; 12+ messages in thread
From: Sagi Grimberg @ 2019-09-04 21:25 UTC (permalink / raw)
  To: linux-rdma; +Cc: Jason Gunthorpe

This may be the final put on a qp and result in freeing
resourcesand should not be done with interrupts disabled.

Produce the following warning:
--
[  317.026048] WARNING: CPU: 1 PID: 443 at kernel/smp.c:425 smp_call_function_many+0xa0/0x260
[  317.026131] Call Trace:
[  317.026159]  ? load_new_mm_cr3+0xe0/0xe0
[  317.026161]  on_each_cpu+0x28/0x50
[  317.026183]  __purge_vmap_area_lazy+0x72/0x150
[  317.026200]  free_vmap_area_noflush+0x7a/0x90
[  317.026202]  remove_vm_area+0x6f/0x80
[  317.026203]  __vunmap+0x71/0x210
[  317.026211]  siw_free_qp+0x8d/0x130 [siw]
[  317.026217]  destroy_cm_id+0xc3/0x200 [iw_cm]
[  317.026222]  rdma_destroy_id+0x224/0x2b0 [rdma_cm]
[  317.026226]  nvme_rdma_reset_ctrl_work+0x2c/0x70 [nvme_rdma]
[  317.026235]  process_one_work+0x1f4/0x3e0
[  317.026249]  worker_thread+0x221/0x3e0
[  317.026252]  ? process_one_work+0x3e0/0x3e0
[  317.026256]  kthread+0x117/0x130
[  317.026264]  ? kthread_create_worker_on_cpu+0x70/0x70
[  317.026275]  ret_from_fork+0x35/0x40
--

Fix this by exchanging the qp pointer early on and safely destroying
it.

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
---
Changes from v2:
- store the qp locally so we don't need to unlock the cm_id_priv lock when
  destroying the qp

Changes from v1:
- don't release the lock before qp pointer is cleared.

 drivers/infiniband/core/iwcm.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
index 72141c5b7c95..c64707f68d22 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -373,8 +373,10 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
 {
 	struct iwcm_id_private *cm_id_priv;
 	unsigned long flags;
+	struct ib_qp *qp;
 
 	cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
+	qp = xchg(&cm_id_priv->qp, NULL);
 	/*
 	 * Wait if we're currently in a connect or accept downcall. A
 	 * listening endpoint should never block here.
@@ -401,7 +403,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
 		cm_id_priv->state = IW_CM_STATE_DESTROYING;
 		spin_unlock_irqrestore(&cm_id_priv->lock, flags);
 		/* Abrupt close of the connection */
-		(void)iwcm_modify_qp_err(cm_id_priv->qp);
+		(void)iwcm_modify_qp_err(qp);
 		spin_lock_irqsave(&cm_id_priv->lock, flags);
 		break;
 	case IW_CM_STATE_IDLE:
@@ -426,11 +428,9 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
 		BUG();
 		break;
 	}
-	if (cm_id_priv->qp) {
-		cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
-		cm_id_priv->qp = NULL;
-	}
 	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+	if (qp)
+		cm_id_priv->id.device->ops.iw_rem_ref(qp);
 
 	if (cm_id->mapped) {
 		iwpm_remove_mapinfo(&cm_id->local_addr, &cm_id->m_local_addr);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-04 21:25 [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref Sagi Grimberg
@ 2019-09-10 11:18 ` Krishnamraju Eraparaju
  2019-09-10 16:53   ` Sagi Grimberg
  0 siblings, 1 reply; 12+ messages in thread
From: Krishnamraju Eraparaju @ 2019-09-10 11:18 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-rdma, Jason Gunthorpe

On Wednesday, September 09/04/19, 2019 at 14:25:31 -0700, Sagi Grimberg wrote:
> This may be the final put on a qp and result in freeing
> resourcesand should not be done with interrupts disabled.

Hi Sagi,

Few things to consider in fixing this completely:
  - there are some other places where iw_rem_ref() should be called
    after spinlock critical section. eg: in cm_close_handler(),
iw_cm_connect(),...
  - Any modifications to "cm_id_priv" should be done with in spinlock
    critical section, modifying cm_id_priv->qp outside spinlocks, even
with atomic xchg(), might be error prone.
  - the structure "siw_base_qp" is getting freed in siw_destroy_qp(),
    but it should be done at the end of siw_free_qp().
  
I am about to finish writing a patch that cover all the above issues.
Will test it and submit here by EOD.

Regards,
Krishna.
> 
> Produce the following warning:
> --
> [  317.026048] WARNING: CPU: 1 PID: 443 at kernel/smp.c:425 smp_call_function_many+0xa0/0x260
> [  317.026131] Call Trace:
> [  317.026159]  ? load_new_mm_cr3+0xe0/0xe0
> [  317.026161]  on_each_cpu+0x28/0x50
> [  317.026183]  __purge_vmap_area_lazy+0x72/0x150
> [  317.026200]  free_vmap_area_noflush+0x7a/0x90
> [  317.026202]  remove_vm_area+0x6f/0x80
> [  317.026203]  __vunmap+0x71/0x210
> [  317.026211]  siw_free_qp+0x8d/0x130 [siw]
> [  317.026217]  destroy_cm_id+0xc3/0x200 [iw_cm]
> [  317.026222]  rdma_destroy_id+0x224/0x2b0 [rdma_cm]
> [  317.026226]  nvme_rdma_reset_ctrl_work+0x2c/0x70 [nvme_rdma]
> [  317.026235]  process_one_work+0x1f4/0x3e0
> [  317.026249]  worker_thread+0x221/0x3e0
> [  317.026252]  ? process_one_work+0x3e0/0x3e0
> [  317.026256]  kthread+0x117/0x130
> [  317.026264]  ? kthread_create_worker_on_cpu+0x70/0x70
> [  317.026275]  ret_from_fork+0x35/0x40
> --
> 
> Fix this by exchanging the qp pointer early on and safely destroying
> it.
> 
> Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
> ---
> Changes from v2:
> - store the qp locally so we don't need to unlock the cm_id_priv lock when
>   destroying the qp
> 
> Changes from v1:
> - don't release the lock before qp pointer is cleared.
> 
>  drivers/infiniband/core/iwcm.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
> index 72141c5b7c95..c64707f68d22 100644
> --- a/drivers/infiniband/core/iwcm.c
> +++ b/drivers/infiniband/core/iwcm.c
> @@ -373,8 +373,10 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
>  {
>  	struct iwcm_id_private *cm_id_priv;
>  	unsigned long flags;
> +	struct ib_qp *qp;
>  
>  	cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
> +	qp = xchg(&cm_id_priv->qp, NULL);
>  	/*
>  	 * Wait if we're currently in a connect or accept downcall. A
>  	 * listening endpoint should never block here.
> @@ -401,7 +403,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
>  		cm_id_priv->state = IW_CM_STATE_DESTROYING;
>  		spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>  		/* Abrupt close of the connection */
> -		(void)iwcm_modify_qp_err(cm_id_priv->qp);
> +		(void)iwcm_modify_qp_err(qp);
>  		spin_lock_irqsave(&cm_id_priv->lock, flags);
>  		break;
>  	case IW_CM_STATE_IDLE:
> @@ -426,11 +428,9 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
>  		BUG();
>  		break;
>  	}
> -	if (cm_id_priv->qp) {
> -		cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
> -		cm_id_priv->qp = NULL;
> -	}
>  	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
> +	if (qp)
> +		cm_id_priv->id.device->ops.iw_rem_ref(qp);
>  
>  	if (cm_id->mapped) {
>  		iwpm_remove_mapinfo(&cm_id->local_addr, &cm_id->m_local_addr);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-10 11:18 ` Krishnamraju Eraparaju
@ 2019-09-10 16:53   ` Sagi Grimberg
  2019-09-10 19:21     ` Krishnamraju Eraparaju
  2019-09-11  9:38     ` Bernard Metzler
  0 siblings, 2 replies; 12+ messages in thread
From: Sagi Grimberg @ 2019-09-10 16:53 UTC (permalink / raw)
  To: Krishnamraju Eraparaju; +Cc: linux-rdma, Jason Gunthorpe


>> This may be the final put on a qp and result in freeing
>> resourcesand should not be done with interrupts disabled.
> 
> Hi Sagi,
> 
> Few things to consider in fixing this completely:
>    - there are some other places where iw_rem_ref() should be called
>      after spinlock critical section. eg: in cm_close_handler(),
> iw_cm_connect(),...
>    - Any modifications to "cm_id_priv" should be done with in spinlock
>      critical section, modifying cm_id_priv->qp outside spinlocks, even
> with atomic xchg(), might be error prone.
>    - the structure "siw_base_qp" is getting freed in siw_destroy_qp(),
>      but it should be done at the end of siw_free_qp().

Not sure why you say that, at the end of this function ->qp will be null
anyways...

>    
> I am about to finish writing a patch that cover all the above issues.
> Will test it and submit here by EOD.

Sure, you take it. Just stumbled on it so thought I'd go ahead and send
a patch...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-10 16:53   ` Sagi Grimberg
@ 2019-09-10 19:21     ` Krishnamraju Eraparaju
  2019-09-11  9:38     ` Bernard Metzler
  1 sibling, 0 replies; 12+ messages in thread
From: Krishnamraju Eraparaju @ 2019-09-10 19:21 UTC (permalink / raw)
  To: Sagi Grimberg, Steve Wise, Bernard Metzler; +Cc: linux-rdma, Jason Gunthorpe

Please review the below patch, I will resubmit this in patch-series
after review.
- As kput_ref handler(siw_free_qp) uses vfree, iwcm can't call
  iw_rem_ref() with spinlocks held. Doing so can cause vfree() to sleep
  with irq disabled.
  Two possible solutions:
  1)With spinlock acquired, take a copy of "cm_id_priv->qp" and update
    it to NULL. And after releasing lock use the copied qp pointer for
    rem_ref().
  2)Replacing issue causing vmalloc()/vfree to kmalloc()/kfree in SIW
    driver, may not be a ideal solution.
  
  Solution 2 may not be ideal as allocating huge contigous memory for
   SQ & RQ doesn't look appropriate.
  
- The structure "siw_base_qp" is getting freed in siw_destroy_qp(), but
  if cm_close_handler() holds the last reference, then siw_free_qp(),
  via cm_close_handler(), tries to get already freed "siw_base_qp" from
  "ib_qp". 
   Hence, "siw_base_qp" should be freed at the end of siw_free_qp().


diff --git a/drivers/infiniband/core/iwcm.c
b/drivers/infiniband/core/iwcm.c
index 72141c5b7c95..d5ab69fa598a 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -373,6 +373,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
 {
        struct iwcm_id_private *cm_id_priv;
        unsigned long flags;
+       struct ib_qp *qp;

        cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
        /*
@@ -389,6 +390,9 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
        set_bit(IWCM_F_DROP_EVENTS, &cm_id_priv->flags);

        spin_lock_irqsave(&cm_id_priv->lock, flags);
+       qp = cm_id_priv->qp;
+       cm_id_priv->qp = NULL;
+
        switch (cm_id_priv->state) {
        case IW_CM_STATE_LISTEN:
                cm_id_priv->state = IW_CM_STATE_DESTROYING;
@@ -401,7 +405,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
                cm_id_priv->state = IW_CM_STATE_DESTROYING;
                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
                /* Abrupt close of the connection */
-               (void)iwcm_modify_qp_err(cm_id_priv->qp);
+               (void)iwcm_modify_qp_err(qp);
                spin_lock_irqsave(&cm_id_priv->lock, flags);
                break;
        case IW_CM_STATE_IDLE:
@@ -426,11 +430,9 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
                BUG();
                break;
        }
-       if (cm_id_priv->qp) {
-               cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
-               cm_id_priv->qp = NULL;
-       }
        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+       if (qp)
+               cm_id_priv->id.device->ops.iw_rem_ref(qp);

        if (cm_id->mapped) {
                iwpm_remove_mapinfo(&cm_id->local_addr,
&cm_id->m_local_addr);
@@ -671,11 +673,11 @@ int iw_cm_accept(struct iw_cm_id *cm_id,
                BUG_ON(cm_id_priv->state != IW_CM_STATE_CONN_RECV);
                cm_id_priv->state = IW_CM_STATE_IDLE;
                spin_lock_irqsave(&cm_id_priv->lock, flags);
-               if (cm_id_priv->qp) {
-                       cm_id->device->ops.iw_rem_ref(qp);
-                       cm_id_priv->qp = NULL;
-               }
+               qp = cm_id_priv->qp;
+               cm_id_priv->qp = NULL;
                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+               if (qp)
+                       cm_id->device->ops.iw_rem_ref(qp);
                clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
                wake_up_all(&cm_id_priv->connect_wait);
        }
@@ -730,13 +732,13 @@ int iw_cm_connect(struct iw_cm_id *cm_id, struct
iw_cm_conn_param *iw_param)
                return 0;       /* success */

        spin_lock_irqsave(&cm_id_priv->lock, flags);
-       if (cm_id_priv->qp) {
-               cm_id->device->ops.iw_rem_ref(qp);
-               cm_id_priv->qp = NULL;
-       }
+       qp = cm_id_priv->qp;
+       cm_id_priv->qp = NULL;
        cm_id_priv->state = IW_CM_STATE_IDLE;
 err:
        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+       if (qp)
+               cm_id->device->ops.iw_rem_ref(qp);
        clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
        wake_up_all(&cm_id_priv->connect_wait);
        return ret;
@@ -880,6 +882,7 @@ static int cm_conn_rep_handler(struct
iwcm_id_private *cm_id_priv,
 {
        unsigned long flags;
        int ret;
+       struct ib_qp *qp = NULL;

        spin_lock_irqsave(&cm_id_priv->lock, flags);
        /*
@@ -896,11 +899,13 @@ static int cm_conn_rep_handler(struct
iwcm_id_private *cm_id_priv,
                cm_id_priv->state = IW_CM_STATE_ESTABLISHED;
        } else {
                /* REJECTED or RESET */
-               cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
+               qp = cm_id_priv->qp;
                cm_id_priv->qp = NULL;
                cm_id_priv->state = IW_CM_STATE_IDLE;
        }
        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
+       if (qp)
+               cm_id_priv->id.device->ops.iw_rem_ref(qp);
        ret = cm_id_priv->id.cm_handler(&cm_id_priv->id, iw_event);

        if (iw_event->private_data_len)
@@ -944,12 +949,12 @@ static int cm_close_handler(struct iwcm_id_private
*cm_id_priv,
 {
        unsigned long flags;
        int ret = 0;
+       struct ib_qp *qp;
+
        spin_lock_irqsave(&cm_id_priv->lock, flags);
+       qp = cm_id_priv->qp;
+       cm_id_priv->qp = NULL;

-       if (cm_id_priv->qp) {
-               cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
-               cm_id_priv->qp = NULL;
-       }
        switch (cm_id_priv->state) {
        case IW_CM_STATE_ESTABLISHED:
        case IW_CM_STATE_CLOSING:
@@ -965,6 +970,8 @@ static int cm_close_handler(struct iwcm_id_private
*cm_id_priv,
        }
        spin_unlock_irqrestore(&cm_id_priv->lock, flags);

+       if (qp)
+               cm_id_priv->id.device->ops.iw_rem_ref(qp);
        return ret;
 }

diff --git a/drivers/infiniband/sw/siw/siw_qp.c
b/drivers/infiniband/sw/siw/siw_qp.c
index 430314c8abd9..cb177688a49f 100644
--- a/drivers/infiniband/sw/siw/siw_qp.c
+++ b/drivers/infiniband/sw/siw/siw_qp.c
@@ -1307,6 +1307,7 @@ void siw_free_qp(struct kref *ref)
        struct siw_qp *found, *qp = container_of(ref, struct siw_qp,
ref);
        struct siw_device *sdev = qp->sdev;
        unsigned long flags;
+       struct siw_base_qp *siw_base_qp = to_siw_base_qp(qp->ib_qp);

        if (qp->cep)
                siw_cep_put(qp->cep);
@@ -1327,4 +1328,5 @@ void siw_free_qp(struct kref *ref)
        atomic_dec(&sdev->num_qp);
        siw_dbg_qp(qp, "free QP\n");
        kfree_rcu(qp, rcu);
+       kfree(siw_base_qp);
 }
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c
b/drivers/infiniband/sw/siw/siw_verbs.c
index da52c90e06d4..ac08d84d84cb 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -603,7 +603,6 @@ int siw_verbs_modify_qp(struct ib_qp *base_qp,
struct ib_qp_attr *attr,
 int siw_destroy_qp(struct ib_qp *base_qp, struct ib_udata *udata)
 {
        struct siw_qp *qp = to_siw_qp(base_qp);
-       struct siw_base_qp *siw_base_qp = to_siw_base_qp(base_qp);
        struct siw_ucontext *uctx =
                rdma_udata_to_drv_context(udata, struct siw_ucontext,
                                          base_ucontext);
@@ -640,7 +639,6 @@ int siw_destroy_qp(struct ib_qp *base_qp, struct
ib_udata *udata)
        qp->scq = qp->rcq = NULL;

        siw_qp_put(qp);
-       kfree(siw_base_qp);

        return 0;
 }


On Tuesday, September 09/10/19, 2019 at 22:23:13 +0530, Sagi Grimberg wrote:
> 
> >> This may be the final put on a qp and result in freeing
> >> resourcesand should not be done with interrupts disabled.
> > 
> > Hi Sagi,
> > 
> > Few things to consider in fixing this completely:
> >    - there are some other places where iw_rem_ref() should be called
> >      after spinlock critical section. eg: in cm_close_handler(),
> > iw_cm_connect(),...
> >    - Any modifications to "cm_id_priv" should be done with in spinlock
> >      critical section, modifying cm_id_priv->qp outside spinlocks, even
> > with atomic xchg(), might be error prone.
> >    - the structure "siw_base_qp" is getting freed in siw_destroy_qp(),
> >      but it should be done at the end of siw_free_qp().
> 
> Not sure why you say that, at the end of this function ->qp will be null
> anyways...
 Hope the above description and patch answers this.
> 
> >    
> > I am about to finish writing a patch that cover all the above issues.
> > Will test it and submit here by EOD.
> 
> Sure, you take it. Just stumbled on it so thought I'd go ahead and send
> a patch...

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-10 16:53   ` Sagi Grimberg
  2019-09-10 19:21     ` Krishnamraju Eraparaju
@ 2019-09-11  9:38     ` Bernard Metzler
  2019-09-11 14:42       ` Steve Wise
  1 sibling, 1 reply; 12+ messages in thread
From: Bernard Metzler @ 2019-09-11  9:38 UTC (permalink / raw)
  To: Krishnamraju Eraparaju
  Cc: Sagi Grimberg, Steve Wise, linux-rdma, Jason Gunthorpe

-----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: -----

>To: "Sagi Grimberg" <sagi@grimberg.me>, "Steve Wise"
><larrystevenwise@gmail.com>, "Bernard Metzler" <BMT@zurich.ibm.com>
>From: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>Date: 09/10/2019 09:22PM
>Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "Jason
>Gunthorpe" <jgg@ziepe.ca>
>Subject: [EXTERNAL] Re: [PATCH v3] iwcm: don't hold the irq disabled
>lock on iw_rem_ref
>
>Please review the below patch, I will resubmit this in patch-series
>after review.
>- As kput_ref handler(siw_free_qp) uses vfree, iwcm can't call
>  iw_rem_ref() with spinlocks held. Doing so can cause vfree() to
>sleep
>  with irq disabled.
>  Two possible solutions:
>  1)With spinlock acquired, take a copy of "cm_id_priv->qp" and
>update
>    it to NULL. And after releasing lock use the copied qp pointer
>for
>    rem_ref().
>  2)Replacing issue causing vmalloc()/vfree to kmalloc()/kfree in SIW
>    driver, may not be a ideal solution.
>  
>  Solution 2 may not be ideal as allocating huge contigous memory for
>   SQ & RQ doesn't look appropriate.
>  
>- The structure "siw_base_qp" is getting freed in siw_destroy_qp(),
>but
>  if cm_close_handler() holds the last reference, then siw_free_qp(),
>  via cm_close_handler(), tries to get already freed "siw_base_qp"
>from
>  "ib_qp". 
>   Hence, "siw_base_qp" should be freed at the end of siw_free_qp().
>

Regarding the siw driver, I am fine with that proposed
change. Delaying freeing the base_qp is OK. In fact,
I'd expect the drivers soon are passing that responsibility
to the rdma core anyway -- like for CQ/SRQ/PD/CTX objects,
which are already allocated and freed up there.

The iwcm changes look OK to me as well.

(some comments on re-formatting the changes
inlined below)

Thanks!
Bernard.
>
>diff --git a/drivers/infiniband/core/iwcm.c
>b/drivers/infiniband/core/iwcm.c
>index 72141c5b7c95..d5ab69fa598a 100644
>--- a/drivers/infiniband/core/iwcm.c
>+++ b/drivers/infiniband/core/iwcm.c
>@@ -373,6 +373,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
> {
>        struct iwcm_id_private *cm_id_priv;
>        unsigned long flags;
>+       struct ib_qp *qp;

move *qp declaration up one line - see comment in
siw driver change below.
>
>        cm_id_priv = container_of(cm_id, struct iwcm_id_private, id);
>        /*
>@@ -389,6 +390,9 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
>        set_bit(IWCM_F_DROP_EVENTS, &cm_id_priv->flags);
>
>        spin_lock_irqsave(&cm_id_priv->lock, flags);
>+       qp = cm_id_priv->qp;
>+       cm_id_priv->qp = NULL;
>+
>        switch (cm_id_priv->state) {
>        case IW_CM_STATE_LISTEN:
>                cm_id_priv->state = IW_CM_STATE_DESTROYING;
>@@ -401,7 +405,7 @@ static void destroy_cm_id(struct iw_cm_id *cm_id)
>                cm_id_priv->state = IW_CM_STATE_DESTROYING;
>                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>                /* Abrupt close of the connection */
>-               (void)iwcm_modify_qp_err(cm_id_priv->qp);
>+               (void)iwcm_modify_qp_err(qp);
>                spin_lock_irqsave(&cm_id_priv->lock, flags);
>                break;
>        case IW_CM_STATE_IDLE:
>@@ -426,11 +430,9 @@ static void destroy_cm_id(struct iw_cm_id
>*cm_id)
>                BUG();
>                break;
>        }
>-       if (cm_id_priv->qp) {
>-
>cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
>-               cm_id_priv->qp = NULL;
>-       }
>        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>+       if (qp)
>+               cm_id_priv->id.device->ops.iw_rem_ref(qp);
>
>        if (cm_id->mapped) {
>                iwpm_remove_mapinfo(&cm_id->local_addr,
>&cm_id->m_local_addr);
>@@ -671,11 +673,11 @@ int iw_cm_accept(struct iw_cm_id *cm_id,
>                BUG_ON(cm_id_priv->state != IW_CM_STATE_CONN_RECV);
>                cm_id_priv->state = IW_CM_STATE_IDLE;
>                spin_lock_irqsave(&cm_id_priv->lock, flags);
>-               if (cm_id_priv->qp) {
>-                       cm_id->device->ops.iw_rem_ref(qp);
>-                       cm_id_priv->qp = NULL;
>-               }
>+               qp = cm_id_priv->qp;
>+               cm_id_priv->qp = NULL;
>                spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>+               if (qp)
>+                       cm_id->device->ops.iw_rem_ref(qp);
>                clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
>                wake_up_all(&cm_id_priv->connect_wait);
>        }
>@@ -730,13 +732,13 @@ int iw_cm_connect(struct iw_cm_id *cm_id,
>struct
>iw_cm_conn_param *iw_param)
>                return 0;       /* success */
>
>        spin_lock_irqsave(&cm_id_priv->lock, flags);
>-       if (cm_id_priv->qp) {
>-               cm_id->device->ops.iw_rem_ref(qp);
>-               cm_id_priv->qp = NULL;
>-       }
>+       qp = cm_id_priv->qp;
>+       cm_id_priv->qp = NULL;
>        cm_id_priv->state = IW_CM_STATE_IDLE;
> err:
>        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>+       if (qp)
>+               cm_id->device->ops.iw_rem_ref(qp);
>        clear_bit(IWCM_F_CONNECT_WAIT, &cm_id_priv->flags);
>        wake_up_all(&cm_id_priv->connect_wait);
>        return ret;
>@@ -880,6 +882,7 @@ static int cm_conn_rep_handler(struct
>iwcm_id_private *cm_id_priv,
> {
>        unsigned long flags;
>        int ret;
>+       struct ib_qp *qp = NULL;
>
>        spin_lock_irqsave(&cm_id_priv->lock, flags);
>        /*
>@@ -896,11 +899,13 @@ static int cm_conn_rep_handler(struct
>iwcm_id_private *cm_id_priv,
>                cm_id_priv->state = IW_CM_STATE_ESTABLISHED;
>        } else {
>                /* REJECTED or RESET */
>-
>cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
>+               qp = cm_id_priv->qp;
>                cm_id_priv->qp = NULL;
>                cm_id_priv->state = IW_CM_STATE_IDLE;
>        }
>        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>+       if (qp)
>+               cm_id_priv->id.device->ops.iw_rem_ref(qp);
>        ret = cm_id_priv->id.cm_handler(&cm_id_priv->id, iw_event);
>
>        if (iw_event->private_data_len)
>@@ -944,12 +949,12 @@ static int cm_close_handler(struct
>iwcm_id_private
>*cm_id_priv,
> {
>        unsigned long flags;
>        int ret = 0;
>+       struct ib_qp *qp;

move *qp declaration up two lines - see comment on siw
driver change below.
>+
>        spin_lock_irqsave(&cm_id_priv->lock, flags);
>+       qp = cm_id_priv->qp;
>+       cm_id_priv->qp = NULL;
>
>-       if (cm_id_priv->qp) {
>-
>cm_id_priv->id.device->ops.iw_rem_ref(cm_id_priv->qp);
>-               cm_id_priv->qp = NULL;
>-       }
>        switch (cm_id_priv->state) {
>        case IW_CM_STATE_ESTABLISHED:
>        case IW_CM_STATE_CLOSING:
>@@ -965,6 +970,8 @@ static int cm_close_handler(struct
>iwcm_id_private
>*cm_id_priv,
>        }
>        spin_unlock_irqrestore(&cm_id_priv->lock, flags);
>
>+       if (qp)
>+               cm_id_priv->id.device->ops.iw_rem_ref(qp);
>        return ret;
> }
>
>diff --git a/drivers/infiniband/sw/siw/siw_qp.c
>b/drivers/infiniband/sw/siw/siw_qp.c
>index 430314c8abd9..cb177688a49f 100644
>--- a/drivers/infiniband/sw/siw/siw_qp.c
>+++ b/drivers/infiniband/sw/siw/siw_qp.c
>@@ -1307,6 +1307,7 @@ void siw_free_qp(struct kref *ref)
>        struct siw_qp *found, *qp = container_of(ref, struct siw_qp,
>ref);
>        struct siw_device *sdev = qp->sdev;
>        unsigned long flags;
>+       struct siw_base_qp *siw_base_qp = to_siw_base_qp(qp->ib_qp);

Please move that two lines up if OK with you.
I always prefer to have structs and its pointers
declared before introducing simple helper variables
like int and long etc. Thanks!


>
>        if (qp->cep)
>                siw_cep_put(qp->cep);
>@@ -1327,4 +1328,5 @@ void siw_free_qp(struct kref *ref)
>        atomic_dec(&sdev->num_qp);
>        siw_dbg_qp(qp, "free QP\n");
>        kfree_rcu(qp, rcu);
>+       kfree(siw_base_qp);
> }
>diff --git a/drivers/infiniband/sw/siw/siw_verbs.c
>b/drivers/infiniband/sw/siw/siw_verbs.c
>index da52c90e06d4..ac08d84d84cb 100644
>--- a/drivers/infiniband/sw/siw/siw_verbs.c
>+++ b/drivers/infiniband/sw/siw/siw_verbs.c
>@@ -603,7 +603,6 @@ int siw_verbs_modify_qp(struct ib_qp *base_qp,
>struct ib_qp_attr *attr,
> int siw_destroy_qp(struct ib_qp *base_qp, struct ib_udata *udata)
> {
>        struct siw_qp *qp = to_siw_qp(base_qp);
>-       struct siw_base_qp *siw_base_qp = to_siw_base_qp(base_qp);
>        struct siw_ucontext *uctx =
>                rdma_udata_to_drv_context(udata, struct siw_ucontext,
>                                          base_ucontext);
>@@ -640,7 +639,6 @@ int siw_destroy_qp(struct ib_qp *base_qp, struct
>ib_udata *udata)
>        qp->scq = qp->rcq = NULL;
>
>        siw_qp_put(qp);
>-       kfree(siw_base_qp);
>
>        return 0;
> }
>
>
>On Tuesday, September 09/10/19, 2019 at 22:23:13 +0530, Sagi Grimberg
>wrote:
>> 
>> >> This may be the final put on a qp and result in freeing
>> >> resourcesand should not be done with interrupts disabled.
>> > 
>> > Hi Sagi,
>> > 
>> > Few things to consider in fixing this completely:
>> >    - there are some other places where iw_rem_ref() should be
>called
>> >      after spinlock critical section. eg: in cm_close_handler(),
>> > iw_cm_connect(),...
>> >    - Any modifications to "cm_id_priv" should be done with in
>spinlock
>> >      critical section, modifying cm_id_priv->qp outside
>spinlocks, even
>> > with atomic xchg(), might be error prone.
>> >    - the structure "siw_base_qp" is getting freed in
>siw_destroy_qp(),
>> >      but it should be done at the end of siw_free_qp().
>> 
>> Not sure why you say that, at the end of this function ->qp will be
>null
>> anyways...
> Hope the above description and patch answers this.
>> 
>> >    
>> > I am about to finish writing a patch that cover all the above
>issues.
>> > Will test it and submit here by EOD.
>> 
>> Sure, you take it. Just stumbled on it so thought I'd go ahead and
>send
>> a patch...
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-11  9:38     ` Bernard Metzler
@ 2019-09-11 14:42       ` Steve Wise
  2019-09-11 15:58         ` Krishnamraju Eraparaju
  0 siblings, 1 reply; 12+ messages in thread
From: Steve Wise @ 2019-09-11 14:42 UTC (permalink / raw)
  To: Bernard Metzler
  Cc: Krishnamraju Eraparaju, Sagi Grimberg, linux-rdma, Jason Gunthorpe

On Wed, Sep 11, 2019 at 4:38 AM Bernard Metzler <BMT@zurich.ibm.com> wrote:
>
> -----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: -----
>
> >To: "Sagi Grimberg" <sagi@grimberg.me>, "Steve Wise"
> ><larrystevenwise@gmail.com>, "Bernard Metzler" <BMT@zurich.ibm.com>
> >From: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
> >Date: 09/10/2019 09:22PM
> >Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "Jason
> >Gunthorpe" <jgg@ziepe.ca>
> >Subject: [EXTERNAL] Re: [PATCH v3] iwcm: don't hold the irq disabled
> >lock on iw_rem_ref
> >
> >Please review the below patch, I will resubmit this in patch-series
> >after review.
> >- As kput_ref handler(siw_free_qp) uses vfree, iwcm can't call
> >  iw_rem_ref() with spinlocks held. Doing so can cause vfree() to
> >sleep
> >  with irq disabled.
> >  Two possible solutions:
> >  1)With spinlock acquired, take a copy of "cm_id_priv->qp" and
> >update
> >    it to NULL. And after releasing lock use the copied qp pointer
> >for
> >    rem_ref().
> >  2)Replacing issue causing vmalloc()/vfree to kmalloc()/kfree in SIW
> >    driver, may not be a ideal solution.
> >
> >  Solution 2 may not be ideal as allocating huge contigous memory for
> >   SQ & RQ doesn't look appropriate.
> >
> >- The structure "siw_base_qp" is getting freed in siw_destroy_qp(),
> >but
> >  if cm_close_handler() holds the last reference, then siw_free_qp(),
> >  via cm_close_handler(), tries to get already freed "siw_base_qp"
> >from
> >  "ib_qp".
> >   Hence, "siw_base_qp" should be freed at the end of siw_free_qp().
> >
>
> Regarding the siw driver, I am fine with that proposed
> change. Delaying freeing the base_qp is OK. In fact,
> I'd expect the drivers soon are passing that responsibility
> to the rdma core anyway -- like for CQ/SRQ/PD/CTX objects,
> which are already allocated and freed up there.
>
> The iwcm changes look OK to me as well.
>

Hey Krishna,  Since the iwcm struct/state is still correctly being
manipulated under the lock, then I think it this patch correct.  Test
the heck out of it. :)

Steve.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-11 14:42       ` Steve Wise
@ 2019-09-11 15:58         ` Krishnamraju Eraparaju
  2019-09-16 16:28           ` Jason Gunthorpe
  2019-09-17  9:04           ` Bernard Metzler
  0 siblings, 2 replies; 12+ messages in thread
From: Krishnamraju Eraparaju @ 2019-09-11 15:58 UTC (permalink / raw)
  To: Steve Wise, Bernard Metzler; +Cc: Sagi Grimberg, linux-rdma, Jason Gunthorpe

Hi Steve & Bernard,

Thanks for the review comments.
I will do those formating changes.

Thanks,
Krishna.
On Wednesday, September 09/11/19, 2019 at 20:12:43 +0530, Steve Wise wrote:
> On Wed, Sep 11, 2019 at 4:38 AM Bernard Metzler <BMT@zurich.ibm.com> wrote:
> >
> > -----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: -----
> >
> > >To: "Sagi Grimberg" <sagi@grimberg.me>, "Steve Wise"
> > ><larrystevenwise@gmail.com>, "Bernard Metzler" <BMT@zurich.ibm.com>
> > >From: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
> > >Date: 09/10/2019 09:22PM
> > >Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "Jason
> > >Gunthorpe" <jgg@ziepe.ca>
> > >Subject: [EXTERNAL] Re: [PATCH v3] iwcm: don't hold the irq disabled
> > >lock on iw_rem_ref
> > >
> > >Please review the below patch, I will resubmit this in patch-series
> > >after review.
> > >- As kput_ref handler(siw_free_qp) uses vfree, iwcm can't call
> > >  iw_rem_ref() with spinlocks held. Doing so can cause vfree() to
> > >sleep
> > >  with irq disabled.
> > >  Two possible solutions:
> > >  1)With spinlock acquired, take a copy of "cm_id_priv->qp" and
> > >update
> > >    it to NULL. And after releasing lock use the copied qp pointer
> > >for
> > >    rem_ref().
> > >  2)Replacing issue causing vmalloc()/vfree to kmalloc()/kfree in SIW
> > >    driver, may not be a ideal solution.
> > >
> > >  Solution 2 may not be ideal as allocating huge contigous memory for
> > >   SQ & RQ doesn't look appropriate.
> > >
> > >- The structure "siw_base_qp" is getting freed in siw_destroy_qp(),
> > >but
> > >  if cm_close_handler() holds the last reference, then siw_free_qp(),
> > >  via cm_close_handler(), tries to get already freed "siw_base_qp"
> > >from
> > >  "ib_qp".
> > >   Hence, "siw_base_qp" should be freed at the end of siw_free_qp().
> > >
> >
> > Regarding the siw driver, I am fine with that proposed
> > change. Delaying freeing the base_qp is OK. In fact,
> > I'd expect the drivers soon are passing that responsibility
> > to the rdma core anyway -- like for CQ/SRQ/PD/CTX objects,
> > which are already allocated and freed up there.
> >
> > The iwcm changes look OK to me as well.
> >
> 
> Hey Krishna,  Since the iwcm struct/state is still correctly being
> manipulated under the lock, then I think it this patch correct.  Test
> the heck out of it. :)
> 
> Steve.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-11 15:58         ` Krishnamraju Eraparaju
@ 2019-09-16 16:28           ` Jason Gunthorpe
  2019-09-17  9:04           ` Bernard Metzler
  1 sibling, 0 replies; 12+ messages in thread
From: Jason Gunthorpe @ 2019-09-16 16:28 UTC (permalink / raw)
  To: Krishnamraju Eraparaju
  Cc: Steve Wise, Bernard Metzler, Sagi Grimberg, linux-rdma

On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju wrote:
> Hi Steve & Bernard,
> 
> Thanks for the review comments.
> I will do those formating changes.

I don't see anything in patchworks, but the consensus is to drop
Sagi's patch pending this future patch?

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-11 15:58         ` Krishnamraju Eraparaju
  2019-09-16 16:28           ` Jason Gunthorpe
@ 2019-09-17  9:04           ` Bernard Metzler
  2019-09-17 12:47             ` Krishnamraju Eraparaju
                               ` (2 more replies)
  1 sibling, 3 replies; 12+ messages in thread
From: Bernard Metzler @ 2019-09-17  9:04 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Krishnamraju Eraparaju, Steve Wise, Sagi Grimberg, linux-rdma

-----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----

>To: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>From: "Jason Gunthorpe" <jgg@ziepe.ca>
>Date: 09/16/2019 06:28PM
>Cc: "Steve Wise" <larrystevenwise@gmail.com>, "Bernard Metzler"
><BMT@zurich.ibm.com>, "Sagi Grimberg" <sagi@grimberg.me>,
>"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
>Subject: [EXTERNAL] Re: Re: [PATCH v3] iwcm: don't hold the irq
>disabled lock on iw_rem_ref
>
>On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju
>wrote:
>> Hi Steve & Bernard,
>> 
>> Thanks for the review comments.
>> I will do those formating changes.
>
>I don't see anything in patchworks, but the consensus is to drop
>Sagi's patch pending this future patch?
>
>Jason
>
This is my impression as well. But consensus should be
explicit...Sagi, what do you think?

Best regards,
Bernard.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-17  9:04           ` Bernard Metzler
@ 2019-09-17 12:47             ` Krishnamraju Eraparaju
  2019-09-17 13:40             ` Bernard Metzler
  2019-09-17 17:20             ` Sagi Grimberg
  2 siblings, 0 replies; 12+ messages in thread
From: Krishnamraju Eraparaju @ 2019-09-17 12:47 UTC (permalink / raw)
  To: Bernard Metzler, Steve Wise
  Cc: Jason Gunthorpe, Sagi Grimberg, linux-rdma, Nirranjan Kirubaharan

On Tuesday, September 09/17/19, 2019 at 14:34:24 +0530, Bernard Metzler wrote:
> -----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----
> 
> >To: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
> >From: "Jason Gunthorpe" <jgg@ziepe.ca>
> >Date: 09/16/2019 06:28PM
> >Cc: "Steve Wise" <larrystevenwise@gmail.com>, "Bernard Metzler"
> ><BMT@zurich.ibm.com>, "Sagi Grimberg" <sagi@grimberg.me>,
> >"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
> >Subject: [EXTERNAL] Re: Re: [PATCH v3] iwcm: don't hold the irq
> >disabled lock on iw_rem_ref
> >
> >On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju
> >wrote:
> >> Hi Steve & Bernard,
> >> 
> >> Thanks for the review comments.
> >> I will do those formating changes.
> >
> >I don't see anything in patchworks, but the consensus is to drop
> >Sagi's patch pending this future patch?
> >
> >Jason
> >
> This is my impression as well. But consensus should be
> explicit...Sagi, what do you think?
> 
> Best regards,
> Bernard.
> 
While testing iSER(with my proposed patch applied) I see Chelsio iwarp
driver is hitting the below deadlock issue. This is due to iw_rem_ref
reordering changes in IWCM.

Bernard, how about replacing vmalloc/vfree with kmalloc/kfree,
such that freeing of SIW qp resources can be done with spinlocks held?
to fix the orginal vfree issue less invasively..

Steve, any suggestions?


[ 1230.161871] INFO: task kworker/u12:0:11291 blocked for more than 122
seconds.
[ 1230.162147]       Not tainted 5.3.0-rc5+ #19
[ 1230.162417] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1230.162911] kworker/u12:0   D13000 11291      2 0x80004080
[ 1230.163186] Workqueue: iw_cm_wq cm_work_handler
[ 1230.163456] Call Trace:
[ 1230.163718]  ? __schedule+0x297/0x510
[ 1230.163986]  schedule+0x2e/0x90
[ 1230.164253]  schedule_timeout+0x1c0/0x280
[ 1230.164520]  ? xas_store+0x23e/0x500
[ 1230.164789]  wait_for_completion+0xa2/0x110
[ 1230.165067]  ? wake_up_q+0x70/0x70
[ 1230.165336]  c4iw_destroy_qp+0x141/0x260 [iw_cxgb4]
[ 1230.165611]  ? xas_store+0x23e/0x500
[ 1230.165893]  ? _cond_resched+0x10/0x20
[ 1230.166160]  ? wait_for_completion+0x2e/0x110
[ 1230.166432]  ib_destroy_qp_user+0x142/0x230
[ 1230.166699]  rdma_destroy_qp+0x1f/0x40
[ 1230.166966]  iser_free_ib_conn_res+0x52/0x190 [ib_iser]
[ 1230.167241]  iser_cleanup_handler.isra.15+0x32/0x60 [ib_iser]
[ 1230.167510]  iser_cma_handler+0x23b/0x730 [ib_iser]
[ 1230.167776]  cma_iw_handler+0x154/0x1e0
[ 1230.168037]  cm_work_handler+0xb4c/0xd60
[ 1230.168302]  process_one_work+0x155/0x380
[ 1230.168564]  worker_thread+0x41/0x3b0
[ 1230.168827]  kthread+0xf3/0x130
[ 1230.169086]  ? process_one_work+0x380/0x380
[ 1230.169350]  ? kthread_bind+0x10/0x10
[ 1230.169615]  ret_from_fork+0x35/0x40
[ 1230.169885] NMI backtrace for cpu 3


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Re: Re: Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-17  9:04           ` Bernard Metzler
  2019-09-17 12:47             ` Krishnamraju Eraparaju
@ 2019-09-17 13:40             ` Bernard Metzler
  2019-09-17 17:20             ` Sagi Grimberg
  2 siblings, 0 replies; 12+ messages in thread
From: Bernard Metzler @ 2019-09-17 13:40 UTC (permalink / raw)
  To: Krishnamraju Eraparaju
  Cc: Steve Wise, Jason Gunthorpe, Sagi Grimberg, linux-rdma,
	Nirranjan Kirubaharan

-----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>, "Steve Wise"
><larrystevenwise@gmail.com>
>From: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>Date: 09/17/2019 02:48PM
>Cc: "Jason Gunthorpe" <jgg@ziepe.ca>, "Sagi Grimberg"
><sagi@grimberg.me>, "linux-rdma@vger.kernel.org"
><linux-rdma@vger.kernel.org>, "Nirranjan Kirubaharan"
><nirranjan@chelsio.com>
>Subject: [EXTERNAL] Re: Re: Re: [PATCH v3] iwcm: don't hold the irq
>disabled lock on iw_rem_ref
>
>On Tuesday, September 09/17/19, 2019 at 14:34:24 +0530, Bernard
>Metzler wrote:
>> -----"Jason Gunthorpe" <jgg@ziepe.ca> wrote: -----
>> 
>> >To: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>> >From: "Jason Gunthorpe" <jgg@ziepe.ca>
>> >Date: 09/16/2019 06:28PM
>> >Cc: "Steve Wise" <larrystevenwise@gmail.com>, "Bernard Metzler"
>> ><BMT@zurich.ibm.com>, "Sagi Grimberg" <sagi@grimberg.me>,
>> >"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
>> >Subject: [EXTERNAL] Re: Re: [PATCH v3] iwcm: don't hold the irq
>> >disabled lock on iw_rem_ref
>> >
>> >On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju
>> >wrote:
>> >> Hi Steve & Bernard,
>> >> 
>> >> Thanks for the review comments.
>> >> I will do those formating changes.
>> >
>> >I don't see anything in patchworks, but the consensus is to drop
>> >Sagi's patch pending this future patch?
>> >
>> >Jason
>> >
>> This is my impression as well. But consensus should be
>> explicit...Sagi, what do you think?
>> 
>> Best regards,
>> Bernard.
>> 
>While testing iSER(with my proposed patch applied) I see Chelsio
>iwarp
>driver is hitting the below deadlock issue. This is due to iw_rem_ref
>reordering changes in IWCM.
>
>Bernard, how about replacing vmalloc/vfree with kmalloc/kfree,
>such that freeing of SIW qp resources can be done with spinlocks
>held?
>to fix the orginal vfree issue less invasively..

Well, I'd really like to avoid kmalloc on potentially
large data structures when there is no need to have
physically contiguous memory.

I could of course move that vfree out to a worker.
Simple, but not really nice though.

So it seems it would be no good option to restructure
the Chelsio driver?

Thanks
Bernard.

>
>Steve, any suggestions?
>
>
>[ 1230.161871] INFO: task kworker/u12:0:11291 blocked for more than
>122
>seconds.
>[ 1230.162147]       Not tainted 5.3.0-rc5+ #19
>[ 1230.162417] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>disables this message.
>[ 1230.162911] kworker/u12:0   D13000 11291      2 0x80004080
>[ 1230.163186] Workqueue: iw_cm_wq cm_work_handler
>[ 1230.163456] Call Trace:
>[ 1230.163718]  ? __schedule+0x297/0x510
>[ 1230.163986]  schedule+0x2e/0x90
>[ 1230.164253]  schedule_timeout+0x1c0/0x280
>[ 1230.164520]  ? xas_store+0x23e/0x500
>[ 1230.164789]  wait_for_completion+0xa2/0x110
>[ 1230.165067]  ? wake_up_q+0x70/0x70
>[ 1230.165336]  c4iw_destroy_qp+0x141/0x260 [iw_cxgb4]
>[ 1230.165611]  ? xas_store+0x23e/0x500
>[ 1230.165893]  ? _cond_resched+0x10/0x20
>[ 1230.166160]  ? wait_for_completion+0x2e/0x110
>[ 1230.166432]  ib_destroy_qp_user+0x142/0x230
>[ 1230.166699]  rdma_destroy_qp+0x1f/0x40
>[ 1230.166966]  iser_free_ib_conn_res+0x52/0x190 [ib_iser]
>[ 1230.167241]  iser_cleanup_handler.isra.15+0x32/0x60 [ib_iser]
>[ 1230.167510]  iser_cma_handler+0x23b/0x730 [ib_iser]
>[ 1230.167776]  cma_iw_handler+0x154/0x1e0
>[ 1230.168037]  cm_work_handler+0xb4c/0xd60
>[ 1230.168302]  process_one_work+0x155/0x380
>[ 1230.168564]  worker_thread+0x41/0x3b0
>[ 1230.168827]  kthread+0xf3/0x130
>[ 1230.169086]  ? process_one_work+0x380/0x380
>[ 1230.169350]  ? kthread_bind+0x10/0x10
>[ 1230.169615]  ret_from_fork+0x35/0x40
>[ 1230.169885] NMI backtrace for cpu 3
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref
  2019-09-17  9:04           ` Bernard Metzler
  2019-09-17 12:47             ` Krishnamraju Eraparaju
  2019-09-17 13:40             ` Bernard Metzler
@ 2019-09-17 17:20             ` Sagi Grimberg
  2 siblings, 0 replies; 12+ messages in thread
From: Sagi Grimberg @ 2019-09-17 17:20 UTC (permalink / raw)
  To: Bernard Metzler, Jason Gunthorpe
  Cc: Krishnamraju Eraparaju, Steve Wise, linux-rdma


>> To: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>> From: "Jason Gunthorpe" <jgg@ziepe.ca>
>> Date: 09/16/2019 06:28PM
>> Cc: "Steve Wise" <larrystevenwise@gmail.com>, "Bernard Metzler"
>> <BMT@zurich.ibm.com>, "Sagi Grimberg" <sagi@grimberg.me>,
>> "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
>> Subject: [EXTERNAL] Re: Re: [PATCH v3] iwcm: don't hold the irq
>> disabled lock on iw_rem_ref
>>
>> On Wed, Sep 11, 2019 at 09:28:16PM +0530, Krishnamraju Eraparaju
>> wrote:
>>> Hi Steve & Bernard,
>>>
>>> Thanks for the review comments.
>>> I will do those formating changes.
>>
>> I don't see anything in patchworks, but the consensus is to drop
>> Sagi's patch pending this future patch?
>>
>> Jason
>>
> This is my impression as well. But consensus should be
> explicit...Sagi, what do you think?

I don't really care, but given the changes from Krishnamraju cause other
problems I'd ask if my version is also offending his test.

In general, I do not think that making resources free routines (both
explict or implicit via ref dec) under a spinlock is not a robust
design.

I would first make it clear and documented what cm_id_priv->lock is
protecting. In my mind, it should protect *its own* mutations of
cm_id_priv and by design leave all the ops calls outside the lock.

I don't understand what is causing the Chelsio issue observed, but
it looks like c4iw_destroy_qp blocks on a completion that depends on a
refcount that is taken also by iwcm, which means that I cannot call
ib_destroy_qp if the cm is not destroyed as well?

If that is madatory, I'd say that instead of blocking on this
completion, we can simply convert c4iw_qp_rem_ref if use a kref
which is not order dependent.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-04 21:25 [PATCH v3] iwcm: don't hold the irq disabled lock on iw_rem_ref Sagi Grimberg
2019-09-10 11:18 ` Krishnamraju Eraparaju
2019-09-10 16:53   ` Sagi Grimberg
2019-09-10 19:21     ` Krishnamraju Eraparaju
2019-09-11  9:38     ` Bernard Metzler
2019-09-11 14:42       ` Steve Wise
2019-09-11 15:58         ` Krishnamraju Eraparaju
2019-09-16 16:28           ` Jason Gunthorpe
2019-09-17  9:04           ` Bernard Metzler
2019-09-17 12:47             ` Krishnamraju Eraparaju
2019-09-17 13:40             ` Bernard Metzler
2019-09-17 17:20             ` Sagi Grimberg

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org linux-rdma@archiver.kernel.org
	public-inbox-index linux-rdma


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/ public-inbox