All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Danil Kipnis <danil.kipnis@cloud.ionos.com>,
	Doug Ledford <dledford@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Jack Wang <jinpu.wang@cloud.ionos.com>,
	Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	netdev@vger.kernel.org, rds-devel@oss.oracle.com,
	Sagi Grimberg <sagi@grimberg.me>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Subject: Re: [PATCH] RDMA: Add rdma_connect_locked()
Date: Tue, 27 Oct 2020 13:05:00 +0100	[thread overview]
Message-ID: <11bb18bd-a26a-d0e2-9ff6-6d7e2bf3fb86@cloud.ionos.com> (raw)
In-Reply-To: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com>



On 10/26/20 15:25, Jason Gunthorpe wrote:
> There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
> handler triggers a completion and another thread does rdma_connect() or
> the handler directly calls rdma_connect().
> 
> In all cases rdma_connect() needs to hold the handler_mutex, but when
> handler's are invoked this is already held by the core code. This causes
> ULPs using the 2nd method to deadlock.
> 
> Provide a rdma_connect_locked() and have all ULPs call it from their
> handlers.
> 
> Reported-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state"
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/core/cma.c            | 39 +++++++++++++++++++++---
>   drivers/infiniband/ulp/iser/iser_verbs.c |  2 +-
>   drivers/infiniband/ulp/rtrs/rtrs-clt.c   |  4 +--
>   drivers/nvme/host/rdma.c                 | 10 +++---
>   include/rdma/rdma_cm.h                   | 13 +-------
>   net/rds/ib_cm.c                          |  5 +--
>   6 files changed, 47 insertions(+), 26 deletions(-)
> 
> Seems people are not testing these four ULPs against rdma-next.. Here is a
> quick fix for the issue:
> 
> https://lore.kernel.org/r/3b1f7767-98e2-93e0-b718-16d1c5346140@cloud.ionos.com

I can't see the previous calltrace with this patch.

Tested-by: Guoqing Jiang<guoqing.jiang@cloud.ionos.com>


Thanks,
Guoqing

WARNING: multiple messages have this Message-ID (diff)
From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
To: Jason Gunthorpe <jgg@nvidia.com>,
	Danil Kipnis <danil.kipnis@cloud.ionos.com>,
	Doug Ledford <dledford@redhat.com>,
	Christoph Hellwig <hch@lst.de>,
	Jack Wang <jinpu.wang@cloud.ionos.com>,
	Keith Busch <kbusch@kernel.org>,
	linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	netdev@vger.kernel.org, rds-devel@oss.oracle.com,
	Sagi Grimberg <sagi@grimberg.me>,
	Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Subject: Re: [PATCH] RDMA: Add rdma_connect_locked()
Date: Tue, 27 Oct 2020 13:05:00 +0100	[thread overview]
Message-ID: <11bb18bd-a26a-d0e2-9ff6-6d7e2bf3fb86@cloud.ionos.com> (raw)
In-Reply-To: <0-v1-75e124dbad74+b05-rdma_connect_locking_jgg@nvidia.com>



On 10/26/20 15:25, Jason Gunthorpe wrote:
> There are two flows for handling RDMA_CM_EVENT_ROUTE_RESOLVED, either the
> handler triggers a completion and another thread does rdma_connect() or
> the handler directly calls rdma_connect().
> 
> In all cases rdma_connect() needs to hold the handler_mutex, but when
> handler's are invoked this is already held by the core code. This causes
> ULPs using the 2nd method to deadlock.
> 
> Provide a rdma_connect_locked() and have all ULPs call it from their
> handlers.
> 
> Reported-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
> Fixes: 2a7cec538169 ("RDMA/cma: Fix locking for the RDMA_CM_CONNECT state"
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>   drivers/infiniband/core/cma.c            | 39 +++++++++++++++++++++---
>   drivers/infiniband/ulp/iser/iser_verbs.c |  2 +-
>   drivers/infiniband/ulp/rtrs/rtrs-clt.c   |  4 +--
>   drivers/nvme/host/rdma.c                 | 10 +++---
>   include/rdma/rdma_cm.h                   | 13 +-------
>   net/rds/ib_cm.c                          |  5 +--
>   6 files changed, 47 insertions(+), 26 deletions(-)
> 
> Seems people are not testing these four ULPs against rdma-next.. Here is a
> quick fix for the issue:
> 
> https://lore.kernel.org/r/3b1f7767-98e2-93e0-b718-16d1c5346140@cloud.ionos.com

I can't see the previous calltrace with this patch.

Tested-by: Guoqing Jiang<guoqing.jiang@cloud.ionos.com>


Thanks,
Guoqing

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2020-10-27 12:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26 14:25 [PATCH] RDMA: Add rdma_connect_locked() Jason Gunthorpe
2020-10-26 14:25 ` Jason Gunthorpe
2020-10-26 16:01 ` santosh.shilimkar
2020-10-26 16:01   ` santosh.shilimkar
2020-10-27  2:01 ` Chao Leng
2020-10-27  2:01   ` Chao Leng
2020-10-27 12:00   ` Jason Gunthorpe
2020-10-27 12:00     ` Jason Gunthorpe
2020-10-27  7:33 ` Jinpu Wang
2020-10-27  7:33   ` Jinpu Wang
2020-10-27  8:04 ` Christoph Hellwig
2020-10-27  8:04   ` Christoph Hellwig
2020-10-27 12:05 ` Guoqing Jiang [this message]
2020-10-27 12:05   ` Guoqing Jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11bb18bd-a26a-d0e2-9ff6-6d7e2bf3fb86@cloud.ionos.com \
    --to=guoqing.jiang@cloud.ionos.com \
    --cc=danil.kipnis@cloud.ionos.com \
    --cc=dledford@redhat.com \
    --cc=hch@lst.de \
    --cc=jgg@nvidia.com \
    --cc=jinpu.wang@cloud.ionos.com \
    --cc=kbusch@kernel.org \
    --cc=leonro@nvidia.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mgurtovoy@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=rds-devel@oss.oracle.com \
    --cc=sagi@grimberg.me \
    --cc=santosh.shilimkar@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.