From: liruozhu <liruozhu@huawei.com>
To: Max Gurtovoy <mgurtovoy@nvidia.com>,
Christoph Hellwig <hch@infradead.org>
Cc: <linux-nvme@lists.infradead.org>, <sagi@grimberg.me>
Subject: Re: [PATCH 1/1] nvme-rdma: destroy cm id before destroy qp to avoid use after free
Date: Fri, 10 Sep 2021 10:50:30 +0800 [thread overview]
Message-ID: <3f68da69-6e34-7bfd-438e-96f8f17525ac@huawei.com> (raw)
In-Reply-To: <90356fc2-837c-0ba2-5fcf-d9e9dcc06dba@nvidia.com>
Hi Max,
Yes, I did encounter this problem. I posted the detailed log in the cover letter of this patch.
I tested it with Huawei storage array, kill nvme target processes(we use self-developed user mode nvme target)
while inject PCIE AER errors on array HBA card repeatedly.
I’m not sure if there is a way to easily reproduce it using standard server with kernel target.
But I still think that the host side should try to avoid this possibility of crashing.
Thanks.
On 2021/9/10 1:32, Max Gurtovoy wrote:
>
> On 9/9/2021 5:05 PM, Christoph Hellwig wrote:
>> This looks reasonable to me. Sagi, Max: any comments?
>>
>> On Mon, Sep 06, 2021 at 11:51:34AM +0800, Ruozhu Li wrote:
>>> We should always destroy cm_id before destroy qp to avoid to get cma
>>> event after qp was destroyed, which may lead to use after free.
>
> Do you really encounter use-after-free ?
>
> I think the code looks correct but from experience with RDMA-CM I
> would like to know which tests did you run with this code ?
>
> interesting tests are: port toggling with/without IO, unload
> mlx5/other_rdma_hw drivers during connection establishment or during
> IO, reboot target, reboot target during connection establishment,
> reboot target during IO.
>
> Thanks.
>
>
>>> In RDMA connection establishment error flow, don't destroy qp in cm
>>> event handler.Just report cm_error to upper level, qp will be destroy
>>> in nvme_rdma_alloc_queue() after destroy cm id.
>>>
>>> Signed-off-by: Ruozhu Li <liruozhu@huawei.com>
>>> ---
>>> drivers/nvme/host/rdma.c | 16 +++-------------
>>> 1 file changed, 3 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
>>> index a68704e39084..042c594bc57e 100644
>>> --- a/drivers/nvme/host/rdma.c
>>> +++ b/drivers/nvme/host/rdma.c
>>> @@ -656,8 +656,8 @@ static void nvme_rdma_free_queue(struct
>>> nvme_rdma_queue *queue)
>>> if (!test_and_clear_bit(NVME_RDMA_Q_ALLOCATED, &queue->flags))
>>> return;
>>> - nvme_rdma_destroy_queue_ib(queue);
>>> rdma_destroy_id(queue->cm_id);
>>> + nvme_rdma_destroy_queue_ib(queue);
>>> mutex_destroy(&queue->queue_lock);
>>> }
>>> @@ -1815,14 +1815,10 @@ static int
>>> nvme_rdma_conn_established(struct nvme_rdma_queue *queue)
>>> for (i = 0; i < queue->queue_size; i++) {
>>> ret = nvme_rdma_post_recv(queue, &queue->rsp_ring[i]);
>>> if (ret)
>>> - goto out_destroy_queue_ib;
>>> + return ret;
>>> }
>>> return 0;
>>> -
>>> -out_destroy_queue_ib:
>>> - nvme_rdma_destroy_queue_ib(queue);
>>> - return ret;
>>> }
>>> static int nvme_rdma_conn_rejected(struct nvme_rdma_queue *queue,
>>> @@ -1916,14 +1912,10 @@ static int nvme_rdma_route_resolved(struct
>>> nvme_rdma_queue *queue)
>>> if (ret) {
>>> dev_err(ctrl->ctrl.device,
>>> "rdma_connect_locked failed (%d).\n", ret);
>>> - goto out_destroy_queue_ib;
>>> + return ret;
>>> }
>>> return 0;
>>> -
>>> -out_destroy_queue_ib:
>>> - nvme_rdma_destroy_queue_ib(queue);
>>> - return ret;
>>> }
>>> static int nvme_rdma_cm_handler(struct rdma_cm_id *cm_id,
>>> @@ -1954,8 +1946,6 @@ static int nvme_rdma_cm_handler(struct
>>> rdma_cm_id *cm_id,
>>> case RDMA_CM_EVENT_ROUTE_ERROR:
>>> case RDMA_CM_EVENT_CONNECT_ERROR:
>>> case RDMA_CM_EVENT_UNREACHABLE:
>>> - nvme_rdma_destroy_queue_ib(queue);
>>> - fallthrough;
>>> case RDMA_CM_EVENT_ADDR_ERROR:
>>> dev_dbg(queue->ctrl->ctrl.device,
>>> "CM error event %d\n", ev->event);
>>> --
>>> 2.16.4
>>>
>>>
>>> _______________________________________________
>>> Linux-nvme mailing list
>>> Linux-nvme@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/linux-nvme
>> ---end quoted text---
> .
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
next prev parent reply other threads:[~2021-09-10 2:51 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-06 3:51 [PATCH 0/1] fix a panic cause by use qp after free Ruozhu Li
2021-09-06 3:51 ` [PATCH 1/1] nvme-rdma: destroy cm id before destroy qp to avoid use " Ruozhu Li
2021-09-09 14:05 ` Christoph Hellwig
2021-09-09 17:32 ` Max Gurtovoy
2021-09-10 2:50 ` liruozhu [this message]
2021-09-10 7:42 ` liruozhu
2021-09-13 1:49 ` liruozhu
2021-09-13 14:16 ` Max Gurtovoy
2021-09-13 15:05 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3f68da69-6e34-7bfd-438e-96f8f17525ac@huawei.com \
--to=liruozhu@huawei.com \
--cc=hch@infradead.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mgurtovoy@nvidia.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).