From: Haakon Bugge <haakon.bugge@oracle.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
Dmitry Vyukov <dvyukov@google.com>,
OFED mailing list <linux-rdma@vger.kernel.org>,
"syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com"
<syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com>
Subject: Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
Date: Wed, 22 Sep 2021 09:38:40 +0000 [thread overview]
Message-ID: <27C1E678-2EC4-4916-9720-00C1B69EA5AA@oracle.com> (raw)
In-Reply-To: <YUri44sX8Lp3muc4@unreal>
> On 22 Sep 2021, at 10:01, Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Sep 16, 2021 at 03:34:46PM -0300, Jason Gunthorpe wrote:
>> The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
>> on the same id_priv. While this cannot happen without going through the
>> work, it violates the invariant that the same address resolution
>> background request cannot be active twice.
>>
>> CPU 1 CPU 2
>>
>> rdma_resolve_addr():
>> RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
>> rdma_resolve_ip(addr_handler) #1
>>
>> process_one_req(): for #1
>> addr_handler():
>> RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
>> mutex_unlock(&id_priv->handler_mutex);
>> [.. handler still running ..]
>>
>> rdma_resolve_addr():
>> RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
>> rdma_resolve_ip(addr_handler)
>> !! two requests are now on the req_list
>>
>> rdma_destroy_id():
>> destroy_id_handler_unlock():
>> _destroy_id():
>> cma_cancel_operation():
>> rdma_addr_cancel()
>>
>> // process_one_req() self removes it
>> spin_lock_bh(&lock);
>> cancel_delayed_work(&req->work);
>> if (!list_empty(&req->list)) == true
>>
>> ! rdma_addr_cancel() returns after process_on_req #1 is done
>>
>> kfree(id_priv)
>>
>> process_one_req(): for #2
>> addr_handler():
>> mutex_lock(&id_priv->handler_mutex);
>> !! Use after free on id_priv
>>
>> rdma_addr_cancel() expects there to be one req on the list and only
>> cancels the first one. The self-removal behavior of the work only happens
>> after the handler has returned. This yields a situations where the
>> req_list can have two reqs for the same "handle" but rdma_addr_cancel()
>> only cancels the first one.
>>
>> The second req remains active beyond rdma_destroy_id() and will
>> use-after-free id_priv once it inevitably triggers.
>>
>> Fix this by remembering if the id_priv has called rdma_resolve_ip() and
>> always cancel before calling it again. This ensures the req_list never
>> gets more than one item in it and doesn't cost anything in the normal flow
>> that never uses this strange error path.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")
>> Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> ---
>> drivers/infiniband/core/cma.c | 17 +++++++++++++++++
>> drivers/infiniband/core/cma_priv.h | 1 +
>> 2 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index c40791baced588..751cf5ea25f296 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv,
>> {
>> switch (state) {
>> case RDMA_CM_ADDR_QUERY:
>> + /*
>> + * We can avoid doing the rdma_addr_cancel() based on state,
>> + * only RDMA_CM_ADDR_QUERY has a work that could still execute.
>> + * Notice that the addr_handler work could still be exiting
>> + * outside this state, however due to the interaction with the
>> + * handler_mutex the work is guaranteed not to touch id_priv
>> + * during exit.
>> + */
>> rdma_addr_cancel(&id_priv->id.route.addr.dev_addr);
>> break;
>> case RDMA_CM_ROUTE_QUERY:
>> @@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
>> if (dst_addr->sa_family == AF_IB) {
>> ret = cma_resolve_ib_addr(id_priv);
>> } else {
>> + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
>> + * rdma_resolve_ip() is called, eg through the error
>> + * path in addr_handler. If this happens the existing
>> + * request must be canceled before issuing a new one.
>> + */
>> + if (id_priv->used_resolve_ip)
>> + rdma_addr_cancel(&id->route.addr.dev_addr);
>> + else
>> + id_priv->used_resolve_ip = 1;
>
> Why don't you never clear this field? If you assume that this is one lifetime
> event, can you please add a comment with an explanation "why"?
Adding to that, don't you need {READ,WRITE}_ONCE when accessing used_resolve_ip? Or will the write to it obtain global visibility because mutex_unlock(&ctx->mutex) is executed before any other context can read it?
Thxs, Håkon
>
> Thanks
>
>> ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr,
>> &id->route.addr.dev_addr,
>> timeout_ms, addr_handler,
>> diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
>> index 5c463da9984536..f92f101ea9818f 100644
>> --- a/drivers/infiniband/core/cma_priv.h
>> +++ b/drivers/infiniband/core/cma_priv.h
>> @@ -91,6 +91,7 @@ struct rdma_id_private {
>> u8 afonly;
>> u8 timeout;
>> u8 min_rnr_timer;
>> + u8 used_resolve_ip;
>> enum ib_gid_type gid_type;
>>
>> /*
>>
>> base-commit: ad17bbef3dd573da937816edc0ab84fed6a17fa6
>> --
>> 2.33.0
next prev parent reply other threads:[~2021-09-22 9:39 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-16 18:34 [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests Jason Gunthorpe
2021-09-22 8:01 ` Leon Romanovsky
2021-09-22 9:38 ` Haakon Bugge [this message]
2021-09-22 14:44 ` Jason Gunthorpe
2021-09-22 14:41 ` Jason Gunthorpe
2021-09-23 5:49 ` Leon Romanovsky
2021-09-23 11:45 ` Jason Gunthorpe
2021-09-23 18:15 ` Leon Romanovsky
2021-09-23 20:03 ` Jason Gunthorpe
2021-09-23 23:17 ` Leon Romanovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27C1E678-2EC4-4916-9720-00C1B69EA5AA@oracle.com \
--to=haakon.bugge@oracle.com \
--cc=dvyukov@google.com \
--cc=jgg@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).