* [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
@ 2021-09-16 18:34 Jason Gunthorpe
2021-09-22 8:01 ` Leon Romanovsky
0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2021-09-16 18:34 UTC (permalink / raw)
To: Dmitry Vyukov, linux-rdma; +Cc: syzbot+dc3dfba010d7671e05f5
The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
on the same id_priv. While this cannot happen without going through the
work, it violates the invariant that the same address resolution
background request cannot be active twice.
CPU 1 CPU 2
rdma_resolve_addr():
RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler) #1
process_one_req(): for #1
addr_handler():
RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
mutex_unlock(&id_priv->handler_mutex);
[.. handler still running ..]
rdma_resolve_addr():
RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
rdma_resolve_ip(addr_handler)
!! two requests are now on the req_list
rdma_destroy_id():
destroy_id_handler_unlock():
_destroy_id():
cma_cancel_operation():
rdma_addr_cancel()
// process_one_req() self removes it
spin_lock_bh(&lock);
cancel_delayed_work(&req->work);
if (!list_empty(&req->list)) == true
! rdma_addr_cancel() returns after process_on_req #1 is done
kfree(id_priv)
process_one_req(): for #2
addr_handler():
mutex_lock(&id_priv->handler_mutex);
!! Use after free on id_priv
rdma_addr_cancel() expects there to be one req on the list and only
cancels the first one. The self-removal behavior of the work only happens
after the handler has returned. This yields a situations where the
req_list can have two reqs for the same "handle" but rdma_addr_cancel()
only cancels the first one.
The second req remains active beyond rdma_destroy_id() and will
use-after-free id_priv once it inevitably triggers.
Fix this by remembering if the id_priv has called rdma_resolve_ip() and
always cancel before calling it again. This ensures the req_list never
gets more than one item in it and doesn't cost anything in the normal flow
that never uses this strange error path.
Cc: stable@vger.kernel.org
Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")
Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/infiniband/core/cma.c | 17 +++++++++++++++++
drivers/infiniband/core/cma_priv.h | 1 +
2 files changed, 18 insertions(+)
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index c40791baced588..751cf5ea25f296 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv,
{
switch (state) {
case RDMA_CM_ADDR_QUERY:
+ /*
+ * We can avoid doing the rdma_addr_cancel() based on state,
+ * only RDMA_CM_ADDR_QUERY has a work that could still execute.
+ * Notice that the addr_handler work could still be exiting
+ * outside this state, however due to the interaction with the
+ * handler_mutex the work is guaranteed not to touch id_priv
+ * during exit.
+ */
rdma_addr_cancel(&id_priv->id.route.addr.dev_addr);
break;
case RDMA_CM_ROUTE_QUERY:
@@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
if (dst_addr->sa_family == AF_IB) {
ret = cma_resolve_ib_addr(id_priv);
} else {
+ /* The FSM can return back to RDMA_CM_ADDR_BOUND after
+ * rdma_resolve_ip() is called, eg through the error
+ * path in addr_handler. If this happens the existing
+ * request must be canceled before issuing a new one.
+ */
+ if (id_priv->used_resolve_ip)
+ rdma_addr_cancel(&id->route.addr.dev_addr);
+ else
+ id_priv->used_resolve_ip = 1;
ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr,
&id->route.addr.dev_addr,
timeout_ms, addr_handler,
diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
index 5c463da9984536..f92f101ea9818f 100644
--- a/drivers/infiniband/core/cma_priv.h
+++ b/drivers/infiniband/core/cma_priv.h
@@ -91,6 +91,7 @@ struct rdma_id_private {
u8 afonly;
u8 timeout;
u8 min_rnr_timer;
+ u8 used_resolve_ip;
enum ib_gid_type gid_type;
/*
base-commit: ad17bbef3dd573da937816edc0ab84fed6a17fa6
--
2.33.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-16 18:34 [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests Jason Gunthorpe
@ 2021-09-22 8:01 ` Leon Romanovsky
2021-09-22 9:38 ` Haakon Bugge
2021-09-22 14:41 ` Jason Gunthorpe
0 siblings, 2 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-09-22 8:01 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Thu, Sep 16, 2021 at 03:34:46PM -0300, Jason Gunthorpe wrote:
> The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
> on the same id_priv. While this cannot happen without going through the
> work, it violates the invariant that the same address resolution
> background request cannot be active twice.
>
> CPU 1 CPU 2
>
> rdma_resolve_addr():
> RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
> rdma_resolve_ip(addr_handler) #1
>
> process_one_req(): for #1
> addr_handler():
> RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
> mutex_unlock(&id_priv->handler_mutex);
> [.. handler still running ..]
>
> rdma_resolve_addr():
> RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
> rdma_resolve_ip(addr_handler)
> !! two requests are now on the req_list
>
> rdma_destroy_id():
> destroy_id_handler_unlock():
> _destroy_id():
> cma_cancel_operation():
> rdma_addr_cancel()
>
> // process_one_req() self removes it
> spin_lock_bh(&lock);
> cancel_delayed_work(&req->work);
> if (!list_empty(&req->list)) == true
>
> ! rdma_addr_cancel() returns after process_on_req #1 is done
>
> kfree(id_priv)
>
> process_one_req(): for #2
> addr_handler():
> mutex_lock(&id_priv->handler_mutex);
> !! Use after free on id_priv
>
> rdma_addr_cancel() expects there to be one req on the list and only
> cancels the first one. The self-removal behavior of the work only happens
> after the handler has returned. This yields a situations where the
> req_list can have two reqs for the same "handle" but rdma_addr_cancel()
> only cancels the first one.
>
> The second req remains active beyond rdma_destroy_id() and will
> use-after-free id_priv once it inevitably triggers.
>
> Fix this by remembering if the id_priv has called rdma_resolve_ip() and
> always cancel before calling it again. This ensures the req_list never
> gets more than one item in it and doesn't cost anything in the normal flow
> that never uses this strange error path.
>
> Cc: stable@vger.kernel.org
> Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")
> Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/infiniband/core/cma.c | 17 +++++++++++++++++
> drivers/infiniband/core/cma_priv.h | 1 +
> 2 files changed, 18 insertions(+)
>
> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
> index c40791baced588..751cf5ea25f296 100644
> --- a/drivers/infiniband/core/cma.c
> +++ b/drivers/infiniband/core/cma.c
> @@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv,
> {
> switch (state) {
> case RDMA_CM_ADDR_QUERY:
> + /*
> + * We can avoid doing the rdma_addr_cancel() based on state,
> + * only RDMA_CM_ADDR_QUERY has a work that could still execute.
> + * Notice that the addr_handler work could still be exiting
> + * outside this state, however due to the interaction with the
> + * handler_mutex the work is guaranteed not to touch id_priv
> + * during exit.
> + */
> rdma_addr_cancel(&id_priv->id.route.addr.dev_addr);
> break;
> case RDMA_CM_ROUTE_QUERY:
> @@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
> if (dst_addr->sa_family == AF_IB) {
> ret = cma_resolve_ib_addr(id_priv);
> } else {
> + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> + * rdma_resolve_ip() is called, eg through the error
> + * path in addr_handler. If this happens the existing
> + * request must be canceled before issuing a new one.
> + */
> + if (id_priv->used_resolve_ip)
> + rdma_addr_cancel(&id->route.addr.dev_addr);
> + else
> + id_priv->used_resolve_ip = 1;
Why don't you never clear this field? If you assume that this is one lifetime
event, can you please add a comment with an explanation "why"?
Thanks
> ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr,
> &id->route.addr.dev_addr,
> timeout_ms, addr_handler,
> diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
> index 5c463da9984536..f92f101ea9818f 100644
> --- a/drivers/infiniband/core/cma_priv.h
> +++ b/drivers/infiniband/core/cma_priv.h
> @@ -91,6 +91,7 @@ struct rdma_id_private {
> u8 afonly;
> u8 timeout;
> u8 min_rnr_timer;
> + u8 used_resolve_ip;
> enum ib_gid_type gid_type;
>
> /*
>
> base-commit: ad17bbef3dd573da937816edc0ab84fed6a17fa6
> --
> 2.33.0
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-22 8:01 ` Leon Romanovsky
@ 2021-09-22 9:38 ` Haakon Bugge
2021-09-22 14:44 ` Jason Gunthorpe
2021-09-22 14:41 ` Jason Gunthorpe
1 sibling, 1 reply; 10+ messages in thread
From: Haakon Bugge @ 2021-09-22 9:38 UTC (permalink / raw)
To: Leon Romanovsky
Cc: Jason Gunthorpe, Dmitry Vyukov, OFED mailing list,
syzbot+dc3dfba010d7671e05f5
> On 22 Sep 2021, at 10:01, Leon Romanovsky <leon@kernel.org> wrote:
>
> On Thu, Sep 16, 2021 at 03:34:46PM -0300, Jason Gunthorpe wrote:
>> The FSM can run in a circle allowing rdma_resolve_ip() to be called twice
>> on the same id_priv. While this cannot happen without going through the
>> work, it violates the invariant that the same address resolution
>> background request cannot be active twice.
>>
>> CPU 1 CPU 2
>>
>> rdma_resolve_addr():
>> RDMA_CM_IDLE -> RDMA_CM_ADDR_QUERY
>> rdma_resolve_ip(addr_handler) #1
>>
>> process_one_req(): for #1
>> addr_handler():
>> RDMA_CM_ADDR_QUERY -> RDMA_CM_ADDR_BOUND
>> mutex_unlock(&id_priv->handler_mutex);
>> [.. handler still running ..]
>>
>> rdma_resolve_addr():
>> RDMA_CM_ADDR_BOUND -> RDMA_CM_ADDR_QUERY
>> rdma_resolve_ip(addr_handler)
>> !! two requests are now on the req_list
>>
>> rdma_destroy_id():
>> destroy_id_handler_unlock():
>> _destroy_id():
>> cma_cancel_operation():
>> rdma_addr_cancel()
>>
>> // process_one_req() self removes it
>> spin_lock_bh(&lock);
>> cancel_delayed_work(&req->work);
>> if (!list_empty(&req->list)) == true
>>
>> ! rdma_addr_cancel() returns after process_on_req #1 is done
>>
>> kfree(id_priv)
>>
>> process_one_req(): for #2
>> addr_handler():
>> mutex_lock(&id_priv->handler_mutex);
>> !! Use after free on id_priv
>>
>> rdma_addr_cancel() expects there to be one req on the list and only
>> cancels the first one. The self-removal behavior of the work only happens
>> after the handler has returned. This yields a situations where the
>> req_list can have two reqs for the same "handle" but rdma_addr_cancel()
>> only cancels the first one.
>>
>> The second req remains active beyond rdma_destroy_id() and will
>> use-after-free id_priv once it inevitably triggers.
>>
>> Fix this by remembering if the id_priv has called rdma_resolve_ip() and
>> always cancel before calling it again. This ensures the req_list never
>> gets more than one item in it and doesn't cost anything in the normal flow
>> that never uses this strange error path.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: e51060f08a61 ("IB: IP address based RDMA connection manager")
>> Reported-by: syzbot+dc3dfba010d7671e05f5@syzkaller.appspotmail.com
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> ---
>> drivers/infiniband/core/cma.c | 17 +++++++++++++++++
>> drivers/infiniband/core/cma_priv.h | 1 +
>> 2 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
>> index c40791baced588..751cf5ea25f296 100644
>> --- a/drivers/infiniband/core/cma.c
>> +++ b/drivers/infiniband/core/cma.c
>> @@ -1776,6 +1776,14 @@ static void cma_cancel_operation(struct rdma_id_private *id_priv,
>> {
>> switch (state) {
>> case RDMA_CM_ADDR_QUERY:
>> + /*
>> + * We can avoid doing the rdma_addr_cancel() based on state,
>> + * only RDMA_CM_ADDR_QUERY has a work that could still execute.
>> + * Notice that the addr_handler work could still be exiting
>> + * outside this state, however due to the interaction with the
>> + * handler_mutex the work is guaranteed not to touch id_priv
>> + * during exit.
>> + */
>> rdma_addr_cancel(&id_priv->id.route.addr.dev_addr);
>> break;
>> case RDMA_CM_ROUTE_QUERY:
>> @@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
>> if (dst_addr->sa_family == AF_IB) {
>> ret = cma_resolve_ib_addr(id_priv);
>> } else {
>> + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
>> + * rdma_resolve_ip() is called, eg through the error
>> + * path in addr_handler. If this happens the existing
>> + * request must be canceled before issuing a new one.
>> + */
>> + if (id_priv->used_resolve_ip)
>> + rdma_addr_cancel(&id->route.addr.dev_addr);
>> + else
>> + id_priv->used_resolve_ip = 1;
>
> Why don't you never clear this field? If you assume that this is one lifetime
> event, can you please add a comment with an explanation "why"?
Adding to that, don't you need {READ,WRITE}_ONCE when accessing used_resolve_ip? Or will the write to it obtain global visibility because mutex_unlock(&ctx->mutex) is executed before any other context can read it?
Thxs, Håkon
>
> Thanks
>
>> ret = rdma_resolve_ip(cma_src_addr(id_priv), dst_addr,
>> &id->route.addr.dev_addr,
>> timeout_ms, addr_handler,
>> diff --git a/drivers/infiniband/core/cma_priv.h b/drivers/infiniband/core/cma_priv.h
>> index 5c463da9984536..f92f101ea9818f 100644
>> --- a/drivers/infiniband/core/cma_priv.h
>> +++ b/drivers/infiniband/core/cma_priv.h
>> @@ -91,6 +91,7 @@ struct rdma_id_private {
>> u8 afonly;
>> u8 timeout;
>> u8 min_rnr_timer;
>> + u8 used_resolve_ip;
>> enum ib_gid_type gid_type;
>>
>> /*
>>
>> base-commit: ad17bbef3dd573da937816edc0ab84fed6a17fa6
>> --
>> 2.33.0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-22 8:01 ` Leon Romanovsky
2021-09-22 9:38 ` Haakon Bugge
@ 2021-09-22 14:41 ` Jason Gunthorpe
2021-09-23 5:49 ` Leon Romanovsky
1 sibling, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2021-09-22 14:41 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
> > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > + * rdma_resolve_ip() is called, eg through the error
> > + * path in addr_handler. If this happens the existing
> > + * request must be canceled before issuing a new one.
> > + */
> > + if (id_priv->used_resolve_ip)
> > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > + else
> > + id_priv->used_resolve_ip = 1;
>
> Why don't you never clear this field?
The only case where it can be cleared is if we have called
rdma_addr_cancel(), and since this is the only place that does it and
immediately calls rdma_resolve_ip() again, there is no reason to ever
clear it.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-22 9:38 ` Haakon Bugge
@ 2021-09-22 14:44 ` Jason Gunthorpe
0 siblings, 0 replies; 10+ messages in thread
From: Jason Gunthorpe @ 2021-09-22 14:44 UTC (permalink / raw)
To: Haakon Bugge
Cc: Leon Romanovsky, Dmitry Vyukov, OFED mailing list,
syzbot+dc3dfba010d7671e05f5
On Wed, Sep 22, 2021 at 09:38:40AM +0000, Haakon Bugge wrote:
> >> @@ -3413,6 +3421,15 @@ int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr,
> >> if (dst_addr->sa_family == AF_IB) {
> >> ret = cma_resolve_ib_addr(id_priv);
> >> } else {
> >> + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> >> + * rdma_resolve_ip() is called, eg through the error
> >> + * path in addr_handler. If this happens the existing
> >> + * request must be canceled before issuing a new one.
> >> + */
> >> + if (id_priv->used_resolve_ip)
> >> + rdma_addr_cancel(&id->route.addr.dev_addr);
> >> + else
> >> + id_priv->used_resolve_ip = 1;
> >
> > Why don't you never clear this field? If you assume that this is one lifetime
> > event, can you please add a comment with an explanation "why"?
>
> Adding to that, don't you need {READ,WRITE}_ONCE when accessing
> used_resolve_ip?
The FSM logic guarentees there is no concurrent access here, this is
the only thread that can be in this state at this point.
> Or will the write to it obtain global visibility because
> mutex_unlock(&ctx->mutex) is executed before any other context can
> read it?
Global visibility flows indirectly through the rdma_resolve_ip() to
the work. Basically when the rdma_resolve_ip schedules the work it
does a full release, then the work does a spinlock/unlock which is
another full release, finally the next time we go through this
function it does another spinlock/unlock which will act as ancquire
for this store.
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-22 14:41 ` Jason Gunthorpe
@ 2021-09-23 5:49 ` Leon Romanovsky
2021-09-23 11:45 ` Jason Gunthorpe
0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2021-09-23 5:49 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Wed, Sep 22, 2021 at 11:41:19AM -0300, Jason Gunthorpe wrote:
> On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
>
> > > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > > + * rdma_resolve_ip() is called, eg through the error
> > > + * path in addr_handler. If this happens the existing
> > > + * request must be canceled before issuing a new one.
> > > + */
> > > + if (id_priv->used_resolve_ip)
> > > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > > + else
> > > + id_priv->used_resolve_ip = 1;
> >
> > Why don't you never clear this field?
>
> The only case where it can be cleared is if we have called
> rdma_addr_cancel(), and since this is the only place that does it and
> immediately calls rdma_resolve_ip() again, there is no reason to ever
> clear it.
IMHO, it is better to clear instead to rely on "the only place" semantic.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-23 5:49 ` Leon Romanovsky
@ 2021-09-23 11:45 ` Jason Gunthorpe
2021-09-23 18:15 ` Leon Romanovsky
0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2021-09-23 11:45 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Thu, Sep 23, 2021 at 08:49:06AM +0300, Leon Romanovsky wrote:
> On Wed, Sep 22, 2021 at 11:41:19AM -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
> >
> > > > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > > > + * rdma_resolve_ip() is called, eg through the error
> > > > + * path in addr_handler. If this happens the existing
> > > > + * request must be canceled before issuing a new one.
> > > > + */
> > > > + if (id_priv->used_resolve_ip)
> > > > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > > > + else
> > > > + id_priv->used_resolve_ip = 1;
> > >
> > > Why don't you never clear this field?
> >
> > The only case where it can be cleared is if we have called
> > rdma_addr_cancel(), and since this is the only place that does it and
> > immediately calls rdma_resolve_ip() again, there is no reason to ever
> > clear it.
>
> IMHO, it is better to clear instead to rely on "the only place" semantic.
Then the code looks really silly:
if (id_priv->used_resolve_ip) {
rdma_addr_cancel(&id->route.addr.dev_addr);
id_priv->used_resolve_ip = 0;
}
id_priv->used_resolve_ip = 1;
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-23 11:45 ` Jason Gunthorpe
@ 2021-09-23 18:15 ` Leon Romanovsky
2021-09-23 20:03 ` Jason Gunthorpe
0 siblings, 1 reply; 10+ messages in thread
From: Leon Romanovsky @ 2021-09-23 18:15 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Thu, Sep 23, 2021 at 08:45:57AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 23, 2021 at 08:49:06AM +0300, Leon Romanovsky wrote:
> > On Wed, Sep 22, 2021 at 11:41:19AM -0300, Jason Gunthorpe wrote:
> > > On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
> > >
> > > > > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > > > > + * rdma_resolve_ip() is called, eg through the error
> > > > > + * path in addr_handler. If this happens the existing
> > > > > + * request must be canceled before issuing a new one.
> > > > > + */
> > > > > + if (id_priv->used_resolve_ip)
> > > > > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > > > > + else
> > > > > + id_priv->used_resolve_ip = 1;
> > > >
> > > > Why don't you never clear this field?
> > >
> > > The only case where it can be cleared is if we have called
> > > rdma_addr_cancel(), and since this is the only place that does it and
> > > immediately calls rdma_resolve_ip() again, there is no reason to ever
> > > clear it.
> >
> > IMHO, it is better to clear instead to rely on "the only place" semantic.
>
> Then the code looks really silly:
>
> if (id_priv->used_resolve_ip) {
> rdma_addr_cancel(&id->route.addr.dev_addr);
> id_priv->used_resolve_ip = 0;
> }
> id_priv->used_resolve_ip = 1;
So write comment why you don't need to clear used_resolve_ip, but don't
leave it as it is now, where readers need to guess.
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-23 18:15 ` Leon Romanovsky
@ 2021-09-23 20:03 ` Jason Gunthorpe
2021-09-23 23:17 ` Leon Romanovsky
0 siblings, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2021-09-23 20:03 UTC (permalink / raw)
To: Leon Romanovsky; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Thu, Sep 23, 2021 at 09:15:44PM +0300, Leon Romanovsky wrote:
> On Thu, Sep 23, 2021 at 08:45:57AM -0300, Jason Gunthorpe wrote:
> > On Thu, Sep 23, 2021 at 08:49:06AM +0300, Leon Romanovsky wrote:
> > > On Wed, Sep 22, 2021 at 11:41:19AM -0300, Jason Gunthorpe wrote:
> > > > On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
> > > >
> > > > > > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > > > > > + * rdma_resolve_ip() is called, eg through the error
> > > > > > + * path in addr_handler. If this happens the existing
> > > > > > + * request must be canceled before issuing a new one.
> > > > > > + */
> > > > > > + if (id_priv->used_resolve_ip)
> > > > > > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > > > > > + else
> > > > > > + id_priv->used_resolve_ip = 1;
> > > > >
> > > > > Why don't you never clear this field?
> > > >
> > > > The only case where it can be cleared is if we have called
> > > > rdma_addr_cancel(), and since this is the only place that does it and
> > > > immediately calls rdma_resolve_ip() again, there is no reason to ever
> > > > clear it.
> > >
> > > IMHO, it is better to clear instead to rely on "the only place" semantic.
> >
> > Then the code looks really silly:
> >
> > if (id_priv->used_resolve_ip) {
> > rdma_addr_cancel(&id->route.addr.dev_addr);
> > id_priv->used_resolve_ip = 0;
> > }
> > id_priv->used_resolve_ip = 1;
>
> So write comment why you don't need to clear used_resolve_ip, but don't
> leave it as it is now, where readers need to guess.
>
I think it is a bit wordy, but I put this:
/*
* The FSM can return back to RDMA_CM_ADDR_BOUND after
* rdma_resolve_ip() is called, eg through the error
* path in addr_handler(). If this happens the existing
* request must be canceled before issuing a new one.
* Since canceling a request is a bit slow and this
* oddball path is rare, keep track once a request has
* been issued. The track turns out to be a permanent
* state since this is the only cancel as it is
* immediately before rdma_resolve_ip().
*/
And into for-rc
Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests
2021-09-23 20:03 ` Jason Gunthorpe
@ 2021-09-23 23:17 ` Leon Romanovsky
0 siblings, 0 replies; 10+ messages in thread
From: Leon Romanovsky @ 2021-09-23 23:17 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: Dmitry Vyukov, linux-rdma, syzbot+dc3dfba010d7671e05f5
On Thu, Sep 23, 2021 at 05:03:58PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 23, 2021 at 09:15:44PM +0300, Leon Romanovsky wrote:
> > On Thu, Sep 23, 2021 at 08:45:57AM -0300, Jason Gunthorpe wrote:
> > > On Thu, Sep 23, 2021 at 08:49:06AM +0300, Leon Romanovsky wrote:
> > > > On Wed, Sep 22, 2021 at 11:41:19AM -0300, Jason Gunthorpe wrote:
> > > > > On Wed, Sep 22, 2021 at 11:01:39AM +0300, Leon Romanovsky wrote:
> > > > >
> > > > > > > + /* The FSM can return back to RDMA_CM_ADDR_BOUND after
> > > > > > > + * rdma_resolve_ip() is called, eg through the error
> > > > > > > + * path in addr_handler. If this happens the existing
> > > > > > > + * request must be canceled before issuing a new one.
> > > > > > > + */
> > > > > > > + if (id_priv->used_resolve_ip)
> > > > > > > + rdma_addr_cancel(&id->route.addr.dev_addr);
> > > > > > > + else
> > > > > > > + id_priv->used_resolve_ip = 1;
> > > > > >
> > > > > > Why don't you never clear this field?
> > > > >
> > > > > The only case where it can be cleared is if we have called
> > > > > rdma_addr_cancel(), and since this is the only place that does it and
> > > > > immediately calls rdma_resolve_ip() again, there is no reason to ever
> > > > > clear it.
> > > >
> > > > IMHO, it is better to clear instead to rely on "the only place" semantic.
> > >
> > > Then the code looks really silly:
> > >
> > > if (id_priv->used_resolve_ip) {
> > > rdma_addr_cancel(&id->route.addr.dev_addr);
> > > id_priv->used_resolve_ip = 0;
> > > }
> > > id_priv->used_resolve_ip = 1;
> >
> > So write comment why you don't need to clear used_resolve_ip, but don't
> > leave it as it is now, where readers need to guess.
> >
>
> I think it is a bit wordy, but I put this:
>
> /*
> * The FSM can return back to RDMA_CM_ADDR_BOUND after
> * rdma_resolve_ip() is called, eg through the error
> * path in addr_handler(). If this happens the existing
> * request must be canceled before issuing a new one.
> * Since canceling a request is a bit slow and this
> * oddball path is rare, keep track once a request has
> * been issued. The track turns out to be a permanent
> * state since this is the only cancel as it is
> * immediately before rdma_resolve_ip().
> */
>
> And into for-rc
Thanks
>
> Jason
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-09-23 23:17 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-16 18:34 [PATCH rc] RDMA/cma: Ensure rdma_addr_cancel() happens before issuing more requests Jason Gunthorpe
2021-09-22 8:01 ` Leon Romanovsky
2021-09-22 9:38 ` Haakon Bugge
2021-09-22 14:44 ` Jason Gunthorpe
2021-09-22 14:41 ` Jason Gunthorpe
2021-09-23 5:49 ` Leon Romanovsky
2021-09-23 11:45 ` Jason Gunthorpe
2021-09-23 18:15 ` Leon Romanovsky
2021-09-23 20:03 ` Jason Gunthorpe
2021-09-23 23:17 ` Leon Romanovsky
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.