All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-rdma: Fix early queue flags settings
@ 2016-09-20 18:57 Sagi Grimberg
  2016-09-20 19:01 ` Steve Wise
  2016-09-20 20:08 ` Christoph Hellwig
  0 siblings, 2 replies; 13+ messages in thread
From: Sagi Grimberg @ 2016-09-20 18:57 UTC (permalink / raw)


When we reconnect we can't really clear the
queue flags (DELETING flag in specific) because
we might end up trigerring a use-after-free condition
if we fail to establish the rdma connection.

Fixes: e89ca58f9c90 ("nvme-rdma: add DELETING queue flag")
Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
---
 drivers/nvme/host/rdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index c2c2c28e6eb5..3437f0e8866f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -561,7 +561,6 @@ static int nvme_rdma_init_queue(struct nvme_rdma_ctrl *ctrl,
 
 	queue = &ctrl->queues[idx];
 	queue->ctrl = ctrl;
-	queue->flags = 0;
 	init_completion(&queue->cm_done);
 
 	if (idx > 0)
@@ -595,6 +594,7 @@ static int nvme_rdma_init_queue(struct nvme_rdma_ctrl *ctrl,
 		goto out_destroy_cm_id;
 	}
 
+	queue->flags = 0;
 	set_bit(NVME_RDMA_Q_CONNECTED, &queue->flags);
 
 	return 0;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-20 18:57 [PATCH] nvme-rdma: Fix early queue flags settings Sagi Grimberg
@ 2016-09-20 19:01 ` Steve Wise
  2016-09-20 19:05   ` Sagi Grimberg
  2016-09-20 20:08 ` Christoph Hellwig
  1 sibling, 1 reply; 13+ messages in thread
From: Steve Wise @ 2016-09-20 19:01 UTC (permalink / raw)


> When we reconnect we can't really clear the
> queue flags (DELETING flag in specific) because
> we might end up trigerring a use-after-free condition
> if we fail to establish the rdma connection.
> 
> Fixes: e89ca58f9c90 ("nvme-rdma: add DELETING queue flag")
> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>

Tests good.  

Thanks Sagi!

Tested-by: Steve Wise <swise at opengridcomputing.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-20 19:01 ` Steve Wise
@ 2016-09-20 19:05   ` Sagi Grimberg
  0 siblings, 0 replies; 13+ messages in thread
From: Sagi Grimberg @ 2016-09-20 19:05 UTC (permalink / raw)



>> When we reconnect we can't really clear the
>> queue flags (DELETING flag in specific) because
>> we might end up trigerring a use-after-free condition
>> if we fail to establish the rdma connection.
>>
>> Fixes: e89ca58f9c90 ("nvme-rdma: add DELETING queue flag")
>> Signed-off-by: Sagi Grimberg <sagi at grimberg.me>
>
> Tests good.
>
> Thanks Sagi!
>
> Tested-by: Steve Wise <swise at opengridcomputing.com>

Thanks Steve,

Jens, can you grab this last minute fix for rc8?

You can pull it from
git://git.infradead.org/nvme-fabrics.git nvmf-4.8-rc

Thanks

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-20 18:57 [PATCH] nvme-rdma: Fix early queue flags settings Sagi Grimberg
  2016-09-20 19:01 ` Steve Wise
@ 2016-09-20 20:08 ` Christoph Hellwig
  2016-09-20 20:14   ` Steve Wise
  1 sibling, 1 reply; 13+ messages in thread
From: Christoph Hellwig @ 2016-09-20 20:08 UTC (permalink / raw)


On Tue, Sep 20, 2016@11:57:28AM -0700, Sagi Grimberg wrote:
> When we reconnect we can't really clear the
> queue flags (DELETING flag in specific) because
> we might end up trigerring a use-after-free condition
> if we fail to establish the rdma connection.

Can you add a comment explaining this to the assignment?  Looking
at the patch it looks like black magic to be, and that's even
after reading the above changelog..

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-20 20:08 ` Christoph Hellwig
@ 2016-09-20 20:14   ` Steve Wise
  2016-09-21  3:38     ` Sagi Grimberg
  0 siblings, 1 reply; 13+ messages in thread
From: Steve Wise @ 2016-09-20 20:14 UTC (permalink / raw)


> 
> On Tue, Sep 20, 2016@11:57:28AM -0700, Sagi Grimberg wrote:
> > When we reconnect we can't really clear the
> > queue flags (DELETING flag in specific) because
> > we might end up trigerring a use-after-free condition
> > if we fail to establish the rdma connection.
> 
> Can you add a comment explaining this to the assignment?  Looking
> at the patch it looks like black magic to be, and that's even
> after reading the above changelog..

Maybe this changelog?

    nvme-rdma: only clear queue flags after successful connect

    Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
    try to stop/free rdma qps/cm_ids that are already freed.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-20 20:14   ` Steve Wise
@ 2016-09-21  3:38     ` Sagi Grimberg
  2016-09-21 14:01       ` 'Christoph Hellwig'
  0 siblings, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2016-09-21  3:38 UTC (permalink / raw)



>> On Tue, Sep 20, 2016@11:57:28AM -0700, Sagi Grimberg wrote:
>>> When we reconnect we can't really clear the
>>> queue flags (DELETING flag in specific) because
>>> we might end up trigerring a use-after-free condition
>>> if we fail to establish the rdma connection.
>>
>> Can you add a comment explaining this to the assignment?  Looking
>> at the patch it looks like black magic to be, and that's even
>> after reading the above changelog..
>
> Maybe this changelog?
>
>     nvme-rdma: only clear queue flags after successful connect
>
>     Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
>     try to stop/free rdma qps/cm_ids that are already freed.

I can modify the change log, Christoph do you still want a
comment in the code?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21  3:38     ` Sagi Grimberg
@ 2016-09-21 14:01       ` 'Christoph Hellwig'
  2016-09-21 14:18         ` Steve Wise
  2016-09-21 14:36         ` Sagi Grimberg
  0 siblings, 2 replies; 13+ messages in thread
From: 'Christoph Hellwig' @ 2016-09-21 14:01 UTC (permalink / raw)


On Tue, Sep 20, 2016@08:38:52PM -0700, Sagi Grimberg wrote:
>> Maybe this changelog?
>>
>>     nvme-rdma: only clear queue flags after successful connect
>>
>>     Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
>>     try to stop/free rdma qps/cm_ids that are already freed.
>
> I can modify the change log, Christoph do you still want a
> comment in the code?

Honestly there more I look into this the less I'm happy with the patch.
queue->flags is an atomic, and as the patch shows we can get
nvme_rdma_init_queue caled on a queue that still has visibility in
other threads  So I think we really should not even do that simple
queue->flags = 0 assignment at all.  We'll need to use clear_bit to
atomically clear anything that might be set, and we need to be careful
where we do that.  I think this whole situation that we can get an
*_init_* function called on something that already is live and visible
to other threads need to be well documented at least because it's just
waiting for sucker like me that don't expect that.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 14:01       ` 'Christoph Hellwig'
@ 2016-09-21 14:18         ` Steve Wise
  2016-09-21 14:52           ` Steve Wise
  2016-09-21 14:36         ` Sagi Grimberg
  1 sibling, 1 reply; 13+ messages in thread
From: Steve Wise @ 2016-09-21 14:18 UTC (permalink / raw)


> >
> > I can modify the change log, Christoph do you still want a
> > comment in the code?
> 
> Honestly there more I look into this the less I'm happy with the patch.
> queue->flags is an atomic, and as the patch shows we can get
> nvme_rdma_init_queue caled on a queue that still has visibility in
> other threads  So I think we really should not even do that simple
> queue->flags = 0 assignment at all.  We'll need to use clear_bit to
> atomically clear anything that might be set, and we need to be careful
> where we do that.  I think this whole situation that we can get an
> *_init_* function called on something that already is live and visible
> to other threads need to be well documented at least because it's just
> waiting for sucker like me that don't expect that.


Sagi, you originally proposed this in a patch for debugging the crash where
a request is accessing a queue with rdma resources freed:

@@ -542,11 +542,12 @@ static int nvme_rdma_create_queue_ib(struct 
nvme_rdma_queue *queue,
                 goto out_destroy_qp;
         }
         set_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags);
+       clear_bit(NVME_RDMA_Q_DELETING, &queue->flags);

         return 0;


Perhaps this is how we should proceed?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 14:01       ` 'Christoph Hellwig'
  2016-09-21 14:18         ` Steve Wise
@ 2016-09-21 14:36         ` Sagi Grimberg
  2016-09-21 15:50           ` Steve Wise
  1 sibling, 1 reply; 13+ messages in thread
From: Sagi Grimberg @ 2016-09-21 14:36 UTC (permalink / raw)


>>> Maybe this changelog?
>>>
>>>     nvme-rdma: only clear queue flags after successful connect
>>>
>>>     Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
>>>     try to stop/free rdma qps/cm_ids that are already freed.
>>
>> I can modify the change log, Christoph do you still want a
>> comment in the code?
>
> Honestly there more I look into this the less I'm happy with the patch.

I agree that we should really do a proper queue state machine.

> queue->flags is an atomic, and as the patch shows we can get
> nvme_rdma_init_queue caled on a queue that still has visibility in
> other threads  So I think we really should not even do that simple
> queue->flags = 0 assignment at all.  We'll need to use clear_bit to
> atomically clear anything that might be set, and we need to be careful
> where we do that.  I think this whole situation that we can get an
> *_init_* function called on something that already is live and visible
> to other threads need to be well documented at least because it's just
> waiting for sucker like me that don't expect that.

I don't think its a multithreading issue. We aren't expected to get
here concurrently from different contexts (I think we're screwed if we
do). The issue was that we are not clearing the DELETING flag on 
reconnects causing possible leaks, then I "fixed" it by resetting the
queue flags at init_queue start, the problem was that if we failed to
init the queue we lost the DELETING flag and got to a use-after-free
condition (nothing prevented stop_and_free_queue to make forward
progress...

I can move it to clear_bit if you want...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 14:18         ` Steve Wise
@ 2016-09-21 14:52           ` Steve Wise
  2016-09-21 15:10             ` Steve Wise
  0 siblings, 1 reply; 13+ messages in thread
From: Steve Wise @ 2016-09-21 14:52 UTC (permalink / raw)


> Sagi, you originally proposed this in a patch for debugging the crash
where
> a request is accessing a queue with rdma resources freed:
> 
> @@ -542,11 +542,12 @@ static int nvme_rdma_create_queue_ib(struct
> nvme_rdma_queue *queue,
>                  goto out_destroy_qp;
>          }
>          set_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags);
> +       clear_bit(NVME_RDMA_Q_DELETING, &queue->flags);
> 
>          return 0;
> 
> 
> Perhaps this is how we should proceed?

Just tested this, and it doesn't fix the problem.  This makes sense, because
we really only want to clear it if we successfully setup the connection...
So never mind. :)  

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 14:52           ` Steve Wise
@ 2016-09-21 15:10             ` Steve Wise
  0 siblings, 0 replies; 13+ messages in thread
From: Steve Wise @ 2016-09-21 15:10 UTC (permalink / raw)


> > Sagi, you originally proposed this in a patch for debugging the crash
> where
> > a request is accessing a queue with rdma resources freed:
> >
> > @@ -542,11 +542,12 @@ static int nvme_rdma_create_queue_ib(struct
> > nvme_rdma_queue *queue,
> >                  goto out_destroy_qp;
> >          }
> >          set_bit(NVME_RDMA_IB_QUEUE_ALLOCATED, &queue->flags);
> > +       clear_bit(NVME_RDMA_Q_DELETING, &queue->flags);
> >
> >          return 0;
> >
> >
> > Perhaps this is how we should proceed?
> 
> Just tested this, and it doesn't fix the problem.  This makes sense,
because
> we really only want to clear it if we successfully setup the connection...
> So never mind. :)

What I'm trying to say is that if you use clear_bit(), it needs to be
exactly where the flags = 0 was.

Thanks,

Steve.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 14:36         ` Sagi Grimberg
@ 2016-09-21 15:50           ` Steve Wise
  2016-09-21 19:57             ` Sagi Grimberg
  0 siblings, 1 reply; 13+ messages in thread
From: Steve Wise @ 2016-09-21 15:50 UTC (permalink / raw)


> 
> >>> Maybe this changelog?
> >>>
> >>>     nvme-rdma: only clear queue flags after successful connect
> >>>
> >>>     Otherwise, nvme_rdma_stop_and_clear_queue() will incorrectly
> >>>     try to stop/free rdma qps/cm_ids that are already freed.
> >>
> >> I can modify the change log, Christoph do you still want a
> >> comment in the code?
> >
> > Honestly there more I look into this the less I'm happy with the patch.
> 
> I agree that we should really do a proper queue state machine.
> 
> > queue->flags is an atomic, and as the patch shows we can get
> > nvme_rdma_init_queue caled on a queue that still has visibility in
> > other threads  So I think we really should not even do that simple
> > queue->flags = 0 assignment at all.  We'll need to use clear_bit to
> > atomically clear anything that might be set, and we need to be careful
> > where we do that.  I think this whole situation that we can get an
> > *_init_* function called on something that already is live and visible
> > to other threads need to be well documented at least because it's just
> > waiting for sucker like me that don't expect that.
> 
> I don't think its a multithreading issue. We aren't expected to get
> here concurrently from different contexts (I think we're screwed if we
> do). The issue was that we are not clearing the DELETING flag on
> reconnects causing possible leaks, then I "fixed" it by resetting the
> queue flags at init_queue start, the problem was that if we failed to
> init the queue we lost the DELETING flag and got to a use-after-free
> condition (nothing prevented stop_and_free_queue to make forward
> progress...
> 
> I can move it to clear_bit if you want...
> 

I'd like this in 4.8 to avoid the crash during reconnect.  (the clear_bit()
is fine and tests ok).

I'll sign up to work with Sagi on a proper state machine fix for 4.9.  How's
that sound?

Steve.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] nvme-rdma: Fix early queue flags settings
  2016-09-21 15:50           ` Steve Wise
@ 2016-09-21 19:57             ` Sagi Grimberg
  0 siblings, 0 replies; 13+ messages in thread
From: Sagi Grimberg @ 2016-09-21 19:57 UTC (permalink / raw)



>>> Honestly there more I look into this the less I'm happy with the patch.
>>
>> I agree that we should really do a proper queue state machine.
>>
>>> queue->flags is an atomic, and as the patch shows we can get
>>> nvme_rdma_init_queue caled on a queue that still has visibility in
>>> other threads  So I think we really should not even do that simple
>>> queue->flags = 0 assignment at all.  We'll need to use clear_bit to
>>> atomically clear anything that might be set, and we need to be careful
>>> where we do that.  I think this whole situation that we can get an
>>> *_init_* function called on something that already is live and visible
>>> to other threads need to be well documented at least because it's just
>>> waiting for sucker like me that don't expect that.
>>
>> I don't think its a multithreading issue. We aren't expected to get
>> here concurrently from different contexts (I think we're screwed if we
>> do). The issue was that we are not clearing the DELETING flag on
>> reconnects causing possible leaks, then I "fixed" it by resetting the
>> queue flags at init_queue start, the problem was that if we failed to
>> init the queue we lost the DELETING flag and got to a use-after-free
>> condition (nothing prevented stop_and_free_queue to make forward
>> progress...
>>
>> I can move it to clear_bit if you want...
>>
>
> I'd like this in 4.8 to avoid the crash during reconnect.  (the clear_bit()
> is fine and tests ok).
>
> I'll sign up to work with Sagi on a proper state machine fix for 4.9.  How's
> that sound?

Sounds good to me, I'll send out the clear_bit() version.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-09-21 19:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-20 18:57 [PATCH] nvme-rdma: Fix early queue flags settings Sagi Grimberg
2016-09-20 19:01 ` Steve Wise
2016-09-20 19:05   ` Sagi Grimberg
2016-09-20 20:08 ` Christoph Hellwig
2016-09-20 20:14   ` Steve Wise
2016-09-21  3:38     ` Sagi Grimberg
2016-09-21 14:01       ` 'Christoph Hellwig'
2016-09-21 14:18         ` Steve Wise
2016-09-21 14:52           ` Steve Wise
2016-09-21 15:10             ` Steve Wise
2016-09-21 14:36         ` Sagi Grimberg
2016-09-21 15:50           ` Steve Wise
2016-09-21 19:57             ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.